Commit Graph

818 Commits

Author SHA1 Message Date
Andreas Rheinhardt 48cf1d878c avformat/matroskadec: Add support for FlagTextDescriptions
This is the equivalent of the WebM "D_WEBVTT/DESCRIPTIONS" and is
therefore only exported for subtitles.

Reviewed-by: Ridley Combs <rcombs@rcombs.me>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-22 04:14:20 +01:00
Andreas Rheinhardt 51e1067729 avformat/matroskadec: Add support for FlagHearing/VisualImpaired
Given that our disposition flags provide no way to distinguish the
cases of "track is unsuitable for hearing impaired users" and "it is
unknown whether the track is suitable for hearing impaired users" we do
not need to use a CountedElement for these flags.

Reviewed-by: Ridley Combs <rcombs@rcombs.me>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-22 04:12:25 +01:00
Andreas Rheinhardt 18b3deee22 avformat/matroskadec: Add support for FlagCommentary
Hint: Matroska actually provides a way to distinguish the cases of
"track is no commentary track" and "it is unknown whether the track
is a commentary track", but our disposition flags do not. Therefore
we need not use a CountedElement.

Reviewed-by: Ridley Combs <rcombs@rcombs.me>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-22 04:12:02 +01:00
Andreas Rheinhardt 05ae0b239f avformat/matroskadec: Beautify setting default values
Reviewed-by: Ridley Combs <rcombs@rcombs.me>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-22 03:58:18 +01:00
Andreas Rheinhardt 7471f473c5 avformat/matroskadec: Reindent after the previous commit
Reviewed-by: Ridley Combs <rcombs@rcombs.me>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-22 03:58:13 +01:00
Andreas Rheinhardt 0a99c3dadc avformat/matroskadec: Make reading zero-length elements spec-compliant
For a very long time, the payload of integer and float elements had to
have a length > 0. Our parser treated such invalid elements as having a
value zero. But now it has been defined what an EBML element with length
zero means: It is a shorthand for the default value. This has also been
defined for strings (both ASCII and UTF-8). This commit modifies our
parser to support this.

Reviewed-by: Ridley Combs <rcombs@rcombs.me>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-22 03:57:52 +01:00
Andreas Rheinhardt 2f1a5621d3 avformat/matroskadec: Don't use fake default value for ReferenceBlock
This has been done in order to find out whether this element is present
at all; but this can now be done in a cleaner way by using a CountedElement
for it.

Reviewed-by: Ridley Combs <rcombs@rcombs.me>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-22 03:57:52 +01:00
Andreas Rheinhardt e8f3901604 avformat/matroskadec: Check min_luminance more thoroughly
In the absence of an explicitly coded minimal luminance, the current
code inferred it to be -1, an invalid value. Yet it did not check the
value lateron at all, so that if a valid maximum luminance is
encountered, but no minimal luminance, an invalid minimal luminance of
-1 is exported. If an minimal luminance element with a negative value is
present, it is exported, too. This can be simply fixed by adding a check
for the value of the element.

Yet given that a minimal luminance of zero Cd/m² is legal and can be
coded with a length of zero, we must not use a fake default value to
find out whether the element is present or not. Therefore this patch
uses an explicit counter for it.

While just at it, also check for max_luminance > min_luminance.

Reviewed-by: Ridley Combs <rcombs@rcombs.me>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-22 03:57:52 +01:00
Andreas Rheinhardt 5b9a0e6031 avformat/matroskadec: Allow to count the number of element occurences
Up until now, the generic EBML reader used by the Matroska demuxer did
not have the capability to record whether an element was actually
present or not; instead, in cases where it mattered one typically added
an invalid default value and checked whether the value is valid (in
which case it is guaranteed to be present). This worked pretty well so
far, yet the EBML specifications have evolved: It is now legal to use
zero-length elements for floats, ints, uints and strings (both ASCII and
UTF-8); the value of these elements is the default value of the element
(if it has one) or zero for scalar types and an empty string for
strings. Furthermore, having a default value does no longer imply that
the element may be presumed to be present (with its default value) if it
is absent; this is only true if the element is mandatory, too.

These rules are designed to allow size savings as follows: Consider the
newly added FlagOriginal: It being zero means the track is not in its
original language, it being one means it is. For backward compatibility
reasons, neither of the two values may be inferred automatically in the
absence of the element. But one can still save a byte when one wants to
write the element with a value of zero, as one can write the integer with
a length of zero: 0x55AE 80 instead of 0x55AE 81 00. In the former case,
a parser has to infer the value of the element to be zero (which is the
element's default value).

When encountering an element with length zero, our parser always infers
a value of zero (or an empty string); this is wrong for values with
a different default value. It needs to infer the default value (or zero
in its absence) and this precludes using an invalid default value for
elements like FlagOriginal. Ergo one needs to be able to record whether
an element is present or not by other means. This patch allows to use a
simple counter for this. While just at it, some invalid and unnecessary
default values have been removed (mastering metadata elements used
default values of -1.0, despite these elements only being used if they
are > 0).

Reviewed-by: Ridley Combs <rcombs@rcombs.me>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-22 03:57:16 +01:00
Michael Niedermayer 7b88dd8f0c avformat/matroskadec: Sanity check codec_id/track type
Fixes: memleak
Fixes: 27766/clusterfuzz-testcase-minimized-ffmpeg_dem_MATROSKA_fuzzer-5198300814508032

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-12-09 21:41:15 +01:00
Steve Lhomme 5bd870a212 avformat/matroskadec: only use the track duration if it exists
No need to multiplying/dividing when we know it's zero.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2020-11-20 15:20:24 +01:00
Steve Lhomme 3a2b786db0 avformat/matroskadec: adjust the cluster time to the track timebase
The Block timestamp read in matroska_parse_block() is in track timebase and is
passed on as such to the AVPacket which uses this timebase.

In the normal case the Cluster and Track timebases are the same because the
track->time_scale is 1.0. But when it is not the case, the values in Cluster
timebase need to be transformed in Track timebase so they can be added
together.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2020-11-20 15:20:24 +01:00
Steve Lhomme b00d2210e4 avformat/matroskadec: add a warning when the track TimestampScale won't be used
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2020-11-20 15:20:24 +01:00
Anton Khirnov cea7c19cda lavf: move AVStream.*index_entries* to AVStreamInternal
Those are private fields, no reason to have them exposed in a public
header. Since there are some (semi-)public fields located after these,
even though this section is supposed to be private, keep some dummy
padding there until the next major bump to preserve ABI compatibility.
2020-10-28 14:59:28 +01:00
Anton Khirnov 108864acee lavf: move AVStream.{request_probe,skip_to_keyframe} to AVStreamInternal
Those are private fields, no reason to have them exposed in a public
header.
2020-10-28 14:57:01 +01:00
James Almer 8a81820624 avcodec/packet: move AVPacketList definition and function helpers over from libavformat
And replace the flags parameter with a function callback that can be used to
copy the contents of the packet (e.g, av_packet_ref and av_packet_copy_props).

Signed-off-by: James Almer <jamrial@gmail.com>
2020-09-15 09:53:39 -03:00
Andreas Rheinhardt 6f134c6d17 avformat/matroskadec: Slightly simplify version check
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-07-24 04:25:45 +02:00
Andreas Rheinhardt 880519c1de avformat/matroskadec: Avoid undefined pointer arithmetic
The Matroska demuxer currently always opens a GetByteContext to read the
content of the projection's private data buffer; it does this even if
there is no private data buffer in which case opening the GetByteContext
will lead to a NULL + 0 which is undefined behaviour.
Furthermore, in this case the code relied both on the implicit checks
of the bytestream2 API as well as on the fact that it returns zero
if there is not enough data available.

Both of these issues have been addressed by not using the bytestream API
any more; instead the data is simply read directly by using AV_RB. This
is possible because the offsets are constants.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-07-24 04:25:45 +02:00
Andreas Rheinhardt 0841063ce6 avformat/matroskadec: Fix memleaks in WebM DASH manifest demuxer
In certain error scenarios, the underlying Matroska demuxer was not
properly closed, causing leaks.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-06-15 16:35:28 +02:00
Andreas Rheinhardt 1ef30571a0 avformat/matroskadec: Use right number of tracks
When demuxing a Matroska/WebM file, streams are added for tracks and for
attachments, so that the array containing the former can be NULL even
when the corresponding AVFormatContext has streams. So check for there
to be tracks in the MatroskaDemuxContext instead of just streams in the
AVFormatContext before dereferencing the pointer to the tracks.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-06-15 16:15:47 +02:00
Andreas Rheinhardt 3714d452b8 avformat/matroskadec: Fix handling gigantic durations
matroska_parse_block currently asserts that the duration is not equal to
AV_NOPTS_VALUE, but there is nothing that actually guarantees this. It
is easy to create (spec-compliant) files which run into this assert;
so replace it and instead cap the duration to INT64_MAX, as the duration
field of an AVPacket is an int64_t.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-06-15 15:44:27 +02:00
Andreas Rheinhardt cbe336c9e8 avformat/matroskadec: Move AVBufferRef instead of copying, fix memleak
EBML binary elements are already made reference-counted when read;
so when populating the AVStream.attached_pic, one does not need to
allocate a new buffer for the data; instead the current code just
creates a new reference to the underlying AVBuffer. But this can be
improved even further: Just move the already existing reference.

This also fixes a memleak that happens upon error because
matroska_read_close has not been called in this scenario.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-06-15 15:44:27 +02:00
Andreas Rheinhardt 1fd8528c4e avformat/matroskadec: Beautify matroska_parse_laces()
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-26 06:19:25 +02:00
Andreas Rheinhardt 39f5bb6a3f avformat/matroskadec: Use proper context for logging
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-23 07:07:43 +02:00
Andreas Rheinhardt 3bd26b285e avformat/matroskadec: Export FileDescription as title tag
Each AttachedFile in Matroska can have a FileDescription element that
contains a human-friendly name for the attached file; yet this element
has been ignored up until now. This commit changes this and exports it
as title tag instead (the Matroska muxer mapped the title tag to the
AttachedFile element since support for Attachments was added).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-19 05:02:20 +02:00
Andreas Rheinhardt ff4da60fb8 avformat/matroskadec: Allow multiple Tags elements
The Matroska specification allows multiple (level 1) Tags elements per
file, yet our demuxer didn't: While it parsed any amount of Tags
elements it found in front of the Clusters (albeit with warnings because
of duplicate elements), it would treat any Tags element only referenced
via a SeekHead entry as already parsed if any Tags element has already
been parsed; therefore this Tags element would not be parsed at all.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-08 13:42:35 +02:00
Andreas Rheinhardt 7e9103535a avformat/matroskadec: Improve handling of circular SeekHeads
There can be more than one SeekHead in a Matroska file, but most of the
other level 1 elements can only occur once.* Therefore the Matroska
demuxer only allows one entry per ID in its internal list of level 1
elements known to it; the only exception to this are SeekHeads.

The only exception to this are SeekHeads: When one is encountered
(either directly or in the list of entries read from SeekHeads),
a new entry in the list of known level-1 elements is always added,
even when this entry is actually already known.

This leads to lots of seeks in case of circular SeekHeads: Each time a
SeekHead is parsed, a new entry for a SeekHead will be added to the list
of entries read from SeekHeads. The exception for SeekHeads mentioned
above now implies that this SeekHead will always appear new and unparsed
and parsing will be attempted. This continued until the list of known
level-1 elements is full.

Fixing this is pretty simple: Don't add a new entry for a SeekHead if
its position matches the position of an already known SeekHead.

*: Actually, there can be multiple Tags and several other level 1
elements are "identically recurring" which means they may be resent
multiple times, but each instance must be absolutely identical to the
previous.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-08 13:39:24 +02:00
Andreas Rheinhardt 7c243eece3 avformat/matroskadec: Sanitize SeekHead entries
A Seek element in a Matroska SeekHead should contain a SeekID and a
SeekPosition element and upon reading, they should be sanitized:

Given that IDs are restricted to 32 bit, longer SeekIDs should be treated
as invalid. Instead currently the lower 32 bits have been used.

For SeekPosition, no checks were performed for the element to be
present and if present, whether it was excessively large (i.e. the
absolute file position described by it exceeding INT64_MAX). The
SeekPosition element had a default value of -1 which means that a check
seems to have been intended; but it was not implemented. This commit adds
a check for overflow to the calculation of the absolute file position of
the referenced level 1 elements.
Using -1 (i.e. UINT64_MAX) as default value for SeekPosition implies that
a Seek element without SeekPosition will run afoul of this check.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-08 13:37:40 +02:00
Andreas Rheinhardt 5767a2ed74 avformat/matroskadec: Free right buffer on error
Since commit 979b5b8959, reverting the
Matroska ContentCompression is no longer done inside
matroska_parse_frame() (the function that creates AVPackets out of the
parsed data (unless we are dealing with certain codecs that need special
handling)), but instead in matroska_parse_block(). As a consequence,
the data that matroska_parse_frame() receives is no longer always owned
by an AVBuffer; it is owned by an AVBuffer iff no ContentCompression needed
to be reversed; otherwise the data is independently allocated and needs
to be freed on error.

Whether the data is owned by an AVBuffer or not is indicated by a variable
buf of type AVBufferRef *: If it is NULL, the data is independently
allocated, if not it is owned by the underlying AVBuffer (and is used to
avoid copying the data when creating the AVPackets).

Because the allocation of the buffer holding the uncompressed data happens
outside of matroska_parse_frame() (if a ContentCompression needs to be
reversed), the data is passed as uint8_t ** in order to not leave any
dangling pointers behind in matroska_parse_block() should the data need to
be freed: In case of errors, said uint8_t ** would be av_freep()'ed in
case buf indicated the data to be independently allocated.

Yet there is a problem with this: Some codecs (namely WavPack and
ProRes) need special handling: Their packets are only stored in
Matroska in a stripped form to save space and the demuxer reconstructs
full packets. This involved allocating a new, enlarged buffer. And if
an error happens when trying to wrap this new buffer into an AVBuffer,
this buffer needs to be freed; yet instead the given uint8_t ** (holding
the uncompressed, yet still stripped form of the data) would be freed
(av_freep()'ed) which certainly leads to a memleak of the new buffer;
even worse, in case the track does not use ContentCompression the given
uint8_t ** must not be freed as the actual data is owned by an AVBuffer
and the data given to matroska_parse_frame() is not the start of the
actual allocated buffer at all.

Both of these issues are fixed by always freeing the current data in
case it is independently allocated. Furthermore, while it would be
possible to track whether the pointer from matroska_parse_block() needs
to be reset or not, there is no gain in doing so, as the pointer is not
used at all afterwards and the sematics are clear: If the data passed
to matroska_parse_frame() is independently allocated, then ownership
of the data passes to matroska_parse_frame(). So don't pass the data
via uint8_t **.

Fixes Coverity ID 1462661 (the issue as described by Coverity is btw
a false positive: It thinks that this error can be triggered by ProRes
with a size of zero after reconstructing the original packets, but the
reconstructed packets can't have a size of zero).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-04 07:51:29 +02:00
Andreas Rheinhardt 39fb1e968a avformat/matroskadec: Cosmetics
Reindentation as well as marking several variables used for demuxing
RealAudio as const to clearly see that they don't change during
demuxing.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-01 08:35:48 +02:00
Andreas Rheinhardt 979b5b8959 avformat/matroskadec: Support ContentCompression for all codecs
The Matroska demuxer has three functions for creating packets out of
the data read: One for certain RealAudio codecs (ATRAC3, cook, sipr,
RealAudio 28.8), one for WebVTT (actually, the WebM flavour of it) and
one for all the others. Only the last function supported Matroska's
ContentCompression (e.g. it reversed zlib compression or added the
removed headers to the packets). But in Matroska, all tracks are allowed
to be compressed. This commit adds support for this.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-01 08:31:20 +02:00
Andreas Rheinhardt 96012d17a9 avformat/matroskadec: Cache whether a track needs to be decoded
There is no need to recheck this for every frame.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-01 08:15:07 +02:00
Andreas Rheinhardt b577968cab avformat/matroskadec: Improve forward compability
Matroska is built around the principle that a reader does not need to
understand everything in a file in order to be able to make use of it;
it just needs to ignore the data it doesn't know about.

Our demuxer typically follows this principle, but there is one important
instance where it does not: A Block belonging to a TrackEntry with no
associated stream is treated as invalid data (i.e. the demuxer will try
to resync to the next level 1 element because it takes this as a sign
that it has lost sync). Given that we do not create streams if we don't
know or don't support the type of the TrackEntry, this impairs this
demuxer's forward compability.

Furthermore, ignoring Blocks belonging to a TrackEntry without
corresponding stream can (in future commits) also be used to ignore
TrackEntries with obviously bogus entries without affecting the other
TrackEntries (by not creating a stream for said TrackEntry).

Finally, given that matroska_find_track_by_num() already emits its own
error message in case there is no TrackEntry with a given TrackNumber,
the error message (with level AV_LOG_INFO) for this can be removed.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-01 08:06:52 +02:00
Andreas Rheinhardt e471faf962 avformat/matroskadec: Don't discard valid packets
A Block (meaning both a Block in a BlockGroup as well as a SimpleBlock)
must have at least three bytes after the field containing the encoded
TrackNumber. So if there are <= 3 bytes, the Matroska demuxer would
skip this block, believing it to be an empty, but valid Block.

This might discard valid nonempty Blocks, namely if the track uses header
stripping. And certain definitely spec-incompliant Blocks don't raise
errors: Those with two or less bytes left after the encoded TrackNumber
and those with three bytes left, but with flags indicating that the Block
uses lacing as then there has to be further data describing the lacing.

Furthermore, zero-sized packets were still possible because only the
size of the last entry of a lace was checked.

This commit fixes this. All spec-compliant Blocks that contain data
(even if side data only) are now returned to the caller; spec-compliant
Blocks that don't contain anything are not returned.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-01 07:58:53 +02:00
Andreas Rheinhardt 4b1c19a054 avformat/matroskadec: Simplify checks for cook and ATRAC3
Some conditions which don't change and which can therefore be checked
in read_header() were instead rechecked upon parsing each block. This
has been changed.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-01 07:50:30 +02:00
Andreas Rheinhardt bdaa98dd4a avformat/matroskadec: Don't output uninitialized data for RealAudio 28.8
The Matroska demuxer splits every sequence of h Matroska Blocks into
h * w / cfs packets of size cfs; here h (sub_packet_h), w (frame_size)
and cfs (coded_framesize) are parameters from the track's CodecPrivate.

It does this by splitting the Block's data in h/2 pieces of size cfs each
and putting them into a buffer at offset m * 2 * w + n * cfs where
m (range 0..(h/2 - 1)) indicates the index of the current piece in the
current Block and n (range 0..(h - 1)) is the index of the current Block
in the current sequence of Blocks. The data in this buffer is then used
for the output packets.

The problem is that there is currently no check to actually guarantee
that no uninitialized data will be output. One instance where this is
trivially so is if h == 1; another is if cfs * h is so small that the
input pieces do not cover everything that is output. In order to
preclude this, rmdec.c checks for h * cfs == 2 * w and h >= 2. The
former requirement certainly makes much sense, as it means that for
every given m the input pieces (corresponding to the h different values
of n) form a nonoverlapping partition of the two adjacent frames of size w
corresponding to m. But precluding h == 1 is not enough, other odd
values can cause problems, too. That is because the assumption behind
the code is that h frames of size w contain data to be output, although
the real number is h/2 * 2. E.g. for h = 3, cfs = 2 and w = 3 the
current code would output four (== h * w / cfs) packets. although only
data for three (== h/2 * h) packets has been read.

(Notice that if h * cfs == 2 * w, h being even is equivalent to
cfs dividing w; the latter condition also seems very reasonable:
It means that the subframes are a partition of the frames.)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-01 07:42:16 +02:00
Andreas Rheinhardt 4f5c6c1b0e avformat/matroskadec: Fix buffer overflow when demuxing RealAudio 28.8
RealAudio 28.8 (like other RealAudio codecs) uses a special demuxing
mode in which the data of the existing Matroska Blocks is not simply
forwarded as-is. Instead data from several Blocks is recombined
together to output several packets. The parameters governing this
process are parsed from the CodecPrivate: Coded framesize (cfs), frame
size (w) and sub_packet_h (h).

During demuxing, h/2 pieces of data of size cfs each are read from every
Matroska (Simple)Block and put at offset m * 2 * w + n * cfs of a buffer
of size h * w, where m ranges from 0 to h/2 - 1 for each Block while n
is initially zero and incremented after a Block has been parsed until it
is h, at which poin the assembled packets are output and n reset.

The highest offset is given by (h/2 - 1) * 2 * w + (h - 1) * cfs + cfs
while the destination buffer's size is given by h * w. For even h, this
leads to a buffer overflow (and potential segfault) if h * cfs > 2 * w;
for odd h, the condition is h * cfs > 3 * w.

This commit adds a check to rule this out.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-01 07:27:36 +02:00
Andreas Rheinhardt c91e3690d9 avformat/matroskadec: Fix demuxing RealAudio 28.8
RealAudio 28.8 does not need or use sub_packet_size for its demuxing
and this field is therefore commonly set to zero. But since 18ca491b
the Real Audio specific demuxing is no longer applied if sub_packet_size
is zero because the codepath for cook and ATRAC3 divide by it; this made
these files undecodable.

Furthermore, since 569d18aa (merged in 2c8d876d) sub_packet_size being
zero is used as an indicator for invalid data, so that a file containing
such a track was completely skipped.

This commit fixes this by not checking sub_packet_size for RealAudio
28.8 at all.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-01 07:20:22 +02:00
Andreas Rheinhardt c6f60b90f0 avformat/matroskadec: Simplify check for RealAudio
They need a special parsing mode and in order to find out whether this
mode is in use, several checks have to be performed. They can all be
combined into one: If the buffer that is only used to assemble their
packets has been allocated, use the RealAudio parsing mode.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-01 07:11:40 +02:00
Andreas Rheinhardt 8287c20153 avformat/matroskadec: Reject sipr flavor > 3
Only flavors 0..3 seem to exist. E.g. rmdec.c treats any flavor > 3
as invalid data. Furthermore, we do not know how big the packets to
create ought to be given that for sipr these values are not read from
the bitstream, but from a table.

Furthermore, flavor is only used for sipr, so only check it for sipr;
rmdec.c does the same. (The old check for flavor being < 0 was
always wrong given that flavor is an int that is read via avio_rb16(),
so it has been removed completely.)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-05-01 06:52:34 +02:00
Andreas Rheinhardt 67e957b43a avformat/matroska: Move mime_tag lists to matroskadec
They are not used any more by the muxer.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-04-20 21:24:18 +02:00
Andreas Rheinhardt 3059b7746a avformat/matroskadec: Remove redundant setting of chapter titles
Chapter titles are added to the chapter's metadata since 6cb6e159,
yet since 012867f0 (the predecessor of) avpriv_new_chapter() already
adds the title to the chapter's metadata. So setting it again in
matroskadec.c is redundant and expensive.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-04-19 00:33:34 +02:00
Andreas Rheinhardt 048bc3fe31 avformat/matroskadec: Add a workaround for missing WavPack extradata
mkvmerge versions 6.2 to 40.0 had a bug that made it not propagate the
WavPack extradata (containing the WavPack version) during remuxing from
a Matroska file; currently our demuxer would treat every WavPack block
encountered as invalid data (unless the WavPack stream is to be
discarded (i.e. the streams discard is >= AVDISCARD_ALL)) and try to
resync to the next level 1 element.

Luckily, the WavPack version is currently not really important; so we
fix this problem by assuming a version. David Bryant, the creator of
WavPack, recommended using version 0x410 (the most recent version) for
this. And this is what this commit does.

A FATE-test for this has been added.

Reviewed-by: David Bryant <david@wavpack.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-04-02 07:12:01 +02:00
Andreas Rheinhardt ba36a07734 avformat/matroskadec: Don't discard the upper 32bits of TrackNumber
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-03-26 19:49:45 +01:00
Steve Lhomme b5dd964cdc avformat/matroskadec: fix the type of the TrackLanguage
It's an ASCII string, not a UTF-8 string.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-03-26 05:51:49 +01:00
Andreas Rheinhardt 40d9cbdc22 avformat/matroskadec: Use AV_DICT_DONT_STRDUP_VAL to save av_strdup
This will likely also fix CID 1452562, a false positive resulting from
Coverity thinking that av_dict_set() automatically frees its key and
value parameters (even without the AV_DICT_DONT_STRDUP_* flags).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-01-01 16:38:28 +01:00
Andreas Rheinhardt 2ff687c17f avformat/matroskadec: Fix lzo decompression
When a Matroska Block is only stored in compressed form, the size of
the uncompressed block is not explicitly coded and therefore not known
before decompressing it. Therefore the demuxer uses a guess for the
uncompressed size: The first guess is three times the compressed size
and if this is not enough, it is repeatedly incremented by a factor of
three. But when this happens with lzo, the decompression is neither
resumed nor started again. Instead when av_lzo1x_decode indicates that x
bytes of input data could not be decoded, because the output buffer is
already full, the first (not the last) x bytes of the input buffer are
resent for decoding in the next try; they overwrite already decoded
data.

This commit fixes this by instead restarting the decompression anew,
just with a bigger buffer.

This seems to be a regression since 935ec5a1.

A FATE-test for this has been added.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2019-12-28 22:40:13 -03:00
Andreas Rheinhardt af50f0a515 avformat/matroskadec: Fix use-after-free when demuxing ProRes
ProRes in Matroska is supposed to not contain the first atom header
(containing a size field and the tag "icpf") and therefore the Matroska
demuxer has to recreate it; this involves an allocation and copy, of
course. Whether the old buffer (containing the data without the atom
header) needs to be freed or not depends upon whether it is what was
directly read (in which case it is owned by an AVBuffer) or whether it
has been allocated when reversing the track's content compression (e.g.
zlib compression) that Matroska supports.

So there are three pointers involved: The one pointing to the directly
read data (owned by the AVBuffer), the one pointing to the currently
valid data (which coincides with the former if no content compression
needed to be reverted) and the one pointing to the new data with the
first atom header. The check for whether to free the second of these is
simply whether the first two are different.

This works mostly, but there is a complication: Some muxers don't strip
the first atom header away and in this case, it is also not reinserted
and no new buffer is allocated; instead, the second and the third
pointers agree. In this case, one must never free the second buffer.
Yet it is currently done if the track is e.g. zlib compressed.
This commit fixes this.

This is a regression since b8e75a2a.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2019-12-07 12:36:21 -03:00
Andreas Rheinhardt d5274f86a8 avformat/matroskadec: Reuse AVIOContext
When parsing EBML lacing, for every number read, a new AVIOContext has
been initialized (via ffio_init_context()) just for this number. This
has been changed: The context is kept now.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2019-12-04 23:11:37 -03:00
Andreas Rheinhardt dbe3be6744 avformat/matroskadec: Improve frame size parsing error messages
When parsing the sizes of the frames in a lace fails, sometimes no
error message was raised (e.g. when using xiph or fixed-size lacing).
Only EBML lacing generated error messages (which were wrongly declared
as AV_LOG_INFO), but even here not all errors resulted in an error
message. So add a generic error message to catch them all.

Moreover, if parsing one of the EBML numbers fails, ebml_read_num already
emits its own error messages, so that all that is needed is a generic error
message to indicate that this happened during parsing the sizes of the
frames in a block; in other words, the error messages specific to
parsing EBML lace numbers can be and have been removed.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2019-12-04 23:11:37 -03:00