This fixes when resuming certain broken h264 files encoded by x264. See
FFmpeg commit 840b41b2a643fc8f0617c0370125a19c02c6b586 about the x264
bug itself.
Normally, the unregistered user data SEI (that contains the x264 version
string) is informational only. But libavcodec uses it to workaround a
x264 bug, which was recently fixed in both libavcodec and x264. The fact
that both encoder and decoder were buggy is the reason that it was not
found earlier, and there are apparently a lot of files around created by
the broken decoder. If libavcodec sees the SEI, this bug can be worked
around by using the old behavior.
If you resume a file with mpv (i.e. seeking when the file loads),
libavcodec never sees the first video packet. Consequently it has to
assume the file is not broken, and never applies the workaround,
resulting in garbage being played.
Fix this by always feeding the first video packet to the decoder on
init, and then flushing the codec (to avoid that an unwanted image is
output). Flushing the codec does not remove info such as the x264
version. We also abuse the fact that the first avcodec_send_packet()
always pushes the frame into the decoder (so we don't have to trigger
the decoder by requsting an output frame).
More the ignore_eof field to the internal demux_stream struct. This is
relatively messy, because the internal struct exists only once the
stream is created, and after that setting the ignore_eof flag is a race
condition. We could bother with adding demux_add_sh_stream() parameters
for this, but let's not. So in theory a tiny race condition is
introduced, which can never be triggered since all demux API functions
are called by the playback thread only anyway.
Fix that ts_offset is accessed without log (this was introduced much
earlier by myself).
Introduce an alternative way of avoiding the annoying EOF reached
messages by not resetting the EOF flags for CC streams when a CC packet
is added. This makes the second commit in the PR which added the
original fix unnecessary.
As another cosmetic change merge the check in cached_demux_control()
into a single if().
In the future, the CC pseudo-stream should probably be replaced with an
entire pseudo-demuxer or such, which would avoid some of the messiness
(or maybe not, we don't know yet).
This adds handling of spherical video metadata: retrieving it from
demux_lavf and demux_mkv, passing it through filters, and adjusting it
with vf_format. This does not include support for rendering this type of
video.
We don't expect we need/want to support the other projection types like
cube maps, so we don't include that for now. They can be added later as
needed.
Also raise the maximum sizes of stringified image params, since they
can get really long.
All authors of the current code have agreed.
For most of its life, MPlayer used Microsoft structs like WAVEFORMATEX
to describe media content. It appears these were copied from wine in
61c5a99851. Copyright is unclear, but mpv completely removed use of
these structs anyway. (demux_mkv.c still contains code to read these
fields from a byte stream, but the struct is fully gone.)
42f97b2b82: cehoyos (who probably didn't agree with LGPL) applied a
patch by someone who agreed. It's unknown whether cehoyos modified the
patch (and thus his copyright would apply), but the changed code was
removed anyway. (The code was first moved somewhere else, then removed.)
efd53eed61: the patch author was not asked. Although the mkv_sh_sub_t
struct was later moved to stheader.h, the added field was removed
without replacement.
f6878753fb: nick, who could not be reached, added an include guard, but
the guard was changed several times later, and it's probably not
copyrightable anyway.
afb0fd5ea1: the same nick adds a field that was later replaced and
finally removed again in 8cd6b20571.
Implementation-wise, the values from the demuxer/codec header are merged
with the values from the decoder such that the former are used only
where the latter are unknown (0/auto).
Instead of passing through double float timestamps opaquely, pass real
timestamps. Do so by always setting a valid timebase on the
AVCodecContext for audio and video decoding.
Specifically try not to round timestamps to a too coarse timebase, which
could round off small adjustments to timestamps (such as for start time
rebasing or demux_timeline). If the timebase is considered too coarse,
make it finer.
This gets rid of the need to do this specifically for some hardware
decoding wrapper. The old method of passing through double timestamps
was also a bit questionable. While libavcodec is not supposed to
interpret timestamps at all if no timebase is provided, it was
needlessly tricky. Also, it actually does compare them with
AV_NOPTS_VALUE. This change will probably also reduce confusion in the
future.
...and ignore it. The main purpose is for retrieving per-track
replaygain tags. Other than that per-track tags are not used or accessed
by the playback core yet.
The demuxer infrastructure is still not really good with that whole
synchronization thing (at least in part due to being inherited from
mplayer's single-threaded architecture). A convoluted mechanism is
needed to transport the tags from demuxer thread to user thread. Two
factors contribute to the complexity: tags can change during playback,
and tracks (i.e. struct sh_stream) are not duplicated per thread.
In particular, we update the way replaygain tags are retrieved. We first
try to use per-track tags (common in Matroska) and global tags
(effectively formats like mp3). This part fixes#3405.
AVFormatContext.codec is deprecated now, and you're supposed to use
AVFormatContext.codecpar instead.
Handle this for all of the normal playback code.
Encoding mode isn't touched.
This is mainly a refactor. I'm hoping it will make some things easier
in the future due to cleanly separating codec metadata and stream
metadata.
Also, declare that the "codec" field can not be NULL anymore. demux.c
will set it to "" if it's NULL when added. This gets rid of a corner
case everything had to handle, but which rarely happened.
Just so I can remove a few lines from dec_sub.c.
This is slightly inelegant, as the whole subtitle file has to be read
into memory, converted at once in memory, and then provided to
libavformat in an awkward way by creating a memory stream instead of
using demuxer->stream. It also won't be possible to force the charset on
subtitles in binary container formats - but this wasn't exposed before,
and we just hope this won't be ever needed. (One motivation was fixing
broken files with non-UTF8 muxed.) It also won't be possible to change
the charset on the fly, but this was not exposed either.
Since commit 6d9cb893, subtitle state doesn't survive timeline switches
(ordered chapters etc.). So there is no point in caching the state per
sh_stream anymore (which would be required to deal with multiple
segments). Move the cache to struct track.
(Whether it's worth caching the subtitle state just for the situation
when subtitle tracks get reselected is questionable. But for now, it's
nice to have the subtitles immediately show up when reselecting a
subtitle.)
MPlayer traditionally always used the display aspect ratio, e.g. 16:9,
while FFmpeg uses the sample (aka pixel) aspect ratio.
Both have a bunch of advantages and disadvantages. Actually, it seems
using sample aspect ratio is generally nicer. The main reason for the
change is making mpv closer to how FFmpeg works in order to make life
easier. It's also nice that everything uses integer fractions instead
of floats now (except --video-aspect option/property).
Note that there is at least 1 user-visible change: vf_dsize now does
not set the display size, only the display aspect ratio. This is
because the image_params d_w/d_h fields did not just set the display
aspect, but also the size (except in encoding mode).
Slightly simpler, and removes the need to pre-read all subtitle packets.
This still does the subtitle charset conversion on the packet level
(instead converting when parsing the file), so in theory this still
could provide a way to change the charset at runtime. But maybe even
this should be removed, as FFmpeg is somewhat likely to get its own
charset detection and conversion mechanism in the future. (Would have
to keep the subtitle file in memory to allow changing the charset on
the fly, I guess.)
At least Matroska files have a "forced" flag (in addition to the
"default" flag). Export this flag. Treat it almost like the default
flag, but with slightly higher priority.
MPlayer traditionally had completely separate sh_ structs for
audio/video/subs, without a good way to share fields. This meant that
fields shared across all these headers had to be duplicated. This commit
deduplicates essentially the last remaining duplicated fields.
Remove the old implementation for these properties. It was never very
good, often returned very innaccurate values or just 0, and was static
even if the source was variable bitrate. Replace it with the
implementation of "packet-video-bitrate". Mark the "packet-..."
properties as deprecated. (The effective difference is different
formatting, and returning the raw value in bits instead of kilobits.)
Also extend the documentation a little.
It appears at least some decoders (sipr?) need the
AVCodecContext.bit_rate field set, so this one is still passed through.
Trying to handle such video is almost worthless, but it was requested by
at least 2 users.
If there are no timestamps, enable byte seeking by setting
ts_resets_possible. Use the video FPS (wherever it comes from) and the
audio samplerate for timing. The latter was already done by making the
first packet emit DTS=0; remove this again and do it "properly" in a
higher level.
Remove coded_width and coded_height. This was originally added in commit
fd7dde40, when BITMAPINFOHEADER was killed. The separate fields became
redundant in commit e68f4be1. Remove them (nothing passed to the
decoders actually changes with _this_ commit).
Apparently using the stream index is the best way to refer to the same
streams across multiple FFmpeg-using programs, even if the stream index
itself is rarely meaningful in any way.
For Matroska, there are some possible problems, depending how FFmpeg
actually adds streams. Normally they seem to match though.
See previous commits. This finally replaces directly reading the file
data into a struct with reading them manually. In theory this is more
portable (no alignment issues and other things). For the most part,
it's nice seeing this gone.
MPlayer traditionally did this because it made sense: the most important
formats (avi, asf/wmv) used Microsoft formats, and many important
decoders (win32 binary codecs) also did. But the world has changed, and
I've always wanted to get rid of this thing from the codebase.
demux_mkv.c internally still uses it, because, guess what, Matroska has
a VfW muxing mode, which uses these data structures natively.
For a while, we used this to transfer PCM from demuxer to the filter
chain. We had a special "codec" that mapped what MPlayer used to do
(MPlayer passes the AF sample format over an extra field to ad_pcm,
which specially interprets it).
Do this by providing a mp_set_pcm_codec() function, which describes a
sample format in a generic way, and sets the appropriate demuxer header
fields so that libavcodec interprets it correctly. We use the fact that
libavcodec has separate PCM decoders for each format. These are
systematically named, so we can easily map them.
This has the advantage that we can change the audio filter chain as we
like, without losing features from the "rawaudio" demuxer. In fact, this
commit also gets rid of the audio filter chain formats completely.
Instead have an explicit list of PCM formats. (We could even just have
the user pass libavcodec PCM decoder names directly, but that would be
annoying in other ways.)
Until now, the audio chain could handle both little endian and big
endian formats. This actually doesn't make much sense, since the audio
API and the HW will most likely prefer native formats. Or at the very
least, it should be trivial for audio drivers to do the byte swapping
themselves.
From now on, the audio chain contains native-endian formats only. All
AOs and some filters are adjusted. af_convertsignendian.c is now wrongly
named, but the filter name is adjusted. In some cases, the audio
infrastructure was reused on the demuxer side, but that is relatively
easy to rectify.
This is a quite intrusive and radical change. It's possible that it will
break some things (especially if they're obscure or not Linux), so watch
out for regressions. It's probably still better to do it the bulldozer
way, since slow transition and researching foreign platforms would take
a lot of time and effort.
--hls-bitrate=min/max lets you select the min or max bitrate. That's it.
Something more sophisticated might be possible, but is probably not even
worth the effort.
This inserts an automatic conversion filter if a Matroska file is marked
as 3D (StereoMode element). The basic idea is similar to video rotation
and colorspace handling: the 3D mode is added as a property to the video
params. Depending on this property, a video filter can be inserted.
As of this commit, extending mp_image_params is actually completely
unnecessary - but the idea is that it will make it easier to integrate
with VOs supporting stereo 3D mogrification. Although vo_opengl does
support some stereo rendering, it didn't support the mode my sample file
used, so I'll leave that part for later.
Not that most mappings from Matroska mode to vf_stereo3d mode are
probably wrong, and some are missing.
Assuming that Matroska modes, and vf_stereo3d in modes, and out modes
are all the same might be an oversimplification - we'll see.
See issue #1045.
This adds a thread to the demuxer which reads packets asynchronously.
It will do so until a configurable minimum packet queue size is
reached. (See options.rst additions.)
For now, the thread is disabled by default. There are some corner cases
that have to be fixed, such as fixing cache behavior with webradios.
Note that most interaction with the demuxer is still blocking, so if
e.g. network dies, the player will still freeze. But this change will
make it possible to remove most causes for freezing.
Most of the new code in demux.c actually consists of weird caches to
compensate for thread-safety issues (with the previously single-threaded
design), or to avoid blocking by having to wait on the demuxer thread.
Most of the changes in the player are due to the fact that we must not
access the source stream directly. the demuxer thread already accesses
it, and the stream stuff is not thread-safe.
For timeline stuff (like ordered chapters), we enable the thread for the
current segment only. We also clear its packet queue on seek, so that
the remaining (unconsumed) readahead buffer doesn't waste memory.
Keep in mind that insane subtitles (such as ASS typesetting muxed into
mkv files) will practically disable the readahead, because the total
queue size is considered when checking whether the minimum queue size
was reached.
It's unlikely that files with multiple audio tracks and with replaygain
actually happen, but this change might help avoid minor corner cases
with later changes.
The i_bps members of the sh_audio and dev_video structs are mostly used
for displaying the average audio and video bitrates. Keeping them in
bits-per-second avoids truncating them to bytes-per-second and changing
them back lateron.
Instead of parsing the ASS file in demux_libass.c and trying to pass the
ASS_Track to the subtitle renderer, just read all file data in
demux_libass.c, and let the subtitle renderer pass the file contents to
ass_process_codec_private(). (This happens to parse full files too.)
Makes the code simpler, though it also relies harder on the (messy)
probe logic in demux_libass.c.
The mplayer decoder (spudec.c) actually handled this. There was explicit
code for binary palettes (16 32 bit values), and the subtitle resolution
was handled by video resolution coincidentally matching the subtitle
resolution.
Whoever puts vobsub into mp4 should be punished.
Fixes the sample gundam_sample.mp4, closes github issue #547.
So, FFmpeg/Libav requires us to figure out video timestamps ourselves
(see last 10 commits or so), but the methods it provides for this aren't
even sufficient. In particular, everything that uses AVI-style DTS (avi,
vfw-muxed mkv, possibly mpeg4-in-ogm) with a codec that has an internal
frame delay is broken. In this case, libavcodec will shift the packet-
to-image correspondence by the codec delay, meaning that with a delay=1,
the first AVFrame.pkt_dts is not 0, but that of the second packet. All
timestamps will appear shifted. The start time (e.g. the time displayed
when doing "mpv file.avi --pause") will not be exactly 0.
(According to Libav developers, this is how it's supposed to work; just
that the first DTS values are normally negative with formats that use
DTS "properly". Who cares if it doesn't work at all with very common
video formats? There's no indication that they'll fix this soon,
either. An elegant workaround is missing too.)
Add a hack to re-enable the old PTS code for AVI and vfw-muxed MKV.
Since these timestamps are not reorderd, we wouldn't need to sort them,
but it's less code this way (and possibly more robust, should a demuxer
unexpectedly output PTS).
The original intention of all the timestamp changes recently was
actually to get rid of demuxer-specific hacks and the old timestamp
sorting code, but it looks like this didn't work out. Yet another case
where trying to replace native MPlayer functionality with FFmpeg/Libav
led to disadvantages and bugs. (Note that the old PTS sorting code
doesn't and can't handle frame dropping correctly, though.)
Bug reports:
https://trac.ffmpeg.org/ticket/3178https://bugzilla.libav.org/show_bug.cgi?id=600
This used to be needed to access the generic stream header from the
specific headers, which in turn was needed because the decoders had
access only to the specific headers. This is not the case anymore, so
this can finally be removed again.
Also move the "format" field from the specific headers to sh_stream.
This is similar to the sh_audio commit.
This is mostly cosmetic in nature, except that it also adds automatical
freeing of the decoder driver's state struct (which was in
sh_video->context, now in dec_video->priv).
Also remove all the stheader.h fields that are not needed anymore.
sh_audio is supposed to contain file headers, not whatever was decoded.
Fix this, and write the decoded format to separate fields in the decoder
context, the dec_audio.decoded field. (Note that this field is really
only needed to communicate the audio format from decoder driver to the
generic code, so no other code accesses it.)
Move all state that basically changes during decoding or is needed in
order to manage decoding itself into a new struct (dec_audio).
sh_audio (defined in stheader.h) is supposed to be the audio stream
header. This should reflect the file headers for the stream. Putting the
decoder context there is strange design, to say the least.
Most libavcodec decoders output non-interleaved audio. Add direct
support for this, and remove the hack that repacked non-interleaved
audio back to packed audio.
Remove the minlen argument from the decoder callback. Instead of
forcing every decoder to have its own decode loop to fill the buffer
until minlen is reached, leave this to the caller. So if a decoder
doesn't return enough data, it's simply called again. (In future, I
even want to change it so that decoders don't read packets directly,
but instead the caller has to pass packets to the decoders. This fits
well with this change, because now the decoder callback typically
decodes at most one packet.)
ad_mpg123.c receives some heavy refactoring. The main problem is that
it wanted to handle format changes when there was no data in the decode
output buffer yet. This sounds reasonable, but actually it would write
data into a buffer prepared for old data, since the caller doesn't know
about the format change yet. (I.e. the best place for a format change
would be _after_ writing the last sample to the output buffer.) It's
possible that this code was not perfectly sane before this commit,
and perhaps lost one frame of data after a format change, but I didn't
confirm this. Trying to fix this, I ended up rewriting the decoding
and also the probing.
This member was redundant. sh_audio->sample_format indicates the sample
size already.
The TV code is a bit strange: the redundant sample size was part of the
internal TV interface. Assume it's really redundant and not something
else. The PCM decoder ignores the sample size anyway.