Just rearranging shit. Setting SEEK_HR for backstep seeks actually
doesn't have much meaning, but disables the weird audio snapping for
"keyframe" seeks, and I don't know it's late.
This code used to be simpler, but now it's enough that it should be
factored into a single function.
Both uses of the new function are annoyingly different. The first use is
the special case when a decoder tries to read packets, but the demuxer
doesn't see any (like mp4 files with sparse video packets, which
actually turned out to be chapter thumbnail "tracks"). Then the other
stream queues will overflow, and the stream with no packets is marked
EOF to avoid stalling playback.
The second case is when the demxuer returns global EOF.
It would be more awkward to have the loop iterating the streams in the
function, because then you'd need a weird parameter to control the
behavior.
Just "mpv file.mkv --play-direction=backward" did not work, because
backward demuxing from the very end was not implemented. This is another
corner case, because the resume mechanism so far requires a packet
"position" (dts or pos) as reference. Now "EOF" is another possible
reference.
Also, the backstep mechanism could cause streams to find different
playback start positions, basically leading to random playback start
(instead of what you specified with --start). This happens only if
backstep seeks are involved (i.e. no cached data yet), but since this is
usually the case at playback start, it always happened. It was racy too,
because it depended on the order the decoders on other threads requested
new data. The comment below "resume_earlier" has some more blabla.
Some other details are changed.
I'm giving up on the "from_cache" parameter, and don't try to detect the
situation when the demuxer does not seek properly. Instead, always seek
back, hopefully some more.
Instead of trying to adjust the backstep seek target by a random value
of 1.0 seconds. Instead, always rely on the random value provided by the
user via --demuxer-backward-playback-step. If the demuxer should really
get "stuck" and somehow miss the seek target badly, or the user sets the
option value to 0, then the demuxer will not make any progress and just
eat CPU. (Although due to backward seek semantics used for backstep
seeks, even a very small seek step size will work. Just not 0.)
It seems this also fixes backstepping correctly when the initial seek
ended at the last keyframe range. (The explanation above was about the
case when it ends at EOF. These two cases are different. In the former,
you just need to step to the previous keyframe range, which was broken
because it didn't always react correctly to reaching EOF. In the latter,
you need to do a separate search for the last keyframe.)
Fixes the same thing as the previous commit did with demux_mkv. I'm not
sure if this is correct or a good idea (well, it works with my sample
file).
There are some shady things in this, but describing them would require
too many expletives.
In this scenario, the demuxer will output timestamps offset by the codec
delay (e.g. negative timestamps at the start; mkv simulates those), and
the trimming in the decoder (often libavcodec, but ad_lavc.c in our
case) will adjust the timestamps back (e.g. stream actually starts at
0).
This offset needs to be taken into account when seeking. This worked in
the uncached case. (demux_mkv.c is a bit tricky in that the index is
already in the offset space, so it compensates even though the seek call
does not reference codec_delay.) But in the cached case, seeks backwards
did not seek enough, and forward they seeked too much.
Fix this by adding the codec delay to the index search. We need to get
"earlier" packets, so e.g. seeking to position 0 really gets the initial
packets with negative timestamps.
This also adjusts the seek range start. This is also pretty obvious: if
the beginning of the file is cached, the seek range should start at 0,
not a negative value. We compare 0-based timestamps to it later on.
Not sure if this is the best approach. I also could have thought
about/checked some corner cases harder. But fuck this shit.
Not fixing duration (who cares) or end trimming, which would reduce the
seek range and duration (who cares).
This is a bad approach, and should be handled by a codec parameter field
(in mp_codec_params or AVCodecParameters).
It's bad because it's overly complicated, and has potential to break
demuxer cache assumptions: packets that were "intended" for seek
resuming may suddenly appear in the middle of a stream, when you seek
back and play a cached part again. (In general it was fine though,
because seek range joining tends to remove the first audio packet of the
next range when trying to find an overlap.)
demux_mkv.c does not try to export its codec_delay field through the
codec parameters mentioned above. In the only case I spotted this
element, the codec itself (opus) set this field within libavcodec. And I
think that's actually how it should be. On the other hand, a file could
in theory set this field via mkv headers if the codec is too stupid to
have such a field internally. But I don't really care until I see such a
file.
The end trimming is still sort of needed (though not sure if anything
uses it, other than the opus/mkv test sample I was using). The decoder
can't know whether something is the last packet, until it's too late.
The codec_delay field is still needed to offset timestamps.
Only timestamps that enter or leave the demuxer API should be adjusted
by ts_offset (which is usually the start time). queue_seek() is also
used by backward demux seeks, which uses an internal timestamp.
Raw audio formats can be accessed sample-wise, and logically audio
packets demuxed from it would contain only 1 sample. This is
inefficient, so raw audio demuxers typically "bundle" multiple samples
in one packet.
The problem for the demuxer cache and backward playback is that they
need properly aligned packets to make seeking "deterministic". The
requirement is that if you read some packets, and then seek back, you
eventually see the same packets again. demux_raw basically allowed to
seek into the middle of a previously returned packet, which makes it
impossible to make the transition seamless. (Unless you'd be aware of
the packet data format and cut them to make it seamless, which is too
complex for such a use case.)
Solve this by always aligning seeks to packet boundaries. This reduces
the seek accuracy to the arbitrarily chosen packet size. But you can use
hr-seek to fix this. The gain from not making raw audio an awful special
case pays in exchange for this "stupid" suggestion to use hr-seek.
It appears this also fixes that it could and did seek into the middle of
the frame (not sure if this code was ever tested - it goes back to
removing the code duplication between the former demux_rawaudio.c and
demux_rawvideo.c).
If you really cared, you could introduce a seek flag that controls
whether the seek is aligned or not. Then code which requires
"deterministic" demuxing could set it. But this isn't really useful for
us, and we'd always set the flag anyway, unless maybe the caching were
forced disabled.
libavformat's wav demuxer exhibits the same issue. We can't fix it (it
would require the unpleasant experience of contributing to FFmpeg), so
document this in otions.rst. In theory, this also affects seek range
joining, but the only bad effect should be that cached data is
discarded.
This is for uncompressed data, so every frame is a "keyframe". This is
part of making this demuxer work with the demuxer layer caching and
backward playback.
See manpage additions. This is a huge hack. You can bet there are shit
tons of bugs. It's literally forcing square pegs into round holes.
Hopefully, the manpage wall of text makes it clear enough that the whole
shit can easily crash and burn. (Although it shouldn't literally crash.
That would be a bug. It possibly _could_ start a fire by entering some
sort of endless loop, not a literal one, just something where it tries
to do work without making progress.)
(Some obvious bugs I simply ignored for this initial version, but
there's a number of potential bugs I can't even imagine. Normal playback
should remain completely unaffected, though.)
How this works is also described in the manpage. Basically, we demux in
reverse, then we decode in reverse, then we render in reverse.
The decoding part is the simplest: just reorder the decoder output. This
weirdly integrates with the timeline/ordered chapter code, which also
has special requirements on feeding the packets to the decoder in a
non-straightforward way (it doesn't conflict, although a bugmessmass
breaks correct slicing of segments, so EDL/ordered chapter playback is
broken in backward direction).
Backward demuxing is pretty involved. In theory, it could be much
easier: simply iterating the usual demuxer output backward. But this
just doesn't fit into our code, so there's a cthulhu nightmare of shit.
To be specific, each stream (audio, video) is reversed separately. At
least this means we can do backward playback within cached content (for
example, you could play backwards in a live stream; on that note, it
disables prefetching, which would lead to losing new live video, but
this could be avoided).
The fuckmess also meant that I didn't bother trying to support
subtitles. Subtitles are a problem because they're "sparse" streams.
They need to be "passively" demuxed: you don't try to read a subtitle
packet, you demux audio and video, and then look whether there was a
subtitle packet. This means to get subtitles for a time range, you need
to know that you demuxed video and audio over this range, which becomes
pretty messy when you demux audio and video backwards separately.
Backward display is the most weird (and potentially buggy) part. To
avoid that we need to touch a LOT of timing code, we negate all
timestamps. The basic idea is that due to the navigation, all
comparisons and subtractions of timestamps keep working, and you don't
need to touch every single of them to "reverse" them.
E.g.:
bool before = pts_a < pts_b;
would need to be:
bool before = forward
? pts_a < pts_b
: pts_a > pts_b;
or:
bool before = pts_a * dir < pts_b * dir;
or if you, as it's implemented now, just do this after decoding:
pts_a *= dir;
pts_b *= dir;
and then in the normal timing/renderer code:
bool before = pts_a < pts_b;
Consequently, we don't need many changes in the latter code. But some
assumptions inhererently true for forward playback may have been broken
anyway. What is mainly needed is fixing places where values are passed
between positive and negative "domains". For example, seeking and
timestamp user display always uses positive timestamps. The main mess is
that it's not obvious which domain a given variable should or does use.
Well, in my tests with a single file, it suddenly started to work when I
did this. I'm honestly surprised that it did, and that I didn't have to
change a single line in the timing code past decoder (just something
minor to make external/cached text subtitles display). I committed it
immediately while avoiding thinking about it. But there really likely
are subtle problems of all sorts.
As far as I'm aware, gstreamer also supports backward playback. When I
looked at this years ago, I couldn't find a way to actually try this,
and I didn't revisit it now. Back then I also read talk slides from the
person who implemented it, and I'm not sure if and which ideas I might
have taken from it. It's possible that the timestamp reversal is
inspired by it, but I didn't check. (I think it claimed that it could
avoid large changes by changing a sign?)
VapourSynth has some sort of reverse function, which provides a backward
view on a video. The function itself is trivial to implement, as
VapourSynth aims to provide random access to video by frame numbers (so
you just request decreasing frame numbers). From what I remember, it
wasn't exactly fluid, but it worked. It's implemented by creating an
index, and seeking to the target on demand, and a bunch of caching. mpv
could use it, but it would either require using VapourSynth as demuxer
and decoder for everything, or replacing the current file every time
something is supposed to be played backwards.
FFmpeg's libavfilter has reversal filters for audio and video. These
require buffering the entire media data of the file, and don't really
fit into mpv's architecture. It could be used by playing a libavfilter
graph that also demuxes, but that's like VapourSynth but worse.
The demuxer layer can start a thread to decouple the rest of the player
from blocking I/O (such as network accesses). But this particular
function does not support running with the thread enabled. The mutex use
within it is only since thread_work() may temporarily unlock the mutex,
and unlocking an unlocked mutex is not allowed. Most of the rest of the
code still does proper locking, even if it's pointless and effectively
single-threaded.
To make this look slightly cleaner, extend the mutex around the rest of
the code (like threaded code would have to do). This is mostly a
cosmetic change.
The demuxer cache benefits slightly from knowing where the current file
or stream begins. For example, seeking "left most" when the start is
cached would not trigger a low level seek (which would be followed by
messy range joining when it notices that the newly demuxed packets
overlap with an existing range).
Unfortunately, since multimedia is so crazy (or actually FFmpeg in its
quite imperfect attempt to be able to demux anything), it's hard to tell
where a file starts. There is no feedback whether a specific seek went
to the start of the file. Packets are not tagged with a flag indicating
they were demuxed from the start position. There is no index available
that could be used to cross-check this (even if the file contains a full
and "perfect" index, like mp4). You could go by the timestamps, but who
says streams start at 0? Streams can start somewhere at an extremely
high timestamps (transport streams like to do that), or they could start
at negative times (e.g. files with audio pre-padding will do that), and
maybe some file formats simply allow negative timestamps and could start
at any negative time. Even if the affected file formats don't allow it
in theory, they may in practice. In addition, FFmpeg exports a
start_time field, which may or may not be useful. (mpv's internal mkv
demuxer also exports such a field, but doesn't bother to set it for
efficiency and robustness reasons.)
Anyway, this is all a huge load of crap, so I decided that if the user
performs a seek command to time 0 or earlier, we consider the first
packet demuxed from each stream to be at the start of the file. In
addition, just trust the start_time field. This is the "shitty" part of
this commit.
One common case of negative timestamps is audio pre-padding. Demuxers
normally behave sanely, and will treat 0 as the start of the file, and
the first packets demuxed will have negative timestamps (since they
contain data to discard), which doesn't break our assumptions in this
commit. (Although, unfortunately, do break some other demuxer cache
assumptions, and the first cached range will be shown as starting at a
negative time.)
Implementation-wise, this is quite simple. Just split the existing
initial_state flag into two, since we want to deal with two separate
aspects. In addition, this avoids the refresh seek on track switching
when it happens right after a seek, instead of only after opening the
demuxer.
There were 3 packet reading functions: the "old" demux_read_packet()
that blocked (leftover from MPlayer times, but was still used until
recently by some obscure code), the "new" demux_read_packet_async(), and
the special demux_read_any_packet(), that is used by pseudo-demuxers
like demux_edl.
The first two could be used both in threaded and un-threaded mode. This
made 5 cases in total. Some bits of logic was spread across all of them.
Unify the logic. A recent commit made demux_read_packet() private, and
the code for it in threaded mode disappears. The difference between
threaded and un-threaded is minimized.
It's possible that this commit causes random regression. Enjoy.
There are 3 packet reading functions in the demux API, which all
function completely differently. One of them, demux_read_packet(), has
only 1 caller, which is in dec_sub.c. Change this caller to use
demux_read_packet_async() instead. Since it really wants to do a
blocking call, setup some proper waiting. This uses mp_dispatch_queue,
because even though it's overkill, it needs the least code.
In practice, waiting actually never happens. This code is only called on
code paths where everything is already read into memory (libavformat's
subtitle demuxers simply behave this way). It's still a bit of a
"coincidence", so implement it properly anyway.
If suubtitle decoder init fails, we still need to unset the demuxer
wakeup callback. Add a sub_destroy() call to the failure path. This also
happens to fix a missed pthread_mutex_destroy() call (in practice this
was a nop, or a memory leak on BSDs).
I'm not sure about this, but it looks like a bug. If a stream didn't
have packets, but the joined range does, the stream should obviously
read the packets added by the joined range. Until now, due to
reader_head being NULL, reading was only resumed if a _new_ packet was
added by actual demuxing (in add_packet_locked()), which means the
stream would suddenly skip ahead, past the original end of the joined
range.
Change it so that it will pick up the new range.
Also, clear the skip_to_keyframe flag. Nothing useful can come from this
flag being set; in the first place, the first packet of a range (that
isn't the current range) should start with a keyframe. Some code
probably enforced it (although it's fuzzy).
Completely untested.
When doing a seek to the end of the cache, ds->skip_to_keyframe can be
set to true. Then some packets passed to add_packet_locked() may have to
be skipped. In some aspects, the skipped packet was still treated as if
it was going to be returned to the reader.
It almost doesn't matter though: it only caused a redundant wakeup_ds()
call, and could pass the packet to the stream recorder. Fix it anyway.
If a DASH-hack EDL has an init fragment is set, it opens the init
fragment as such to get the track layout (including codec etc.) and
avoids opening actual fragments until actual playback. It does not get
added to the source array, so it leaks on exit, which triggers an
obscure (but very justified) assertion in thread_tools.c:106. Fix the
leak by adding the additional demuxer instance to the sources arrays,
which gets it freed.
This is a regression from when I rewrote some of the timeline handling.
I decided that in order to make memory management slightly simpler,
freeing a timeline should only free elements in the sources array. That
is OK; I just didn't re-test with pseudo-DASH that has init fragments,
and just hit a video that uses that by accidents. These videos are
rather scarce (apparently) so it happened only now.
The real solution would probably be adding demuxer reference counting.
This EDL memory management is just too messy, and throwing refcounting
at such problems is an effective and popular fix. Then you'd get
debugging nightmares with incorrect refcounts too, though.
If you have a EDL stream with separate sources for audio and video
stream (like ytdl_hook now creates), you can get the problem that the
video stream seeks to a different position than audio due to different
key frame granularity.
In particular, if you seek backward, the video might undershoot the seek
target by a lot. Then video will resume from an earlier position than
audio, and the player plays silence. This is annoying.
Fix this by explicitly implementing a heuristic to detect separate
audio/video streams, determining where a video seek ends up, and then
seeking the audio stream to the video destination. This also makes sure
to not seek audio with SEEK_FORWARD, so it will always seek before the
video position. Non-precise seeks still skip audio to the video target,
so this helps with ensuring that audio is present at the final seek
target.
The implementation is very annoying, because the only way to determine
the seek target is to actually read a packet. Thus a 1-packet queue
needs to be added. In theory, we could get the seek target from the
index of the video file (especially if it's mp4), but libavformat does
not have public API that exports this index, so we're stuck with this
roundabout generic method.
Note that this is only for non-precise seeks. If precise seeks are done,
the problem is handled by the frontend by skipping unwanted video
frames. But non-precise seeking should still work. (Personally I prefer
non-precise seek mode by default because they're still significantly
faster.)
It also needs to be said that this is the 4th implementation of this
seek adjustment thing in mpv. The 1st implementation is in the frontend
(look for MPContext.seek_slave). This works only if the external audio
stream is known as such on the frontend level. The 2nd implementation is
in the demuxer level packet cache (top of execute_cache_seek()). This is
similar to code that any demuxer needs to handle non-precise seeks
sufficiently nicely. The 3rd is in demux_mkv.c. Since mkv is an
interleaved format, this implementation mostly consists on trying to
pick index entries for video packets if a video stream is selected.
Maybe these "redundant" implementations could be avoided by exposing
separate streams through the demuxer API (and making them individually
seekable) or something like this, but this is messy and not without
problems for multiple reasons. So for now this commit is the best way to
fix the observed behavior.
Instead of just using "edl/" for the file format, report mkv_oc if it's
generated from ordered chapters, "cue/" if from .cue, "multi/" if it's
from EDL but only for adding separate streams, "dash/" if it's from EDL
but only using the DASH hack, and "edl/" for everything else.
The EDL variants are mostly special-cased to the variants the ytdl
wrapper usually generates.
This has no effect other than what the command.c file-format property
returns.
Remove the singly linked list hack, replace it with a slightly more
proper data structure. This probably gets rid of a few minor bugs along
the way, caused by the awkward nonsensical sharing/duplication of some
fields.
Another change (because I'm touching everything related to timeline
anyway) is that I'm removing the special semantics for parts[num_parts].
This is now strictly out of bounds, and instead of using the start time
of the next/beyond-last part, there is an end time field now.
Unfortunately, this also requires touching the code for cue and mkv
ordered chapters. From some superficial testing, they still seem to
mostly work.
One observable change is that the "no_chapters" header is per-stream
now, which is arguably more correct, and getting the old behavior would
require adding code to handle it as special-case, so just adjust
ytdl_hook.lua to the new behavior.
Used by the next commit. It mostly exposes part of mp4_dash
functionality. It actually makes little sense other than for ytdl
special-use. See next commit.
Normal EDL needs to clip packets coming from the underlying demuxer to
the segment range (including complicated stuff due to frame reordering).
This is unwanted In pseudo-DASH mode. A broken or subtly incorrect
manifest would lead to "bad stuff" happening. The intention of the
pseudo-DASH mode is to literally concatenate fragments.
This fixes that there were weird delay ("buffering") when seeking into
the last part of a seekable range. The exact case which triggers it if
SEEK_FORWARD is used, and the seek pts is after the second-last
keyframe, but before the end of the range. In that case,
find_seek_target() returned NULL, and the cache layer waited until the
_next_ keyframe the underlying demuxer returned until resuming playback.
find_seek_target() returned NULL, because the last keyframe had
kf_seek_pts unset. This field contains the lowest PTS in the packet
range from the keyframe until the next keyframe (or EOF). For normal
seeks, this is needed because keyframes don't necessarily have the
minimum PTS in the packet range, so it needs to be computed by waiting
for all packets until the next keyframe (or EOF).
Strictly speaking, this behavior was correct, but it meant that the
caller would set ds->skip_to_keyframe, which waits for the next newly
demuxed keyframe. No packets were returned to the decoder until this
happened, usually resulting in the frontend entering "buffering" mode.
What it really needs to do is returning the last keyframe in the cache.
In this situation, the seek target points in the middle of the last
completely cached packet range (as delimited by keyframes), and
SEEK_FORWARD is supposed to skip to the next keyframe. This is in line
with the basic assumptions the packet cache makes (e.g. the keyframe
flag means it's possible to start decoding, and the frames decoded from
it and following packets will strictly have PTS values above the
previous keyframe range). This means in this situation the kf_seek_pts
value doesn't matter either.
So fix this situation by explicitly detecting it and then returning the
last cached keyframe.
Should the search loop look at all packets, instead of only keyframe
ones? This would mean it can know that it's within the last keyframe
range (without looking at queue->seek_end). Maybe this would be a bit
more natural for the SEEK_FORWARD case, but due to PTS reordering it
doesn't sound like a useful thing to do.
Should skip_to_keyframe be checked by the code that sets kf_seek_pts to
a known value? This wouldn't help too much; the frontend would still go
into "buffering" mode for no reason until the packet range is completed,
although it would resume from the correct range.
Should a NULL return always unconditionally use keyframe_latest? This
makes sense because the seek PTS is usually already in the cached range,
so this is the only case that should happen. But there are scary special
cases, like sparse subtitle streams, or other uses of find_seek_target()
which could be out of range now or in future. Basically, don't "risk"
it.
One other potential problem with this is that the "adjust seek target"
code will be disabled in this case. It checks kf_seek_pts, and if it's
unset, the adjustment is not done. Maybe this could be changed to use
the queue's seek_end time, but I'm not sure if this is fully kosher. On
the other hand, I think the main use for this adjustment is with
backwards seeks, so this shouldn't matter.
A previous commit dealing with audio/video stream merging mentioned how
seeking forward entered "buffering" mode for unknown reasons; this
commit fixes this issue.
demux_timeline doesn't do any transport accesses itself. The slave
demuxers do this (these will actually access the stream layer and
perform e.g. network accesses). As a consequence, demux_timeline always
reported 0 bytes read, and network speed display didn't work.
Fix this by awkwardly reporting the amount of read bytes upwards. This
is not very nice, and requires explicit calls whenever the slave "might"
have read data.
Due to the way the reporting is done, it only works if the slaves do not
run demuxer threads, which makes things even less nice. (Fortunately
they don't anyway, because it would be a waste of resources.) Some
identifiers contain the word "hack" as a warning.
Some of the stupidity comes from the fact that demux.c itself resets the
stats randomly in order to calculate the bytes_per_second value, which
is useless for a slave, but of course is still done, because demux.c
itself is not aware of whether it's on the slave or top-level layer.
Unfortunately, this must do.
In theory, the demuxer thread/cache layer should be separated from
demuxer implementations. This would get rid of all the awkwardness and
nonsense. For example, the only threading involved would be the caching
layer, completely separate from demuxers themselves. It'd be the only
thing calculates speed rates for the player frontend, too (instead of
doing it for each demuxer, even if unused).
It was an ugly hack, and the next commit will make it even uglier.
Slightly reduce the ugliness to prevent death of too many brain cells,
though it's still an ugly hack.
The cleanup is really minor, but I guess the following commit would be
much worse otherwise. In particular, this commit checks accesses
(instead of having a public field with evil access rules), which should
avoid misunderstandings and incorrect use. Strictly speaking, the added
field is redundant, but the next commit complicates it a bit.
I think this is better. On the other hand, this is a behavior change.
The EDL "spec" says that unknown fields are igored. But strictly
speaking, unknown headers are not "fields", but unknown entities.
EDL "headers" were always an afterthought, and kind of hacked on top of
the existing code. Improve it slightly, and make it follow the
conventions of the normal parsing. Basically use the same code structure
for them, just that they use different field names.
This commit adds an extension to mpv EDL, which basically allows you to
do the same as --audio-file, --external-file, etc. in a single EDL file.
This is a relatively quick & dirty implementation. The dirty part lies
in the fact that several shortcuts are taken. For example, struct
timeline now forms a singly linked list, which is really weird, but also
means the other timeline using demuxers (cue, mkv) don't need to be
touched. Also, memory management becomes even worse (weird object
ownership rules that are just fragile WTFs). There are some other
dubious small changes, mostly related to the weird representation of
separate streams.
demux_timeline.c contains the actual implementation of the separate
stream handling. For the most part, most things that used to be on the
top level are now in struct virtual_source, of which one for each
separate stream exists. This is basically like running multiple
demux_edl.c in parallel. Some changes could strictly speaking be split
into a separate commit, such as the stream_map type change.
Mostly untested. Seems to work for the intended purpose. Potential for
regressions for other timeline uses (like ordered chapters) is probably
low. One thing which could definitely break and which I didn't test is
the pseudo-DASH fragmented EDL code, of which ytdl can trigger various
forms in obscure situations. (Uh why don't we have a test suite.)
Background:
The intention is to use this for the ytdl wrapper. A certain streaming
site from a particularly brain damaged and plain evil Silicon Valley
company usually provides streams as separate audio and video streams.
The ytdl wrapper simply does use audio-add (i.e. adding it as external
track, like with --audio-file), which works mostly fine. Unfortunately,
mpv manages caching completely separately for external files. This has
the following potential problems:
1. Seek ranges are rendered incorrectly. They always use the "main"
stream, in this case the video stream. E.g. clicking into a cached range
on the OSC could trigger a low level seek if the audio stream is
actually not cached at the target position.
2. The stream cache bloats unnecessarily. Each stream may allocate the
full configured maximum cache size, which is not what the user intends
to do. Cached ranges are not pruned the same way, which creates disjoint
cache ranges, which only use memory and won't help with fast seeking or
playback.
3. mpv will try to aggressively read from both streams. This is done
from different threads, with no regard which stream is more important.
So it might happen that one stream starves the other one, especially if
they have different bitrates.
4. Every stream will use a separate thread, which is an unnecessary
waste of system resources.
In theory, the following solutions are available (this commit works
towards D):
A. Centrally manage reading and caching of all streams. A single thread
would do all I/O, and decide from which stream it should read next. As
long as the total TCP/socket buffering is not too high, this should be
effective to avoid starvation issues. This can also manage the cached
ranges better. It would also get rid of the quite useless additional
demuxer threads. This solution is conceptually simple, but requires
refactoring the entire demuxer middle layer.
B. Attempt to coordinate the demuxer threads. This would maintain a
shared cache and readahead state to solve the mentioned problems
explicitly. While this sounds simple and like an incremental change,
it's probably hard to implement, creates more messy special cases,
solution A. seems just a better and simpler variant of this. (On the
other hand, A. requires refactoring more code.)
C. Render an intersection of the seek ranges across all streams. This
fixes only problem 1.
D. Merge all streams in a dedicated wrapper demuxer. The general demuxer
layer remains unchanged, and reading from separate streams is handled as
special case. This effectively achieves the same as A. In particular,
caching is simply handled by the usual demuxer cache layer, which sees
the wrapper demuxer as a single stream of interleaved packets. One
implementation variant of this is to reuse the EDL infrastructure, which
this commit does.
All in all, solution A would be preferable, because it's cleaner and
works for all external streams in general.
Some previous commit tried to prepare for implementing solution A. This
could still happen. But it could take years until this is finally
seriously started and finished. In any case, this commit doesn't block
or complicate such attempts, which is also why it's the way to go.
It's worth mentioning that original mplayer handles external files by
creating a wrapper demuxer. This is like a less ideal mixture of A. and
D. (The similarity with A. is that extending the mplayer approach to be
fully dynamic and without certain disadvantages caused by the wrapper
would end up with A. anyway. The similarity with D. is that due to the
wrapper, no higher level code needs to be changed.)
struct stream used to include the stream buffer, including peek buffer,
inline in the struct. It could not be resized, which means the maximum
peek size was set in stone. This meant demux_lavf.c could peek only so
much data.
Change it to use a dynamic buffer. Because it's possible, keep the
inline buffer for default buffer sizes (which are basically always used
outside of file opening). It's unknown whether it really helps with
anything. Probably not.
This is also the fallback plan in case we need something like the old
stream cache in order to deal with mp4 + unseekable http: the code can
now be easily changed to use any buffer size.
The only thing left is the notification for track switching. Just get
rid of that.
There's probably no real reason to get rid of control(), but why not. I
think I was actually trying to do some real work but fuck that.
Subtitles (and a few other file types, like playlists) are not streamed,
but fully read on opening. This means keeping the file handle or network
socket open is a waste of resources and could cause other weird
behavior. This is why there's a hack to close them after opening.
Change this hack to make the demuxer itself do this, which is less
weird. (Until recently, demuxer->stream ownership was more complex,
which is why it was done this way.)
There is some evil shit due to a huge ownership/lifetime mess of various
objects. Especially EDL (the currently only nested demuxer case)
requires being careful about mp_cancel and passing down stream pointers.
As one defensive programming measure, stop accessing the "stream"
variable in open_given_type(), even where it would still work. This
includes removing a redundant line of code, and removing the peak call,
which should not be needed anymore, as the remaining demuxers do this
mostly correctly.
I always wanted to get rid of this, because it makes the ownership rules
for the stream pointer really awkward. demux_edl.c was the only
remaining user of this. Replace it with a semi-clever idea: the init
segment shit can be used to pass the "file" contents as memory block,
and "memory://" itself provides an empty stream. I have no idea if this
actually works, because I didn't immediately find a test stream (would
have to be some youtube DASH shit).
Instead of going through those weird DEMUXER_CTRLs, query this
information directly. I'm not sure which kind of brain damage made me
use CTRLs for these. Since there are no other DEMUXER_CTRLs that make
sense for the frontend, remove the remaining infrastructure for them
too.
The stream size return was the only thing that still required doing
STREAM_CTRLs from frontend through the demuxer layer. This can be done
much easier, so rip it out. Also rip out the now unused infrastructure
for STREAM_CTRLs via demuxer layer.
Apparently this was so that when playing a video file from a .rar file,
it would load external subtitles with the same name (instead of looking
for mpv's rar:// mangled URL). This was requested on github almost 5
years ago. Seems like a weird feature, and I don't care. Drop it,
because it complicates some in progress change.
This code set pkt->stream to a value which I'm not sure whether it's
correct. A recent commit overwrote it with a value that is definitely
correct.
There appears to be an off by one error. No fucking clue whether this
was somehow correct, but applying an apparent fix does not seem to break
anything, so whatever.
The "program" property could switch between TS programs. It was rather
complex and rather obscure (even if you deal with TS captures, you
usually don't need it). If anyone actually needs it (did anyone ever
attempt to even use it?), it should be rewritten. The demuxer should
export a program list, and the frontend should handle the "cycling"
logic.
Linux analog TV support (via tv://) was excessively complex, and
whenever I attempted to use it (cameras or loopback devices), it didn't
work well, or would have required some major work to update it. It's
very much stuck in the analog past (my favorite are the frequency tables
in frequencies.c for analog TV channels which don't exist anymore).
Especially cameras and such work fine with libavdevice and better than
tv://, for example:
mpv av://v4l2:/dev/video0
(adding --profile=low-latency --untimed even makes it mostly realtime)
Adding a new input layer that targets such "modern" uses would be
acceptable, if anyone is interested in it. The old TV code is just too
focused on actual analog TV.
DVB is rather obscure, but has an active maintainer, so don't remove it.
However, the demux/stream ctrl layer must go, so remove controls for
channel switching. Most of these could be reimplemented by using the
normal method for option runtime changes.
This removes anything related to DVD/BD/CD that negatively affected the
core code. It includes trying to rewrite timestamps (since DVDs and
Blurays do not set packet stream timestamps to playback time, and can
even have resets mid-stream), export of chapters, stream languages,
export of title/track lists, and all that.
Only basic seeking is supported. It is very much possible that seeking
completely fails on some discs (on some parts of the timeline), because
timestamp rewriting was removed.
Note that I don't give a shit about optical media. If you want to watch
them, rip them. Keeping some bare support for DVD/BD is the most I'm
going to do to appease the type of lazy, obnoxious users who will care.
There are other players which are better at optical discs.
Obvious mistake. This reported 44 bytes more data than what was
available. Could cause out of bounds reads. Security researchers would
claim a major victory if they found something like this in more popular
software, and would create a website for it.
Manual changes done:
* Merged the interface-changes under the already master'd changes.
* Moved the hwdec-related option changes to video/decode/vd_lavc.c.
The seek range update was to early and did not take the removed head
packets into account. And therefore missed that the queue was not
BOF anymore.
This led to not be able to backward seek before the first packet of
the first seek range.
Fix it by moving the seek range update after the possible removal and
the change of the BOF flag.
Fixes: #6522
Commit e392d6610d modified the native
demuxer to use track gain as a fallback for album gain if the latter is
not present. This commit makes functionally equivalent changes in the
libavformat demuxer.
If the number of chapters is 0, the chapter list can be NULL. clang
complains that we pass NULL to qsort(). This is yet another pointless UB
that exists for no reason other than wasting your time.
The redundancy here always annoyed me. Back then I didn't change it
because it's hard to test and I just had fixed something. This doesn't
matter anymore, so simplify it, without testing and with the risk that
something breaks (why care).
--record-file is nice, but only sometimes. If you watch some sort of
livestream which you want to record, it's actually much nicer not to
record what you're currently "seeing", but anything you're receiving.
In theory, this could be easily done with custom I/O. In practice, all
the halfassed garbage in FFmpeg shits itself and fucks up like there's
no tomorrow. There are several problems:
1. FFmpeg pretends you can do custom I/O, but in reality there's a lot
that custom I/O can do. hls.c even contains explicit checks to disable
important things if custom I/O is used! In particular, you can't use the
HTTP keepalive functionality (needed for somewhat decent HLS
performance), because some cranky asshole in the cursed FFmpeg dev.
community blocked it.
2. The implementation of nested I/O callbacks (io_open/io_close) is
bogus and halfassed (like everything in FFmpeg, really). It will call
io_open on some URLs without ever calling io_close. Instead, it'll call
avio_close() on the context directly. From what I can tell, avio_close()
is incompable to custom I/O anyway (overwhelmed by their own garbage,
the fFmpeg devs created the io_close callback for this reason, because
they couldn't fix their own fucking garbage). This commit adds some
shitty workaround for this (technically triggers UB, but with that
garbage heap of a library we depend on it's not like it matters).
3. Even then, you can't proxy I/O contexts (see 1.), but we can just
keep track of the opened nested I/O contexts. The bytes_read is
documented as not public, but reading it is literally the only way to
get what we want.
A more reasonable approach would probably be using curl. It could
transparently handle the keep-alive thing, as well as propagating
cookies etc. (which doesn't work with the FFmpeg approach if you use
custom I/O). Of course even better if there were an independent HLS
implementation anywhere. FFmpeg's HLS support is so embarrassing
pathetic and just goes to show that they belong into the past
(multimedia from 2000-2010) and should either modernize or fuck off.
With FFmpeg's shit-crusted structures, todic communities, and retarded
assholes denying progress, probably the latter. Did I already mention
that FFmpeg is a shit fucked steaming pile of garbage shit?
And all just to get some basic I/O stats, that any proper HLS consumer
requires in order to implement adaptive streaming correctly (i.e.
browser based players, and nothing FFmshit based).
I encountered a stream that fails with "Could not demux init fragment.".
It turns out this is a regression from the recent change to that code.
The assumption was that demux_lavf.c would treat this as concatenated
stream - which it does, but not for probing.
Doing this transparently is hard without doing it properly. Doing it
properly would mean creating some sort of stream_concat (reminiscent of
that FFmpeg security bug). I probably don't want to go there, and I
think libavformat should just support this directly, so whatever.
Hack-fix this with the knowledge that the init segment will always
contain the headers.
FFmpeg is retarded enough not to give us any indication whether it is
(unless we query fields not in the ABI/API). I bet FFmpeg developers
love it when library users have to litter their code with duplicated
information.
The demuxer cache is the only cache now. Might need another change to
combat seeking failures in mp4 etc. The only bad thing is the loss of
cache-speed, which was sort of nice to have.
FFmpeg is retarded enough not to give us any indication whether it is
(unless we query fields not in the ABI/API). I bet FFmpeg developers
love it when library users have to litter their code with duplicated
information.
When the current packet queue was completely empty, and EOF was reached,
the queue->is_eof flag was not correctly set to true. Change this by
reading ds->eof to check whether the stream is considered EOF. We also
need to make sure update_seek_ranges() is called in this case, so change
the code to simply call it when queue->is_eof changes.
Also, read_packet() needs to call adjust_seek_range_on_packet() if
ds->eof changes. In that case, the decoder also needs to be notified
about EOF. So both of these should be called when ds->eof changes to
true. (Other code outside of this function deals with the case when
ds->eof is changed to false.)
In addition, this code was kind of shoddy about calling wakeup_ds()
correctly. It looks like there was an inverted condition, and sent a
wakeup to the decoder only when ds->eof was already true, which is
obviously bogus. The final EOF case tried to be somehow clever about
checking in->last_eof for notifying the codec, which is sort of OK, but
seems to be strictly worse than just checking whether ds->eof changed.
Fix these things.
Passing NULL to mp_get_config_group() returns the main option struct.
This is just a dumb hack to deal with inconsistencies caused by legacy
things (as I'll claim), and will probably be changed in the future. So
before littering the whole code base with hard to find NULL parameters,
require using callers an easy to find separate define.
This will enable the player core to terminate the demuxers in a "nicer"
way without having to block on network. If it just used demux_free(), it
would either have to block on network, or like currently, essentially
kill all I/O forcefully.
The API is slightly awkward, because demuxer lifetime is bound to its
allocation. On the other hand, changing that would also be awkward, and
introduce weird in-between states that would have to be handled in tons
of places.
Currently unused, to be user later.
Alway give each demuxer its own mp_cancel instance. This makes
management of the mp_cancel things much easier. Also, instead of having
add/remove functions for mp_cancel slaves, replace them with a simpler
to use set_parent function. Remove cancel_and_free_demuxer(), which had
mpctx as parameter only to check an assumption. With this commit,
demuxers have their own mp_cancel, so add demux_cancel_and_free() which
makes use of it.
Them being separate is just dumb. Replace them with a single
demux_free() function, and free its stream by default. Not freeing the
stream is only needed in 1 special case (demux_disc.c), use a special
flag to not free the stream in this case.
The properties/commands touched in this commit are all for obscure
special inputs (BD/DVD/DVB/TV), and they all block on the demuxer/stream
layer. For network streams, this blocking is very unwelcome. They will
affect playback and probably introduce pauses and frame drops. The
player can even freeze fully, and the logic that tries to make playback
abortable even if frozen complicates the player.
Since the mentioned accesses are not needed for network streams, but
they will block on network streams even though they're going to fail,
add a flag that coarsely enables/disables these accesses. Essentially it
establishes a whitelist of demuxers/streams which support them.
In theory you could to access BD/DVD images over network (or add such
support, I don't think it's a thing in mpv). In these cases these
controls still can block and could even "freeze" the player completely.
Writing to the "program" and "cache-size" properties still can block
even for network streams. Just don't use them if you don't want freezes.
Don't allow it to freeze everything when loading a playlist from network
(although you definitely shouldn't do that, but whatever).
This also affects the really obscure --ordered-chapters-files option.
The --playlist option on the other hand has no choice but to freeze the
shit, because there's no concept of aborting the player during command
line parsing.
It seems a bit inappropriate to have dumped this into stream.c, even if
it's roughly speaking its main user. At least it made its way somewhat
unfortunately to other components not related to the stream or demuxer
layer at all.
I'm too greedy to give this weird helper its own file, so dump it into
thread_tools.c.
Probably a somewhat pointless change.
If a stream starts later than the others at the start of the file, it
shouldn't restrict the seek range to the time stamp where it begins.
This is similar to the previous commit, just for the other end.
Normally, the seek range is the minimum overlap of the cached ranges of
each stream. But if one of the streams ends earlier, this leads to the
seek range getting cut off, even if you could seek there.
Change it so that EOF streams cannot restrict the end of the seek range.
They can only extend it. This is the opposite from not-EOF streams, so
they need to be handled separately. In particular, they get exluded from
normal end range calculation, but when full EOF is reached, all streams
are EOF, and the maximum end time can be used to set the seek end time.
(In theory we could also take the max with the demuxer signaled total
file duration, but let's not for now.)
Also, if a stream is completely empty, essentially skip it, instead of
considering the range unseekable. (Also, we don't need to mess with
seek_start in this case, because it will be NOPTS and is skipped
anyway.)
When the current packet queue was completely empty, and EOF was reached,
the queue->is_eof flag was not correctly set to true. Change this by
reading ds->eof to check whether the stream is considered EOF. We also
need to make sure update_seek_ranges() is called in this case, so change
the code to simply call it when queue->is_eof changes.
Also, read_packet() needs to call adjust_seek_range_on_packet() if
ds->eof changes. In that case, the decoder also needs to be notified
about EOF. So both of these should be called when ds->eof changes to
true. (Other code outside of this function deals with the case when
ds->eof is changed to false.)
In addition, this code was kind of shoddy about calling wakeup_ds()
correctly. It looks like there was an inverted condition, and sent a
wakeup to the decoder only when ds->eof was already true, which is
obviously bogus. The final EOF case tried to be somehow clever about
checking in->last_eof for notifying the codec, which is sort of OK, but
seems to be strictly worse than just checking whether ds->eof changed.
Fix these things.
Fixes several issues playing back mpegts with video streams marked
as having "still images". For example, see this video which has
frames only every 6s: https://s3.amazonaws.com/tmm1/music-choice.ts
Changes include:
- start playback right away, without waiting for first video frame
- do not consider the sparse video stream in demuxer underrun detection
- do not require multiple video frames for the VO
- use audio as the master stream for demuxer metadata events
- use audio stream for playback time
Signed-off-by: Aman Gupta <aman@tmm1.net>
With -v -v ("debug" level), which is the default for --log-file, this
would log every damn Matroska EBML element and some other uninteresting
things, which was very noisy.
Adjust the log levels to make them less noisy. Also, change some log
calls to MP_ERR for things which are actually errors.
Going by ISO 639.2, "und" means "Undetermined". Whatever it's supposed
to mean, in practice it's user for "unset". We prefer if the language
tag remains simply unset in this case.
This removes an ugliness with mp4 in partricular, because libavformat
will export unset languages as such, which affects most mp4 files.
This makes ICY title changes show up at approximately the correct time,
even if the demuxer buffer is huge. (It'll still be wrong if the stream
byte cache contains a meaningful amount of data.)
It should have the same effect for mid-stream metadata changes in e.g.
OGG (untested).
This is still somewhat fishy, but in parts due to ICY being fishy, and
FFmpeg's metadata change API being somewhat fishy. For example, what
happens if you seek? With FFmpeg AVFMT_EVENT_FLAG_METADATA_UPDATED and
AVSTREAM_EVENT_FLAG_METADATA_UPDATED we hope that FFmpeg will correctly
restore the correct metadata when the first packet is returned.
If you seke with ICY, we're out of luck, and some audio will be
associated with the wrong tag until we get a new title through ICY
metadata update at an essentially random point (it's mostly inherent to
ICY). Then the tags will switch back and forth, and this behavior will
stick with the data stored in the demuxer cache. Fortunately, this can
happen only if the HTTP stream is actually seekable, which it usually is
not for ICY things. Seeking doesn't even make sense with ICY, since you
can't know the exact metadata location. Basically ICY metsdata sucks.
Some complexity is due to a microoptimization: I didn't want additional
atomic accesses for each packet if no timed metadata is used. (It
probably doesn't matter at all.)
This fixes an issue where captions stop rendering after an
in-demuxer-cache seek, because the demuxer keeps waiting to find
a keyframe (ds->skip_to_keyframe set to true in execute_cache_seek).
ffmpeg marks audio tracks which are not meant to be played standalone
as DEPENDENT. these are typically used in DVB broadcasts for audio
descriptions, and are meant to be mixed into the main audio track during
playback.
I changed avio_flush() and introduced avformat_flush() exactly for this
reason.
Used with DVD/BD only (on seeks and when setting the "angle" property).
Seems to work, but wasn't tested too thoroughly (I don't care about
optical discs, I only want this ugly stuff gone that might even violate
the API/ABI).
Some shittily muxed files (by a certain HandBrake+libavformat combo)
contain a SeekHead pointing to a SeekHead at the end of the file, which
in turn points to track headers (also at the end of the file). This
failed because the demuxer didn't bother to actually read the elements
listed by the second SeekHead, so no track headers were read, and
playback broke.
Somehow commit 6fe75c38 broke this for no reason. It adds a "needed"
field, which seems completely pointless and replaced the "parsed" flag
in an incomplete way. In particular, the "needed" field was not set when
a _recursive_ SeekHead was read, so those elements were not read. Just
get rid of the field and use "parsed" instead.
Quickly tested by a person who had FFmpeg linked with libaom.
Seems as simple as the VP9 mappings, where there is no extradata/
initialization data off-band, and just stuff in the packets
themselves.
Do note that the AV1 video format itself at this point is still
not frozen, so what you might produce one day might not be
decodable the following day.
When this happens, network calls are forcibly aborted (more or less),
but demuxers might keep going, as most of them do not check for forced
exits properly. This can possibly lead to broken packets being added.
Also do not attempt to read more packets in this situation.
Also do not print a stream open failed message if opening was aborted
anyway.
Since the demuxer cache addition, ds->queue->head can actually be set to
non-NULL, but the decoder can still be at EOF (with no packets to come).
This made it report an unknown buffered size, instead of 0. Fix this by
checking the decoder part of the packet queue instead.
Probably doesn't matter much, but fixes an annoying "???" on the CLI
status line in some situations.
Naturally, there's more than one fourcc that indicates an mjpeg
stream.
I have a particular ancient webcam here (Logitech QuickCam Messanger)
that only supports the single 'JPEG' format, but there are other
devices out there which support both 'JPEG' and 'MJPG' with no visible
differences, and others where the streams are slightly different.
Regardless of those details, it remains correct to treat 'JPEG'
the same as 'MJPG' from a stream consumption perspective.
It's a mess: mp3 files have user tags as global metadata (because the
id3v2 tag is global and there is only 1 stream), while OGG files have it
per-track (because it's per-stream on the lowest level). mpv needs to
try to make something nice out of the mess.
It did so by trying to detect audio-only OGG files, and then copying the
per-stream metadata to the global metadata. Make the heuristic for
detecting this slightly more clever, so it works for files with extra,
unrelated streams, like the awful libavformat cover art hack.
Fixes#5577.
It appears some (or all) mkv files with EAC3 are muxed in a way that
breaks FFmpeg's spdifenc. I suspect it's because either dependent
substream packets are localted in their own packets, or the reverse. Or
possibly this is case where the muxer did not respect packet boundaries
at all. Enabling the EAC3 parser seems to fix this anyway, because why
waste your precious time on retarded Dolby bullshit technology? (Which
idiot came up with this shitty substream garbage?)
Observed with dolby_digital_plus_channel_check_lossless-DWEU.mkv.
Fixes#5578.
This includes codec/muxer/demuxer iteration (different iteration
function, registration functions deprecated), and the renaming of
AVFormatContext.filename to url (plus making it a malloced string).
Libav doesn't have the new API yet, so it will break. I hope they will
add the new APIs too.
Reduce backward/forward from 400MB/400MB to 50MB/150MB. Too many
complaints about high memory usage.
Note that external tracks (like ytdl DASH with external audio tracks)
will double the amounts, because an external track uses its own demuxer
and cache.
This reverts commit b7f90be567.
The author agreed to the relicensing now (if that code is affected by
the original copyright at all - that was the only line possibly left of
it).
If tags like TITLE have the whole parameter in " quotes, strip them.
Also remove the leading whitespace, since even with a single space it
was always included.
Fixes#5462.
Move dec_video.c to filters/f_decoder_wrapper.c. It essentially becomes
a source filter. vd.h mostly disappears, because mp_filter takes care of
the dataflow, but its remains are in struct mp_decoder_fns.
One goal is to simplify dataflow by letting the filter framework handle
it (or more accurately, using its conventions). One result is that the
decode calls disappear from video.c, because we simply connect the
decoder wrapper and the filter chain with mp_pin_connect().
Another goal is to eventually remove the code duplication between the
audio and video paths for this. This commit prepares for this by trying
to make f_decoder_wrapper.c extensible, so it can be used for audio as
well later.
Decoder framedropping changes a bit. It doesn't seem to be worse than
before, and it's an obscure feature, so I'm content with its new state.
Some special code that was apparently meant to avoid dropping too many
frames in a row is removed, though.
I'm not sure how the source code tree should be organized. For one,
video/decode/vd_lavc.c is the only file in its directory, which is a bit
annoying.
This is supposed to help making data flow easier and wakeup handling
more efficient. Once that change is done, reading a packet on any
stream won't have to wakeup and poll all decoders (which helps reducing
the mess even if all decoders are on the same thread).
This also improves the accuracy of wakeups by tracking better whether
a wakeup is needed.
AV_DISPOSITION_ATTACHED_PIC usually means the video track isn't real,
and merely reflects the presence of an embedded image in tag data (such
as ID3v2 tags), with some inconsistent hack to make libavformat return
it as video packet once.
Except it doesn't mean that. It can be randomly set on other streams
that do sort of behave like video streams, such as chapter thumbnail
tracks in mp4 files. AV_DISPOSITION_TIMED_THUMBNAILS is set in these
cases. In theory, there can supposedly be more such cases, but only the
chapter thumbnail one currently exists. So add it as exception.
This restores displaying these thumbnails as video frames, for better or
worse. (Before, only the first thumbnail was displayed.)
Requires newest FFmpeg git, which has a change that makes the HLS
demuxer set an AVFMTCTX_UNSEEKABLE flag if seeking is not available,
which is the case for HLS live streams. This should make the player
frontend behave pretty well, instead of crapping up irrecoverably.
And use it for 2 demuxer options. It could be used for more options
later. (Though the --cache options can not use this, because they use KB
as base unit.)
I found that at least for mjpeg streams, FFmpeg will set packet pts/dts
anyway. The mjpeg raw video demuxer (along with some other raw formats)
has a "framerate" demuxer option which defaults to 25, so all mjpeg
streams will be played at 25 FPS by default.
mpv doesn't like this much. If AVFMT_NOTIMESTAMPS is set, it prints a
warning, that might print a bogus FPS value for the assumed framerate.
The code was originally written with the assumption that FFmpeg would
not set pts/dts for such formats, but since it does, the printed
estimated framerate will never be used. --fps will also not be used by
default in this situation.
To make this hopefully less confusing, explicitly state the situation
when the AVFMT_NOTIMESTAMPS flag is set, and give instructions how to
work it around.
Also, remove the warning in dec_video.c. We don't know what FPS it's
going to assume anyway. If there are really no timestamps in the stream,
it will trigger our normal missing pts workaround. Add the assumed FPS
there.
In theory, we could just clear packet timestamps if AVFMT_NOTIMESTAMPS
is set, and make up our own timestamps. That is non-trivial for advanced
video codecs like h264, so I'm not going there. For seeking and
buffering estimation the situation thus remains half-broken.
This is a mitigation for #5419.
It was actually already implemented as ta_dup_ptrtype(), but that seems
like a clunky name. Also we still use the talloc_ names throughout the
source, and I'd rather use an old name instead of a mixing inconsistent
naming conventions.
If you play a video with an external audio track, and do backwards
keyframe seeks, then audio can be missing. This is because a backwards
seek can end up way before the seek target (this is just how this seek
mode works). The audio file will be seeked at the correct seek target
(since audio usually has a much higher seek granularity), which results
in silence being played until the video reaches the originally intended
seek target.
There was a hack in audio.c to deal with this. Replace it with a
different hack. The new hack probably works about as well as the old
hack, except it doesn't add weird crap to the audio resync path (which
is some of the worst code here, so this is some nice preparation for
rewriting it). As a more practical advantage, it doesn't discard the
audio demuxer packet cache. The old code did, which probably ruined
seeking in youtube DASH streams.
A non-hacky solution would be handling external files in the demuxer
layer. Then chaining the seeks would be pretty easy. But we're pretty
far from that, because it would either require intrusive changes to the
demuxer layer, or wouldn't be flexible enough to load/unload external
files at runtime. Maybe later.
Similar to 1eec7d2315, but for the beginning of the stream (named BOF in
this commit).
We can know this only if demuxing actually started from the beginning.
If there is a seek to the beginning (even if you use --start=-1000), we
don't know in general whether the demuxer truly returns the start of the
file. We could probably make a heuristic with assuming that this is what
happens if the seek target is before the start time or so, but this is
not included in this commit.
libavformat's cover art hack (aka attached pictures) breaks the ability
of the demuxer cache to keep multiple seek ranges. This happens because
the cover art packet has neither position nor timestamp, and libavformat
gives us the packet even though we intended to drop it.
The cover art hack works by adding the cover art packet to the read
packet stream once when demuxing starts (or after seeks). mpv treats
cover art in a similar way internally, but we have to compensate for
libavformat's shortcomings, and add the cover art packet ourselves when
we need it. So we don't want libavformat to return the packet.
We normally prevent this in demux_lavc.c/select_tracks() and explicitly
disable cover art streams. (We add it in dequeue_packet() instead.) But
libavformat will actually add the cover art packet even if we disable
the cover art stream, because it adds it at initialization time, and
does not bother to check again in av_read_frame() (apparently). The
packet is actually read, and upsets the demuxer cache logic. In
addition, this also means we probably decoded the cover art picture
twice in some situations.
Fix this by explicitly checking/discarding this in yet another place.
(Screw this hack...)
The impact was that you couldn't exactly seek to the join point with a
keyframe seek, even though there was a keyframe. This commit fixes it by
preserving the necessary metadata that got lost on cached range joining.
This is so absurdly obscure that it gets a longer code comment.
This warning was printed when the demuxer cache tried to join two
adjacent seek ranges, but failed if the last keyframe in the second
range was within the (overlapping) first range. This is a weird corner
case which to support probably would not be worth it.
So this code just printed a warning and discarded the second range. As
it turns out, this can happen relatively often if you seek a lot, and
the seek ranges are very tiny (such as consisting of only 1 keyframe).
Dropping the second range in these cases is OK and probably cheaper than
trying to actually join them. Change the warning to verbose level.
(It seems this could actually be "supported", because if keyframe_latest
is not set, there will be no other keyframes, so it could just be unset,
with the exception that q1->keyframe_latest in the code below must not
be overwritten. But still, too much trouble for a special case that
likely does not matter, and it would have to be tested too.)
This means if the user tries to seek past EOF, and we know EOF was seen
already, then use a cached seek, instead of triggering a low level seek.
This requires some annoying tracking, but seems pretty simple otherwise.
One advantage of doing this is that if the user tries to do this kind of
seek, there's no unnecessary waiting for a reaction by network (and in
most cases, redundant downloading of data just to discard it again).
Another is that this avoids creating overlapping seek ranges: previously, the
low level seek would naturally create a new range. Then it would read and add
data from the end of the stream due to the low level demuxer not being able to
seek to the target and selecting the last seek point before the end of the
stream. Consequently, this new range would overlap with the previous cached
range. But since the cache joining code is written such that you join the
current range with the _next_ range (instead of the previous as it would be
needed in this case), the overlapping ranges were left alone, until seeking back
to the previous range. That was ugly, sort of harmless, and could happen in
other cases, but this avoidable case was pretty easy to trigger.
Export them as explicitly undocumented debugging fields for the
"demuxer-cache-state" property.
Should be somewhat helpful to debug "wtf is the demuxer" doing
situations better, especially when seeking. It also becomes visible how
long the demuxer is blocked on an "old" seek when you keep seeking while
the first seek hasn't finished.
update_seek_ranges() has some special code that attempts to correctly
adjust seek ranges for subtitle tracks. (Subtitle are a nightmare for
seek ranges, because they are sparse, so using the packet list is not
enough to reliably determine the valid cached range.)
This had code like this inside the modified if statement:
range->seek_start = MP_PTS_MAX(range->seek_start, <something>);
If seek_start is NOPTS, then seek_start will be set to <something>,
breaking some other code that checks seek_start for NOPTS to see if it's
empty. Fix this by explicitly checking whether seek_start is NOPTS
before adjusting it.
The crash happened in prune_old_packets() because the range was marked
as non-empty, yet there was no packet in it to prune. This was with
files with muxed subtitles, when seeking back to the start. This should
not happen anymore with the change. Also add an assert() to
check_queue_consistency() that checks for this specific case.
There's still some mess. In theory, subtitle tracks could be completely
empty, yet their seek range would span the entire file. Seek range
tracking of subtitle files is slightly broken (even before this change).
Some of this should probably be revisited later, including not just
using seek_start to determine whether a seek range should be pruned due
to being empty.
The x264 hack requires reading the first video packet, which in turn we
handle with a hack in demux_mkv.c to get the packet without having to
add special crap to demux.c. Another useless MKV feature (which they
enabled by default at one point and which caused many demuxers to break
completely, only to disable it again when it was too late) conflicts
with this, because we actually pass a block as packet contents, instead
of after "decompression".
Fix this by calling demux_mkv_decode().
This fixes when resuming certain broken h264 files encoded by x264. See
FFmpeg commit 840b41b2a643fc8f0617c0370125a19c02c6b586 about the x264
bug itself.
Normally, the unregistered user data SEI (that contains the x264 version
string) is informational only. But libavcodec uses it to workaround a
x264 bug, which was recently fixed in both libavcodec and x264. The fact
that both encoder and decoder were buggy is the reason that it was not
found earlier, and there are apparently a lot of files around created by
the broken decoder. If libavcodec sees the SEI, this bug can be worked
around by using the old behavior.
If you resume a file with mpv (i.e. seeking when the file loads),
libavcodec never sees the first video packet. Consequently it has to
assume the file is not broken, and never applies the workaround,
resulting in garbage being played.
Fix this by always feeding the first video packet to the decoder on
init, and then flushing the codec (to avoid that an unwanted image is
output). Flushing the codec does not remove info such as the x264
version. We also abuse the fact that the first avcodec_send_packet()
always pushes the frame into the decoder (so we don't have to trigger
the decoder by requsting an output frame).
This will help with things like livestreams.
As a minor detail, subtitles are excluded, because they sometimes have
"unused" events after video and audio ends. To avoid this annoying
corner case, just ignore them.
Before this change and before the seekable stream cache became a thing,
we could possibly seek using the stream cache. But we couldn't know
whether the seek would succeed. We knew the available byte range, but
could in general not tell whether a demuxer would stay within the range
when trying to seek to a specific time position. We preferred to have
safe defaults, so seeking in streams that were detected as unseekable
were not honored. We allowed overriding this via --force-seekable=yes,
in which case it depended on your luck whether the seek would work, or
the player crapped its pants.
With the demuxer packet cache, we can tell exactly whether a seek will
work (at least if there's only 1 seek range). We can just let seeks go
through. Everything to allow this is already in place, and this commit
just moves around some minor things.
Note that the demux_seek() return value was not used before, because low
level (i.e. network level) seeks are usually asynchronous, and if they
fail, the state is pretty much undefined. We simply repurpose the return
value to signal whether cache seeking worked. If it didn't, we can just
resume playback normally, because demuxing continues unaffected, and no
decoder are reset.
This should be particularly helpful to people who for some reason stream
data into stdin via streamlink and such.
Caused by the relatively recent change to packet parsing. This time it
was probably triggered by lace type 0, which reduces the byte length of
a 0 sized packet to 3 (timestamp + flag) instead of 4 (lace count for
other lace types). The thing about laces is just my guess why it worked
for other 0 sized packets, though.
Also remove the redundant and now incorrect check below.
Fixes#5271.