Essentially, this lets video.c decide whether to consider a video track
cover art, instead of having the decoder wrapper use the lower level
sh_stream flag.
Some pain because of the dumb threading shit. Moving the code further
down to make some of it part of the lock should not change behavior,
although it could (framedrop nonsense).
This commit should not change actual behavior, and is only preparation
for the following commit.
I don't see the point of this. Not doing it may defer an error to later.
That's OK? For now, it seems better to reduce the encoding internal API.
If someone can demonstrate that this is needed, I might reimplement it
in a different way.
This replaces the two buffers (ao_chain.ao_buffer in the core, and
buffer_state.buffers in the AO) with a single queue. Instead of having a
byte based buffer, the queue is simply a list of audio frames, as output
by the decoder. This should make dataflow simpler and reduce copying.
It also attempts to simplify fill_audio_out_buffers(), the function I
always hated most, because it's full of subtle and buggy logic.
Unfortunately, I got assaulted by corner cases, dumb features (attempt
at seamless looping, really?), and other crap, so it got pretty
complicated again. fill_audio_out_buffers() is still full of subtle and
buggy logic. Maybe it got worse. On the other hand, maybe there really
is some progress. Who knows.
Originally, the data flow parts was meant to be in f_output_chain, but
due to tricky interactions with the playloop code, it's now in the dummy
filter in audio.c.
At least this improves the way the audio PTS is passed to the encoder in
encoding mode. Now it attempts to pass frames directly, along with the
pts, which should minimize timestamp problems. But to be honest, encoder
mode is one big kludge that shouldn't exist in this way.
This commit should be considered pre-alpha code. There are lots of bugs
still hiding.
This mode drops or repeats audio data to adapt to video speed, instead
of resampling it or such. It was added to deal with SPDIF. The
implementation was part of fill_audio_out_buffers() - the entire
function is something whose complexity exploded in my face, and which I
want to clean up, and this is hopefully a first step.
Put it in a filter, and mess with the shitty glue code. It's all sort of
roundabout and illogical, but that can be rectified later. The important
part is that it works much like the resample or scaletempo filters.
For PCM audio, this does not work on samples anymore. This makes it much
worse. But for PCM you can use saner mechanisms that sound better. Also,
something about PTS tracking is wrong. But not wasting more time on
this.
Can be useful to force it to adapt to extreme speed changes, while a
higher limit would just use a fraction closer to the original video
speed.
Probably useful for testing only.
The wakeup at the end of VO frame rendering seems redundant, because
after rendering almost no state changes. The player core can queue a new
frame once frame rendering begins, and there's a separate wakeup for
this. The only thing that actually changes is in->rendering. The only
thing that seems to depend on it and can trigger a wakeup is the
vo_still_displaying() function. Change it so that it needs an explicit
call to a new API function, so we can avoid wakeups in the common case.
The vo_still_displaying() code is mostly just moved around due to
locking and for avoiding forward declarations.
Also a somewhat risky change (tasty new bugs).
This should be unnecessary, since the VO itself performs wakeups once a
new frame can be queued. The only situation I can think of where this
might be required are EOF situations (which are always strange).
If I'm wrong, there'll be fun new bugs, probably causing frame drops or
temporary stalls.
I may (optionally) move decoding to a separate thread in a future
change. It's a bit attractive to move the entire decoder wrapper to
there, so if the demuxer has a new packet, it doesn't have to wake up
the main thread, and can directly wake up the decoder. (Although that's
bullshit, since there's a queue in between, and libavcodec's
multi-threaded decoding plays cross-threads ping pong with packets
anyway. On the other hand, the main thread would still have to shuffle
the packets around, so whatever, just seems like better design.)
As preparation, there shouldn't be any mutable state exposed by the
wrapper. But there's still a large number of corner-caseish crap, so
just use setters/getters for them. This recorder thing will inherently
not work, so it'll have to be disabled if threads are used.
This is a bit painful, but probably still the right thing. Like
speculatively pulling teeth.
Try to deal with various corner cases. But when I fix one thing, another
thing breaks. (And it's 50/50 whether I find the breakage immediately or
a few months later.) So results may vary.
The default for--hr-seek is changed to "default" (not creative enough to
find a better name). In this mode, audio seeking is exact if there is no
video, or if the video has only a single frame. This change is actually
pretty dumb, since audio frames are usually small enough that exact
seeking does not really add much. But it gets rid of some weird special
cases.
Internally, the most important change is that is_coverart and is_sparse
handling is merged. is_sparse was originally just a special case for
weird .ts streams that have the corresponding low-level flag set. The
idea is that they're pretty similar anyway, so this would reduce the
number of corner cases. But I'm not sure if this doesn't break the
original intended use case for it (I don't have a sample anyway).
This changes last-frame handling, and respects the duration of the last
frame only if audio is disabled. This is mostly "coincidental" due to
the need to make seeking past EOF trigger player exit, and is caused by
setting STATUS_EOF early. On the other hand, this might have been this
way before (see removed chunk close to it).
Hr-seek past the last frame instantly enters EOF, which means
handle_playback_time() will not set playback_pts to the video PTS (as
all video frames are skipped), which leads to the playback time being
taken from the last seek target. This results in confusing behavior,
especially since the seek time will be clipped to the file duration for
display, but not for further relative seeks.
Obviously, the time should be set to the last video frame, so use the
last video frame as fallback if both audio and video have ended. Also,
since the same problem exists with audio-only playback, add a fallback
for audio PTS too. We don't know which was the "last" fragment of media
played (to decide whether to use the audio or video PTS as the
fallback), but it doesn't matter since the maximum works.
This could lead to some undesired effects. In particular the audio PTS
is basically a bad guess, and is for example not clipped against --end.
(But the ridiculous way audio syncing and clamping currently works, I'm
not going to touch that shit unless I rewrite it completely.) The cover
art case is slightly broken: using --keep-open with keyframe seeks will
result in 0 as playback PTS (the video PTS). OK, who cares, it got late.
Also casually get rid of last_vo_pts, since that barely made any sense
at all.
Fixes: #7487
The seeking logic saves the last video frame it has seen (for example
for being able to seek to the last frame, or backstepping).
Unfortunately, the frame was fed back to the filtering pipeline in
situations when it shouldn't have. Then it's an out of order frame,
because it really saves the last _discarded_ frame.
For example, seeking to the end of a file with --keep-open, shift+up,
shift+down => invalid video pts warning due to saved_frame being fed
back.
Explicitly discard saved_frame when it's obviously not needed anymore.
The removed accesses to "r" are strictly speaking unrelated (just
const-propagating them).
Due to asynchronicity, we generally can't guarantee that a video frame
matches up with other events such as playback time change exactly (since
decoding, presentation, and property update all happen at different
times). This is a complaint in the referenced bug report, where
screenshot filenames in each-frame screenshot did not use the correct
timestamp, and instead was lagging behind by 1 frame.
But in this case, synchronicity was already pretty much forced with wait
calls. The only problem was that the playback time was updated at a
later time, which results in the observed 1 frame lag. Fix this by
moving the place where the screenshot is triggered in this mode.
Normal screenshots may still have the old problem. There is no effort
made to guarantee the timestamps absolutely line up, same as with the
OSD. (If you want a guarantee, you need to use a video filter, such as
libavfilter's drawtext. These will obviously use the proper timestamp,
instead of going through the somewhat asynchronous property etc. system
in the player frontend.)
Fixes: #7433
The VO underrun detection (just a weak heuristic) added in commit f26dfb
flagged the underrun state every time it was checked, and since the
check happened in every playloop iteration, this caused the playloop to
wake up itself on every iteration. It burned an entire core while in
this state.
Fix this by flagging this condition only once (as it should be), and
requiring that a frame is displayed to trigger it again. This makes it
work similar as the audio underrun check.
The bug report referenced below says --demuxer-thread=no avoided this.
This is because the demuxer layer doesn't do proper underrun reporting
if the reader thread is disabled.
Fixes: #7259
If you have a normal file with audio and video, and keep "spamming"
forward hr-seeks, the player just kept showing the last video frame
instead of exiting or playing the next file. This started happening
since commit 6bcda94cb. Although not a bug per se, it was odd, and very
user-noticable.
The main problem was that the pending seek command was processed before
the EOF was "noticed". Processing the command reset everything, so the
player did not terminate playback, but repeated the seek.
This commit restores the old behavior.
For one, it makes video return the correct status (video.c). The
parameter is a bit ugly, but better than duplicating the logic or having
another MPContext field. (As a minor detail, setting r=VD_EOF makes sure
have_new_frame() returns true, rather than going through another
iteration or whatever the hell will happen instead, which would clobber
logical_eof.)
Another thing is making the seek logic actually wait until the seek
outcome has been determined if audio is also active. Audio needs to wait
for video in order to get the video seek target position. (Which in turn
is because hr-seek still "snaps" to video frames. You can't seek in
between two frames, so audio can't just use the seek target, but always
has to wait on the timestamp of the video frame. This has other
disadvantages and is a misdesign, but not something I'll fix today.)
In theory, this might make hr-seeks less responsive, because it needs to
fully decode/filter the audio too, but in practice most time is spent on
video, which had to be fully decoded before this change. (In general,
hr-seek could probably just show a random frame when a queued hr-seek
overrides the current hr-seek, which would probably lead to a better
user experience, but that's out of scope.)
Fixes: #7206
Hr-seek has some sort of tolerance against timestamps, where it allows
for up to 5ms deviation. This means it can work only for videos with up
to 200 FPS framerate. There were complains about how it doesn't work
with videos beyond some high fps. (1000 was mentioned, although that
sounds more like it's about the limit that .mkv has.)
I suspect this is because otherwise, it might be hard to hit a timestamp
with --start, which specifies timestamps as integer, and thus will most
likely never represent a timestamp exactly. Another part of the problem
is that mpv uses 64 bit floats for timestamps, so fractional parts are
never represented exactly. (Both the "tolerance" and using floats for
timestamps were things introduced before my time.)
Anyway, in the backstep case, we can be relatively sure that the
timestamp will be exact (as in, the same unmodified value that was
returned by the filter chain), so we can make an exception for that, in
order to fix backstep.
Untested. (For that you have users.)
May help with #7208.
Until now, this didn't work, since the external image had pts 0; so
enabling video at a later time did nothing, because the image was
discarded. Since hrseek now ends on the last frame (instead of nothing),
reusing the hrseek mechanism solves this, and we don't even need to
treat the cursed coverart case separately.
See what the added code comment says. Normally when this is needed, it's
the cover art case. But this flag is not set when using an external
image. This gives weird seek behavior, because the frame will be
"normally" displayed for its determined duration, and during normal
video playback, the video pts will be used - which is always 0 here.
This should happen only if audio is active. Otherwise, we're more or
less in image viewer mode, where the image should be displayed for a
configured duration.
This gives much better behavior in general, and is what we want if video
somehow ends earlier than audio.
A common special is using an audio file with an external image file.
This commit makes things like switching aspect ratio work (provided the
demuxer for the image behaves correctly, which currently isn't the case
with demux_mf.c). Since the image file had timestamp 0, it was usually
skipped by hr-seek, and changed properties weren't applied to it at the
start of the filter chain.
It appears commit 4ad68d9452 broke handling the first video
frame duration through roundabout ways (I think because the duration of
the first frame was now available at all in the normal case). The first
frame was cut short, which showed up especially with looping, or if the
file had a low FPS.
This questionable change seems to fix it without breaking any other
known cases => push and call it a day.
The display-sync mode did not have this problem.
Fixes: #7150
On a audio/video desync by more than 0.5 seconds, display-sync mode was
disabled, and not enabled again (until playback restart, e.g. a seek).
The idea was that it this only happens when this playback mode is broken
and can't perform well anyway (A/V desync is a clear indication that
something is very wrong). Instead of behaving like a god damn POS, it
should revert to the more robust audio-sync mode.
Unfortunately, this could happen sporadically due to temporary system
performance problems, such as toggling fullscreen. Users didn't like
this, and asked for a function to disable it, or to recover in some
other way.
This mechanism is questionable anyway. If an ignorant user enables
display-sync, and encounters problems with it (without being able to
determine that display-sync is messing up), the player will still behave
like a POS on every playback, and even after every seek. It might
actually be helpful to fail more consistently. Also, I've found that
it's sill relatively reliable anyway even without this mechanism.
So just remove the fallback.
Fixes: #7048
The --cache-pause feature (enabled by default) will pause playback for a
while if network runs out of data. If this is not done, then playback
will go on frame-wise (as packets are slowly read from the network and
then instantly decoded and displayed). This feature is actually useless,
as you won't get nice playback no matter what if network is too slow,
but I guess I still prefer this behavior for some reason.
This commit changes this behavior from using the demuxer cache state
only, to trying to use underrun information from the AO/VO. This means
if you have a very large audio buffer, then cache-pausing will trigger
once that buffer is depleted, which will be some time _after_ the
demuxer cache has run out.
This requires explicit support from the AO. Otherwise, the behavior
should be mostly the same as before this commit.
This does not care about the AO buffer. In theory, the AO may underrun,
then the player will write some data to the AO buffer, then the AO will
recover and play this bit of data, then the player will probably trigger
the cache-pause behavior. The probability of this happening should be
pretty low, so I will hold off fixing this until the next refactor of
the AO chain (if ever).
The VO underflow detection was devised and tested in 5 minutes, and may
not be correct. At least I'm fairly sure that the combination of all the
factors should make incorrect behavior relatively unlikely, but problems
are possible.
Also, the demux_reader_state.underrun field may be inaccurate. It's only
the present state at the time demux_get_reader_state() was called, and
may exclude past underruns. In theory, this could cause "close" cases to
be missed. Then you might get an audio underrun without cache-pausing
acting on it. If the stars align, this could happen multiple times in
the row, effectively making this feature not work.
The most user-visible consequence of this change is that the user
will now see an AO underrun warning every time the cache runs out.
Maybe this cache-pause feature should just be removed...
Unless --video-latency-hacks, always decode 2 frames on playback
restart. This in turn will always compute the correct frame duration
(even for the first frame), which in turn happens to fix that playback
with an image at the beginning breaks display.
If a still image precedes video, and the size/format of the frame is
different from that of the video following it, the incorrect frame
duration caused vo_reconfig2() to be called early, causing the window to
resize, and the renderer to clear the image to black. Specifically, it
hit the default value of 1 second duration (for still images), so the
image was displayed for 1 second, and changed to black until the next
proper video frame was displayed.
Normally this does not happen. Even if a video file displays still
images, it normally repeats the still image at the video's FPS (which is
sane). But you can construct such files, or use EDL to construct
something similarly behaving.
This change may increase seek latency a bit in audio video-sync mode
(the default). It needs to wait until 2 frames are decoded, before it
bothers to display the first frame. This is done even when seeking. In
theory it might be good to introduce a "seek preview" mode, which shows
the target image without all the preparations needed for starting
playback. (For example, it could not decode audio.) But since I'm using
video-sync=display-resample, which already needed to always decode 2
frames, I don't think this is a terribly high priority, nor do I
consider the slightly slower seeking a regression.
Fixes: #6765
Track switching doesn't run reset_playback_state(), so a track enabled
at runtime during backward playback would lead to a messed up state.
This commit just does a bad code monkey fix to this. It feels like there
needs to be a much better way to propagate this state.
We need to transform the timestamp returned by get_play_end_pts().
I considered making it return the transformed timestamp directly. There
are 4 callers; 2 need a transformed timestamps, 2 don't. So I guess it
doesn't matter.
E.g. "mpv null:// --demuxer=rawvideo" will "hang" by waiting for video
EOF forever. It's not signalled correctly because of the last-frame
corner case, which attempts to wait until the current frame is finally
displayed (which is signalled by whether a new frame can be queued, see
commit 1a339fa09d for some details). If no frame was ever queued, the VO
is not configured, and vo_is_ready_for_frame() never returns true.
Fix this by using vo_has_frame(), which seems to be exactly the correct
thing we need.
Basically reimplement the async behavior on top of the async command
code. With this, all screenshot commands are async, and the "async"
prefix basically does nothing. The prefix now behaves exactly like with
other commands that use spawn_thread.
This also means using the prefix in the preset input.conf is pointless
(without effect) and misleading, so remove that.
The each_frame mode was actually particularly painful in making this
change, since the player wants to block for it when writing a
screenshot, and generally doesn't fit into the new infrastructure. It
was still relatively easy to reimplement by copying the original command
and then repeating it on each frame. The waiting is reentrant now, so
move the call in video.c to a "safer" spot.
One way to observe how the new semantics interact with everything is
using the mpv repl script and sending a screenshot command through it.
Without async flag, the script will freeze while writing the screenshot
(while playback continues), while with async flag it continues.
Fixes several issues playing back mpegts with video streams marked
as having "still images". For example, see this video which has
frames only every 6s: https://s3.amazonaws.com/tmm1/music-choice.ts
Changes include:
- start playback right away, without waiting for first video frame
- do not consider the sparse video stream in demuxer underrun detection
- do not require multiple video frames for the VO
- use audio as the master stream for demuxer metadata events
- use audio stream for playback time
Signed-off-by: Aman Gupta <aman@tmm1.net>
Until recently, ao_lavc and vo_lavc started encoding whenever the core
happened to send them data. Since audio and video are not initialized at
the same time, and the muxer was not necessarily opened when the first
encoder started to produce data, the resulting packets were put into a
queue. As soon as the muxer was opened, the queue was flushed.
Change this to make the core wait with sending data until all encoders
are initialized. This has the advantage that we don't need to queue up
the packets.
The video timing code could just decide that EOF was reached before it
was displayed. This is not really a problem for normal playback (if you
use something like --keep-open it'd show the last frame anyway,
otherwise it'd at best flash it on screen before destroying the window).
But in encode mode, it really matters, and makes the difference between
having one frame more or less in the output file.
Fix this by waiting for the VO before starting the real EOF.
vo_is_ready_for_frame() is normally used to determine when the VO frame
queue has enough space to send a new frame. Since the VO frame queue is
currently at most 1 frame, it being signaled means the remaining frame
was consumed and thus sent to the VO driver. If it returns false, it
will wake up the playloop as soon as the state changes.
I also considered using vo_still_displaying(), but it's not reliable,
because it checks the realtime of the frame end display time.
The main change is that we wait with opening the muxer ("writing
headers") until we have data from all streams. This fixes race
conditions at init due to broken assumptions in the old code.
This also changes a lot of other stuff. I found and fixed a few API
violations (often things for which better mechanisms were invented, and
the old ones are not valid anymore). I try to get away from the public
mutex and shared fields in encode_lavc_context. For now it's still
needed for some timestamp-related fields, but most are gone. It also
removes some bad code duplication between audio and video paths.
1. I want to get away from mp_image_params (maybe).
2. For encoding mode, it's convenient to get the nominal_fps, which is
a mp_image field, and not in mp_image_params.
There is some sort-of awkwardness here, because option access needs to
happen in a synchronized manner, and the framedrop flag is not in the VO
option struct. Remove the mp_read_option_raw() call and the awkward
change notification via VO_EVENT_WIN_STATE from command.c, and pass it
through as new vo_frame flag.
The playback start logic explicitly waits until the first frame has been
displayed. Usually this will introduce a wait of 1 vsync. For normal
playback this doesn't matter, but with respect to low latency needs,
this only leads to additional data getting queued up in the demuxer or
network buffers.
Another thing is that the timing logic decodes 1 frame ahead (= 1 frame
extra latency) to determine the exact duration of a frame.
To be fair, there doesn't really seem to be a hard reason why this is
needed. With the current code, enabling the option does lead to A/V
desync sometimes (if the demuxer FPS is too inaccurate), and also frame
drops at playback start in some situations. But this all seems to be
avoidable, if the timing logic were to be rewritten completely, which
should probably happen in the future. Thus the new option comes with the
warning that it can be removed any time. This is also why the option has
"hack" in the name.
The purpose of the new API is to make it useable with other APIs than
OpenGL, especially D3D11 and vulkan. In theory it's now possible to
support other vo_gpu backends, as well as backends that don't use the
vo_gpu code at all.
This also aims to get rid of the dumb mpv_get_sub_api() function. The
life cycle of the new mpv_render_context is a bit different from
mpv_opengl_cb_context, and you explicitly create/destroy the new
context, instead of calling init/uninit on an object returned by
mpv_get_sub_api().
In other to make the render API generic, it's annoyingly EGL style, and
requires you to pass in API-specific objects to generic functions. This
is to avoid explicit objects like the internal ra API has, because that
sounds more complicated and annoying for an API that's supposed to never
change.
The opengl_cb API will continue to exist for a bit longer, but
internally there are already a few tradeoffs, like reduced
thread-safety.
Mostly untested. Seems to work fine with mpc-qt.
This fixes playback stalls on some mediacodec hardware decoders,
which expect that frame buffers will be rendered and returned back
to the decoder as soon as possible.
Specifically, the issue was observed on an NVidia SHIELD Android TV,
only when playing an H264 sample which switched between interlaced
and non-interlaced frames. On an interlacing change, the decoder
expects all outstanding frames would be returned to it before it
would emit any new frames. Since a single extra frame always remained
buffered by mpv, playback would stall. After this commit, no extra
frames are buffered by mpv when using vo_mediacodec_embed.
Use the decoder wrapper that was introduced for video. This removes all
code duplication the old audio decoder wrapper had with the video code.
(The audio wrapper was copy pasted from the video one over a decade ago,
and has been kept in sync ever since by the power of copy&paste. Since
the original copy&paste was possibly done by someone who did not answer
to the LGPL relicensing, this should also remove all doubts about
whether any of this code is left, since we now completely remove any
code that could possibly have been based on it.)
There is some complication with spdif handling, and a minor behavior
change (it will restrict the list of codecs to spdif if spdif is to be
used), but there should not be any difference in practice.
Move dec_video.c to filters/f_decoder_wrapper.c. It essentially becomes
a source filter. vd.h mostly disappears, because mp_filter takes care of
the dataflow, but its remains are in struct mp_decoder_fns.
One goal is to simplify dataflow by letting the filter framework handle
it (or more accurately, using its conventions). One result is that the
decode calls disappear from video.c, because we simply connect the
decoder wrapper and the filter chain with mp_pin_connect().
Another goal is to eventually remove the code duplication between the
audio and video paths for this. This commit prepares for this by trying
to make f_decoder_wrapper.c extensible, so it can be used for audio as
well later.
Decoder framedropping changes a bit. It doesn't seem to be worse than
before, and it's an obscure feature, so I'm content with its new state.
Some special code that was apparently meant to avoid dropping too many
frames in a row is removed, though.
I'm not sure how the source code tree should be organized. For one,
video/decode/vd_lavc.c is the only file in its directory, which is a bit
annoying.
Get rid of the old vf.c code. Replace it with a generic filtering
framework, which can potentially handle more than just --vf. At least
reimplementing --af with this code is planned.
This changes some --vf semantics (including runtime behavior and the
"vf" command). The most important ones are listed in interface-changes.
vf_convert.c is renamed to f_swscale.c. It is now an internal filter
that can not be inserted by the user manually.
f_lavfi.c is a refactor of player/lavfi.c. The latter will be removed
once --lavfi-complex is reimplemented on top of f_lavfi.c. (which is
conceptually easy, but a big mess due to the data flow changes).
The existing filters are all changed heavily. The data flow of the new
filter framework is different. Especially EOF handling changes - EOF is
now a "frame" rather than a state, and must be passed through exactly
once.
Another major thing is that all filters must support dynamic format
changes. The filter reconfig() function goes away. (This sounds complex,
but since all filters need to handle EOF draining anyway, they can use
the same code, and it removes the mess with reconfig() having to predict
the output format, which completely breaks with libavfilter anyway.)
In addition, there is no automatic format negotiation or conversion.
libavfilter's primitive and insufficient API simply doesn't allow us to
do this in a reasonable way. Instead, filters can use f_autoconvert as
sub-filter, and tell it which formats they support. This filter will in
turn add actual conversion filters, such as f_swscale, to perform
necessary format changes.
vf_vapoursynth.c uses the same basic principle of operation as before,
but with worryingly different details in data flow. Still appears to
work.
The hardware deint filters (vf_vavpp.c, vf_d3d11vpp.c, vf_vdpaupp.c) are
heavily changed. Fortunately, they all used refqueue.c, which is for
sharing the data flow logic (especially for managing future/past
surfaces and such). It turns out it can be used to factor out most of
the data flow. Some of these filters accepted software input. Instead of
having ad-hoc upload code in each filter, surface upload is now
delegated to f_autoconvert, which can use f_hwupload to perform this.
Exporting VO capabilities is still a big mess (mp_stream_info stuff).
The D3D11 code drops the redundant image formats, and all code uses the
hw_subfmt (sw_format in FFmpeg) instead. Although that too seems to be a
big mess for now.
f_async_queue is unused.
I've decided that MP_TRACE means “noisy spam per frame”, whereas
MP_DBG just means “more verbose debugging messages than MSGL_V”.
Basically, MSGL_DBG shouldn't create spam per frame like it currently
does, and MSGL_V should make sense to the end-user and provide mostly
additional informational output.
MP_DBG is basically what I want to make the new default for --log-file,
so the cut-off point for MP_DBG is if we probably want to know if for
debugging purposes but the user most likely doesn't care about on the
terminal.
Also, the debug callbacks for libass and ffmpeg got bumped in their
verbosity levels slightly, because being external components they're a
bit less relevant to mpv debugging, and a bit too over-eager in what
they consider to be relevant information.
I exclusively used the "try it on my machine and remove messages from
MSGL_* until it does what I want it to" approach of refactoring, so
YMMV.
update_subtitles() makes sure all subtitle packets at/before the given
PTS have been read and processed. Normally, this function is only called
before sending a frame to the VO. This is too late for vf_sub, which
expects the subtitles to be updated before feeding a frame to the
filters.
Apparently this was specifically a problem for the first frame.
Subsequent frames might have been ok due to general prefetching.
(This will fail anyway, should a filter dare to add an offset to the
timestamps of the filered frames before they pass to vf_sub.)
Fixes#5194.