This uses a different method to piece segments together. The old
approach basically changes to a new file (with a new start offset) any
time a segment ends. This meant waiting for audio/video end on segment
end, and then changing to the new segment all at once. It had a very
weird impact on the playback core, and some things (like truly gapless
segment transitions, or frame backstepping) just didn't work.
The new approach adds the demux_timeline pseudo-demuxer, which presents
an uniform packet stream from the many segments. This is pretty similar
to how ordered chapters are implemented everywhere else. It also reminds
of the FFmpeg concat pseudo-demuxer.
The "pure" version of this approach doesn't work though. Segments can
actually have different codec configurations (different extradata), and
subtitles are most likely broken too. (Subtitles have multiple corner
cases which break the pure stream-concatenation approach completely.)
To counter this, we do two things:
- Reinit the decoder with each segment. We go as far as allowing
concatenating files with completely different codecs for the sake
of EDL (which also uses the timeline infrastructure). A "lighter"
approach would try to make use of decoder mechanism to update e.g.
the extradata, but that seems fragile.
- Clip decoded data to segment boundaries. This is equivalent to
normal playback core mechanisms like hr-seek, but now the playback
core doesn't need to care about these things.
These two mechanisms are equivalent to what happened in the old
implementation, except they don't happen in the playback core anymore.
In other words, the playback core is completely relieved from timeline
implementation details. (Which honestly is exactly what I'm trying to
do here. I don't think ordered chapter behavior deserves improvement,
even if it's bad - but I want to get it out from the playback core.)
There is code duplication between audio and video decoder common code.
This is awful and could be shareable - but this will happen later.
Note that the audio path has some code to clip audio frames for the
purpose of codec preroll/gapless handling, but it's not shared as
sharing it would cause more pain than it would help.
Until now (and in mplayer traditionally), avi timestamps were handled
with a timestamp FIFO. AVI timestamps are essentially just strictly
increasing frame numbers and are not reordered like normal timestamps.
Limiting the FIFO is required because frames can be dropped. To make
it worse, frame dropping can't be distinguished from the decoder not
returning output due to increasing the buffering required for B-frames.
("Measuring" the buffering at playback start seems like an interesting
idea, but won't work as the buffering could be increased mid-playback.)
Another problem are skipped frames (packets with data, but which do
not contain a video frame).
Besides dropped and skipped frames, there is the problem that we can't
always know the delay. External decoders like MMAL are not going to
tell us. (And later perhaps others, like direct VideoToolbox usage.)
In general, this works not-well enough that I prefer the solution of
passing through AVI timestamps as DTS. This is slightly incorrect,
because most decoders treat DTS as mpeg-style timestamps, which
already include a b-frame delay, and thus will be shifted by a few
frames. This means there will be a problem with A/V sync in some
situations.
Note that the FFmpeg AVI demuxer shifts timestamps by an additional
amount (which increases after the first seek!?!?), which makes the
situation worse. It works well with VfW-muxed Matroska files, though.
On RPI, the first X timestamps are broken until the MMAL decoder "locks
on".
Will be helpful for the coming filter support. I planned on merging
audio/video decoding, but this will have to wait a bit longer, so only
remove the duplicate status codes.
vo_chain_uninit() isn't supposed to care much about the decoder
(although decoders and outputs still go strictly together, so there is
not much of an actual difference now).
Also unset track.d_video correctly.
Remove a stale declaration from dec_video.h as well.
This moves some code related to decoding from video.c to dec_video.c,
and also removes some accesses to dec_video.c from the filtering code.
dec_video.ch is starting to make sense, and simply returns video frames
from a demuxer stream. The API exposed is also somewhat intended to be
easily changeable to move decoding to a separate thread, if we ever want
this (due to libavcodec already being threaded, I don't see much of a
reason, but it might still be helpful).
The aspect ratio calculations are cached (mainly so that aspect ratio
related messages are not logged on every frame). The cache is not clared
anymore when video filters are reconfigured, but changing the
video-aspect-ratio property relied on it. Make it explicit.
Fixes#2714.
Lots of noise to remove the vfilter/vo fields from dec_video.
From now on, video filtering and output will still be done together,
summarized under struct vo_chain.
There is the question where exactly the vf_chain should go in such a
decoupled architecture. The end goal is being able to place a "complex"
filter between video decoders and output (which will culminate in
natural integration of A->V filters for natural integration of
libavfilter audio visualizations). The vf_chain is still useful for
"final" processing, such as format conversions and deinterlacing. Also,
there's only 1 VO and 1 --vf option. So having 1 vf_chain for a VO seems
ideal, since otherwise there would be no natural way to handle all these
existing options and mechanisms.
There is still some work required to truly decouple decoding.
Instead of handling this on filter chain reinit, do it directly after
the decoder. This makes the code less entangled. In particular, this
gets rid of the really weird "override params" concept in the video
filter code.
The last_format/fixed_formats have some redundance with decoder_output,
but unfortunately the latter has a slightly different use.
Use the first encountered packet PTS/DTS as base, instead of the last
one. This does not add the amount of frames buffered in the codec to the
PTS offset, and thus is better.
Also, don't add the frame time if there was no decoded frame yet. The
first frame should obviously have the timestamp of the first packet
(going by this heuristic).
While b-frame reordering limits the maximum required number to around
16, the number of additionally buffered frames can be much higher.
Guess when this actually matters? (For the libavcodec MMAL wrapper.)
Useless. Sometimes it might be useful to make some extremely broken
files work, but on the other hand --no-correct-pts is sufficient for
these cases.
While we still need some of the code for AVI, the "auto" mode in
particular inflated the size of the code.
When showing cover art, the decoding logic pretends that the source has
an infinite number of frames. This slightly simplifies dealing with
filter data flow. It was done by feeding the same packet repeatedly to
the decoder (each decode run produces new output).
Change this by decoding once at the video initialization. This is easier
to follow, and increases robustness in case of broken images. Usually,
we try to tolerate decoding errors, so decoding normally continues, but
in this case it would just burn the CPU for no reason.
Fixes#2056.
Remove the old implementation for these properties. It was never very
good, often returned very innaccurate values or just 0, and was static
even if the source was variable bitrate. Replace it with the
implementation of "packet-video-bitrate". Mark the "packet-..."
properties as deprecated. (The effective difference is different
formatting, and returning the raw value in bits instead of kilobits.)
Also extend the documentation a little.
It appears at least some decoders (sipr?) need the
AVCodecContext.bit_rate field set, so this one is still passed through.
DVD and Bluray (and to some extent cdda) require awful hacks all over
the codebase to make them work. The main reason is that they act like
container, but are entirely implemented on the stream layer. The raw
mpeg data resulting from these streams must be "extended" with the
container-like metadata transported via STREAM_CTRLs. The result were
hacks all over demux.c and some higher-level parts.
Add a "disc" pseudo-demuxer, and move all these hacks and special-cases
to it.
The i_bps members of the sh_audio and dev_video structs are mostly used
for displaying the average audio and video bitrates. Keeping them in
bits-per-second avoids truncating them to bytes-per-second and changing
them back lateron.
Change how the video decoding loop works. The structure should now be a
bit easier to follow. The interactions on format changes are (probably)
simpler. This also aligns the decoding loop with future planned changes,
such as moving various things to separate threads.
Until now, the player didn't care to drain frames on video reconfig.
Instead, the VO was reconfigured (i.e. resized) before the queued frames
finished displaying. This can for example be observed by passing
multiple images with different size as mf:// filename. Then the window
would resize one frame before image with the new size is displayed. With
--vo=vdpau, the effect is worse, because this VO queues more than 1
frame internally.
Fix this by explicitly draining buffered frames before video reconfig.
Raise the display time of the last frame. Otherwise, the last frame
would be shown for a very short time only. This usually doesn't matter,
but helps when playing image files. This is a byproduct of frame
draining, because normally, video timing is based on the frames queued
to the VO, and we can't do that with frames of different size or format.
So we pretend that the frame before the change is the last frame in
order to time it. This code is incorrect though: it tries to use the
framerate, which often doesn't make sense. But it's good enough to test
this code with mf://.
This should help fixing some issues (like not draining video frames
correctly on reinit), as well as decoupling the decoder, filter chain,
and VO code.
I also wanted to make the hardware video decoding fallback work properly
if software-only video filters are inserted. This currently has the
issue that the fallback is too violent, and throws away a bunch of
demuxer packets needed to restart software decoding properly. But
keeping "backup" packets turned out as too hacky, so I'm not doing this,
at least not yet.
This adds vf_chain, which unlike vf_instance refers to the filter chain
as a whole. This makes the filter API less awkward, and will allow
handling format negotiation better.
Using --start with files that use DTS only, or which simply have broken
PTS timestamps, would incorrectly drop frames and possibly not execute
the seek correctly.
Add yet another heuristic to detect this. The intent is that --start and
hr-seeks in general should work correctly, but in order to keep things
fast, we still want to allow frame dropping during hr-seek if there are
no problems doing so. Do this by disabling frame dropping by default,
but re-enabling it if there are no problems found for a while. As a
consequence, --start might be somewhat slower, but normal user
interaction should remain as fast as before.
Note that there's something subtle about the added code: the
has_broken_packet_pts field is checked even before the first packet is
fed to dec_video.c, so the field must not be set to 0 right on start.
It's not initially set to 0 anyway, because the heuristic requires
decoding some images before enabling frame drop anyway.
Note 2: it's not clear whether frame dropping during hr-seek really
helps; I didn't benchmark it.
The d_video->pts field was a bit strange. The code overwrote it multiple
times (on decoding, on filtering, then once again...), and it wasn't
really clear what purpose this field had exactly. Replace it with the
mpctx->video_next_pts field, which is relatively unambiguous.
Move the decreasing PTS check to dec_video.c. This means it acts on
decoder output, not on filter output. (Just like in the previous commit,
assume the filter chain is sane.) Drop the jitter vs. reset semantics;
the dec_video.c determined PTS never goes backwards, and demuxer
timestamps don't "jitter".
Refactor the PTS handling code to make it cleaner, and to separate the
bits that use PTS sorting.
Add a heuristic to fall back to DTS if the PTS us non-monotonic. This
code is based on what FFmpeg/Libav use for ffplay/avplay and also
best_effort_timestamp (which is only in FFmpeg). Basically, this 1. just
uses the DTS if PTS is unset, and 2. ignores PTS entirely if PTS is non-
monotonic, but DTS is sorted.
The code is pretty much the same as in Libav [1]. I'm not sure if all of
it is really needed, or if it does more than what the paragraph above
mentions. But maybe it's fine to cargo-cult this.
This heuristic fixes playback of mpeg4 in ogm, which returns packets
with PTS==DTS, even though the PTS timestamps should follow codec
reordering. This is probably a libavformat demuxer bug, but good luck
trying to fix it.
The way vd_lavc.c returns the frame PTS and DTS to dec_video.c is a bit
inelegant, but maybe better than trying to mess the PTS back into the
decoder callback again.
[1] https://git.libav.org/?p=libav.git;a=blob;f=cmdutils.c;h=3f1c667075724c5cde69d840ed5ed7d992898334;hb=fa515c2088e1d082d45741bbd5c05e13b0500804#l1431
These used the suffix _resync_stream, which is a bit misleading. Nothing
gets "resynchronized", they really just reset state.
(Some audio decoders actually used to "resync" by reading packets for
resuming playback, but that's not the case anymore.)
Also move the function in dec_video.c to the top of the file.
This means the code that tries to figure out the timestamp from
demuxer and decoder output is now all in dec_video.c. We set the
final timestamp on the returned image (mp_image.pts), as well as
the d_video->pts field.
The way the player uses d_video->pts field is still a bit messy. Maybe
this could be cleaned up later.
Now the --no-correct-pts mode is like the normal mode, just with
different timestamp calculations. The semantics should be about the
same as before this commit.
Before this commit, this mode estimated the frame time by subtracting
successive packet PTS values. This is complete non-sense for video
codecs which use reordering. The code compensated frame times for these
non-sense using the FPS value, but confused the rest of the player with
non-sense jumping around timestamps. So, all in all this mode is not
very useful.
Repurpose this mode for fixed frame rate playback. This gives almost the
same behavior as the old mode with forced framerate (--fps option). The
result is simpler and often more robust.
Instead of passing the PTS as separate field, pass it as part of the
usual data structures. Basically, this removes strange artifacts from
the API. (It's not finished, though: the final decoded PTS goes through
strange paths, and filter_video() finally overwrites the decoded
mp_image's pts field with it.)
We also stop using libavcodec's reordered_opaque fields, and use
AVPacket.pts and AVFrame.pkt_pts. This is slightly unorthodox, because
these pts fields are not "really" opaque anymore, yet we treat them as
such. But the end result should be the same, and reordered_opaque is
marked as partially deprecated (it's not clear whether it's really
deprecated).
If the --fps option was given (MPOpts->force_fps), the demuxer FPS value
was overwritten with the forced value. This was fine, since the demuxer
value wasn't needed anymore. But with the recent changes not to write to
the demuxer stream headers, we don't want to do this anymore. So
maintain the (forced/updated) FPS value in dec_video->fps.
The removed code in loadfile.c is probably redundant, and an artifact
from past refactorings.
Note that sub.c will now always use the demuxer FPS value, instead of
the user override value. I think this is fine, because it used the
demuxer's video size values too. (And it's rare that these values are
used at all.)
Now the actual decoder doesn't need to care about this anymore, and it's
handled in generic code instead. This simplifies vd_lavc.c, and in
particular we don't need to detect format changes in the old way
anymore.
The only reason why these structs were dynamically allocated was to
avoid recursive includes in stheader.h, which is (or was) a very central
file included by almost all other files. (If a struct is referenced via
a pointer type only, it can be forward referenced, and the definition of
the struct is not needed.) Now that they're out of stheader.h, this
difference doesn't matter anymore, and the code can be simplified.
Also sneak in some sanity checks.
This is similar to the sh_audio commit.
This is mostly cosmetic in nature, except that it also adds automatical
freeing of the decoder driver's state struct (which was in
sh_video->context, now in dec_video->priv).
Also remove all the stheader.h fields that are not needed anymore.
This drops the --pp option, which was probably broken for a while. The
option automatically inserted the "pp" filter. The value passed to it
was ignored (which is probably broken, it always selected maximal
quality).
Inserting this filter can be done simply with --vf=pp, so this is not
needed anymore.
This means most code accessing this struct must now include hwdec.h
instead of dec_video.h. I just put it into dec_video.h at first because
I thought a separate file would be a waste, but it's more proper to do
it this way, as there are too many files which include dec_video.h only
to get the mp_hwdec_info definition.
vo_opengl always loads the hwdec backend lazily, so hwdec_request_api()
has to be called to possibly load it. This makes vf_vavpp work with
software decoding. (Hardware decoding loads the backend before the
filter is initialized, so this case is different.)
Also, the VFCTRL_GET_HWDEC_INFO call doesn't need to be checked. If it
fails, the info will be left blank.
Most hardware decoding APIs provide some OpenGL interop. This allows
using vo_opengl, without having to read the video data back from GPU.
This requires adding a backend for each hardware decoding API. (Each
backend is an entry in gl_hwdec_vaglx[].) The backends expose video data
as a set of OpenGL textures.
Add infrastructure to support this. The next commit will add support for
VA-API.
Until now, video output levels (obscure feature, like using TV screens
that require RGB output in limited range, similar to YUY) still required
handling of VOCTRL_SET_YUV_COLORSPACE. Simplify this, and use the new
mp_image_params code. This gets rid of some code. VOCTRL_SET_YUV_COLORSPACE
is not needed at all anymore in VOs that use the reconfig callback. The
result of VOCTRL_GET_YUV_COLORSPACE is now used only used for the
colormatrix related properties (basically, for display on OSD). For
other VOs, VOCTRL_SET_YUV_COLORSPACE will be sent only once after config
instead of twice.