Will be helpful for the coming filter support. I planned on merging
audio/video decoding, but this will have to wait a bit longer, so only
remove the duplicate status codes.
Now I'm not sure if this is really necessary to explicitly mention. But
then there have been issues and pull requests without being properly
looked at for a week or longer.
Since there can be multiple backends for a single API (vaapi can use GLX
or EGL), not logging the exact backend name is annoying. So add it. At
the same time, there is no need to duplicate the name as used by the
--hwdec options, so replace it with using the numeric hwdec API ID.
Let's fix broken samples with questionable heuristic without real
reasoning. Until this gets fixed properly, this is a good compromise,
though. A proper fix would properly resync audio and video without
brutally resetting the decoders, but on the other hand not doing the
brutal reset would cause issues in other obscure corner cases such
resyncing might cause.
This code is tricky because it has to wakeup the mainloop to make
progressing during syncing audio, but also has to avoid waking it up
when it's not needed. Failure to do so either burns CPU by not ever
going to sleep, or causes apparent "freezes" by going to sleep (and it
will continue if the mainloop is woken up e.g. due to user input).
In this case, simply starting A/V playback with --start=5 and removing
an unrelated wakeup in osd.c can trigger such a "freeze". The unrelated
wakeup did hide this bug, nonetheless it's a bug.
(Can't wait to rewrite this shitty audio resync code. And it's all my
fault.)
Commit b53cb8de added a delay queue for decoded frames. This is supposed
to be used with copy-back decoders like dxva2-copy and vaapi-copy.
Surfaces returned by them can't be referenced after uninitializing the
decoders, so they have to be released before destroying the decoder.
Move the flush_all() call above decoder uninit accordingly. Also move
the destruction of the AVFrame used for decoding (just for being
defensive - normally it doesn't hold any reference).
We just need to provide an entrypoint for it, and move the main init
code to a separate function. This gets rid of the messy video chain full
reinit in command.c, which completely destroyed and recreated the video
state for the purpose of mid-stream hw/sw switching.
Don't give the "software_fallback_decoder" field special meaning. Alwass
set it, and rename it to "decoder". Whether hw decoding is used is
determined by the "hwdec" field already.
These changes don't make too much sense without context, but are
preparation for later. Then the audio_src/video_src fields will be
actually be NULL under circumstances.
Before this commit, reinit_audio_chain() did 2 things: create all the
management data structures and initialize the decoder, and handling lazy
filter/output init (as well as dealing with format changes). For the
second purpose, it could be called multiple times (even though it wasn't
really idempotent). This was pretty weird, so make them separate
functions. The new function is actually idempotent too.
It also turns out the reinit functions don't have to call themselves
recursively for the spdif PCM fallback.
Regression caused by commit 3b95dd47. Also see commit 4c25b000. We can
either use video_next_pts and add "delay", or we just use video_pts. Any
other combination breaks. The reason why the assumption that delay==0 at
this point was wrong exactly because after displaying the first video
frame (usually done before audio resync) a new frame might be "added"
immediately, resulting in a new video_next_pts and "delay", which will
still amount to video_pts.
Fixes#2770. (The reason why display-sync was blamed in this issue is
because enabling display-sync in the options forces a prefetch by 2
instead of 1 frames for seeks/playback restart, which triggers the
issue, even if display-sync is not actually enabled. In this case,
display-sync is never enabled because the frames have a unusually high
frame duration. This is also what exposed the initial desync issue.)
This codes tries to deal with broken PTS timestamps, but since commit
271cabe6 it didn't always overwrite the previous timestamp as it should
have. This mattered only if there were broken timestamps in the video
stream.
Also remove the pointless prev_codec_pts variables, since the decoder
doesn't overwrite these fields anymore.
This also makes it refcounted, i.e. the new AVFrame will reference the
mp_audio buffers, instead of potentially forcing the consumer of the
AVFrame to copy the data.
All the extra code is for handling the >8 channels case, which requires
very messy dealing with the extended_ fields (not our fault).
Correctly avoid a reload if the current device was specified by the user through
--audio-device. Previously, we only recognized if the user had specified
--ao=wasapi:device=.
FFmpeg can generate such files. It's unclear whether they're allowed by
Matroska. mkvinfo shows packet timestamps in both forms (one of them
must be a bug), and at last libavformat's demuxer treats timestamps
as signed.
GLES requires this. Some more common sampler types have default
precisions, but not usampler2D. Newer ANGLE builds verify this more
strictly than older builds, so this wasn't caught before.
Fixes#2761.
GLES does not support high bit depth fixed point textures for unknown
reasons, so direct 10 bit input is not possible. But we can still use
integer textures, which are supported by GLES 3.0. These store integer
data just like the standard fixed point textures, except they are not
normalized on sampling. They also don't support bilinear filtering, and
require a special sampler ("usampler2D").
While these texture formats enable us to shuffle the data to the GPU,
they're rather impractical with the requirements mentioned above and our
current architecture. One problem is that most code assumes it can
always use bilinear scaling (even if bilinear is never used when using
appropriate scale/cscale options). Another is that we don't have any
concept of running a function on a texture in an uniform way.
So for now, run a simple conversion step through a FBO. The FBO will use
the rgba16f format normally, which gives enough bits for 10 bit, and
will at least gracefully degrade with higher depth input.
This is bound to be much slower than a more "direct" method, but at
least it works and is simple to implement.
The odd change of function call order in init_video() is to properly
disable "dumb mode" (no FBO use) if these texture formats are in use.
This was never reset - absolutely can't be right. If the renderer
somehow switches back to another codepath, it certainly has to be reset.
Maybe this was hard to hit, as the normalization is going to be
idempotent in simpler cases (like rendering RGBA input).
Also get rid of the "merged" variable.
This seems generally easier when using libmpv (and was already requested
and implemented before: see commit 327a779a; it was reverted some time
later).
With the weird internal logic we have to deal with, in particular the
--softvol=no case (using system volume), and using the audio API's mixer
(--softvol=auto on some systems), we still can't avoid all glitches and
corner cases that complicate this issue so much. The API user is either
recommended to use --softvol=yes or auto, or to watch the new
mixer-active property, and assume the volume/mute properties have
significant values if the mixer is active.
Remaining glitches:
- changing the volume/mute properties has no effect if no internal mixer
is used (--softvol=no) and the mixer is not active; the actual mixer
controls do not change, only the property values
- --volume/--mute do not have an effect on the volume/mute properties
before mixer initialization (the options strictly are only applied
during mixer init)
- volume-max is 100 while the mixer is not active
Commit b53cb8de increased this by the number of additionally delayed
surfaces. But since this is only enabled in copy-back mode (which is
what process_image is about), the other additional surfaces accounted
for the direct rendering case can be ignored.
Often requested. The main argument, that prominent scalers like sharpen
change the image even if no scaling happens, disappeared anyway.
("sharpen", unsharp masking, is neither prominent nor a scaler anymore.
This is an artifact from MPlayer, which fuses unsharp masking with
bilinear scaling in order to make it single-pass, or so.)
Another fix for the crazy and insane nvidia preemption behavior.
This time, the situation is that we are using vo_opengl with vdpau
interop, and that vdpau got preempted in the background while mpv was
sitting idly. This can be e.g. reproduced by using:
--force-window=immediate --idle --hwdec=vdpau
and switching VTs. Then after switching back, load a video file.
This will not let mp_vdpau_handle_preemption() perform preemption
recovery, simply because it will do so only once vdp_decoder_create()
has been called. There are some other API calls which trigger
preemption, but many don't.
Due to the way the libavcodec API works, vdp_decoder_create() is way too
late. It does so when get_format returns. It notices creating the
decoder fails, and continues calling get_format without the vdpau
format. We could perhaps force it to reinit again (by adding a call to
vdpau.c, that checks for preemption, and sets hwdec_request_reinit), but
this seems too much of a mess.
Solve it by calling API in mp_vdpau_handle_preemption() that empirically
does trigger preemption: output_surface_put_bits_native(). This call is
useless, and in fact should be doing nothing (empty update VdpRect).
There's the slight chance that in theory it will slow down operation,
but in practice it's bound to be harmless. It's the likely cheapest and
simplest API call I've found that can trigger the fallback this way.
(The driver is closed source, so it was up to trial & error.)
Also, when initializing decoding, allow initial preemption recovery,
which is needed to pass the test mention above.
With the format left untouched, this would just try to reinit with a
spdif format again.
We're not clearing the format in reset_audio_state() so the audio chain
can be recreated any time without having to wait for a frame to be
decoded.
Some VOs had support for these - remove them.
Typically, these formats will have only some use in cases where using
RGB software conversion with libswscale is faster than letting the
VO/GPU do it (i.e. almost never). For the sake of testing this case,
keep IMGFMT_RGB565. This is the least messy format, because it has no
padding/alpha bits with unknown semantics.
Note that decoding to these formats still works. We'll let libswscale
repack the data to whatever the VO in use can take.