One downside of this approach is that it bypasses the mixer cache, but
while this is not ideal for performance reasons, the status quo is also
simply broken so I'd rather have a slower implementation that works than
a faster implementation that does not.
And as it turns out, updating the OSD state and invalidating the mixer
cache correctly is sufficiently nontrivial to do in a clean way, so I'd
rather have this code that I can be reasonably certain does the right
thing.
Fixes#9923 as discussed. Also fixes#9928.
It is possible for vo_gpu_next to attempt a resize before the windowing
backend is fully initialized. In practice, this can happen on wayland
which means libplacebo attempts a 0x0 resize. Depending on the API, a
0x0 resize may be allowed (vulkan or d3d11), but libplacebo just returns
a 0 in this case which mpv doesn't do anything with anyway. In the case
of opengl, this usage is explictly forbidden and will result in a
warning which may confuse users. Solve this by just not trying a resize
if dwidth and dheight in the vo are not available. Fixes#10083.
This makes use of the new frame acquire/release callbacks to hold on to
hwdec images only as long as necessary. This should greatly improve the
smoothness/efficiency of hwdec interop, by not holding on to them for
longer than needed.
This also avoids the need to pool hwdec mappers altogether.
Should fix#10067 as well, since frames are now only mapped when we
actually use them.
Pulling in <libplacebo/utils/libav.h> in particular triggers the
notorious _av_vkfmt_from_pixfmt linking issue when FFmpeg is built
without Vulkan support.
When I introduced the concept of lazy loading of hwdecs by img format,
I did not propagate the probing flag correctly, leading to the new
normal loading path not runnng with probing set, meaning that any
errors would show up, creating unnecessary noise.
This change fixes this regression.
This code fails if we have DR buffers, but none of them correspond to
the current frame. Normally only happens if e.g. changing the decoder at
runtime, since DR buffers are not properly reinit in that case.
(Arguably a separate bug)
The current map_frame() code fails to clean up after itself on the
failure paths. But if map_frame returns false, no cleanup code is ever
attempted. Add the relevant calls to clean up state manually,
throughout.
So, turns out the approach in 7f67a553f doesn't work for all codecs. In
particular, sometimes lavc will internally allocate a new AVBuffer that
just points at the old AVBuffer but has a different opaque field for
some reason. In these cases, the DR metadata doesn't survive the
round-trip through libavcodec.
I explored several alternative ways of solving this problem, including
adding new mp_image fields, but in the end none of them survived the
round-trip through AVFrame and back. The `priv` and `opaque` fields
in respectively `mp_image` and `AVFrame` are also too heavily overloaded
to be of much help.
In the end, bite the bullet and use the same approach as done in
`vo_gpu`, which is to just keep track of a list of all allocations. This
is a really ugly way of doing things IMO, but ultimately, completely
safe.
Some decoders, notably hevcdec, will unconditionally memset() the entire
AVBufferRef based on the AVBufferRef size. This is bad news for us,
since it also overwrites our `struct dr_buf`.
Rewrite this code to make it more robust - keep track of the DR buf
metadata in a separate allocation instead. Has the unfortunate downside
of technically being undefined behavior if `opaque` is not at least 64
bits in size, though, but avoids this issue.
Maybe there's a better way for us to unconditionally keep track of DR
allocation metadata. I could try adding it into the `mp_image` itself.
Maybe on a rainy day. For now, this works.
Fixes#9949
Might fix#9526
There are two major ways of going about this:
1. Expose the native ra_gl/ra_pl/ra_d3d11 objects to the pre-existing
hwdec mappers, and then add code in vo_gpu_next to rewrap those
ra_tex objects into pl_tex.
2. Wrap the underlying pl_opengl/pl_d3d11 into a ra_pl object and expose
it to the hwdec mappers, then directly use the resulting pl_tex.
I ultimately opted for approach 1 because it enables compatibility with
more hardware decoders, specifically including ones that use native
OpenGL calls currently. The second approach only really works with
cuda_vk and vaapi_pl.
This avoids decoding/caching more frames in advance than necessary. In
particular, this is very important for hwdec, which generally can't have
too many decoded frames in a pool at the same time.
libplacebo v198 fixed this properly by adding the ability to flip planes
directly, which is done automatically by the swapchain helpers.
As such, we no longer need to concern ourselves with hacky logic to flip
planes using the crop. This also removes the need for the OSD coordinate
hack on OpenGL.
Render subs at the output resolution, rather than the video resolution.
Uses the new APIs found in libplacebo 197+, to allow controlling the OSD
resolution even for image-attached overlays.
Also fixes an issue where the overlay state did not get correctly
updated while paused. To avoid regenerating the OSD / flushing the cache
constantly, we keep track of OSD changes and only regenerate the OSD
when the OSD state is expected to change in some way (e.g. resolution
change). This requires introducing a new VOCTRL to inform the VO when
the UPDATE_OSD-tagged options have changed.
Fixes#9744, #9524, #9399 and #9398.
This is an annoying special case only really needed because of the
`vflip` filter. mpv handles this by directly adjusting the plane
transform. The libplacebo API, unfortunately, does not allow passing the
required information for this to work smoothly.
Long-term I plan on adding support for plane flipping in libplacebo
directly, but for the meantime, we will have to work-around it by moving
the flipping to the whole-image `crop` instead. Not an ideal solution
but better than crashing.
Fixes#9855
The only way to fix the channel order here is to create the texture with
bgra format. Incidentally, there used to be a way to set the component
map for overlays directly, but no more. Shouldn't matter since
everything supports bgra8 anyways, though.
Fixes#9699
This merges the old desaturation control options into a single
enumeration, with the goal of both simplifying how these options work
and also making this list more extensible (including, notably, new
options only supported by vo_gpu_next).
For the hybrid option, I decided to port the (slightly tweaked) values
from libplacebo's pre-refactor defaults, rather than the old values we
had in mpv, to more visually match the look of the vo_gpu_next hybrid.
Merge --gamut-clipping and --gamut-warning into a single option,
--gamut-mapping-mode, better corresponding to the new vo_gpu_next APIs
and allowing us to easily extend this option as new modes are added in
the future.
This is only needed on Android and supposed to handle a context
resize without reconfiguring the image parameters. reconfig()
already does exactly this so plug it in there.
`plane_data_from_imgfmt` doesn't zero-initialize the struct, so this
contained invalid values for e.g. `row_stride`, causing formats to
*randomly* fail. (Especially any formats with specific alignment
requirements)
Might fix#9424 and #9425.
Somewhat annoying but still relatively straightforward. There are
several ways to approach this, but I settled on reusing the pl_queue as
a cheap way to get access to the currently mapped frame. This saves us
from having to process `vo_frame` at all, and also avoids any overhead
from re-uploading the same frame twice.
(However maybe there's some circumstance in which `vo_frame` needs to be
queried/updated first, to get a screenshot of the correct frame? I'm not
sure.)
I also had the option of going with either pl_render_image() on the
extract pl_frame, or just calling pl_render_image_mix directly on it. I
went for the latter, because in the optimal case, this allows the
rendered frame to be directly retrieved from the cache, actually
entirely avoiding any sort of recompute overhead. This makes e.g. ctrl+s
during playback essentially free. (Except for the download cost,
obviously)
It would be even neater if we could make this VOCTRL asynchronous and
thereby leverage libplacebo's asynchronous download capabilities. But oh
well. That will have to wait for a sufficiently rainy day.
Closes#9388
As noticed in #9526, apparently there's some case in which DR buffers
get corrupted. Add an explicit sentinel check to try and figure out
which cases these are.
This is done to avoid cluttering vo_gpu_next.c with more ifdeffery and context-specific code
when additional backends are added in the near future.
Eventually gpu_ctx is intended to take the place of ra_ctx to further separate gpu and gpu_next.
This is needed when the color system is not explicitly tagged, but
instead needs to be inferred by the VO.
Note that there exists the function mp_image_params_guess_csp for this
sort of stuff, but it contains a lot of baggage that I don't want to
replicate, in order to move as much of this logic into pl_renderer as
possible, and therefore also give it the best chance of knowing what
shortcuts it can and can't take.
Fixes the other half of https://github.com/mpv-player/mpv/issues/9499
Adding vsync_offset to the pts in pl_queue_update actually messes up
frame timings if one isn't using interpolation. The easiest way to see
this is to have the monitor's refresh rate at an integer multiple of a
video during a panning shot (classic case). There will be very visible
judder/stutter in this case that does not happen in vo_gpu. The cause of
this is the addition of the extra vsync_offset. Just match the semantics
of vo_gpu where this is only used when interpolating.
Even when not display synced. Prevents redraw overhead for refreshes
while paused.
Also make the logic slightly clearer to follow (since it's inverted).
This almost perfectly recreates the semantics of --vo=gpu, i.e.:
- still frames are never interpolated
- non-repeated frames bypass single frame cache
The only difference is that libplacebo doesn't do a cache/blit on the
full output image, but rather it re-runs the last rendering step. This
has some advantages and some drawbacks. The most notable advantage is
that it also allows re-using the image contents when the only thing that
changes is the OSD (whereas `--vo=gpu` would force a full re-render for
that). The most notable drawback is that it also implies going through
the dithering and output LUT logic on redraws. All in all, I think this
is a pretty good trade-off in favor of `--vo=gpu-next`.
Fully fixes the last remaining performance difference in #9430.
Completely untested, since Linux still can't into HDR in 2021. Somebody
please make sure it works.
Technically covers #8219, since gpu-context=drm can be combined with
vo=gpu-next.
This required an upstream API change to implement in a way that doesn't
unnecessarily re-push or upload frames that won't be used. I consider
this a big enough bug to justify bumping the minimum version for it.
Closes#9401