Although we can support vulkan multiplane images, cuda lacks any such
support, and so cannot natively import such images for interop. It's
possible that we can do separate exports for each plane in the image
and have it work, but for now, we can selectively disable multiplane
when we know that we'll be consuming cuda frames.
As a reminder, even though cuda is the frame source, interop is one way
so the vulkan images have to be imported to cuda before we copy the
frame contents over.
This logic here is slightly more complex than I'd like but you can't
just set the flag blindly, as it will cause hwframes ctx creation to
fail if the format is packed or if it's planar rgb. Oh well.
Certain combinations of hardware formats require the use of hwmap to
transfer frames between the formats, rather than hwupload, which will
fail if attempted.
To keep the usage of vf_format for HW -> HW transfers as intuitive as
possible, we should detect these cases and do the map operation instead
of uploading.
For now, the relevant cases are moving between VAAPI and Vulkan, and
VAAPI and DRM Prime, in both directions. I have introduced the IMGFMT
entry for Vulkan here so that I can put in the complete mapping table.
It's actually not useless, as you can map to Vulkan, use a Vulkan
filter and then map back to VAAPI for display output.
Basically predicts what mp_image_hw_download() will do. It's pretty
simple if it gets the full mp_image. (Taking just a imgfmt would make
this pretty hard/impossible or inaccurate.)
Used in one of the following commits.
FFmpeg has its own rather "special" image pools (AVHWFramesContext)
specifically for hardware decoding. So it's not really practical to use
our own pool implementation. Add these helpers, which make it easier to
use FFmpeg's code in mpv.
Remove the max_count creation parameter, because it's pointless and
rarely ever did anything. Add a talloc parent parameter instead (which
is something completely different, but convenient, and all callers needs
to be changed anyway).
Instead of clearing the pool when the now removed maximum is reached,
clear it on image parameter changes instead.
Makes va_surface_download() call mp_image_hw_download() for
libavutil-allocated surfaces, which in turn calls
av_hwframe_transfer_data().
mp_image_hw_download() is actually not specific to vaapi, and can be
used for any hw surface allocated by libavutil.
Provide a way for the user to add mp_images to the pool. This is required for
dxva2, for which using set_allocator is extremely awkward since all the d3d9
surfaces must be allocated in advance and all together.
Until now, failure to allocate image data resulted in a crash (i.e.
abort() was called). This was intentional, because it's pretty silly to
degrade playback, and in almost all situations, the OOM will probably
kill you anyway. (And then there's the standard Linux overcommit
behavior, which also will kill you at some point.)
But I changed my opinion, so here we go. This change does not affect
_all_ memory allocations, just image data. Now in most failure cases,
the output will just be skipped. For video filters, this coincidentally
means that failure is treated as EOF (because the playback core assumes
EOF if nothing comes out of the video filter chain). In other
situations, output might be in some way degraded, like skipping frames,
not scaling OSD, and such.
Functions whose return values changed semantics:
mp_image_alloc
mp_image_new_copy
mp_image_new_ref
mp_image_make_writeable
mp_image_setrefp
mp_image_to_av_frame_and_unref
mp_image_from_av_frame
mp_image_new_external_ref
mp_image_new_custom_ref
mp_image_pool_make_writeable
mp_image_pool_get
mp_image_pool_new_copy
mp_vdpau_mixed_frame_create
vf_alloc_out_image
vf_make_out_image_writeable
glGetWindowScreenshot
The plan is to get rid of the custom VAAPI and possibly VDPAU surface
allocators.
Add custom surface allocation, because hwaccel surfaces are allocated
completely differently from software surfaces.
Add optional LRU allocation, which is (probably) helpful for hwaccel,
but (probably) less optimal for software surfaces.
mp_image_pool_get_no_alloc() is specifically for VAAPI, which can't
allocate new decoder surfaces after decoder init.
Image formats used to be FourCCs, so unsigned int was better. But now
it's annoying and the only difference is that unsigned int is more to
type than int.
See previous commits. Also simplify this thing: 2 flags per pool image
are enough to avoid a weird central refcount and an associated shared
object keeping the refcount. We could even just store these two flags
in the mp_image itself (like in mp_image.flags or mp_image.priv), but
let's not for the sake of readability.
Refcounting will conceptually allocate and free images all the time
when using the filter chain. Add a pool that makes these reallocations
cheap.
This only affects the image data, not mp_image structs and similar small
allocations. Small allocations are always fast with reasonable memory
managers, while large image data will trigger mmap/munmap calls each
time.