Although we can support vulkan multiplane images, cuda lacks any such
support, and so cannot natively import such images for interop. It's
possible that we can do separate exports for each plane in the image
and have it work, but for now, we can selectively disable multiplane
when we know that we'll be consuming cuda frames.
As a reminder, even though cuda is the frame source, interop is one way
so the vulkan images have to be imported to cuda before we copy the
frame contents over.
This logic here is slightly more complex than I'd like but you can't
just set the flag blindly, as it will cause hwframes ctx creation to
fail if the format is packed or if it's planar rgb. Oh well.
Certain combinations of hardware formats require the use of hwmap to
transfer frames between the formats, rather than hwupload, which will
fail if attempted.
To keep the usage of vf_format for HW -> HW transfers as intuitive as
possible, we should detect these cases and do the map operation instead
of uploading.
For now, the relevant cases are moving between VAAPI and Vulkan, and
VAAPI and DRM Prime, in both directions. I have introduced the IMGFMT
entry for Vulkan here so that I can put in the complete mapping table.
It's actually not useless, as you can map to Vulkan, use a Vulkan
filter and then map back to VAAPI for display output.
Historically, HW -> HW uploads did not exist, so the current code
assumes they will never happen. But as part of introducing Vulkan
support into ffmpeg, we added HW -> HW support to enable transfers
between Vulkan and CUDA.
Today, that means you can use the lavfi hwupload filter with the
correct configuration (and previous changes in this series) but it
would be more convenient to enable HW -> HW in the format filter so
that the transfers can be done more intuitively:
```
--vf=format=fmt=cuda
```
and
```
--vf=format=fmt=vulkan
```
Most of the work here is skipping logic that is specific to SW -> HW
uploads doing format conversion. There is no ability to do inline
conversion when moving between HW formats, so the format must be
mutually understood to begin with.
Additional work needs to be done to enable transfers between VAAPI
and Vulkan which uses mapping, rather than uploads. I'll tackle that
in the next change.
Basically predicts what mp_image_hw_download() will do. It's pretty
simple if it gets the full mp_image. (Taking just a imgfmt would make
this pretty hard/impossible or inaccurate.)
Used in one of the following commits.
FFmpeg has its own rather "special" image pools (AVHWFramesContext)
specifically for hardware decoding. So it's not really practical to use
our own pool implementation. Add these helpers, which make it easier to
use FFmpeg's code in mpv.
Remove the max_count creation parameter, because it's pointless and
rarely ever did anything. Add a talloc parent parameter instead (which
is something completely different, but convenient, and all callers needs
to be changed anyway).
Instead of clearing the pool when the now removed maximum is reached,
clear it on image parameter changes instead.
This leaked 2 unreffed AVFrame structs (roughly 1KB) per decoded frame.
Can I blame the FFmpeg API and the weird difference between freeing and
unreffing an AVFrame?
This was phased out, and was used only by vdpau by now. Drop the
mechanism and the vdpau special code, which means screenshots won't
include the vf_vdpaupp processing anymore. (I don't care enough about
vdpau, it's on its way out.)
This reverts commit 0ce3dce03a.
We actually explicitly add read-only frames in some of the hwaccel code
via mp_image_new_custom_ref(), which sets AV_BUFFER_FLAG_READONLY. It's
probably better to keep it this way.
Fixes#4652 (and some related issues with D3D).
Remove the feature of adding read-only frames to mp_image_pool_add().
This makes no sense, because an image pool is an allocator, and must
always return writable images. Also check these assumptions earlier.
Makes va_surface_download() call mp_image_hw_download() for
libavutil-allocated surfaces, which in turn calls
av_hwframe_transfer_data().
mp_image_hw_download() is actually not specific to vaapi, and can be
used for any hw surface allocated by libavutil.
Provide a way for the user to add mp_images to the pool. This is required for
dxva2, for which using set_allocator is extremely awkward since all the d3d9
surfaces must be allocated in advance and all together.
This covers source files which were added in mplayer2 and mpv times
only, and where all code is covered by LGPL relicensing agreements.
There are probably more files to which this applies, but I'm being
conservative here.
A file named ao_sdl.c exists in MPlayer too, but the mpv one is a
complete rewrite, and was added some time after the original ao_sdl.c
was removed. The same applies to vo_sdl.c, for which the SDL2 API is
radically different in addition (MPlayer supports SDL 1.2 only).
common.c contains only code written by me. But common.h is a strange
case: although it originally was named mp_common.h and exists in MPlayer
too, by now it contains only definitions written by uau and me. The
exceptions are the CONTROL_ defines - thus not changing the license of
common.h yet.
codec_tags.c contained once large tables generated from MPlayer's
codecs.conf, but all of these tables were removed.
From demux_playlist.c I'm removing a code fragment from someone who was
not asked; this probably could be done later (see commit 15dccc37).
misc.c is a bit complicated to reason about (it was split off mplayer.c
and thus contains random functions out of this file), but actually all
functions have been added post-MPlayer. Except get_relative_time(),
which was written by uau, but looks similar to 3 different versions of
something similar in each of the Unix/win32/OSX timer source files. I'm
not sure what that means in regards to copyright, so I've just moved it
into another still-GPL source file for now.
screenshot.c once had some minor parts of MPlayer's vf_screenshot.c, but
they're all gone.
mpv had refcounted frames before libav*, so we were not using
libavutil's facilities. Change this and drop our own code.
Since AVFrames are not actually refcounted, and only the image data
they reference, the semantics change a bit. This affects mainly
mp_image_pool, which was operating on whole images instead of buffers.
While we could work on AVBufferRefs instead (and use AVBufferPool),
this doesn't work for use with hardware decoding, which doesn't
map cleanly to FFmpeg's reference counting. But it worked out. One
weird consequence is that we still need our custom image data
allocation function (for normal image data), because AVFrame's uses
multiple buffers.
There also seems to be a timing-dependent problem with vaapi (the
pool appears to be "leaking" surfaces). I don't know if this is a new
problem, or whether the code changes just happened to cause it more
often. Raising the number of reserved surfaces seemed to fix it, but
since it appears to be timing dependent, and I couldn't find anything
wrong with the code, I'm just going to assume it's not a new bug.
Until now, failure to allocate image data resulted in a crash (i.e.
abort() was called). This was intentional, because it's pretty silly to
degrade playback, and in almost all situations, the OOM will probably
kill you anyway. (And then there's the standard Linux overcommit
behavior, which also will kill you at some point.)
But I changed my opinion, so here we go. This change does not affect
_all_ memory allocations, just image data. Now in most failure cases,
the output will just be skipped. For video filters, this coincidentally
means that failure is treated as EOF (because the playback core assumes
EOF if nothing comes out of the video filter chain). In other
situations, output might be in some way degraded, like skipping frames,
not scaling OSD, and such.
Functions whose return values changed semantics:
mp_image_alloc
mp_image_new_copy
mp_image_new_ref
mp_image_make_writeable
mp_image_setrefp
mp_image_to_av_frame_and_unref
mp_image_from_av_frame
mp_image_new_external_ref
mp_image_new_custom_ref
mp_image_pool_make_writeable
mp_image_pool_get
mp_image_pool_new_copy
mp_vdpau_mixed_frame_create
vf_alloc_out_image
vf_make_out_image_writeable
glGetWindowScreenshot
The plan is to get rid of the custom VAAPI and possibly VDPAU surface
allocators.
Add custom surface allocation, because hwaccel surfaces are allocated
completely differently from software surfaces.
Add optional LRU allocation, which is (probably) helpful for hwaccel,
but (probably) less optimal for software surfaces.
mp_image_pool_get_no_alloc() is specifically for VAAPI, which can't
allocate new decoder surfaces after decoder init.
Image formats used to be FourCCs, so unsigned int was better. But now
it's annoying and the only difference is that unsigned int is more to
type than int.
pthreads should be available anywhere. Even if not, for environment
without threads a pthread wrapper could be provided that can't actually
start threads, thus disabling features that require threads.
Make pthreads mandatory in order to simplify build dependencies and to
reduce ifdeffery. (Admittedly, there wasn't much complexity, but maybe
we will use pthreads more in the future, and then it'd become a real
bother.)
Change talloc destructor so that they can never signal failure, and
don't return a status code. This makes our talloc copy even more
incompatible to upstream talloc, but on the other hand this is
preparation for getting rid of talloc entirely.
(The talloc replacement in the next commit won't allow the talloc_free
equivalent to fail, and the destructor return value would be useless.
But I don't want to change any mpv code either; the idea is that the
talloc replacement commit can be reverted for some time in order to
test whether the talloc replacement introduced a regression.)
See previous commits. Also simplify this thing: 2 flags per pool image
are enough to avoid a weird central refcount and an associated shared
object keeping the refcount. We could even just store these two flags
in the mp_image itself (like in mp_image.flags or mp_image.priv), but
let's not for the sake of readability.
Refcounting will conceptually allocate and free images all the time
when using the filter chain. Add a pool that makes these reallocations
cheap.
This only affects the image data, not mp_image structs and similar small
allocations. Small allocations are always fast with reasonable memory
managers, while large image data will trigger mmap/munmap calls each
time.