Commit Graph

55 Commits

Author SHA1 Message Date
James Almer e36eb94048 avutil: use the buffer_size_t typedef where required
Signed-off-by: James Almer <jamrial@gmail.com>
2021-03-10 20:26:37 -03:00
Xu Guangxin ae97f69ce1 avutils/vulkan: hwmap, respect src frame resolution
fixes http://trac.ffmpeg.org/ticket/9055

The hw decoder may allocate a large frame from AVHWFramesContext, and adjust width and height based on bitstream.
We need to use resolution from src frame instead of AVHWFramesContext.

test command:
ffmpeg -loglevel debug -hide_banner -hwaccel vaapi -init_hw_device vaapi=va:/dev/dri/renderD128 -hwaccel_device va -hwaccel_output_format vaapi -init_hw_device vulkan=vulk -filter_hw_device vulk -i 1920x1080.264 -c:v libx264 -r:v 30 -profile:v high -preset veryfast -vf "hwmap,chromaber_vulkan=0:0,hwdownload,format=nv12" -map 0 -y vaapiouts.mkv

expected:
No green bar at bottom.
2021-01-22 04:30:42 +01:00
Lynne b51b9bbd42
hwcontext_vulkan: wait and signal semaphores when transferring to CUDA
Same as when downloading. Not sure why this isn't done, probably
because the CUDA code predates the sync mechanism we settled on.
2020-12-05 23:53:23 +01:00
Lynne 659e6e9c88
hwcontext_vulkan: reduce priority for PACK32 formats
Due to some endian-dependent overlap, these should be used last.
2020-11-27 02:58:02 +01:00
Niklas Haas 3fbc74582f
hwcontext_vulkan: optionally enable more functionality
These two extensions and two features are both optionally used by
libplacebo to speed up rendering, so it makes sense for libavutil to
automatically enable them as well.
2020-11-25 23:14:47 +01:00
Lynne 2aeb152653
hwcontext_vulkan: support additional pixel formats
We support every single packed format possible now.
There are some fringe leftover mappings which are uninteresting.
2020-11-25 23:14:47 +01:00
Lynne 48b3532183
hwcontext_vulkan: fix incorrect A/0BGR mapping
Vulkan formats with a PACK suffix define native endianess.
Vulkan formats without a PACK suffix are in bytestream order.

Pixel formats with a LE/BE suffix define endianess.
Pixel formats without LE/BE suffix are in bytestream order.
2020-11-25 23:14:46 +01:00
Lynne 3aa8de12ab
hwcontext_vulkan: simplify plane size calculations and support 4-plane formats
Needed to support YUVA.
2020-11-25 23:14:46 +01:00
Lynne 7b274a9b89
hwcontext_vulkan: do not segfault when failing to init a AVHWFramesContext
frames_uninit is always called on failure, and the free_exec_ctx function
did not zero the pool when freeing it, so it resulted in a double free.
2020-11-25 23:14:46 +01:00
Lynne 18a6535b08
hwcontext_vulkan: always attempt to map host memory when transferring
This relies on the fact that host memory is always going to be required
to be aligned to the platform's page size, which means we can adjust
the pointers when we map them to buffers and therefore skip an entire
copy. This has already had extensive testing in libplacebo without
problems, so its safe to use here as well.

Speeds up downloads and uploads on platforms which do not pool their
memory hugely, but less so on platforms that do.

We can pool the buffers ourselves, but that can come as a later patch
if necessary.
2020-11-25 23:14:01 +01:00
Lynne 9cf1811d3d
hwcontext_vulkan: check for memory size before choosing type
It makes allocation a bit more robust in case some weird device with
weird drivers which segments memory in weird ways appears.
2020-11-25 23:06:36 +01:00
Lynne ff29ca2f1f
hwcontext_vulkan: correctly access the p->extensions bitmask
Its a 64-bit bitfield being put directly into an int.
2020-11-25 23:06:36 +01:00
Lynne b83e0560f7
hwcontext_vulkan: unify download/upload functions
They were identical, save for variable names and order.
2020-11-25 23:06:35 +01:00
Lynne b4f9d05301
hwcontext_vulkan: add VkExternalMemoryBufferCreateInfo to imported buffers
Its a validation layer thing.
2020-11-25 23:06:35 +01:00
Lynne 10b3c9b533
hwcontext_vulkan: do not use uninitialized variables on errors in CUDA code 2020-11-25 23:06:35 +01:00
Lynne fe3ea13131
hwcontext_vulkan: remove plane size alignment checks when host importing
The process space is guaranteed to be aligned to the page size, hence we're
never going to map outside of our address space.
There are more optimizations to do with respect to chroma plane alignment and
buffer offsets, but that can be done later.
2020-08-02 22:48:51 +02:00
Lynne 64b12624e2
hwcontext_vulkan: fix uploading and downloading from/to flipped images
We want to copy the lowest amount of bytes per line, but while the buffer
stride is sanitized, the src/dst stride can be negative, and negative numbers
of bytes do not make a lot of sense.
2020-05-26 12:03:42 +01:00
Lynne bf056caf54
hwcontext_vulkan: check for dedicated allocation when mapping from drm/vaapi
Some vendors (AMD) require dedicated allocation to be used for all imported
images.
2020-05-26 10:52:11 +01:00
Lynne b6d4bedbb1
hwcontext_vulkan: initialize the frames context when deriving
Otherwise, the frames context is considered to be ready to handle
mapping, and it doesn't get initialized the normal way through
.frames_init.
2020-05-26 10:52:10 +01:00
Lynne 6bb718aabd
hwcontext_vulkan: use dedicated allocation for buffers when necessary 2020-05-26 10:52:10 +01:00
Lynne 4dcb50c58a
hwcontext_vulkan: use host mapped buffers when uploading and downloading
Speeds up both use cases by 30%.
2020-05-26 10:52:10 +01:00
Lynne dc9cf7f2cd
hwcontext_vulkan: move physical device feature discovery to device_init
Otherwise custom vulkan device contexts won't work.
2020-05-23 19:07:46 +01:00
Lynne d870e75c39
hwcontext_vulkan: split uploading and downloading contexts
This allows us to speed up only-uploading or only-downloading use cases.
2020-05-23 19:07:45 +01:00
Lynne 192997dd7f
hwcontext_vulkan: set usage for DRM imports to the frames context usage
They're nothing special, and there's no reason they should always use the
default flags.
2020-05-23 19:07:43 +01:00
Lynne 2c6366590e
hwcontext_vulkan: do not OR the user-specified usage with our default flags
Some users may need special formats that aren't available when the STORAGE
flag bit is set, which would result in allocations failing.
2020-05-23 19:07:41 +01:00
Lynne 98405422be
hwcontext_vulkan: actually use the frames exec context for prep/import/export
This was never actually used, likely due to confusion, as the device context
also had one used for uploads and downloads.
Also, since we're only using it for very quick image barriers (which are
practically free on all hardware), use the compute queue instead of the
transfer queue.
2020-05-23 19:07:39 +01:00
Lynne 3dd3d1b7fb
hwcontext_vulkan: support user-provided pools
If an external pool was provided we skipped all of frames init,
including the exec context.
2020-05-23 19:07:37 +01:00
Lynne c0b0807871
hwcontext_vulkan: use all enabled queues for transfers, make uploads async
This commit makes full use of the enabled queues to provide asynchronous
uploads of images (downloads remain synchronous).
For a pure uploading use cases, the performance gains can be significant.
2020-05-23 19:07:36 +01:00
Lynne cdb949a05c
hwcontext_vulkan: wrap ImageBufs into AVBufferRefs
Makes it easier to support multiple queues
2020-05-23 19:07:34 +01:00
Lynne ea1a7f6064
hwcontext_vulkan: expose the enabled device features
With this, the puzzle of making libplacebo, ffmpeg and any other Vulkan
API users interoperable is complete.
Users of both libraries can initialize one another's contexts without having
to create a new one.
2020-05-23 19:07:30 +01:00
Lynne 01c7539f30
hwcontext_vulkan: expose the amount of queues for each queue family
This, along with the next patch, are the last missing pieces to being
interoperable with libplacebo.
2020-05-23 19:07:29 +01:00
Lynne 2e08b39444
hwcontext: add av_hwdevice_ctx_create_derived_opts
This allows for users who derive devices to set options for the
new device context they derive.
The main use case of this is to allow users to enable extensions
(such as surface drawing extensions) in Vulkan while deriving from
the device their frames are on. That way, users don't need to write
any initialization code themselves, since the Vulkan spec invalidates
mixing instances, physical devices and active devices.
Apart from Vulkan, other hwcontexts ignore the opts argument since they
don't support options at all (or in VAAPI and OpenCL's case, options are
currently only used for device selection, which device_derive overrides).
2020-05-23 19:07:26 +01:00
Lynne 858f786eb9 hwcontext_vulkan: fix incorrect print argument 2020-05-14 21:06:24 +01:00
Lynne 4b7e13931f
hwcontext_vulkan: don't add the optional VK_KHR_surface extension by default
Both API and CLI users can enable any extension they'd like using the options.
2020-05-12 21:32:34 +01:00
Lynne 251e4ad0ad
hwcontext_vulkan: don't error on unavailable user-specified extensions
Only warn instead. API users can find out which extensions were unavailable
by using the enabled_inst_extensions and enabled_dev_extensions fields.
This eliminates having to trial-and-error to find which extensions were missing.
2020-05-12 21:32:32 +01:00
Lynne 6025e66f98
hwcontext_vulkan: use the maximum amount of queues for each family
Due to our AVHWDevice infrastructure, where API users are offered a way
to derive contexts rather than always create new one, our filterchains,
being supported by a single hardware device context, can grow to considerable
size.
Hence, in such situations, using the maximum amount of queues the device offers
can be benefitial to eliminating bottlenecks where queue submissions on the
same family have to wait for the previous one to finish.
2020-05-12 21:32:30 +01:00
Lynne 0e39fce1e1
hwcontext_vulkan: update prepare_frame() for multiple semaphores when exporting 2020-05-12 21:32:24 +01:00
Lynne 70d396c8af Revert "hwcontext_vulkan: only use one semaphore per image"
This reverts commit 97b526c192.
It broke the API, and assumed no other APIs used multiple semaphores.
This also disallowed certain optimizations to happen.

Dealing with APIs that give or expect single semaphores is easier when
we use per-image semaphores.
2020-05-11 23:48:26 +01:00
Lynne fc99a24782
hwcontext_vulkan: convert to general layout and transfer queue when exporting
The specs note that images should be in the GENERAL layout when exporting
for maximum compatibility.
CUDA exported images are handled differently, and the queue is the same,
so we don't need to do that there.
2020-05-10 23:20:49 +01:00
Lynne 875c1707e5
hwcontext_vulkan: create all images with concurrent sharing mode
As it turns out, we were already assuming and treating all images as if they had
concurrent access mode. This just changes the flag to CONCURRENT, which has less
restrictions than EXCLUSIVE, and fixed validation messages on machines with
multiple queues.
The validation layer didn't pick this up because the machine I was testing on
had only a single queue.
2020-05-10 23:20:49 +01:00
Lynne 7c080dc190
hwcontext_vulkan: fix inverted condition when exporting images to drm_prime
Calling vkGetImageSubresourceLayout is only legal for linear and drm images.
2020-05-10 23:20:49 +01:00
Lynne acfef378b7
hwcontext_vulkan: update debugging layer name 2020-05-10 23:20:48 +01:00
Lynne 030a565baf
hwcontext_vulkan: remove unused internal REQUIRED extension flag
This is a leftover from an old version which used the 1.0 Vulkan API
with the maintenance extensions being required.
2020-05-10 23:20:48 +01:00
Lynne dccd07f66d
hwcontext_vulkan: expose enabled device and instance extensions
This solves a huge oversight - it lets users reliably use their own
AVVulkanDeviceContext. Otherwise, the extensions supplied and enabled
are not discoverable by anything outside of hwcontext_vulkan.
Also clarifies that any user-supplied VkInstance must be at least 1.1.
2020-05-10 23:20:48 +01:00
Lynne 3c5e5a5095
hwcontext_vulkan: let users enable device and instance extensions using options
Also documents all options supported by the hwdevice.
This lets users enable all extensions they need without writing their own
instance initialization code.
2020-05-10 23:20:47 +01:00
Lynne b69f5a72ce hwcontext_vulkan: optionally enable the VK_KHR_surface extension if available
This allows any phys_device derived to be used as a display rendering device.
2020-05-10 11:23:10 +01:00
Lynne e3c7b22451
hwcontext_vulkan: correctly download and upload flipped images
We derive the destination buffer stride from the input stride,
which meant if the image was flipped with a negative stride,
we'd be FFALIGNING a negative number which ends up being huge,
thus making the Vulkan buffer allocation fail and the whole
image transfer fail.

Only found out about this as OpenGL compositors can copy an entire
image with a single call if its flipped, rather than iterate over
each line.
2020-04-21 19:00:51 +01:00
Lynne 97b526c192 hwcontext_vulkan: only use one semaphore per image
The idea was to allow separate planes to be filtered independently, however,
in hindsight, literaly nothing uses separate per-plane semaphores and it
would only work when each plane is backed by separate device memory.
2020-04-07 12:52:56 +01:00
Lynne ecc3dceff4 hwcontext_vulkan: fix imported image bitmask 2020-03-17 22:52:00 +00:00
Lynne 6353b9e4ab hwcontext_vulkan: support more than one plane per DMABUF layer
Requires the dmabuf modifiers extension.
Allows for importing of compressed images with a second plane.
2020-03-12 18:59:12 +00:00