mpv/man at e72093581bdf07784d6889035c3751cbc7fb8ca0 - mpv

History

Philip Langdale da1073c247 vo_gpu: vulkan: hwdec_cuda: Add support for Vulkan interop Despite their place in the tree, hwdecs can be loaded and used just fine by the vulkan GPU backend. In this change we add Vulkan interop support to the cuda/nvdec hwdec. The overall process is mostly straight forward, so the main observation here is that I had to implement it using an intermediate Vulkan buffer because the direct VkImage usage is blocked by a bug in the nvidia driver. When that gets fixed, I will revist this. Nevertheless, the intermediate buffer copy is very cheap as it's all device memory from start to finish. Overall CPU utilisiation is pretty much the same as with the OpenGL GPU backend. Note that we cannot use a single intermediate buffer - rather there is a pool of them. This is done because the cuda memcpys are not explicitly synchronised with the texture uploads. In the basic case, this doesn't matter because the hwdec is not asked to map and copy the next frame until after the previous one is rendered. In the interpolation case, we need extra future frames available immediately, so we'll be asked to map/copy those frames and vulkan will be asked to render them. So far, harmless right? No. All the vulkan rendering, including the upload steps, are batched together and end up running very asynchronously from the CUDA copies. The end result is that all the copies happen one after another, and only then do the uploads happen, which means all textures are uploaded the same, final, frame data. Whoops. Unsurprisingly this results in the jerky motion because every 3/4 frames are identical. The buffer pool ensures that we do not overwrite a buffer that is still waiting to be uploaded. The ra_buf_pool implementation automatically checks if existing buffers are available for use and only creates a new one if it really has to. It's hard to say for sure what the maximum number of buffers might be but we believe it won't be so large as to make this strategy unusable. The highest I've seen is 12 when using interpolation with tscale=bicubic. A future optimisation here is to synchronise the CUDA copies with respect to the vulkan uploads. This can be done with shared semaphores that would ensure the copy of the second frames only happens after the upload of the first frame, and so on. This isn't trivial to implement as I'd have to first adjust the hwdec code to use asynchronous cuda; without that, there's no way to use the semaphore for synchronisation. This should result in fewer intermediate buffers being required.		2018-10-22 21:35:48 +02:00
..
af.rst	f_lavfi: add an option to use old audio PTS handling for af_lavfi	2018-04-15 23:11:33 +03:00
ao.rst	ao_pulse: reduce requested device buffer size	2018-04-15 23:11:33 +03:00
changes.rst	…
encode.rst	encode: remove old timestamp handling	2018-05-03 01:08:44 +03:00
input.rst	manpage: Correct show-text duration default value	2018-08-05 23:02:01 +02:00
ipc.rst	ipc: alias set_property_string to set_property	2018-05-25 10:45:59 +02:00
javascript.rst	js: implement mp.register_idle	2018-04-07 16:02:19 -07:00
libmpv.rst	…
lua.rst	scripting: change when/how player waits for scripts being loaded	2018-04-18 01:17:41 +03:00
mpv.rst	man: mention stats in interactive control	2018-10-14 21:56:34 +03:00
options.rst	vo_gpu: vulkan: hwdec_cuda: Add support for Vulkan interop	2018-10-22 21:35:48 +02:00
osc.rst	config: replace config dir lua-settings/ with dir script-opts/	2018-04-07 16:02:16 -07:00
stats.rst	config: replace config dir lua-settings/ with dir script-opts/	2018-04-07 16:02:16 -07:00
vf.rst	manpage: fix --vf exclamation mark description	2018-08-05 23:01:45 +02:00
vo.rst	manpage: minor fix to --drm-format	2018-09-30 14:22:49 +03:00