mpv/video/hwdec.h

134 lines
4.8 KiB
C
Raw Normal View History

#ifndef MP_HWDEC_H_
#define MP_HWDEC_H_
#include "options/m_option.h"
struct mp_image_pool;
// keep in sync with --hwdec option (see mp_hwdec_names)
enum hwdec_type {
HWDEC_NONE = 0,
HWDEC_AUTO,
HWDEC_AUTO_COPY,
HWDEC_VDPAU,
HWDEC_VDPAU_COPY,
HWDEC_VIDEOTOOLBOX,
HWDEC_VIDEOTOOLBOX_COPY,
HWDEC_VAAPI,
HWDEC_VAAPI_COPY,
HWDEC_DXVA2,
HWDEC_DXVA2_COPY,
HWDEC_D3D11VA,
HWDEC_D3D11VA_COPY,
HWDEC_RPI,
HWDEC_RPI_COPY,
HWDEC_MEDIACODEC,
HWDEC_MEDIACODEC_COPY,
hwdec/opengl: Add support for CUDA and cuvid/NvDecode Nvidia's "NvDecode" API (up until recently called "cuvid" is a cross platform, but nvidia proprietary API that exposes their hardware video decoding capabilities. It is analogous to their DXVA or VDPAU support on Windows or Linux but without using platform specific API calls. As a rule, you'd rather use DXVA or VDPAU as these are more mature and well supported APIs, but on Linux, VDPAU is falling behind the hardware capabilities, and there's no sign that nvidia are making the investments to update it. Most concretely, this means that there is no VP8/9 or HEVC Main10 support in VDPAU. On the other hand, NvDecode does export vp8/9 and partial support for HEVC Main10 (more on that below). ffmpeg already has support in the form of the "cuvid" family of decoders. Due to the design of the API, it is best exposed as a full decoder rather than an hwaccel. As such, there are decoders like h264_cuvid, hevc_cuvid, etc. These decoders support two output paths today - in both cases, NV12 frames are returned, either in CUDA device memory or regular system memory. In the case of the system memory path, the decoders can be used as-is in mpv today with a command line like: mpv --vd=lavc:h264_cuvid foobar.mp4 Doing this will take advantage of hardware decoding, but the cost of the memcpy to system memory adds up, especially for high resolution video (4K etc). To avoid that, we need an hwdec that takes advantage of CUDA's OpenGL interop to copy from device memory into OpenGL textures. That is what this change implements. The process is relatively simple as only basic device context aquisition needs to be done by us - the CUDA buffer pool is managed by the decoder - thankfully. The hwdec looks a bit like the vdpau interop one - the hwdec maintains a single set of plane textures and each output frame is repeatedly mapped into these textures to pass on. The frames are always in NV12 format, at least until 10bit output supports emerges. The only slightly interesting part of the copying process is that CUDA works by associating PBOs, so we need to define these for each of the textures. TODO Items: * I need to add a download_image function for screenshots. This would do the same copy to system memory that the decoder's system memory output does. * There are items to investigate on the ffmpeg side. There appears to be a problem with timestamps for some content. Final note: I mentioned HEVC Main10. While there is no 10bit output support, NvDecode can return dithered 8bit NV12 so you can take advantage of the hardware acceleration. This particular mode requires compiling ffmpeg with a modified header (or possibly the CUDA 8 RC) and is not upstream in ffmpeg yet. Usage: You will need to specify vo=opengl and hwdec=cuda. Note that hwdec=auto will probably not work as it will try to use vdpau first. mpv --hwdec=cuda --vo=opengl foobar.mp4 If you want to use filters that require frames in system memory, just use the decoder directly without the hwdec, as documented above.
2016-09-04 22:23:55 +00:00
HWDEC_CUDA,
HWDEC_CUDA_COPY,
HWDEC_NVDEC,
HWDEC_NVDEC_COPY,
HWDEC_CRYSTALHD,
HWDEC_RKMPP,
};
#define HWDEC_IS_AUTO(x) ((x) == HWDEC_AUTO || (x) == HWDEC_AUTO_COPY)
// hwdec_type names (options.c)
extern const struct m_opt_choice_alternatives mp_hwdec_names[];
struct mp_hwdec_ctx {
enum hwdec_type type; // (never HWDEC_NONE or HWDEC_IS_AUTO)
const char *driver_name; // NULL if unknown/not loaded
// This is never NULL. Its meaning depends on the .type field:
2017-03-23 10:16:02 +00:00
// HWDEC_VDPAU: struct mp_vdpau_ctx*
// HWDEC_VIDEOTOOLBOX: non-NULL dummy pointer
// HWDEC_VAAPI: struct mp_vaapi_ctx*
// HWDEC_D3D11VA: ID3D11Device*
// HWDEC_DXVA2: IDirect3DDevice9*
// HWDEC_CUDA: CUcontext*
void *ctx;
// libavutil-wrapped context, if available.
struct AVBufferRef *av_device_ref; // AVHWDeviceContext*
// List of IMGFMT_s, terminated with 0. NULL if N/A.
int *supported_formats;
// Hint to generic code: it's using a wrapper API
bool emulated;
vdpau: crappy hack to allow initializing hw decoding after preemption If vo_opengl is used, and vo_opengl already created the vdpau interop (for whatever reasons), and then preemption happens, and then you try to enable hw decoding, it failed. The reason was that preemption recovery is not run at any point before libavcodec accesses the vdpau device. The actual impact was that with libmpv + opengl-cb use, hardware decoding was permanently broken after display mode switching (something that caused the display to get preempted at least with older drivers). With mpv CLI, you can for example enable hw decoding during playback, then disable it, VT switch to console, switch back to X, and try to enable hw decoding again. This is mostly because libav* does not deal with preemption, and NVIDIA driver preemption behavior being horrible garbage. In addition to being misdesigned API, the preemption callback is not called before you try to access vdpau API, and then only with _some_ accesses. In summary, the preemption callback was never called, neither before nor after libavcodec tried to init the decoder. So we have to get mp_vdpau_handle_preemption() called before libavcodec accesses it. This in turn will do a dummy API access which usually triggers the preemption callback immediately (with NVIDIA's drivers). In addition, we have to update the AVHWDeviceContext's device. In theory it could change (in practice it usually seems to use handle "0"). Creating a new device would cause chaos, as we don't have a concept of switching the device context on the fly. So we simply update it directly. I'm fairly sure this violates the libav* API, but it's the best we can do.
2017-05-19 13:24:38 +00:00
// Optional. Crap for vdpau. Makes sure preemption recovery is run if needed.
void (*restore_device)(struct mp_hwdec_ctx *ctx);
// Optional. Do not set for VO-bound devices.
void (*destroy)(struct mp_hwdec_ctx *ctx);
};
// Used to communicate hardware decoder device handles from VO to video decoder.
struct mp_hwdec_devices;
struct mp_hwdec_devices *hwdec_devices_create(void);
void hwdec_devices_destroy(struct mp_hwdec_devices *devs);
// Return the device context for the given API type. Returns NULL if none
// available. Logically, the returned pointer remains valid until VO
// uninitialization is started (all users of it must be uninitialized before).
// hwdec_devices_request() may be used before this to lazily load devices.
struct mp_hwdec_ctx *hwdec_devices_get(struct mp_hwdec_devices *devs,
enum hwdec_type type);
// For code which still strictly assumes there is 1 (or none) device.
struct mp_hwdec_ctx *hwdec_devices_get_first(struct mp_hwdec_devices *devs);
// Add this to the list of internal devices. Adding the same pointer twice must
// be avoided.
void hwdec_devices_add(struct mp_hwdec_devices *devs, struct mp_hwdec_ctx *ctx);
// Remove this from the list of internal devices. Idempotent/ignores entries
vo_gpu: make it possible to load multiple hwdec interop drivers Make the VO<->decoder interface capable of supporting multiple hwdec APIs at once. The main gain is that this simplifies autoprobing a lot. Before this change, it could happen that the VO loaded the "wrong" hwdec API, and the decoder was stuck with the choice (breaking hw decoding). With the change applied, the VO simply loads all available APIs, so autoprobing trickery is left entirely to the decoder. In the past, we were quite careful about not accidentally loading the wrong interop drivers. This was in part to make sure autoprobing works, but also because libva had this obnoxious bug of dumping garbage to stderr when using the API. libva was fixed, so this is not a problem anymore. The --opengl-hwdec-interop option is changed in various ways (again...), and renamed to --gpu-hwdec-interop. It does not have much use anymore, other than debugging. It's notable that the order in the hwdec interop array ra_hwdec_drivers[] still matters if multiple drivers support the same image formats, so the option can explicitly force one, if that should ever be necessary, or more likely, for debugging. One example are the ra_hwdec_d3d11egl and ra_hwdec_d3d11eglrgb drivers, which both support d3d11 input. vo_gpu now always loads the interop lazily by default, but when it does, it loads them all. vo_opengl_cb now always loads them when the GL context handle is initialized. I don't expect that this causes any problems. It's now possible to do things like changing between vdpau and nvdec decoding at runtime. This is also preparation for cleaning up vd_lavc.c hwdec autoprobing. It's another reason why hwdec_devices_request_all() does not take a hwdec type anymore.
2017-12-01 04:05:00 +00:00
// not added yet. This is not thread-safe.
void hwdec_devices_remove(struct mp_hwdec_devices *devs, struct mp_hwdec_ctx *ctx);
// Can be used to enable lazy loading of an API with hwdec_devices_request().
// If used at all, this must be set/unset during initialization/uninitialization,
// as concurrent use with hwdec_devices_request() is a race condition.
void hwdec_devices_set_loader(struct mp_hwdec_devices *devs,
vo_gpu: make it possible to load multiple hwdec interop drivers Make the VO<->decoder interface capable of supporting multiple hwdec APIs at once. The main gain is that this simplifies autoprobing a lot. Before this change, it could happen that the VO loaded the "wrong" hwdec API, and the decoder was stuck with the choice (breaking hw decoding). With the change applied, the VO simply loads all available APIs, so autoprobing trickery is left entirely to the decoder. In the past, we were quite careful about not accidentally loading the wrong interop drivers. This was in part to make sure autoprobing works, but also because libva had this obnoxious bug of dumping garbage to stderr when using the API. libva was fixed, so this is not a problem anymore. The --opengl-hwdec-interop option is changed in various ways (again...), and renamed to --gpu-hwdec-interop. It does not have much use anymore, other than debugging. It's notable that the order in the hwdec interop array ra_hwdec_drivers[] still matters if multiple drivers support the same image formats, so the option can explicitly force one, if that should ever be necessary, or more likely, for debugging. One example are the ra_hwdec_d3d11egl and ra_hwdec_d3d11eglrgb drivers, which both support d3d11 input. vo_gpu now always loads the interop lazily by default, but when it does, it loads them all. vo_opengl_cb now always loads them when the GL context handle is initialized. I don't expect that this causes any problems. It's now possible to do things like changing between vdpau and nvdec decoding at runtime. This is also preparation for cleaning up vd_lavc.c hwdec autoprobing. It's another reason why hwdec_devices_request_all() does not take a hwdec type anymore.
2017-12-01 04:05:00 +00:00
void (*load_api)(void *ctx), void *load_api_ctx);
vo_gpu: make it possible to load multiple hwdec interop drivers Make the VO<->decoder interface capable of supporting multiple hwdec APIs at once. The main gain is that this simplifies autoprobing a lot. Before this change, it could happen that the VO loaded the "wrong" hwdec API, and the decoder was stuck with the choice (breaking hw decoding). With the change applied, the VO simply loads all available APIs, so autoprobing trickery is left entirely to the decoder. In the past, we were quite careful about not accidentally loading the wrong interop drivers. This was in part to make sure autoprobing works, but also because libva had this obnoxious bug of dumping garbage to stderr when using the API. libva was fixed, so this is not a problem anymore. The --opengl-hwdec-interop option is changed in various ways (again...), and renamed to --gpu-hwdec-interop. It does not have much use anymore, other than debugging. It's notable that the order in the hwdec interop array ra_hwdec_drivers[] still matters if multiple drivers support the same image formats, so the option can explicitly force one, if that should ever be necessary, or more likely, for debugging. One example are the ra_hwdec_d3d11egl and ra_hwdec_d3d11eglrgb drivers, which both support d3d11 input. vo_gpu now always loads the interop lazily by default, but when it does, it loads them all. vo_opengl_cb now always loads them when the GL context handle is initialized. I don't expect that this causes any problems. It's now possible to do things like changing between vdpau and nvdec decoding at runtime. This is also preparation for cleaning up vd_lavc.c hwdec autoprobing. It's another reason why hwdec_devices_request_all() does not take a hwdec type anymore.
2017-12-01 04:05:00 +00:00
// Cause VO to lazily load all devices, and will block until this is done (even
// if not available).
void hwdec_devices_request_all(struct mp_hwdec_devices *devs);
// Convenience function:
// - return NULL if devs==NULL
// - call hwdec_devices_request(devs, type)
// - call hwdec_devices_get(devs, type)
// - then return the mp_hwdec_ctx.ctx field
void *hwdec_devices_load(struct mp_hwdec_devices *devs, enum hwdec_type type);
vo_gpu: make it possible to load multiple hwdec interop drivers Make the VO<->decoder interface capable of supporting multiple hwdec APIs at once. The main gain is that this simplifies autoprobing a lot. Before this change, it could happen that the VO loaded the "wrong" hwdec API, and the decoder was stuck with the choice (breaking hw decoding). With the change applied, the VO simply loads all available APIs, so autoprobing trickery is left entirely to the decoder. In the past, we were quite careful about not accidentally loading the wrong interop drivers. This was in part to make sure autoprobing works, but also because libva had this obnoxious bug of dumping garbage to stderr when using the API. libva was fixed, so this is not a problem anymore. The --opengl-hwdec-interop option is changed in various ways (again...), and renamed to --gpu-hwdec-interop. It does not have much use anymore, other than debugging. It's notable that the order in the hwdec interop array ra_hwdec_drivers[] still matters if multiple drivers support the same image formats, so the option can explicitly force one, if that should ever be necessary, or more likely, for debugging. One example are the ra_hwdec_d3d11egl and ra_hwdec_d3d11eglrgb drivers, which both support d3d11 input. vo_gpu now always loads the interop lazily by default, but when it does, it loads them all. vo_opengl_cb now always loads them when the GL context handle is initialized. I don't expect that this causes any problems. It's now possible to do things like changing between vdpau and nvdec decoding at runtime. This is also preparation for cleaning up vd_lavc.c hwdec autoprobing. It's another reason why hwdec_devices_request_all() does not take a hwdec type anymore.
2017-12-01 04:05:00 +00:00
// Return "," concatenated list (for introspection/debugging). Use talloc_free().
char *hwdec_devices_get_names(struct mp_hwdec_devices *devs);
2017-10-16 15:06:01 +00:00
struct mp_image;
// Per AV_HWDEVICE_TYPE_* functions, queryable via hwdec_get_hwcontext_fns().
// For now, all entries are strictly optional.
struct hwcontext_fns {
int av_hwdevice_type;
// Set any mp_image fields that require hwcontext specific code, such as
// fields or flags not present in AVFrame or AVHWFramesContext in a
// portable way. This is called directly after img is converted from an
// AVFrame, with all other fields already set. img.hwctx will be set, and
// use the correct AV_HWDEVICE_TYPE_.
void (*complete_image_params)(struct mp_image *img);
// Fill in special format-specific requirements.
void (*refine_hwframes)(struct AVBufferRef *hw_frames_ctx);
};
// The parameter is of type enum AVHWDeviceType (as in int to avoid extensive
// recursive includes). May return NULL for unknown device types.
const struct hwcontext_fns *hwdec_get_hwcontext_fns(int av_hwdevice_type);
#endif