ANGLE was missing texture() overloads in the shader compiler for
GL_TEXTURE_EXTERNAL_OES textures. Support has been added upstream,
so we can use it now.
MSDN documents this as "Introduced in Windows 8.1.". I assume on Windows
7 this field will simply be ignored. Too bad for Windows 7 users.
Also, I'm not using D3D11_VIDEO_PROCESSOR_NOMINAL_RANGE_16_235 and
D3D11_VIDEO_PROCESSOR_NOMINAL_RANGE_0_255, because these are apparently
completely missing from the MinGW headers. (Such a damn pain.)
ANGLE is _really_ annoying to build. (Requires special toolchain and a
recent MSVC version.) This results in various issues with people
having trouble to build mpv against ANGLE (apparently linking it
against a prebuilt binary doesn't count, or using binaries from
potentially untrusted sources is not wanted).
Dynamically loading ANGLE is going to be a huge convenience. This commit
implements this, with special focus on keeping it source compatible to
a normal build with ANGLE linked at build-time.
This uses EGL_ANGLE_stream_producer_d3d_texture_nv12 and related
extensions to map the D3D textures coming from the hardware decoder
directly in GL.
In theory this would be trivial to achieve, but unfortunately ANGLE does
not have a mechanism to "import" D3D textures as GL textures. Instead,
an awkward mechanism via EGL_KHR_stream was implemented, which involves
at least 5 extensions and a lot of glue code. (Even worse than VAAPI EGL
interop, and very far from the simplicity you get on OSX.)
The ANGLE mechanism so far supports only the NV12 texture format, which
means 10 bit won't work. It also does not work in ES3 mode yet. For
these reasons, the "old" ID3D11VideoProcessor code is kept and used as a
fallback.
Rename gl_hwdec_driver.map_image to map_frame, and let it fill out a
struct gl_hwdec_frame describing the exact texture layout. This gives
more flexibility to what the hwdec interop can export. In particular, it
can export strange component orders/permutations and textures with
padded size. (The latter originating from cropped video.)
The way gl_hwdec_frame works is in the spirit of the rest of the
vo_opengl video processing code, which tends to put as much information
in immediate state (as part of the dataflow), instead of declaring it
globally. To some degree this duplicates the texplane and img_tex
structs, but until we somehow unify those, it's better to give the hwdec
state its own struct. The fact that changing the hwdec struct would
require changes and testing on at least 4 platform/GPU combinations
makes duplicating it almost a requirement to avoid pain later.
Make gl_hwdec_driver.reinit set the new image format and remove the
gl_hwdec.converted_imgfmt field.
Likewise, gl_hwdec.gl_texture_target is replaced with
gl_hwdec_plane.gl_target.
Split out a init_image_desc function from init_format. The latter is not
called in the hwdec case at all anymore. Setting up most of struct
texplane is also completely separate in the hwdec and normal cases.
video.c does not check whether the hwdec "mapped" image format is
supported. This should not really happen anyway, and if it does, the
hwdec interop backend must fail at creation time, so this is not an
issue.
The main change is with video/hwdec.h. mp_hwdec_info is made opaque (and
renamed to mp_hwdec_devices). Its accessors are mainly thread-safe (or
documented where not), which makes the whole thing saner and cleaner. In
particular, thread-safety rules become less subtle and more obvious.
The new internal API makes it easier to support multiple OpenGL interop
backends. (Although this is not done yet, and it's not clear whether it
ever will.)
This also removes all the API-specific fields from mp_hwdec_ctx and
replaces them with a "ctx" field. For d3d in particular, we drop the
mp_d3d_ctx struct completely, and pass the interfaces directly.
Remove the emulation checks from vaapi.c and vdpau.c; they are
pointless, and the checks that matter are done on the VO layer.
The d3d hardware decoders might slightly change behavior: dxva2-copy
will not use the VO device anymore if the VO supports proper interop.
This pretty much assumes that any in such cases the VO will not use any
form of exclusive mode, which makes using the VO device in copy mode
unnecessary.
This is a big refactor. Some things may be untested and could be broken.
If ANGLE was probed before (but rejected), the ANGLE API can remain
"initialized", and eglGetCurrentDisplay() will return a non-NULL
EGLDisplay. Then if a native GL context is used, the ANGLE/EGL API will
then (apparently) keep working alongside native OpenGL API. Since GL
objects are just numbers, they'll simply fail to interact, and OpenGL
will get invalid textures. For some reason this will result in black
textures.
With VAAPI-EGL, something similar could happen in theory, but didn't in
practice.
Including initguid.h at the top of a file that uses references to GUIDs
causes the GUIDs to be declared globally with __declspec(selectany). The
'selectany' attribute tells the linker to consolidate multiple
definitions of each GUID, which would be great except that, in Cygwin
and MinGW GCC 6.1, this method of linking makes the GUIDs conflict with
the ones declared in libuuid.a.
Since initguid.h obsoletes libuuid.a in modern compilers that support
__declspec(selectany), add initguid.h to all files that use GUIDs and
remove libuuid.a from the build.
Fixes#3097
Basically this gets rid of the need for the accessors in d3d11va.h, and
the code can be cleaned up a little bit.
Note that libavcodec only defines a ID3D11VideoDecoderOutputView pointer
in the last plane pointers, but it tolerates/passes through the other
plane pointers we set.
This uses ID3D11VideoProcessor to convert the video to a RGBA surface,
which is then bound to ANGLE. Currently ANGLE does not provide any way
to bind nv12 surfaces directly, so this will have to do.
ID3D11VideoContext1 would give us slightly more control about the
colorspace conversion, though it's still not good, and not available
in MinGW headers yet.
The video processor is created lazily, because we need to have the coded
frame size, of which AVFrame and mp_image have no concept of. Doing the
creation lazily is less of a pain than somehow hacking the coded frame
size into mp_image.
I'm not really sure how ID3D11VideoProcessorInputView is supposed to
work. We recreate it on every frame, which is simple and hopefully
doesn't affect performance.