Commit Graph

18 Commits

Author SHA1 Message Date
James Ross-Gowan 888f4e63a4 vo_gpu: d3d11: set the ra_format.storable flag
This flag was added in e2976e662d, but it was only set for Vulkan. In
D3D11 it can be set from info in D3D11_FEATURE_FORMAT_SUPPORT2.
2019-10-27 00:45:27 +11:00
James Ross-Gowan 6002e2705f vo_gpu: d3d11: use linear filtering for wrapped textures
This affects hwdec_dxva2dxgi, which uses ra_d3d11_wrap_tex to wrap RGB
video frames that are shared with a D3D9 device. Without it, mpv uses
nearest instead of bilinear scaling with --scale=bilinear (the default)
and --hwdec=dxva2. It's kind of hard to believe this bug has gone
unnoticed for almost two years, but that seems to have been the case.

Fixes: #7042
2019-10-10 19:52:19 +11:00
James Ross-Gowan 80552ab28e vo_gpu: d3d11: fix storage lifetime of compound literals
Somehow I got the idea that compound literals had function-scoped
lifetime. Instead, like all other objects with automatic storage
duration, compound literals are block-scoped, so they become invalid
after exiting the block they were declared in. It seems like a recent
change to GCC actually reuses the memory that the compound literals
used to occupy, which was causing a few bugs.

The pattern of conditionally assigning a pointer to a compound literal
was used in a few places in ra_d3d11 where the Direct3D API expects
either a pointer to an initialised struct or NULL. Change these to
ensure the lifetime of the struct includes the API call.

Should fix #6775.
2019-08-20 18:12:21 +10:00
Philip Langdale e2976e662d video/out/gpu: Add a `storable` flag to ra_format
While `ra` supports the concept of a texture as a storage
destination, it does not support the concept of a texture format
being usable for a storage texture. This can lead to us attempting
to create a texture from an incompatible format, with undefined
results.

So, let's introduce an explicit format flag for storage and use
it. In `ra_pl` we can simply reflect the `storable` flag. For
GL and D3D, we'll need to write some new code to do the compatibility
checks. I'm not going to do it here because it's not a regression;
we were already implicitly assuming all formats were storable.

Fixes #6657
2019-07-08 00:59:28 +02:00
James Ross-Gowan cc38035841 vo_gpu: d3d11: use the SPIRV-Cross C API directly
When the D3D11 backend was first written, SPIRV-Cross only had a C++ API
and no guarantee of API or ABI stability, so instead of using
SPIRV-Cross directly, mpv used an unofficial C wrapper called crossc.

Now that KhronosGroup/SPIRV-Cross#611 is resolved, SPIRV-Cross has an
official C API that can be used instead, so remove crossc and use
SPIRV-Cross directly.
2019-06-12 23:03:55 +03:00
Niklas Haas f0b6860d62 vo_gpu: index desc namespaces by ra
No reason to require them be constant. This allows them to depend on
runtime characteristics of the `ra`.
2019-04-21 23:55:22 +03:00
James Ross-Gowan 1b80e124db vo_gpu: d3d11: implement tex_download()
This allows the new GPU screenshot functionality introduced in
9f595f3a80 to work with the D3D11 backend. It replaces the old window
screenshot functionality, which was shared between D3D11 and ANGLE. The
old code can be removed, since it's not needed by ANGLE anymore either.
2018-02-13 21:25:15 +11:00
James Ross-Gowan 88c29b1301 vo_gpu: hwdec_dxva2dxgi: initial implementation
This enables DXVA2 hardware decoding with ra_d3d11. It should be useful
for Windows 7, where D3D11VA is not available. Images are transfered
from D3D9 to D3D11 using D3D9Ex surface sharing[1].

Following Microsoft's recommendations, it uses a queue of shared
surfaces, similar to Microsoft's ISurfaceQueue. This will hopefully
prevent surface sharing from impacting parallelism and allow multiple
D3D11 frames to be in-flight at once.

[1]: https://msdn.microsoft.com/en-us/library/windows/desktop/ee913554.aspx
2018-01-06 11:26:15 +11:00
James Ross-Gowan 7677c7c32c vo_gpu: d3d11: avoid copying staging buffers to cbuffers
Apparently some Intel drivers have a bug where copying from staging
buffers to constant buffers does not work. We used to keep a copy of the
buffer data in a staging buffer to enable partial constant buffer
updates. To work around this bug, keep the copy in talloc-allocated
system memory instead.

There doesn't seem to be any noticable performance difference from
keeping the copy in system memory. Our cbuffers are probably too small
for it to matter anyway.

See also: https://crbug.com/593024

Fixes #5293
2018-01-01 20:31:45 +11:00
James Ross-Gowan 6ab7e0d465 vo_gpu: d3d11: check for timestamp query support
Apparently timestamp queries are optional for 10level9 devices. Check
for support when creating the device rather than spamming error messages
during rendering. CreateQuery can be used to check for support by
passing NULL as the final parameter.

See:
https://msdn.microsoft.com/en-us/library/windows/desktop/ff476150.aspx#ID3D11Device_CreateQuery
2017-12-09 19:53:53 +11:00
James Ross-Gowan 4569f39da5 vo_gpu: d3d11: don't use runtime version for UAV slot count
FL 11_1 is only supported with the Direct3D 11.1 runtime anyway, so
there is no need to check both the runtime version and the feature
level.
2017-11-19 21:49:27 +11:00
James Ross-Gowan 021fe791e1 vo_gpu: d3d11: mark the bgra8 format as unordered
Whoops. I was confused by the double-negative here.
2017-11-19 21:17:31 +11:00
James Ross-Gowan 9b5d062d36 vo_gpu: d3d11: remove flipped texture upload hack
Made unnecessary by 4a6b04bdb9.
2017-11-12 17:10:22 +11:00
James Ross-Gowan e7bf5576e5 vo_gpu: hwdec_d3d11va: allow zero-copy video decoding
Like the manual says, this is technically undefined behaviour. See:
https://msdn.microsoft.com/en-us/library/windows/desktop/ff476085.aspx

In particular, MSDN says texture arrays created with the BIND_DECODER
flag cannot be used with CreateShaderResourceView, which means they
can't be sampled through SRVs like normal Direct3D textures. However,
some programs (Google Chrome included) do this anyway for performance
and power-usage reasons, and it appears to work with most drivers.

Older AMD drivers had a "bug" with zero-copy decoding, but this appears
to have been fixed. See #3255, #3464 and http://crbug.com/623029.
2017-11-07 20:27:13 +11:00
James Ross-Gowan b258d82d6e vo_gpu: d3d11: enhance cache invalidation
The shader cache in ra_d3d11 caches the result of shaderc, crossc and
the D3DCompiler DLL, so it should be invalidated when any of those
components are updated. This should make the cache more reliable, which
makes it safer to enable gpu-shader-cache-dir. Shader compilation is
slow with D3D11, so gpu-shader-cache-dir is highly necessary
2017-11-07 20:27:13 +11:00
James Ross-Gowan b9c1286893 vo_gpu: d3d11: log shader compilation times
Some shaders take a _long_ time to compile with the Direct3D compiler.
The ANGLE backend had this problem too, to a certain extent. Logging
should help identify which shaders cause long stalls and could also help
with benchmarking ways of reducing compile times.
2017-11-07 20:27:13 +11:00
James Ross-Gowan 9b2dae79b1 vo_gpu: d3d11: add RA caps for ra_d3d11
ra_d3d11 uses the SPIR-V compiler to translate GLSL to SPIR-V, which is
then translated to HLSL. This means it always exposes the same GLSL
version that the SPIR-V compiler supports (4.50 for shaderc/glslang.)

Despite claiming to support GLSL 4.50, some features that are tied to
the GLSL version in OpenGL are not supported by ra_d3d11 when targeting
legacy Direct3D feature levels.

This includes two features that mpv relies on:
- Reading from gl_FragCoord in the fragment shader (requires FL 10_0)
- textureGather from any texture component (requires FL 11_0)

These features have been exposed as new RA caps.
2017-11-07 20:27:13 +11:00
James Ross-Gowan 68eac1a1e7 vo_gpu: d3d11: initial implementation
This is a new RA/vo_gpu backend that uses Direct3D 11. The GLSL
generated by vo_gpu is cross-compiled to HLSL with SPIRV-Cross.

What works:

- All of mpv's internal shaders should work, including compute shaders.

- Some external shaders have been tested and work, including RAVU and
  adaptive-sharpen.

- Non-dumb mode works, even on very old hardware. Most features work at
  feature level 9_3 and all features work at feature level 10_0. Some
  features also work at feature level 9_1 and 9_2, but without high-bit-
  depth FBOs, it's not very useful. (Hardware this old is probably not
  fast enough for advanced features anyway.)

  Note: This is more compatible than ANGLE, which requires 9_3 to work
  at all (GLES 2.0,) and 10_1 for non-dumb-mode (GLES 3.0.)

- Hardware decoding with D3D11VA, including decoding of 10-bit formats
  without truncation to 8-bit.

What doesn't work / can be improved:

- PBO upload and direct rendering does not work yet. Direct rendering
  requires persistent-mapped PBOs because the decoder needs to be able
  to read data from images that have already been decoded and uploaded.
  Unfortunately, it seems like persistent-mapped PBOs are fundamentally
  incompatible with D3D11, which requires all resources to use driver-
  managed memory and requires memory to be unmapped (and hence pointers
  to be invalidated) when a resource is used in a draw or copy
  operation.

  However it might be possible to use D3D11's limited multithreading
  capabilities to emulate some features of PBOs, like asynchronous
  texture uploading.

- The blit() and clear() operations don't have equivalents in the D3D11
  API that handle all cases, so in most cases, they have to be emulated
  with a shader. This is currently done inside ra_d3d11, but ideally it
  would be done in generic code, so it can take advantage of mpv's
  shader generation utilities.

- SPIRV-Cross is used through a NIH C-compatible wrapper library, since
  it does not expose a C interface itself.

  The library is available here: https://github.com/rossy/crossc

- The D3D11 context could be made to support more modern DXGI features
  in future. For example, it should be possible to add support for
  high-bit-depth and HDR output with DXGI 1.5/1.6.
2017-11-07 20:27:13 +11:00