Commit Graph

231 Commits

Author SHA1 Message Date
wm4 b7eae31834 vo_gpu: hwdec_d3d11eglrgb: remove this
Finally. Since with the previous commit we can (probably) handle
P010 directly, this hack isn't needed anymore.
2019-10-16 23:41:06 +02:00
Jan Ekström eaa3c1c922 vo_gpu/d3d11: fix memleak of the adapter description string 2019-10-15 22:12:48 +03:00
Jan Ekström 03e7a36a73 vo_gpu/d3d11: remove unnecessary nullptr check
mp_to_utf8 will abort in case of either invalid input or OOM.
2019-10-15 22:12:48 +03:00
Jan Ekström 89f4ce9d6f vo_gpu/d3d11: switch adapter selection to case-insensitive startswith
This lets users set values such as "intel" or "nvidia" as the
adapter vendor is generally noted in the beginning of the
description string.
2019-10-15 22:12:48 +03:00
Jan Ekström 684ffd13b4 vo_gpu/d3d11: fixup adapter selection by switching it all to bstr
I did ponder if I should have done this right away, and it seems
like not doing it at first was a mistake.
2019-10-15 22:12:48 +03:00
Jan Ekström 648d785930 vo_gpu/d3d11: add support for configuring swap chain format
Query information on the system output most linked to the swap chain,
and either utilize a user-configured format, or either 8bit
RGBA or 10bit RGB with 2bit alpha depending on the system output's
bit depth.
2019-10-13 22:31:33 +11:00
Jan Ekström 1f76e69145 vo_gpu/d3d11: add adapter name validation and listing with "help"
Not the prettiest way to get it done, but seems to work.
2019-09-29 19:39:26 +03:00
Jan Ekström bca6e14702 vo_gpu/d3d11: refactor pthread_once d3d11 loading to function
Lets us reuse this in the future.
2019-09-29 19:39:26 +03:00
Jan Ekström b7438d3aff vo_gpu/d3d11: utilize the passed adapter name
Normalize nullptr and an empty string both to nullptr to simplify
handling. API users cannot set a value back to nullptr, so both
an empty string as well as nullptr should behave the same.
2019-09-29 19:39:26 +03:00
Jan Ekström e6447e2e89 vo_gpu/d3d11: add an option for the adapter name
Set it from the adapter name in the d3d11 options.
2019-09-29 19:39:26 +03:00
Jan Ekström e205e179e0 vo_gpu/d3d11_helpers: also load up CreateDXGIFactory1
Just a factory, without a device, is required for listing of devices.
2019-09-29 19:39:26 +03:00
Anton Kindestam 6290420380 vo: make swapchain-depth option generic for all VOs
In preparation for making vo_drm able to use swapchain-depth
2019-09-28 14:10:01 +03:00
Wessel Dankers 643417dd17 video: add pure gamma TRC curves for 2.0, 2.4 and 2.6. 2019-09-27 13:21:41 +02:00
Philip Sequeira 21a5c416d5 options: add M_OPT_FILE to some more options that take files 2019-09-27 13:19:29 +02:00
sfan5 e350ceef4c vo_gpu: vulkan: add Android context 2019-09-27 00:05:06 +03:00
Cameron Cawley db09d77e46 rpi: Update for modern systems 2019-09-20 11:39:06 +02:00
wm4 c6773692ad vo_gpu: remove vdpau/GLX backend
Useless garbage.

This was once added to test whether vdpau presentation feedback could be
used. Results were always unsatisfactory, and now vdpau is dead.
2019-09-19 20:37:05 +02:00
wm4 83d7123dc3 vo_gpu: remove mali-fbdev
Useless at this point, I don't even know if it still works, or how to
test it.
2019-09-19 20:37:05 +02:00
Anton Kindestam e08f235578 drm: fix libmpv ABI breakage introduced in 351c083487
Extending the client-allocated mpv_opengl_drm_params struct
constituted a break of ABI that could cause UB.

Create a clean break by deprecating "drm_params" and related structs
and enum values, and replacing it with "drm_params_v2".

Also fix some comments and code that wrongly assumed that open could
return any other negative number than -1 for failure.

This commit updates the libmpv version to 1.104
2019-09-18 23:59:32 +03:00
Philip Langdale fa0a905ea0 vo_gpu: hwdec_vaapi: Refactor Vulkan and OpenGL interops for VAAPI
Like hwdec_cuda, you get a big #ifdef mess if you try and keep the
OpenGL and Vulkan interops in the same file. So, I've refactored
them into separate files in a similar way.
2019-09-15 17:51:47 -07:00
wm4 0abe34ed21 vo_gpu: x11: remove special vdpau probing, use EGL by default
Originally, vo_gpu/vo_opengl considered the case of Nvidia proprietary
drivers, which required vdpau/GLX, and Intel open source drivers, which
require vaapi/EGL. Since window creation and GPU context creation are
inseparable in mpv's internal API, it had to pick the correct API very
early, or hardware decoding wouldn't work. "x11probe" was introduced for
this reason. It created a GLX context (without showing the window yet),
and checked whether vdpau was available. If yes, it used GLX, if not, it
continued probing x11/EGL. (Obviously it couldn't always fail on GLX
without vdpau, which is why it was a separate "probe" backend.)

Years passed, and now the situation is different. Vdpau is dead. Nvidia
drivers and libavcodec now provide CUDA interop, which requires EGL, and
fixes some of the vdpau problems. AMD drivers now provide vaapi, which
generally works better than vdpau. Intel didn't change.

In particular, vaapi provides working HEVC Main10 support. In theory, it
should work on vdpau too, with quality reduction (no 10 bit surfaces),
but I couldn't get it to work.

So always prefer EGL. And suddenly hardware decoding works. This is
actually rather important, because HEVC is unfortunately on the rise,
despite shitty encoders and unoptimized decoders. The latter may mean
that hardware decoding works better than libavcodec.

This should have been done a long, long time ago.
2019-09-15 20:00:52 +03:00
Niklas Haas a416b3f084 vo_gpu: correctly normalize src.sig_peak
In some cases, src.sig_peak remains undefined as 0, which was definitely
the case when using the OSD, since it never got passed through the usual
color space normalization process. Most robust work-around is to simply
force the normalization at the site where it's needed. This ensures this
value is always valid and defined, to make the peak-dependent logic in
these two functions always work.

Fixes 4b25ec3a9d
Fixes #6917
Fixes #6918
2019-09-15 01:33:27 +02:00
Niklas Haas 4b25ec3a9d vo/gpu: fix check on src/dst peak mismatch
In the past, src peak was always equal to or higher than dst peak. But
since `--target-peak` got introduced, this could no longer be the case.
This leads to an incorrect result (scaling for peak mismatch in gamma
light) unless some other option (CMS, --linear-scaling, etc.) forces the
linearization.

Fixes #6533
2019-09-05 19:13:44 +03:00
wnoun ae8cb39ab2 vo_gpu: fix taking screenshots of rotated videos 2019-08-14 21:54:14 +02:00
Philip Langdale e2976e662d video/out/gpu: Add a `storable` flag to ra_format
While `ra` supports the concept of a texture as a storage
destination, it does not support the concept of a texture format
being usable for a storage texture. This can lead to us attempting
to create a texture from an incompatible format, with undefined
results.

So, let's introduce an explicit format flag for storage and use
it. In `ra_pl` we can simply reflect the `storable` flag. For
GL and D3D, we'll need to write some new code to do the compatibility
checks. I'm not going to do it here because it's not a regression;
we were already implicitly assuming all formats were storable.

Fixes #6657
2019-07-08 00:59:28 +02:00
Bin Jin c9e7473d67 vo_gpu: process three component together in error diffusion
This started as a desperate attempt to lower the memory requirement
of error diffusion, but later it turns out that this change also
improved the rendering performance a lot (by 40% as I tested).

Errors was stored in three uint before this change, each with 24bit
precision. This change encoded them into a single uint, each with 8bit
precision. This reduced the shared memory usage, as well as number of
atomic operations, all by three times.

Before this change, with the minimum required 32kb shared memory, only
the `simple` kernel can be used to render 1080p video, which is mostly
useless compare to `--dither=fruit`. After this change, 32kb can
handle `burkes` kernel for 1080p, or `sierra-lite` for 4K resolution.
2019-06-16 11:19:44 +02:00
Bin Jin f6fd127fe8 vo_gpu: fix use of existing textures in error diffusion
error diffusion requires two texture rendering pass. The existing code
reuses `screen_tex` and creates another for such purpose. This works
generally well for opengl, but could potentially be problematic for
vulkan, due to its async natural.
2019-06-16 11:19:44 +02:00
Bin Jin ca2f193671 vo_gpu: implement error diffusion for dithering
This is a straightforward parallel implementation of error diffusion
algorithms in compute shader. Basically we use single work group with
maximal possible size to process the whole image. After a shift
mapping we are able to process all pixels column by column.

A large ring buffer are allocated in shared memory to speed things up.
However the size of required shared memory depends linearly on the
height of video window (or screen height in fullscreen mode). In case
there is no enough shared memory, it will fallback to `--dither=fruit`.

The maximal allowed work group size is hardcoded as 1024. Ideally we
could query `GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS`. But for whatever
reason, it seems most high end card from nvidia and amd support only
the minimal required value, so I guess we can stick to it for now.
2019-06-16 11:19:44 +02:00
Bin Jin fbe267150d vo_gpu: fix --scaler-resizes-only for fractional ratio scaling
The calculation of scale factor involves 32-bit float, and a strict
equality test will effectively ignore `--scaler-resizes-only` option
for some non-integer scale factor.

Fix this by using non-strict equality check.
2019-06-06 20:01:56 +02:00
Bin Jin f2119d9d88 vo_gpu: expose texture_off to user shader
It will provide low level access to coordinate mapping other than
texmap().
2019-06-06 20:01:56 +02:00
Bin Jin ae1c489b31 vo_gpu: allow user shader to fix texture offset
This commit essentially makes user shader able to fix offset (produced
by other prescaler, for example) like builtin `--scale`.
2019-06-06 20:01:56 +02:00
Philip Langdale b74b39dfb5 vo_gpu: vulkan: Add back context_win for libplacebo
Feature parity with the original ra_vk obviously requires win32 support,
so let's put it back in.
2019-04-21 23:55:22 +03:00
Niklas Haas 7006d6752d vo_gpu: vulkan: use libplacebo instead
This commit rips out the entire mpv vulkan implementation in favor of
exposing lightweight wrappers on top of libplacebo instead, which
provides much of the same except in a more up-to-date and polished form.

This (finally) unifies the code base between mpv and libplacebo, which
is something I've been hoping to do for a long time.

Note: The ra_pl wrappers are abstract enough from the actual libplacebo
device type that we can in theory re-use them for other devices like
d3d11 or even opengl in the future, so I moved them to a separate
directory for the time being. However, the rest of the code is still
vulkan-specific, so I've kept the "vulkan" naming and file paths, rather
than introducing a new `--gpu-api` type. (Which would have been ended up
with significantly more code duplicaiton)

Plus, the code and functionality is similar enough that for most users
this should just be a straight-up drop-in replacement.

Note: This commit excludes some changes; specifically, the updates to
context_win and hwdec_cuda are deferred to separate commits for
authorship reasons.
2019-04-21 23:55:22 +03:00
Niklas Haas a3c808c6c8 vo_gpu: fix segfault when OSD tex creation fails
If !osd->texture, then mpgl_osd_draw_prepare fails.
2019-04-21 23:55:22 +03:00
Niklas Haas f0b6860d62 vo_gpu: index desc namespaces by ra
No reason to require them be constant. This allows them to depend on
runtime characteristics of the `ra`.
2019-04-21 23:55:22 +03:00
Bin Jin dd83b66652 vo_gpu: increase user shader size limit
The old size limit was chosen before LUT texture was supported in user
shader. At that time, the whole user shader will be compiled and run
on GPU, which makes large user shader impractical to be used.

With the introduction of LUT texture, the old size limit doesn't make
any sense. For example, a 1024x1024 rgba16f LUT will cost 32MB shader
size.

Fix this by increasing the size limit to a value that's unlikely be
reached.
2019-03-13 21:47:24 +02:00
Jan Ekström 199aabddcc Merge branch 'master' into pr6360
Manual changes done:
  * Merged the interface-changes under the already master'd changes.
  * Moved the hwdec-related option changes to video/decode/vd_lavc.c.
2019-03-11 01:00:27 +02:00
Bin Jin 1d0349d3b5 vo_gpu: add two useful operators to user shader
modulo operator could be used to check if size is multiple of a
certain number.

equal operator could be used to verify if size of different textures
aligns.
2019-03-09 12:56:11 +01:00
Bin Jin b3cbd46509 vo_gpu: make texture offset available to CHROMA hooks
Before this commit, texture offset is set after all source textures
are finalized. Which means CHROMA hooks won't be able to align with
luma planes. This could be problematic for chroma prescalers utilizing
information from luma plane.

Fix this by find the reference texture early, and set global texture
offset early.
2019-03-09 12:56:11 +01:00
zc62 e37c253b92 lcms: allow infinite contrast
Fixes #5980
2019-03-09 12:55:44 +01:00
Niklas Haas 8b563a0346 vo_gpu: fix initial seeding of the peak detect ssbo
This solves some edge cases when using files with very weird metadata
(e.g. MaxCLL 10k and so forth). Instead of just blindly seeding it with
the tagged metadata, forcibly set the initial state from the detected
values.
2019-02-18 01:54:06 +02:00
Niklas Haas 3f1bc25d4d vo_gpu: use dB units for scene change detection
Rather than the linear cd/m^2 units, these (relative) logarithmic units
lend themselves much better to actually detecting scene changes,
especially since the scene averaging was changed to also work
logarithmically.
2019-02-18 01:54:06 +02:00
Niklas Haas b4b719e337 vo_gpu: clamp sigmoid function
Can explode on some clips otherwise
2019-02-18 01:54:06 +02:00
Niklas Haas 258ed5d471 vo_gpu: tone map before gamut mapping
Gamut mapping can take very bright out-of-gamut colors into the
negatives, which completely destroys the color balance (which tone
mapping tries its best to preserve).
2019-02-18 01:54:06 +02:00
Niklas Haas 677ae4f8fe vo_gpu: make --gamut-warning warn on negative colors
As is the case for actually out-of-gamut colors (rather than just too
bright colors).
2019-02-18 01:54:06 +02:00
Niklas Haas 11b58415d5 vo_gpu: improve numerical accuracy of PQ OETF constant
Not a huge deal, but we can do the division in C, which makes the float
constant larger.
2019-02-18 01:54:06 +02:00
Niklas Haas 4e8022da26 vo_gpu: allow color management in dumb mode
There's no point to disallow target-trc/prim in dumb mode, since they
still work fine.
2019-02-18 01:54:06 +02:00
Niklas Haas fdd671188d vo_gpu: improve accuracy of HDR brightness estimation
This change switches to a logarithmic mean to estimate the average
signal brightness. This handles dark scenes with isolated highlights
much more faithfully than the linear mean did, since the log of the
signal roughly corresponds to the perceptual brightness.
2019-02-18 01:54:06 +02:00
Niklas Haas 12e58ff8a6 vo_gpu: allow boosting dark scenes when tone mapping
In theory our "eye adaptation" algorithm works in both ways, both
darkening bright scenes and brightening dark scenes. But I've always
just prevented the latter with a hard clamp, since I wanted to avoid
blowing up dark scenes into looking funny (and full of noise).

But allowing a tiny bit of over-exposure might be a good thing. I won't
change the default just yet (better let users test), but a moderate
value of 1.2 might be better than the current 1.0 limit. Needs testing
especially on dark scenes.
2019-02-18 01:54:06 +02:00
Niklas Haas 6179dcbb79 vo_gpu: redesign peak detection algorithm
The previous approach of using an FIR with tunable hard threshold for
scene changes had several problems:

- the FIR involved annoying hard-coded buffer sizes, high VRAM usage,
  and the FIR sum was prone to numerical overflow which limited the
  number of frames we could average over. We also totally redesign the
  scene change detection.

- the hard scene change detection was prone to both false positives and
  false negatives, each with their own (annoying) issues.

Scrap this entirely and switch to a dual approach of using a simple
single-pole IIR low pass filter to smooth out noise, while using a
softer scene change curve (with tunable low and high thresholds), based
on `smoothstep`. The IIR filter is extremely simple in its
implementation and has an arbitrarily user-tunable cutoff frequency,
while the smoothstep-based scene change curve provides a good, tunable
tradeoff between adaptation speed and stability - without exhibiting
either of the traditional issues associated with the hard cutoff.

Another way to think about the new options is that the "low threshold"
provides a margin of error within which we don't care about small
fluctuations in the scene (which will therefore be smoothed out by the
IIR filter).
2019-02-18 01:54:06 +02:00
Niklas Haas 3fe882d4ae vo_gpu: improve tone mapping desaturation
Instead of desaturating towards luma, we desaturate towards the
per-channel tone mapped version. This essentially proves a smooth
roll-off towards the "hollywood"-style (non-chromatic) tone mapping
algorithm, which works better for bright content, while continuing to
use the "linear" style (chromatic) tone mapping algorithm for primarily
in-gamut content.

We also split up the desaturation algorithm into strength and exponent,
which allows users to use less aggressive desaturation settings without
affecting the overall curve.
2019-02-18 01:54:06 +02:00
Kotori Itsuka 05f0980b96 vo_gpu: allow resetting target-peak to the trc default
Add "auto" the possible values of target-peak.  The default value
for target_peak is to calculate the target using mp_trc_nom_peak.
Unfortunately, this default was outside the acceptable range of
10-10000 nits, which prevented its later reassignment.  So add an
"auto" choice to target-peak which lets clients and scripts go back
to using the trc default after assigning a value.
2019-01-23 09:31:35 +01:00
wm4 b1ba7de34d vo: use a struct for vsync feedback stuff
So new useless stuff can be easily added.
2018-12-06 10:30:25 +01:00
wm4 83884fdf03 vo_gpu: glx: use GLX_OML_sync_control for better vsync reporting
Use the extension to compute the (hopefully correct) video delay and
vsync phase.

This is very fuzzy, because the latency will suddenly be applied after
some frames have already been shown. This means there _will_ be "jumps"
in the time accounting, which can lead to strange effects at start of
playback (such as making initial "dropped" etc. frames worse). The only
reasonable way to fix this would be running a few dummy frame swaps at
start of playback until the latency is known. The same happens when
unpausing.

This only affects display-sync mode.

Correct function was not confirmed. It only "looks right". I don't have
the equipment to make scientifically correct measurements.

A potentially bad thing is that we trust the timestamps we're receiving.
Out of bounds timestamps could wreak havoc. On the other hand, this will
probably cause the higher level code to panic and just disable DS.

As a further caveat, this makes a bunch of assumptions about UST
timestamps. If there are delayed frames (i.e. we skipped one or more
vsyncs), the latency logic is mostly reset. There is no attempt to make
the vo.c skipped vsync logic to use this. Also, the latency computation
determines a vsync duration, and there's no effort to reconcile or share
the vo.c logic for determining vsync duration.
2018-12-06 10:30:14 +01:00
Anton Kindestam 8b83c89966 Merge commit '559a400ac36e75a8d73ba263fd7fa6736df1c2da' into wm4-commits--merge-edition
This bumps libmpv version to 1.103
2018-12-05 19:19:24 +01:00
Niklas Haas 5bcac8580d spirv: remove --spirv-compiler=nvidia
This option has been deprecated upstream for a long time, probably
doesn't even work anymore, and won't work moving forwards as we replace
the vulkan code by libplacebo wrappers.

I haven't removed the option completely yet since in theory we could
still add support for e.g. a native glslang wrapper in the future. But
most likely the future of this code is deletion.

As an aside, fix an issue where the man page didn't mention d3d11.
2018-12-01 15:50:23 +02:00
Anton Kindestam f0509d3738 drm: rename plane options to better, invariant, names
This commit bumps the libmpv version to 1.102

drm-osd-plane -> drm-draw-plane
drm-video-plane -> drm-drmprime-video-plane
drm-osd-size -> drm-draw-surface-size

"draw plane", as in the plane that OpenGL draws to, whether it be
video + OSD or just OSD.

"drmprime video plane", as in the plane used for hwdec video imported
via drmprime.

"draw surface size", as in the size of the surface used for the draw plane

The new names are invariant whether or not hwdec_drmprime_drm is being
used or not. The original naming was very confusing, as when doing
regular rendering (swdec or vaapi) the video would be displayed on the
"OSD plane", and the "Video plane" would remain unused.
2018-12-01 15:42:20 +02:00
dudemanguy 8b6064de76 gpu: prefer wayland context on autodetect 2018-11-19 00:26:39 +02:00
Akemi e72093581b vo_libmpv: support render performance data 2018-11-13 20:43:29 +02:00
Philip Langdale 93f800a00f vo_gpu: vulkan: Add support for exporting buffer memory
The CUDA/Vulkan interop works on the basis of memory being exported
from Vulkan and then imported by CUDA. To enable this, we add a way
to declare a buffer as being intended for export, and then add a
function to do the export.

For now, we support the fd and Handle based exports on Linux and
Windows respectively. There are others, which we can support when
a need arises.

Also note that this is just for exporting buffers, rather than
textures (VkImages). Image import on the CUDA side is supposed to
work, but it is currently buggy and waiting for a new driver release.

Finally, at least with my nvidia hardware and drivers, everything
seems to work even if we don't initialise the buffer with the right
exportability options. Nevertheless I'm enforcing it so that we're
following the spec.
2018-10-22 21:35:48 +02:00
BtbN f3098cd61b vo_gpu: vulkan: fix strncpy truncation in spirv_compiler_init
Fixes GCC8 warning
../video/out/gpu/spirv.c: In function 'spirv_compiler_init':
../video/out/gpu/spirv.c:68:9: warning: 'strncpy' specified bound 32 equals destination size [-Wstringop-truncation]
2018-10-21 23:33:10 +02:00
Niklas Haas 7ad60a7c5e vo_gpu: split --linear-scaling into two separate options
Since linear downscaling makes sense to handle independently from
linear/sigmoid upscaling, we split this option up. Now,
linear-downscaling is its own option that only controls linearization
when downscaling and nothing more. Likewise, linear-upscaling /
sigmoid-upscaling are two mutually exclusive options (the latter
overriding the former) that apply only to upscaling and no longer
implicitly enable linear light downscaling as well.

The old behavior was very confusing, as evidenced by issues such
as #6213. The current behavior should make much more sense, and only
minimally breaks backwards compatibility (since using linear-scaling
directly was very uncommon - most users got this for free as part of
gpu-hq and relied only on that).

Closes #6213.
2018-10-19 22:58:01 +02:00
Niklas Haas 730469cb29 vo_gpu: fix vec3 packing in UBOs/push_constants
For vec3, the alignment and size differ. The current code will pack a
struct like { vec3; float; vec2 } into 8 machine words, whereas the spec
would only use 6.

This actually fixes a real bug: The only place in the code I could find
where it was conceivably possible that a vec3 is followed by a float was
when using --gpu-dumb-mode in combination with --gamma-factor, and only
when --gpu-api=vulkan. So it's no surprised nobody ran into it yet.
2018-09-29 20:15:10 +02:00
Niklas Haas 39d10e3359 vo_gpu: use explicit offsets for push constants
These used to be unsupported long ago, but it seems glslang added
support in the meantime. (I don't know which version, but I'm guessing
it was long enough ago that we don't have to add a feature check)

Should hopefully help make push constant layouts more robust against
possible bugs either in our code or in the driver.
2018-09-29 20:15:10 +02:00
sfan5 a4c5a4486e vo_gpu: adjust PRNG variant used by GL shaders
Certain low-end Mali GPUs have a rather low precision and overflow
during the PRNG calculations, thereby breaking e.g. deband-grain.
Modify the permute() to avoid this, this does not impact the
quality of PRNG output (noticeably).

This problem was observed on:
GL_VENDOR='ARM', GL_RENDERER='Mali-T720'
GL_VERSION='OpenGL ES 3.1 v1.r15p0-00rel0.bdd9e62cdc8c88e0610a16b5901161e9'
2018-09-26 23:53:05 +03:00
Niklas Haas a5b0d59084 vo_gpu: switch to optimization level performance
Upstream has this now. Didn't really make any different for me (except
making the polar compute shader 2%-3% faster), but maybe it does for
somebody else.
2018-09-01 16:14:22 +02:00
Niklas Haas 1890ca024e vo_gpu: avoid overwriting compute shader block sizes
When using multiple compute shaders as part of the same pass, there can
be a conflict in the block sizes. In the problematic case, the HDR
detection shader can collide with the polar sampling shader. In this
case, the solution is clear - the passes that can handle any size should
"give in" and not overwrite the block sizes.

Fixes #6083.
2018-08-26 12:32:20 +02:00
Tom Yan d48786f682 wscript: split egl-android from android 2018-08-20 17:16:22 +02:00
Jan Ekström 1a893e8257 gpu: prefer 16bit floating point FBO formats to 16bit integer ones
According to earlier discussions, this can improve visual quality.
This only changes the preferred order of the formats, not the
formats themselves.
2018-07-08 16:49:23 +03:00
Niklas Haas 5056777b86 vo_gpu: desaturate after peak detection
This sacrifices some dynamic range for well-behaved sources, but
prevents catastrophic desaturation on badly mastered / too bright
sources. I think that's the better trade-off. This makes the
desaturation algorithm much "safer" to deploy by default, as well. One
could even argue going up to strength 1.0, which works better for some
sources but worse for others. But I think the current strength is the
best trade-off even after this change.
2018-05-31 03:13:50 +03:00
wm4 f8ab59eacd player: get rid of mpv_global.opts
This was always a legacy thing. Remove it by applying an orgy of
mp_get_config_group() calls, and sometimes m_config_cache_alloc() or
mp_read_option_raw().

win32 changes untested.
2018-05-24 19:56:35 +02:00
Niklas Haas 05b392bc94 vo_gpu: allow higher icc-contrast and improve logging
With the advent of actual HDR devices, my real measured ICC profile has
an "infinite" contrast, since the display is completely off on pure
black inputs. 100k:1 might not be enough, so let's just bump it up to
1m:1 to be safe.

Also, improve the logging in the case that the detected contrast is too
high by default.
2018-05-17 22:56:45 +03:00
LongChair 9f2970f28a drm/atomic: refactor hwdec_drmprime_drm with native resources
That new API was introduced and allows to have several native resources.
Thisuses that mechanisma for drm resources rather than the deprecated
opengl-cb structs.

This patch therefore add two structs that can be used with the drm atomic interop.
 - mpv_opengl_drm_params : which will hold all the drm handles
 - mpv_opengl_drm_osd_size : which will hold osd layer size

This commit adds a drm-osd-size=WxH parameter to commandline which
allows to define the OSD plane dimension. OSD can be upscaled to
screen resolution when having OSD at video resolution is too heavy.

This is especially useful for UHD modes on embedded devices where
the GPU cannot handle UHD modes at a decent framerate.
2018-05-01 20:48:02 +03:00
Jan Ekström 11f915f5ef vo_gpu/video: disable compute shaders if an FBO format was not available
This is actually more generic and better than just lazily plastering
peak calculation together with dumb mode.
2018-05-01 19:24:53 +03:00
Jan Ekström df65ac95ba vo_gpu/video: add improved logging when a user-specified FBO fails
I don't know if we can just return from this function, so for now
just adding this piece of logging.
2018-05-01 19:24:53 +03:00
Niklas Haas dc16d85379 gpu/video: make HDR peak computing work without work group count
Define a hard-coded value for gl_NumWorkGroups if it is not available.
This adds an additional requirement of needing a shader recompile for
all window size changes.

This was considered a worthwhile compromise as currently f.ex. d3d11
completely lacked any peak computation - this is a major quality of
life upgrade.
2018-04-29 03:51:19 +03:00
Jan Ekström 59d422f042 gpu/video: improve HDR peak computation feature check logging
Now that the feature depends on multiple features, log all of
their states in the message.
2018-04-29 03:51:19 +03:00
wm4 c6b9288465 video: remove internal stereo_out flag
Also rename stereo3d to stereo_in. The only real change is that the
vo_gpu OSD code now uses the actual stereo 3D mode, instead of the
--video-steroe-mode value. (Why does this vo_gpu code even exist?)
2018-04-29 02:21:32 +03:00
wm4 0be3a94e0b vo_libmpv: support GPU rendered screenshots
Like DR, this needed a lot of preparation, and here's the boring glue
code that finally implements it.
2018-04-29 02:21:32 +03:00
wm4 9825bbb8cf vo_libmpv: add support for DR
With all the preparation work done, this only has to do the annoying
dance of passing it through all the damn layers.
2018-04-29 02:21:32 +03:00
wm4 6435d9ae7f vo_gpu: move some extra code for screenshot to video.c
This also happens to fix some UB on the error path (target being
declared after the first "goto done;").
2018-04-20 17:05:53 +02:00
wm4 f9bcb5c42c client API: clarify that Display pointers etc. need to stay valid
Normally, MPV_RENDER_PARAM* arguments are copied, unless documented
otherwise. Of course we can't copy X11 Display or Wayland wl_display
types, but for arguments that are "summarized" in a struct (like
MPV_RENDER_PARAM_OPENGL_FBO), a copy is expected.

Also add some unused infrastructure to make this explicit, and to make
it easier to add parameter types that require a copy.

Untested.
2018-04-16 01:21:59 +03:00
wm4 52dd38a48a client API: add a new way to pass X11 Display etc. to render API
Hardware decoding things often need access to additional handles from
the windowing system, such as the X11 or Wayland display when using
vaapi. The opengl-cb had nothing dedicated for this, and used the weird
GL_MP_MPGetNativeDisplay GL extension (which was mpv specific and not
officially registered with OpenGL).

This was awkward, and a pain due to having to emulate GL context
behavior (like needing a TLS variable to store context for the pseudo GL
extension function). In addition (and not inherently due to this), we
could pass only one resource from mpv builtin context backends to
hwdecs. It was also all GL specific.

Replace this with a newer mechanism. It works for all RA backends, not
just GL. the API user can explicitly pass the objects at init time via
mpv_render_context_create(). Multiple resources are naturally possible.

The API uses MPV_RENDER_PARAM_* defines, but internally we use strings.
This is done for 2 reasons: 1. trying to leave libmpv and internal
mechanisms decoupled, 2. not having to add public API for some of the
internal resource types (especially D3D/GL interop stuff).

To remain sane, drop support for obscure half-working opengl-cb things,
like the DRM interop (was missing necessary things), the RPI window
thing (nobody used it), and obscure D3D interop things (not needed with
ANGLE, others were undocumented). In order not to break ABI and the C
API, we don't remove the associated structs from opengl_cb.h.

The parts which are still needed (in particular DRM interop) needs to be
ported to the render API.
2018-03-26 19:47:08 +02:00
wm4 fbcf2bf207 vo_gpu: fix anamorphic video screenshots (second try)
This passed the display size as source size to the renderer, which is of
course nonsense. I don't know what I was doing in 569383bc54.

Yet another fix for those damn anamorphic videos.

As a somewhat redundant/cosmetic change, use image_params instead of
real_image_params in the code above. They should have the same, dimensions
(but possibly different formats when doing hw decdoing), and mixing them
is confusing. p->image_params wins because it's shorter.

Actually fixes #5619.
2018-03-16 23:00:45 +02:00
wm4 569383bc54 vo_gpu: fix anamorphic screenshots
We took the storage size instead of the display size for "unscaled"
screenshots. Even if it's called "unscaled", it's still supposed to
scale to compensate for aspect ratio.

(How many commits fixing anamorphic screenshots in various situations
are there?)

Fixes #5619.
2018-03-15 23:13:53 -07:00
wm4 ecf4d7a843 vo_gpu: error out if there were rendering errors when taking screenshot 2018-03-03 02:38:01 +02:00
wm4 1b786a71c1 vo_gpu: fix taking screenshots of rotated videos
Good old 90° rotation logic messing everything up.
2018-03-03 02:38:01 +02:00
wm4 b037121430 client API: deprecate opengl-cb API and introduce a replacement API
The purpose of the new API is to make it useable with other APIs than
OpenGL, especially D3D11 and vulkan. In theory it's now possible to
support other vo_gpu backends, as well as backends that don't use the
vo_gpu code at all.

This also aims to get rid of the dumb mpv_get_sub_api() function. The
life cycle of the new mpv_render_context is a bit different from
mpv_opengl_cb_context, and you explicitly create/destroy the new
context, instead of calling init/uninit on an object returned by
mpv_get_sub_api().

In other to make the render API generic, it's annoyingly EGL style, and
requires you to pass in API-specific objects to generic functions. This
is to avoid explicit objects like the internal ra API has, because that
sounds more complicated and annoying for an API that's supposed to never
change.

The opengl_cb API will continue to exist for a bit longer, but
internally there are already a few tradeoffs, like reduced
thread-safety.

Mostly untested. Seems to work fine with mpc-qt.
2018-02-28 00:55:06 -08:00
wm4 d6921678b9 vo_gpu: remove a dead declaration 2018-02-28 00:55:06 -08:00
Niklas Haas 1f2d8ed01c vo_gpu: fix mobius tone mapping when sig_peak <= 1.0
Mobius isn't well-defined for sig_peak <= 1.0. We can solve this by just
soft-clamping sig_peak to 1.0. Although, in this case, we can just skip
tone mapping altogether since the limit of mobius as sig_peak -> 1.0 is
just a linear function.
2018-02-25 16:11:26 +02:00
Niklas Haas 66dfb96fa1 vo_gpu: don't tone-map for pure gamut reductions
Based on testing with real-world non-HDR BT.2020 clips, clipping the
color space looks better than attempting to gamut map using a tone
mapping shader that's (by now) optimized for HDR content.

If anything, we'd have to develop a separate gamut mapping shader that
works in LCh space.
2018-02-25 14:57:57 +02:00
Niklas Haas 441e384390 vo_gpu: introduce --target-peak
This solves a number of problems simultaneously:

1. When outputting HLG, this allows tuning the OOTF based on the display
   characteristics.
2. When outputting PQ or other HDR curves, this allows soft-limiting the
   output brightness using the tone mapping algorithm.
3. When outputting SDR, this allows HDR-in-SDR style output, by
   controlling the output brightness directly.

Closes #5521
2018-02-20 22:02:51 +02:00
Niklas Haas 1f881eca65 vo_gpu: correctly parametrize the HLG OOTF by the display peak
The HLG OOTF is defined as a one-parameter family of OOTFs depending on
the display's peak luminance. With the preceding change to OOTF scale
and handling, we no longer have any issues with outputting values in
whatever signal range we need.

So as a result, it's easy for us to support a tunable OOTF which may
(drastically) alter the display brightness. In fact, this is also the
only correct way to do it, because the HLG appearance depends strongly
on the OOTF configuration. For the OOTF, we consult the mastering
display's tagging (via src.sig_peak). For the inverse OOTF, we consult
the output display's target peak.
2018-02-20 22:02:51 +02:00
Niklas Haas b9e7478760 vo_gpu: simplify and correct color scale handling
The primary need for this change is the fact that the OOTF was
incorrectly scaled, due to the fact that the application of the OOTF can
itself change the required normalization peak. (Plus, an oversight in
pass_inverse_ootf meant we forgot to normalize at the end of it)

The linearize/delinearize functions still normalize the scale since it's
used in a number of places throughout gpu/video.c, but the color
management function now converts to absolute scale right away, instead
of in an awkward way inside the tone mapping branch. The OOTF functions
now work in absolute scale only.

In addition, minor changes have been made to the way normalization is
handled for tone mapping - we now divide out the dst_peak *after* peak
detection, in order to make the scale of the peak detection buffer
consistent even if the dst_peak were to (hypothetically) change
mid-stream. In theory, we could also do this for desaturation, but doing
the desaturation before tone mapping has the advantage of preserving
much more brightness than the other way around - and even mid-stream
changes are not that drastic here.

Finally, some preparation work has been done for allowing the user to
customize the `dst.sig_peak` in the future.
2018-02-20 22:02:51 +02:00
wm4 f17246fec1 vo_gpu: remove old window screenshot glue code and GL implementation
There is now a better way. Reading the font framebuffer was always a
hack. The new code via VOCTRL_SCREENSHOT renders it into a FBO, which
does not come with the disadvantages of reading the front buffer (like
not being supported by GLES, possibly black regions due to overlapping
windows on some systems).

For now keep VOCTRL_SCREENSHOT_WIN on the VO level, because there are
still some lesser VOs and backends that use it.
2018-02-13 17:45:29 -08:00
James Ross-Gowan 1b80e124db vo_gpu: d3d11: implement tex_download()
This allows the new GPU screenshot functionality introduced in
9f595f3a80 to work with the D3D11 backend. It replaces the old window
screenshot functionality, which was shared between D3D11 and ANGLE. The
old code can be removed, since it's not needed by ANGLE anymore either.
2018-02-13 21:25:15 +11:00
James Ross-Gowan 7d2228c673 vo_gpu: use a variable for the RA_CAP_FRAGCOORD flag
This is just a cosmetic change. Now the RA_CAP_FRAGCOORD check looks
like all the others.
2018-02-13 00:21:26 +02:00
James Ross-Gowan 44dc79dcb0 vo_gpu: check for HDR peak detection in dumb mode too
Similar spirit to edb4970ca8. check_gl_features() has a confusing
early-return. This also adds compute_hdr_peak to the list of options
that is copied to the dumb-mode options struct, since it seems to make a
difference. Otherwise it would be impossible to disable HDR peak
detection in dumb mode.
2018-02-13 00:21:26 +02:00
wm4 9f595f3a80 vo_gpu: make screenshots use the GL renderer
Using the GL renderer for color conversion will make sure screenshots
will use the same conversion as normal video rendering. It can do this
for all types of screenshots.

The logic when to write 16 bit PNGs changes. To approximate the old
behavior, we decide by looking whether the source video format has more
than 8 bits per component. We apply this logic even for window
screenshots. Also, 16 bit PNGs now always include an unused alpha
channel. The reason is that FFmpeg has RGB48 and RGBA64 formats, but no
RGB064. RGB48 is 3 bytes and usually not supported by GPUs for
rendering, so we have to use RGBA64, which forces an alpha channel.

Will break for users who use --target-trc and similar options.

I considered creating a new gl_video context, but it could double GPU
memory use, so I didn't.

This uses FBOs instead of glGetTexImage(), because that increases the
chance it could work on GLES (e.g. ANGLE). Untested. No support for the
Vulkan and D3D11 backends yet.

Fixes #5498. Also fixes #5240, because the code for reading back is not
used with the new code path.
2018-02-11 17:45:51 -08:00
wm4 7b1e73139f vo_gpu: add internal ability to skip osd/subs for rendering
Needed for the following commit.
2018-02-11 17:45:51 -08:00