1
0
mirror of https://github.com/mpv-player/mpv synced 2025-03-11 08:37:59 +00:00
Commit Graph

74 Commits

Author SHA1 Message Date
Bin Jin
f2119d9d88 vo_gpu: expose texture_off to user shader
It will provide low level access to coordinate mapping other than
texmap().
2019-06-06 20:01:56 +02:00
Bin Jin
ae1c489b31 vo_gpu: allow user shader to fix texture offset
This commit essentially makes user shader able to fix offset (produced
by other prescaler, for example) like builtin `--scale`.
2019-06-06 20:01:56 +02:00
Bin Jin
dd83b66652 vo_gpu: increase user shader size limit
The old size limit was chosen before LUT texture was supported in user
shader. At that time, the whole user shader will be compiled and run
on GPU, which makes large user shader impractical to be used.

With the introduction of LUT texture, the old size limit doesn't make
any sense. For example, a 1024x1024 rgba16f LUT will cost 32MB shader
size.

Fix this by increasing the size limit to a value that's unlikely be
reached.
2019-03-13 21:47:24 +02:00
Jan Ekström
199aabddcc Merge branch 'master' into pr6360
Manual changes done:
  * Merged the interface-changes under the already master'd changes.
  * Moved the hwdec-related option changes to video/decode/vd_lavc.c.
2019-03-11 01:00:27 +02:00
Bin Jin
b3cbd46509 vo_gpu: make texture offset available to CHROMA hooks
Before this commit, texture offset is set after all source textures
are finalized. Which means CHROMA hooks won't be able to align with
luma planes. This could be problematic for chroma prescalers utilizing
information from luma plane.

Fix this by find the reference texture early, and set global texture
offset early.
2019-03-09 12:56:11 +01:00
Niklas Haas
8b563a0346 vo_gpu: fix initial seeding of the peak detect ssbo
This solves some edge cases when using files with very weird metadata
(e.g. MaxCLL 10k and so forth). Instead of just blindly seeding it with
the tagged metadata, forcibly set the initial state from the detected
values.
2019-02-18 01:54:06 +02:00
Niklas Haas
3f1bc25d4d vo_gpu: use dB units for scene change detection
Rather than the linear cd/m^2 units, these (relative) logarithmic units
lend themselves much better to actually detecting scene changes,
especially since the scene averaging was changed to also work
logarithmically.
2019-02-18 01:54:06 +02:00
Niklas Haas
b4b719e337 vo_gpu: clamp sigmoid function
Can explode on some clips otherwise
2019-02-18 01:54:06 +02:00
Niklas Haas
4e8022da26 vo_gpu: allow color management in dumb mode
There's no point to disallow target-trc/prim in dumb mode, since they
still work fine.
2019-02-18 01:54:06 +02:00
Niklas Haas
fdd671188d vo_gpu: improve accuracy of HDR brightness estimation
This change switches to a logarithmic mean to estimate the average
signal brightness. This handles dark scenes with isolated highlights
much more faithfully than the linear mean did, since the log of the
signal roughly corresponds to the perceptual brightness.
2019-02-18 01:54:06 +02:00
Niklas Haas
12e58ff8a6 vo_gpu: allow boosting dark scenes when tone mapping
In theory our "eye adaptation" algorithm works in both ways, both
darkening bright scenes and brightening dark scenes. But I've always
just prevented the latter with a hard clamp, since I wanted to avoid
blowing up dark scenes into looking funny (and full of noise).

But allowing a tiny bit of over-exposure might be a good thing. I won't
change the default just yet (better let users test), but a moderate
value of 1.2 might be better than the current 1.0 limit. Needs testing
especially on dark scenes.
2019-02-18 01:54:06 +02:00
Niklas Haas
6179dcbb79 vo_gpu: redesign peak detection algorithm
The previous approach of using an FIR with tunable hard threshold for
scene changes had several problems:

- the FIR involved annoying hard-coded buffer sizes, high VRAM usage,
  and the FIR sum was prone to numerical overflow which limited the
  number of frames we could average over. We also totally redesign the
  scene change detection.

- the hard scene change detection was prone to both false positives and
  false negatives, each with their own (annoying) issues.

Scrap this entirely and switch to a dual approach of using a simple
single-pole IIR low pass filter to smooth out noise, while using a
softer scene change curve (with tunable low and high thresholds), based
on `smoothstep`. The IIR filter is extremely simple in its
implementation and has an arbitrarily user-tunable cutoff frequency,
while the smoothstep-based scene change curve provides a good, tunable
tradeoff between adaptation speed and stability - without exhibiting
either of the traditional issues associated with the hard cutoff.

Another way to think about the new options is that the "low threshold"
provides a margin of error within which we don't care about small
fluctuations in the scene (which will therefore be smoothed out by the
IIR filter).
2019-02-18 01:54:06 +02:00
Niklas Haas
3fe882d4ae vo_gpu: improve tone mapping desaturation
Instead of desaturating towards luma, we desaturate towards the
per-channel tone mapped version. This essentially proves a smooth
roll-off towards the "hollywood"-style (non-chromatic) tone mapping
algorithm, which works better for bright content, while continuing to
use the "linear" style (chromatic) tone mapping algorithm for primarily
in-gamut content.

We also split up the desaturation algorithm into strength and exponent,
which allows users to use less aggressive desaturation settings without
affecting the overall curve.
2019-02-18 01:54:06 +02:00
Kotori Itsuka
05f0980b96 vo_gpu: allow resetting target-peak to the trc default
Add "auto" the possible values of target-peak.  The default value
for target_peak is to calculate the target using mp_trc_nom_peak.
Unfortunately, this default was outside the acceptable range of
10-10000 nits, which prevented its later reassignment.  So add an
"auto" choice to target-peak which lets clients and scripts go back
to using the trc default after assigning a value.
2019-01-23 09:31:35 +01:00
Anton Kindestam
8b83c89966 Merge commit '559a400ac36e75a8d73ba263fd7fa6736df1c2da' into wm4-commits--merge-edition
This bumps libmpv version to 1.103
2018-12-05 19:19:24 +01:00
Niklas Haas
7ad60a7c5e vo_gpu: split --linear-scaling into two separate options
Since linear downscaling makes sense to handle independently from
linear/sigmoid upscaling, we split this option up. Now,
linear-downscaling is its own option that only controls linearization
when downscaling and nothing more. Likewise, linear-upscaling /
sigmoid-upscaling are two mutually exclusive options (the latter
overriding the former) that apply only to upscaling and no longer
implicitly enable linear light downscaling as well.

The old behavior was very confusing, as evidenced by issues such
as #6213. The current behavior should make much more sense, and only
minimally breaks backwards compatibility (since using linear-scaling
directly was very uncommon - most users got this for free as part of
gpu-hq and relied only on that).

Closes #6213.
2018-10-19 22:58:01 +02:00
Niklas Haas
1890ca024e vo_gpu: avoid overwriting compute shader block sizes
When using multiple compute shaders as part of the same pass, there can
be a conflict in the block sizes. In the problematic case, the HDR
detection shader can collide with the polar sampling shader. In this
case, the solution is clear - the passes that can handle any size should
"give in" and not overwrite the block sizes.

Fixes #6083.
2018-08-26 12:32:20 +02:00
Jan Ekström
1a893e8257 gpu: prefer 16bit floating point FBO formats to 16bit integer ones
According to earlier discussions, this can improve visual quality.
This only changes the preferred order of the formats, not the
formats themselves.
2018-07-08 16:49:23 +03:00
wm4
f8ab59eacd player: get rid of mpv_global.opts
This was always a legacy thing. Remove it by applying an orgy of
mp_get_config_group() calls, and sometimes m_config_cache_alloc() or
mp_read_option_raw().

win32 changes untested.
2018-05-24 19:56:35 +02:00
Jan Ekström
11f915f5ef vo_gpu/video: disable compute shaders if an FBO format was not available
This is actually more generic and better than just lazily plastering
peak calculation together with dumb mode.
2018-05-01 19:24:53 +03:00
Jan Ekström
df65ac95ba vo_gpu/video: add improved logging when a user-specified FBO fails
I don't know if we can just return from this function, so for now
just adding this piece of logging.
2018-05-01 19:24:53 +03:00
Niklas Haas
dc16d85379 gpu/video: make HDR peak computing work without work group count
Define a hard-coded value for gl_NumWorkGroups if it is not available.
This adds an additional requirement of needing a shader recompile for
all window size changes.

This was considered a worthwhile compromise as currently f.ex. d3d11
completely lacked any peak computation - this is a major quality of
life upgrade.
2018-04-29 03:51:19 +03:00
Jan Ekström
59d422f042 gpu/video: improve HDR peak computation feature check logging
Now that the feature depends on multiple features, log all of
their states in the message.
2018-04-29 03:51:19 +03:00
wm4
c6b9288465 video: remove internal stereo_out flag
Also rename stereo3d to stereo_in. The only real change is that the
vo_gpu OSD code now uses the actual stereo 3D mode, instead of the
--video-steroe-mode value. (Why does this vo_gpu code even exist?)
2018-04-29 02:21:32 +03:00
wm4
6435d9ae7f vo_gpu: move some extra code for screenshot to video.c
This also happens to fix some UB on the error path (target being
declared after the first "goto done;").
2018-04-20 17:05:53 +02:00
wm4
fbcf2bf207 vo_gpu: fix anamorphic video screenshots (second try)
This passed the display size as source size to the renderer, which is of
course nonsense. I don't know what I was doing in 569383bc54.

Yet another fix for those damn anamorphic videos.

As a somewhat redundant/cosmetic change, use image_params instead of
real_image_params in the code above. They should have the same, dimensions
(but possibly different formats when doing hw decdoing), and mixing them
is confusing. p->image_params wins because it's shorter.

Actually fixes #5619.
2018-03-16 23:00:45 +02:00
wm4
569383bc54 vo_gpu: fix anamorphic screenshots
We took the storage size instead of the display size for "unscaled"
screenshots. Even if it's called "unscaled", it's still supposed to
scale to compensate for aspect ratio.

(How many commits fixing anamorphic screenshots in various situations
are there?)

Fixes #5619.
2018-03-15 23:13:53 -07:00
wm4
ecf4d7a843 vo_gpu: error out if there were rendering errors when taking screenshot 2018-03-03 02:38:01 +02:00
wm4
1b786a71c1 vo_gpu: fix taking screenshots of rotated videos
Good old 90° rotation logic messing everything up.
2018-03-03 02:38:01 +02:00
Niklas Haas
441e384390 vo_gpu: introduce --target-peak
This solves a number of problems simultaneously:

1. When outputting HLG, this allows tuning the OOTF based on the display
   characteristics.
2. When outputting PQ or other HDR curves, this allows soft-limiting the
   output brightness using the tone mapping algorithm.
3. When outputting SDR, this allows HDR-in-SDR style output, by
   controlling the output brightness directly.

Closes #5521
2018-02-20 22:02:51 +02:00
Niklas Haas
b9e7478760 vo_gpu: simplify and correct color scale handling
The primary need for this change is the fact that the OOTF was
incorrectly scaled, due to the fact that the application of the OOTF can
itself change the required normalization peak. (Plus, an oversight in
pass_inverse_ootf meant we forgot to normalize at the end of it)

The linearize/delinearize functions still normalize the scale since it's
used in a number of places throughout gpu/video.c, but the color
management function now converts to absolute scale right away, instead
of in an awkward way inside the tone mapping branch. The OOTF functions
now work in absolute scale only.

In addition, minor changes have been made to the way normalization is
handled for tone mapping - we now divide out the dst_peak *after* peak
detection, in order to make the scale of the peak detection buffer
consistent even if the dst_peak were to (hypothetically) change
mid-stream. In theory, we could also do this for desaturation, but doing
the desaturation before tone mapping has the advantage of preserving
much more brightness than the other way around - and even mid-stream
changes are not that drastic here.

Finally, some preparation work has been done for allowing the user to
customize the `dst.sig_peak` in the future.
2018-02-20 22:02:51 +02:00
James Ross-Gowan
7d2228c673 vo_gpu: use a variable for the RA_CAP_FRAGCOORD flag
This is just a cosmetic change. Now the RA_CAP_FRAGCOORD check looks
like all the others.
2018-02-13 00:21:26 +02:00
James Ross-Gowan
44dc79dcb0 vo_gpu: check for HDR peak detection in dumb mode too
Similar spirit to edb4970ca8. check_gl_features() has a confusing
early-return. This also adds compute_hdr_peak to the list of options
that is copied to the dumb-mode options struct, since it seems to make a
difference. Otherwise it would be impossible to disable HDR peak
detection in dumb mode.
2018-02-13 00:21:26 +02:00
wm4
9f595f3a80 vo_gpu: make screenshots use the GL renderer
Using the GL renderer for color conversion will make sure screenshots
will use the same conversion as normal video rendering. It can do this
for all types of screenshots.

The logic when to write 16 bit PNGs changes. To approximate the old
behavior, we decide by looking whether the source video format has more
than 8 bits per component. We apply this logic even for window
screenshots. Also, 16 bit PNGs now always include an unused alpha
channel. The reason is that FFmpeg has RGB48 and RGBA64 formats, but no
RGB064. RGB48 is 3 bytes and usually not supported by GPUs for
rendering, so we have to use RGBA64, which forces an alpha channel.

Will break for users who use --target-trc and similar options.

I considered creating a new gl_video context, but it could double GPU
memory use, so I didn't.

This uses FBOs instead of glGetTexImage(), because that increases the
chance it could work on GLES (e.g. ANGLE). Untested. No support for the
Vulkan and D3D11 backends yet.

Fixes #5498. Also fixes #5240, because the code for reading back is not
used with the new code path.
2018-02-11 17:45:51 -08:00
wm4
7b1e73139f vo_gpu: add internal ability to skip osd/subs for rendering
Needed for the following commit.
2018-02-11 17:45:51 -08:00
wm4
bff8cfe3f0 vo_gpu: use blit() only if target ra_tex supports it
Even if RA_CAP_BLIT is set, this might just not be enabled for the
target ra_tex.
2018-02-11 17:45:51 -08:00
Niklas Haas
4e7f4f10ce vo_gpu: correctly infer HDR peak detection support
The re-ordering of commits e3d93fd and 0870859 ended up swallowing the
change which made the HDR tone mapping algorithm actually check for
RA_CAP_NUM_GROUPS support.
2018-02-11 16:45:20 -08:00
Niklas Haas
4c2edecd7d vo_gpu: refactor HDR peak detection algorithm
The major changes are as follows:

1. Use `uint32_t` instead of `unsigned int` for the SSBO size
   calculation. This doesn't really matter, since a too-big buffer will
   still work just fine, but since `uint` is a 32-bit integer by
   definition this is the correct way to do it.

2. Pre-divide the frame_sum by the num_wg immediately at the end of a
   frame. This change was made to prevent overflow. At 4K screen size,
   this code is currently already very at risk of overflow, especially
   once I started playing with longer averaging sizes. Pre-dividing this
   out makes it just about fit into 32-bit even for worst-case PQ
   content. (It's technically also faster and easier this way, so I
   should have done it to begin with). Rename `frame_sum` to `frame_avg`
   to clearly signal the change in semantics.

3. Implement a scene transition detection algorithm. This basically
   compares the current frame's average brightness against the
   (averaged) value of the past frames. If it exceeds a threshold, which
   I experimentally configured, we reset the peak detection SSBO's state
   immediately - so that it just contains the current frame. This
   prevents annoying "eye adaptation"-like effects on scene transitions.

4. As a result of the previous change, we can now use a much larger
   buffer size by default, which results in a more stable and less
   flickery result. I experimented with values between 20 and 256 and
   settled on the new value of 64. (I also switched to a power-of-2
   array size, because I like powers of two)
2018-02-11 16:45:20 -08:00
Niklas Haas
e3d93fde2f vo_gpu: port HDR tone mapping algorithm from libplacebo
The current peak detection algorithm was very bugged (which contributed
to the excessive cross-frame flicker without long normalization) and
also didn't take into account the frame average brightness level.

The new algorithm both takes into account frame average brightness (in
addition to peak brightness), and also computes the values in a more
stable/correct way. (The old path was basically undefined behavior)

In addition to improving the algorithm, we also switch to hable tone
mapping by default, and try to enable peak computation automatically
whever possible (compute shaders + SSBOs supported). We also make the
desaturation milder, after extensive testing during libplacebo
development.

I also had to compensate a bit for the representational differences
between mpv and libplacebo (libplacebo treats 1.0 as the reference peak,
but mpv treats it as the nominal peak), but it shouldn't have caused any
problems.

This is still not quite the same as libplacebo, since libplacebo also
allows tagging the desired scene average brightness on the output, and
it also supports reading the scene average brightness from static
metadata (MaxFALL) where available. But those changes are a bit more
involved. It's possible we could also read this from metadata in the
future, but we have problems communicating with AVFrames as it is and I
don't want to touch the mpv colorimetry structs for the time being.
2018-02-05 23:11:18 -08:00
James Ross-Gowan
edb4970ca8 vo_gpu: check for RA_CAP_FRAGCOORD in dumb mode too
The RA_CAP_FRAGCOORD checks apply to dumb mode as well, but they were
after the check for dumb mode, which returns early, so they never ran.

Fixes #5436
2018-01-30 20:22:58 +11:00
wm4
3c1566e736 video: fix crash with vdpau when reinitializing rendering
Using vdpau will allocate additional textures for the reinterleaving
step, which uninit_rendering() will free. This is a problem because the
hwdec image remains mapped when reinitializing, so the reinterleaving
textures are turned into dangling pointers. Fix this by freeing the
reinterleave textures on full uninit instead.

Fixes #5447.
2018-01-27 03:31:53 -08:00
Akemi
828f38e10d video: change some remaining vo_opengl mentions to vo_gpu 2018-01-20 14:43:49 -08:00
wm4
342e36ea11 vo_gpu: skip DR for unsupported image formats
DR (direct rendering) works by having the decoder decode into the GPU
staging buffers, instead of copying the video data on texture upload. We
did this even for formats unsupported by the GPU or the renderer. This
"worked" because the staging memory is untyped, and the video frame was
converted by libswscale to a supported format, and then uploaded with a
copy using the normal non-DR texture upload path.

Even though it "works", we don't gain anything from using the staging
buffers for decoding, since we can't use them for upload anyway. Also,
staging memory might be potentially limited (what really happens is up
to the driver). It's easy to avoid, so just skip it in these cases.
2018-01-18 00:25:00 -08:00
wm4
07753bbb4a vo_gpu: fix broken 10 bit via integer textures playback
The check_gl_features(p) call here checks whether dumb mode can be used.
It uses the field use_integer_conversion, which is set _after_ the call
in the same function. Move check_gl_features() to the end of the
function, when use_integer_conversion is finally set.

Fixes that it tried to use bilinear filtering with integer textures. The
bug disabled the code that is supposed to convert it to non-integer
textures.
2018-01-17 22:59:15 -08:00
Niklas Haas
a42b8b1142 vo_gpu: attempt re-using the FBO format for p->output_tex
This allows RAs with support for non-opaque FBO formats to use a more
appropriate FBO format for the output tex, possibly enabling a more
efficient blit operation.

This requires distinguishing between real formats (which can be used to
create textures) and fake formats (e.g. ra_gl's FBO hack).
2017-12-25 00:47:53 +01:00
Niklas Haas
dcda8bd36a vo_gpu: aggressively prefer async compute
On AMD devices, we only get one graphics pipe but several compute pipes
which can (in theory) run independently. As such, we should prefer
compute shaders over fragment shaders in scenarios where we expect them
to be better for parallelism.

This is amusingly trivial to do, and actually improves performance even
in a single-queue scenario.
2017-12-25 00:47:53 +01:00
Niklas Haas
a3c9685257 vo_gpu: invalidate fbotex before drawing
Don't discard the OSD or pass_draw_to_screen passes though. Could be
faster on some hardware.
2017-12-25 00:47:53 +01:00
Niklas Haas
ba1943ac00 msg: reinterpret a bunch of message levels
I've decided that MP_TRACE means “noisy spam per frame”, whereas
MP_DBG just means “more verbose debugging messages than MSGL_V”.
Basically, MSGL_DBG shouldn't create spam per frame like it currently
does, and MSGL_V should make sense to the end-user and provide mostly
additional informational output.

MP_DBG is basically what I want to make the new default for --log-file,
so the cut-off point for MP_DBG is if we probably want to know if for
debugging purposes but the user most likely doesn't care about on the
terminal.

Also, the debug callbacks for libass and ffmpeg got bumped in their
verbosity levels slightly, because being external components they're a
bit less relevant to mpv debugging, and a bit too over-eager in what
they consider to be relevant information.

I exclusively used the "try it on my machine and remove messages from
MSGL_* until it does what I want it to" approach of refactoring, so
YMMV.
2017-12-15 22:28:47 -08:00
wm4
6047333f0b video: remove code duplication by calling a hwdec loader helper
Make gl_video_load_hwdecs() call gl_video_load_hwdecs_all() when
all HW decoders should be loaded.
2017-12-11 20:44:59 +02:00
wm4
5196c34aec video: properly initialize and set hwdec_interop
Don't reset --gpu-hwdec-interop if vo_gpu uses dumb mode.
2017-12-11 20:44:59 +02:00