When mpv attempts to play a video that is, on average, 60 FPS on a
display that is not exactly 60.00 Hz, two options try to fight each
other: `video-sync-max-video-change` and `interpolation-threshold`.
Normally, container FPS in something such as an .mp4 or a .mkv is
precise enough such that the video can be retimed exactly to the display
Hz and interpolation is not activated.
In the case of something like certain live streaming videos or other scenario
where container FPS is not known, the default option of 0.0001 for
`interpolation-threshold` is extremely low, and while
`video-sync-max-video-change` retimes the video to what it approximately
knows as the "real" FPS, this may or may not be outside of
`interpolation-threshold`'s logic at any given time, which causes
interpolation to be frequently flipped on and off giving an appearance
of stuttering or repeated frames that is oftern quite jarring and makes
a video unwatchable.
This commit changes the default of `interpolation-threshold` to 0.01,
which is the same value as `video-sync-max-video-change`, and guarantees
that if the user accepts a video being retimed to match the display,
they do not additionally have to worry about a much more
precise interpolation threshold randomly flipping on or off. No internal
logic is changed so setting `interpolation-threshold` to -1 will still
disable this logic entirely and always enable interpolation.
The documentation has been updated to reflect this change and give
context to the user for which scenarios they might want to disable
`interpolation-threshold` logic or change it to a smaller value.
Today, validation is only possible for string type options. But there's
no particular reason why it needs to be restricted in this way, and
there are potential uses, to allow other options to be validated
without forcing the option to have to reimplement parsing from
scratch.
The first part, simply making the validation function an explicit
field instead of overloading priv is simple enough. But if we only do
that, then the validation function still needs to deal with the raw
pre-parsed string. Instead, we want to allow the value to be parsed
before it is validated. That in turn leads to us having validator
functions that should be type aware. Unfortunately, that means we need
to keep the explicit macro like OPT_STRING_VALIDATE() as a way to
enforce the correct typing of the function. Otherwise, we'd have to
have the validator take a void * and hope the implementation can cast
it correctly.
For help, we don't have this problem, as help doesn't look at the
value.
Then, we turn validators that are really help generators into explicit
help functions and where a validator is help + validation, we split
them into two parts.
I have, however, left functions that need to query information for both
help and validation as single functions to avoid code duplication.
In this change, I have not added an other OPT_FOO_VALIDATE() macros as
they are not needed, but I will add some in a separate change to
illustrate the pattern.
in the original commit, that removed the conditional clearing, an
incorrect assumption was made that clearing "should be practically free"
and can be done always. though, at least on macOS + intel this can have
a performance impact of up to 50% increased usage. it might have an
impact on other platforms and setups as well, but this is unconfirmed.
the reason for removing the conditional clearing was to partially work
around a driver bug on very specific setups, X11 with amdgpu and OpenGL,
to clear garbled frames on start. though it still has issues with
garbled frames in other situation like fullscreening. there is also an
open bug report on the mesa bug tracker about this. setting the
radeonsi_zerovram flag works around all of those issues.
since the flag works around all these issues and the original fix
doesn't work completely we revert it and keep our optimisation.
Fixes#8273
--dscale= and --*scale-window= (i.e. an empty string) are respectively
valid settings for their options (and, in fact, the defaults).
This fixes the bug that it was impossible to reset e.g. tscale-window
back to the default "unset" setting after setting it once.
Credit goes to @CounterPillow for locating the cause of this bug.
Implementation copy/pasted from:
https://code.videolan.org/videolan/libplacebo/-/commit/f793fc0480f
This brings mpv's tone mapping more in line with industry standard
practices, for a hopefully more consistent result across the board.
Note that we ignore the black point adjustment of the tone mapping
entirely. In theory we could revisit this, if we ever make black point
compensation part of the mpv rendering pipeline.
glslang accepted this, perhaps erronneously, but mesa does not. It seems
to be incorrect. A caveat is that this means *all* SSBOs are now
coherent, but since we only use SSBOs for peak detection, that's a
non-issue. (And besides, marking something as coherent when we don't
perform any synchronization commands on it should be a no-op anyway)
Fixes#7823
The "screenshot window" command (ctrl+s by default) somehow broke video
colors with --gpu-api=vulkan --profile=gpu-hq when playback was paused.
I don't know the cause, but the rest of the code seems to imply
gl_video_reset_surfaces() needs to be called manually to flush some
caches, and it fixes the issue, so I assume there's no great mystery
here.
The original solution for #7017 was sort of a hack, but this hack is no
longer needed because c05e5d9d fixed the underlying issue causing this
error to be spammed in the first place. So just remove the "fix" that
apparently introduced about as many issueas it fixed.
Fixes#7719, hopefully.
This resolves prefixes such as "~/" and "~~/" at the caller, instead of
relying on stream_read_file() to do it. One of the following commits
will remove this from stream_read_file() itself.
Untested.
This was incorrect at least because the colorspace matrix attempted to
center chroma at (conceptually) 0.5, instead of 0. Also, it tried to
apply the fixed point shift logic for component sizes > 8 bit.
There is no float yuv format in mpv/ffmpeg yet, but see next commit,
which enables zimg to output it. I'm assuming zimg defines this format
such that luma is in range [0,1] and chroma in range [-0.5,0.5], with
the levels flag being ignored. This is consistent with H264/5 Annex E (I
think...), and it sort of seems to look right, so that's it.
For some reason this was never done? Looking through the code, it was
never the case that the frame cache was hit for still frames. I have no
idea why not. It makes a lot of sense to do so.
Notably, this massively improves the performance of updating the OSC
when viewing e.g. large still images, or while paused. (Tested on a
4000x8000 image, the OSC now responds smoothly to user input)
This is mostly for testing. It adds passing through the metadata through
the video chain. The metadata can be manipulated with vf_format. Support
for zimg alpha conversion (if built with zimg after it gained alpha
support) is implemented. Support premultiplied input in vo_gpu.
Some things still seem to be buggy.
Change all OPT_* macros such that they don't define the entire m_option
initializer, and instead expand only to a part of it, which sets certain
fields. This requires changing almost every option declaration, because
they all use these macros. A declaration now always starts with
{"name", ...
followed by designated initializers only (possibly wrapped in macros).
The OPT_* macros now initialize the .offset and .type fields only,
sometimes also .priv and others.
I think this change makes the option macros less tricky. The old code
had to stuff everything into macro arguments (and attempted to allow
setting arbitrary fields by letting the user pass designated
initializers in the vararg parts). Some of this was made messy due to
C99 and C11 not allowing 0-sized varargs with ',' removal. It's also
possible that this change is pointless, other than cosmetic preferences.
Not too happy about some things. For example, the OPT_CHOICE()
indentation I applied looks a bit ugly.
Much of this change was done with regex search&replace, but some places
required manual editing. In particular, code in "obscure" areas (which I
didn't include in compilation) might be broken now.
In wayland_common.c the author of some option declarations confused the
flags parameter with the default value (though the default value was
also properly set below). I fixed this with this change.
We have this cap now thanks to e2976e662, but we don't actually make
sure our FBOs are storable before we blindly attempt using them with
compute shaders.
There's no more need to unconditionally set `storage_dst = true` as long
as we make sure to include an extra condition on the `fbo_format`
selection to prevent users from accidentally enabling
compute-shader-only features with non-storable FBOs, alongside some
other miscellaneous adjustments to eliminate instances of "assumed
storability" from vo_gpu.
This simply makes the "is the destination FBO format bad?" check a tiny
bit less awful, by making sure we prefer storable FBO formats over
non-storable FBO formats. I'd love to make this also conditional on
whether or not we actually *need* a storable FBO format, but that logic
is decided later, in `pass_draw_to_screen`, and I don't want to
replicate the logic.
Fixes#7017.
Theoretically possible (and quite unlikely due to the small texture
size). The code was originally written with the assumption that texture
allocations can't fail, and it was never updated out of laziness.
Untested.
As we are less and less interested in vpdpau, with nvdec and vaapi
being better choices in general on nvidia and AMD respectively, we
might consider removing direct_mode, where we bypass the vdpau
mixer and work directly with yuv textures. Normally, working with
yuv textures would be great, but vdpau built in an assumption that
all frames are delivered as separate fields, causing us to have
to re-interleave them.
nvidia then introduces a new OpenGL extension that can return the
yuv frames as frames, but we can't just unconditionally switch to
that as we'd want to keep supporting older hardware where the drivers
are no longer getting new features. The end result is that we
wouldn't be able to get rid of the old code paths.
Removing direct_mode means we always use the mixer, and work with
rgba frame textures. There are some theoretical limitations to
this, but in practice they probably don't matter much - unsupported
colourspaces don't matter because without 10bit decoding support,
we can't use them anyway, and apparently we're not doing separate
chroma scaling these days, so scaling the rbga doesn't really lose
anything (and the vdpau hq scaling option remains available).
Pretty annoying affair. The vo_gpu code could of course not trigger
rendering from filters yet, so it needed to be extended. Also, this uses
some icky stuff made for vf_sub (and this was the reason I marked vf_sub
as deprecated), so everything is terrible.
Handling the window with this function makes no sense, since windows
and kernels are not the same thing and don't share the same option list.
The only reason it's done is to make sure the char* points at the static
string rather than the dynamically allocated one, which we can do
manually in this function. Rewrite a bit for clarity/quality.
pass_color_map() (in video_shaders.c) and pass_colormanage() (video.c)
both duplicate the condition on whether to do peak computation. Peak
computation requires a compute shader, so if the duplicated conditions
don't match, video_shaders.c will generate a compute shader, but video.c
will try to run it as fragment shader. This leads to a "blue screen".
This can be reproduced by playing a HDTV video with --target-peak=99.
It's not clear how to fix this. Should pass_tone_map() be only invoked
if mp_trc_is_hdr() == true (what pass_colormanage() uses to decide
whether to enable peak computation), or should pass_colormanage() just
tell pass_color_map() to skip peak computation? Decide for the latter,
as it's more robust.
Even if not correct, at least it gets rid of the blue shit.
Fixes: #7149
Probably. It's not like these pixel formats are formally specified -
FFmpeg added them because _some_ file format or decoder supports it, and
while that format/codec may define it precisely, the pixel format is
sort of disconnected and just a FFmpeg thing.
In any case, the yuva sample I had at hand uses the full range the
component data type can provide. The old code used the same "shifted"
range as for Y/U/V components, which must have been wrong.
This will not work correctly for packed YUVA formats, but fortunately
they matter even less.
For some reason, the first frame displayed on X11 with amdgpu and OpenGL
will be garbled. This is especially visible if the player starts,
displays a frame, but then still takes a while to properly start
playback.
With --interpolation, the behavior somehow changes (usually gets worse).
I'm not sure what exactly is going on, and the code in video.c is way
too abstruse. Maybe there is some slight possibility that a frame with
uncleared contents gets displayed, which somehow also corrupts another
frame that is displayed immediately after that.
If clear is unconditionally run, this somehow doesn't happen, and you
see a video frame. By any logic this shouldn't happen: a video frame
should always overwrite the background. So I can't exclude that this
isn't some sort of driver bug, or at least very obscure interaction.
Clearing should be practically free anyway, so always do it.
Fixes: #7105
This lets us set primaries, transfer function and the target peak
based on what the presenting layer would want us to have.
Now that this mechanism is available, warn if the user has
overridden values such as primaries or transfer function.
Using e.g. --vf=format=0bgr showed obviously wrong colors with --vo=gpu.
The reason is that leading padding wasn't handled correctly.
Try to hack fix it. While the code in copy_image() is somewhat
reasonable, I can't tell what the fuck is going on with that HOOKED
shit. For some reason this HOOKED shit doesn't use copy_image() (???),
or uses it incorrectly. It affects debanding. --deband=no works
correctly. If it's enabled, the crap in hook_prelude() is needed.
I bet there are many more bugs with this. For example, the deband shader
will try to deband the alpha channel if the format abgr is used (because
the correct component order is only established later). This can be
tested by inserting a "color.x = 0;" at the end of the deband shader,
and using --vf=format=rgba vs. abgr.
I cannot comprehend why it doesn't just store explicitly which
components a texture contains, and why it doesn't just read the
components always in an uniform way.
There's a big chance this fix works only by coincidence. This shouldn't
have been so hard either. Time for a complete rewrite?
Internally, vo_gpu uses NaN for some options to indicate a default value
that is different depending on the context (e.g. different scalers).
There are 2 problems with this:
1. you couldn't reset the options to their defaults
2. NaN is a damn mess and shouldn't be part of the API
The option parser already rejected NaN explicitly, which is why 1.
didn't work. Regarding 2., JSON might be a good example, and actually
caused a bug report.
Fix this by mapping NaN to the special value "default". I think I'd
prefer other mechanisms (maybe just having every scaler expose separate
options?), but for now this will do. See you in a future commit, which
painfully deprecates this and replaces it with something else.
I refrained from using "no" (my favorite magic value for "unset" etc.)
because then I'd have e.g. make --no-scale-param1 work, which in
addition to a lot of effort looks dumb and nobody will use it.
Here's also an apology for the shitty added test script.
Fixes: #6691
In some cases, src.sig_peak remains undefined as 0, which was definitely
the case when using the OSD, since it never got passed through the usual
color space normalization process. Most robust work-around is to simply
force the normalization at the site where it's needed. This ensures this
value is always valid and defined, to make the peak-dependent logic in
these two functions always work.
Fixes 4b25ec3a9dFixes#6917Fixes#6918
error diffusion requires two texture rendering pass. The existing code
reuses `screen_tex` and creates another for such purpose. This works
generally well for opengl, but could potentially be problematic for
vulkan, due to its async natural.
This is a straightforward parallel implementation of error diffusion
algorithms in compute shader. Basically we use single work group with
maximal possible size to process the whole image. After a shift
mapping we are able to process all pixels column by column.
A large ring buffer are allocated in shared memory to speed things up.
However the size of required shared memory depends linearly on the
height of video window (or screen height in fullscreen mode). In case
there is no enough shared memory, it will fallback to `--dither=fruit`.
The maximal allowed work group size is hardcoded as 1024. Ideally we
could query `GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS`. But for whatever
reason, it seems most high end card from nvidia and amd support only
the minimal required value, so I guess we can stick to it for now.
The calculation of scale factor involves 32-bit float, and a strict
equality test will effectively ignore `--scaler-resizes-only` option
for some non-integer scale factor.
Fix this by using non-strict equality check.
The old size limit was chosen before LUT texture was supported in user
shader. At that time, the whole user shader will be compiled and run
on GPU, which makes large user shader impractical to be used.
With the introduction of LUT texture, the old size limit doesn't make
any sense. For example, a 1024x1024 rgba16f LUT will cost 32MB shader
size.
Fix this by increasing the size limit to a value that's unlikely be
reached.
Manual changes done:
* Merged the interface-changes under the already master'd changes.
* Moved the hwdec-related option changes to video/decode/vd_lavc.c.
Before this commit, texture offset is set after all source textures
are finalized. Which means CHROMA hooks won't be able to align with
luma planes. This could be problematic for chroma prescalers utilizing
information from luma plane.
Fix this by find the reference texture early, and set global texture
offset early.
This solves some edge cases when using files with very weird metadata
(e.g. MaxCLL 10k and so forth). Instead of just blindly seeding it with
the tagged metadata, forcibly set the initial state from the detected
values.
Rather than the linear cd/m^2 units, these (relative) logarithmic units
lend themselves much better to actually detecting scene changes,
especially since the scene averaging was changed to also work
logarithmically.
This change switches to a logarithmic mean to estimate the average
signal brightness. This handles dark scenes with isolated highlights
much more faithfully than the linear mean did, since the log of the
signal roughly corresponds to the perceptual brightness.
In theory our "eye adaptation" algorithm works in both ways, both
darkening bright scenes and brightening dark scenes. But I've always
just prevented the latter with a hard clamp, since I wanted to avoid
blowing up dark scenes into looking funny (and full of noise).
But allowing a tiny bit of over-exposure might be a good thing. I won't
change the default just yet (better let users test), but a moderate
value of 1.2 might be better than the current 1.0 limit. Needs testing
especially on dark scenes.
The previous approach of using an FIR with tunable hard threshold for
scene changes had several problems:
- the FIR involved annoying hard-coded buffer sizes, high VRAM usage,
and the FIR sum was prone to numerical overflow which limited the
number of frames we could average over. We also totally redesign the
scene change detection.
- the hard scene change detection was prone to both false positives and
false negatives, each with their own (annoying) issues.
Scrap this entirely and switch to a dual approach of using a simple
single-pole IIR low pass filter to smooth out noise, while using a
softer scene change curve (with tunable low and high thresholds), based
on `smoothstep`. The IIR filter is extremely simple in its
implementation and has an arbitrarily user-tunable cutoff frequency,
while the smoothstep-based scene change curve provides a good, tunable
tradeoff between adaptation speed and stability - without exhibiting
either of the traditional issues associated with the hard cutoff.
Another way to think about the new options is that the "low threshold"
provides a margin of error within which we don't care about small
fluctuations in the scene (which will therefore be smoothed out by the
IIR filter).