They are sort of confusing, and they hide the fact that they have an
alpha component. Using the actual formats directly is no problem, sicne
these were useful only for big endian systems, something we can't test
anyway.
This was not intended to be committed in 0e3f893606. It disables
the extra wakeup if working==true. I've convinced myself that the wakeup
was really needed at the time, so no idea how I didn't notice this until
someone pointed it out on the commit diff on github (lol).
This resolves prefixes such as "~/" and "~~/" at the caller, instead of
relying on stream_read_file() to do it. One of the following commits
will remove this from stream_read_file() itself.
Untested.
The caller of render_frame() re-iterates without waiting if this
function returns true. That's normally meant for DS, where we draw
frames as fast as possible to let the driver perform waiting.
draw_bmp.c is the software blender for subtitles and OSD. It's used by
encoding mode (burning subtitles), and some VOs, like vo_drm, vo_x11,
vo_xv, and possibly more.
This changes the algorithm from upsampling the video to 4:4:4 and then
blending to downsampling the OSD and then blending directly to video.
This has far-reaching consequences for its internals, and results in an
effective rewrite.
Since I wanted to avoid un-premultiplying, all blending is done with
premultiplied alpha. That's actually the sane thing to do. The old code
just didn't do it, because it's very weird in YUV fixed point.
Essentially, you'd have to compensate for the chroma centering constant
by subtracting src_alpha/255*128. This seemed so hairy (especially with
correct rounding and high bit depths involved) that I went for using
float.
I think it turned out mostly OK, although it's more complex and less
maintainable than before. reinit() is certainly a bit too long. While it
should be possible to optimize the RGB path more (for example by
blending directly instead of doing the stupid float conversion), this is
probably slower. vo_xv users probably lose in this, because it takes the
slowest path (due to subsampling requirements and using YUV).
Why this rewrite? Nobody knows. I simply forgot the reason. But you'll
have it anyway. Whether or not this would have required a full rewrite,
at least it supports target alpha now (you can for example hard sub
transparent PNGs, if you ever wanted to use mpv for this).
Remove the check in vf_sub. The new draw_bmp.c is not as reliant on
libswscale anymore (mostly uses repack.c now), and osd.c shows an
error message on missing support instead now.
Formats with chroma subsampling of 4 are not supported, because FFmpeg
doesn't provide pixfmt definitions for alpha variants. We could provide
those ourselves (relatively trivial), but why bother.
This provides a way to convert YUV in fixed point (or pseudo-fixed
point; probably best to say "uint") to float, or rather coefficients
to perform such a conversion. Things like colorspace conversion is out
of scope, so this is simple and strictly per-component.
This is somewhat similar to mp_get_csp_mul(), but includes proper color
range expansion and correct chroma centering. The old function even
seems to have a bug, and assumes something about shifted range for full
range YCbCr, which is wrong.
vo_gpu should probably use the new function eventually.
Adding all these so I can use them for obscure processing purposes (see
later draw_bmp commit).
There isn't really a reason why they should exist. On the other hand,
they're just labels for formats that can be handled in a generic way,
and this commit adds support for them in the zimg wrapper and vo_gpu
just by making the formats exist. (Well, vo_gpu had to be fixed in the
previous commit.)
This was incorrect at least because the colorspace matrix attempted to
center chroma at (conceptually) 0.5, instead of 0. Also, it tried to
apply the fixed point shift logic for component sizes > 8 bit.
There is no float yuv format in mpv/ffmpeg yet, but see next commit,
which enables zimg to output it. I'm assuming zimg defines this format
such that luma is in range [0,1] and chroma in range [-0.5,0.5], with
the levels flag being ignored. This is consistent with H264/5 Annex E (I
think...), and it sort of seems to look right, so that's it.
Was broken with a zimg wrapper refucktor before the previous commit. In
addition, it seems this didn't match the vo_drm format, or the format
naming convention. So the order actually changes, and the format is
redefined. (The img_format.h comment was probably wrong.)
Change vo_gpu to the new format as well, so we can still test it.
For whatever purpose. If anything, this makes the zimg wrapper cleaner.
The added tests are not particular exhaustive, but nice to have. This
also makes the scale_zimg.c test pretty useless, because it only tests
repacking (going through the zimg wrapper). In theory, the repack_tests
things could also be used on scalers, but I guess it doesn't matter.
Some things are added over the previous zimg wrapper code. For example,
some fringe formats can now be expanded to 8 bit per component for
convenience.
Fallback to drmModeAddFB2 if drmModeAddFB2WithModifiers fails. I've
observed it failing on a pinebook pro running manjaro. We also got "0"
as modifiers from FFmpeg anyway, which might or might not have
something to do with this.
Instead of trying to find the source of the problem, just add this
fallback.
Untested (I don't have a platform that requires modifiers to work
here). Might break something, or might fix something. At least this
looks more intuitive to me.
The current implementation of presentation feedback was designed to be
used with flip model presentation. With the bitblt model,
GetFrameStatistics returns totally different values and it's not clear
if we can use them at all. Previously, this wasn't a problem because
with the bitblt model, GetFrameStatistics only worked in exclusive
fullscreen. Now that mpv supports exclusive fullscreen, we should
explicitly check for a flip model swapchain before using presentation
feedback.
This is really basic for planar image data access; not sure why there
weren't such helpers before.
They also handle trickier formats that use bit-packing, or they would be
mich simpler. (This affects only BGR4/BGR4/MONOW/MONOH, I hope whoever
invented them is proud of triggering so many special cases for so little
gain.)
These were still mapped to MP errors during probing, but they also get
triggered when instance creation fails due to lack of support for e.g.
wayland. Since waylandvk is probed above x11vk, we should probably
suppress these by default.
Closes#7626
For some reason this was never done? Looking through the code, it was
never the case that the frame cache was hit for still frames. I have no
idea why not. It makes a lot of sense to do so.
Notably, this massively improves the performance of updating the OSC
when viewing e.g. large still images, or while paused. (Tested on a
4000x8000 image, the OSC now responds smoothly to user input)
Only _writes_ are aligned, so the assumption doesn't work for reads. But
it's easy to fix by rounding down x0 to the next byte boundary. Writing
pixels outside of the read area is allowed, and we don't go out of
buffer bounds.
Patch by anon32, permission to do anything with it.
This is mostly for testing. It adds passing through the metadata through
the video chain. The metadata can be manipulated with vf_format. Support
for zimg alpha conversion (if built with zimg after it gained alpha
support) is implemented. Support premultiplied input in vo_gpu.
Some things still seem to be buggy.
Used to be used by vo_x11, and some other situations where software
conversion was employed. Haven't seen anyone complain about how software
brightness controls went away (originating from mplayer), so whatever,
it won't be needed again.
In the wayland code, the left mouse click is treated a bit differently.
Dragging the left click allows mpv to request a window move to the
compositor. In some cases, this can also request a window resize if the
osc-windowcontrols are enabled. These functions had the strange side
effect of messing up mpv's deadzone (it seemed to disappear completely).
A harmless enough workaround is to just explictly send an UP event for
left click after the move/resize functions are finished executing. The
xdg_toplevel move and resize functions both finish after the button
press is let go, so we are guarenteed to have the left click in the UP
state here. Sending this event probably unconfuses some calculation
somewhere thus fixing the deadzone bug. It feels a little silly, but
it's safe and works. Fixes#7651.
Move lookup GBRP or planar gray/alpha formats to separate functions in
some cases.
Make setup_regular_rgb_packer() not use a 4:4:4 YUV format to "pass"
RGB. This was used as a "trick" to avoid the stupid GBRP plane
permutation, but it confused severely, so get rid of it. Just do the
reordering, even if the zimg wrapper itself will reorder it back (which
is so stupid that I used the other approach at first). The comment
saying IMGFMT_420P was bogus of course; typically it was IMGFMT_444P.
This was inconsistent for unknown reason. monob was the way we wanted
it, and handling of monow was missing.
See the previous "img_format: add some mpv-only helper formats" commit.
Matters for the zimg wrapper.
mp_imgfmt_get_forced_csp() should be consistent with the MP_CSP_RGB/YUV
flags.
At least the different handling of the XYZ exception was a mess, even if
the result was the same.
Utterly useless, but the intention is to make dealing with corner case
pixel formats (forced upon us by FFmpeg, very rarely) less of a pain.
The zimg wrapper will use them. (It already supports these formats
automatically, but it will help with its internals.)
Y1 is considered RGB, even though gray formats are generally treated as
YUV for various reasons. mpv will default all YUV formats to limited
range internally, which makes no sense for a 1 bit format, so this is a
problem. I wanted to avoid that mp_image_params_guess_csp() (which
applies the default) explicitly checks for an image format, so although
a bit janky, this seems to be a good solution, especially because I
really don't give a shit about these formats, other than having to
handle them. It's notable that AV_PIX_FMT_MONOBLACK (also 1 bit gray,
just packed) already explicitly marked itself as RGB.
When I added mp_regular_imgfmt, I made the chroma subsampling use the
actual chroma division factor, instead of a shift (log2 of the actual
value). I had some ideas about how this was (probably?) more intuitive
and general. But nothing ever uses non-power of 2 subsampling (except
jpeg in rare cases apparently, because the world is a bad place).
Change the fields back to use shifts and rename them to avoid mistakes.
Make this slightly less ad-hoc. Also correct the missing alpha flag for
yap8/yap16.
Despite reduced redundancy, the LOC is going up anyway... whatever.
On FreeBSD and DragonFly kernel checks if `frsig` is valid and aborts
with `EINVAL` if not. However, `frsig` was never implemented.
$ build/mpv --gpu-context=drm /path/to/video.mkv
[...]
[vo/gpu] VT_SETMODE failed: Invalid argument
[vo/gpu/opengl] Failed to set up VT switcher. Terminal switching will be unavailable.
[...]
$ ./waf configure
Checking for vt.h : no
Checking for DRM : vt.h not found
[...]
../test.c:1:10: fatal error: 'sys/vt.h' file not found
#include <sys/vt.h>
^~~~~~~~~~
$ build/mpv --gpu-context=drm /path/to/video.mkv
Error parsing option gpu-context (option parameter could not be parsed)
Setting commandline option --gpu-context=drm failed.
Exiting... (Fatal error)
One not-so-nice hack in the wayland code is the assumption of when a
window is hidden (out of view from the compositor) and an arbitrary
delay for enabling/disabling the usage of presentation time. Since you
do not receive any presentation feedback when a window is hidden on
wayland (a feature or misfeature depending on who you ask), the ust is
updated based on the refresh_nsec statistic gathered from the previous
feedback event.
The flaw with this is that refresh_nsec basically just reports back the
display's refresh rate (1 / refresh_rate * 10^9). It doesn't tell you
how long the vsync interval really was. So as a video is left playing
out of view, the wl->last_queue_display_time becomes increasingly
inaccurate. This led to a vsync spike when bringing the mpv window back
into sight after it was hidden for a period of time. The hack for
working around this is to just wait a while before enabling presentation
time again. The discrepancy between the "bogus"
wl->last_queue_display_time and the actual value you get from the
feedback only happens initially after a switch. If you just discard
those values, you avoid the dramatic vsync spike.
It turns out that there's a smarter way to do this. Just use mp_time_us
deltas. The whole reason for these hacks is because
wl->last_queue_display_time wasn't close enough to how long it would
take for a frame to actually display if it wasn't hidden. Instead, mpv's
internal timer can be used, and the difference between wayland_sync_swap
calls is a close enough proxy for the vsync interval (certainly better
than using the monitor's refresh rate). This avoids the entire conundrum
of massive vsync spikes when bringing the player back into view, and it
means we can get rid of extra crap like wl->hidden.
In theory this mostly happens automatically, especially after the 5
vsync limit disables this already. But if we uninit before 5 vsyncs are
rendered, this can get left in a dangling 'enabled' state, which leaks a
debug report callback.
Always explicitly disable it just to be on the safe side.
I must have messed this up when I actually added the Y210 format
(because that one is correct). So my comment in the commit adding this
about the FFmpeg pixfmt doxygen being wrong was wrong.
I'd like to use this opportunity to complain once more about the
existence of these terrible pixel formats.
Again worthless, slow, and only for libswscale parity.
With this, we support all formats libswscale supports, except bayer
input, and rgb4/bgr4 output. We even support some formats libswscale
doesn't.
It's possible that the zimg wrapper isn't always as fast as libswscale.
But there is optimization potential: the inner repack loops are
self-contained enough that they could be reasonably be implemented in
assembler (probably), and doing everything slice-wise should reduce the
overhead of the separate pack/unpack stages.
Just lazily tested.
The comment on AV_PIX_FMT_Y210LE seems to be wrong. It claims it's "like
YUYV422", bit it seems more like YVYU422, at last the way libswscale
input treats it. Maybe Intel pays its developers too much?
The repacker inner lop is probably rather inefficient. In theory we
could optimize it by reading the packed pixels as words, doing the
component reshuffling using compile time values etc., but I'd rather
keep the code size small. It's already bad enough that we have to
support 16 bit per component variants, just because this one Intel guy
couldn't keep it in his pants. In general, I can't be bothered to spend
time on optimizing it; I'm only doing this for fun (i.e. masochistic
obligation).
This covers 8 and 16 bit packed RGB formats. It doesn't really help with
any actual use-cases, other than giving the finger to libswscale.
One problem is with different color depths. For example, rgb565 provides
1 bit more resolution to the green channel. zimg can only dither to a
uniform depth. I tried dithering to the highest depth and shifting away
1 bit for the lower channels, but that looked ugly (or I messed up
somewhere), so instead it dithers to the lowest depth, and adjusts the
value range if needed. Testing with bgr4_byte (extreme case with 1/2/1
depths), it looks more "grainy" (ordered dithering artifacts) than
libswscale, but it also looks cleaner and smoother. It doesn't have
libswscale's weird red-shift. So I call it a success.
Big endian formats need to be handled explicitly; the generic big endian
swapper code assumes byte-aligned components.
Unpacking is done with shifts and 3 LUTs. This is symmetric to the
packer. Using a generated palette might be better, but I preferred to
keep the symmetry, and not having to mess with a generated palette and
the pal8 code.
This uses FFmepg pixfmts constants directly. I would have preferred
keeping zimg completely separate. But neither do I want to add an IMGFMT
alias for every of these formats, nor do I want to extend our imgfmt
code such that it can provide a complete description of each packed RGB
format (similar to FFmpeg pixdesc).
It also appears that FFmpeg pixdesc as well as the FFmpeg pixfmt doxygen
have an error regarding RGB8: the R/B bit depths are swapped. libswscale
appears to be handling them differently. Not completely sure, as this is
the only packed format case with R/B havuing different depths (instead
of G, the middle component, where things are symmetric).
One of the extremely annoying dumb things in ffmpeg is that most pixel
formats are available as little endian and big endian variants. (The
sane way would be having native endian formats only.) Usually, most of
the real codecs use native formats only, while non-native formats are
used by fringe raw codecs only. But the PNG encoders and decoders
unfortunately use big endian formats, and since PNG it such a popular
format, this causes problems for us. In particular, the current zimg
wrapper will refuse to work (and mpv will fall back to sws) when writing
non-8 bit PNGs.
So add non-native endian support to zimg. This is done in a fairly
"generic" way (which means lots of potential for bugs). If input is a
"regular" format (and just byte-swapped), the rest happens
automatically, which happens to cover all interesting formats.
Some things could be more efficient; for example, unpacking is done on
the data before it's passed to the unpacker. You could make endian
swapping part of the actual unpacking process, which might be slightly
faster. You could avoid copying twice in some cases (such as when
there's no actual repacker, or if alignment needs to be corrected). But
I don't really care. It's reasonably fast for the normal case.
Not entirely sure whether this is correct. Some (but not many) formats
are covered by the tests, some I tested manually. Some I can't even
test, because libswscale doesn't support them (like nv20*).