For some reason this was never done? Looking through the code, it was
never the case that the frame cache was hit for still frames. I have no
idea why not. It makes a lot of sense to do so.
Notably, this massively improves the performance of updating the OSC
when viewing e.g. large still images, or while paused. (Tested on a
4000x8000 image, the OSC now responds smoothly to user input)
Only _writes_ are aligned, so the assumption doesn't work for reads. But
it's easy to fix by rounding down x0 to the next byte boundary. Writing
pixels outside of the read area is allowed, and we don't go out of
buffer bounds.
Patch by anon32, permission to do anything with it.
This is mostly for testing. It adds passing through the metadata through
the video chain. The metadata can be manipulated with vf_format. Support
for zimg alpha conversion (if built with zimg after it gained alpha
support) is implemented. Support premultiplied input in vo_gpu.
Some things still seem to be buggy.
Used to be used by vo_x11, and some other situations where software
conversion was employed. Haven't seen anyone complain about how software
brightness controls went away (originating from mplayer), so whatever,
it won't be needed again.
In the wayland code, the left mouse click is treated a bit differently.
Dragging the left click allows mpv to request a window move to the
compositor. In some cases, this can also request a window resize if the
osc-windowcontrols are enabled. These functions had the strange side
effect of messing up mpv's deadzone (it seemed to disappear completely).
A harmless enough workaround is to just explictly send an UP event for
left click after the move/resize functions are finished executing. The
xdg_toplevel move and resize functions both finish after the button
press is let go, so we are guarenteed to have the left click in the UP
state here. Sending this event probably unconfuses some calculation
somewhere thus fixing the deadzone bug. It feels a little silly, but
it's safe and works. Fixes#7651.
Move lookup GBRP or planar gray/alpha formats to separate functions in
some cases.
Make setup_regular_rgb_packer() not use a 4:4:4 YUV format to "pass"
RGB. This was used as a "trick" to avoid the stupid GBRP plane
permutation, but it confused severely, so get rid of it. Just do the
reordering, even if the zimg wrapper itself will reorder it back (which
is so stupid that I used the other approach at first). The comment
saying IMGFMT_420P was bogus of course; typically it was IMGFMT_444P.
This was inconsistent for unknown reason. monob was the way we wanted
it, and handling of monow was missing.
See the previous "img_format: add some mpv-only helper formats" commit.
Matters for the zimg wrapper.
mp_imgfmt_get_forced_csp() should be consistent with the MP_CSP_RGB/YUV
flags.
At least the different handling of the XYZ exception was a mess, even if
the result was the same.
Utterly useless, but the intention is to make dealing with corner case
pixel formats (forced upon us by FFmpeg, very rarely) less of a pain.
The zimg wrapper will use them. (It already supports these formats
automatically, but it will help with its internals.)
Y1 is considered RGB, even though gray formats are generally treated as
YUV for various reasons. mpv will default all YUV formats to limited
range internally, which makes no sense for a 1 bit format, so this is a
problem. I wanted to avoid that mp_image_params_guess_csp() (which
applies the default) explicitly checks for an image format, so although
a bit janky, this seems to be a good solution, especially because I
really don't give a shit about these formats, other than having to
handle them. It's notable that AV_PIX_FMT_MONOBLACK (also 1 bit gray,
just packed) already explicitly marked itself as RGB.
When I added mp_regular_imgfmt, I made the chroma subsampling use the
actual chroma division factor, instead of a shift (log2 of the actual
value). I had some ideas about how this was (probably?) more intuitive
and general. But nothing ever uses non-power of 2 subsampling (except
jpeg in rare cases apparently, because the world is a bad place).
Change the fields back to use shifts and rename them to avoid mistakes.
Make this slightly less ad-hoc. Also correct the missing alpha flag for
yap8/yap16.
Despite reduced redundancy, the LOC is going up anyway... whatever.
On FreeBSD and DragonFly kernel checks if `frsig` is valid and aborts
with `EINVAL` if not. However, `frsig` was never implemented.
$ build/mpv --gpu-context=drm /path/to/video.mkv
[...]
[vo/gpu] VT_SETMODE failed: Invalid argument
[vo/gpu/opengl] Failed to set up VT switcher. Terminal switching will be unavailable.
[...]
$ ./waf configure
Checking for vt.h : no
Checking for DRM : vt.h not found
[...]
../test.c:1:10: fatal error: 'sys/vt.h' file not found
#include <sys/vt.h>
^~~~~~~~~~
$ build/mpv --gpu-context=drm /path/to/video.mkv
Error parsing option gpu-context (option parameter could not be parsed)
Setting commandline option --gpu-context=drm failed.
Exiting... (Fatal error)
One not-so-nice hack in the wayland code is the assumption of when a
window is hidden (out of view from the compositor) and an arbitrary
delay for enabling/disabling the usage of presentation time. Since you
do not receive any presentation feedback when a window is hidden on
wayland (a feature or misfeature depending on who you ask), the ust is
updated based on the refresh_nsec statistic gathered from the previous
feedback event.
The flaw with this is that refresh_nsec basically just reports back the
display's refresh rate (1 / refresh_rate * 10^9). It doesn't tell you
how long the vsync interval really was. So as a video is left playing
out of view, the wl->last_queue_display_time becomes increasingly
inaccurate. This led to a vsync spike when bringing the mpv window back
into sight after it was hidden for a period of time. The hack for
working around this is to just wait a while before enabling presentation
time again. The discrepancy between the "bogus"
wl->last_queue_display_time and the actual value you get from the
feedback only happens initially after a switch. If you just discard
those values, you avoid the dramatic vsync spike.
It turns out that there's a smarter way to do this. Just use mp_time_us
deltas. The whole reason for these hacks is because
wl->last_queue_display_time wasn't close enough to how long it would
take for a frame to actually display if it wasn't hidden. Instead, mpv's
internal timer can be used, and the difference between wayland_sync_swap
calls is a close enough proxy for the vsync interval (certainly better
than using the monitor's refresh rate). This avoids the entire conundrum
of massive vsync spikes when bringing the player back into view, and it
means we can get rid of extra crap like wl->hidden.
In theory this mostly happens automatically, especially after the 5
vsync limit disables this already. But if we uninit before 5 vsyncs are
rendered, this can get left in a dangling 'enabled' state, which leaks a
debug report callback.
Always explicitly disable it just to be on the safe side.
I must have messed this up when I actually added the Y210 format
(because that one is correct). So my comment in the commit adding this
about the FFmpeg pixfmt doxygen being wrong was wrong.
I'd like to use this opportunity to complain once more about the
existence of these terrible pixel formats.
Again worthless, slow, and only for libswscale parity.
With this, we support all formats libswscale supports, except bayer
input, and rgb4/bgr4 output. We even support some formats libswscale
doesn't.
It's possible that the zimg wrapper isn't always as fast as libswscale.
But there is optimization potential: the inner repack loops are
self-contained enough that they could be reasonably be implemented in
assembler (probably), and doing everything slice-wise should reduce the
overhead of the separate pack/unpack stages.
Just lazily tested.
The comment on AV_PIX_FMT_Y210LE seems to be wrong. It claims it's "like
YUYV422", bit it seems more like YVYU422, at last the way libswscale
input treats it. Maybe Intel pays its developers too much?
The repacker inner lop is probably rather inefficient. In theory we
could optimize it by reading the packed pixels as words, doing the
component reshuffling using compile time values etc., but I'd rather
keep the code size small. It's already bad enough that we have to
support 16 bit per component variants, just because this one Intel guy
couldn't keep it in his pants. In general, I can't be bothered to spend
time on optimizing it; I'm only doing this for fun (i.e. masochistic
obligation).
This covers 8 and 16 bit packed RGB formats. It doesn't really help with
any actual use-cases, other than giving the finger to libswscale.
One problem is with different color depths. For example, rgb565 provides
1 bit more resolution to the green channel. zimg can only dither to a
uniform depth. I tried dithering to the highest depth and shifting away
1 bit for the lower channels, but that looked ugly (or I messed up
somewhere), so instead it dithers to the lowest depth, and adjusts the
value range if needed. Testing with bgr4_byte (extreme case with 1/2/1
depths), it looks more "grainy" (ordered dithering artifacts) than
libswscale, but it also looks cleaner and smoother. It doesn't have
libswscale's weird red-shift. So I call it a success.
Big endian formats need to be handled explicitly; the generic big endian
swapper code assumes byte-aligned components.
Unpacking is done with shifts and 3 LUTs. This is symmetric to the
packer. Using a generated palette might be better, but I preferred to
keep the symmetry, and not having to mess with a generated palette and
the pal8 code.
This uses FFmepg pixfmts constants directly. I would have preferred
keeping zimg completely separate. But neither do I want to add an IMGFMT
alias for every of these formats, nor do I want to extend our imgfmt
code such that it can provide a complete description of each packed RGB
format (similar to FFmpeg pixdesc).
It also appears that FFmpeg pixdesc as well as the FFmpeg pixfmt doxygen
have an error regarding RGB8: the R/B bit depths are swapped. libswscale
appears to be handling them differently. Not completely sure, as this is
the only packed format case with R/B havuing different depths (instead
of G, the middle component, where things are symmetric).
One of the extremely annoying dumb things in ffmpeg is that most pixel
formats are available as little endian and big endian variants. (The
sane way would be having native endian formats only.) Usually, most of
the real codecs use native formats only, while non-native formats are
used by fringe raw codecs only. But the PNG encoders and decoders
unfortunately use big endian formats, and since PNG it such a popular
format, this causes problems for us. In particular, the current zimg
wrapper will refuse to work (and mpv will fall back to sws) when writing
non-8 bit PNGs.
So add non-native endian support to zimg. This is done in a fairly
"generic" way (which means lots of potential for bugs). If input is a
"regular" format (and just byte-swapped), the rest happens
automatically, which happens to cover all interesting formats.
Some things could be more efficient; for example, unpacking is done on
the data before it's passed to the unpacker. You could make endian
swapping part of the actual unpacking process, which might be slightly
faster. You could avoid copying twice in some cases (such as when
there's no actual repacker, or if alignment needs to be corrected). But
I don't really care. It's reasonably fast for the normal case.
Not entirely sure whether this is correct. Some (but not many) formats
are covered by the tests, some I tested manually. Some I can't even
test, because libswscale doesn't support them (like nv20*).
This sucks, but is helpful for testing.
Obviously, it would be much nicer if there were a way to specify _all_
scaler options per filter (if the user wanted), instead of always using
the global options. But this is "too hard" for now. For testing, it is
extremely convenient to select the scaler backend, so add this option,
but make clear that it could go away. We'd delete it once there is a
better mechanism for this.
In display-sync mode, the core doesn't need to woken up every vsync, but
only every time a new actual video frame needs to be queued. So don't
wake up if there are still frames to repeat.
In audio-sync mode, the wakeup is simply redundant, since there's a
separate timer (in->wakeup_pts) to control when to queue a new frame. I
think.
This finally brings the required playloop iterations down to almost the
number of video frames. (As originally intended, really.)
Also a fairly risky change.
The wakeup at the end of VO frame rendering seems redundant, because
after rendering almost no state changes. The player core can queue a new
frame once frame rendering begins, and there's a separate wakeup for
this. The only thing that actually changes is in->rendering. The only
thing that seems to depend on it and can trigger a wakeup is the
vo_still_displaying() function. Change it so that it needs an explicit
call to a new API function, so we can avoid wakeups in the common case.
The vo_still_displaying() code is mostly just moved around due to
locking and for avoiding forward declarations.
Also a somewhat risky change (tasty new bugs).
This was optional, with the intention that normally such options require
a valid format. But there is no reason for this (at least not anymore),
and it's actually more logical to accept "no" in all situations this
option type is used. This also gets rid of the weird min field special
use.
Add an infrastructure for collecting performance-related data, use it in
some places. Add rendering of them to stats.lua.
There were two main goals: minimal impact on the normal code and normal
playback. So all these stats_* function calls either happen only during
initialization, or return immediately if no stats collection is going
on. That's why it does this lazily adding of stats entries etc. (a first
iteration made each stats entry an API thing, instead of just a single
stats_ctx, but I thought that was getting too intrusive in the "normal"
code, even if everything gets worse inside of stats.c).
You could get most of this information from various profilers (including
the extremely primitive --dump-stats thing in mpv), but this makes it
easier to see the most important information at once (at least in
theory), partially because we know best about the context of various
things.
Not very happy with this. It's all pretty primitive and dumb. At this
point I just wanted to get over with it, without necessarily having to
revisit it later, but with having my stupid statistics.
Somehow the code feels terrible. There are a lot of meh decisions in
there that could be better or worse (but mostly could be better), and it
just sucks but it's also trivial and uninteresting and does the job. I
guess I hate programming. It's so tedious and the result is always shit.
Anyway, enjoy.
Until now, it used only coordinates clipped to the screen for this,
which meant no negative margins were ever reported to libass. This broke
proper rendering of explicitly positioned ASS events (libass simply
could not know the real video size in this case.)
Fix this by reporting margins even if they're negative. This makes it
apparently work correctly with vo_gpu at least.
Note that I'm not really sure if anything in the rendering chain
required non-negative margins. If so, and that code implicitly assumed
it, I suppose crashes and such are possible.
The "rule" is that a fallback warning message should be shown only shown
if software decoding was used before, or in other words when either
hwdec was enabled before, but the stream suddenly falls back, or it was
attempted to enable it at runtime, and it didn't work.
The message wasn't printed the first time in the latter case, because
hwdec_notified was not set in forced software decoding mode. Fix it with
this commit. Fortunately, the logic becomes simpler.
Change all OPT_* macros such that they don't define the entire m_option
initializer, and instead expand only to a part of it, which sets certain
fields. This requires changing almost every option declaration, because
they all use these macros. A declaration now always starts with
{"name", ...
followed by designated initializers only (possibly wrapped in macros).
The OPT_* macros now initialize the .offset and .type fields only,
sometimes also .priv and others.
I think this change makes the option macros less tricky. The old code
had to stuff everything into macro arguments (and attempted to allow
setting arbitrary fields by letting the user pass designated
initializers in the vararg parts). Some of this was made messy due to
C99 and C11 not allowing 0-sized varargs with ',' removal. It's also
possible that this change is pointless, other than cosmetic preferences.
Not too happy about some things. For example, the OPT_CHOICE()
indentation I applied looks a bit ugly.
Much of this change was done with regex search&replace, but some places
required manual editing. In particular, code in "obscure" areas (which I
didn't include in compilation) might be broken now.
In wayland_common.c the author of some option declarations confused the
flags parameter with the default value (though the default value was
also properly set below). I fixed this with this change.
Previously, the vo wasn't always informed if something about the output
changed during playback. For instance, changing a display's refresh rate
during playback would not update mpv's display fps. Fix this by simply
using VO_EVENT_WIN_STATE in output_handle_done which executes whenever
something about the output is changed.
Allow the --window-maximized and --window-minimized flags to actually
work when the player is started. since macOS doesn't like using both at
the same time the minimized state takes precedence over the maximized
state.
Before this commit, option declarations used M_OPT_MIN/M_OPT_MAX (and
some other identifiers based on these) to signal whether an option had
min/max values. Remove these flags, and make it use a range implicitly
on the condition if min<max is true.
This requires care in all cases when only M_OPT_MIN or M_OPT_MAX were
set (instead of both). Generally, the commit replaces all these
instances with using DBL_MAX/DBL_MIN for the "unset" part of the range.
This also happens to fix some cases where you could pass over-large
values to integer options, which were silently truncated, but now cause
an error.
This commit has some higher potential for regressions.
This was mostly unused, and has certain problems. Just get rid of it.
It was still used in CDDA (--cdda-span) and a debug option for OpenGL
(--opengl-check-pattern). Replace both of these with 2 options, where
each sets the start/end values of the former span. Both were
undocumented somehow (normally we require all options to be documented),
so I'm not caring about compatibility, and not bothering to add it to
the API changelog.
We have this cap now thanks to e2976e662, but we don't actually make
sure our FBOs are storable before we blindly attempt using them with
compute shaders.
There's no more need to unconditionally set `storage_dst = true` as long
as we make sure to include an extra condition on the `fbo_format`
selection to prevent users from accidentally enabling
compute-shader-only features with non-storable FBOs, alongside some
other miscellaneous adjustments to eliminate instances of "assumed
storability" from vo_gpu.
This simply makes the "is the destination FBO format bad?" check a tiny
bit less awful, by making sure we prefer storable FBO formats over
non-storable FBO formats. I'd love to make this also conditional on
whether or not we actually *need* a storable FBO format, but that logic
is decided later, in `pass_draw_to_screen`, and I don't want to
replicate the logic.
Fixes#7017.
Previously if the --fs-screen option was set, it would only use the
screen if mpv was launched with --fs and only on startup. During
runtime, the toggle would ignore it. Rework the logic here so that mpv's
fullscreen always uses --fs-screen if it is set. Additionally, cleanup
some unneeded cruft in vo_wayland_reconfig and make find_output more
useful.
This commit fixes a bug where handle for a framebuffer gets double
freed.
It seems to happen that the same prime fd gets two framebuffers.
As the prime fd is the same the resulting prime handle is also the
same.
This means one handle but 2 framebuffers and can lead to the following
chain:
1. The first framebuffer gets deleted the handle gets also freed via
the ioctl.
2. In startup phase not all 4 dumb buffers for overlay drawing
are set up. It can happen that the last dumb buffer gets the
handle we freed above.
3. The second framebuffer gets freed and the handle will be
freed again resulting that the 4's dumb buffer handle is not
backed by a buffer.
4. Drm prime continues to assign handles to its prime fds an
will lead to have this handle which was just freed to
reassign again but to an prime buffer.
5.Now the overlay should be drawn into dumb buffer 4 which
still has the same handle but is backed by the wrong buffer.
This leads to two different behaviors:
- MPV crashes as the drm prime buffers size als calculated
by the decoder output format. The overlay output format
differs and it takes more space. SO the size check
in kernel fails.
- MPV is continuing play. This happens when the decoders
allocates a bigger buffer than needed for the overlay.
For example overlay is Full HD and decoder output is 4k.
This leads to the behavior das the overlay wil be drawn
into the wrong buffer as its a drm prime buffer and results
in a flicker every fourth step.