This flips the y-coordinate to be consistent with
other platforms and the manual. furthermore it
fixes an unwanted behaviour of the cocoa
convertRectFromBacking method, where the x- and
y-coordinate was divided by the same factor as the
width and height instead of placing the new scaled
rectangle at the same relative position as the
original unscaled rectangle, by manually
calculating the new position.
Fixes#3867
This is simpler and more robust, especially for the hwdec fallback case.
The most annoying issue is that C doesn't support multiple return values
(or sum types), so the decode call gets all awkward.
The hwdec fallback case does not need to try to produce some output
after the fallback anymore. Instead, it can use the normal "replay"
code path.
We invert the "eof" bool that vd_lavc.c used internally. The
receive_frame decoder API returns the inverse of EOF, because
returning "true" from the decode function if EOF was reached
feels awkward.
Usually they happen at the same time, but conflating them is still a bit
unclean and could possibly cause problems in the future. It's also
really unnecessary.
The ffmpeg cuda wrappers need more than 1 packet for determining whether
hw decoding will work. So do something complicated and keep up to 32
packets when trying to do hw decoding, and replay the packets on the
software decoder if it doesn't work.
This code was written in a delirious state, testing for regressions and
determining whether this is worth the trouble will follow later. All mpv
git users are alpha testers as of this moment.
Fixes#3914.
Cover art handling is a disgusting hack that causes a mess in all
components. And this will stay this way. This is the Xth time I've
changed cover art handling, and that will probably also continue.
But change the code such that cover art is injected into the demux
packet stream, instead of having an explicit special case it in the
decoder glue code. (This is somewhat more similar to the cover art hack
in libavformat.)
To avoid that the over art picture is decoded again on each seek, we
need some additional "caching" in player/video.c. Decoding it after each
seek would work as well, but since cover art pictures can be pretty
huge, it's probably ok to invest some lines of code into caching it.
One weird thing is that the cover art packet will remain queued after
seeks, but that is probably not an issue.
In exchange, we can drop the dec_video.c code, which is pretty
convenient for one of the following commits. This code duplicates a
bunch of lower-level decode calls and does icky messing with this weird
state stuff, so I'm glad it goes away.
I'm not sure what systems have <sys/poll.h> (maybe there are historical
reasons why some would), but POSIX defines <poll.h>. Although this code
is full of highly OS specific calls (like ioctl()), there's no reason
not to use the more standard include path.
This is available since the first commit after libva 0.39.4. Since the
version wasn't bumped since, we just check some random other symbol that
was added since (I'd rather not add a configure check).
The libva message callback repeats the endlessly repeated API mistakes
of libraries using global message callback handlers. But it's the only
way to shut up libva's dumb messages to stderr, so add something
complicated and dumb to workaround libva's stupidity.
Just some minor refactoring within va_initialize() as preparation for
the next commit.
Also, do not call vaTerminate(display) on failures. All callers already
do this, so this would have led to a double-free.
vo_wayland_wait_events() is going to return when its time to swap the
buffers anyway, calling request_frame() before makes no sense.
Fixes the constant high CPU usage by the compositor when mpv is paused
and the window is in view.
According to MSDN, in WM_SYSCOMMAND messages, the four low-order
bits of the wParam parameter are used internally by the system.
To obtain the correct result when testing the value of wParam,
an application must combine the value 0xFFF0 with the wParam
value by using the bitwise AND operator.
Looks quite like a bug. If you have a filter chain with only the
dynaudnorm filter, and send call av_buffersrc_add_frame(s, NULL), then
subsequent av_buffersink_get_frame() calls will return EAGAIN instead of
EOF.
This was apparently caused by a recent change in FFmpeg.
Some other circumstances (which I didn't fully analyze and which is due
to the playloop's absurd temporary-EOF behavior on seeks) then led the
decoder loop to send data again, but since libavfilter was stuck in the
EOF state now, it could never recover. It kept sending new input (due to
missing output), until the demuxer refused to return more audio packets.
Each time a filter error was printed.
Fortunately, it's pretty easy to workaround. We just mark the p->eof
flag as we send an EOF frame to libavfilter. The p->eof flag is used
only to recover from temporary EOF: it resets the filter if new data is
available again. We don't care much about av_buffersink_get_frame()
returning a broken EAGAIN state in this situation and essentially ignore
it, meaning if we get EAGAIN after sending EOF, we assume effectively
that EOF was fully reached.
This is actually a pretty important fix. eglChooseConfig() might be the
first thing that fails when porobing for desktop GL / ES2 / ES3 support,
because EGL_RENDERABLE_TYPE is set values specific to the underlying
APIs.
Not sure how the hell this worked before. EGL 1.4 implementations
certainly could fail the call with EGL_BAD_ATTRIBUTE if
EGL_RENDERABLE_TYPE has EGL_OPENGL_ES3_BIT set. It's quite possible that
many EGL implementations tolerate invalid EGLConfig values steming from
uininitialized EGLConfig values (and eglCreateWindowSurface() even is
specified to return EGL_BAD_CONFIG error code for "not valid"
EGLConfigs).
The way it should (probably) work is that selecting a RGBA framebuffer
format will simply make the compositor use the alpha. It works this way
on Wayland. On X11, this is... not done. Instead, both GLX and EGL
report two FB configs, which are exactly the same, except for the
platform-specific visual. Only the latter (non-default) points to a
visual that actually has alpha. So you can't make the pure GLX and EGL
APIs select alpha mode, and you have to override manually.
Or in other words, alpha was hacked violently into X11, in a way that
doesn't really make sense for the sake of compatibility, and forces API
users to wade through metaphorical cow shit to deal with it.
To be fair, some other platforms actually also require you to enable
alpha explicitly (rather than looking at the framebuffer type), but they
skip the metaphorical cow shit step.
So that the EGL code can use it too.
Also print the actual FB config ID, instead of nonsense. (I _think_ once
in the past a certain GLX implementation just used numeric config IDs
casted to EGLConfig - or at least that would explain this nonsense.)
Preparation for the following commits. Since at least theoretically the
config selection depends on the context type (EGL_RENDERABLE_TYPE has
separate bits for ES 2, ES 3, and desktop GL), doing it any other way
would be too painful.
For X11 garbage we have to pass some annoying parameters to EGL context
creation. Add some sort of extensible API, so that adding a new
parameter doesn't break all callers. We still want to keep it as a
single function, because it's so nice isolating all the EGL nonsense API
boilerplate like this. (Did I mention yet that X11 and EGL are garbage?)
Also somewhat simplifies the vo_flags mess in the helper internals.
The chroma alignment renormalization code forgot to account for the fact
that the chroma subsampling ratio has to be rotated.
Unfortunately, doing it this way seems to have somewhat broken the
chroma offset rotation logic for odd-sized subsampled image files. While
this is a bug, it's much, much less noticeable, so it's not nearly as
important as the bug this change fixes. Either way, a future patch needs
to still revise this logic, ideally by redesigning the entire rotation
mechanism.
Conceptually cleaner, although the API claims this is equivalent.
Originally, AVCodecContext fields were used, because not all supported
libavcodec/libavutil versions had the AVFrame fields.
This is not done for chroma_sample_location - it has no AVFrame field.
Helps with gif, probably does unwanted things with other formats.
This doesn't handle --end quite correctly, but this could be added
later.
Fixes#3924.
this replaces the old fullscreen with the native
macOS fullscreen. additional the
--fs-black-out-screens was removed since the new
API doesn't support it in a way the old one did.
it can possibly be re-added if done manually.
Fixes#2857#3272#1352#2062#3864
Allow minimizing the borderless/fullscreen window by clicking on the
taskbar button or pressing Win+Down hotkey.
Also fixes#2229 and probably fixes#2451
According to MSDN, GetWindowLong and SetWindowLong have been
superseded by GetWindowLongPtr and SetWindowLongPtr.
It's a cosmetic code change in this case.
Needs explicit logic. Fixes a pretty bad regression which prefers
vdpau-copy over native vaapi with direct rendering (with --hwdec=auto)
if libvdpau-va-gl1 is present. The reason is that vdpau-copy is above
vaapi, simply because all vdpau hwdecs are grouped and happened to be
listed before vaapi.
Although this is not that bad for copy-mode (unlike the case described
above), it's still a good idea to use our native vaapi code instead.
Apparently we don't always set the viewport to window dimensions
anymore, e.g. if nothing is actually rendered. This means the viewport
can contain old values.
The window screenshot code uses the viewport values to guess the default
framebuffer dimensions. With --force-window --idle --no-osc (which draws
nothing and issues a glClear() command only), taking a screenshot would
yield an image with the wrong size and possibly garbage in it. Fix this
by explicitly passing the currently known window dimensions. Abusing the
values stored in the viewport was questionable anyway.
Fixes a segfault introduced in libwayland
e8ad23266f36521215dcd7cfcc524e0ef67d66dd, where a poison value has been
introduced to catch this kind of use-after-free bug.
Long planned. Leads to some sanity.
There still are some rather gross things. Especially g_groups is ugly,
and a hack that can hopefully be removed. (There is a plan for it, but
whether it's implemented depends on how much energy is left.)
We want to avoid causing problems if libmpv is used in an application
that links cuda, or if the libav* libraries are linked with cuda,
as might happen if the scale_npp filter is used.
The test ended up failing if cuda.h wasn't present, even if cuda.h
isn't used during the actual build.
This test is attempting to establish if the ffmpeg being built
against has dynlink_cuda support. While it might theoretically be
possible to build against the older normally-linked-cuda version
of ffmpeg, it seems more trouble than it's worth.
This was a typo in the extensiuon spec and was probably always broken.
Could have led to broken builds when used with ancient ANGLE headers
(or possibly generic EGL headers).
The latest 375.xx nvidia drivers add support for P016 output
surfaces. In combination with an ffmpeg change to return those
surfaces, we can display them.
The bulk of the work is related to knowing which format you're
dealing with at the right time. Once you know, it's straight forward.
At least with Nvidia drivers, some thread tries to access D3D11 objects
after ANGLE unloads d3d11.dll. Fix this by holding a reference to
d3d11.dll ourselves.
Might fix the crash in #3348. (I wish I knew why though.)
- win32-console-wrapper.c was inconsistently using the explicit Unicode
versions of some Windows API functions and structures.
- vo.c should use llabs for int64_t, since long is 32-bit on Windows.
- vo_direct3d.c had a potential use of an uninitialized variable if it
took the first goto error_exit.
Deactivating this options makes it possible to
circumvent the default OS X behavior of using
points. Windows on HiDPI resolutions won't open
in double the size anymore and videos are display
in their native resolution when windowed.
Fixes#3716
This is a bit unintuitiv, but it appears hwdec backends have to unset
hwdec_priv manually in their uninit function. Normally with this idiom
you'd expect the common code to do this (and maybe even freeing the priv
struct). Since other hwdec backends do this quite consistently, just fix
vdpau for now.
Also add an assert to detect similar bugs sooner.
Fixes#3788.
Implementation-wise, the values from the demuxer/codec header are merged
with the values from the decoder such that the former are used only
where the latter are unknown (0/auto).
The intention was that if --blend-subtitles is enabled, the frame should
always be re-rendered instead of using e.g. a cached scaled frame. The
reason is that subtitles can change anyway, e.g. if you pause and change
subtitle size and such.
On the other hand, if the frame is marked as repeated, it should always
use the cached copy. Actually "simplify" this and drop the cache only if
playback is paused (which frame->still indicates indirectly).
Also see PR #3773.
unmap_current_image() is called after rendering. This essentially
invalidates the textures, so we can't assume that the image is still
present.
Also see PR #3773.
This allows us to define the tukey window (and other tapered windows).
Also add a missing option definition for `wblur` while we're at it, to
make testing out window-related stuff easier.
This makes no sense, as the flag is supposed to be used for vsync
purposes only (when literally outputting the screen again with no
changes at all), and redrawing is often used for OSD updates.
It's not that easy to decide whether a frame needs to be
reuploaded/rerendered. Using unique frame IDs for input makes it
slightly easier and more robust. This also removes the use of video PTS
in the interpolation path.
This should also avoid reuploading the video frame if it's just redrawn
in paused mode, or when using OSD/subtitles in cover art mode.
When pthread_cond_timedwait(), the condition we are checking for could
be true or false. This code assumed it was always false.
This should be an extremely obscure race condition, since it can happen
only if timeout and the condition changing sort of happen at the same
time, or the lock is held for a longer time (which it normally isn't).
But I could observe it a few times.
Even if a frame is dropped due to the libmpv API user not drawing a
frame, it should be set as current frame. This avoids dropping a frame
forever in certain circumstances such as cover art of the API user was
stuck at initialization or such.
It turns out the glFlush() call really helps in some cases, though only
in audio timing mode (where we render, then wait for a while, then
display the frame). Add a --opengl-early-flush=auto mode, which does
exactly that.
It's unclear whether this is fine on OSX (strange things going on
there), but it should be.
See #3670.
Thread-local storage in GCC is platform-specific, and some platforms that
are otherwise perfectly capable of running mpv may lack TLS support in GCC.
This change adds a test for GCC variant of TLS and relies on its result
instead of assumption.
Provided that LLVM's `__thread` support is similar to GCC, the test is
called "GCC/LLVM TLS".
Signed-off-by: wm4 <wm4@nowhere>
At this point, all other hwaccels provide -copy modes, and vdpau is the
exception with not having one. Although there is vf_vdpaurb, it's less
convenient in certain situations, and exposes some issues with the
filter chain code as well.
Both AVFrame.pts and AVFrame.pkt_pts have existed for a long time. Until
now, decoders always returned the pts via the pkt_pts field, while the
pts field was used for encoding and libavfilter only. Recently, pkt_pts
was deprecated, and pts was switched to always carry the pts.
This means we have to be careful not to accidentally use the wrong
field, depending on the libavcodec version. We have to explicitly check
the version numbers. Of course the version numbers are completely
idiotic, because idiotically the pkg-config and library names are the
same for FFmpeg and Libav, so we have to deal with this explicitly as
well.
- Change connector selection to accept human readable names (such as
eDP-1, HDMI-A-2) rather than arbitrary numbers.
- Change GPU selection to accept GPU number rather than device paths.
- Merge connector and GPU selection into one --drm-connector.
- Add support for --drm-connector=help.
- Add support for --drm-* in EGL backend.
- Refactor KMS; reduce state sharing across drm_common.
The glFlush() call was made optional recently
since it's not needed in most cases. On OSX though
this is needed since we removed kCGLPFADoubleBuffer
from the context creation, so the glFlush() call
was added to the cocoa backend only.
The CGLFlushDrawable() call can be safely removed
since it only does something when a double
buffered context is used. Also fixes a small typo.
Fixes#3627.
In "dumb mode" (where most features are disabled and which only performs
some basic rendering) we explicitly copy a set of whitelisted options,
and leave all the other options at their default values. Add the new
--opengl-early-flush option to this whitelist. Also remove an option
field accidentally added in the commit adding --opengl-early-flush.
It seems this can cause issues with certain platforms, so better to
disable it by default. The original reason for this isn't overly
justified, and display-sync mode should get rid of the need for it
anyway.
The new option is meant for testing, and will probably be removed if
nobody comes up and reports that enabling the option actually improves
anything.
Reduces code duplication between OpenGL backend and DRM VO.
(The control() for OpenGL backend isn't sufficiently similar to the
VO's control() to consider merging it as a whole - I extracted only the
FPS code.)
When the vaapi decoder is used in copy mode, it creates a dummy
display to render to. In theory, this should support hardware
decoding on on a separate GPU that is not actually connected to
any output (like an iGPU which supports more formats than the
external GPU to which the monitor is connected).
However, before this change, only X11 displays were supported as
dummy displays. This caused some graphics drivers (namely
intel-driver) to core dump when they were not actually used as X11
module.
This change introduces support for drm libav displays, which
allows vaapi-copy to run on such cards which are not actually
rendering the X11 output.
Other than being overly convoluted, this seems to make sense to me.
Except that to get the "rot" transform I have to set flip=true, which
makes no sense at all to me.
Combining rotation and cropping didn't work. It was just completely
broken.
I'm still not sure if this is correct. Chroma positioning seems to be
broken on rotation. There might also be a problem with non-mod-2 frame
sizes. Still, strictly an improvement for both rotated and non-rotated
rendering modes.
Also, this could probably be written in a more elegant way.
Commit aa1047a3 originally added this as:
+ // this is from the DarkPlaces engine, reduces to 3x3. Original code
+ // released under GPL2 or any later version.
According to Rudolf Polzer, the original author (a certain LH) was
actually asked whether it would be ok to put this code under LGPL, and
the author gave his agreement. This code is not from id Software either
(on which large parts of DarkPlaces is based on), which is the main
reason why DarkPlaces is under GPL.
So this note is just confusing, and always has been LGPL. Fix it.
The video code can deal fine with feeding software image formats to
hwdec interop drivers. In RPI's case, this is preferable for
performance, working around OpenGL bugs (see RPI firmware issue #666),
and because OpenGL rendering doesn't bring too many advantages due to
RPI supporting GLES 2.0 only.
Maybe a way to force the normal video path is needed later. But
currently, this can be tested by just not loading the hwdec interop
driver.
If you run command-line mpv and set --hwdec to something that does
not load the RPI interop layer, you'll even have to use --hwdec-preload
manually to get it enabled.
Was intended to put the GL layer above the standard console. (But
actually that was done already, and the oddness I'm seeing seems to
be an unrelated bug.)
This should make display-names usable on Windows. It returns a list of
GDI monitor names like "\\.\DISPLAY1". Since it may be useful to get the
monitor that Windows considers associated with the window (with
MonitorFromWindow,) this will always be returned as the first argument.
This monitor is the one used for display-fps and icc-profile-auto.
We always want to use __declspec(selectany) to declare GUIDs, but
manually including <initguid.h> in every file that used GUIDs was
error-prone. Since all <initguid.h> does is define INITGUID and include
<guiddef.h>, we can remove all references to <initguid.h> and just
compile with -DINITGUID to get the same effect.
Also, this partially reverts 622bcb0 by re-adding libuuid.a to the
build, since apparently some GUIDs (such as GUID_NULL) are not declared
in the source file, even when INITGUID is set.
Obviously, in the vast majority of cases, there's only one device
in the system, but doing this means we're more likely to get a
usable device in the multi-device case.
cuda would support decoding on one device and displaying on another
but the peer memory handling is not transparent and I have no way
to test it so I can't really write it.
The documentation around this stuff is poor, but I found an nvidia
sample that demonstrates how to use the interop API most efficiently.
(https://github.com/nvpro-samples/gl_cuda_interop_pingpong_st)
Key lessons are:
1) you can register the texture itself and have cuda write to it,
thereby skipping an additional copy through the PBO.
2) You don't have to be mapped when you do the copy - once you get a
mapped pointer, it remains valid. Magic!
This lets us throw out the PBOs as well as much of the explicit
alignment and stride handling.
CPU usage is slightly (~3%) lower for 4K content in one test case,
so it makes a detectable difference, and presumably saves memory.
On x11, you can change the fullscreen via the window manager and without
mpv's involvement. In these cases, the internal fullscreen flag has to
be updated.
The hack used for this didn't really work properly. Change it
accordingly. The important thing is that the shadow copy of the option
is updated. This is still not really ideal.
Fixes#3570.
The documentation claims that --video-unscaled will still perform
anamorphic adjustments, and it rightfully should. The current reality is
that it does not, because the video-unscaled size was based on the wrong
set of variables. (encoded width/height instead of nominal display
width/height)
When we rotate the inmage by 90° or 270°, chroma width and height need
to be swapped.
Fixes#3568.
But is the chroma sub location correct? Who the hell knows...
This is the actual decoder output, with no overrides applied. (Maybe
video-params shouldn't contain the overrides in the first place, but
damage done.)
This really shouldn't be in vd_lavc.c - move it to dec_video.c, where it
also applies aspect overrides. This makes all overrides in one place.
The previous commit contains some required changes for resetting the
image parameters change detection (i.e. it's not done only on video
aspect override changes).
Use the new mechanism, instead of wrapped properties. As usual, extend
the update handling to some options that were forgotten/neglected
before. Rename video_reset_aspect() to video_reset_params() to make it
more "general" (and we can amazingly include write access to
video-aspect as well in this).
Extend the flag-based notification mechanism that was used via
M_OPT_TERM. Make the vo_opengl update mechanism use this (which, btw.,
also fixes compilation with OpenGL renderers forcibly disabled).
While this adds a 3rd mechanism and just seems to further the chaos, I'd
rather have a very simple mechanism now, than actually furthering the
mess by mixing old and new update mechanisms. In particular, we'll be
able to remove quite some property implementations, and replace them
with much simpler update handling. The new update mechanism can also
more easily refactored once we have a final mechanism that handles
everything in an uniform way.
There were multiple values under M_OPT_EXIT (M_OPT_EXIT-n for n>=0).
Somehow M_OPT_EXIT-n either meant error code n (with n==0 no error?), or
the number of option valus consumed (0 or 1). The latter is MPlayer
legacy, which left it to the option type parsers to determine whether an
option took a value or not. All of this was changed in mpv, by requiring
the user to use explicit syntax ("--opt=val" instead of "-opt val").
In any case, the n value wasn't even used (anymore), so rip this all
out. Now M_OPT_EXIT-1 doesn't mean anything, and could be used by a new
error code.
Negative height is used to signal a flipped framebuffer. There's
absolutely no reason to pass this down to overlay_adjust(), and only
requires implementers to deal with an additional special-case.
Instead of using input_ctx for waiting, use the dispatch queue directly.
One big change is that the dispatch queue will just process commands
that come in (e.g. from client API) without returning. This should
reduce unnecessary playloop excutions (which is good since the playloop
got a bit fat from rechecking a lot of conditions every iteration).
Since this doesn't force a new playloop iteration on every access, this
has to be enforced manually in some cases.
Normal input (via terminal or VO window) still wakes up the playloop
every time, though that's not too important. It makes testing this
harder, though. If there are missing wakeup calls, it will be noticed
only when using the client API in some form.
At this point we could probably use a normal lock instead of the
dispatch queue stuff.
Currently, calling mp_input_wakeup() will wake up the core thread (also
called the playloop). This seems odd, but currently the core indeed
calls mp_input_wait() when it has nothing more to do. It's done this way
because MPlayer used input_ctx as central "mainloop".
This is probably going to change. Remove direct calls to this function,
and replace it with mp_wakeup_core() calls. ao and vo are changed to use
opaque callbacks and not use input_ctx for this purpose. Other code
already uses opaque callbacks, or has legitimate reasons to use
input_ctx directly (such as sending actual user input).
'cuda-gl' isn't right - you can turn this on without any GL and
get some non-zero benefit (with the cuda-copy hwaccel). So
'cuda-hwaccel' seems more consistent with everything else.
This happened to break because the texture unit wasn't reset to 0, which
some code expects. The OSD code in particular set the OSD texture on the
wrong texture unit, with the result that OSD/OSC was not visible.
A minor cleanup that makes the code simpler, and guarantees that we
cleanup the GL state properly at any point.
We do this by reusing the uniform caching, and assigning each sampler
uniform its own texture unit by incrementing a counter. This has various
subtle consequences for the GL driver, which hopefully don't matter. For
example, it will bind fewer textures at a time, but also rebind them
more often.
For some reason we keep TEXUNIT_VIDEO_NUM, because it limits the number
of hook passes that can be bound at the same time.
OSD rendering is an exception: we do many passes with the same shader,
and rebinding the texture each pass. For now, this is handled in an
unclean way, and we make the shader cache reserve texture unit 0 for the
OSD texture. At a later point, we should allocate that one dynamically
too, and just pass the texture unit to the OSD rendering code. Right now
I feel like vo_rpi.c (may it rot in hell) is in the way.
The caller now has to call gl_sc_reset(), and _after_ rendering. This
way we can unset OpenGL state that was setup for rendering. This affects
the shader program, for example. The next commit uses this to
automatically manage texture units via the shader cache.
vo_rpi.c changes untested.
Stops Mesa from restricting us to OpenGL 3.0. It also tries to create
GLES 3 contexts for drivers which do not just return a higher context
when requesting GLES 2.
I don't know whether this code is a good or bad idea. A not-so-good
aspect is that we don't check for EGL 1.5 (or 1.4 extensions) for some
of the more advanced context attributes. But EGL implementations should
be able to tolerate it and return an error, and then we'd use the
fallback.
This used to be shared, but since vo_rpi is going to be removed,
untangle them. There was barely any actual code shared since the recent
changes anyway.
As a subtle change, we also stop opening libGLESv2.so explicitly in the
vo_opengl backend, and use RTLD_DEFAULT instead.
Minimal support just for testing.
Only the window surface creation (including size determination) is
really platform specific, so this could be some generic thing with
platform-specific support as some sort of sub-driver, but on the other
hand I don't see much of a need for such a thing.
While most of the fbdev usage is done by the EGL driver, using this
fbdev ioctl is apparently the only way to get the display resolution.
Add a function to egl_helpers.c for creating an EGL context and make
context_x11egl.c use it. This is meant to be generic, and should work
with other windowing APIs as well. The other EGL-using code in mpv can
be switched to it.
The wrong enum got copied here, so it was essentially using the transfer
characteristics as the primaries (instead of the primaries), which
accidentally worked fine most of the time (since the two usually
coincided), but broke on weird/mistagged files.
The consequence of this was that e.g. hardware decoding with VAAPI-EGL
could sometimes not work if the compiler didn't support C11. (Although I
found this one on RPI, which also uses this mechanism.)
If the shader fails to compile, and assertion could trigger in
gl_sc_gen_shader_and_reset() due to the code trying to recreate the
shader every time, and re-appending the uniforms every time. Just reset
the uniform array to fix this.
Some disturbed GL drivers might not return anything for glGetShaderiv()
if the GL state got "lost", so initialize variables just for additional
robustness.
Since vo_rpi is going to be deprecated, better port its features to the
vo_opengl backend.
The most tricky part is the fact that recreating dispmanx elements will
conflict with the GL context. Fortunately, RPI's EGL support is
reasonably compliant, and we can transplant the context to newly created
dispmanx elements, making this much easier. This means unlike vo_rpi,
the GL state will actually not be recreated.
This overlay support specifically skips the OpenGL rendering chain, and
uses GL rendering only for OSD/subtitles. This is for devices which
don't have performant GL support.
hwdec_rpi.c contains code ported from vo_rpi.c. vo_rpi.c is going to be
deprecated. I left in the code for uploading sw surfaces (as it might
be slightly more efficient for rendering sw decoded video), although
it's dead code for now.
The cuvid decoder already knows how to copy back to system memory
if NV12 frames are requested, and this will happen if the decoder
is used without the hwdec.
For convenience, let's add a wrapper hwdec so people don't have
to explicitly pick the cuvid decoder if they want this behaviour.
The nvidia examples use the old (as in CUDA 3.x) interop API which
is deprecated, and I think not even functional on recent versions
of CUDA for windows. As I was following the examples, I used this
old API.
So, let's update to the new API, and hopefully, it'll start working
on windows too.
It was used to determine whether the VO supports VOCTRL_SET_PANSCAN.
With all those changes to property semantics this became unnecessary,
and its only use was dropped at some point.
Just another corner-caseish potential issue. Unlike unreffing the image
manually, unref_current_image() also takes care of properly unmapping
hwdec frames. (The corner-case part of this is that it's probably never
mapped at this point, but it's apparently not entirely guaranteed.)
The " || vimg->mpi" part virtually never seems to trigger, but on the
other hand could possibly create unintended corner cases (for example by
trying to upload a NULL image, which would then be marked as an error
and render a blue screen).
I guess it's a leftover from over times, where a NULL image meant
"redraw the current frame". This is now handled by actually passing
along the current frame.
Nvidia's "NvDecode" API (up until recently called "cuvid" is a cross
platform, but nvidia proprietary API that exposes their hardware
video decoding capabilities. It is analogous to their DXVA or VDPAU
support on Windows or Linux but without using platform specific API
calls.
As a rule, you'd rather use DXVA or VDPAU as these are more mature
and well supported APIs, but on Linux, VDPAU is falling behind the
hardware capabilities, and there's no sign that nvidia are making
the investments to update it.
Most concretely, this means that there is no VP8/9 or HEVC Main10
support in VDPAU. On the other hand, NvDecode does export vp8/9 and
partial support for HEVC Main10 (more on that below).
ffmpeg already has support in the form of the "cuvid" family of
decoders. Due to the design of the API, it is best exposed as a full
decoder rather than an hwaccel. As such, there are decoders like
h264_cuvid, hevc_cuvid, etc.
These decoders support two output paths today - in both cases, NV12
frames are returned, either in CUDA device memory or regular system
memory.
In the case of the system memory path, the decoders can be used
as-is in mpv today with a command line like:
mpv --vd=lavc:h264_cuvid foobar.mp4
Doing this will take advantage of hardware decoding, but the cost
of the memcpy to system memory adds up, especially for high
resolution video (4K etc).
To avoid that, we need an hwdec that takes advantage of CUDA's
OpenGL interop to copy from device memory into OpenGL textures.
That is what this change implements.
The process is relatively simple as only basic device context
aquisition needs to be done by us - the CUDA buffer pool is managed
by the decoder - thankfully.
The hwdec looks a bit like the vdpau interop one - the hwdec
maintains a single set of plane textures and each output frame
is repeatedly mapped into these textures to pass on.
The frames are always in NV12 format, at least until 10bit output
supports emerges.
The only slightly interesting part of the copying process is that
CUDA works by associating PBOs, so we need to define these for
each of the textures.
TODO Items:
* I need to add a download_image function for screenshots. This
would do the same copy to system memory that the decoder's
system memory output does.
* There are items to investigate on the ffmpeg side. There appears
to be a problem with timestamps for some content.
Final note: I mentioned HEVC Main10. While there is no 10bit output
support, NvDecode can return dithered 8bit NV12 so you can take
advantage of the hardware acceleration.
This particular mode requires compiling ffmpeg with a modified
header (or possibly the CUDA 8 RC) and is not upstream in ffmpeg
yet.
Usage:
You will need to specify vo=opengl and hwdec=cuda.
Note that hwdec=auto will probably not work as it will try to use
vdpau first.
mpv --hwdec=cuda --vo=opengl foobar.mp4
If you want to use filters that require frames in system memory,
just use the decoder directly without the hwdec, as documented
above.
Instead of copying the options around... just don't. video.c now has
full control over when options are updated. (It still gets notified from
outside, but it decides when the updated options are copied: when
m_config_cache_update() is called.) So there's no need for tricky
stuff, and it can be simplified a bit.
Also change lcms.c. We could do it like video.c, and get the options
from the global config store. But it seems simpler to just provide a
pointer to an option struct, which is arbitrarily mutated from the
outside (from the perspective of lcms.c).
Setting --icc-profile had no effect, until a vo_opengl option was
changed at runtime. We must initialize the renderer for the initial
option state too.
For some reason, the ICC profile gets loaded twice. The next commit
happens to fix this.
I decided that it's too much work to convert all the VO/AOs to the new
option system manually at once. So here's a shitty hack instead, which
achieves almost the same thing. (The only user-visible difference is
that e.g. --vo=name:help will list the sub-options normally, instead of
showing them as deprecation placeholders. Also, the sub-option parser
will verify each option normally, instead of deferring to the global
option parser.)
Another advantage is that once we drop the deprecated options,
converting the remaining things will be easier, because we obviously
don't need to add the compatibility hacks.
Using this mechanism is separate in the next commit to keep the diff
noise down.
Instead of requiring each VO or AO to manually add members to MPOpts and
the global option table, make it possible to register them automatically
via vo_driver/ao_driver.global_opts members. This avoids modifying
options.c/options.h every time, including having to duplicate the exact
ifdeffery used to enable a driver.
On a VOCTRL_UPDATE_PLAYBACK_STATE store the state, and use it to
initialize the next time the task list becomes available.
This actually fixes#3482. Revert commit f2e25e9e because it's not
needed anymore.
With the recent vo_opengl changes it doesn't do anything anymore.
I don't think a deprecation period is necessary, because the command
was always marked as experimental.
vo_opengl sub-option were always rather annoying to handle. It seems
better to make them global options instead. This is simpler and easier
to use. The only disadvantage we are aware of is that it's not clear
that many/all of these new global options work with vo_opengl only.
--vo=opengl-hq is also deprecated.
There is extensive compatibility with the old behavior. One exception is
that --vo-defaults will not apply to opengl-hq (though with opengl it
still works). vo-cmdline is also dysfunctional and will be removed in a
following commit.
These changes also affect opengl-cb.
The update mechanism is still rather inefficient: it requires syncing
with the VO after each option change, rather than batching updates.
There's also no granularity (video.c just updates "everything", and if
auto-ICC profiles are enabled, vo_opengl.c will fetch them on each
update).
Most of the manpage changes were done by Niklas Haas <git@haasn.xyz>.
This is still rather basic.
run_reconfig() and run_control() update the options because it's needed
for panscan (and other video scaling options), and fullscreen, border,
ontop updates. In the old model, these options could be accessed only
while both playback thread and VO threads were locked (i.e. during
synchronous calls like vo_control()), so this should be sufficient in
order not to miss any updates. In the future, a more fine-grained update
mechanism could be added to handle these updates "exactly".
x11_common.c contains an evil hack, as I see no reasonable way to handle
this properly. The VO thread can't "lock" the main thread, so this is
not simple.
Reduce accesses to the renderer opts in vo_opengl.c, and instead add
accessors for them to video.c.
I suppose gamma and maybe icc-auto could be moved to vo_opengl.c
options. Also, the output colorspace could probably be adjusted to what
is really used, not just the options (although it's possible that this
commit changes this, due to video.c mutating its own copy of the options
according to actual renderer capapbilities).
But don't deal with this now.
Normally I'd prefer a bunch of smaller functions with fewer parameters
over a single function with a lot of parameters. But future changes will
require messing with the parameters in a slightly more complex way, so a
combined function will be needed anyway. The now-unused "global"
parameter is required for later as well.
Deprecated in favor of user-shaders, which are functionally equivalent
but superior. (Except in the case of scaler-shader, which has no direct
replacement, but it turned out to be a very unpopular feature either way
- most custom scalers don't fit into the mpv kernel infrastructure and
are therefore implemented as user shaders either way)
Signed-off-by: wm4 <wm4@nowhere>
Positional parameters cause problems because they can be ambiguous with
flag options. If a flag option is removed or turned into a non-flag
option, it'll usually be interpreted as value for the first sub-option
(as positional parameter), resulting in very confusing error messages.
This changes it into a simple "option not found" error.
I don't expect that anyone really used positional parameters with --vo
or --ao. Although the docs for --ao=pulse seem to encourage positional
parameters for the host/sink options, which means it could possibly
annoy some PulseAudio users.
--vf and --af are still mostly used with positional parameters, so this
must be a configurable option in the option parser.
Before this commit, all VOs had to toggle the option flag themselves,
now command.c does it.
I can't really comprehend why it required every VO to do this manually.
Maybe it was for rejecting the property/option change if the VO didn't
support a specific capability. But then it could have checked the VOCTRL
result. In any case, I don't care, and successfully changing the
property without doing anything (With some VOs) is fine too. Many things
work this way now, and it's simpler overall.
This change will be useful for cleaning up VO option handling.
Just a minor refactor along the planned option change. This commit will
make it easier to update (i.e. copy) the VO options without copying
_all_ options. For now, behavior should be equivalent, though.
(The VO options were put into a separate struct quite early - when all
global variables were removed from the source code. It wasn't clear
whether the separate struct would have any actual purpose, but it seems
it will now. Awesome, huh.)
It seems like many GL implementations (including Mesa) choke on this,
while others are fine. We still think that this use of the GL API is
allowed by the standard (at least in the Mesa case), so to reduce
confusion, explicitly check the "controversial" calls, and use an
appropriate error message.
These are different AVCodecContext fields. pkt_timebase is the correct
one for identifying the unit of packet/frame timestamps when decoding,
while time_base is for encoding. Some decoders also overwrite the
time_base field with some unrelated codec metadata.
pkt_timebase does not exist in Libav, so an #if is required.
Because VOCTRL_CHECK_EVENTS is processed asynchronously (as of 088a007,)
the GUI thread no longer gets regular wakeups, so the old check that
made sure the video window matched the parent window's size in --wid
embedding mode did not run very often. This made --wid embedding not
very usable.
Instead of polling for window size changes, use Windows hooks to react
to them when they happen. When the parent window is owned by the same
process as the video window, use a WH_CALLWNDPROC hook. When the parent
window is not owned by the same process, WinEvents must be used, which
are not as smooth, but still work for this purpose.
Since neither SetWindowsHookEx nor SetWinEventHook take a context
parameter to send data to the hook function, the hook functions must
find the child window by its class instead, so there are a few changes
to ensure this is fast and the class is unique.
This also fixes up the logic to handle window destruction. When a parent
window is destroyed, its children are also destroyed, so this gives us a
way to react to parent window destruction without polling.
If the video has the same size as the screen, starting with --fs and
then leaving fullscreen doesn't actually leave fullscreen.
The reason is that mpv tries to restore the previous window size if
necessary (otherwise, you'd end up with a Window of nearly the same size
as the screen with some WMs). It will typically restore with the
rectangle set exactly to the screen if no other position or size is
forced. This triggers pre-EWMH fullscreen mode, which WMs detect using
various heuristics.
Apparently we triggered this with mutter (but strangely no other WMs).
It's possible that pre-EWMH fullscreen mode actually requires removing
decorations, and mutter either ignores this. But this is speculation and
I haven't checked.
Work this around by reducing the requested size by 1 pixel if it
happens.
This was observed with mutter 3.18.2.
Fixes#2072.
This should actually be rather safe - we already check whether the
estimated value jitters less than the (possibly untrustworthy) nominal
one. Remove a "safety" check that disabled this code for small
deviations, and make it trigger sooner into playback. Also lower the log
level of messages about using the estimated display FPS down to verbose.
Normally there's another mechanism for smoothing out minor estimation
differences, but that is not good enough here.
This possibly improves behavior as reported in #3433, which can be
reproduced with --vo=null:fps=48.426 --display-fps=48 (though it doesn't
consider the jitter introduced by a real VO).
Doing this required synchronizing with the VO thread, which could lead
to audio dropouts if the VO was frozen (which can happen in practice if
e.g. an opengl_cb user is not doing what the API demands).
Add a way to send asynchronous VOCTRLs, and use that for the playback
state. In theory, it would be better to make this status update a
several function and to "merge" several queued update, but that would be
slightly more effort/code, and the update is so infrequent that the
merging would never happen anyway.
The change to vo_destroy() is to make sure all queued asynchronous
reuqests are finished before making the vo_thread exit.
Even though it's only used on MS Windows, it's run on any platform with
any VO, which makes this worse.
run_control() dereferences an uint32_t as int. Whether this is allowed
depends on what uint32_t is typedefed to (dereferencing an unsigned int
as int should be fine). Fix it by always using int. The uint32_t type
never really made sense.
Instead of passing through double float timestamps opaquely, pass real
timestamps. Do so by always setting a valid timebase on the
AVCodecContext for audio and video decoding.
Specifically try not to round timestamps to a too coarse timebase, which
could round off small adjustments to timestamps (such as for start time
rebasing or demux_timeline). If the timebase is considered too coarse,
make it finer.
This gets rid of the need to do this specifically for some hardware
decoding wrapper. The old method of passing through double timestamps
was also a bit questionable. While libavcodec is not supposed to
interpret timestamps at all if no timebase is provided, it was
needlessly tricky. Also, it actually does compare them with
AV_NOPTS_VALUE. This change will probably also reduce confusion in the
future.
Instead of letting it keep decoding by trying to find a new frame,
"plug" the frame queue by not removing it. (Or actually, by putting
it back instead of discarding it.)
Matters for seamless looping (following commits), and possibly some
other corner cases.
The added function vf_unread_output_frame() is a bit of a sin, but still
reasonable, since its implementation is trivial.
In display-sync mode, the very first video frame is idiotically fully
timed, even though audio has not been synced yet at this point, and the
video frame is more like a "preview" frame. But since it's fully timed,
an underflow is detected if audio takes longer than the display time of
the frame (we send the second frame only after audio is done).
The timing code will try to compensate for the determined desync, but it
really shouldn't. So explicitly discard the timing info in this specific
case. On the other hand, if the first frame still hasn't finished
display, we can pretend everything is ok.
This is a hack - ideally, we either would send a frame without timing
info (and then send it again or so when playback starts properly), or we
would add real pause support to the VO, and pause it during syncing.
VOCTRL_CHECK_EVENTS is called on every frame. This is by design, and is
supposed to check the event queue of the windowing API.
With the decoupled GUI thread in w32_common.c this doesn't make too much
sense, and the purpose of VOCTRL_CHECK_EVENTS is really reduced to
checking event flags. Even worse, waiting on the GUI thread can
interfere with playback, since win32 sometimes blocks the event loop
(e.g. clicking the window title bar).
Change the code such that we really only query the event flags. Use
atomics to avoid having to add a new mutex. (We assume we always have
real atomics available. The build system doesn't check this properly,
and it could fall back to dummy atomics, which are not atomic.)
Should help with #3393. Doesn't help if the core happens to send a
synchronous request, most commonly via VOCTRL_SET_CURSOR_VISIBILITY or
VOCTRL_UPDATE_PLAYBACK_STATE.
Prevents segfaults when a fullscreen switch is issued before fully
initializing the VO.
Doesn't change anything since the schedule_resize is only there to
resize in case the image size switches, which happens long after init.
The problem was that when in fullscreen, switching between images did
not issue a resize event, causing none of the images to be rendered
correctly.
This fixes the problem by issuing a resize event with the screen width
and height.
This commit also moves the zeroing of the events field to when it gets
retrieved by mpv rather than randomly after a resize in the vo/backend
code.
ssurface_handle_configure()'s width and height are just hints given by
the compositor, the application's free to not respect those strictly and
to compensate for e.g. aspect ratio.
This prevents crazy scenarios in which pictures with portrait aspect
ratios have a huge black area to make them 16:9 or whatever the
compositor feels like.
With X11 it was usually left up to the window manager to prevent huge
windows from being out of range, but no Wayland compositor will do
this right now.
Hugely improves usability when using mpv as an image viewer.
Missed during the recent changes.
Also simplify error checking code and check for POLLNVAL
as well (the display fd was never actually checked to be valid).
This requires changing the pixel upload alignment because the odd sizes
might not be aligned to multiples of 4.
Anyway, the restriction has no real benefit and the sizes in between 32
and 64 might be worth using, so just drop it.
Following testing after ebe798a, this is a more than sufficient size to
cover our use case.
The old default was a drop of about 58 dB PSNR using the old code, and
this new default is about 65 dB PSNR, so it's actually an improvement
despite resulting in a smaller size.
There was no outlier whatsoever when comparing sizes around the 64
neighbourhood (with every step corresponding to a PSNR drop of about
0.07 dB), so I picked this since it's a power of two and requires no
change to the current 3dlut-size parsing logic.
I also tested smaller sizes such as 32x32x32 which performed almost as
well on colorful samples, but this results in noticeable black boost in
the dark regions, which is pretty undesirable. Therefore, we should
avoid going much further below 64x64x64.
Either way, this new size is so fast to compute that the 3dlut cache is
almost useless on my end. In fact, it might even be slower to load the
profile from the cache than to recompute it from scratch. (For caches on
a disk. For cache on a tmpfs, it makes no difference)
It seems vo_x11_check_events() was supposed to return the currently
flagged events and reset them. But there are many places where
vo_x11_check_events() is called without checking its return value. This
could lead to forgotten events.
Change the code such that they can't get lost.
This code had the exact same texture indexing bug that the original
scaler code had before the introduction of the LUT_POS macro to fix it.
We can re-use this same macro here, and the performance drop is
virtually entirely negligible. The benefit is greatly improved LUT
accuracy as the 3DLUT size decreases - in particular, the old LUT
started introducing more and more black crush the lower your LUT size is
(because the error was essentially an over-contrast bias, with a
magnitude linearly related to the lut size).
The new code improves black stability as the LUT size decreases, and
only at very low values (16 and below) do black levels start noticeably
getting affected (due to crude linearization of the nonlinear response
curve).
The default value of 3dlut-size is definitely generous enough for this
to make no difference out of the box, but it also causes no performance
drop at all on my machine so I see no harm in improving the logic.
Furthermore, this means we could easily decrease the default 3dlut size
in a future commit, perhaps even down to 64x64x64 as a default. (But
more testing is warranted here)
Both backends have code to close each FD of their wakeup_pipe array.
This array is default-initialized with 0, which means if the backends
exit before the wakeup pipe is created (e.g. when probing), they would
close FD 0.
Initialize the FDs with -1. Then we call close(-1) in these situations,
which is perfectly allowed and has no bad consequences.
This fits natively into the vo/backend and allows to simplify the
polling code.
One new change is the fact that surface_handle_enter flags VO_EVENT_WIN_STATE
and VO_EVENT_RESIZE instead of only VO_EVENT_WIN_STATE. Before this, the code
hackily relied on the timeout and the loop in the wait_frame function to track
and set the scaling factor. Instead, this triggers mpv to run a schedule_resize
and adjust the new VO output dimensions immediately. This is also more accurate
since surface_handle_enter() gets called when a surface is created, moved and
resized, which is exactly what the rest of the player might be interested in.
This uses GLSL mix() instead of going through an indirect texture
access. Easy to implement and might require less resources on some
devices, since the oversample code was already essentially just a
special case of this.
Could be made the new default (as per issue #2685), but that should be
done in a separate commit.
Until now, this has been either handled over vo.event_fd (which should
go away), or by putting event handling on a separate thread. The
backends which do the latter do it for a reason and won't need this, but
X11 and Wayland will, in order to get rid of event_fd.
There's no need to call wl_display_flush() since all the client-side
buffered data has already been flushed prior to polling the fd.
Instead only check for POLLIN and the usual ERR+HUP.
Don't just cause vo_opengl to update the ICC profile every time the
window is moved. Instead, explicitly check if the screen was changed.
Mostly untested.
This makes the difference between passing VA_FRAME_PICTURE or
VA_BOTTOM_FIELD for progressive frames (that should be force-
deinterlaced) to VAProcPipelineParameterBuffer.flags. VA-VPP doesn't
really seem to care, and we can get rid of mp_refqueue_is_interlaced()
entirely. It could be argued it's better to pass field flags instead of
the progressive flag.
"Real" frame flag vs. what we pretend it to be. It always used the real
flag, and thus never deinterlaced unflagged frames, even if the
suboption was set to "no".
The hw_subfmt field roughly corresponds to the field
AVHWFramesContext.sw_format in ffmpeg. The ffmpeg one is of the type
AVPixelFormat (instead of the underlying hardware format), so it's a
good idea to switch to this too for preparation.
Now the hw_subfmt field is an mp_imgfmt instead of an opaque/API-
specific number. VDPAU and Direct3D11 already used mp_imgfmt, but
Videotoolbox and VAAPI had to be switched.
One somewhat user-visible change is that the verbose log will now always
show the hw_subfmt as image format, instead of as nonsensical number.
(In the end it would be good if we could switch to AVHWFramesContext
completely, but the upstream API is incomplete and doesn't cover
Direct3D11 and Videotoolbox.)
This should get mpv working on Windows 7 machines without hardware
accelerated graphics adapters. It already worked on Windows 8 and up
because those systems would silently fall back to WARP if there was no
graphics hardware installed.
The normal MPGL_CAP_SW flag is not set, so unlike other opengl backends,
this will choose a software adapter even if opengl:sw is not specified.
The reason for this is, unlike on Linux, where vo_xv and vo_x11 can be
used, mpv on Windows does not have any VO to fall back on when hardware
acceleration isn't available, so if software adapters are rejected, the
user won't see any video output when using the default settings. WARP
seems to perform quite well, so it should be used in this case.