This is mostly for the case when mpv_render_context_free() is called
while video is going on. This is supposed to gracefully stop video and
deinitialize everything properly. (I feel like it would put too much on
the API user to require that video is stopped before calling this
function. Whether video is running or not is a fairly highlevel thing,
and the API user could not do it in a race-free way.)
One problem was that unit() accessed ctx after ctx->in_use was set to
false. The update(ctx) call was basically a racy use-after-free. It
needed that call to wake up the mpv_render_context_free() loop that
waited for VO uninit. Fix this by triggering the wakeup inside the lock,
and then doing "barrier" locking in mpv_render_context_free().
Another problem was that the wait loop didn't really wait properly. IT
seems the had_kill_update field was a botched attempt to do that. It's
indeed quite hairy to do that with update(). Instead make use of the
dispatch queue (infinite timeout, using mp_dispatch_interrupt()), which
handles the problem of having to wait both for dispatch queue updates
and VO uninit at the same time.
Preparation for the next commit. Until now, it was only needed if DR was
involved. One reason for not always creating it was that you normally
must not use it if advanced_control is not enabled. This is why e.g.
VOCTRL_SCREENSHOT now checks for that variable; it still can't use
ctx->dispatch if the render API user did not enable it.
render api needs to wait for vo to be destroyed before frees the context.
The purpose of kill_cb is to wake up render api after vo is destroyed,
but uninit did that before kill_cb, so kill_cb tries using the freed
memory. Remove kill_cb to fix the issue as uninit is able to do the
work.
Equalizer control was redone in 03cf150ff3 (over 2 years
ago). Ever since, the equalizer control structs and the GET voctrl have
been unused. Only the SET voctrl is still used as notification mechanism
(actually a bad hack to avoid some further option change handling
complexity).
Remove the unused parts.
Dear diary,
today I fixed a shitty bug that was all my fault because I made a
horrible mess. (Except it was a horrible mess before I even touched
this shit, but let's not blame others.)
Sometimes, updates to VO option that control video sizing (like panscan)
didn't update the screen correctly. They were delayed until the next
option change or so.
It turns out that if the option update happens at the "same" time as a
VOCTRL, update_opts() doesn't actually notify the vo_driver of the
change. This in turn happened because run_control() called
m_config_cache_update(). The latter function returns true if the options
changed since the last call, and update_opts() also calls it (on the
same config cache) for the same purpose. The update_opts() call, which
is triggered by a third mechanism, comes later, but the cache update
call will return false (as it should). Basically, given the config API,
you can't act differently on multiple update calls and expect it to
work. The skipped handling in update_opts() meant that the notification
required to apply the changed option wasn't run.
Fix this by simply calling update_opts() directly instead. Now there's
only 1 m_config_cache_update() call on this specific instance. Fix the
call in run_reconfig() too, so the previous sentence isn't a lie (but it
probably doesn't make a difference in practice due to certain details).
I'm not sure how I even ran into this sort-of race condition. The VOCTRL
that messed up the option update was VOCTRL_UPDATE_PLAYBACK_STATE, which
happens semi-regularly.
Why this config cache shit and all the other shit? Rediscovering this
crap wasn't pleasant. It's a bunch of hacks that became necessary when
the ancient MPlayer architecture made it hard to move the VO to a
separate thread.
All the VO code typically accesses vo->opts (whose fields all used to be
global variables in MPlayer). The frontend changes these on user input.
Putting locking around all the options would be a nightmare, and keeping
a copy of the options in the thread was much simpler. You need a way to
propagate option changes, notify the thread, and update the local copy
too. And the result of these thoughts was the config cache mechanism.
In this specific case, the relevant cache update call in update_opts()
triggers a VOCTRL_SET_PANSCAN to the VO driver, which isn't related to
its former function anymore. Instead, it causes the VO driver to update
the video sizing/placing options, which the generic VO code can't do.
(Mostly because the VO driver includes the windowing stuff and is
responsible for resizing etc. itself.)
VOCTRLs sent by the frontend are even worse. MPlayer had no real runtime
option change mechanism. Some options were vaguely duplicated by
properties, so you could effectively change those options at runtime.
Each of these options had its own VOCTRL, which still exist today, e.g.
VOCTRL_FULLSCREEN, or VOCTRL_ONTOP. I tried to make all options runtime
changeable, and to unify properties with options. But I couldn't be
bothered with updating all VO drivers to listen to option changes
directly, because that would be pretty tedious. So the property code is
still all there and sends the old VOCTRLs. But of course you need to
sync up the options, which is why the run_control() code did that.
(Unrelated: VO_EVENT_FULLSCREEN_STATE is the worst shithack of them all.
Currently, only the frontend can actually write to options (for awful
reasons), so if the fullscreen state changes due to outside interaction,
the VO driver can't update the corresponding option fields. So the VO
notifies the frontend with said VO_EVENT_, and the frontend then sends
VOCTRL_GET_FULLSCREEN, and updates the global copy of the option with
the value returned by that. I still like to think the situation is not
that bad considering the monstrous effort of converting single-threaded
code that had hundreds of options in global variables to multi-threaded
code with no global variables at all.)
m_geometry_apply() will read and modify the dummy variable. It's not
actually used for anything, but valgrind will still warn against
uninitialized data. I'm not sure whether this was UB, but in any case
it's annoying when running valgrind.
During probing on a system with AMD GPU, mpv used to output the
following messages if hardware decoding was enabled:
[ffmpeg] AVHWFramesContext: Failed to create surface: 2 (resource
allocation failed).
[ffmpeg] AVHWFramesContext: Unable to allocate a surface from internal
buffer pool.
This commit removed the message, with hopefully no other side effects.
Long explanations follow, better don't read them, it's just tedious
drivel about the details. People should learn to write concise commit
messages, not drone on and on endlessly all while they have no fucking
point.
The code probes supported hardware pixel format, and checks whether they
can be mapped as textures. av_hwdevice_get_hwframe_constraints() returns
a list of hardware pixel formats in the valid_sw_formats field (the "sw"
means software, but they're still hardware pixel formats, makes sense).
This contained the format yuv420p, even though this is not a valid
hardware format. Trying to create a surface of this type results in VA
surface creation failure, upon which FFmpeg prints the error messages
above. We'd be fine with this, except FFmpeg has a global log callback,
and there's no way to suppress these messages without creating other
issues.
It turns out that FFmpeg's vaapi implementation returns all formats from
vaQueryImageFormats() if no "hwconfig" is provided. This list includes
yuv420p, which is probably supported for surface upload/download, but
not as native format. Following FFmpeg's logic, it should not appear in
the valid_sw_formats list, because formats for transfers are returned by
another roundabout API.
Idiotically, there doesn't seem to be any vaapi call that determines
whether a format is a valid surface format. All mechanisms to do this
are bound to a VAConfigID (= video codec or video processor), all while
the actual surface creation API strangely does not take a VAConfigID (a
big WTF).
Also, calling the vaCreateSurfaces() API ourselves for probing is out of
the question, because that functions is utterly and idiotically complex.
Look at the FFmpeg code and how much effort it requires to setup a
complete set of attributes - we can't duplicate this.
So the only way left to do this is the most idiotic and tedious way:
enumerating all VAProfile (and VAEntrypoints) to create all possible
VAConfigIDs. Each of the VAConfigIDs is associated with a list of
formats, which FFmpeg can return (by passing the ID along with the
"hwconfig"), and which is probed separately.
Note that VAConfigID actually refers to a dynamic instance of something,
and creating a VAConfigID takes not only the VAProfile and the
VAEntrypoint, but also an arbitrary attribute array. In theory, this
means our attempt to get to know all possible configurations cannot
work, but in practice this attribute array seems to be pointless for
decoding and video processing, and FFmpeg doesn't use it (though the
encoding path does use it). This probably just makes it _barely_ OK to
do it this way.
Could we discard all this probing shit, and somehow do it another way?
Probably not. The EGL API for mapping surfaces doesn't even seem to
provide a way to enumerate supported formats, we may not even know
whether DRM/dmabuf interop is actually supported (AFAIR the EGL
extensions are present even if they don't work), nor do we know whether
the VAAPI driver supports this interop (not sure). So actually trying is
the only way.
Further, mpv initializes the decoder on a another thread, where you
can't just access OpenGL state. This suckage is mostly to be blamed on
OpenGL itself and its crazy thread boundedness. In theory, this could be
done anyway (see how software decoding "direct rendering" tries to get
around this). But to make it worse, the decoder never cares about the
list of supported formats determined by this code; instead,
f_autoconvert.c tries to deal with it and insert a video processor
(well, good luck with this crap, I bet it doesn't even work). So this
whole endeavor might be pointless, other than the fact that failed
probing can disable use of vaapi (which is correct and necessary). But
if you have a shovel, you don't use it to smash the flat end on the heap
of shit that's piled up before you, or do you?
While this method probably works, it's still orgasmically tedious. It
was tedious before: we had to create a real surface, create a GL
texture, map the surface with it, then destroy everything again. But the
added code is tedious on its own. Highlights include the need to malloc
a FFmpeg struct just to pass a single damn integer, the need to
enumerate "entrypoints" for each VA profile, even though all profiles
have exactly 1 entrypoint, and the kind of obnoxious way how vaapi
requires you to preallocate arrays for returned things, even they could
for example reasonably be returned as immutable arrays or have some
other simpler API.
The main grand fuckup is of course that vaapi requires a VAConfigID to
query surface properties, but not for creating surfaces. This
awkwardness even affected the FFmpeg API design, which has a "hwconfig"
concept that is only used by vaapi (vaapi is only 1 out of 10 hardware
decoding APIs supported by the FFmpeg hwcontext stuff). Maybe I'm just
missing something. It's as if vaapi required setting radioactive shit on
fire. Look how clean the native D3D11 code is instead. (Even the ANGLE
code manages to avoid being this fucked up. Or the VDPAU code, despite
supporting multiple mapping methods.)
Another only barely related change is that the valid_sw_formats field
can be NULL, and the API explicitly documents this. Technically, the mpv
code was buggy for not checking this, although until now the FFmpeg
implementation so far could not return it when we still passed NULL for
the hwconfig parameter.
No functional changes, just preparation for the next commit. Split the
probing into multiple functions. Prepare for the yet unused possibility
to pass AVVAAPIHWConfig to probing. try_format_pixfmt() now assumes it
can be called multiple times with the same format, so it filters the
format.
The format probing is now something like O(n^2) for n formats, but n
will most likely remain something under 50 or so.
Useless garbage.
This was once added to test whether vdpau presentation feedback could be
used. Results were always unsatisfactory, and now vdpau is dead.
Semantics a bit questionable. This is done for the OSC (next commit),
and a comment added the manpage explicitly states this. Meaning this is
probably garbage and needs to revisit when the OSC changes and/or
someone wants to use this margin feature for something else.
Not sure about the subtitle thing. It's imaginable that someone uses
these options to create empty borders for subtitles on the bottom, so
subtitles should be located there. On the other hand, this gives a
rather unpolished user experience when using the (later added) OSC
feature to not overlap with the video. There's not much of a point if
the OSC still overlaps the video. However, I'm too lazy to think about
this, so it stays like it is.
--video-margin-ratio-left=0.2 --video-margin-ratio-right=0.9 (added in
the the next commit) will set f_w to inf, resulting in some garbage
being propagated. Later, the OSD margins are computed from values before
various sanity clamping is applied, which makes libass suffer from
bullshit values.
I'm very sure it's OK and more correct to compute the OSD margins using
the later values, but I'm not sure about that.
This one is probably not terribly obvious from just the valgrind log,
but a wayland dev explained it to me just a second ago. Whenever mpv
sends events to the screen with wl_display_dispatch, wayland internally
allocates memory to a struct wl_proxy object if a new id is found. Quite
a few more things happen to that proxy object, but eventually mpv stores
the data on the client-side in a wrapper type of struct (struct
wl_data_offer). mpv's data_device_listener keeps track of those proxies
and frees the memory when appropriate. Of course, mpv is constantly
sending events to the screen and does so until the user quits the
player. What happens here is that one final wl_display_dispatch is called
right before the user quits the player and before mpv's
data_device_listener can handle that object. So the result is that you
always have one extra dangling proxy that doesn't get properly freed.
The solution is to just simply call wl_data_offer_destroy before closing
the wl_display to free that final dangling wl_proxy.
Extending the client-allocated mpv_opengl_drm_params struct
constituted a break of ABI that could cause UB.
Create a clean break by deprecating "drm_params" and related structs
and enum values, and replacing it with "drm_params_v2".
Also fix some comments and code that wrongly assumed that open could
return any other negative number than -1 for failure.
This commit updates the libmpv version to 1.104
Like hwdec_cuda, you get a big #ifdef mess if you try and keep the
OpenGL and Vulkan interops in the same file. So, I've refactored
them into separate files in a similar way.
Originally, vo_gpu/vo_opengl considered the case of Nvidia proprietary
drivers, which required vdpau/GLX, and Intel open source drivers, which
require vaapi/EGL. Since window creation and GPU context creation are
inseparable in mpv's internal API, it had to pick the correct API very
early, or hardware decoding wouldn't work. "x11probe" was introduced for
this reason. It created a GLX context (without showing the window yet),
and checked whether vdpau was available. If yes, it used GLX, if not, it
continued probing x11/EGL. (Obviously it couldn't always fail on GLX
without vdpau, which is why it was a separate "probe" backend.)
Years passed, and now the situation is different. Vdpau is dead. Nvidia
drivers and libavcodec now provide CUDA interop, which requires EGL, and
fixes some of the vdpau problems. AMD drivers now provide vaapi, which
generally works better than vdpau. Intel didn't change.
In particular, vaapi provides working HEVC Main10 support. In theory, it
should work on vdpau too, with quality reduction (no 10 bit surfaces),
but I couldn't get it to work.
So always prefer EGL. And suddenly hardware decoding works. This is
actually rather important, because HEVC is unfortunately on the rise,
despite shitty encoders and unoptimized decoders. The latter may mean
that hardware decoding works better than libavcodec.
This should have been done a long, long time ago.
In some cases, src.sig_peak remains undefined as 0, which was definitely
the case when using the OSD, since it never got passed through the usual
color space normalization process. Most robust work-around is to simply
force the normalization at the site where it's needed. This ensures this
value is always valid and defined, to make the peak-dependent logic in
these two functions always work.
Fixes 4b25ec3a9dFixes#6917Fixes#6918
Mesa supports the EGL_CHROMIUM_sync_control extension, and it's
available out of the box with AMD drivers. In practice, this is exactly
the same as GLX_OML_sync_control, but for EGL. The extension
specification is separate from the GLX one though, and buried somewhere
in the Chromium code.
This appears to work, although I don't know if it really works.
In theory, this could be useful for other EGL targets. Support code for
it could have been added to egl_helpers.c to avoid some minor duplicated
glue code if another EGL target were to provide this extension. I didn't
bother with that. ANGLE on Windows can't support it, because the
extension spec. explicitly requires POSIX timers. ANGLE on Linux/OSX is
actively harmful for mpv and hopefully won't ever use it. Wayland uses
EGL, but has its own fancy presentation feedback stuff (and besides, I
don't think basic video player functionality works on Wayland at all).
context_drm_egl maybe? But I think DRM has its own stuff.
So the next commit can make EGL use it. EGL has a quite similar
function, that practically works the same. Although it's relatively
trivial, it's still tricky, and probably shouldn't end up as duplicated
code.
There are no functional changes, except initialization, and how failure
of the glXGetSyncValues call is handled. Also, some comments mention the
EGL extension.
Note that there's no intention for this code to handle anything else
than the very specific OML sync extension (and its EGL equivalent). This
is just too weirdly specific to the weird idiosyncrasies of the
extension, and it makes no sense to extend it to handle anything else.
(Such as Wayland or DXGI presentation feedback.)
In the past, src peak was always equal to or higher than dst peak. But
since `--target-peak` got introduced, this could no longer be the case.
This leads to an incorrect result (scaling for peak mismatch in gamma
light) unless some other option (CMS, --linear-scaling, etc.) forces the
linearization.
Fixes#6533
We collect a 'vulkan-device' option today but then don't actually
pass it on, so it's useless. Once that's fixed, it can be used
to select a specific vulkan device by name.
Tested with the new nvidia offload feature to select between the
nvidia and intel GPUs.
Somehow I got the idea that compound literals had function-scoped
lifetime. Instead, like all other objects with automatic storage
duration, compound literals are block-scoped, so they become invalid
after exiting the block they were declared in. It seems like a recent
change to GCC actually reuses the memory that the compound literals
used to occupy, which was causing a few bugs.
The pattern of conditionally assigning a pointer to a compound literal
was used in a few places in ra_d3d11 where the Direct3D API expects
either a pointer to an initialised struct or NULL. Change these to
ensure the lifetime of the struct includes the API call.
Should fix#6775.
This is documented as required (although we did not do it in
the old GL codepath, with no visible problems) and I have seen
transient artifacts after seeking which _appear_ to have gone
away after introducing this.
this migrates our current swift code to version 5 and 4. building is
support from 10.12.6 and xcode 9.1 onwards.
dynamic linking is the new default, since Apple removed static libs
from their new toolchains and it's the recommended way.
additionally the found macOS SDK version is printed since it's an
important information for finding possible errors now.
Fixes#6470
We saw a segfault when trying to use the intel-media-driver (iHD)
rather than the normal intel va driver. This happened because the
iHD driver reports P010 (and maybe other formats) with multiple
layers to represent the interleaved UV plane. The normal va driver
reports one UV layer to match the plane.
This threw off my logic which assumed that the number of layers
could not exceed the number of planes.
There's a way one could fix this in a fully generalised form, but
I'm just going to do what the EGL path does and assume that:
* Layer 'n' is on Plane 'n' for n < total number of planes
* These layers always start at offset 0 on the plane
You can imagine ways that these assumptions are violated, but at
least the failure will look the same for both EGL and Vulkan
paths.
Today, we normally see a format error when probing because yuyv422
cannot be used, but it's in the normal set of probed formats.
This error is distracting and confusing, so only log probing errors
at the VERBOSE level.
Fixes#6411
This change introduces a vulkan interop path for the vaapi hwdec.
The basic principles are mostly the same as for EGL, with the
exported dma_buf being imported by Vukan. The biggest difference
is that we cannot reuse the texture as we do with OpenGL - there's
no way to rebind a VkImage to a different piece of memory, as far
as I can see. So, a new texture is created on each map call.
I did not bother implementing a code path for the old libva API as
I think it's safe to assume any system with a working vulkan driver
will have access to a newer libva.
Note that we are using separate layers for the vaapi surface, just
as is done for EGL. This is because libplacebo doesn't support
multiplane images.
This change does not include format negotiation because no driver
implements the vk_ext_image_drm_format_modifier extension that
would be required to do that. In practice, the two formats we care
about (nv12, p010) work correctly, so we are not blocked. A separate
change had to be made in libplacebo to filter out non-fatal validation
errors related to surface sizes due to the lack of format negotiation.
New releases of VDPAU support decoding 4:4:4 content, and that comes
back as NV24 when using 'direct mode' in OpenGL Interop. That means we
need to be a little bit smarter about how we set up the OpenGL
textures.
If the compositor sends a configure event before the surface is initially
mapped, resize gets called before the egl_window gets created, resulting
in a crash in wl_egl_window_resize.
This was fixed back in 618361c697, but was reintroduced when the wayland
code was rewritten in 68f9ee7e0b.
While `ra` supports the concept of a texture as a storage
destination, it does not support the concept of a texture format
being usable for a storage texture. This can lead to us attempting
to create a texture from an incompatible format, with undefined
results.
So, let's introduce an explicit format flag for storage and use
it. In `ra_pl` we can simply reflect the `storable` flag. For
GL and D3D, we'll need to write some new code to do the compatibility
checks. I'm not going to do it here because it's not a regression;
we were already implicitly assuming all formats were storable.
Fixes#6657
This started as a desperate attempt to lower the memory requirement
of error diffusion, but later it turns out that this change also
improved the rendering performance a lot (by 40% as I tested).
Errors was stored in three uint before this change, each with 24bit
precision. This change encoded them into a single uint, each with 8bit
precision. This reduced the shared memory usage, as well as number of
atomic operations, all by three times.
Before this change, with the minimum required 32kb shared memory, only
the `simple` kernel can be used to render 1080p video, which is mostly
useless compare to `--dither=fruit`. After this change, 32kb can
handle `burkes` kernel for 1080p, or `sierra-lite` for 4K resolution.
error diffusion requires two texture rendering pass. The existing code
reuses `screen_tex` and creates another for such purpose. This works
generally well for opengl, but could potentially be problematic for
vulkan, due to its async natural.
This is a straightforward parallel implementation of error diffusion
algorithms in compute shader. Basically we use single work group with
maximal possible size to process the whole image. After a shift
mapping we are able to process all pixels column by column.
A large ring buffer are allocated in shared memory to speed things up.
However the size of required shared memory depends linearly on the
height of video window (or screen height in fullscreen mode). In case
there is no enough shared memory, it will fallback to `--dither=fruit`.
The maximal allowed work group size is hardcoded as 1024. Ideally we
could query `GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS`. But for whatever
reason, it seems most high end card from nvidia and amd support only
the minimal required value, so I guess we can stick to it for now.
When the D3D11 backend was first written, SPIRV-Cross only had a C++ API
and no guarantee of API or ABI stability, so instead of using
SPIRV-Cross directly, mpv used an unofficial C wrapper called crossc.
Now that KhronosGroup/SPIRV-Cross#611 is resolved, SPIRV-Cross has an
official C API that can be used instead, so remove crossc and use
SPIRV-Cross directly.
The calculation of scale factor involves 32-bit float, and a strict
equality test will effectively ignore `--scaler-resizes-only` option
for some non-integer scale factor.
Fix this by using non-strict equality check.
since the none native fs is a special legacy case it needs a special
quit routine. it indefinitely waited for an exit fs screen event to
shutdown properly, though that event only fires for the native fs.
now we check if we really are using a native fullscreen and if not
shutdown immediately.
Fixes#6704
These were unnecessary for a couple of reasons, but it seems like the
old code went through a lot of effort to avoid duplicating the code to
print a RECT, even though the windowrc gets printed anyway at the end of
the function.
Avoid printing the same windowrc twice by only printing it when it gets
changed (in the w32->current_fs branch.)
This allows to select the drm mode using a string specification. You
can either select the the preferred mode, the mode with the highest
resolution, by specifying WxH[@R] or by its index in the list of modes
as before.
This was implemented by using OPT_STRING_VALIDATE for drm-mode,
instead of OPT_INT. Using a string here also prepares for future
additions to drm-mode that aim to allow specifying a mode by its
resolution.
It is useful when debugging to be able to force atomic off, or as a
workaround if atomic breaks for some user. Legacy modesetting is less
likely to break by virtue of being a less complex API.
The amount of code now present that's specific to Vulkan or OpenGL
has reached the point where we really want to split it out to
avoid a mess of #ifdefs.
At the same time, I'm moving the code to an api neutral location.
the force unwrapping of optionals caused many unpredictable segfaults
instead of gracefully exiting or falling back. besides that, it is bad
practice and the code is a lot more stable now.
This change updates the vulkan interop code to work with the
libplacebo based ra_vk, but also introduces direct VkImage
sharing to avoid the use of the intermediate buffer.
It is also necessary and desirable to introduce explicit
semaphore bsed synchronisation for operations on the shared
images.
Synchronisation means we can safely reuse the same VkImage for every
mapped frame, by ensuring the frame is copied to the VkImage before
mapping the next frame.
This functionality requires a 417.xx or newer nvidia driver, due to
bugs in the VkImage interop in the earlier 411 and 415 drivers.
It's definitely worth the effort, as the raw throughput is about
twice that of implementation using an intermediate buffer.
When interacting directly with libplacebo, we may need to pass a
pl_fmt based on an ra_format. Although the mapping is currently
trivial, it's worth wrapping to make it easy to adapt if this
changes in the future.
This commit rips out the entire mpv vulkan implementation in favor of
exposing lightweight wrappers on top of libplacebo instead, which
provides much of the same except in a more up-to-date and polished form.
This (finally) unifies the code base between mpv and libplacebo, which
is something I've been hoping to do for a long time.
Note: The ra_pl wrappers are abstract enough from the actual libplacebo
device type that we can in theory re-use them for other devices like
d3d11 or even opengl in the future, so I moved them to a separate
directory for the time being. However, the rest of the code is still
vulkan-specific, so I've kept the "vulkan" naming and file paths, rather
than introducing a new `--gpu-api` type. (Which would have been ended up
with significantly more code duplicaiton)
Plus, the code and functionality is similar enough that for most users
this should just be a straight-up drop-in replacement.
Note: This commit excludes some changes; specifically, the updates to
context_win and hwdec_cuda are deferred to separate commits for
authorship reasons.
half of the materials we used were deprecated with macOS 10.14, broken
and not supported by run time changes of the macOS theme. furthermore
our styling names were completely inconsistent with the actually look
since macOS 10.14, eg ultradark got a lot brighter and couldn't be
considered ultradark anymore.
i decided to drop the old option --macos-title-bar-style and rework
the whole mechanism to allow more freedom. now materials and appearance
can be set separately. even if apple changes the look or semantics in
the future the new options can be easily adapted.
setting the appearance of the window to nil will always use the system
setting and changes on run time switches too. this is only the case for
macOS 10.14.
quite a lot of the title bar functionality and logic was within our
window. since we recently added a custom title bar class to our window
i decided to move all that functionality into that class and in its
own file.
this is also a preparation for the next commits.
i found the old pixel format creation a bit too messy. pixel format
attribute arrays and look ups were all over the place, the actual logic
what kind of format was created was inscrutable, the software pixel
format was hardcoded and no probing was done.
i split the attributes into mandatory and optional ones, one mandatory
for a hardware and software pixel format each, and moved those to the
top of the class. that way new attributes can be easily added to either
the mandatory or optional attributes and they don't mess up the actual
pixel creation logic any more. furthermore both hardware and software
pixel formats are being probed the same way now. to minimise code
duplications the probing was moved into its own function.
only the dragged types NSFilenamesPboardType and NSURLPboardType were
supported to be dropped on the window, which was inconsistent with the
dragged types the dock icon supports. the dock icon additional supports
strings that represents an URL or a path. the system takes care of
validating the strings properly in the case of the dock icon, but in the
case of dropping on the window it needs to be done manually.
support for strings is added by also allowing the NSPasteboardTypeString
type and manually validating the strings. strings are split by new lines
and trimmed, to also support a list of URLs and paths. every new element
is checked if being an URL or path and only then being added to the
playlist.
on init an NSWindow can't be placed on a none Main screen NSScreen
outside the Main screen's frame bounds.
To fix this we just check if the target screen is different from the
main screen and if that is the case we check the actual position with
the expect one. if they are different we try to reposition the window
to the wanted position.
Fixes#6453
when quitting mpv in fullscreen the System always switchs to Space 1
regardless of which Space mpv was on previous to switching to fs. to fix
this we close the window before quitting fs. that way the System
switches back the the Space mpv was previous on before fs.
new events were added but not fetched by the vo, because we didn't
signal the vo that new events were available.
actually wakeup the vo when new events are available.
Regression from 8e3308d687.
Broken cases were:
* --no-cursor-autohide acted like --cursor-autohide=always.
* --cursor-autohide-fs-only always hid the cursor if starting
non-fullscreen; entering fullscreen at least once fixed it.
The old size limit was chosen before LUT texture was supported in user
shader. At that time, the whole user shader will be compiled and run
on GPU, which makes large user shader impractical to be used.
With the introduction of LUT texture, the old size limit doesn't make
any sense. For example, a 1024x1024 rgba16f LUT will cost 32MB shader
size.
Fix this by increasing the size limit to a value that's unlikely be
reached.
Manual changes done:
* Merged the interface-changes under the already master'd changes.
* Moved the hwdec-related option changes to video/decode/vd_lavc.c.
modulo operator could be used to check if size is multiple of a
certain number.
equal operator could be used to verify if size of different textures
aligns.
Before this commit, texture offset is set after all source textures
are finalized. Which means CHROMA hooks won't be able to align with
luma planes. This could be problematic for chroma prescalers utilizing
information from luma plane.
Fix this by find the reference texture early, and set global texture
offset early.
This allows context_drm_egl to use as many buffers as libgbm or the
swapchain_depth setting allows (whichever is smaller).
On pause and on still images (cover art etc.) to make sure that output does not
lag behind user input, the swapchain is drained and reverts to working in a dual
buffered (equivalent to swapchain-depth=1) manner.
When possible (swapchain-depth>=2), the wait on the page flip event is now not
done immediately after queueing, but is deferred to the next invocation of
swap_buffers. Which should give us more CPU time between invocations.
Although, since gbm_surface_has_free_buffers() can only tell us a boolean value
and not how many buffers we have left, we are forced to do this contortionist
dance where we first overshoot until gbm_surface_has_free_buffers() reports 0,
followed by immediately waiting so we can free a buffer, to be able to get the
deferred wait on page flip rolling.
With this commit we do not rely on the default vsync fences/latency emulation of
video/out/opengl/context.c, but supply our own, since the places we create and
wait for the fences needs to be somewhat different for best performance.
Minor fixes:
* According to GBM documentation all BO:s gotten with
gbm_surface_lock_front_buffer must be released before gbm_surface_destroy is
called on the surface.
* We let the page flip handler function handle the waiting_for_flip flag.
This solves some edge cases when using files with very weird metadata
(e.g. MaxCLL 10k and so forth). Instead of just blindly seeding it with
the tagged metadata, forcibly set the initial state from the detected
values.
Rather than the linear cd/m^2 units, these (relative) logarithmic units
lend themselves much better to actually detecting scene changes,
especially since the scene averaging was changed to also work
logarithmically.
Gamut mapping can take very bright out-of-gamut colors into the
negatives, which completely destroys the color balance (which tone
mapping tries its best to preserve).
This change switches to a logarithmic mean to estimate the average
signal brightness. This handles dark scenes with isolated highlights
much more faithfully than the linear mean did, since the log of the
signal roughly corresponds to the perceptual brightness.
In theory our "eye adaptation" algorithm works in both ways, both
darkening bright scenes and brightening dark scenes. But I've always
just prevented the latter with a hard clamp, since I wanted to avoid
blowing up dark scenes into looking funny (and full of noise).
But allowing a tiny bit of over-exposure might be a good thing. I won't
change the default just yet (better let users test), but a moderate
value of 1.2 might be better than the current 1.0 limit. Needs testing
especially on dark scenes.
The previous approach of using an FIR with tunable hard threshold for
scene changes had several problems:
- the FIR involved annoying hard-coded buffer sizes, high VRAM usage,
and the FIR sum was prone to numerical overflow which limited the
number of frames we could average over. We also totally redesign the
scene change detection.
- the hard scene change detection was prone to both false positives and
false negatives, each with their own (annoying) issues.
Scrap this entirely and switch to a dual approach of using a simple
single-pole IIR low pass filter to smooth out noise, while using a
softer scene change curve (with tunable low and high thresholds), based
on `smoothstep`. The IIR filter is extremely simple in its
implementation and has an arbitrarily user-tunable cutoff frequency,
while the smoothstep-based scene change curve provides a good, tunable
tradeoff between adaptation speed and stability - without exhibiting
either of the traditional issues associated with the hard cutoff.
Another way to think about the new options is that the "low threshold"
provides a margin of error within which we don't care about small
fluctuations in the scene (which will therefore be smoothed out by the
IIR filter).
Instead of desaturating towards luma, we desaturate towards the
per-channel tone mapped version. This essentially proves a smooth
roll-off towards the "hollywood"-style (non-chromatic) tone mapping
algorithm, which works better for bright content, while continuing to
use the "linear" style (chromatic) tone mapping algorithm for primarily
in-gamut content.
We also split up the desaturation algorithm into strength and exponent,
which allows users to use less aggressive desaturation settings without
affecting the overall curve.
This is the naming xdg-shell stable adopted, it doesn’t make much sense
to keep using “shell” everywhere with all functions calling it
“wm_base”.
Finishes what 76211609e3 started.
some safety mechanism for the async fs animation aren't needed anymore,
due to possible improved logic and slightly different behaviour on new
macOS versions. that safety fallback prevented the Split View because
it always returned a rectangle of the whole screen, instead of just
part/half of it.
Fixes#6443
Add "auto" the possible values of target-peak. The default value
for target_peak is to calculate the target using mp_trc_nom_peak.
Unfortunately, this default was outside the acceptable range of
10-10000 nits, which prevented its later reassignment. So add an
"auto" choice to target-peak which lets clients and scripts go back
to using the trc default after assigning a value.
I misunderstood how this extension works. If I understand it correctly
now, it's worse than I thought. They key thing is that the (ust, msc,
sbc) tripple is not for a single swap event. Instead, (ust, msc) run
independently from sbc. Assuming a CFR display/compositor, this means
you can at best know the vsync phase and frequency, but not the exact
time a sbc changed value.
There is GLX_INTEL_swap_event, which might work as expected, but it has
no EGL equivalent (while GLX_OML_sync_control does, in theory).
Redo the context_glx sync code. Now it's either more correct or less
correct. I wanted to add proper skip detection (if a vsync gets skipped
due to rendering taking too long and other problems), but it turned out
to be too complex, so only some unused fields in vo.h are left of it.
The "generic" skip detection has to do.
The vsync_duration field is also unused by vo.c.
Actually this seems to be an improvement. In cases where the flip call
timing is off, but the real driver-level timing apparently still works,
this will not report vsync skips or higher vsync jitter anymore. I could
observe this with screenshots and fullscreen switching. On the other
hand, maybe it just introduces an A/V offset or so.
Why the fuck can't there be a proper API for retrieving these
statistics? I'm not even asking for much.
Use the extension to compute the (hopefully correct) video delay and
vsync phase.
This is very fuzzy, because the latency will suddenly be applied after
some frames have already been shown. This means there _will_ be "jumps"
in the time accounting, which can lead to strange effects at start of
playback (such as making initial "dropped" etc. frames worse). The only
reasonable way to fix this would be running a few dummy frame swaps at
start of playback until the latency is known. The same happens when
unpausing.
This only affects display-sync mode.
Correct function was not confirmed. It only "looks right". I don't have
the equipment to make scientifically correct measurements.
A potentially bad thing is that we trust the timestamps we're receiving.
Out of bounds timestamps could wreak havoc. On the other hand, this will
probably cause the higher level code to panic and just disable DS.
As a further caveat, this makes a bunch of assumptions about UST
timestamps. If there are delayed frames (i.e. we skipped one or more
vsyncs), the latency logic is mostly reset. There is no attempt to make
the vo.c skipped vsync logic to use this. Also, the latency computation
determines a vsync duration, and there's no effort to reconcile or share
the vo.c logic for determining vsync duration.
This option has been deprecated upstream for a long time, probably
doesn't even work anymore, and won't work moving forwards as we replace
the vulkan code by libplacebo wrappers.
I haven't removed the option completely yet since in theory we could
still add support for e.g. a native glslang wrapper in the future. But
most likely the future of this code is deletion.
As an aside, fix an issue where the man page didn't mention d3d11.
This commit bumps the libmpv version to 1.102
drm-osd-plane -> drm-draw-plane
drm-video-plane -> drm-drmprime-video-plane
drm-osd-size -> drm-draw-surface-size
"draw plane", as in the plane that OpenGL draws to, whether it be
video + OSD or just OSD.
"drmprime video plane", as in the plane used for hwdec video imported
via drmprime.
"draw surface size", as in the size of the surface used for the draw plane
The new names are invariant whether or not hwdec_drmprime_drm is being
used or not. The original naming was very confusing, as when doing
regular rendering (swdec or vaapi) the video would be displayed on the
"OSD plane", and the "Video plane" would remain unused.
Add general primary/overlay plane option to drm-osd-plane-id and
drm-video-plane-id, so that the user can just request any usable
primary or overlay plane for either of these two options. This should
be somewhat more user-friendly (especially as neither of these two
options currently have a useful help function), as usually you would
only be interested in the type of the plane, and not exactly which
plane gets picked.
By design, some vulkan implementations block until vsync during
vkAcquireNextImageKHR. Since mpv only considers the time that
`swap_buffers` spent blocking as constituting part of the vsync, we can
help it out a bit by pre-emptively calling this function here in order
to improve the accuracy of vsync jitter measurements on vulkan.
(If it fails, we just ignore the error and have the user call it a
second time later - maybe it will work then)
On my system this drops vsync-jitter from ~0.030 to ~0.007, an accuracy
of +/- 100μs. (Which *might* have something to do with the fact that
this is the polling interval for command polling)
Makes performance slightly better when using multiple queues by avoiding
unnecessary semaphores due to bad queue selection.
Also remove an aeons-old workaround for an nvidia bug that only ever
existed in the earliest beta vulkan drivers anyway.
We are currently unnecessarily including vulkan headers even when
not building with vulkan support. I also guarded the GL header
inclusion even though this doesn't appear to break anything today.
Fixes#6330.
This makes the default fit on screen, autofit and window-scale
changing behavior to use the screen working area, instead of
the whole screen area.
As a result mpv window doesn't cover the taskbar now when opening
videos with size larger than the screen size.
The actual behavior now is the same as expected behavior for
usecases 1-4 from #4363.
This commit also removes the screenrc from w32 struct.
The screen rect can now be retrieved via `get_screen_area` function,
which was renamed from `update_screen_rect`.
On a multi-monitor system, if the user moved the window between
monitors, this function will return the current screen area under
the window, and not the screen area from monitor specified by
`--screen` option. The `--screen` option sets the initial monitor
the mpv window is displayed on.
Returning -1 in a function with return type bool is the same as
returning true. In the error paths, false should be returned to
indicate that something went wrong.
depending on the capabilities of the system and testing of various
attributes the resulting CGL pixel format can change. due to that
probing it can be helpful to know which pixel format is used to create
the CGL context. added some verbose logging that outputs final pixel
format.
this adds support for GPU rendered screenshots, DR (theoretically) and
possible other advanced functions in the future that need to be executed
from the rendering thread.
additionally frames that would be off screen or not be displayed when on
screen are being dropped now.
I was inconsistent about this originally, as the functionality was
moved into the core spec in 1.1 and so both suffixed and unsuffixed
versions of everything exist and can be mixed together.
There's no reason to fail to build with 1.0.39+ so I'm fixing the
names.
Currently, the error paths in init() are a bit confusing, and we can
end up trying to pop the current context when there is no context,
which leads to distracting error messages.
I also added an explicit path to return early if the GPU backend is
not OpenGL or Vulkan. It's pointless to do any other cuda init
after that point. (Of course, someone could write more interops.)
Fixes#6256
Since 810acf32d6 video_plane can be NULL
under some circumstances. While there is a check in init, init treats
this as an error condition and would call uninit, which in turn calls
disable_video_plane, which would then segfault. Fix this by including
a NULL check inside disable_video_plane, so that it doesn't try to
disable what isnt' there.
Despite their place in the tree, hwdecs can be loaded and used just
fine by the vulkan GPU backend.
In this change we add Vulkan interop support to the cuda/nvdec hwdec.
The overall process is mostly straight forward, so the main observation
here is that I had to implement it using an intermediate Vulkan buffer
because the direct VkImage usage is blocked by a bug in the nvidia
driver. When that gets fixed, I will revist this.
Nevertheless, the intermediate buffer copy is very cheap as it's all
device memory from start to finish. Overall CPU utilisiation is pretty
much the same as with the OpenGL GPU backend.
Note that we cannot use a single intermediate buffer - rather there
is a pool of them. This is done because the cuda memcpys are not
explicitly synchronised with the texture uploads.
In the basic case, this doesn't matter because the hwdec is not
asked to map and copy the next frame until after the previous one
is rendered. In the interpolation case, we need extra future frames
available immediately, so we'll be asked to map/copy those frames
and vulkan will be asked to render them. So far, harmless right? No.
All the vulkan rendering, including the upload steps, are batched
together and end up running very asynchronously from the CUDA copies.
The end result is that all the copies happen one after another, and
only then do the uploads happen, which means all textures are uploaded
the same, final, frame data. Whoops. Unsurprisingly this results in
the jerky motion because every 3/4 frames are identical.
The buffer pool ensures that we do not overwrite a buffer that is
still waiting to be uploaded. The ra_buf_pool implementation
automatically checks if existing buffers are available for use and
only creates a new one if it really has to. It's hard to say for sure
what the maximum number of buffers might be but we believe it won't
be so large as to make this strategy unusable. The highest I've seen
is 12 when using interpolation with tscale=bicubic.
A future optimisation here is to synchronise the CUDA copies with
respect to the vulkan uploads. This can be done with shared semaphores
that would ensure the copy of the second frames only happens after the
upload of the first frame, and so on. This isn't trivial to implement
as I'd have to first adjust the hwdec code to use asynchronous cuda;
without that, there's no way to use the semaphore for synchronisation.
This should result in fewer intermediate buffers being required.
This is arguably a little contrived, but in the case of CUDA interop,
we have to track additional state on the cuda side for each exported
buffer. If we want to be able to manage buffers with an ra_buf_pool,
we need some way to keep that CUDA state associated with each created
buffer. The easiest way to do that is to attach it directly to the
buffers.
The CUDA/Vulkan interop works on the basis of memory being exported
from Vulkan and then imported by CUDA. To enable this, we add a way
to declare a buffer as being intended for export, and then add a
function to do the export.
For now, we support the fd and Handle based exports on Linux and
Windows respectively. There are others, which we can support when
a need arises.
Also note that this is just for exporting buffers, rather than
textures (VkImages). Image import on the CUDA side is supposed to
work, but it is currently buggy and waiting for a new driver release.
Finally, at least with my nvidia hardware and drivers, everything
seems to work even if we don't initialise the buffer with the right
exportability options. Nevertheless I'm enforcing it so that we're
following the spec.
Since the code just broke out of the loop on a match rather than jumping
straight to the end of the function body, it ended up hitting the code
path for when the end of the list was reached.
since we draw our own title bar we lose the standard functionality of
the system provided title bar. because of that we have to reimplement
the functionality of double clicking the title bar. depending on the
system preferences we want to minimize, zoom or do nothing.
Fixes#6223
On a multi monitor setup, when the center of the window was going off
screen, the icc profile would always switch to the profile of the first
screen.
This fixes the issue by defaulting the value to the current screen.
This was pased on the texture height, which was a mistake. In some cases
it could exceed the actual size of the buffer, leading to a vulkan API
error. This didn't seem to cause any problems in practice, since a
too-large synchronization is just bad for performance and shouldn't do
any harm internally, but either way, it was still undefined behavior to
submit a barrier outside of the buffer size.
Fix the calculation, thus fixing this issue.
Since linear downscaling makes sense to handle independently from
linear/sigmoid upscaling, we split this option up. Now,
linear-downscaling is its own option that only controls linearization
when downscaling and nothing more. Likewise, linear-upscaling /
sigmoid-upscaling are two mutually exclusive options (the latter
overriding the former) that apply only to upscaling and no longer
implicitly enable linear light downscaling as well.
The old behavior was very confusing, as evidenced by issues such
as #6213. The current behavior should make much more sense, and only
minimally breaks backwards compatibility (since using linear-scaling
directly was very uncommon - most users got this for free as part of
gpu-hq and relied only on that).
Closes#6213.
when entering a Split View a windowDidEnterFullScreen event happens
without a previous toggleFullScreen call. in that case it tries to stop
an animation that was never initiated by us and basically breaks the
system initiated fullscreen, or in this case the Split View. immediately
after entering the fullscreen it tries top stop the animation and
resizes the window, which causes the window to exit fullscreen. only
try to stop an animation that was initiated by us and is safe to stop.
by default the pixel format creation falls back to software renderer
when everything fails. this is mostly needed for VMs. additionally one
can directly request an sw renderer or exclude it entirely.
moved the retrieval of the macOS specific options from the backend
initialisation to the initialisation of the CocoaCB class, the earliest
point possible. this way macOS specific options can be used for the
opengl context creation for example.
This is to improve the experience when running with default settings
on a driver that doesn't have any overlay planes (or indeed only one
plane), but still supports DRM atomic. Since the drmprime video plane
is set to pick an overlay plane by default it would fail on these
drivers due to not being able to create any atomic context. Users with
such cards had to specify --drm-video-plane-id manually to some bogus
value (it's not used after all).
The "video" plane is only ever used by the drmprime-drm hwdec interop,
which is not used at all in the typical usecase where everything is
actually rendered on to the "OSD" plane using EGL, so having an atomic
context without the "video" plane should be fine most of the time.
For vec3, the alignment and size differ. The current code will pack a
struct like { vec3; float; vec2 } into 8 machine words, whereas the spec
would only use 6.
This actually fixes a real bug: The only place in the code I could find
where it was conceivably possible that a vec3 is followed by a float was
when using --gpu-dumb-mode in combination with --gamma-factor, and only
when --gpu-api=vulkan. So it's no surprised nobody ran into it yet.
These used to be unsupported long ago, but it seems glslang added
support in the meantime. (I don't know which version, but I'm guessing
it was long enough ago that we don't have to add a feature check)
Should hopefully help make push constant layouts more robust against
possible bugs either in our code or in the driver.
Certain low-end Mali GPUs have a rather low precision and overflow
during the PRNG calculations, thereby breaking e.g. deband-grain.
Modify the permute() to avoid this, this does not impact the
quality of PRNG output (noticeably).
This problem was observed on:
GL_VENDOR='ARM', GL_RENDERER='Mali-T720'
GL_VERSION='OpenGL ES 3.1 v1.r15p0-00rel0.bdd9e62cdc8c88e0610a16b5901161e9'
Upstream has this now. Didn't really make any different for me (except
making the polar compute shader 2%-3% faster), but maybe it does for
somebody else.
When using multiple compute shaders as part of the same pass, there can
be a conflict in the block sizes. In the problematic case, the HDR
detection shader can collide with the polar sampling shader. In this
case, the solution is clear - the passes that can handle any size should
"give in" and not overwrite the block sizes.
Fixes#6083.
instead of force unwrapping and chaining the optional vars in our
containsMouseLocation function, safely unwrap and guard the resulting
var.
Fixes#6062
Add another parameter to mpv_opengl_drm_params to hold the FD to the
render node, so that the fd can be passed to hwdec_vaegl.
The render node is opened in context_drm_egl and inferred from the
primary device fd using drmGetRenderDeviceNameFromFd.
The previous code did not save enough information about the old state,
and could end up changing what plane the fbcon:s FB got attached to,
or in worse case causing a blank screen (observed in some multi-screen
setups on Sandy Bridge).
In addition refactor the handling of drmModeModeInfo property blobs to
not leak, as well as enable reuse of already created blobs.
init is a reserved keyword and Swift 4.2 got a bit stricter about using
it. this could be fixed by adding apostrophes around init but makes the
code uglier. hence i just renamed init to initialized and for
consistency uninit to uninitialized.
Fixes#5899
the pre-allocation was needed because the layer allocated a opengl
context async itself and we couldn't influence that. so we had to start
the core after the context was actually allocated. furthermore a window,
view and layer hierarchy had to be created so the layer would create
a context.
now, instead of relying on the layer to create a context we do this
manually and re-use that context later when the layer wants to create
one async itself.
This sacrifices some dynamic range for well-behaved sources, but
prevents catastrophic desaturation on badly mastered / too bright
sources. I think that's the better trade-off. This makes the
desaturation algorithm much "safer" to deploy by default, as well. One
could even argue going up to strength 1.0, which works better for some
sources but worse for others. But I think the current strength is the
best trade-off even after this change.
For some reason, the X default modifier map binds shift+tab to
ISO_Left_Tab instead of the regular Tab. So to get Shift+TAB recognized
by mpv, we also need to accept ISO_Left_Tab.
This patch matches what other programs like e.g. Qt do, which treat Tab
and ISO_Left_Tab as the same thing.
God only knows why the distinction exists, and why X decides to mix up
its bindings like that.
Fixes#5849
If anyone happened to build with GL disabled, this could lead to option
changes not always refreshing the screen. Since vo_gpu is always enabled
now (just not necessarily any backend for it), we can drop the #if
completely.
(The way this works is a bit idiotic - the option cache exists only to
grab the change notification, which will trigger a redraw and make
vo_gpu update its own second copy of them. But at least it avoids some
layering issues for now.)
This was always a legacy thing. Remove it by applying an orgy of
mp_get_config_group() calls, and sometimes m_config_cache_alloc() or
mp_read_option_raw().
win32 changes untested.
With the advent of actual HDR devices, my real measured ICC profile has
an "infinite" contrast, since the display is completely off on pure
black inputs. 100k:1 might not be enough, so let's just bump it up to
1m:1 to be safe.
Also, improve the logging in the case that the detected contrast is too
high by default.
First fix a memory leak when skipping cursor planes by inverting the
check and putting everything, but the free, in the body.
Then fix a missed drmModeFreePlane by simply copying the fields of the
drmModePlane we are interested in and freeing the drmModePlane struct
early.
Until recently, ao_lavc and vo_lavc started encoding whenever the core
happened to send them data. Since audio and video are not initialized at
the same time, and the muxer was not necessarily opened when the first
encoder started to produce data, the resulting packets were put into a
queue. As soon as the muxer was opened, the queue was flushed.
Change this to make the core wait with sending data until all encoders
are initialized. This has the advantage that we don't need to queue up
the packets.
The user won't want to have those in the video (I think). The core can
sporadically issue redraws, which is what you want for actual playback,
but not in encode mode. vo_lavc can explicitly detect those and skip
them. It only requires switching to a more advanced internal VO API.
The comments in vo.h are because vo_lavc draws to one of the images in
order to render OSD. This is OK, but might come as a surprise to whoever
calls draw_frame, so document it. (Current callers are OK with it.)
Inspired by kmscube, first try to pick the Encoder and CRTC already
associated with the selected Connector, if any. Otherwise try to find
the first matching encoder & CRTC like before.
The previous behavior had problems when using atomic
modesetting (crtc_setup_atomic) when we picked an Encoder & CRTC that
was currently being used by the fbcon together with another Encoder.
drmModeSetCrtc was able to "steal" the CRTC in this case, but using
atomic modesetting we do not seem to get this behavior automatically.
This should also improve behavior somewhat when run on a multi screen
setup with regards to deinit and VT switching (still sometimes you end
up with a blank screen where you previously had a cloned display of
your fbcon)
Add some properties which where forgotten in crtc_setup_atomic.
In both change to not use DRM_MODE_PAGE_FLIP_EVENT | DRM_MODE_ATOMIC_NONBLOCK
flags. This should make it more similar to the drmSetCrtc which it aims to
replace (take effect directly, and blocking call). This also saves us the
trouble of having to set up a poll to wait for pageflip, which would've been
neccesary with DRM_MODE_PAGE_FLIP_EVENT, in both crtc_setup_atomic and
crtc_release_atomic.
This patch will make sure that the video plane is hidden when unused.
When using high resolution modes, typically UHD, and embedding mpv,
having the video plane sitting in the back when you don't play any video
is eating a lot of memory bandwidth for compositing.
That patch makes sure that the video layer is just disabled before and
after playback.
This commit allows to add atomic modesetting when using the atomic renderer.
This is actually needed when using and osd with a smaller size than screen resolution.
It will also make the drm atomic path more consistent
We are currently using primary / overlay planes drm objects, assuming that primary plane is osd and overlay plane is video.
This commit is doing two things :
- replace the primary / overlay planes members with osd and video planes member without the assumption
- Add two more options to determine which one of the primary / overlay is associated to osd / video.
- It will default osd to overlay and video to primary if unspecified
This patch adds
- DRM connector object to atomic context.
- fd property to the drm atomic object as well as a method to read blob type properties.
This allows to ensure that the proper connector is picked up, especially when specifying it
from the commandline, and also allows to make sure we're using the right one when embedding
with interop into an application.
That new API was introduced and allows to have several native resources.
Thisuses that mechanisma for drm resources rather than the deprecated
opengl-cb structs.
This patch therefore add two structs that can be used with the drm atomic interop.
- mpv_opengl_drm_params : which will hold all the drm handles
- mpv_opengl_drm_osd_size : which will hold osd layer size
This commit adds a drm-osd-size=WxH parameter to commandline which
allows to define the OSD plane dimension. OSD can be upscaled to
screen resolution when having OSD at video resolution is too heavy.
This is especially useful for UHD modes on embedded devices where
the GPU cannot handle UHD modes at a decent framerate.
Define a hard-coded value for gl_NumWorkGroups if it is not available.
This adds an additional requirement of needing a shader recompile for
all window size changes.
This was considered a worthwhile compromise as currently f.ex. d3d11
completely lacked any peak computation - this is a major quality of
life upgrade.
This is for working around bugs in certain Android devices. At least one
device fails to sort EGLConfigs by size, so eglChooseConfig() ends up
choosing a config with 5/6/5 bits per r/g/b component. The other
attributes in the affected EGLConfigs did not look like they should
affect the sorting process as specified by the EGL 1.4 standard.
The device was reported as:
Sony Xperia Z3 Tablet Compact
Firmware 6.0.1 build number 23.5.A.1.291
GL_VERSION='OpenGL ES 3.0 V@140.0 AU@ (GIT@I741a3d36ca)'
GL_VENDOR='Qualcomm'
GL_RENDERER='Adreno (TM) 330'
Other Qualcom/Adreno devices have been reported as unaffected by this
(including some with same GL_RENDERER string).
"Fix" this by always requiring at least 8 bit. This means it would fail
on devices which cannot provide this. We're fine with this.
mpv-android/mpv-android#112
This was supposed to be a replacement for encode_lavc_discontinuity()
(so we don't need to store last_video_in_pts in a way which requires
synchronization). Unfortunately, VOCTRL_RESET is also called before
termination, and even though it shouldn't matter as far as the VO API is
concerned, it does. It's because vo_lavc.c buffers a frame to compute
the frame duration.
Drop this code. The consequence is that it appears to encode 2 frames
with the same PTS if multiple files are encoded into one. Before this,
it merely dropped a frame (maybe the first of every subsequent file, not
sure).
The main change is that we wait with opening the muxer ("writing
headers") until we have data from all streams. This fixes race
conditions at init due to broken assumptions in the old code.
This also changes a lot of other stuff. I found and fixed a few API
violations (often things for which better mechanisms were invented, and
the old ones are not valid anymore). I try to get away from the public
mutex and shared fields in encode_lavc_context. For now it's still
needed for some timestamp-related fields, but most are gone. It also
removes some bad code duplication between audio and video paths.
1. I want to get away from mp_image_params (maybe).
2. For encoding mode, it's convenient to get the nominal_fps, which is
a mp_image field, and not in mp_image_params.
Also rename stereo3d to stereo_in. The only real change is that the
vo_gpu OSD code now uses the actual stereo 3D mode, instead of the
--video-steroe-mode value. (Why does this vo_gpu code even exist?)
Attempts to enable the following things:
- let a render API user do "proper" audio-sync video timing itself
- make it possible to not re-render repeated frames if the API user has
better mechanisms available (e.g. waiting for a DisplayLink cycle
instead)
- allow the user to delay or skip redraws if it makes sense
Basically this information will be needed by API users who want to be
"clever" about optimizing timing and rendering.
In MPV_RENDER_PARAM_ADVANCED_CONTROL mode, a simple update callback does
not necessarily make the API user redraw. So handle it differently.
For one, setting vo->want_redraw already uses the "normal" redraw path,
which will call draw_frame() and set next_frame.
Then there are redraws trigered by mpv_render_context_set_parameter(),
which are on the render thread, and would require a separate mechanism.
I decided this is not really a good idea, since it's not even clear that
setting an arbitrary parameter should redraw. Also this could trigger an
unbounded number of redraws. The user can trigger redraws manually if
really needed, depending on the parameter that's being set. If we really
wanted vo_libmpv to do this, we could add a new flag like need_redraw,
which would be 4 lines of code or so.
update() used to require the lock, but now it doesn't matter. It's
slightly better to do it outside of the lock now, in case the update
callback reschedules before returning, and the user render thread tries
to acquire the still held lock (which would require 2 more context
switches).
DR (letting the decoder allocate texture memory) requires running the
allocation on the render thread. This is rather hard with the render
API, because the user controls this thread and when it's entered. It was
not possible until now.
This commit adds a bunch of infrastructure to make this possible. We add
a new optional mode (MPV_RENDER_PARAM_ADVANCED_CONTROL) which basically
lets the user's render thread and libmpv agree how this should be done.
Misuse would lead to deadlocks. To make this less likely, strictly
document thread safety/locking issues. In particular, document which
libmpv functions can be called without issues. (The rest has to be
assumed unsafe.)
The worst issue is destruction of the render context while video is
still active. To avoid certain unintended recursive locks (i.e.
deadlocks, unless we'd make the locks recursive), make the update
callback lock separate. Make "killing" the video chain asynchronous, so
we can do extra work while video is being destroyed.
Because losing wakeups is a big deal, setting the update callback now
triggers a wakeup. (It would have been better if the wakeup callback
were a parameter to mpv_render_context_create(), but too late.)
This commit does not add DR yet; the following commit does this.
I suppose this doesn't matter in practice, i.e. even if calls relayed
over the dispatch queue will cause WndProc to be invoked, WndProc will
never run for a longer time.
Preparation for removing recursion support from the dispatch queue code.
Normally, MPV_RENDER_PARAM* arguments are copied, unless documented
otherwise. Of course we can't copy X11 Display or Wayland wl_display
types, but for arguments that are "summarized" in a struct (like
MPV_RENDER_PARAM_OPENGL_FBO), a copy is expected.
Also add some unused infrastructure to make this explicit, and to make
it easier to add parameter types that require a copy.
Untested.
The CUDA dynamic loader was broken out of ffmpeg into its own repo
and package. This gives us an opportunity to re-use it in mpv and
remove our custom loader logic.
Hardware decoding things often need access to additional handles from
the windowing system, such as the X11 or Wayland display when using
vaapi. The opengl-cb had nothing dedicated for this, and used the weird
GL_MP_MPGetNativeDisplay GL extension (which was mpv specific and not
officially registered with OpenGL).
This was awkward, and a pain due to having to emulate GL context
behavior (like needing a TLS variable to store context for the pseudo GL
extension function). In addition (and not inherently due to this), we
could pass only one resource from mpv builtin context backends to
hwdecs. It was also all GL specific.
Replace this with a newer mechanism. It works for all RA backends, not
just GL. the API user can explicitly pass the objects at init time via
mpv_render_context_create(). Multiple resources are naturally possible.
The API uses MPV_RENDER_PARAM_* defines, but internally we use strings.
This is done for 2 reasons: 1. trying to leave libmpv and internal
mechanisms decoupled, 2. not having to add public API for some of the
internal resource types (especially D3D/GL interop stuff).
To remain sane, drop support for obscure half-working opengl-cb things,
like the DRM interop (was missing necessary things), the RPI window
thing (nobody used it), and obscure D3D interop things (not needed with
ANGLE, others were undocumented). In order not to break ABI and the C
API, we don't remove the associated structs from opengl_cb.h.
The parts which are still needed (in particular DRM interop) needs to be
ported to the render API.
we rendered on the displaylink thread which wasn't the best idea. if
rendering took too long or was blocking it also blocked the displaylink
callback. when that happened new vsyncs were reported delayed or not at
all. consequently the mpv_render_context_report_swap function wasn't
called consistently and that could cause bad video playback. so the
rendering is moved to a dedicated dispatch queue. furthermore the update
callback starts a layer update directly instead of the displaylink
callback, making the rendering a bit more consistent.
Right now the atomic request is alive during the renderloop.
We want it to be alive until the drm egl context is destroyed because some properties
might still be set upon interop close
This patch make the request to be kept created even outside the renderloop.
The context uninit will commit the last request.
commit 2edf00f changed the MPV_EVENT_SHUTDOWN behaviour slightly, such
that it will only be sent once. cocoa-cb relied on it being sent
continuously till all mpv_handles are destroyed. now it manually shuts
down and destroys the mpv_handle after the animation instead of relying
on this removed behaviour.
This passed the display size as source size to the renderer, which is of
course nonsense. I don't know what I was doing in 569383bc54.
Yet another fix for those damn anamorphic videos.
As a somewhat redundant/cosmetic change, use image_params instead of
real_image_params in the code above. They should have the same, dimensions
(but possibly different formats when doing hw decdoing), and mixing them
is confusing. p->image_params wins because it's shorter.
Actually fixes#5619.
There is some sort-of awkwardness here, because option access needs to
happen in a synchronized manner, and the framedrop flag is not in the VO
option struct. Remove the mp_read_option_raw() call and the awkward
change notification via VO_EVENT_WIN_STATE from command.c, and pass it
through as new vo_frame flag.
Removes the awkward notification through VO_EVENT_WIN_STATE.
Unfortunately, some awkwardness remains in mp_property_display_fps(),
because the property has conflicting semantics with the option.
We took the storage size instead of the display size for "unscaled"
screenshots. Even if it's called "unscaled", it's still supposed to
scale to compensate for aspect ratio.
(How many commits fixing anamorphic screenshots in various situations
are there?)
Fixes#5619.
the first mouse events, that try to hide the title bar, could happen
before the title bar was actually initialised. that caused our hiding
code to access a nil value. check for an available title bar before
trying to hide it.
there were actually a few small problems. the fatalError() function
wasn't supposed to be called there and caused an "Illegal instruction".
this was replaced by a print and exit() call. the second problem was
that cocoa returns a kCGLBadPixelFormat instead of a kCGLBadAttribute
error, which broke our check, immediately exited our loop and no working
pixel format was ever created. the third problem was that macOS 10.12
didn't return any errors but also didn't return a pixel format, that
also broke our check. now the code checks for both cases.
Fixes#5631
mouse events and the tracking area are needed for (un)hiding the new
title bar, which was broken when input-cursor=no was set. no tracking
area was ever created and set which completely deactivated any mouse
events. the specific mouse event functions were already deactivated
proactively and have the needed check. no events are being propagated to
the mpv core when input-cursor=no is set, even with an active tracking
area.
The s_size() function, whatever it was supposed to do, caused the
surface size to increase indefinitely. Fix by making it always use the
maximum size that was last used, which is less optimal (many surface
recreations when making the window slowly larger), but at least it
works.
The rotation code didn't mark the old surface as invalid when it was
freed, so it could destroy random other surfaces (let's call it dangling
ID).
Also, the required rotation surface size depends on the rotation mode,
so recreate the surfaces on rotation as well.
We use triple buffering for this interop and we were only unreffing the
data structures, which doesn't destroy the drm buffers.
This patch allows to make sure that we release the drm buffers on
playback end.
we activated the rendering loop a bit too early and it was possible that
the first draw function was called before it was actually ready. this
was a remnant from the old init routine and should have been changed.
start the queue on reconfigure instead of preinit.
on a file change and when the aspect ratio of the window changed, the
first live resize state had a wrong aspect ratio because the new aspect
ratio was only set after the first resize. just set the new content
frame before the resize.
i tried being smart and handle aspect ratio differences manually via
atomic drawing and resizing to aspect fitted frames. there were a few
issues with that. like unexpected visibility of certain System GUI
elements on entering fullscreen or visually dropped frames due to the
atomic drawing. now we rely on system mechanics to keep the proper
aspect ratio of our layer, the recommended way. as a side effect it also
fixes a segfault.
Fixes#5581
It turns out that Mali drivers are likely broken, and do not return
GBM_FORMAT_ARGB8888 (they return GBM_FORMAT_XRGB8888) when getting
EGL_NATIVE_VISUAL_ID for any EGLConfig, even though the resulting
EGLConfig appears to be capable of alpha.
It could also be potentially useful to allow an ARGB EGLConfig used
with an XRGB framebuffer on some platforms, so we do that. (cf. weston)
Unrelated indentation fix in gbm_format_to_string.
the NSWindowButton enum was moved to be a member of NSWindow and renamed
to ButtonType in SDK 10.13. apparently that wasn't documented anywhere.
not even in the SDK changes Document and the official Documentations
makes it look like it was always like this. the old NSWindowButton enum
though is still around on SDK 10.13 or at least got a typealias. so we
will just use that.
The purpose of the new API is to make it useable with other APIs than
OpenGL, especially D3D11 and vulkan. In theory it's now possible to
support other vo_gpu backends, as well as backends that don't use the
vo_gpu code at all.
This also aims to get rid of the dumb mpv_get_sub_api() function. The
life cycle of the new mpv_render_context is a bit different from
mpv_opengl_cb_context, and you explicitly create/destroy the new
context, instead of calling init/uninit on an object returned by
mpv_get_sub_api().
In other to make the render API generic, it's annoyingly EGL style, and
requires you to pass in API-specific objects to generic functions. This
is to avoid explicit objects like the internal ra API has, because that
sounds more complicated and annoying for an API that's supposed to never
change.
The opengl_cb API will continue to exist for a bit longer, but
internally there are already a few tradeoffs, like reduced
thread-safety.
Mostly untested. Seems to work fine with mpc-qt.
when resizing async it's possible that the layer, and the underlying gl
surface, is stretched on an aspect ratio change. to prevent that we do
an atomic resize (resize and draw at the same time). usually max one
unique frame should be dropped but it's possible, depending on the
performance, that more are dropped.
the title bar is now within the window bounds instead of outside. same
as QuickTime Player. it supports several standard styles, two dark and
two light ones. additionally we have properly rounded corners now and
the borderless window also has the proper window shadow.
Also make the earliest supported macOS version 10.10.
Fixes#4789, #3944
By blocking the VT switcher signal in the VO thread we get less races
and other oddities.
This gets rid of tearing (at least for me) when VT switching with
--gpu-context=drm.