Commit Graph

4242 Commits

Author SHA1 Message Date
Anton Kindestam ba2dee38fb hwdec_drmprime_drm: Missing NULL-check on drm_atomic_context video_plane
Since 810acf32d6 video_plane can be NULL
under some circumstances. While there is a check in init, init treats
this as an error condition and would call uninit, which in turn calls
disable_video_plane, which would then segfault. Fix this by including
a NULL check inside disable_video_plane, so that it doesn't try to
disable what isnt' there.
2018-10-25 13:50:09 +02:00
Philip Langdale da1073c247 vo_gpu: vulkan: hwdec_cuda: Add support for Vulkan interop
Despite their place in the tree, hwdecs can be loaded and used just
fine by the vulkan GPU backend.

In this change we add Vulkan interop support to the cuda/nvdec hwdec.

The overall process is mostly straight forward, so the main observation
here is that I had to implement it using an intermediate Vulkan buffer
because the direct VkImage usage is blocked by a bug in the nvidia
driver. When that gets fixed, I will revist this.

Nevertheless, the intermediate buffer copy is very cheap as it's all
device memory from start to finish. Overall CPU utilisiation is pretty
much the same as with the OpenGL GPU backend.

Note that we cannot use a single intermediate buffer - rather there
is a pool of them. This is done because the cuda memcpys are not
explicitly synchronised with the texture uploads.

In the basic case, this doesn't matter because the hwdec is not
asked to map and copy the next frame until after the previous one
is rendered. In the interpolation case, we need extra future frames
available immediately, so we'll be asked to map/copy those frames
and vulkan will be asked to render them. So far, harmless right? No.

All the vulkan rendering, including the upload steps, are batched
together and end up running very asynchronously from the CUDA copies.

The end result is that all the copies happen one after another, and
only then do the uploads happen, which means all textures are uploaded
the same, final, frame data. Whoops. Unsurprisingly this results in
the jerky motion because every 3/4 frames are identical.

The buffer pool ensures that we do not overwrite a buffer that is
still waiting to be uploaded. The ra_buf_pool implementation
automatically checks if existing buffers are available for use and
only creates a new one if it really has to. It's hard to say for sure
what the maximum number of buffers might be but we believe it won't
be so large as to make this strategy unusable. The highest I've seen
is 12 when using interpolation with tscale=bicubic.

A future optimisation here is to synchronise the CUDA copies with
respect to the vulkan uploads. This can be done with shared semaphores
that would ensure the copy of the second frames only happens after the
upload of the first frame, and so on. This isn't trivial to implement
as I'd have to first adjust the hwdec code to use asynchronous cuda;
without that, there's no way to use the semaphore for synchronisation.
This should result in fewer intermediate buffers being required.
2018-10-22 21:35:48 +02:00
Philip Langdale 621389134a vo_gpu: vulkan: Add a function to get the device UUID
We need this to do device matching for the cuda interop.
2018-10-22 21:35:48 +02:00
Philip Langdale a782197d21 vo_gpu: vulkan: Add arbitrary user data for an ra_vk_buf
This is arguably a little contrived, but in the case of CUDA interop,
we have to track additional state on the cuda side for each exported
buffer. If we want to be able to manage buffers with an ra_buf_pool,
we need some way to keep that CUDA state associated with each created
buffer. The easiest way to do that is to attach it directly to the
buffers.
2018-10-22 21:35:48 +02:00
Philip Langdale 93f800a00f vo_gpu: vulkan: Add support for exporting buffer memory
The CUDA/Vulkan interop works on the basis of memory being exported
from Vulkan and then imported by CUDA. To enable this, we add a way
to declare a buffer as being intended for export, and then add a
function to do the export.

For now, we support the fd and Handle based exports on Linux and
Windows respectively. There are others, which we can support when
a need arises.

Also note that this is just for exporting buffers, rather than
textures (VkImages). Image import on the CUDA side is supposed to
work, but it is currently buggy and waiting for a new driver release.

Finally, at least with my nvidia hardware and drivers, everything
seems to work even if we don't initialise the buffer with the right
exportability options. Nevertheless I'm enforcing it so that we're
following the spec.
2018-10-22 21:35:48 +02:00
Niklas Haas facc63b862 vo_gpu: vulkan: suppress bogus error message on --vulkan-device
Since the code just broke out of the loop on a match rather than jumping
straight to the end of the function body, it ended up hitting the code
path for when the end of the list was reached.
2018-10-21 23:33:36 +02:00
Akemi 9a52b90f04 cocoa-cb: fix double clicking the title bar
since we draw our own title bar we lose the standard functionality of
the system provided title bar. because of that we have to reimplement
the functionality of double clicking the title bar. depending on the
system preferences we want to minimize, zoom or do nothing.

Fixes #6223
2018-10-21 23:33:25 +02:00
BtbN f3098cd61b vo_gpu: vulkan: fix strncpy truncation in spirv_compiler_init
Fixes GCC8 warning
../video/out/gpu/spirv.c: In function 'spirv_compiler_init':
../video/out/gpu/spirv.c:68:9: warning: 'strncpy' specified bound 32 equals destination size [-Wstringop-truncation]
2018-10-21 23:33:10 +02:00
slatchurie 24e21a0236 x11: fix icc profile when the window goes near off screen
On a multi monitor setup, when the center of the window was going off
screen, the icc profile would always switch to the profile of the first
screen.

This fixes the issue by defaulting the value to the current screen.
2018-10-21 23:32:50 +02:00
Niklas Haas 0f3e25cb0a vo_gpu: vulkan: fix the buffer size on partial upload
This was pased on the texture height, which was a mistake. In some cases
it could exceed the actual size of the buffer, leading to a vulkan API
error. This didn't seem to cause any problems in practice, since a
too-large synchronization is just bad for performance and shouldn't do
any harm internally, but either way, it was still undefined behavior to
submit a barrier outside of the buffer size.

Fix the calculation, thus fixing this issue.
2018-10-19 22:58:16 +02:00
Niklas Haas 7ad60a7c5e vo_gpu: split --linear-scaling into two separate options
Since linear downscaling makes sense to handle independently from
linear/sigmoid upscaling, we split this option up. Now,
linear-downscaling is its own option that only controls linearization
when downscaling and nothing more. Likewise, linear-upscaling /
sigmoid-upscaling are two mutually exclusive options (the latter
overriding the former) that apply only to upscaling and no longer
implicitly enable linear light downscaling as well.

The old behavior was very confusing, as evidenced by issues such
as #6213. The current behavior should make much more sense, and only
minimally breaks backwards compatibility (since using linear-scaling
directly was very uncommon - most users got this for free as part of
gpu-hq and relied only on that).

Closes #6213.
2018-10-19 22:58:01 +02:00
Nicolas F 448cbd472e x11_common: replace atoi with strtoul
Using strtol and strtoul is allegedly better practice, and I'm going
for strtoul here because I'm pretty sure X11 displays cannot be in
the negative.
2018-10-19 22:57:49 +02:00
Niklas Haas 104b510774 vo_gpu: opengl: fix segfault when gl->DeleteSync is unavailable
This deinit code was never checked, so this line would always crash on
implementations without support for sync objects.

Fixes #6197.
2018-10-16 01:57:49 +03:00
Akemi 2b0b9bb6a1 cocoa-cb: fix side by side Split View
when entering a Split View a windowDidEnterFullScreen event happens
without a previous toggleFullScreen call. in that case it tries to stop
an animation that was never initiated by us and basically breaks the
system initiated fullscreen, or in this case the Split View. immediately
after entering the fullscreen it tries top stop the animation and
resizes the window, which causes the window to exit fullscreen. only
try to stop an animation that was initiated by us and is safe to stop.
2018-10-02 20:13:19 +03:00
Rodger Combs 1842a932d3 {mac,cocoa}: trim trailing null out of macosx_icon when loading it
This prevents crashes when loading the application icon image.

Suggested-by: Akemi <der.richter@gmx.de>
2018-10-02 00:20:43 +03:00
Akemi 8d2d0f0640 cocoa-cb: add Apple Software Renderer support
by default the pixel format creation falls back to software renderer
when everything fails. this is mostly needed for VMs. additionally one
can directly request an sw renderer or exclude it entirely.
2018-09-30 17:13:34 +03:00
Akemi 44e49aee3c cocoa-cb: move macOS option retrieval to the earliest point possible
moved the retrieval of the macOS specific options from the backend
initialisation to the initialisation of the CocoaCB class, the earliest
point possible. this way macOS specific options can be used for the
opengl context creation for example.
2018-09-30 17:13:34 +03:00
Anton Kindestam 810acf32d6 drm_atomic: Allow to create atomic context w/o drmprime video plane
This is to improve the experience when running with default settings
on a driver that doesn't have any overlay planes (or indeed only one
plane), but still supports DRM atomic. Since the drmprime video plane
is set to pick an overlay plane by default it would fail on these
drivers due to not being able to create any atomic context. Users with
such cards had to specify --drm-video-plane-id manually to some bogus
value (it's not used after all).

The "video" plane is only ever used by the drmprime-drm hwdec interop,
which is not used at all in the typical usecase where everything is
actually rendered on to the "OSD" plane using EGL, so having an atomic
context without the "video" plane should be fine most of the time.
2018-09-30 14:22:49 +03:00
Niklas Haas 730469cb29 vo_gpu: fix vec3 packing in UBOs/push_constants
For vec3, the alignment and size differ. The current code will pack a
struct like { vec3; float; vec2 } into 8 machine words, whereas the spec
would only use 6.

This actually fixes a real bug: The only place in the code I could find
where it was conceivably possible that a vec3 is followed by a float was
when using --gpu-dumb-mode in combination with --gamma-factor, and only
when --gpu-api=vulkan. So it's no surprised nobody ran into it yet.
2018-09-29 20:15:10 +02:00
Niklas Haas 39d10e3359 vo_gpu: use explicit offsets for push constants
These used to be unsupported long ago, but it seems glslang added
support in the meantime. (I don't know which version, but I'm guessing
it was long enough ago that we don't have to add a feature check)

Should hopefully help make push constant layouts more robust against
possible bugs either in our code or in the driver.
2018-09-29 20:15:10 +02:00
sfan5 a4c5a4486e vo_gpu: adjust PRNG variant used by GL shaders
Certain low-end Mali GPUs have a rather low precision and overflow
during the PRNG calculations, thereby breaking e.g. deband-grain.
Modify the permute() to avoid this, this does not impact the
quality of PRNG output (noticeably).

This problem was observed on:
GL_VENDOR='ARM', GL_RENDERER='Mali-T720'
GL_VERSION='OpenGL ES 3.1 v1.r15p0-00rel0.bdd9e62cdc8c88e0610a16b5901161e9'
2018-09-26 23:53:05 +03:00
Niklas Haas 48c38f730d mp_image: strip all HDR peak information from SDR clips
By overriding it with 1.0 (aka SDR). This prevents blowing up on
mistagged clips.

Fixes #6111
2018-09-05 22:09:30 +02:00
Niklas Haas a5b0d59084 vo_gpu: switch to optimization level performance
Upstream has this now. Didn't really make any different for me (except
making the polar compute shader 2%-3% faster), but maybe it does for
somebody else.
2018-09-01 16:14:22 +02:00
Niklas Haas 1890ca024e vo_gpu: avoid overwriting compute shader block sizes
When using multiple compute shaders as part of the same pass, there can
be a conflict in the block sizes. In the problematic case, the HDR
detection shader can collide with the polar sampling shader. In this
case, the solution is clear - the passes that can handle any size should
"give in" and not overwrite the block sizes.

Fixes #6083.
2018-08-26 12:32:20 +02:00
Tom Yan d48786f682 wscript: split egl-android from android 2018-08-20 17:16:22 +02:00
Akemi 049816c145 cocoa-cb: fix crash on macOS 10.10
the colorspace of the layer is only available on 10.11 and upwards.

Fixes #6041
2018-08-11 12:59:50 +02:00
Akemi 6bf0edc59c cocoa-cb: fix crash when no screen is available
instead of force unwrapping and chaining the optional vars in our
containsMouseLocation function, safely unwrap and guard the resulting
var.

Fixes #6062
2018-08-11 12:59:44 +02:00
Anton Kindestam 351c083487 hwdec_vaegl: Fix VAAPI EGL interop used with gpu-context=drm
Add another parameter to mpv_opengl_drm_params to hold the FD to the
render node, so that the fd can be passed to hwdec_vaegl.

The render node is opened in context_drm_egl and inferred from the
primary device fd using drmGetRenderDeviceNameFromFd.
2018-07-09 02:33:35 +03:00
Anton Kindestam 7beee68f8d context_drm_egl: Fix CRTC setup and release code when using atomic
The previous code did not save enough information about the old state,
and could end up changing what plane the fbcon:s FB got attached to,
or in worse case causing a blank screen (observed in some multi-screen
setups on Sandy Bridge).

In addition refactor the handling of drmModeModeInfo property blobs to
not leak, as well as enable reuse of already created blobs.
2018-07-09 02:17:47 +03:00
Anton Kindestam 1298b9d201 context_drm_egl: Fix some memory leaks on error exit
Fix some memory leaks on error exit in crtc_setup_atomic and
crtc_release_atomic.
2018-07-09 02:17:47 +03:00
Jan Ekström 1a893e8257 gpu: prefer 16bit floating point FBO formats to 16bit integer ones
According to earlier discussions, this can improve visual quality.
This only changes the preferred order of the formats, not the
formats themselves.
2018-07-08 16:49:23 +03:00
Akemi cd893626cb cocoa-cb: fix building with Swift 4.2
init is a reserved keyword and Swift 4.2 got a bit stricter about using
it. this could be fixed by adding apostrophes around init but makes the
code uglier. hence i just renamed init to initialized and for
consistency uninit to uninitialized.

Fixes #5899
2018-06-12 01:57:34 +03:00
Akemi 5865086aa8 cocoa-cb: remove pre-allocation of window, view and layer
the pre-allocation was needed because the layer allocated a opengl
context async itself and we couldn't influence that. so we had to start
the core after the context was actually allocated. furthermore a window,
view and layer hierarchy had to be created so the layer would create
a context.
now, instead of relying on the layer to create a context we do this
manually and re-use that context later when the layer wants to create
one async itself.
2018-06-12 01:51:01 +03:00
Akemi 20dffe0621 vo_libmpv: pass vo struct to the control callback 2018-06-12 01:51:01 +03:00
Anton Kindestam 157b242289 hwdec_drmprime_drm: Do not show error message during probing
Change the log-level of an error message that would sometimes show up
during hwdec probing, and could be misleading.
2018-06-08 22:13:39 +03:00
sfan5 7f625ea29b vo_sdl: add support for screensaver VOCTRL's
Previously vo_sdl would unconditonally disable the screensaver,
ignoring the `stop-screensaver` option.
2018-06-02 23:34:38 +03:00
Niklas Haas 5056777b86 vo_gpu: desaturate after peak detection
This sacrifices some dynamic range for well-behaved sources, but
prevents catastrophic desaturation on badly mastered / too bright
sources. I think that's the better trade-off. This makes the
desaturation algorithm much "safer" to deploy by default, as well. One
could even argue going up to strength 1.0, which works better for some
sources but worse for others. But I think the current strength is the
best trade-off even after this change.
2018-05-31 03:13:50 +03:00
wm4 eb08cd75c1 input: add a define for the number of mouse buttons and use it
(Why the fuck are there up to 20 mouse buttons?)
2018-05-25 10:17:06 +02:00
wm4 25525cee62 vd_lavc: minor simplification for get_format fallback
The default get_format does exactly do this, so we don't need to
duplicate it.

The only potential problem with this is that the logic doesn't entirely
prevent that the avcodec_default_get_format hw_device_ctx path is
triggered, which would probably work, but has unknown consequences and
interactions. But the way the logic currently works it can't happen,
provided the hwaccel metadata libavcodec provides is correct.
2018-05-25 10:17:06 +02:00
Niklas Haas fea87c4253 x11: support Shift+TAB
For some reason, the X default modifier map binds shift+tab to
ISO_Left_Tab instead of the regular Tab. So to get Shift+TAB recognized
by mpv, we also need to accept ISO_Left_Tab.

This patch matches what other programs like e.g. Qt do, which treat Tab
and ISO_Left_Tab as the same thing.

God only knows why the distinction exists, and why X decides to mix up
its bindings like that.

Fixes #5849
2018-05-24 22:12:02 +03:00
Rostislav Pehlivanov 0b3d1d6faf wayland_common: require wl_compositor of version 3
We already did require it, in order to call set_buffer_scale. This
just makes it error out more gracefully.
2018-05-20 02:48:23 +03:00
Rostislav Pehlivanov 43d575616c wayland_common: fix maximized state
Window size should not change if the window has been maximized or tiled.
2018-05-20 02:48:23 +03:00
Niklas Haas 05b392bc94 vo_gpu: allow higher icc-contrast and improve logging
With the advent of actual HDR devices, my real measured ICC profile has
an "infinite" contrast, since the display is completely off on pure
black inputs. 100k:1 might not be enough, so let's just bump it up to
1m:1 to be safe.

Also, improve the logging in the case that the detected contrast is too
high by default.
2018-05-17 22:56:45 +03:00
Anton Kindestam a645c6b2ec drm_atomic: Fix memory leaks in drm_atomic_create
First fix a memory leak when skipping cursor planes by inverting the
check and putting everything, but the free, in the body.

Then fix a missed drmModeFreePlane by simply copying the fields of the
drmModePlane we are interested in and freeing the drmModePlane struct
early.
2018-05-08 02:24:40 +03:00
wm4 e02c9b9902 build: make encoding mode non-optional
Makes it easier to not break the build by confusing the ifdeffery.
2018-05-03 01:08:44 +03:00
wm4 0ab3184526 encode: get rid of the output packet queue
Until recently, ao_lavc and vo_lavc started encoding whenever the core
happened to send them data. Since audio and video are not initialized at
the same time, and the muxer was not necessarily opened when the first
encoder started to produce data, the resulting packets were put into a
queue. As soon as the muxer was opened, the queue was flushed.

Change this to make the core wait with sending data until all encoders
are initialized. This has the advantage that we don't need to queue up
the packets.
2018-05-03 01:08:44 +03:00
wm4 958053ff56 vo_lavc: explicitly skip redraw and repeated frames
The user won't want to have those in the video (I think). The core can
sporadically issue redraws, which is what you want for actual playback,
but not in encode mode. vo_lavc can explicitly detect those and skip
them. It only requires switching to a more advanced internal VO API.

The comments in vo.h are because vo_lavc draws to one of the images in
order to render OSD. This is OK, but might come as a surprise to whoever
calls draw_frame, so document it. (Current callers are OK with it.)
2018-05-03 01:08:44 +03:00
wm4 f18c4175ad encode: remove old timestamp handling
This effectively makes --ocopyts the default. The --ocopyts option
itself is also removed, because it's redundant.
2018-05-03 01:08:44 +03:00
Anton Kindestam a9c2a8e162 drm_atomic: Disallow selecting cursor planes using the options 2018-05-01 20:48:02 +03:00
Anton Kindestam 02d40eee1b drm_common: Be smarter when deciding on which CRTC and Encoder to use
Inspired by kmscube, first try to pick the Encoder and CRTC already
associated with the selected Connector, if any. Otherwise try to find
the first matching encoder & CRTC like before.

The previous behavior had problems when using atomic
modesetting (crtc_setup_atomic) when we picked an Encoder & CRTC that
was currently being used by the fbcon together with another Encoder.
drmModeSetCrtc was able to "steal" the CRTC in this case, but using
atomic modesetting we do not seem to get this behavior automatically.

This should also improve behavior somewhat when run on a multi screen
setup with regards to deinit and VT switching (still sometimes you end
up with a blank screen where you previously had a cloned display of
your fbcon)
2018-05-01 20:48:02 +03:00