Commit Graph

4010 Commits

Author SHA1 Message Date
Niklas Haas a6aab5dfd6 vo_gpu: vulkan: refine queue family selection algorithm
This gets confused by e.g. SPARSE_BIT on the TRANSFER_BIT, leading to
situations where "more specialized" is ambiguous and the logic breaks
down. So to fix it, only compare the subset we care about.
2017-12-25 00:47:53 +01:00
Niklas Haas 2d1769a534 vo_gpu: vulkan: prefer vkCmdCopyImage over vkCmdBlitImage
blit() implies scaling, copy() is the equivalent command to use when the
formats are compatible (same pixel size) and the rects have the same
dimensions.
2017-12-25 00:47:53 +01:00
Niklas Haas a42b8b1142 vo_gpu: attempt re-using the FBO format for p->output_tex
This allows RAs with support for non-opaque FBO formats to use a more
appropriate FBO format for the output tex, possibly enabling a more
efficient blit operation.

This requires distinguishing between real formats (which can be used to
create textures) and fake formats (e.g. ra_gl's FBO hack).
2017-12-25 00:47:53 +01:00
Niklas Haas 80540be211 vo_gpu: vulkan: properly depend on the swapchain acquire semaphore
This is now associated with the ra_tex directly and used in the correct
way, rather than hackily done from submit_frame.
2017-12-25 00:47:53 +01:00
Niklas Haas b138bdc01c vo_gpu: vulkan: use correct access flag for present
This needs VK_ACCESS_MEMORY_READ_BIT (spec)
2017-12-25 00:47:53 +01:00
Niklas Haas 8b0a111c59 vo_gpu: vulkan: make the swapchain more robust
Now handles both VK_ERROR_OUT_OF_DATE_KHR and VK_SUBOPTIMAL_KHR for both
vkAcquireNextImageKHR and vkQueuePresentKHR in the correct way.
2017-12-25 00:47:53 +01:00
Niklas Haas dcda8bd36a vo_gpu: aggressively prefer async compute
On AMD devices, we only get one graphics pipe but several compute pipes
which can (in theory) run independently. As such, we should prefer
compute shaders over fragment shaders in scenarios where we expect them
to be better for parallelism.

This is amusingly trivial to do, and actually improves performance even
in a single-queue scenario.
2017-12-25 00:47:53 +01:00
Niklas Haas bded247fb5 vo_gpu: vulkan: support split command pools
Instead of using a single primary queue, we generate multiple
vk_cmdpools and pick the right one dynamically based on the intent.
This has a number of immediate benefits:

1. We can use async texture uploads
2. We can use the DMA engine for buffer updates
3. We can benefit from async compute on AMD GPUs

Unfortunately, the major downside is that due to the lack of QF
ownership tracking, we need to use CONCURRENT sharing for all resources
(buffers *and* images!). In theory, we could try figuring out a way to
get rid of the concurrent sharing for buffers (which is only needed for
compute shader UBOs), but even so, the concurrent sharing mode doesn't
really seem to have a significant impact over here (nvidia). It's
possible that other platforms may disagree.

Our deadlock-avoidance strategy is stupidly simple: Just flush the
command every time we need to switch queues, and make sure all
submission and callbacks happen in FIFO order. This required lifting the
cmds_pending and cmds_queued out from vk_cmdpool to mpvk_ctx, and some
functions died/got moved as a result, but that's a relatively minor
change.

On my hardware this is a fairly significant performance boost, mainly
due to async transfers. (Nvidia doesn't expose separate compute queues
anyway). On AMD, this should be a performance boost as well due to async
compute.
2017-12-25 00:47:53 +01:00
Niklas Haas a3c9685257 vo_gpu: invalidate fbotex before drawing
Don't discard the OSD or pass_draw_to_screen passes though. Could be
faster on some hardware.
2017-12-25 00:47:53 +01:00
Niklas Haas 6186cc79e6 vo_gpu: allow invalidating FBO in renderpass_run
This is especially interesting for vulkan since it allows completely
skipping the layout transition as part of the renderpass. Unfortunately,
that also means it needs to be put into renderpass_params, as opposed to
renderpass_run_params (unlike #4777).

Closes #4777.
2017-12-25 00:47:53 +01:00
Niklas Haas fb1c7bde42 vo_gpu: vulkan: properly track image dependencies
This uses the new vk_signal mechanism to order all access to textures.
This has several advantageS:

1. It allows real synchronization of image access across multiple frames
   when using multiple queues for parallelism.

2. It allows using events instead of pipeline barriers, which is a
   finer-grained synchronization primitive that allows for more
   efficient layout transitions over longer durations.

This commit also restructures some of the implicit transition code for
renderpasses to be more flexible and correct. (Note: this technically
drops the ability to transition the image out of undefined layout when
not blending, but that was a bug anyway and needs to be done properly)

vo_gpu: vulkan: remove no-longer-true optimization

The change to the output_tex format makes this no longer true, and it
actually seems to hurt performance now as well. So just don't do it
anymore. I also realized it hurts performance when drawing an OSD, so
it's probably not a good idea anyway.
2017-12-25 00:47:53 +01:00
Niklas Haas f2f91cf570 vo_gpu: vulkan: add a vk_signal abstraction
This combines VkSemaphores and VkEvents into a common umbrella
abstraction which can resolve to either.

We aggressively try to prefer VkEvents over VkSemaphores whenever the
conditions are met (1. we can unsignal the semaphore, i.e. it comes from
the same frame; and 2. it comes from the same queue).
2017-12-25 00:47:53 +01:00
Niklas Haas 5feaaba0fd vo_gpu: vulkan: refactor command submission
Instead of being submitted immediately, commands are appended into an
internal submission queue, and the actual submission is done once per
frame (at the same time as queue cycling). Again, the benefits are not
immediately obvious because nothing benefits from this yet, but it will
make more sense for an upcoming vk_signal mechanism.

This also cleans up the way the ra_vk submission interacts with the
synchronization/callbacks from the ra_vk_ctx. Although currently, the
way the dependency is signalled is a bit hacky: normally it would be
associated with the ra_tex itself and waited on in the appropriate stage
implicitly. But that code is just temporary, so I'm keeping it in there
for a better commit order.
2017-12-25 00:47:53 +01:00
Niklas Haas 885497a445 vo_gpu: vulkan: reorganize vk_cmd slightly
Instead of associating a single VkSemaphore with every command buffer
and allowing the user to ad-hoc wait on it during submission, make the
raw semaphores-to-signal array work like the raw semaphores-to-wait-on
array. Doesn't really provide a clear benefit yet, but it's required for
upcoming modifications.
2017-12-25 00:47:53 +01:00
Niklas Haas 4e34615872 vo_gpu: vulkan: refactor vk_cmdpool
1. No more static arrays (deps / callbacks / queues / cmds)
2. Allows safely recording multiple commands at the same time
3. Uses resources optimally by never over-allocating commands
2017-12-25 00:47:53 +01:00
wm4 3412c1a1aa
Restore Libav support
Libav has been broken due to the hwdec changes. This was always a
temporary situation (depended on pending patches to be merged), although
it took a bit longer. This also restores the travis config.

One code change is needed in vd_lavc.c, because it checks the AV_PIX_FMT
for videotoolbox (as opposed to the mpv format identifier), which is not
available in Libav. Add an ifdef; the affected code is for a deprecated
option anyway.
2017-12-21 19:45:32 +01:00
wm4 2ce7face96
hwdec: remove unused fields
These were replaced by a different mechanism, but the old fields weren't
removed.
2017-12-21 19:31:36 +01:00
Aman Gupta 7e2252688b vo_mediacodec_embed: implement hwcontext
Fixes vo_mediacodec_embed, which was broken in 80359c6615
2017-12-20 15:45:55 +11:00
James Ross-Gowan 3d8ca93d23 vo_gpu: win: remove exclusive-fullscreen detection hack
This hack was part of a solution to VSync judder in desktop OpenGL on
Windows. Rather than using blocking-SwapBuffers(), mpv could use
DwmFlush() to wait for the image to be presented by the compositor.
Since this would only work while the compositor was running, and the
compositor was silently disabled when OpenGL entered exclusive
fullscreen mode, mpv needed a way to detect exclusive fullscreen mode.

The code that is being removed could detect exclusive fullscreen mode by
checking the state of an undocumented mutex using undocumented native
API functions, but because of how fragile it was, it was always meant to
be removed when a better solution for accurate VSync in OpenGL was
found. Since then, mpv got the dxinterop backend, which uses desktop
OpenGL but has accurate VSync. It also got a native Direct3D 11 backend,
which is a viable alternative to OpenGL on Windows.

For people who are still using desktop OpenGL with WGL, there shouldn't
be much of a difference, since mpv can use other API functions to detect
exclusive fullscreen.
2017-12-20 14:53:41 +11:00
pavelxdd d13f9d0886 w32_common: refactor and improve window state handling
Refactored and split the `reinit_window_state` code into four
separate functions:
- `update_window_style` used to update window styles without
modifying the window rect.
- `fit_window_on_screen` used to adjust the window size when it is
larger than the screen size. Added a helper function `fit_rect` to
fit one rect on another without using any data from w32 struct.
- `update_fullscreen_state` used to calculate the new fullscreen
state and adjust the window rect accordingly.
- `update_window_state` used to display the window on screen with
new size, position and ontop state.

This commit fixes three issues:
- fixed #4753 by skipping `fit_window_on_screen` for a maximized
window, since maximized window should already fit on the screen.
It should be noted that this bug was only reproducible with
`--fit-border` option which is enabled by default. The cause of the
bug is that after calling the `add_window_borders` for a maximized
window, the rect in result is slightly larger than the screen rect,
which is okay, `SetWindowPos` will interpret it as a maximized state
later, so no auto-fitting to screen size is needed here.
- fixed #5215 by skipping `fit_window_on_screen` when leaving fullscreen.
On a multi-monitor system if the mpv window was stretched to cover
multiple monitors, its size was reset after switching back from
fullscreen to fit the size of the active monitor. Also, when changing
`--ontop` and `--border` options, now only the
`update_window_style` and `update_window_state` functions are used,
so `fit_window_on_screen` is not used for them too.
- fixed #2451 by moving the `ITaskbarList2_MarkFullscreenWindow`
below the `SetWindowPos`. If the taskbar is notified about fullscreen
state before the window is shown on screen, the taskbar button could
be missing until Alt-TAB is pressed, usually it was reproducible on
Windows 8.

Other changes:
- In `update_fullscreen_state` the `reset window bounds` debug
message now reports client area size and position, instead of window area
size and position. This is done for consistency with debug messages
in handling fullscreen state above in this function, since they also print
window bounds of the client area.
- Refactored `gui_thread_reconfig`. Added a new window flag `fit_on_screen`
to fit the window on screen even when leaving fullscreen. This is needed
for the case when the new video opened while the window is still in the
fullscreen state.
- Moved parent and fullscreen state checks out from the WM_MOVING to
`snap_to_screen_edges` function for consistency with other functions.
There's no point in keeping these checks out of the function body.
2017-12-19 23:22:52 +11:00
pavelxdd ebd5ae3721 w32_common: use RECT for storing screen and window size & position
When window and screen size and position are stored in RECT, it's
much easier to modify them using WinAPI functions.
Added two macros to get width and height of the rect.
2017-12-19 23:22:52 +11:00
wm4 9ed8ca2529 vo_gpu: hwdec_drmprime_drm: don't crash for non-GL contexts
Using vulkan with --hwdec crashed because of this.
2017-12-17 11:00:51 -08:00
Niklas Haas ba1943ac00 msg: reinterpret a bunch of message levels
I've decided that MP_TRACE means “noisy spam per frame”, whereas
MP_DBG just means “more verbose debugging messages than MSGL_V”.
Basically, MSGL_DBG shouldn't create spam per frame like it currently
does, and MSGL_V should make sense to the end-user and provide mostly
additional informational output.

MP_DBG is basically what I want to make the new default for --log-file,
so the cut-off point for MP_DBG is if we probably want to know if for
debugging purposes but the user most likely doesn't care about on the
terminal.

Also, the debug callbacks for libass and ffmpeg got bumped in their
verbosity levels slightly, because being external components they're a
bit less relevant to mpv debugging, and a bit too over-eager in what
they consider to be relevant information.

I exclusively used the "try it on my machine and remove messages from
MSGL_* until it does what I want it to" approach of refactoring, so
YMMV.
2017-12-15 22:28:47 -08:00
wm4 cedcdc1f3c vd_lavc: rename --hwdec=rpi to --hwdec=mmal
Annoying exception that makes no sense to keep. Normally, users or
client applications will either use --hwdec=auto, or not set the option
at all, which both leads to the expected result.
2017-12-15 12:32:25 +02:00
wm4 9824a30eb1 vd_lavc: use libavcodec metadata for hardware decoder wrappers
This removes the need to keep an explicit list and to attempt to parse
codec names. Needs latest FFmpeg git.
2017-12-15 12:32:25 +02:00
Vittorio Giovara d7d670fcbf csputils: Add support for Display P3 primaries 2017-12-14 23:31:09 +02:00
Vittorio Giovara 3b0ed13e39 csputils: Fix DCI P3 primaries white point 2017-12-14 23:31:09 +02:00
wm4 26cdd52801 vf_buffer: remove this filter
It has been deprecated for a while and is 100% useless. It was forgotten
in the recent filter purge. Get rid of it.
2017-12-12 22:02:56 +02:00
pavelxdd 6a85f9bf74 w32_common: update outdated comment about wakeup events
mpv doesn't use WM_USER for wakeup events since 91079c0
Updated the comment.
2017-12-11 11:51:41 -08:00
wm4 308b3cd71b vf_convert: default to limited range when converting RGB to YUV
Full range YUV causes problems everywhere. For example it's usually the
wrong choice when using encoding mode, and libswscale sometimes messes
up when converting to full range too. (In this partricular case, we
found that converting rgba->yuv420p16 full range actually seems to
output limited range.)

This actually restores a similar heueristic from the late vf_scale.c.
2017-12-11 21:27:11 +02:00
wm4 5e38e03980 vo_gpu: hwdec_drmprime_drm: silence error on failed autoprobing
When autoprobing the hwdec interops (which now happens to all compiled
interops if hardware decoding is used), failure to load an interop
should not print an error in the normal case. So hide it.

(We could make the log level conditional on whether autoprobing is used,
but directly loading it without autoprobing is obscure, and most other
interops don't do this either.)
2017-12-11 20:50:50 +02:00
wm4 92c4be4b6e hwdec: document a forgotten parameter
Add the "all" value to the --gpu-hwdec-interop help output.
2017-12-11 20:44:59 +02:00
wm4 6047333f0b video: remove code duplication by calling a hwdec loader helper
Make gl_video_load_hwdecs() call gl_video_load_hwdecs_all() when
all HW decoders should be loaded.
2017-12-11 20:44:59 +02:00
wm4 5196c34aec video: properly initialize and set hwdec_interop
Don't reset --gpu-hwdec-interop if vo_gpu uses dumb mode.
2017-12-11 20:44:59 +02:00
wm4 a4705e8b59 vd_lavc: always load VO interops with non-copy hw decoders
For METHOD_INTERNAL hwdecs (non-copy cases), make sure the VO interops
are always loaded, because those decoders will output hardware pixel
formats, which will need special support in vo_gpu. Otherwise,
initialization will fail, complaining that it can't convert the output
format to something the VO supports.
2017-12-11 20:44:59 +02:00
Jan Ekström affcccb007 vo: fix a compiler warning by properly printing a 64bit integer 2017-12-11 00:16:01 +02:00
LongChair b60ac5b5ba vd_lavc: add rkmpp to the hwdec_wrappers array.
Allows to get the hwdec picked up properly by mpv on rockchip devices
2017-12-10 18:24:50 +02:00
James Ross-Gowan 6ab7e0d465 vo_gpu: d3d11: check for timestamp query support
Apparently timestamp queries are optional for 10level9 devices. Check
for support when creating the device rather than spamming error messages
during rendering. CreateQuery can be used to check for support by
passing NULL as the final parameter.

See:
https://msdn.microsoft.com/en-us/library/windows/desktop/ff476150.aspx#ID3D11Device_CreateQuery
2017-12-09 19:53:53 +11:00
pavelxdd 665173d8b2 w32_common: improve the window message state machine
* Distinguish between the window being moved or not.
* Skip trying to snap if currently in full screen or an embedded
  window.
* Exit snapped state if the size changed when the window was being
  moved.
2017-12-07 23:32:56 +02:00
pavelxdd 483437ba91 w32_common: skip window snapping if Windows handled it
Check the expected width and height against up-to-date
window placement. If they do not match, we will consider snapping
to have happened on Windows' side.
2017-12-07 23:32:56 +02:00
Rostislav Pehlivanov a743fef837 vo: add support for externally driven renderloop and make wayland use it
Fixes display-sync (though if you change virtual desktops you'll need to seek
to re-enable display-sync) partially under wayland.

As an advantage, rendering is completely disabled if you change desktops or
alt+tab so you lose no performance if you leave mpv running elsewhere as long
as it isn't visible.

This could also be ported to other VOs which supports it.
2017-12-05 08:26:24 +00:00
James Ross-Gowan 9abb710afb vo_gpu: d3d11_helpers: use better formatting for PCI IDs
The old format was definitely misleading, since it used an 0x prefix and
formatted the device IDs with %d.
2017-12-04 20:11:20 +11:00
Nicolas F 744b67d9e5 Fix various typos in log messages 2017-12-03 21:24:18 +01:00
Anton Kindestam cc16cd5aa4 video: probe format of primary plane in drm/egl context
We need to support hardware/drivers which do not support ARGB8888 in
their primary plane.

We also use p->primary_plane_format when creating the gbm surface, to
make sure it always matches (in actuality there should be little
difference).
2017-12-03 17:30:17 +02:00
Anton Kindestam 04e5fbde43 hwdec: whitespace cleanup in hwdec_drmprime_drm.c 2017-12-03 17:30:17 +02:00
Anton Kindestam eb46d46e73 video: fix use of possibly-NULL pointer in drm_egl_init 2017-12-03 17:30:17 +02:00
Anton Kindestam 5129d777a6 video: fix double free in drm_atomic_create_context
Passing in an invalid DRM overlay id with the --drm-overlay option would
cause drmplane to be freed twice: once in the for-loop and once at the
error-handler label fail.

Solve by setting drmpanel to NULL after freeing it.

Also the 'return false' statement after the error handler label should
probably be 'return NULL', given that the return type of
drm_atomic_create_context returns a pointer.
2017-12-03 17:30:17 +02:00
wm4 d7a02bcb3b build: remove POSIX/sysv shared memory test
vo_x11 and vo_xv need this. According to the Linux manpage, all involved
functions are POSIX-2001 anyway. (I just assumed they were not, because
they're mostly System V UNIX legacy garbage.)
2017-12-02 23:19:13 +01:00
wm4 ca29a5aa9b vd_lavc: don't request native pixfmt with -copy and METHOD_INTERNAL
If the codec uses AV_CODEC_HW_CONFIG_METHOD_INTERNAL, and we're using
the -copy method, then don't request the native pix_fmt. It might not
have a AVFrame.hw_frames_ctx set, and we couldn't read back at all. On
top of that, most of those decoders probably don't provide read-back
when using such opaque formats anyway, while providing separate decoding
modes to decode to RAM.
2017-12-02 21:08:38 +01:00
wm4 292724538c video: remove some more hwdec legacy stuff
Finally get rid of all the HWDEC_* things, and instead rely on the
libavutil equivalents. vdpau still uses a shitty hack, but fuck the
vdpau code.

Remove all the now unneeded remains. The vdpau preemption thing was not
unused anymore; if someone cares this could probably be restored.
2017-12-02 04:53:55 +01:00