Commit Graph

3033 Commits

Author SHA1 Message Date
James Ross-Gowan 68eac1a1e7 vo_gpu: d3d11: initial implementation
This is a new RA/vo_gpu backend that uses Direct3D 11. The GLSL
generated by vo_gpu is cross-compiled to HLSL with SPIRV-Cross.

What works:

- All of mpv's internal shaders should work, including compute shaders.

- Some external shaders have been tested and work, including RAVU and
  adaptive-sharpen.

- Non-dumb mode works, even on very old hardware. Most features work at
  feature level 9_3 and all features work at feature level 10_0. Some
  features also work at feature level 9_1 and 9_2, but without high-bit-
  depth FBOs, it's not very useful. (Hardware this old is probably not
  fast enough for advanced features anyway.)

  Note: This is more compatible than ANGLE, which requires 9_3 to work
  at all (GLES 2.0,) and 10_1 for non-dumb-mode (GLES 3.0.)

- Hardware decoding with D3D11VA, including decoding of 10-bit formats
  without truncation to 8-bit.

What doesn't work / can be improved:

- PBO upload and direct rendering does not work yet. Direct rendering
  requires persistent-mapped PBOs because the decoder needs to be able
  to read data from images that have already been decoded and uploaded.
  Unfortunately, it seems like persistent-mapped PBOs are fundamentally
  incompatible with D3D11, which requires all resources to use driver-
  managed memory and requires memory to be unmapped (and hence pointers
  to be invalidated) when a resource is used in a draw or copy
  operation.

  However it might be possible to use D3D11's limited multithreading
  capabilities to emulate some features of PBOs, like asynchronous
  texture uploading.

- The blit() and clear() operations don't have equivalents in the D3D11
  API that handle all cases, so in most cases, they have to be emulated
  with a shader. This is currently done inside ra_d3d11, but ideally it
  would be done in generic code, so it can take advantage of mpv's
  shader generation utilities.

- SPIRV-Cross is used through a NIH C-compatible wrapper library, since
  it does not expose a C interface itself.

  The library is available here: https://github.com/rossy/crossc

- The D3D11 context could be made to support more modern DXGI features
  in future. For example, it should be possible to add support for
  high-bit-depth and HDR output with DXGI 1.5/1.6.
2017-11-07 20:27:13 +11:00
James Ross-Gowan 8020a62953 vo_gpu: export the GLSL format qualifier for ra_format
Backported from @haasn's change to libplacebo, except in the current RA,
there's nothing to indicate an ra_format can be bound as a storage
image, so there's no way to force all of these formats to have a
glsl_format. Instead, the layout qualifier will be removed if
glsl_format is NULL.

This is needed for the upcoming ra_d3d11 backend. In Direct3D 11, while
loading float values from unorm images often works as expected, it's
technically undefined behaviour, and in Windows 10, it will cause the
debug layer to spam the log with error messages. Also, apparently in
GLSL, the format name must match the image's format exactly (but in
Direct3D, it just has to have the same component type.)
2017-11-07 20:27:13 +11:00
James Ross-Gowan 41dff03f8d vo_gpu: add namespace query mechanism
Backported from @haasn's change to libplacebo. More flexible than the
previous "shared || non-shared" distinction. The extra flexibility is
needed for Direct3D 11, but it also doesn't hurt code-wise.
2017-11-07 20:27:13 +11:00
wm4 793b43020c vo_lavc: remove messy delayed subtitle rendering logic
For some reason vo_lavc's draw_image can buffer the frame and encode it
only later. Also, there is logic for rendering the OSD (i.e. subtitles)
only when needed.

In theory this can lead to subtitles being pruned before it tries to
render them (as the subtitle logic doesn't know that the VO still needs
them later), although this probably never happens in reality.

The worse issue, that actually happened, is that if the last frame gets
buffered, it attempts to render subtitles in the uninit callback. At
this point, the subtitle decoder is already torn down and all subtitles
removed, thus it will draw nothing. This didn't always happen. I'm not
sure why - potentially in the working cases, the frame wasn't buffered.

Since this logic doesn't have much worth, except a minor performance
advantage if frames with subtitles are dropped, just remove it.

Hopefully fixes #4689.
2017-11-07 05:29:26 +01:00
wm4 5261d1b099 vo_gpu: don't re-render hwdec frames when repeating frames
Repeating frames (for display-sync) is not supposed to render the entire
frame again. When using hardware decoding, it unfortunately did: the
renderer uses the frame ID to check whether the frame data changed, and
unmapping the hwdec frame clears it.

Essentially reverts commit 761eeacf54. Back then I probably
thought it would be a good idea to release the hwdec image quickly in
order to return it to the decoder, but they're referenced anyway.

This should increase the performance and reduce GPU work.
2017-11-03 15:11:56 +01:00
wm4 99dd2f57f0 vo_gpu: ra_gl: fix minimum GLSL version to 120
Not sure why there was 110, or why there is even a default.
2017-11-03 11:53:31 +01:00
wm4 2abf20b2b2 vo_gpu: fix mobius tone mapping compatibility to GLSL 120
Normally such code is didsabled by have_mglsl==false in
check_gl_features(), but apparently not this one.

Just fix it. Seems also more readable.

Fixes #5069.
2017-11-03 11:53:17 +01:00
wm4 aac74b7c36 vo_gpu: ra_gl: fix crash trying to use glBindBufferBase on GL 2.1
Apparently this is required, but it doesn't check for it. To be fair,
this was tested by creating a compatibility context and pretending it's
GL 2.1. GL_ARB_shader_storage_buffer_object actually requires GL 4.0 or
up, but GL_ARB_uniform_buffer_object requires only GL 2.0.
2017-11-03 11:39:15 +01:00
wm4 501230f2a0 vo_gpu: potentially fix icc-profile-auto updating
vo_gpu.c will call gl_video_icc_auto_enabled() to check whether it
should retrieve the ICC profile. But the value returned by this function
will be outdated, because gl_video_update_options() is not called yet.
Change the order of function calls so that this is done after updating
the options.

(This is fairly chaotic, but I guess this code will be refactored a
dozen of times anyway in the future.)
2017-11-01 01:58:16 +01:00
wm4 1c46bd5e50 vo_gpu: remove a redundant ifdef 2017-10-30 18:35:33 +01:00
wm4 b7ce3ac445 vd_lavc: remove need for duplicated cuda GL interop backend
This is just a dumb consequence of HWDEC_ types somehow being part of
both decoder and VO. Obviously, the VO should only care about supporting
specific hardware surface types or providing specific device types, but
until they are separated, stupid unintuitive mismatches will occur.
2017-10-30 18:31:20 +01:00
Ryo Munakata 046fe45950 hwdec_drmprime_drm: fix segv with --hwdec 2017-10-30 12:46:49 +01:00
wm4 6b745769b1 vd_lavc: add support for nvdec hwaccel
See manpage additions.

(In ffmpeg-mpv and Libav, this is still called "cuvid". Libav won't work
yet, because it has no frame params support yet, but this could get
fixed soon.)
2017-10-28 19:59:08 +02:00
Niklas Haas 4701c5ba4f vo_gpu: fix ra_tex_upload_pbo for 2D textures
params->rc was ignored in the calculation for the buffer size. I fucking
hate this stupid ra_tex_upload signature where *rc is randomly relevant
or not.
2017-10-27 16:56:23 +02:00
wm4 d6f33e0b0d vo_gpu: osd: simplify some code
Coverity complains about this, but it's probably a false positive.
Anyway, rewrite it in a slightly more readable way. Now it's more
obvious that it is correct.
2017-10-27 14:19:57 +02:00
Niklas Haas c2d4fd0ef4 vo_gpu: change --tone-mapping-desaturate algorithm
Comparing mpv's implementation against the ACES ODR reference samples
and algorithms, it seems like they're happy desaturating highlights
_way_ more aggressively than mpv currently does. And indeed, looking at
some example clips like The Redwoods (which is actually well-mastered),
the current desaturation produces unnatural-looking brightness fringes
where the sky meets the treeline.

Adjust the algorithm to make it apply to a much larger, more gradual
brightness region; and change the interpretation of the parameter. As a
bonus, the new parameter is actually sanely scaled (higher values = more
desaturation). Also, make it scale based on the signal level instead of
the luminance, to avoid under-desaturating bright blues.
2017-10-25 17:24:27 +02:00
Lionel CHAZALLON 1992bb5151 video : Move drm options to substruct.
This allows to group them and most of all query the group config when
needed and when we don't have the access to vo.
2017-10-23 21:08:20 +02:00
Lionel CHAZALLON cfcee4cfe7 Add DRM_PRIME Format Handling and Display for RockChip MPP decoders
This commit allows to use the AV_PIX_FMT_DRM_PRIME newly introduced
format in ffmpeg that allows decoders to provide an AVDRMFrameDescriptor
struct.

That struct holds dmabuf fds and information allowing zerocopy rendering
using KMS / DRM Atomic.

This has been tested on RockChip ROCK64 device.
2017-10-23 21:07:24 +02:00
Lionel CHAZALLON 762b8cc300 video : allow drm primary plane to be transparent for egl context
We want primary plane to be one top of overlay (video), so we need it to
be 32 bits.
2017-10-23 21:06:53 +02:00
Mark Thompson 26b46950a1 vo_opengl: hwdec_vaegl: Disable vaExportSurfaceHandle()
libva 2.0 (VAAPI 1.0.0) was released without it, but it is scheduled to
be included in libva 2.1.
2017-10-23 11:58:13 +02:00
Rostislav Pehlivanov f8aeda0da9 wayland_common: check monitor scale
Since we divide by it in a couple of places and compositors can be crazy,
its better to be safe than sorry.
Also checks cursor spawn durinig init (pointless since it does again on
cursor entry but its more correct).
2017-10-22 06:49:35 +01:00
Rostislav Pehlivanov 78ef7fb766 wayland_common: improve cursor code and scale cursor properly
It seems the cursor hadn't had its position properly adjusted when scaled.
Hence, bring back correct buffer scaling to make the cursor look fine.
Also the cursor surface now gets created sooner so that's better.
2017-10-22 05:53:20 +01:00
Rostislav Pehlivanov 13fb166d87 wayland_common: don't scale the cursor wl_buffer
Only gnome does something as stupid as always applying scaling to
the cursor rather than just using a larger sized one with HIDPI.
2017-10-19 21:35:20 +01:00
wm4 d3c022779a video: fix alpha handling
Regression since ec6e8a31e0. Removal of the explicit else case
always applies the conversion to premultiplied alpha in the else branch.
We want to scale with multiplied alpha, but we don't want to multiply
with alpha again on top of it.

Fixes #4983, hopefully.
2017-10-19 19:01:33 +02:00
James Ross-Gowan d9e3bad500 vo_gpu: add rgba16hf to the list of FBO formats
This should be functionally identical to rgba16f, since the formats only
differ in their representation on the CPU, but it could be useful for RA
backends that don't expose rgba16f, like Vulkan. It's definitely useful
for the WIP D3D11 backend.
2017-10-18 23:55:13 +11:00
wm4 747892209f vo_rpi: fix build (probably)
Untested. If it works, fixes #4919.
2017-10-17 09:28:00 +02:00
wm4 77945b2c16 vo_gpu: remove weird p->vo indirection
That's just unnecessary.
2017-10-17 09:09:00 +02:00
wm4 c90f76d322 vo_gpu: fix video sometimes not being rerendered on equalizer change
With video paused, changing the brightness controls (or similar) would
sometimes not rerender the video frame. So the OSD would redraw, but the
video wouldn't change. This is caused by output caching, and a redraw
request is free to return the cached frame. Change it such to invalidate
the cached frame if any of the options or the equalizer change.

In theory, gl_video_reset_surfaces() could be called if the equalizer
changes - this would apparently force interpolatzion to redraw all
frames. But this looks kind of crappy when changing the equalizer during
playback. It'll "eventually" use the correct settings anyway, and when
paused interpolation is off.
2017-10-17 09:07:35 +02:00
wm4 54d14f5fa8 vo_gpu: remove some minor dead code
This was for the "opengl" compat VO entry, which is now handled
differently.
2017-10-16 11:00:02 +02:00
wm4 7cfae5adce vo_gpu: semi-fix --gpu-context/--gpu-api options and help output
This was confusing at best. Change it to output the actual choices.
(Seems like in the end it's always me who has to clean up other people's
bullshit.)

Context names were not unique - but they should be, so fix it. The whole
point of the original --opengl-backend option was to side-step the
tricky auto-detection, so you know exactly what you get. The goal of
this commit is to make --gpu-context work the same way. Fix the
non-unique names by appending "vk" to the names.

Keep in mind that this was not suitable for slecting the "UI" backend
anyway, since "x11" would force GLX, whereas people on not-NVIDIA
actually want "x11egl". Users trying to use --gpu-context=x11 to force
the X11 backend would always end up with GLX, which would at least break
VAAPI hardware decoding for them. Basically the idea that this option
could select the "UI" type is completely broken - it selects an
implementation, which implies a UI. Selecting the UI type This would
require a separate mechanism. (Although in theory this separate
mechanism could be part of the --gpu-context option - in any case,
someone would have to implement it.)

To achieve help output that can actually be understood, just duplicate
the code. Most of that code is duplicated anyway, and trying to share
just the list code with the result of making the output unreadable
doesn't make too much sense. If we wanted to save code/effort, we could
just remove the help output altogether.

--gpu-api has non-unique entries, and it would be nice to group them
(e.g. list all OpenGL capable contexts with "opengl"), but C makes this
simple idea too much of a pain, so don't do it.

Also remove a stray tab from the android entry on the manpage.
2017-10-16 10:57:51 +02:00
Tobias Jakobi 47b1390b80 vo_gpu: mali-fbdev: fix build error
Apparantly the context was renamed.
2017-10-13 17:10:52 +02:00
Rostislav Pehlivanov c052849e52 wayland_common: init output_list during main struct init
Otherwise if display connection or xkb init failed the uninit function
could segfault.
2017-10-12 23:18:55 +01:00
Rostislav Pehlivanov 91ebc34344 wayland_common: require wl_output v2 and send MP_INPUT_RELEASE_ALL on uninit
Every compositor (including toy compositors) has had support for wl_output v2
since forever, so there's little point in supporting degraded output for 5 year
old releases (especially considering we require zxdg6 which is far more recent).
2017-10-11 19:59:42 +01:00
James Ross-Gowan b3178eb59e vo_gpu: shaderc: include debug info when --gpu-debug is set
This adds symbol information to the generated SPIR-V, which shows up in
the SPIR-V assembly dump. It's also useful for potential RA backends
that use SPIRV-Cross, since the symbol information is used in the
generated shader source.
2017-10-11 12:22:21 +11:00
wm4 14541ae258 Add checks for HAVE_GPL to various GPL-only source files
This should actually cover all of them, if you take into account that
some unchanged GPL source files include header files with such checks.
Also this was done already for the libaf derived code.

This is only for "safety" and to avoid misunderstandings.
2017-10-10 15:51:16 +02:00
Rostislav Pehlivanov 7c66c2bb75 wayland_common: adjust default cursor size and scale its buffer
It turns out compositors which do scaling scale the cursor as well,
so every single surface needs to get scaled too.

Also, 32 corresponds to the default size for both GTK+ and KDE.
2017-10-10 02:39:39 +01:00
Aman Gupta 1abac5f4aa vo: fix reference to mediacodec_embed 2017-10-09 21:49:01 +02:00
Aman Gupta 502d074a31 vo_gpu: android: fix gpu context 2017-10-09 21:49:01 +02:00
Mark Thompson 05cb8d28af vo_opengl: hwdec_vaegl: Use vaExportSurfaceHandle() if present
This new interface in libva2 offers a cleaner way to export surfaces
which can then be imported to EGL.  In particular, this works with
the Mesa driver, so we can have proper playback without a pointless
download and upload on AMD cards.

This change does nothing with libva1, and will fall back to the
libva1 interface (vaDeriveImage() + vaAcquireBufferHandle()) if
vaExportSurfaceHandle() is not present.
2017-10-09 21:35:49 +02:00
wm4 cdef69103a vo_gpu: simplify opengl alias
This makes the replacement warning message worse, but I don't think I
care enough.
2017-10-09 18:55:44 +02:00
wm4 b43bf12fa6 vo_gpu: remove duplicated options
All these options (like --gpu-context etc.) were duplicated. It's
amazing that it didn't cause more problems than it did.
2017-10-09 18:53:32 +02:00
Mark Thompson c6e7ced7f4 vo_opengl: context_drm_egl: Don't create a new framebuffer for every frame 2017-10-09 18:40:45 +02:00
Aman Gupta 8fc21fd0d5 vo_gpu: add android opengl backend
At the moment, rendering on Android requires ``--vo=opengl-cb`` and
a lot of java<->c++ bridging code to receive the receive and react to
the render callback in java. Performance also suffers with opengl-cb,
due to the overhead of context switching in JNI.

With this patch, Android can render using ``--vo=gpu --gpu-context=android``
(after setting ``--wid`` to point to an android.view.Surface on-screen).
2017-10-09 18:36:54 +02:00
Aman Gupta e80a2a572d vo: add mediacodec_embed output driver
Allows rendering IMGFMT_MEDIACODEC frames directly onto an
android.view.Surface
2017-10-09 18:36:54 +02:00
Aman Gupta 6f0fdac6f1 vo: add VO_CAP_NOREDRAW for upcoming vo_mediacodec_embed
MediaCodec uses a fixed number of output buffers to hold frames, and
expects that output buffers will be released as soon as possible. Once
rendered, the underlying frame is automatically released and cannot be
reused or rerendered.

The new VO_CAP_NOREDRAW forces mpv to release frames immediately after
they are rendered or dropped, to ensure that MediaCodec decoder does not
run out of buffers and stall out.
2017-10-09 18:36:54 +02:00
Rostislav Pehlivanov 4c7c8daf9c wayland_common: implement output tracking, cleanups and bugfixes
This commit:
    - Implements output tracking (e.g. monitor plug/unplug)
    - Creates the surface during registry (no other dependencies)
    - Queues the callback immediately after surface creation
    - Cleaner and better event handling (functions return directly)
    - Better reconfigure handling (resizes reduced to 1 during init)
    - Don't unnecessarily resize  (if dimensions match)

Apart from that fixes 2 potential memory leaks (mime type and window
title), 2 string ownership issues (output name and make need to be
dup'd), fixes some style issues (switches were indented) and finally
adds messages when disabling/enabling idle inhibition.

The callback setter function was removed in preparation for the commit
which will use the frame event cb because it was unnecessary.
2017-10-09 02:23:04 +01:00
Niklas Haas 2c046c48ec
wayland_common: allow vo_wayland_uninit(NULL)
...again
2017-10-07 21:49:03 +02:00
Rostislav Pehlivanov 9c806bc299 Revert "wayland_common: add support for embedding"
This reverts commit 8d8d4c5cb1.
2017-10-05 17:43:47 +01:00
Rostislav Pehlivanov da30f0ba2b wayland_common: respect close events
Overlooked.
Also add a comment and only set the parent if WinID is set.
2017-10-05 16:58:29 +01:00
Rostislav Pehlivanov 8d8d4c5cb1 wayland_common: add support for embedding 2017-10-05 16:23:15 +01:00
Rostislav Pehlivanov bee6ca5225 wayland_common: reset the LIVE_RESIZING flag when resizing ends
The VO code resets each flag individually, and it doesn't do it for this one.
Also make the prints use the struct names rather than the hardcoded ones,
forgot to add those to the last wayland_common commit.
2017-10-05 15:42:08 +01:00
Rostislav Pehlivanov 72901bb16b wayland_common: don't hardcode protocol names during registry
Use the interface names from the wl_interface structs they provide.
2017-10-04 02:24:01 +01:00
Rostislav Pehlivanov 68f9ee7e0b wayland_common: rewrite from scratch
The wayland code was written more than 4 years ago when wayland wasn't
even at version 1.0. This commit rewrites everything in a more modern way,
switches to using the new xdg v6 shell interface which solves a lot of bugs
and makes mpv tiling-friedly, adds support for drag and drop, adds support
for touchscreens, adds support for KDE's server decorations protocol,
and finally adds support for the new idle-inhibitor protocol.

It does not yet use the frame callback as a main rendering loop driver,
this will happen with a later commit.
2017-10-03 19:36:02 +01:00
Rostislav Pehlivanov 980116360b vo_wayland: remove
This VO was buggy and never worked correctly. Like with wayland_common,
it needs to be rewritten from scratch.
2017-10-03 19:35:59 +01:00
wm4 0c04ce5f0d vo_gpu: gl: implement proper extension string search
The existing code in check_ext() avoided false positive due to
sub-strings, but allowed false negatives. Fix this with slightly better
search code, and make it available as function to other source files.
(There are some cases of strstr() still around.)
2017-10-02 17:30:27 +02:00
Niklas Haas eb69e73eb4 vo_gpu: enable 3DLUTs in dumb mode
Unless FBOs are unsupported, this works. In particular, it's required to
get ICC profiles working in voluntary dumb mode. So instead of
blanket-disabling it, only disable it in the !have_fbo false case.
2017-09-30 19:03:34 +02:00
wm4 e544c3f7b3 vaapi: change license to LGPL
Originally mpv vaapi support was based on the MPlayer-vaapi patches.
These were never merged in upstream MPlayer. The license headers
indicated they were GPL-only. Although the actual author agreed to
relicensing, the company employing him to write this code did not, so
the original code is unusable to us.

Fortunately, vaapi support was refactored and rewritten several times,
meaning little code is actually left. The previous commits removed or
moved that to GPL-only code. Namely, vo_vaapi.c remains GPL-only. The
other code went away or became unnecessary mainly because libavcodec
itself gained the ability to manage the hw decoder, and libavutil
provides code to manage vaapi surfaces. We also changed to mainly using
EGL interop, making any of the old rendering code unnecessary.

hwdec_vaglx.c is still GPL. It's possibly relicensable, because much of
it was changed, but I'm not too sure and further investigation would be
required. Also, this has been disabled by default for a while now, so
bothering with this is a waste of time. This commit simply disables it
at compile time as well in LGPL mode.
2017-09-29 18:44:47 +02:00
wm4 6a69e897ff vaapi: move legacy code to vo_vaapi.c
Done for license reasons. vo_vaapi.c is turned into some kind of
dumpster fire, and we'll remove it as soon as I'm mentally ready for
unkind users to complain about removal of this old POS.
2017-09-29 18:32:56 +02:00
Niklas Haas f6fd2a05c4 vo_gpu: vulkan: reword comment
This is fixed upstream (and we now know it's a driver bug) so reword the
comment.
2017-09-29 00:48:39 +02:00
Niklas Haas 22311a767d vo_gpu: force layout std430 for PCs
Seems to be fixed upstream in the nvidia driver, so it's probably a good
idea to 1. force the layout and 2. remove the warning, as it now
actually works. Users with older drivers would run into errors, but they
can still use shaderc as a replacement. (And it's not like the old
status quo was any better)
2017-09-29 00:41:50 +02:00
Niklas Haas 07fa5c8a8f vo_gpu: fix --opengl-gamma redirect
It still pointed at --gpu-gamma, but we decided on --gamma-factor
instead.
2017-09-28 17:21:56 +02:00
Niklas Haas 791b9c4024 vo_gpu: set the correct number of vertex attribs
This was always set to the length of the VAO, but it should have been
set to the number of vertex attribs actually in use for this frame. No
idea how that managed to survive the test framework on nvidia/linux, but
ANGLE caught it.
2017-09-28 12:50:45 +02:00
James Ross-Gowan a65abe2447 vo_gpu: vulkan: add support for Windows 2017-09-28 10:02:22 +10:00
Niklas Haas 67fd5882b8
vo_gpu: make the vertex attribs dynamic
This has several advantages:

1. no more redundant texcoords when we don't need them
2. no more arbitrary limit on how many textures we can bind
3. (that extends to user shaders as well)
4. no more arbitrary limits on tscale radius

To realize this, the VAO was moved from a hacky stateful approach
(gl_sc_set_vertex_attribs) - which always bothered me since it was
required for compute shaders as well even though they ignored it - to be
a proper parameter of gl_sc_dispatch_draw, and internally plumbed into
gl_sc_generate, which will make a (properly mangled) deep copy into
params.vertex_attribs.
2017-09-28 01:54:38 +02:00
Niklas Haas 002a0ce232 vo_gpu: kill some static arrays
This gets rid of the hard-coded limits on the number of hooks, textures
and hook points.
2017-09-28 01:54:33 +02:00
Niklas Haas 868bf4da7d vo_gpu: vulkan: indent queue family enumeration
Consistency
2017-09-27 00:46:20 +02:00
Niklas Haas 5b6b77b8dc vo_gpu: vulkan: normalize use of *Flags and *FlagBits
FlagBits is just the name of the enum. The actual data type representing
a combination of these flags follows the *Flags convention. (The
relevant difference is that the latter is defined to be uint32_t instead
of left implicit)

For consistency, use *Flags everywhere instead of randomly switching
between *Flags and *FlagBits.

Also fix a wrong type name on `stageFlags`, pointed out by @atomnuker
2017-09-27 00:25:18 +02:00
Niklas Haas 0ba6c7d73f vo_gpu: vulkan: optimize redundant pipeline barriers
Using renderpass layout transitions is more optimal and doesn't require
a redundant pipeline barrier.

Since our render passes are static and don't change throughout the
lifetime of a ra_renderpass, we unfortunately don't have much
flexibility here - so just hard-code SHADER_READ_ONLY_OPTIMAL as the
output format as this will be the most common case.

We also can't short-circuit the transition when we need to preserve the
framebuffer contents, since that depends on the current layout; so we
still use an explicit tex_barrier in this case. (Most optimal for this
scenario would be an input attachment anyway)
2017-09-26 23:50:01 +02:00
wm4 9b60398f4e video: remove old videotoolbox support
Like as in previous commits, you need a very recent FFmpeg (probably git
master).
2017-09-26 19:13:26 +02:00
wm4 ae7db6503b video: drop old D3D11/DXVA2 support
Now you need FFmpeg git, or something.

This also gets rid of the last real use of gpu_memcpy(). libavutil does
that itself. (vaapi.c still used it, but it was essentially unused,
because the code path isn't really in use anymore. It wasn't even
included due to the d3d-hwaccel dependency in wscript.)
2017-09-26 18:58:45 +02:00
Niklas Haas e569050fe4
vo_gpu: fix memleak in spirv.c 2017-09-26 17:32:36 +02:00
Niklas Haas a4e951e80c
vo_gpu: explicitly label storage image formats
This is apparently required to get storage images working on
windows/vulkan, and probably good practice either way. Not entirely sure
if it's the best idea to be always storing the value as 32-bit float,
but it should hardly matter in practice (since we're only writing one
sample per thread).

(Leaving them implicit requires the shaderStorageImageWriteWithoutFormat
feature to be enabled, which the windows nvidia vulkan driver doesn't
support, at least not for a GTX 670)
2017-09-26 17:25:46 +02:00
Niklas Haas 47af509e1f vo_gpu: attempt to avoid UBOs for dynamic variables
This makes the radeon driver shut up about frequently updating
STATIC_DRAW UBOs (--opengl-debug), and also reduces the amount of
synchronization necessary for vulkan uniform buffers.

Also add some extra debugging/tracing code paths. I went with a
flags-based approach in case we ever want to extend this.
2017-09-26 17:25:35 +02:00
Niklas Haas ca85a153b4 vo_gpu: vulkan: add support for push constants
Can in theory avoid updating the uniform buffer every frame
2017-09-26 17:25:35 +02:00
Rostislav Pehlivanov ed345ffc2f vo_gpu: vulkan: add support for wayland 2017-09-26 17:25:35 +02:00
Niklas Haas 258487370f vo_gpu: vulkan: generalize SPIR-V compiler
In addition to the built-in nvidia compiler, we now also support a
backend based on libshaderc. shaderc is sort of like glslang except it
has a C API and is available as a dynamic library.

The generated SPIR-V is now cached alongside the VkPipeline in the
cached_program. We use a special cache header to ensure validity of this
cache before passing it blindly to the vulkan implementation, since
passing invalid SPIR-V can cause all sorts of nasty things. It's also
designed to self-invalidate if the compiler gets better, by offering a
catch-all `int compiler_version` that implementations can use as a cache
invalidation marker.
2017-09-26 17:25:35 +02:00
Niklas Haas 91f23c7067 vo_gpu: vulkan: initial implementation
This time based on ra/vo_gpu. 2017 is the year of the vulkan desktop!

Current problems / limitations / improvement opportunities:

1. The swapchain/flipping code violates the vulkan spec, by assuming
   that the presentation queue will be bounded (in cases where rendering
   is significantly faster than vsync). But apparently, there's simply
   no better way to do this right now, to the point where even the
   stupid cube.c examples from LunarG etc. do it wrong.
   (cf. https://github.com/KhronosGroup/Vulkan-Docs/issues/370)

2. The memory allocator could be improved. (This is a universal
   constant)

3. Could explore using push descriptors instead of descriptor sets,
   especially since we expect to switch descriptors semi-often for some
   passes (like interpolation). Probably won't make a difference, but
   the synchronization overhead might be a factor. Who knows.

4. Parallelism across frames / async transfer is not well-defined, we
   either need to use a better semaphore / command buffer strategy or a
   resource pooling layer to safely handle cross-frame parallelism.
   (That said, I gave resource pooling a try and was not happy with the
   result at all - so I'm still exploring the semaphore strategy)

5. We aggressively use pipeline barriers where events would offer a much
   more fine-grained synchronization mechanism. As a result of this, we
   might be suffering from GPU bubbles due to too-short dependencies on
   objects. (That said, I'm also exploring the use of semaphores as a an
   ordering tactic which would allow cross-frame time slicing in theory)

Some minor changes to the vo_gpu and infrastructure, but nothing
consequential.

NOTE: For safety, all use of asynchronous commands / multiple command
pools is currently disabled completely. There are some left-over relics
of this in the code (e.g. the distinction between dev_poll and
pool_poll), but that is kept in place mostly because this will be
re-extended in the future (vulkan rev 2).

The queue count is also currently capped to 1, because of the lack of
cross-frame semaphores means we need the implicit synchronization from
the same-queue semantics to guarantee a correct result.
2017-09-26 17:25:35 +02:00
Niklas Haas c82022f349 vo_opengl_cb: fix deprecated option usage
opengl-debug was renamed to gpu-debug
2017-09-26 17:24:39 +02:00
Niklas Haas 89cdccfa6c vo_gpu: fix possible segfault on shader miscompile
Iterations after the first time will fail to realize that the pass was
never created. This function's logic and control flow is so annoying...
2017-09-23 16:36:58 +02:00
James Ross-Gowan 3d119a0e41 vo_gpu: angle: fix misleading struct name
This should have been renamed when it stopped being empty.
2017-09-23 18:33:33 +10:00
Niklas Haas b0ba193b66 vo_gpu: handle texture initialization errors gracefully
Tested by making the ra_tex_resize function always fail (apart from the
initial FBO check). This required a few changes:

1. reset shaders on failed dispatch
2. reset cleanup binds on failed dispatch
3. fall back to initializing the struct image to 1x1 on failure
4. handle output_fbo_valid gracefully
2017-09-23 09:58:27 +02:00
Niklas Haas f3ec494613 vo_gpu: reduce the --alpha=blend-tiles checkerboard intensity
This was sort of grating by default and made it really hard to actually
read e.g. text on top of a transparent background. I decided to approach
the problem from both directions, making the whites darker and the grays
lighter. This brings it closer to the dynamic range of e.g. the
wikipedia transparent svg preview.
2017-09-22 21:14:27 +02:00
Niklas Haas e3288c4597 vo_gpu: simplify compute shader coordinate calculation
Since the removal of FBOTEX_FUZZY, this can be made slightly simpler.
2017-09-22 17:22:53 +02:00
Niklas Haas 62ddc85d17 vo_gpu: simplify structs / names
Due to the plethora of historical baggage from different eras getting
confusing, I decided to simplify and unify the struct organization and
naming scheme.

Structs that got renamed:

1. fbodst     -> ra_fbo  (and moved to gpu/context.h)
2. fbotex     -> removed (redundant after 2af2fa7a)
3. fbosurface -> surface
4. img_tex    -> image

In addition to these structs being renamed, all of the names have been
made consistent. The new scheme is as follows:

struct image img;
struct ra_tex *tex;
struct ra_fbo fbo;

This also affects derived names, e.g. indirect_fbo -> indirect_tex.
Notably also, finish_pass_fbo -> finish_pass_tex and finish_pass_direct
-> finish_pass_fbo.

The new equivalent of fbotex_change() is called ra_tex_resize().

This commit (should) contain no logic changes, just renaming a bunch of
crap.
2017-09-22 16:58:55 +02:00
Niklas Haas 2af2fa7a27 vo_gpu: kill off FBOTEX_FUZZY
I've observed the garbage pixels in more scenarios. They also were never
really needed to begin with, originally being a discovered work-around
for bug that we fixed since then anyway. Doesn't really seem to even
help resizing, since the OpenGL drivers are all smart enough to pool
resources internally anyway.

Fixes #1814
2017-09-22 16:33:25 +02:00
James Ross-Gowan fab0448c5e Revert "cocoa: re-enable double buffering"
Enabling double buffering fixed some graphical glitches when entering
fullscreen, but it also caused a fullscreen performance regression. We
decided that the glitches were preferable to the performance regression.

This reverts commit cee764849e.
2017-09-22 23:08:46 +10:00
James Ross-Gowan 1d5620a658 vo_gpu: override ra_swapchain_fns for the d3d11 surface
ANGLE can take advantage of some of these when using the external
swapchain-backed surface.
2017-09-22 14:22:14 +02:00
Niklas Haas aabe12b0bc vo_gpu: opengl: fix possible screenshot window crash
gl_read_fbo_contents can fail

Fixes #4905
2017-09-22 14:20:22 +02:00
Niklas Haas d325f30fb5 vo_opengl_cb: fix segfault on uninit
The code used ra_ctx_destroy even though ra_ctx_create was never called
(since it's just a dummy ctx), which led to a conflict of assumptions.
The proper fix is to only use ra_gl_ctx_uninit (mirroring the
ra_gl_ctx_init) and free the dummy ctx manually.

Fixes https://github.com/cmdrkotori/mpc-qt/issues/129
2017-09-22 14:20:11 +02:00
wm4 fba927de41 options: properly handle deprecated options with CLI actions
We want e.g. --opengl-shaders-append=foo to resolve to the new option,
all while printing an option name. --opengl-shader is a similar case.
These options are special, because they apply "actions" on actual
options by specifying a suffix. So the alias/deprecation handling has to
be part of resolving the actual option from prefix and suffix.
2017-09-22 11:31:03 +02:00
wm4 baffe6bcbc vo_gpu: fix autoprobing message 2017-09-22 05:37:54 +02:00
wm4 2b5da4804c build: make vo_gpu + infrastructure non-optional
Also readd the the error message for when no GL backends are found (why
was this removed?).
2017-09-22 05:35:26 +02:00
Aman Gupta 6254b6d637 vo_opengl_cb: hwdec_ios: fix build
[179/188] Compiling video/out/vo_lavc.c
../../video/out/opengl/hwdec_ios.m:135:9: warning: unused variable 'gl' [-Wunused-variable]
    GL *gl = ra_gl_get(mapper->ra);
        ^
../../video/out/opengl/hwdec_ios.m:247:48: warning: incompatible pointer to integer conversion passing 'CVOpenGLESTextureRef' (aka 'struct __CVBuffer *') to parameter of type 'GLuint' (aka 'unsigned int') [-Wint-conversion]
                                               p->gl_planes[i]);
                                               ^~~~~~~~~~~~~~~
../../video/out/opengl/ra_gl.h:9:45: note: passing argument to parameter 'gl_texture' here
                                     GLuint gl_texture);
                                            ^
2 warnings generated.
2017-09-22 05:03:52 +02:00
Niklas Haas 52789d6ca0
vo_gpu: fix vo=opengl legacy alias
Turns out the option code apparently tries to directly talloc_free() the
allocated strings, instead of going through a tactx wrapper or
something. So we can't directly overwrite it. Do something else
instead..
2017-09-21 16:13:52 +02:00
Niklas Haas aefd7a90c9
vo_gpu: fix memleak in ra_gl_ctx
The ctx->ra was never freed propely, nor was p->wrapped_fb.

(TIL: MPV_LEAK_REPORT exists)
2017-09-21 15:51:47 +02:00
Niklas Haas b940691784 vo_gpu: drop the RA_CAP_NESTED_ARRAY req from EWA compute
Almost as fast as the old code, but more general. Notably, glslang
doesn't support nested arrays.

(cf. https://github.com/KhronosGroup/glslang/issues/1057)

Also much cleaner code-wise, so I think I'll keep it even if glslang
implements array_of_arrays.
2017-09-21 15:15:59 +02:00
Niklas Haas 03fee22c4d wayland: allow vo_wayland_uninit(NULL) 2017-09-21 15:15:55 +02:00
Niklas Haas 28b2fa4b7e vo_gpu: fix possible segfault in shader_cache.c
If shader compilation fails in an unexpected way, it can end up calling
renderpass_run on an invalid pass, since current_shader is never cleared.
2017-09-21 15:15:15 +02:00
Niklas Haas db0fb3c48b
vo_gpu: fix gamma scale
This never really made sense since the BT.1886 changes. It should get
*brighter* for bright rooms, not darker for dark rooms. Picked some new
values that seemed reasonable-ish.
2017-09-21 15:01:26 +02:00
Niklas Haas 61f5c423be vo_gpu: fix comment on ra_buf_type
This hasn't been true for several iterations of this API.
2017-09-21 15:01:22 +02:00
Niklas Haas e92effb14f vo_gpu: describe the plane merging pass
This can get left unknown if something hooks NATIVE
2017-09-21 15:01:22 +02:00
James Ross-Gowan cee764849e cocoa: re-enable double buffering
This causes a performance regression on 10.11 and newer, but the single
buffered method was broken and could cause partially rendered frames to
be presented to the screen.

This reverts 9f30cd8292 and
e543853a7f.
2017-09-21 15:01:22 +02:00
James Ross-Gowan 75c0c06640 vo_gpu: convert windows/osx hwdecs/contexts to new API 2017-09-21 15:01:17 +02:00
Niklas Haas 65979986a9 vo_opengl: refactor into vo_gpu
This is done in several steps:

1. refactor MPGLContext -> struct ra_ctx
2. move GL-specific stuff in vo_opengl into opengl/context.c
3. generalize context creation to support other APIs, and add --gpu-api
4. rename all of the --opengl- options that are no longer opengl-specific
5. move all of the stuff from opengl/* that isn't GL-specific into gpu/
   (note: opengl/gl_utils.h became opengl/utils.h)
6. rename vo_opengl to vo_gpu
7. to handle window screenshots, the short-term approach was to just add
   it to ra_swchain_fns. Long term (and for vulkan) this has to be moved to
   ra itself (and vo_gpu altered to compensate), but this was a stop-gap
   measure to prevent this commit from getting too big
8. move ra->fns->flush to ra_gl_ctx instead
9. some other minor changes that I've probably already forgotten

Note: This is one half of a major refactor, the other half of which is
provided by rossy's following commit. This commit enables support for
all linux platforms, while his version enables support for all non-linux
platforms.

Note 2: vo_opengl_cb.c also re-uses ra_gl_ctx so it benefits from the
--opengl- options like --opengl-early-flush, --opengl-finish etc. Should
be a strict superset of the old functionality.

Disclaimer: Since I have no way of compiling mpv on all platforms, some
of these ports were done blindly. Specifically, the blind ports included
context_mali_fbdev.c and context_rpi.c. Since they're both based on
egl_helpers, the port should have gone smoothly without any major
changes required. But if somebody complains about a compile error on
those platforms (assuming anybody actually uses them), you know where to
complain.
2017-09-21 15:00:55 +02:00
Niklas Haas 2f41b834b3
vo_opengl: make the ra_renderpass names consistent
The random space kept screwing me over
2017-09-13 20:53:19 +02:00
Niklas Haas 293c696ddb vo_opengl: use GLX_MESA_swap_control where available
This overrides the use of GLX_SGI_swap_control, because apparently
GLX_SGI_swap_control doesn't support SwapInterval(0), but the
GLX_MESA_swap_interval does.

Of course, everybody except mesa just accepts SwapInterval(0) even for
GLX_SGI_swap_control, but mesa needs to be the special snowflake here
and reject it, forcing us to load their stupid named extension instead.

Meanwhile khronos has done nothing except spit out GLX_EXT_swap_control
(not to be confused with GL_EXT_swap_control, which is exported by
WGL_EXT_swap_control), that doesn't fix the problem because mesa doesn't
implement it anyway.

What a fucking mess.
2017-09-13 20:53:17 +02:00
Niklas Haas 0bb67b1055
vo_opengl: always initialize uniforms on first use
Even if the contents are entirely zero. In the current code, these
entries were left uninitialized. (Which always worked for nvidia - but
randomly blew up for AMD)
2017-09-12 03:00:47 +02:00
Niklas Haas 0c2cb69597 vo_opengl: generalize UBO packing/handling
This is simultaneously generalized into two directions:
1. Support more sc_uniform types (needed for SC_UNIFORM_TYPE_PUSHC)
2. Support more flexible packing (needed for both PUSHC and ra_d3d11)
2017-09-12 02:57:45 +02:00
Niklas Haas 3faf1fb0a4
vo: avoid putting large voctrl_performance_data on stack
This is around 512 kB, which is just way too much. Heap-allocate it
instead. Also cut down the max pass count to 64, since 128 was
unrealistically high even for vo_opengl.
2017-09-11 18:20:18 +02:00
Niklas Haas 71c25df5e6
vo_opengl: refactor timer_pool_measure (again)
Instead of relying on power-of-two buffer sizes and unsigned overflow,
make this code more robust (and also cleaner).

Why can't C get a real modulo operator?
2017-09-11 02:42:50 +02:00
Niklas Haas 8a4f2f0ac0
vo_opengl: fix out-of-bounds access in timer_pool_measure
This was there even before the refactor, but the refactor exposed the
bug. I hate C's useless fucking modulo operator so much. I've gotten hit
by this exact bug way too many times.
2017-09-11 02:07:04 +02:00
Niklas Haas 0fe4a492c4
vo_opengl: fix out-of-bounds read in update_uniform
Since the addition of UBOs, the assumption that the uniform index
corresponds to the pass->params.inputs index is no longer true. Also,
there's no reason it would even need this - since the `input` is also
available directly in sc_uniform.

I have no idea how I've been using this code for as long as I have
without any segfaults until earlier today.
2017-09-11 01:04:08 +02:00
Niklas Haas 1da53248ab
vo_opengl: refactor/fix mp_pass_perf code
This was needlessly complicated and prone to breakage, because even the
references to the ring buffer could end up getting invalidated and
containing garbage data on e.g. shader cache flush. For much the same
reason why we can't keep around the *timer_pool, we're also forced to
hard-copy the entire sample buffer per pass per frame.

Not a huge deal, though. This is, what, a few kB per frame? We have more
pressing CPU performance concerns anyway.

Also simplified/fixed some other code.
2017-09-11 00:35:23 +02:00
Niklas Haas d0c87dd579
vo_opengl: add a gamut warning feature
This clearly highlights all out-of-gamut/clipped pixels. (Either too
bright or too saturated)

Has some (documented) caveats. Also make TONE_MAPPING_CLIP stop actually
clamping the value range (it's unnecessary and breaks this feature).
2017-09-10 18:19:46 +02:00
Niklas Haas 5771f7abf4
vo_opengl: add support for vulkan GLSL dialect
Redefining texture1D / texture3D seems to be illegal, they are already
built-in macros or something. So just use tex1D and tex3D instead.

Additionally, GL_KHR_vulkan_glsl requires using explicit vertex
locations and bindings, so make some changes to facilitate this. (It
also requires explicitly setting location=0 for the color attachment
output)
2017-09-04 13:53:23 +02:00
Niklas Haas 62f0677614 vo_opengl: use rgba16 for 3DLUTs instead of rgb16
Vulkan compat. rgb16 doesn't exist on hardware anyway, might as well
just generate the 3DLUT against rgba16 as well. We've decided this is
the simplest way to do vulkan compatibility: just make sure we never
actually need 3-component textures.
2017-09-04 13:53:14 +02:00
Niklas Haas 8cf5799ab1 vo_opengl: refactor scaler LUT weight packing/loading
This is mostly done so we can support using textures with more
components than the scaler LUTs have entries. But while we're at it,
also change the way the weights are packed so that they're always
sequential with no gaps. This allows us to simplify
pass_sample_separated_get_weights as well.
2017-09-04 13:53:14 +02:00
Niklas Haas f589a3bd78
vo_opengl: scale deband-grain to the signal range
This prevents blowing up for high dynamic range sources, where a noise
level of 48 can suddenly mean 4800.
2017-09-03 21:51:48 +02:00
James Ross-Gowan 9a28088e74 filter_kernels: correct spline64 kernel
This seems to have had some copy/paste errors. It should now match the
implementation in fmtconv:
https://github.com/EleonoreMizo/fmtconv/blob/00453a86dd73/src/fmtcl/ContFirSpline64.cpp#L58-L76
2017-09-03 21:18:06 +02:00
James Ross-Gowan 7897f79217 input: merge mouse wheel and axis keycodes
Mouse wheel bindings have always been a cause of user confusion.
Previously, on Wayland and macOS, precise touchpads would generate AXIS
keycodes and notched mouse wheels would generate mouse button keycodes.
On Windows, both types of device would generate AXIS keycodes and on
X11, both types of device would generate mouse button keycodes. This
made it pretty difficult for users to modify their mouse-wheel bindings,
since it differed between platforms and in some cases, between devices.

To make it more confusing, the keycodes used on Windows were changed in
18a45a42d5 without a deprecation period or adequate communication to
users.

This change aims to make mouse wheel binds less confusing. Both the
mouse button and AXIS keycodes are now deprecated aliases of the new
WHEEL keycodes. This will technically break input configs on Wayland and
macOS that assign different commands to precise and non-precise scroll
events, but this is probably uncommon (if anyone does it at all) and I
think it's a fair tradeoff for finally fixing mouse wheel-related
confusion on other platforms.
2017-09-03 20:31:44 +10:00
James Ross-Gowan 8fe4aa94ee cocoa: fix button numbering for back/forward
It seems like the Cocoa backend used to return the same mpv keycodes for
mouse back/forward as it did for scrolling up and down. Fix this by
explicitly mapping all Cocoa button numbers to the right mpv keycodes.
2017-09-03 20:31:44 +10:00
James Ross-Gowan 957e9a37db input: use mnemonic names for mouse buttons
mpv's mouse button numbering is based on X11 button numbering, which
allows for an arbitrary number of buttons and includes mouse wheel input
as buttons 3-6. This button numbering was used throughout the codebase
and exposed in input.conf, and it was difficult to remember which
physical button each number actually referred to and which referred to
the scroll wheel.

In practice, PC mice only have between two and five buttons and one or
two scroll wheel axes, which are more or less in the same location and
have more or less the same function. This allows us to use names to
refer to the buttons instead of numbers, which makes input.conf syntax a
lot easier to remember. It also makes the syntax robust to changes in
mpv's underlying numbering. The old MOUSE_BTNx names are still
understood as deprecated aliases of the named buttons.

This changes both the input.conf syntax and the MP_MOUSE_BTNx symbols in
the codebase, since I think both would benefit from using names over
numbers, especially since some platforms don't use X11 button numbering
and handle different mouse buttons in different windowing system events.

This also makes the names shorter, since otherwise they would be pretty
long, and it removes the high-numbered MOUSE_BTNx_DBL names, since they
weren't used.

Names are the same as used in Qt:
https://doc.qt.io/qt-5/qt.html#MouseButton-enum
2017-09-03 20:31:44 +10:00
wm4 9f0e358827 vo_opengl: fix overlay mode (again)
Did I mention yet that I regret this overlay mode thing?
2017-08-30 12:19:32 +02:00
wm4 a9571fcc0f vo_opengl: don't discard buffered video on redundant resize calls
If a VO-area option changes, gl_video_resize() is called
unconditionally. This function does something even if the size does not
change (at least it discards buffered frames for interpolation), which
can lead to stutter when you keep firing option change events during
playback.

Check for an actual resize, and if nothing changes, exit early.
2017-08-29 15:15:34 +02:00
wm4 a46500a2c8 vo_opengl: don't assume imgfmt=0 is valid
Could cause a crash if anything called ra_get_imgfmt_desc(imgfmt=0). Let
it fail correctly. This can happen if a hwdec backend does not set
hw_subfmt correctly.
2017-08-29 15:05:32 +02:00
Niklas Haas cc79d48d22
vo_opengl: fix the renderpass target format at creation time
Required for vulkan.
2017-08-27 14:37:47 +02:00
Niklas Haas 7baa18d5f8 vo_opengl: fix misleading comment in ra.h
tex_upload is not just for buffers
2017-08-27 14:36:28 +02:00
Niklas Haas 1d47473a7b
vo_opengl: use UBOs where supported/required
This also introduces RA_CAP_GLOBAL_UNIFORM. If this is not set, UBOs
*must* be used for non-bindings. Currently the cap is ignored though,
and the shader_cache *always* generates UBO-using code where it can.
Could be made an option in principle.

Only enabled for drivers new enough to support explicit UBO offsets,
just in case...

No change to performance, which is probably what we expect.
2017-08-27 14:36:04 +02:00
Niklas Haas 136cf2b770 vo_opengl: add support for UBOs
Not actually used by anything yet, but straightforward enough to add to
the RA API for starters.
2017-08-27 14:36:00 +02:00
Niklas Haas 8404a354e5 vo_opengl: clarify RA_CAP_DIRECT_UPLOAD
This no longer concerns the API user except in as much as the API user
probably wants to know whether or not PBOs are active, so keep around
the CAP field even though it's mostly useless now.
2017-08-27 14:36:00 +02:00
Niklas Haas 7684fda6ac vo_opengl: refactor shader_cache binding
There's no reason to be needlessly wasteful with our binding points
here. Just add a CAP for it.
2017-08-27 14:36:00 +02:00
Niklas Haas 45bae90f4d vo_opengl: be explicit about IMG_RW
Both vulkan and opengl distinguish between rendering to an image and
using an image as a storage attachment. So make this an explicit
capability instead of lumping it in with render_dst. (That way we could
support, for example, using an image as a storage attachment without
requiring a framebuffer)

The real reason for this change is that you can directly use the output
FBO as a storage attachment on vulkan but you can't on opengl, which
makes this param structly separate from render_dst.
2017-08-27 14:36:00 +02:00
Niklas Haas f40717a664 vo_opengl: use size_t offset for vertex offsets
I don't like the feeling of "reusing" the int binding for this. It
feels... wrong, somehow. I'd prefer to use an explicit "offset" field.
(Plus, I might re-use this for uniform buffers or something)

YMMV
2017-08-27 14:36:00 +02:00
wm4 68dc7d1695 vo_opengl: allow selection of true 32 bit float if float16 unavailable
Shouldn't make a difference for OpenGL (even with the weird duplication
of these functions removed). Might be useful for the WIP vulkan backend.
2017-08-24 22:44:41 +02:00
wm4 1b2185657f vo_direct3d: fix build
Broken by previous commit. Fix completely untested.
2017-08-22 17:32:05 +02:00
wm4 03cf150ff3 video: redo video equalizer option handling
I really wouldn't care much about this, but some parts of the core code
are under HAVE_GPL, so there's some need to get rid of it. Simply turn
the video equalizer from its current fine-grained handling with vf/vo
fallbacks into global options. This makes updating them much simpler.

This removes any possibility of applying video equalizers in filters,
which affects vf_scale, and the previously removed vf_eq. Not a big
loss, since the preferred VOs have this builtin.

Remove video equalizer handling from vo_direct3d, vo_sdl, vo_vaapi, and
vo_xv. I'm not going to waste my time on these legacy VOs.

vo.eq_opts_cache exists _only_ to send a VOCTRL_SET_EQUALIZER, which
exists _only_ to trigger a redraw. This seems silly, but for now I feel
like this is less of a pain. The rest of the equalizer using code is
self-updating.

See commit 96b906a51d for how some video equalizer code was GPL only.
Some command line option names and ranges can probably be traced back to
a GPL only committer, but we don't consider these copyrightable.
2017-08-22 17:01:35 +02:00
wm4 d2bdb72b69 options: add a thread-safe way to notify option updates
So far, we had a thread-safe way to read options, but no option update
notification mechanism. Everything was funneled though the main thread's
central mp_option_change_callback() function. For example, if the
panscan options were changed, the function called vo_control() with
VOCTRL_SET_PANSCAN to manually notify the VO thread of updates. This
worked, but's pretty inconvenient. Most of these problems come from the
fact that MPlayer was written as a single-threaded program.

This commit works towards a more flexible mechanism. It adds an update
callback to m_config_cache (the thing that is already used for
thread-safe access of global options).

This alone would still be rather inconvenient, at least in context of
VOs. Add another mechanism on top of it that uses mp_dispatch_queue, and
takes care of some annoying synchronization issues. We extend
mp_dispatch_queue itself to make this easier and slightly more
efficient.

As a first application, use this to reimplement certain VO scaling and
renderer options. The update_opts() function translates these to the
"old" VOCTRLs, though.

An annoyingly subtle issue is that m_config_cache's destructor now
releases pending notifications, and must be released before the
associated dispatch queue. Otherwise, it could happen that option
updates during e.g. VO destruction queue or run stale entries, which is
not expected.

Rather untested. The singly-linked list code in dispatch.c is probably
buggy, and I bet some aspects about synchronization are not entirely
sane.
2017-08-22 15:50:33 +02:00
Niklas Haas 09c501a40e
vo_opengl: refactor tex_upload to ra_buf_pool
Also refactors the usage of tex_upload to make ra_tex_upload_pbo a
RA-internal thing again.

ra_buf_pool has the main advantage of being dynamically sized depending
on buf_poll, so for OpenGL we'll end up only using one buffer (when not
persistently mapping) - while for vulkan we'll use as many as necessary,
which depends on the swapchain depth anyway.
2017-08-22 09:55:49 +02:00
wm4 437469c103 x11: fix that window could be resized when using embedding
Somewhat lazy fix. The code isn't particularly robust or correct wrt.
window embedding.

Fixes #4784.
2017-08-21 15:15:55 +02:00
Martin Herkt 82d9419f62
Revert "x11: drop xscrnsaver use"
This broke screensaver/powersave inhibition with at least KDE and
LXDE. This is a release blocker.

Since fdo, KDE and GNOME idiots seem to be unable to reach
a consensus on a simple protocol, this seems unlikely to get
fixed upstream this year, so revert this change.

Fixes #4752.
Breaks #4706 but I don’t give a damn.

This reverts commit 3f75b3c343.
2017-08-20 09:18:39 +02:00
Martin Herkt a82007dd1e
Revert "x11: use xdg-screensaver suspend/resume"
This reverts commit 6694048272.
2017-08-20 09:11:07 +02:00
James Ross-Gowan 2d78705b73 context_angle: remove unused variable
Unused since 16e0a39482.
2017-08-20 16:46:02 +10:00
James Ross-Gowan 08ec444ba5 context_angle: replace hard-coded array size 2017-08-19 22:05:29 +10:00
Akemi 344b75f52d osx: code cleanups and cosmetic fixes
silence build warnings, clean up code style and remove unused code.
2017-08-18 19:47:47 +02:00
Niklas Haas 01058b16f9 vo_opengl: allow texture uploads to fail
Surprisingly makes the code shorter, not longer
2017-08-18 02:33:29 +02:00
Niklas Haas abb7e88e3c
vo_opengl: clarify the ra_fns.debug_marker 2017-08-18 01:10:37 +02:00
Niklas Haas 46d86da630 vo_opengl: refactor RA texture and buffer updates
- tex_uploads args are moved to a struct
- the ability to directly upload texture data without going through a
  buffer is made explicit
- the concept of buffer updates and buffer polling is made more explicit
  and generalized to buf_update as well (not just mapped buffers)
- the ability to call tex_upload/buf_update on a tex/buf is made
  explicit during tex/buf creation
- uploading from buffers now uses an explicit offset instead of
  implicitly comparing *src against buf->data, because not all buffers
  may actually be persistently mapped
- the initial_data = immutable requirement is dropped. (May be re-added
  later for D3D11 if that ever becomes a thing)

This change helps the vulkan abstraction immensely and also helps move
common code (like the PBO pooling) out of ra_gl and into the
opengl/utils.c

This also technically has the side-benefit / side-constraint of using
PBOs for OSD texture uploads as well, which actually seems to help
performance on machines where --opengl-pbo is faster than the naive code
path. Because of this, I decided to hook up the OSD code to the
opengl-pbo option as well.

One drawback of this refactor is that the GL_STREAM_COPY hack for
texture uploads "got lost", but I think I'm happy with that going away
anyway since DR almost fully deprecates it, and it's not the "right
thing" anyway - but instead an nvidia-only hack to make this stuff work
somewhat better on NUMA systems with discrete GPUs.

Another change is that due to the way fencing works with ra_buf (we get
one fence per ra_buf per upload) we have to use multiple ra_bufs instead
of offsets into a shared buffer. But for OpenGL this is probably better
anyway. It's possible that in future, we could support having
independent “buffer slices” (each with their own fence/sync object), but
this would be an optimization more than anything. I also think that we
could address the underlying problem (memory closeness) differently by
making the ra_vk memory allocator smart enough to chunk together
allocations under the hood.
2017-08-18 00:34:34 +02:00
Niklas Haas 9ca5a2a5d8 vo_opengl: make blitting an explicit capability
Instead of merging it into render_dst. This is better for vulkan,
because blitting in vulkan both does not require a FBO *and* requires a
different image layout.

Also less "hacky" for OpenGL, since now the weird blit=FBO requirement
is an implementation detail of ra_gl
2017-08-18 00:19:14 +02:00
Niklas Haas 8209376468 vo_opengl: make ra_fns.timer_create optional 2017-08-18 00:19:14 +02:00
Niklas Haas b74067bc74
vo_opengl: remove redundant #defines in unsharp_hook
These are no longer valid anyway, and the code doesn't use them.
2017-08-17 09:12:28 +02:00
James Ross-Gowan 16e0a39482 vo_opengl: extract non-ANGLE specific D3D11 code
This extracts non-ANGLE specific code to d3d11_helpers.c, which is
modeled after egl_helpers.c. Currently the only consumer is
context_angle.c, but in future this may allow the D3D11 device and
swapchain creation logic to be reused in other backends.

Also includes small improvements to D3D11 device creation. It is now
possible to create feature level 11_1 devices (though ANGLE does not
support these,) and BGRA swapchains, which might be slightly more
efficient than ARGB, since its the same format used by the compositor.
2017-08-17 00:28:38 +10:00
wm4 6694048272 x11: use xdg-screensaver suspend/resume
If it doesn't work this time, I'll remove all X11 screensaver code.

Fixes #4763.
2017-08-15 20:32:44 +02:00
wm4 34ab0386cb vo_rpi: fix operation
Commit 697c4389a9 worked "almost". I couldn't test it at the time.
2017-08-15 19:41:23 +02:00
wm4 935df644af vo_opengl: fix incorrect glBindFramebuffer() call
Used the wrong binding.
2017-08-15 19:08:54 +02:00
wm4 b44e81d9c3 vo_opengl: fix dangling pointers when VAOs are not available
This is for legacy GL: if VAOs are not available, the helper has to
specify vertex attributes again on every rendering. gl_vao_init() keeps
the vertex array for this purpose. Unfortunately, a temporary argument
was passed to the function, instead of the permanent copy.

Also, it didn't use num_entries (instead expected the array being
terminated by a {0} entry). Fix that source code indentation too.
2017-08-15 18:59:59 +02:00
wm4 63b1031ca2 vo_opengl: support float pixel formats
Like AV_PIX_FMT_GBRPF32LE.
2017-08-15 17:00:35 +02:00
wm4 df8cc84f47 vo_opengl: remove DR image layouting code to renderer
No reason to have it in a higher level.
2017-08-14 19:57:44 +02:00
wm4 cacc6db2a3 vo_opengl: hwdec_vdpau: use correct source texture size
In commit c6fafbffac we accidentally set the logical texture size to
the cropped video size, which is not correct. This caused rendering
artifacts in some cases.

Use the video surfaces size instead. Since the current mp_image_params
contains the cropped size only, wrapper texture creation has to be moved
to the _map function. Move the same code for the mixer case (strictly
speaking this is not needed, but seems more symmetric).

(Also there is no need to clear gl_textures on uninit - leftover from
the old hwdec mapper API. So we just drop that part.)

Fixes #4760.
2017-08-14 12:17:39 +02:00
wm4 c20df5b3e1 vo_opengl: hwdec_ios: fix build 2017-08-11 22:00:44 +02:00
wm4 8b1d4b978d vo_opengl: remove some dead code
These were replaced by ra equivalents, and with the recent changes, all
of them became fully unused.
2017-08-11 21:29:35 +02:00
wm4 1d0bf4073b vo_opengl: handle probing GL texture formats better
Retrieve the depth for each component and internal texture format
separately. Only for 8 bit per component textures we assume that all
bits are used (or else we would in my opinion create too many probe
textures).

Assuming 8 bit components are always correct also fixes operation in
GLES3, where we assumed that each component had -1 bits depth, and this
all UNORM formats were considered unusable. On GLES, the function to
check the real bit depth is not available. Since GLES has no 16 bit
UNORM textures at all, except with the MPGL_CAP_EXT16 extension, just
drop the special condition for it. (Of course GLES still manages to
introduce a funny special case by allowing GL_LUMINANCE , but not
defining GL_TEXTURE_LUMINANCE_SIZE.)

Should fix #4749.
2017-08-11 21:29:35 +02:00
wm4 e7a9bd6937 vo_opengl: remove another unneeded GL include
Getting mp_pass_perf seriously requires including vo.h???
2017-08-11 21:29:35 +02:00
wm4 697c4389a9 rpi: fix build
Runtime untested, because I get this:

  [vo/rpi] Could not get DISPMANX objects.

This happened even when building older git versions, and on a RPI image
that hasn't changed in the recent years. I don't know how to make this
POS work again, so I guess if there's a bug in the new code, it will
remain broken.
2017-08-11 21:29:35 +02:00
wm4 de3eecce7f vo_opengl: move strictly private ra_gl structs to .c file
So that nothing accidentally accesses these.
2017-08-11 21:29:35 +02:00
wm4 fba4e8aa40 vo_opengl: remove some indirect GL header inclusions from core renderer 2017-08-10 21:36:57 +02:00
wm4 c6fafbffac vo_opengl: separate hwdec context and mapping, port it to use ra
This does two separate rather intrusive things:

 1. Make the hwdec context (which does initialization, provides the
    device to the decoder, and other basic state) and frame mapping
    (getting textures from a mp_image) separate. This is more
    flexible, and you could map multiple images at once. It will
    help removing some hwdec special-casing from video.c.
 2. Switch all hwdec API use to ra. Of course all code is still
    GL specific, but in theory it would be possible to support other
    backends. The most important change is that the hwdec interop
    returns ra objects, instead of anything GL specific. This removes
    the last dependency on GL-specific header files from video.c.

I'm mixing these separate changes because both requires essentially
rewriting all the glue code, so better do them at once. For the same
reason, this change isn't done incrementally.

hwdec_ios.m is untested, since I can't test it. Apart from superficial
mistakes, this also requires dealing with Apple's texture format
fuckups: they force you to use GL_LUMINANCE[_ALPHA] instead of GL_RED
and GL_RG. We also need to report the correct format via ra_tex to
the renderer, which is done by find_la_variant(). It's unknown whether
this works correctly.

hwdec_rpi.c as well as vo_rpi.c are still broken. (I need to pull my
RPI out of a dusty pile of devices and cables, so, later.)
2017-08-10 21:24:31 +02:00
wm4 b2fb3f1340 vo_opengl: hwdec_cuda: fix filtering mode
Probably explains quality issues in some cases.
2017-08-09 21:02:23 +02:00
wm4 9c5dcf9398 vo_opengl: shrink the hwdec overlay API
Just remove one callback, and fold the functionality into the other one.
RPI will still not compile, so the hwdec_rpi.c changes are untested.
2017-08-09 20:57:37 +02:00
wm4 7397e8ab42 vo_opengl: add a hack for Apple's broken iOS hwdec stuff
As seen in hwdec_ios.m, it insists on using the legacy luminance alpha
formats for mapped textures.
2017-08-08 17:53:19 +02:00
Niklas Haas 6087f63003
vo_opengl: go back to using GL_TIME_ELAPSED
Less flexible than GL_TIMESTAMP but supported by more platforms. This
will mean that nested queries have to be detected and silently omitted,
but oh well. Not much use for them anyway.

Fixes #4721.
2017-08-08 17:08:25 +02:00
wm4 0b10a07b63 vo_opengl: don't call glGetProgramBinary if GL_PROGRAM_BINARY_LENGTH==0
Noticed in #4717, although the issue might be about something else.
2017-08-08 13:16:37 +02:00
wm4 3f75b3c343 x11: drop xscrnsaver use
It's an ancient X11 protocol extension that apparently nobody uses
anymore (desktop environments in particular have replaced it with
equally bad protocols that require tons of dependencies). Users keep
complaining about it being a required dependency.

The impact is likely minimal to none.

Fixes #4706 and other annoying people.
2017-08-08 12:55:41 +02:00
wm4 c1bcd30b09 vo_opengl: cosmetics to comments 2017-08-08 11:38:29 +02:00
wm4 61c8a147b5 vo_opengl: call ra_free() in the correct context
This also fixes a double free in vo_opengl_cb.c.
2017-08-07 19:57:15 +02:00
wm4 168ffbaf23 client API: more opengl_cb clarifications
Also fix a typo in ra_gl.c. Too greedy for a separate commit.
2017-08-07 19:24:25 +02:00
wm4 d45fbecbb5 vo_opengl: add another ra_format field to exclude insane formats
Generic description of pixel formats is hard. In this case, the Apple
special format for packed YUV could have been interpreted as a RGB
format with funny packing.
2017-08-07 19:18:58 +02:00
wm4 47ea771b7a vo_opengl: further GL API use separation
Move multiple GL-specific things from the renderer to other places like
vo_opengl.c, vo_opengl_cb.c, and ra_gl.c.

The vp_w/vp_h parameters to gl_video_resize() make no sense anymore, and
are implicitly part of struct fbodst.

Checking the main framebuffer depth is moved to vo_opengl.c. For
vo_opengl_cb.c it always assumes 8. The API user now has to override
this manually. The previous heuristic didn't make much sense anyway.

The only remaining dependency on GL is the hwdec stuff, which is harder
to change.
2017-08-07 19:17:28 +02:00
wm4 1adf324d8b vo_opengl: fix minor memory leak
Don't leak the buffer if glGetProgramBinary() fails.
2017-08-07 18:46:40 +02:00
Niklas Haas bed421d483
vo_opengl: nuke ra_gl->first_run
Completely unnecessary, we can just update the uniforms immediately
after creating the program. In theory, for GLSL 4.20+, we could even
skip this, but oh well.
2017-08-07 17:47:04 +02:00
Niklas Haas ecbb02148b vo_opengl: better formatting for enum RA_CAP
Also fixes an issue where 1 << 5 was used twice, probably because of the
terrible formatting obscuring this bug
2017-08-07 17:46:04 +02:00
Niklas Haas 01a40bb1ee vo_opengl: also support RA_VARTYPE_INT vertex attribs
No reason not to.
2017-08-07 17:46:04 +02:00
wm4 346ac1e09f vo_opengl: simplify mirroring and fix it if glBlitFramebuffer is used
The vp_w/vp_h variables and parameters were not really used anymore
(they were redundant with ra_tex w/h) - but vp_h was still used to
identify whether rendering should be done mirrored.

Simplify this by adding a fbodst struct (some bad naming), which
contains the render target texture, and some parameters how it should be
rendered to (for now only flipping). It would not be appropriate to make
this a member of ra_tex, so it's a separate struct.

Introduces a weird regression for the first frame rendered after
interpolation is toggled at runtime, but seems to work otherwise. This
is possibly due to the change that blit() now mirrors, instead of just
copying. (This is also why ra_fns.blit is changed.)

Fixes #4719.
2017-08-07 16:44:15 +02:00
wm4 41ee66d566 vo_opengl: drop pointless fbotex_init() function 2017-08-07 14:34:18 +02:00
Niklas Haas 9581fbe569 vo_opengl: generalize ra_buf to support other buffer objects
This allows us to integrate PBOs and SSBOs into the same abstraction,
with the potential to easily add UBOs if the need arises.
2017-08-07 12:46:30 +02:00
Akemi f550fdaa91 cocoa: add an option to disable the native macOS fullscreen
Fixes #4014
2017-08-06 22:48:26 +02:00
Niklas Haas 494aa0f651
vo_opengl: only mark frames as fresh if they contain a new image
When using dumb mode, we can actually redraw a frame without uploading
it. Marking this as fresh as well results in unpredictable pass
behavior, which is confusing and makes debugging harder. So mark it as a
redraw instead, in that case.
2017-08-06 02:51:11 +02:00
Niklas Haas 988d188d96
vo_opengl: drop ra_gl.h from shader_cache.c
Since the GL *gl is no longer needed for the timers, we can get rid of
the sc->gl dependency. This requires moving a utility function (which is
not GL-specific anyway) out of gl_utils.h and into utils.h
2017-08-06 00:10:22 +02:00
Niklas Haas e5748e891f vo_opengl: measure pass_draw_osd as a whole
In the past, this always measured the per-shader execution times of the
individual OSD parts, which was thrown off because the shader was reused
anyway. (And apparently recording the OSD shader execution times was
removed completely, probably because of them being so unrealiably
anyway)

Since ra_timer no longer has the restriction of not allowing timers to
run concurrently, we can just wrap the entire OSD block inside a single
osd_timer now, and record that. (Technically, this can still be off when
using --blend-subtitles=video/yes and showing a full-screen OSD at the
same time. Maybe this can be done better?)
2017-08-06 00:10:20 +02:00
Niklas Haas f2298f394e vo_opengl: move timers to struct ra
In order to prevent code duplication and keep the ra abstraction as
small as possible, `ra` only implements the actual timer queries,
it does not do pooling/averaging of the results. This is instead moved
to a ra-neutral struct timer_pool in utils.c.
2017-08-06 00:10:20 +02:00
wm4 56742ecdc9 vo_opengl: ra_gl: make getting GL ptr slightly less tedious 2017-08-05 17:09:25 +02:00
wm4 dddda6e4a5 vo_opengl: move GL state resetting to vo_opengl_cb
This code is pretty much for the sake of vo_opengl_cb API users. It
resets certain state that either the user or our code doesn't reset
correctly. This is somewhat outdated. With GL implicit state being
so awfully large, it seems more reasonable require that any code
restores the default state when returning to the caller. Some
exceptions are defined in opengl_cb.h.
2017-08-05 16:27:09 +02:00
wm4 333cae74ef vo_opengl: move shader handling to ra
Now all GL-specifics of shader compilation are abstracted through ra.
Of course we still have everything hardcoded to GLSL - that isn't going
to change.

Some things will probably change later - in particular, the way we pass
uniforms and textures to the shader. Currently, there is a confusing
mismatch between "primitive" uniforms like floats, and others like
textures.

Also, SSBOs are not abstracted yet.
2017-08-05 16:27:09 +02:00
wm4 f72a33d2cb vo_opengl: organize ra PBO flag slightly differently
Instead of having a mutable ra_tex field (and the only one), move the
flag to struct ra, since we have only 2 tex_upload user calls anyway,
and both want the same PBO behavior. (At first I considered making it
a RA_TEX_UPLOAD_ flag, but why bother. PBOs are a terribly GL-specific
thing, so we can't expect a reasonable abstraction of it anyway.)
2017-08-05 13:48:46 +02:00
wm4 dd096863fa vo_opengl: make OSD code use ra for textures
This requires a silly extension to ra_fns.tex_upload: since the OSD
texture can be much larger than the actual OSD image data to upload, a
mechanism for uploading only to a small part of the texture is needed.
Otherwise, we'd have to realloc/copy the data, just to pad it, and then
pay for uploading the padding too.

The RA_TEX_UPLOAD_DISCARD flag is not interpreted by GL (not sure how
you'd tell GL about this), but it clarifies the API and might be
helpful if we support other backend APIs in the future.
2017-08-05 13:44:30 +02:00
wm4 8dd4ae13ff vo_opengl: restore OSX "old" hwdec
Probably. Untested.
2017-08-05 13:09:05 +02:00
wm4 aac04c0d64 vo_opengl: split utils.c/h
Actually GL-specific parts go into gl_utils.c/h, the shader cache
(gl_sc*) into shader_cache.c/h.

No semantic changes of any kind, except that the VAO helper is made
public again as part of gl_utils.c (all while the goal for gl_utils.c
itself is to be included by GL-specific code).
2017-08-05 13:09:05 +02:00
wm4 fa4a1c4675 vo_opengl: always use GL_TRIANGLES for all primitives
Will make the ra layer _slightly_ simpler.
2017-08-05 13:09:05 +02:00
wm4 0206efa94a vo_opengl: pass ra objects during rendering instead of GL objects
Another "small" step towards removing GL dependencies from the renderer.
This commit generally passes ra_tex objects instead of GL FBO integer
IDs to various rendering functions. video.c still manually binds the
FBOs when calling shaders.

This also happens to fix a memory leak with output_fbo.
2017-08-05 13:09:05 +02:00
wm4 a796745fd2 vo_opengl: make fbotex helper use ra
Further work removing GL dependencies from the actual video renderer,
and moving them into ra backends.

Use of glInvalidateFramebuffer() falls away. I'd like to keep this, but
it's better to readd it once shader runs are in ra.
2017-08-05 13:09:05 +02:00
wm4 90b53fede6 vo_opengl: drop unused custom texture filter for FBO helper 2017-08-05 13:09:05 +02:00
James Ross-Gowan 037c7a9279 w32_common: handle media keys
This was attempted before in fc9695e63b, but it was reverted in
1b7ce759b1 because it caused conflicts with other software watching
the same keys (See #2041.) It seems like some PCs ship with OEM software
that watches the volume keys without consuming key events and this
causes them to be handled twice, once by mpv and once by the other
software.

In order to prevent conflicts like this, use the WM_APPCOMMAND message
to handle media keys. Returning TRUE from the WM_APPCOMMAND handler
should indicate to the operating system that we consumed the key event
and it should not be propogated to the shell. Also, we now only listen
for keys that are directly related to multimedia playback (eg. the
APPCOMMAND_MEDIA_* keys.) Keys like APPCOMMAND_VOLUME_* are ignored, so
they can be handled by the shell, or by other mixer software.
2017-08-05 02:38:44 +10:00
Rostislav Pehlivanov e406e81477 vo_opengl: always print when getting embedded ICC profile data
The printout in get_vid_profile() gets skipped if icc caching has
been enabled, so always print if an embedded ICC profile has been
provided.
2017-08-04 09:50:13 +01:00
Niklas Haas fee6b287a5 vo_opengl: support embedded ICC profiles
This currently only works when using lcms-based color management
(--icc-profile-*).

In principle, we could also support using lcms even when the user has
not specified an ICC profile, by generating the profile against a fixed
reference (--target-prim/--target-trc) instead. I still might do that
some day, simply because 3dlut provides a higher quality conversion than
our simple gamut mapping does for stuff like BT.2020, and also because
it's now needed to enable embedded ICC profiles. But that would be a
separate change, so preserve the status quo for now.

(Besides, my opinion is still that you should be using an ICC profile if
you care about colors being accurate _at all_)
2017-08-03 21:48:25 +02:00
Niklas Haas 0f956f0929
vo_opengl: use GL_CLIENT_STORAGE_BIT for DR
mesa won't pick client storage unless this bit is set, and we
*absolutely* want to be using client storage for our DR PBOs.
Performance is shit on AMD otherwise. (Nvidia always uses client storage
for persistent coherent buffers whether you tell it it or not, probably
because it's way faster and nvidia doesn't trust users to figure that
out on their own)
2017-08-03 20:06:58 +02:00
wm4 7625bcc716 vo_opengl: remove unused ra_mapped_buffer.preferred_align field
It makes no sense to have this on an already created buffer.

If anything, the ra backend would have to export this as a global value
(e.g. struct ra field), so that whatever allocates the buffer can
account for the required alignment. Since this code is in vo_opengl.c in
the first place, and since GL doesn't dictate any special alignment
here, it doesn't make sense in the first place to export this. (Maybe
something like this will be required later.)
2017-08-03 18:59:43 +02:00
Niklas Haas 2bf094cd55
vo_opengl: don't hardcode texmap0 for polar compute
This was an oversight. The ID shouldn't be hard-coded here, so add it to
sampler_prelude instead.
2017-08-03 18:55:52 +02:00
Niklas Haas 5e89aed934 vo_opengl: don't precompute texcoord in global scope
Breaks on mesa for whatever reason... even though it doesn't generate a
GLSL shader compiler error

Shouldn't make a performance difference for us because we cache `pos`
anyway, and most compute shaders will probably cache all of their
samples to shmem. Might have to re-visit this when we have an actual use
case for repeated sampling inside CS though. (RAVU + anti-ringing is a
possible candidate for that)
2017-08-03 18:50:07 +02:00
Niklas Haas 83f3910398
vo_opengl: make compute shaders more flexible
This allows users to do their own custom sample writing, mainly meant to
address use cases such as RAVU. Also clean up the compute shader code a
bit.
2017-08-03 18:27:36 +02:00
wm4 e7d31d12be vo_opengl: add legend for texture format debug dump 2017-08-03 16:19:57 +02:00
wm4 1479c7bd0d vo_opengl: give special Apple name a more appropriate name
Or less appropriate, as some would argue. The new name is short for
"Apple YUV packed".

(This format is needed only for hardware decoding on rather old Apple
hardware, and a very annoying special case.)
2017-08-03 16:19:56 +02:00
wm4 ffe0526064 vo_opengl: simplify/fix user shader textures
This broke float textures, which were actually used by some shaders.
There were probably some other bugs as well.

Lots of code can be avoided by using ra_tex_params directly, so do that.

The main change is that COMPONENT/FORMAT are replaced by a single FORMAT
directive, which takes different parameters now. Due to the mess with
16/32 bit float textures, and because we want to support other APIs than
just GL in the future, it's not really clear how this should be handled,
and the nice component/type separation makes things actually harder. So
just jump the gun and use the ra_format.name names, which were
originally meant mostly for debugging. (This is probably something that
will be regretted later.)

Still only superficially tested, but seems to work.

Fixes #4708.
2017-08-03 16:19:49 +02:00
Niklas Haas 2bcf04a7bd
vo_opengl: fix constexprs on ANGLE
I hate GLES
2017-08-03 14:27:38 +02:00
Niklas Haas 8f484567fc vo_opengl: fix HLG OOTF inverse
Got the "sign" of the second multiplication wrong.
2017-08-03 14:26:35 +02:00
Niklas Haas 5e1e7d32e8
vo_opengl: generalize HDR tone mapping to gamut mapping
Since this code was already written for HDR, and is now per-channel
(because it works better for HDR as well), we can actually reuse this to
get very high quality gamut mapping without clipping. The only required
change is to move the tone mapping from before the gamut map to after
the gamut map. Additonally, we need to also account for changes in the
signal range as a result of applying the CMS when we compute ref_peak,
which is fortunately pretty easy because we only need to consider the
case of primaries mapping to themselves.

Since `HDR` no longer really makes sense as a label, rename it to
`--tone-mapping` in general. Also fits better with
`--tone-mapping-desat` etc.

Arguably we could also rename `--hdr-compute-peak`, but that option is
basically only useful for HDR content anyway because we don't need
information about the signal range for gamut mapping.

This (finally!) gives us reasonably high quality gamut mapping even in
the absence of an ICC profile / 3DLUT.
2017-08-03 12:46:57 +02:00
Niklas Haas 6074cfdfd4
vo_opengl: implement HLG OOTF inverse
Huge thanks to @rusxg for finding this solution, which was previously
believed not to exist. Of course, we still don't actually need it, but I
don't want to leave this half-implemented in case somebody does in the
future.
2017-08-03 12:05:37 +02:00
Alex Notes bda32d99d7 cocoa: fix the support of multiple renderers (GPU switch)
So far, switching between integrated and discrete GPU would cause the
kernel to kill mpv due to an indecipherable buffer error. The technical
note TN2229 from Apple recommends to enable OpenGL Offline Renderers for
every Mac with more GPUs than displays to handle the switch between GPU.

By ordering the array from the least commonly rejected to the most,
we can sequentially remove PixelFormat attributes to fit the host.

Fixes #2371
2017-07-31 20:23:58 +02:00
Akemi 80758eda17 cocoa: remove usage of FFABS and the dependency on libavutil/common.h 2017-07-31 20:22:33 +02:00
Akemi b726f1eb90 cocoa: distinguish between horizontal and vertical scroll
we need to switch the x and y deltas when Shift is being held because
macOS switches them around. otherwise we would get a horizontal scroll
on a vertical one and vice versa.

additional we switch from deltaX/Y to scrollingDeltaX/Y since the Apple
docs suggest it's the preferred way now. in my tests both reported the
same values on imprecise scrolls though.
2017-07-31 20:22:33 +02:00
wm4 53188a14bf vo_opengl: manage user shader textures with ra
Drops some features I guess, no idea if those were needed. Untested due
to lack of test cases.
2017-07-30 11:38:52 +02:00
wm4 5429dbf2a2 vo_opengl: fix dither texture filter
Should be GL_NEAREST, not GL_LINEAR.
2017-07-30 09:43:41 +02:00
wm4 ab1ffa1382 vo_opengl: manage ICC LUT texture via ra
Also move the capability check to gl_video_get_lut3d(), because it
seems more convenient (ra won't have a _CAP_EXT16).
2017-07-29 21:23:31 +02:00
wm4 37b7b32d61 vo_opengl: manage scaler LUT textures via ra
Also fix the RA_CAP_ bitmask nonsense.
2017-07-29 20:15:59 +02:00
wm4 8494fdadae vo_opengl: manage dither texture via ra
Also add some more helpers.

Fix the broken math.h include statement.

utils.c uses ra_gl.h internals, which it shouldn't, and which will be
removed again as soon as this code gets converted to ra fully.
2017-07-29 20:14:48 +02:00
wm4 0f9fcf0ed4 vo_opengl: do not use GL format conversion on texture upload
The dither texture data is created as a float array, but uploaded to a
texture with GL_R16 as internal format. We relied on GL to do the
conversion from float to uint16_t. Not all GL variants even support
this: GLES does not provide this conversion (one of the reasons why this
code has a float16 code path). Also, ra is not going to do this. So just
convert on the fly.

Still keep the float16 texture format fallback, because not all GLES
implementations provide GL_R16.

There is some possibility that we'll need to provide some kind of upload
conversion anyway for float->float16. We still rely on GL doing this
implicitly, and all GL variants support it, but with RA there might be
the need for explicit conversion. Even then, it might be best to reduce
the number of conversion cases. I'll worry about this later.
2017-07-29 20:12:43 +02:00
wm4 6fcc09ff3d vo_opengl: use ra_* for format negotiation too
Format handling via ra_* was added earlier, but the format negotiation
part was forgotten.

Actually move some aspects of it to ra_get_imgfmt_desc(). Also make sure
the unorm and float formats selected by the common format lookup
functions are linear filterable. (For OpenGL, this is implicitly
guaranteed, so it wasn't done before.) Whether these assumptions should
be checked/enforced in the ra code at all is a bit fuzzy, but with ra
being helper code only for the actual video renderer, it's probably
justified.
2017-07-29 20:11:51 +02:00
Niklas Haas 345bb193fe
vo_opengl: support loading custom user textures
Parsing the texture data as raw strings makes the textures the most
portable and self-contained. In order to facilitate different types of
shaders, the parse_user_shader interaction has been changed to instead
have it loop through blocks and call the passed functions for each valid
block parsed. This is more modular and also cleaner, with better code
separation.

Closes #4586.
2017-07-27 23:51:05 +02:00
Niklas Haas f1af6e53f0 vo_opengl: slightly refactor user_shaders code
- Each struct tex_hook now stores multiple hooks, this allows us to
  avoid the awkward way of the current code has to add the same pass
  multiple times.

- As a consequence, SHADER_MAX_HOOKS was split up into SHADER_MAX_PASSES
  (number of tex_hooks) and SHADER_MAX_HOOKS (number of hooked textures
  per tex_hook), and both numbers decreased correspondingly.

- Instead of having a weird free() callback, we can just leverage
  talloc's recursive free behavior. The only user is the user shaders code
  anyway.
2017-07-27 23:45:17 +02:00
Niklas Haas ea76f79e5d
vo_opengl: tone map on the maximum signal component
This actually makes sure we don't decolor due to clipping even when the
signal itself exceeds the luma by a significant factor, which was pretty
common for saturated blues (and to a lesser degree, reds) - most
noticeable in skies etc.

This prevents the turn-the-sky-cyan effect of mobius tone mapping, and
should also improve the other tone mapping modes in quality.
2017-07-27 09:40:12 +02:00
Niklas Haas e1cc43182c
vo_opengl: fix mpgl_caps bit check
As pointed out by @bjin, this would match if _any_ of the reqs are set.
Need to test for explicit equality.
2017-07-27 00:38:54 +02:00
wm4 81851febc4 vo_opengl: start work on rendering API abstraction
This starts work on moving OpenGL-specific code out of the general
renderer code, so that we can support other other GPU APIs. This is in
a very early stage and it's only a proof of concept. It's unknown
whether this will succeed or result in other backends.

For now, the GL rendering API ("ra") and its only provider (ra_gl) does
texture creation/upload/destruction only. And it's used for the main
video texture only. All other code is still hardcoded to GL.

There is some duplication with ra_format and gl_format handling. In the
end, only the ra variants will be needed (plus the gl_format table of
course). For now, this is simpler, because for some reason lots of hwdec
code still requires the GL variants, and would have to be updated to
use the ra ones.

Currently, the video.c code accesses private ra_gl fields. In the end,
it should not do that of course, and it would not include ra_gl.h.

Probably adds bugs, but you can keep them.
2017-07-26 11:31:43 +02:00
Niklas Haas 5904eddb38
vo_opengl: describe the texture uploading mode
Be a bit more transparent here, which is especially helpful when people
are sending me screenshots of stats pages.
2017-07-26 02:42:23 +02:00
Niklas Haas b31020b193
vo_opengl: check against shmem limits
The radius check was not strict enough, especially not for all
platforms. To fix this, actually check the hardware capabilities instead
of relying on a hard-coded maximum radius.
2017-07-26 01:54:33 +02:00
James Ross-Gowan 9875f14ad4 vo_opengl: fix image uniforms for older OpenGL
This explicitly enables the GL_ARB_shader_image_load_store extension,
which seems to fix compute shaders for Intel/GL 3.0.
2017-07-26 08:02:03 +10:00
Niklas Haas 49a648447f vo_opengl: cosmetic change 2017-07-25 20:14:03 +02:00
Niklas Haas f2809e19f0 vo_opengl: add PRINTF_ATTRIBUTE to gl_sc_ssbo
Doesn't uncover any bugs, but apparently we're getting in the habit of
this anyway.
2017-07-25 06:35:10 +02:00
Niklas Haas 62de84cbe3
vo_opengl: kill off FBOTEX_COMPUTE again
The textures not having an FBO actually caused regressions when trying
to render the subtitles on top of this texture (--blend-subtitles),
which still relied on an FBO.

So just kill off the logic entirely. Why worry about a single FBO wasted
when we're allocating like 10 anyway.

Fixes #4657.
2017-07-25 06:32:29 +02:00
Niklas Haas d099e037ef
vo_opengl: fix incoherent SSBO usage
According to the OpenGL spec, atomic access to SSBO variables is *not*
guaranteed to be coherent, even when reusing the same SSBO attached to
the same shader across different frames. So we actually need a
glMemoryBarrier here, at least in theory.
2017-07-25 06:11:57 +02:00
Niklas Haas 6c06e7e2a0
vo_opengl: cosmetic fix 2017-07-25 05:23:52 +02:00
Niklas Haas cd226bdfd8 vo_opengl: fix incoherent texture usage
This bug slipped past my attention because nvidia ignores memory
barriers, but this is not necessarily always the case. Since
image_load_store is incoherent (specifically, writing to images from
compute shaders is incoherent) we need to insert a memory barrier to
make it coherent again. Since we only care about texture fetches, that's
the only barrier we need.
2017-07-25 05:22:29 +02:00
Niklas Haas 241d5ebc46
vo_opengl: adjust the rules for linearization
Two changes, compounded into one since they affect the same logic:

1. Never use linearization for HDR downscaling
2. Always use linearization for interpolation

Instead of fixing p->use_linear at the beginning of pass_render_frame,
we flip it on "dynamically" as needed. I plan on killing this
p->use_linear frame (along with other per-pass metadata) and moving them
into their own struct for tracking the "current" state of the video, but
that's a separate/upcoming refactor.

As a small bonus, reduce some code duplication in the interpolation
logic.

Fixes #4631
2017-07-24 23:26:15 +02:00
Bin Jin 13ef6bcf6f vo_opengl: enable compute shader for mesa
Mesa 17.1 supports compute shader but not full specs of OpenGL 4.3.
Change the code to detect OpenGL extension "GL_ARB_compute_shader"
rather than OpenGL version 4.3.

HDR peak detection requires SSBO, and polar scaler requires 2D array
extension. Add these extensions as requirement as well.
2017-07-25 04:07:26 +08:00
Niklas Haas 0c84ee01d5
vo_opengl: support user compute shaders
These are identical to regular fragment shader hooks, but with extra
metadata indicating the preferred block size.
2017-07-24 17:19:34 +02:00
Niklas Haas f338ec4591 vo_opengl: implement compute shader based EWA kernel
This performs almost 50% faster on my machine (!!), from 4650μs down to
about 3176μs for ewa_lanczossharp.

It's possible we could use a similar approach to speed up the separable
scalers, although with vastly simpler code. For separable scalers we'd
also have the additional huge benefit of only needing padding in one
direction, so we could potentially use a big 256x1 kernel or something
to essentially compute an entire row at once.
2017-07-24 17:19:31 +02:00
Niklas Haas b196cadf9f vo_opengl: support HDR peak detection
This is done via compute shaders. As a consequence, the tone mapping
algorithms had to be rewritten to compute their known constants in GLSL
(ahead of time), instead of doing it once. Didn't affect performance.

Using shmem/SSBO atomics in this way is extremely fast on nvidia, but it
might be slow on other platforms. Needs testing.

Unfortunately, setting up the SSBO still requires OpenGL calls, which
means I can't have it in video_shaders.c, where it belongs. But I'll
defer worrying about that until the backend refactor, since then I'll be
breaking up the video/video_shaders structure anyway.
2017-07-24 17:19:31 +02:00
Niklas Haas aad6ba018a vo_opengl: support compute shaders
These can either be invoked as dispatch_compute to do a single
computation, or finish_pass_fbo (after setting compute_size_minimum) to
render to a new texture using a compute shader. To make this stuff all
work transparently, we try really, really hard to make compute shaders
as identical to fragment shaders as possible in their behavior.
2017-07-24 17:19:31 +02:00
Niklas Haas eb54d2ad4d vo_opengl: cut down on FBOTEX_FUZZY abuse
Don't use FBOTEX_FUZZY where the FBO is sized according to
p->texture_w/h, since this changes infrequently (and when it does, we
need to reset everything anyway). No real reason to make this change
other than that it possibly prevents nasty surprises in the future, so I
feel more comfortable about it.
2017-07-24 16:41:38 +02:00
wm4 24dc91907a common, vo_opengl: add/use helper for formatted strings on the stack
Seems like I really like this C99 idiom. No reason not to generalize it
do snprintf(). Introduce mp_tprintf(), which basically this idiom to
snprintf(). This macro looks like it returns a string that was allocated
with alloca() on the caller site, except it's portable C99/C11. (And
unlike alloca(), the result is valid only within block scope.)

Use it in 2 places in the vo_opengl code. But it has the potential to
make a whole bunch of weird looking code look slightly nicer.
2017-07-24 08:12:42 +02:00
wm4 3d0f86145c vo_opengl: check format on some printf-like calls
Fix 1 incorrect use.
2017-07-24 08:08:02 +02:00
wm4 64d56114ed vo_opengl: add direct rendering support
Can be enabled via --vd-lavc-dr=yes. See manpage additions for what it
does.

This reminds of the MPlayer -dr flag, but the implementation is
completely different. It's the same basic concept: letting the decoder
render into a GPU buffer to avoid a copy. Unlike MPlayer, this doesn't
try to go through filters (libavfilter doesn't support this anyway).
Unless a filter can work in-place, DR will be silently disabled. MPlayer
had very complex semantics about buffer types and management (which
apparently nobody ever understood) and weird restrictions that mostly
limited it to mpeg2 style codecs. The mpv code does not do any of this,
and just lets the decoder allocate an arbitrary number of untyped
images. (No MPlayer code was used.)

Parts of the code based on work by atomnuker (starting point for the
generic code) and haasn (some GL definitions, some basic PBO code, and
correct fencing).
2017-07-24 04:32:55 +02:00
wm4 bbfd9b5a29 vo_opengl: osd: remove stale declaration
Was missed in the previous changes.
2017-07-23 00:02:02 +02:00