Commit Graph

73 Commits

Author SHA1 Message Date
linkmauve 322eb72679 OpenGL: Also detect softpipe as a software driver
Because it is.
2020-02-25 21:32:04 +02:00
wm4 053297b1ca vo_gpu: opengl: do not free "GL" sub-allocations
This function always expects the GL struct pointer to be a talloc
allocation. So far so bad. But the terrible thing is that _lots_ of code
in mpv didn't quite get this (including the code which introduced the
way it is used this way). For example, in context_glx.c you see this:

struct priv {
    GL gl;
    ...

GL is not a talloc allocation, but since it's at the start of a talloc
allocation, it works anyway. So far so bad. But the really terrible
thing is that mpgl_load_functions2() calls talloc_free_children() on the
GL pointer, which means that all of priv's. This would be unintentional
and could create dangling pointers. And this happens at the about 1
dozen of callers. I'm amazed it didn't broke yet anywhere.

Removing this anti-pattern with making GL "implicitly" a talloc
allocation would be too much effort at this point. So just manually free
the only allocation that the function attached to GL.
2019-11-29 20:23:27 +01:00
Anton Kindestam ae115bd8d8 opengl: Support GL_ARB_sync style fences on OpenGL ES 3.0
OpenGL ES 3.0 and up has suppport for for GL_ARB_sync style fences.
Make sure that mpv can use them.
2019-02-25 01:25:25 +01:00
Akemi 8d2d0f0640 cocoa-cb: add Apple Software Renderer support
by default the pixel format creation falls back to software renderer
when everything fails. this is mostly needed for VMs. additionally one
can directly request an sw renderer or exclude it entirely.
2018-09-30 17:13:34 +03:00
wm4 52dd38a48a client API: add a new way to pass X11 Display etc. to render API
Hardware decoding things often need access to additional handles from
the windowing system, such as the X11 or Wayland display when using
vaapi. The opengl-cb had nothing dedicated for this, and used the weird
GL_MP_MPGetNativeDisplay GL extension (which was mpv specific and not
officially registered with OpenGL).

This was awkward, and a pain due to having to emulate GL context
behavior (like needing a TLS variable to store context for the pseudo GL
extension function). In addition (and not inherently due to this), we
could pass only one resource from mpv builtin context backends to
hwdecs. It was also all GL specific.

Replace this with a newer mechanism. It works for all RA backends, not
just GL. the API user can explicitly pass the objects at init time via
mpv_render_context_create(). Multiple resources are naturally possible.

The API uses MPV_RENDER_PARAM_* defines, but internally we use strings.
This is done for 2 reasons: 1. trying to leave libmpv and internal
mechanisms decoupled, 2. not having to add public API for some of the
internal resource types (especially D3D/GL interop stuff).

To remain sane, drop support for obscure half-working opengl-cb things,
like the DRM interop (was missing necessary things), the RPI window
thing (nobody used it), and obscure D3D interop things (not needed with
ANGLE, others were undocumented). In order not to break ABI and the C
API, we don't remove the associated structs from opengl_cb.h.

The parts which are still needed (in particular DRM interop) needs to be
ported to the render API.
2018-03-26 19:47:08 +02:00
wm4 99dd2f57f0 vo_gpu: ra_gl: fix minimum GLSL version to 120
Not sure why there was 110, or why there is even a default.
2017-11-03 11:53:31 +01:00
wm4 0c04ce5f0d vo_gpu: gl: implement proper extension string search
The existing code in check_ext() avoided false positive due to
sub-strings, but allowed false negatives. Fix this with slightly better
search code, and make it available as function to other source files.
(There are some cases of strstr() still around.)
2017-10-02 17:30:27 +02:00
Niklas Haas 293c696ddb vo_opengl: use GLX_MESA_swap_control where available
This overrides the use of GLX_SGI_swap_control, because apparently
GLX_SGI_swap_control doesn't support SwapInterval(0), but the
GLX_MESA_swap_interval does.

Of course, everybody except mesa just accepts SwapInterval(0) even for
GLX_SGI_swap_control, but mesa needs to be the special snowflake here
and reject it, forcing us to load their stupid named extension instead.

Meanwhile khronos has done nothing except spit out GLX_EXT_swap_control
(not to be confused with GL_EXT_swap_control, which is exported by
WGL_EXT_swap_control), that doesn't fix the problem because mesa doesn't
implement it anyway.

What a fucking mess.
2017-09-13 20:53:17 +02:00
Niklas Haas 1d47473a7b
vo_opengl: use UBOs where supported/required
This also introduces RA_CAP_GLOBAL_UNIFORM. If this is not set, UBOs
*must* be used for non-bindings. Currently the cap is ignored though,
and the shader_cache *always* generates UBO-using code where it can.
Could be made an option in principle.

Only enabled for drivers new enough to support explicit UBO offsets,
just in case...

No change to performance, which is probably what we expect.
2017-08-27 14:36:04 +02:00
Niklas Haas 136cf2b770 vo_opengl: add support for UBOs
Not actually used by anything yet, but straightforward enough to add to
the RA API for starters.
2017-08-27 14:36:00 +02:00
Niklas Haas 46d86da630 vo_opengl: refactor RA texture and buffer updates
- tex_uploads args are moved to a struct
- the ability to directly upload texture data without going through a
  buffer is made explicit
- the concept of buffer updates and buffer polling is made more explicit
  and generalized to buf_update as well (not just mapped buffers)
- the ability to call tex_upload/buf_update on a tex/buf is made
  explicit during tex/buf creation
- uploading from buffers now uses an explicit offset instead of
  implicitly comparing *src against buf->data, because not all buffers
  may actually be persistently mapped
- the initial_data = immutable requirement is dropped. (May be re-added
  later for D3D11 if that ever becomes a thing)

This change helps the vulkan abstraction immensely and also helps move
common code (like the PBO pooling) out of ra_gl and into the
opengl/utils.c

This also technically has the side-benefit / side-constraint of using
PBOs for OSD texture uploads as well, which actually seems to help
performance on machines where --opengl-pbo is faster than the naive code
path. Because of this, I decided to hook up the OSD code to the
opengl-pbo option as well.

One drawback of this refactor is that the GL_STREAM_COPY hack for
texture uploads "got lost", but I think I'm happy with that going away
anyway since DR almost fully deprecates it, and it's not the "right
thing" anyway - but instead an nvidia-only hack to make this stuff work
somewhat better on NUMA systems with discrete GPUs.

Another change is that due to the way fencing works with ra_buf (we get
one fence per ra_buf per upload) we have to use multiple ra_bufs instead
of offsets into a shared buffer. But for OpenGL this is probably better
anyway. It's possible that in future, we could support having
independent “buffer slices” (each with their own fence/sync object), but
this would be an optimization more than anything. I also think that we
could address the underlying problem (memory closeness) differently by
making the ra_vk memory allocator smart enough to chunk together
allocations under the hood.
2017-08-18 00:34:34 +02:00
wm4 47ea771b7a vo_opengl: further GL API use separation
Move multiple GL-specific things from the renderer to other places like
vo_opengl.c, vo_opengl_cb.c, and ra_gl.c.

The vp_w/vp_h parameters to gl_video_resize() make no sense anymore, and
are implicitly part of struct fbodst.

Checking the main framebuffer depth is moved to vo_opengl.c. For
vo_opengl_cb.c it always assumes 8. The API user now has to override
this manually. The previous heuristic didn't make much sense anyway.

The only remaining dependency on GL is the hwdec stuff, which is harder
to change.
2017-08-07 19:17:28 +02:00
Niklas Haas b31020b193
vo_opengl: check against shmem limits
The radius check was not strict enough, especially not for all
platforms. To fix this, actually check the hardware capabilities instead
of relying on a hard-coded maximum radius.
2017-07-26 01:54:33 +02:00
Bin Jin 13ef6bcf6f vo_opengl: enable compute shader for mesa
Mesa 17.1 supports compute shader but not full specs of OpenGL 4.3.
Change the code to detect OpenGL extension "GL_ARB_compute_shader"
rather than OpenGL version 4.3.

HDR peak detection requires SSBO, and polar scaler requires 2D array
extension. Add these extensions as requirement as well.
2017-07-25 04:07:26 +08:00
Niklas Haas b196cadf9f vo_opengl: support HDR peak detection
This is done via compute shaders. As a consequence, the tone mapping
algorithms had to be rewritten to compute their known constants in GLSL
(ahead of time), instead of doing it once. Didn't affect performance.

Using shmem/SSBO atomics in this way is extremely fast on nvidia, but it
might be slow on other platforms. Needs testing.

Unfortunately, setting up the SSBO still requires OpenGL calls, which
means I can't have it in video_shaders.c, where it belongs. But I'll
defer worrying about that until the backend refactor, since then I'll be
breaking up the video/video_shaders structure anyway.
2017-07-24 17:19:31 +02:00
Niklas Haas aad6ba018a vo_opengl: support compute shaders
These can either be invoked as dispatch_compute to do a single
computation, or finish_pass_fbo (after setting compute_size_minimum) to
render to a new texture using a compute shader. To make this stuff all
work transparently, we try really, really hard to make compute shaders
as identical to fragment shaders as possible in their behavior.
2017-07-24 17:19:31 +02:00
wm4 64d56114ed vo_opengl: add direct rendering support
Can be enabled via --vd-lavc-dr=yes. See manpage additions for what it
does.

This reminds of the MPlayer -dr flag, but the implementation is
completely different. It's the same basic concept: letting the decoder
render into a GPU buffer to avoid a copy. Unlike MPlayer, this doesn't
try to go through filters (libavfilter doesn't support this anyway).
Unless a filter can work in-place, DR will be silently disabled. MPlayer
had very complex semantics about buffer types and management (which
apparently nobody ever understood) and weird restrictions that mostly
limited it to mpeg2 style codecs. The mpv code does not do any of this,
and just lets the decoder allocate an arbitrary number of untyped
images. (No MPlayer code was used.)

Parts of the code based on work by atomnuker (starting point for the
generic code) and haasn (some GL definitions, some basic PBO code, and
correct fencing).
2017-07-24 04:32:55 +02:00
Niklas Haas dead206873 vo_opengl: use glBufferSubData instead of glMapBufferRange
Performance seems pretty much unchanged but I no longer get nasty spikes
on NUMA systems, probably because glBufferSubData runs in the driver or
something.

As a simplification of the code, we also just size the PBO to always
have the full size, even for cropped textures. This seems slower but not
by relevant amounts, and only affects e.g. --vf=crop. It also slightly
increases VRAM usage for textures with big strides.

This new code path is especially nice because it no longer depends on
GL_ARB_map_buffer_range, and no longer uses any functions that can
possibly fail, thus simplifying control flow and seemingly deprecating
the manpage's claim about possible image corruption.

In theory we could also reduce NUM_PBO_BUFFERS since it doesn't seem
like we're streaming uploads anyway, but leave it in there just in
case some drivers disagree...
2017-07-16 17:46:24 +02:00
Niklas Haas ad0d6caac7 vo_opengl: use textureGatherOffset for polar filters
This is more efficient on my machine (nvidia), but only when applied to
groups of exactly 4 texels. So we switch to the more efficient
textureGather for groups of 4. Some notes:

- textureGatherOffset seems to be faster than textureGather by a
  non-negligible amount, but for some reason, textureOffset is still
  slower than a straight-up texture
- textureGather* requires GLSL 400; and at least on nvidia, this
  requires actually allocating a GL 4.0 context.
- the code in opengl/common.c that clamped the GLSL version to 330 is
  deprecated, because the old user shader style has been removed
  completely in the meantime
- To combat the growing complexity of the polar sampling code, we drop
  the antiringing functionality from EWA shaders completely, since it
  never really worked well for EWA to begin with. (Horrific artifacting)
2017-07-05 11:21:58 +02:00
wm4 2b616c0682 vo_opengl: drop TLS usage
TLS is a headache. We should avoid it if we can.

The involved mechanism is unfortunately entangled with the unfortunate
libmpv API for returning pointers to host API objects. This has to be
kept until we change the API somehow.

Practically untested out of pure laziness. I'm sure I'll get a bunch of
reports if it's broken.
2017-05-11 17:47:33 +02:00
wm4 759ac6cc93 vo_opengl: add option for caching shaders on disk
Mostly because of ANGLE (sadly).

The implementation became unpleasantly big, but at least it's relatively
self-contained.

I'm not sure to what degree shaders from different drivers are
compatible as in whether a driver would randomly misbehave if it's fed
a binary created by another driver. The useless binayFormat parameter
won't help it, as they can probably easily clash. As usual, OpenGL is
pretty shit here.
2017-04-08 16:43:56 +02:00
wm4 274e71ee8b vo_opengl: add hw overlay support and use it for RPI
This overlay support specifically skips the OpenGL rendering chain, and
uses GL rendering only for OSD/subtitles. This is for devices which
don't have performant GL support.

hwdec_rpi.c contains code ported from vo_rpi.c. vo_rpi.c is going to be
deprecated. I left in the code for uploading sw surfaces (as it might
be slightly more efficient for rendering sw decoded video), although
it's dead code for now.
2016-09-12 19:58:58 +02:00
wm4 b753c229f7 vo_opengl: improve missing function warning
Mostly a cosmetic change. Handling bultin and extension functions
separaterly makes more sense here too.
2016-06-22 21:55:00 +02:00
wm4 7be37337f4 vo_opengl: vdpau interop without RGB conversion
Until now, we've always converted vdpau video surfaces to RGB, and then
mapped the resulting RGB texture. Change this so that the surface is
mapped as NV12 plane textures.

The reason this wasn't done until now is because vdpau surfaces are
mapped in an "interlaced" way as separate fields, even for progressive
video. This requires messy reinterleraving. It turns out that even
though it's an extra processing step, the result can be faster than
going through the video mixer for RGB conversion.

Other than some potential speed-gain, doing this has multiple other
advantages. We can apply our own color conversion, which is important in
more complex cases. We can correctly apply debanding and potentially
other processing that requires chroma-specific or in-YUV handling.

If deinterlacing is enabled, this switches back to the old RGB
conversion method. Until we have at least a primitive deinterlacer in
vo_opengl, this will stay this way. The d3d11 and vaapi code paths are
similar. (Of course these don't require any crazy field reinterleaving.)
2016-06-19 19:58:40 +02:00
Bin Jin 3df95ee57a vo_opengl: remove uniform buffer object routines 2016-06-18 19:16:31 +02:00
James Ross-Gowan 5700da4a8f vo_opengl: use EXT_disjoint_timer_query for timers
This is the ES equivalent to ARB_timer_query. It enables the performance
timers on ANGLE. All the added functions should be identical in
semantics to their desktop GL equivalents.
2016-06-15 20:32:47 +10:00
wm4 788929e4e0 vo_opengl: use standard functions to retrieve display depth
Until now, we've used system-specific API (GLX, EGL, etc.) to retrieve
the depth of the default framebuffer. (We equal this to display depth
and use the determined depth for dithering.)

We can actually retrieve this value through standard GL API, and it
works everywhere (except GLES 2 of course). This simplifies everything a
great deal.

egl_helpers.c is empty now. But I expect that some EGL boilerplate will
be moved to it, so don't remove it yet.
2016-06-14 10:35:43 +02:00
Niklas Haas 8ceb935bd8 vo_opengl: add time queries
To avoid blocking the CPU, we use 8 time objects and rotate through
them, only blocking until the last possible moment (before we need
access to them on the next iteration through the ring buffer). I tested
it out on my machine and 4 query objects were enough to guarantee
block-free querying, but the extra margin shouldn't hurt.

Frame render times are just output at the end of each frame, via MP_DBG.
This might be improved in the future. (In particular, I want to expose
these numbers as properties so that users get some more visible feedback
about render times)

Currently, we measure pass_render_frame and pass_draw_to_screen
separately because the former might be called multiple times due to
interpolation. Doing it this way gives more faithful numbers. Same goes
for frame upload times.
2016-06-07 12:16:15 +02:00
wm4 049e3ccb65 vo_opengl: make ES float texture format checks stricter
Some of these checks became pointless after dropping ES 2.0 support for
extended filtering.

GL_EXT_texture_rg is part of core in ES 3.0, and we already check for
this version, so testing for the extension is redundant.

GL_OES_texture_half_float_linear is also always available, at least as
far as our needs go.

The functionality we need from GL_EXT_color_buffer_half_float is always
available in ES 3.2, and we explicitly check for ES 3.2, so reject this
extension if the ES version is new enough.
2016-05-23 21:27:18 +02:00
wm4 80d702dce8 vo_opengl: make PBOs work on GLES 3.x
For some reason, GLES has no glMapBuffer, only glMapBufferRange.

GLES 2 has no buffer mapping at all, and GL 2.1 does not always have
glMapBufferRange. On those PBOs remain unsupported (there's no reason to
care about GL 2.1 without the extension).

This doesn't actually work on ANGLE, and I have no idea why. (There are
artifacts on OSD, as if parts of the OSD data weren't copied.) It works
on desktop OpenGL and at least 1 other ES 3 implementation. Don't enable
it on ANGLE, I guess.
2016-05-23 21:27:18 +02:00
wm4 75f373cc46 vo_opengl: remove unused glDrawBuffer 2016-05-23 21:27:18 +02:00
wm4 cc72a4e8c3 vo_opengl: support framebuffer invalidation
Not sure how much can be gained with this, as we can't use it properly
yet. For now, this is used only before rendering, which probably does
overwhelmingly nothing.

In the future, this should be used after temporary passes, which could
possibly reduce memory usage and even memory bandwidth usage, depending
on the drivers.
2016-05-23 21:27:18 +02:00
wm4 cc4a0571fa vo_opengl: slightly improve logging of loaded extensions
Only log when actual extensions are loaded, never log anything about
builtins.
2016-05-23 21:27:18 +02:00
wm4 bd41a3ab62 vo_opengl: add detection for the ES texture_rg extension 2016-05-12 21:22:28 +02:00
wm4 84ccebd9b9 vo_opengl: reorganize texture format handling
This merges all knowledge about texture format into a central table.

Most of the work done here is actually identifying which formats exactly
are supported by OpenGL(ES) under which circumstances, and keeping this
information in the format table in a somewhat declarative way. (Although
only to the extend needed by mpv.) In particular, ES and float formats
are a horrible mess.

Again this is a big refactor that might cause regression on "obscure"
configurations.
2016-05-12 21:22:28 +02:00
wm4 d4712af5af vo_opengl: angle: dump translated shaders
Helpful for debugging and such.
2016-05-12 11:17:49 +02:00
wm4 9d16837c99 vo_opengl: support GL_EXT_texture_norm16 on GLES
This gives us 16 bit fixed-point integer texture formats, including
ability to sample from them with linear filtering, and using them as FBO
attachments.

The integer texture format path is still there for the sake of ANGLE,
which does not support GL_EXT_texture_norm16 yet.

The change to pass_dither() is needed, because the code path using
GL_R16 for the dither texture relies on glTexImage2D being able to
convert from GL_FLOAT to GL_R16. GLES does not allow this. This could be
trivially fixed by doing the conversion ourselves, but I'm too lazy to
do this now.
2016-04-27 19:19:56 +02:00
wm4 44644e69f0 vo_opengl: fix an outdated comment
This wasn't updated over multiple iterations.
2016-04-16 16:16:50 +02:00
wm4 6325bdf197 vo_opengl: log if glGetString(GL_VERSION) returns NULL
Typically happens with some implementations if no context is currrent,
or is otherwise broken. This is particularly relevant to the opengl_cb
API, because the API user will have no other indication what went wrong.
2016-04-08 10:57:21 +02:00
wm4 e4ec0f42e4 Change GPL/LGPL dual-licensed files to LGPL
Do this to make the license situation less confusing.

This change should be of no consequence, since LGPL is compatible with
GPL anyway, and making it LGPL-only does not restrict the use with GPL
code.

Additionally, the wording implies that this is allowed, and that we can
just remove the GPL part.
2016-01-19 18:36:34 +01:00
wm4 6154c1d06d vo_opengl: split backend code from common.c to context.c
Now common.c only contains the code for the function loader, while
context.c contains the backend loader/dispatcher.

Not calling it "backend.c", because the central struct is called
MPGLContext.
2015-12-19 14:14:12 +01:00
James Ross-Gowan 3d12312806
vo_opengl: add dxinterop backend
WGL_NV_DX_interop is widely supported by Nvidia and AMD drivers. It
allows a texture to be shared between Direct3D and WGL, so that
rendering can be done with WGL and presentation can be done with
Direct3D. This should allow us to work around some persistent WGL
issues, such as dropped frames with some driver/OS combos, drivers that
buffer frames to increase performance at the cost of latency, and the
inability to disable exclusive fullscreen mode when using WGL to render
to a fullscreen window.

The addition of a DX_interop backend might also enable some cool
Direct3D-specific enhancements in the future, such as using the
GetPresentStatistics API to get accurate frame presentation timestamps.

Note that due to a driver bug, this backend is currently broken on
Intel. It will appear to work as long as the window is not resized too
often, but after a few changes of size it will be unable to share the
newly created renderbuffer with GL. See:
https://software.intel.com/en-us/forums/graphics-driver-bug-reporting/topic/562051
2015-12-11 18:29:18 +01:00
Bin Jin 42a0f4d87b vo_opengl: enable NNEDI3 prescaler on OpenGL ES 3.0
It turns out that both UBO and intBitsToFloat() are supported in
OpenGL ES 3.0[1][2], enable them so that NNEDI3 prescaler can be used
in a wider range of backends.

Also fixes some implicit int-to-float conversions so that the shader
actually compiles on GLES.

Tested on Linux desktop (nvidia 358.16) with "es" sub-option.

[1]: https://www.khronos.org/opengles/sdk/docs/man3/html/glGetUniformBlockIndex.xhtml
[2]: https://www.khronos.org/opengles/sdk/docs/manglsl/docbook4/xhtml/intBitsToFloat.xml
2015-12-02 12:32:02 +01:00
wm4 d5df90a295 vo_opengl: use ANGLE by default if available (except for "hq" preset)
Running mpv with default config will now pick up ANGLE by default. Since
some think ANGLE is still not good enough for hq features, extend the
"es" option to reject GLES backends, and add to to the opengl-hq preset.

One consequence is that mpv will by default use libswscale to convert
10 bit video to 8 bit, before it reaches the VO.
2015-11-21 18:17:14 +01:00
wm4 4fd0cd4a73 vo_opengl: support 3D textures on ANGLE
Unfortunately, color management can still not work, because no GLES
version specified so far support fixed-point 16 bit textures. Maybe
we could use integer textures, but these don't support filtering.
Using float textures would be another possibility.
2015-11-19 21:21:04 +01:00
wm4 b86a59aa4f vo_opengl: better check for float texture support
We don't only need float textures for advanced scaling - we also need
them to be filterable with GL_LINEAR. On GLES, this is not supported
until GLES 3.1, but some implementation expose them with extensions.
2015-11-19 21:17:45 +01:00
Kevin Mitchell 368431f57c vo_opengl: check shader string before sscanfing it 2015-11-19 08:14:06 -08:00
James Ross-Gowan 76e4374d06 vo_opengl: fix ANGLE GLES3 mode
Turns out glGetTexLevelParameter, which is missing in ANGLE, is a
GLES3.1 function. Removing it from the list of core GLES3 functions
makes ANGLE work in GLES3 mode.
2015-11-19 20:37:30 +11:00
James Ross-Gowan e76ec2c963 vo_opengl: add initial ANGLE support
ANGLE is a GLES2 implementation for Windows that uses Direct3D 11 for
rendering, enabling vo_opengl to work on systems with poor OpenGL
drivers and bypassing some of the problems with native GL, such as VSync
in fullscreen mode.

Unfortunately, using GLES2 means that most of vo_opengl's advanced
features will not work, however ANGLE is under rapid development and
GLES3 support is supposed to be coming soon.
2015-11-18 23:07:33 +11:00
wm4 6b22b21651 vo_opengl: attempt to improve GLX vs. EGL backend detection
For the sake of vaapi interop, we want to use EGL, but on the other
hand, but because driver developers are full of shit, vdpau interop will
not work on EGL (even if the driver supports EGL). The latter happens
with both nvidia and AMD Mesa drivers.

Additionally, EGL vaapi interop support can apparently only detected at
runtime by actually using it. While hwdec_vaegl.c already does this, it
would require initializing libva on _every_ system, which will cause
libav to print an unpreventable bullshit message to the terminal.

Try to counter these huge loads of bullshit by adding more fucking
bullshit.
2015-11-16 16:22:23 +01:00