Commit Graph

11 Commits

Author SHA1 Message Date
wm4 1d0bf4073b vo_opengl: handle probing GL texture formats better
Retrieve the depth for each component and internal texture format
separately. Only for 8 bit per component textures we assume that all
bits are used (or else we would in my opinion create too many probe
textures).

Assuming 8 bit components are always correct also fixes operation in
GLES3, where we assumed that each component had -1 bits depth, and this
all UNORM formats were considered unusable. On GLES, the function to
check the real bit depth is not available. Since GLES has no 16 bit
UNORM textures at all, except with the MPGL_CAP_EXT16 extension, just
drop the special condition for it. (Of course GLES still manages to
introduce a funny special case by allowing GL_LUMINANCE , but not
defining GL_TEXTURE_LUMINANCE_SIZE.)

Should fix #4749.
2017-08-11 21:29:35 +02:00
Niklas Haas f2298f394e vo_opengl: move timers to struct ra
In order to prevent code duplication and keep the ra abstraction as
small as possible, `ra` only implements the actual timer queries,
it does not do pooling/averaging of the results. This is instead moved
to a ra-neutral struct timer_pool in utils.c.
2017-08-06 00:10:20 +02:00
Niklas Haas b31020b193
vo_opengl: check against shmem limits
The radius check was not strict enough, especially not for all
platforms. To fix this, actually check the hardware capabilities instead
of relying on a hard-coded maximum radius.
2017-07-26 01:54:33 +02:00
Niklas Haas 6c06e7e2a0
vo_opengl: cosmetic fix 2017-07-25 05:23:52 +02:00
Niklas Haas cd226bdfd8 vo_opengl: fix incoherent texture usage
This bug slipped past my attention because nvidia ignores memory
barriers, but this is not necessarily always the case. Since
image_load_store is incoherent (specifically, writing to images from
compute shaders is incoherent) we need to insert a memory barrier to
make it coherent again. Since we only care about texture fetches, that's
the only barrier we need.
2017-07-25 05:22:29 +02:00
Niklas Haas b196cadf9f vo_opengl: support HDR peak detection
This is done via compute shaders. As a consequence, the tone mapping
algorithms had to be rewritten to compute their known constants in GLSL
(ahead of time), instead of doing it once. Didn't affect performance.

Using shmem/SSBO atomics in this way is extremely fast on nvidia, but it
might be slow on other platforms. Needs testing.

Unfortunately, setting up the SSBO still requires OpenGL calls, which
means I can't have it in video_shaders.c, where it belongs. But I'll
defer worrying about that until the backend refactor, since then I'll be
breaking up the video/video_shaders structure anyway.
2017-07-24 17:19:31 +02:00
Niklas Haas aad6ba018a vo_opengl: support compute shaders
These can either be invoked as dispatch_compute to do a single
computation, or finish_pass_fbo (after setting compute_size_minimum) to
render to a new texture using a compute shader. To make this stuff all
work transparently, we try really, really hard to make compute shaders
as identical to fragment shaders as possible in their behavior.
2017-07-24 17:19:31 +02:00
wm4 64d56114ed vo_opengl: add direct rendering support
Can be enabled via --vd-lavc-dr=yes. See manpage additions for what it
does.

This reminds of the MPlayer -dr flag, but the implementation is
completely different. It's the same basic concept: letting the decoder
render into a GPU buffer to avoid a copy. Unlike MPlayer, this doesn't
try to go through filters (libavfilter doesn't support this anyway).
Unless a filter can work in-place, DR will be silently disabled. MPlayer
had very complex semantics about buffer types and management (which
apparently nobody ever understood) and weird restrictions that mostly
limited it to mpeg2 style codecs. The mpv code does not do any of this,
and just lets the decoder allocate an arbitrary number of untyped
images. (No MPlayer code was used.)

Parts of the code based on work by atomnuker (starting point for the
generic code) and haasn (some GL definitions, some basic PBO code, and
correct fencing).
2017-07-24 04:32:55 +02:00
wm4 7b84297699 vo_opengl: minor cosmetics 2017-04-14 17:35:27 +02:00
wm4 e7940ddbf3 vo_opengl: fix a confused comment 2017-04-08 16:43:56 +02:00
wm4 eb83ee4a4a vo_opengl: add our own copy of OpenGL headers
gl_headers.h is basically header_fixes.h done consequently. It contains
all OpenGL defines (and some typedefs) we need. We don't include GL
headers provided by the system anymore.

Some care has to be taken by certain windowing APIs including all of
gl.h anyway. Then the definitions could clash. Fortunately, redefining
preprocessor symbols to the same content is allowed and ignored. Also,
redefining typedefs to the same thing is allowed in C11. Apparently the
latter is not allowed in C99, so there is an imperfect attempt to avoid
the typedefs if required API symbols are apparently present already.

The nost risky part about this are the standard typedefs and GLAPIENTRY.
The latter is different only on win32 (and at least consistently so).
The typedefs are mostly based on stdint.h typedefs, which khrplatform.h
clumsily emulates on platforms which don't have it. The biggest
difference is that we define GLsizeiptr directly to ptrdiff_t, instead
of checking for the _WIN64 symbol and defining it to long or long long.

This also typedefs GLsync to __GLsync, just like the khronos headers.
Although symbols prefixed with __ are implementation reserved, khronos
also violates this rule, and having the same definition as khronos will
avoid problems on duplicate definitions.

We can simplify the build scripts too. The ios-gl check seems a bit
wrong now (what we really want to test for is EAGLContext), but I can't
test and thus can't improve it.

cuda_dynamic.h redefined two GL symbols; just include the new headers
directly instead.
2017-04-07 15:09:27 +02:00