Commit Graph

105 Commits

Author SHA1 Message Date
wm4 47f2f554a3 vo_opengl: handle alpha with odd bit widths too
Since alpha isn't pulled through the colormatrix (maybe it should?), we
reject alpha formats with odd sizes, such as yuva444p10.

But the awful tex_mul path in vo_opengl does this anyway (at some points
even explicitly), which means there will be a subtle difference in
handling of 16 bit yuv alpha formats. Make it consistent and always
apply the range adjustment to the alpha component. This also means odd
sizes like 10 bit are supported now.

This assumes alpha uses the same "shifted" range as the yuv color
channels for depths larger than 8 bit. I'm not sure whether this is
actually the case.
2015-12-19 16:11:34 +01:00
wm4 a0519f1d18 vo_opengl: cocoa: output premultiplied alpha
Which is apparently what is expected here. (I'm pretty sure X11
compositors want stright alpha, so 2 code paths are needed.)
2015-12-19 14:14:12 +01:00
wm4 3394d37b4e vo_opengl: refactor how framebuffer depth is passed from backends
Store the determined framebuffer depth in struct GL instead of
MPGLContext. This means gl_video_set_output_depth() can be removed, and
also justifies adding new fields describing framebuffer/backend
properties to struct GL instead of having to add more functions just to
shovel the information around.

Keep in mind that mpgl_load_functions() will wipe struct GL, so the
new fields must be set before calling it.
2015-12-19 14:14:12 +01:00
wm4 f24ba544cd vo_opengl: enable brightness/contrast controls for RGB
Why not.

Also, instead of disabling hue/saturation for RGB, just don't apply
them. (They don't make sense for conversion matrixes other than YUV, but
I can't be bothered to keep the fine-grained disabling of UI controls
either.)
2015-12-12 14:47:30 +01:00
wm4 47e6ef0bdf vo_opengl: remove one more XYZ special-case
The XYZ colorspace on XYZ pixfmt is enforced in some sanitation routine.
2015-12-09 17:10:38 +01:00
Bin Jin 6d36c432ab vo_opengl: fix precision loss of fruit dithering matrix
With default setting, the matrix for fruit dithering requires 12 bits
precision (values from 0/4096 to 4095/4096). But 16-bit float
provides only 10 bits. In addition, when `dither-size-fruit=8` is
set, 16 bits are required from the texture format.

Fix this by attempting to use 16 bit integer texture first. This is
still not precise, but should be better than using a half float.
2015-12-09 00:36:48 +01:00
wm4 45ae0716be csputils: rename "yuv2rgb" functions
They're not necessarily restricted to YUV aka YCbCr.

vo_direct3d.c and demux_disc.c (DVD specific code) changes untested.
2015-12-09 00:23:36 +01:00
wm4 c5c7b239b6 csputils, vo_opengl: remove XYZ special case in color matrix retrieval
This just seems unnecessary. Refactor it a bit. There should be no
functional changes.
2015-12-09 00:16:51 +01:00
wm4 c138505813 vo_opengl: enable colormatrix even for RGB input
Enables brightness/contrast controls, and handles gbrp10 correctly.
2015-12-07 23:48:59 +01:00
wm4 663415b914 vo_opengl: fix issues with some obscure pixel formats
The computation of the tex_mul variable was broken in multiple ways.
This variable is used e.g. by debanding for moving expansion of 10 bit
fixed-point input to normalized range to another stage of processing.

One obvious bug was that the rgb555 pixel format was broken. This format
has component_bits=5, but obviously it's already sampled in normalized
range, and does not need expansion. The tex_mul-free code path avoids
this by not using the colormatrix. (The code was originally designed to
work around dealing with the generally complicated pixel formats by only
using the colormatrix in the YUV case.)

Another possible bug was with 10 bit input. It expanded the input by
bringing the [0,2^10) range to [0,1], and then treating the expanded
input as 16 bit input. I didn't bother to check what this actually
computed, but it's somewhat likely it was wrong anyway. Now it uses
mp_get_csp_mul(), and disables expansion when computing the YUV matrix.
2015-12-07 23:48:59 +01:00
Bin Jin c569d4f6ed vo_opengl: decrease default lookup texture size to 64
It turns out that with accurate lookup we can decrease the
default size of texture now. Do it to compensate the performance
loss introduced by the LUT_POS macro.
2015-12-07 23:48:40 +01:00
Bin Jin e6058d3dc3 vo_opengl: make LOOKUP_TEXTURE_SIZE configurable 2015-12-07 23:48:18 +01:00
Bin Jin c1a96de41c vo_opengl: Fix minor LUT sampling error
Define a macro to correct the coordinate for lookup texture. Cache
the corrected coordinate for 1D filter and use mix() to minimize the
performance impact.
2015-12-07 23:48:15 +01:00
wm4 17507b5935 vo_opengl: require --enable-gpl3 for nnedi
There are claims that nnedi3.c doesn't constitute its own new
implementation, but is derived from existing HLSL or OpenCL shaders
distributed under the LGPLv3 license.

Until these are resolved, do the "correct" thing and require
--enable-gpl3 to build nnedi.
2015-12-03 09:32:40 +01:00
Bin Jin 42a0f4d87b vo_opengl: enable NNEDI3 prescaler on OpenGL ES 3.0
It turns out that both UBO and intBitsToFloat() are supported in
OpenGL ES 3.0[1][2], enable them so that NNEDI3 prescaler can be used
in a wider range of backends.

Also fixes some implicit int-to-float conversions so that the shader
actually compiles on GLES.

Tested on Linux desktop (nvidia 358.16) with "es" sub-option.

[1]: https://www.khronos.org/opengles/sdk/docs/man3/html/glGetUniformBlockIndex.xhtml
[2]: https://www.khronos.org/opengles/sdk/docs/manglsl/docbook4/xhtml/intBitsToFloat.xml
2015-12-02 12:32:02 +01:00
wm4 6ff1cd5502 vo_opengl: make tscale=mitchell:tscale-clamp the default
Looks better than "oversample". tscale-clamp suggested by haasn.
2015-11-29 17:55:01 +01:00
wm4 9fc74d5acd vo_opengl: warn if interpolation is enabled, but not display-sync
Try to avoid user confusion.

Reading the global options in this place is a pretty disgusting hack,
but it's still the most robust way.
2015-11-28 20:10:01 +01:00
wm4 318e9801f2 vo_opengl: fix interpolation with display-sync
At least I hope so.

Deriving the duration from the pts was not really correct. It doesn't
include speed adjustments, and becomes completely wrong of the user e.g.
changes the playback speed by a huge amount. Pass through the accurate
duration value by adding a new vo_frame field.

The value for vsync_offset was not correct either. We don't need the
error for the next frame, but the error for the current one. This wasn't
noticed because it makes no difference in symmetric cases, like 24 fps
on 60 Hz.

I'm still not entirely confident in the correctness of this, but it sure
is an improvement.

Also, remove the MP_STATS() calls - they're not really useful to debug
anything anymore.
2015-11-28 15:45:49 +01:00
wm4 7023c383b2 vo: change vo_frame field units
This was just converting back and forth between int64_t/microseconds and
double/seconds. Remove this stupidity. The pts/duration fields are still
in microseconds, but they have no meaning in the display-sync case (also
drop printing the pts field from opengl/video.c - it's always 0).
2015-11-27 22:04:44 +01:00
wm4 1fe64c61be vo_opengl: disable interpolation without display-sync
Without display-sync mode, our guesses wrt. vsync phase etc. are much
worse, and I see no reason to keep the complicated "vsync_timed" code.
2015-11-25 22:10:55 +01:00
wm4 59eb489425 vo_opengl: enable dumb-mode automatically if possible
I decided that I actually can't stand how vo_opengl unnecessarily puts
the video through 3 shader stages (instead of 1). Thus, what was meant
to be a fallback for weak OpenGL implementations, the dumb-mode, now
becomes default if the user settings allow it.

The code required to check for the settings isn't so wild, so I guess
it's manageable. I still hope that one day, our rendering logic can
generate ideal shader stages for this case too.

Note that in theory, dumb-mode could be reenabled at runtime due to a
color management 3D LUT being set, so a separate dumb_mode field is
required. The dumb-mode option can't just be overwritten.
2015-11-19 21:22:24 +01:00
wm4 4fd0cd4a73 vo_opengl: support 3D textures on ANGLE
Unfortunately, color management can still not work, because no GLES
version specified so far support fixed-point 16 bit textures. Maybe
we could use integer textures, but these don't support filtering.
Using float textures would be another possibility.
2015-11-19 21:21:04 +01:00
wm4 6df3fa2ec1 vo_opengl: switch FBO format on GLES
GL_RGB10_A2 is the best fixed-point format we can get on GLES/ANGLE for
now. (Unless we somehow switch to non-normalized integer textures.)
2015-11-19 21:20:50 +01:00
wm4 1a8b06f67e vo_opengl: make 1D textures completely optional
Polar scalers use 1D textures, because they're slightly faster on some
GPUs than 2D textures. But 2D textures work too, so add support for
them.

Allows using these scalers with ANGLE.
2015-11-19 21:20:40 +01:00
wm4 a6fb80baa4 vo_opengl: add RGBA8 framebuffer format, enable non-dumb mode for ES 3.0
This makes advanced scaling sort-of work for GLES 3.0 (on ANGLE). It's
still not very advisable, as 8 bits might not be enough to avoid
debanding. (Ironically, the debanding filter can be enabled, and does
not raise any GL errors - but probably doesn't do anything useful.)
2015-11-19 14:45:06 +01:00
wm4 f9a2fc592f vo_opengl: don't mix floats and integers in dither shader
Some GLSL dialects (GLSL ES 3.00) do not have such implicit conversions.
They have to be made floats for the sake of the shader compiler.
2015-11-19 14:41:49 +01:00
wm4 e24e0ccd68 vo_opengl: force dumb mode if RG textures are not available
Something goes wrong somewhere. Don't bother, it's only needed for
compatibility with our absolute baseline (GL 2.1/GLES 2).

On the other hand, we can process nv12 formats just fine.
2015-11-16 20:09:15 +01:00
wm4 883d311413 vo_opengl: use glBlitFramebuffer to draw repeated frames
In the display-sync, non-interpolation case, and if the display refresh
rate is higher than the video framerate, we duplicate display frames by
rendering exactly the same screen again. The redrawing is cached with a
FBO to speed up the repeat.

Use glBlitFramebuffer() instead of another shader pass. It should be
faster.

For some reason, post-process was run again on each display refresh.
Stop doing this, which should also be slightly faster. The only
disadvantage is that temporal dithering will be run only once per video
frame, but I can live with this.

One aspect is messy: clearing the background is done at the start on the
target framebuffer, so to avoid clearing twice and duplicating the code,
only copy the part of the framebuffer that contains the rendered video.
(Which also gets slightly messy - needs to compensate for coordinate
system flipping.)
2015-11-15 18:30:54 +01:00
wm4 4682b0147e vo_opengl: move the glFlush() call to the renderer 2015-11-10 14:36:23 +01:00
Bin Jin 03bbaad686 vo_opengl: fix 10-bit video prescaling
The nnedi3 prescaler requires a normalized range to work properly,
but the original implementation did the range normalization after
the first step of the first pass. This could lead to severe quality
degradation when debanding is not enabled for NNEDI3.

Fix this issue by passing `tex_mul` into the shader code.

Fixes #2464
2015-11-09 22:48:40 +01:00
wm4 eeb5f98758 vo_opengl: handle GL_ARB_uniform_buffer_object with low GLSL versions
Why is this stupid crap being so much a pain for no reason.
2015-11-09 16:24:01 +01:00
wm4 46cee66563 vo_opengl: rename fancy-downscaling to correct-downscaling
The old name was stupid. Very stupid.
2015-11-07 17:49:14 +01:00
Avi Halachmi (:avih) 0062c98dff vo_opengl: fancy-downscaling: enable also for anamorphic clips 2015-11-07 17:44:50 +01:00
wm4 cfa4952f7c vo_opengl: glBindBufferBase is not part of GL 2.1/GLES 2.0
Commit 27dc834f added it as such.

Also remove the check for glUniformBlockBinding() - it's part of an
extension, and the check glGetUniformBlockIndex() already checks whether
the extension is fully available.
2015-11-06 13:59:33 +01:00
Bin Jin 27dc834f37 vo_opengl: implement NNEDI3 prescaler
Implement NNEDI3, a neural network based deinterlacer.

The shader is reimplemented in GLSL and supports both 8x4 and 8x6
sampling window now. This allows the shader to be licensed
under LGPL2.1 so that it can be used in mpv.

The current implementation supports uploading the NN weights (up to
51kb with placebo setting) in two different way, via uniform buffer
object or hard coding into shader source. UBO requires OpenGL 3.1,
which only guarantee 16kb per block. But I find that 64kb seems to be
a default setting for recent card/driver (which nnedi3 is targeting),
so I think we're fine here (with default nnedi3 setting the size of
weights is 9kb). Hard-coding into shader requires OpenGL 3.3, for the
"intBitsToFloat()" built-in function. This is necessary to precisely
represent these weights in GLSL. I tried several human readable
floating point number format (with really high precision as for
single precision float), but for some reason they are not working
nicely, bad pixels (with NaN value) could be produced with some
weights set.

We could also add support to upload these weights with texture, just
for compatibility reason (etc. upscaling a still image with a low end
graphics card). But as I tested, it's rather slow even with 1D
texture (we probably had to use 2D texture due to dimension size
limitation). Since there is always better choice to do NNEDI3
upscaling for still image (vapoursynth plugin), it's not implemented
in this commit. If this turns out to be a popular demand from the
user, it should be easy to add it later.

For those who wants to optimize the performance a bit further, the
bottleneck seems to be:
1. overhead to upload and access these weights, (in particular,
   the shader code will be regenerated for each frame, it's on CPU
   though).
2. "dot()" performance in the main loop.
3. "exp()" performance in the main loop, there are various fast
   implementation with some bit tricks (probably with the help of the
   intBitsToFloat function).

The code is tested with nvidia card and driver (355.11), on Linux.

Closes #2230
2015-11-05 17:38:20 +01:00
Bin Jin 4c43c30421 vo_opengl: add Super-xBR filter for upscaling
Add the Super-xBR filter for image doubling, and the prescaling framework
to support it.

The shader code was ported from MPDN extensions project, with
modification to process luma only.

This commit is largely inspired by code from #2266, with
`gl_transform_trans()` authored by @haasn taken directly.
2015-11-05 17:38:20 +01:00
Bin Jin 7438f208c3 vo_opengl: make image size dynamic during rendering
This commit marks the image size variables temporary, and renames them
in order to prevent any potential confusion in the future.
2015-11-05 17:38:20 +01:00
wm4 e6a395c297 vo_opengl, vo_opengl_cb: drop unneeded vo_frame fields
next_vsync/prev_vsync was only used to retrieve the vsync duration. We
can get this in a simpler way.

This also removes the vsync duration estimation from vo_opengl_cb.c,
which is probably worthless anyway. (And once interpolation is made
display-sync only, this won't matter at all.)
2015-11-04 21:49:54 +01:00
wm4 8737732035 vo_opengl: cache frames only in display-sync mode
vo_frame.num_vsyncs can be != 1 in some cases in normal sync mode too.
This is not a very exact fix, but in exchange it's robust. (These
vo_frame flags are way too tricky in combination with redrawing and
such.)
2015-10-30 12:53:43 +01:00
wm4 67aab3a9f6 vo_opengl: do not attempt to cache frames in FBO in dumb-mode
There were occasional shader compilation and rendering failures if FBOs
were unavailable. This is caused by the FBO caching code getting active,
even though FBOs are unavailable (i.e. dumb-mode).

Boken by commit 97fc4f.

Fixes #2432.
2015-10-30 12:49:12 +01:00
Bin Jin 17b4fb02b3 vo_opengl: remove source shader leftover
The source shader was removed after deband was introduced.
2015-10-24 17:11:02 +02:00
Niklas Haas 97fc4f4a85 vo_opengl: always cache to an FBO when not interpolating
This speeds up redraws considerably (improving eg. <60 Hz material on a 60 Hz
monitor with display-sync active, or redraws while paused), but slightly
slows down the worst case (eg. video FPS = display FPS).
2015-10-23 19:51:20 +02:00
wm4 e3de309804 vo_opengl: support all kinds of GBRP formats
Adds support for AV_PIX_FMT_GBRP9, AV_PIX_FMT_GBRP10, AV_PIX_FMT_GBRP12,
AV_PIX_FMT_GBRP14, AV_PIX_FMT_GBRP16, AV_PIX_FMT_GBRAP, and
AV_PIX_FMT_GBRAP16.

(Not that it matters, because nobody uses these anyway.)
2015-10-18 18:37:24 +02:00
wm4 4a07205963 vo_opengl: debanding requires GLSL 1.30
We have to disable it, or shader compilation will fail.

Fixes #2362.
2015-10-01 20:44:39 +02:00
wm4 41bf41e416 vo_opengl: do not reset video queue when changing video equalizer
If interpolation is enabled, then this causes heavy artifacts if done
while unpaused. It's preferable to allow a latency of a few frames for
the change to take full effect instead. If this is done paused, the
frame is fully redrawn anyway.
2015-09-30 22:59:34 +02:00
wm4 57831d52dc vo_opengl: actually set hardware decoder mapped texture format
Surfaces used by hardware decoding formats can be mapped exactly like a
specific software pixel format, e.g. RGBA or NV12. p->image_params is
supposed to be set to this format, but it wasn't.

(How did this ever work?)

Also, setting params->imgfmt in the hwdec interop drivers is pointless
and redundant. (Change them to asserts, because why not.)
2015-09-24 23:48:57 +02:00
wm4 cb1c072534 vo_opengl: remove sharpen scalers, add sharpen sub-option
This turns the old scalers (inherited from MPlayer) into a pre-
processing step (after color conversion and before scaling). The code
for the "sharpen5" scaler is reused for this.

The main reason MPlayer implemented this as scalers was perhaps because
FBOs were too expensive, and making it a scaler allowed to implement
this in 1 pass. But unsharp masking is not really a scaler, and I would
guess the result is more like combining bilinear scaling and unsharp
masking.
2015-09-23 22:43:27 +02:00
wm4 65ad85790a vo_opengl: remove unsued chroma_location field
This was redundant to forcing the value with vf_format, so the vo_opengl
sub-option was removed. This field is just a leftover.
2015-09-23 22:16:36 +02:00
wm4 17cd6798a6 vo_opengl: move shader file caching to video.c
It's just about loading and cachign small files, not does not
necessarily have anything to do with shaders. Move it to video.c where
it's used.
2015-09-23 22:13:03 +02:00
wm4 a8eae12af5 vo_opengl: fix shader compilation with debanding and OSX hwdec
2 things are being stupid here: Apple for requiring rectangle textures
with their IOSurface interop for no reason, and OpenGL having a
different sampler type for rectangle textures.
2015-09-10 20:53:47 +02:00
wm4 b4abcbd19d vo_opengl: fix deband sub-option handling
This all has to be done manually.
2015-09-09 20:40:04 +02:00
Niklas Haas 97363e176d vo_opengl: implement debanding (and remove source-shader)
The removal of source-shader is a side effect, since this effectively
replaces it - and the video-reading code has been significantly
restructured to make more sense and be more readable.

This means users no longer have to constantly download and maintain a
separate deband.glsl installation alongside mpv, which was the only real
use case for source-shader that we found either way.
2015-09-09 19:19:23 +02:00
Niklas Haas eb56807b41 vo_opengl: move self-contained shader routines to a separate file
This is mostly to cut down somewhat on the amount of code bloat in
video.c by moving out helper functions (including scaler kernels and
color management routines) to a separate file.

It would certainly be possible to move out more functions (eg. dithering
or CMS code) with some extra effort/refactoring, but this is a start.

Signed-off-by: wm4 <wm4@nowhere>
2015-09-09 18:17:44 +02:00
Niklas Haas 7929e36e93 vo_opengl: reduce code duplication for scaler options
This simple refactor cuts down on the immense amount of overhead and
duplication across all of the related scale-* options.
2015-09-09 18:09:40 +02:00
Niklas Haas 44eda2177d vo_opengl: remove gl_ prefixes from files in video/out/opengl
This is a bit redundant with the name of the directory itself, and not
in line with existing naming conventions.
2015-09-09 18:09:31 +02:00