Commit Graph

282 Commits

Author SHA1 Message Date
Niklas Haas 7c3d78fd82 vo_opengl: support external user hooks
This allows users to add their own near-arbitrary hooks to the vo_opengl
processing pipeline, greatly enhancing the flexibility of user shaders.
This enables, among other things, user shaders such as CrossBilateral,
SuperRes, LumaSharpen and many more.

To make parsing the user shaders easier, shaders are now loaded as
bstrs, and the hooks are set up during video reconfig instead of on
every single frame.
2016-05-15 20:42:02 +02:00
Niklas Haas d53142f9ba vo_opengl: add optional hook points
These are "sequence points" where the image could be rendered out to an
FBO, hooked, and re-loaded if any such hook exists. This is perfect for
things like the current user shaders system, as well as optional effects
like unsharp masking.

Note that since we have to pick *some* FBO to store the optionally
hooked texture, we just store it in an array indexed by an increasing
counter. Since we only ever store as many as MAX_TEXTURE_HOOKS + all
internal hook points entries, this is guaranteed to be enough space.

This commit also removes some of the now unused FBOs.
2016-05-15 20:42:02 +02:00
Niklas Haas 070edd7300 vo_opengl: add hooks and rework pass_read_video
The hook mechanism allows arbitrary processing stages to get dispatched
whenever certain named textures have been "finalized" by the code.

This is mostly meant to serve as a change that opens up the internal
processing in pass_read_video to user scripts, but as a side benefit all
of the code dealing with offsets and plane alignment and other such
confusing things has been rewritten.

This hook mechanism is powerful enough to cover the needs of both
debanding and prescaling (and more), so as a result they can be removed
from pass_read_video entirely and implemented through hooks.

Some avenues for optimization:

- The prescale hook is currently somewhat distributed code-wise. It might be
  cleaner to split it into superxbr and NNEDI3 hooks which can each be
  self-contained.

- It might be possible to move a large part of the hook code out to an
  external file (including the hook definitions for debanding and
  prescaling), which would be very much desired.

- Currently, some stages (chroma merging, integer conversion) will
  *always* run even if unnecessary. I'm planning another series of
  refactors (deferred img_tex) to allow dropping unnecessary shader
  stages like these, but that's probably some ways away. In the meantime
  it would be doable to re-add some of the logic to skip these stages if
  we know we don't need them.

- More hook locations could be added (?)
2016-05-15 20:42:02 +02:00
Niklas Haas 3d4889e91e vo_opengl: minor change to scaler_resizes_only
Instead of rounding down, we round to the nearest float. This reduces
the maximum possible error introduced by this rounding operation. Also
clarify the comment.
2016-05-15 20:42:02 +02:00
wm4 449b948ee8 vo_opengl: remove some pointless compatibility
Remove non-texture_rg compatibility from LUT sampling. OpenGL without
texture_rg support will always trigger dumb-mode, and dumb-mode does not
use LUTs. It used not to, and that was when this made sense.
2016-05-14 12:02:02 +02:00
wm4 3858d37b61 vo_opengl: partially fix 0bgr format support
Fixes broken colors with --vf=format=0bgr (but only if deband is
disabled).

0bgr means the first byte is padding, while the following three bytes
are bgr. From the vo_opengl perspective, it has 4 physical components
with 3 logical components. copy_img_tex() simply copied 3 components
from the physical representation, which means the last component (r) was
sliced off.

Fix this by not using p->color_swizzle for packed formats, and instead
let packed formats set the per-plane swizzle in texplane.swizzle. The
latter applies the swizzle as part of operation in copy_img_tex(), which
essentially moves physical to logical representations.

Unfortunately, debanding (and thus with opengl-hq defaults) is still
broken.
2016-05-13 22:35:42 +02:00
wm4 09e07e92c5 vo_opengl: drop duplicate LUMINANCE_ALPHA handling
This was supposed to handle the absence of GL_ARB_texture_rg. But it's
already handled elsewhere. (init_format() sets texplane.swizzle
accordingly.)
2016-05-13 22:07:25 +02:00
wm4 7d2c6d60da vo_opengl: minor simplification
Make the find_plane_format function take a bit count.

This also makes the function's comment true for the first time the
function and its comment exist. (It was commented as taking bits, but
always took bytes.)
2016-05-13 21:46:08 +02:00
wm4 c9d8bc088c vo_opengl: restrict ES2 FBO formats
Only a few very low bit depth internal formats can be rendered to in
pure ES2 (GL_RGB565 is the "best" one).

Seems like the only potentially reasonable renderable formats in ES2
could be provided via GL_OES_rgb8_rgba8, or half-floats, so don't
bother with this at all.
2016-05-13 18:50:38 +02:00
wm4 f7c81c03b2 vo_opengl: angle: log extension string 2016-05-13 15:39:11 +02:00
wm4 a228bf54c8 vo_opengl: slightly better FBO format check
Now that we know in advance whether an implementation should support a
specific format, we have more flexibility when determining which format
to use.

In particular, we can drop the roundabout ES logic.

I'm not sure if actually trying to create the FBO for probing still has
any value. But it might, so leave it for now.
2016-05-12 21:22:28 +02:00
wm4 0cd217b039 vo_opengl: disable scalers on ES2
Even if everything else is available, the need for first class arrays
breaks it. In theory we could fix this since we don't strictly need
them, but I guess it's not worth bothering.

Also give the misnamed have_mix variable a slightly better name.
2016-05-12 21:22:28 +02:00
wm4 bd41a3ab62 vo_opengl: add detection for the ES texture_rg extension 2016-05-12 21:22:28 +02:00
wm4 84ccebd9b9 vo_opengl: reorganize texture format handling
This merges all knowledge about texture format into a central table.

Most of the work done here is actually identifying which formats exactly
are supported by OpenGL(ES) under which circumstances, and keeping this
information in the format table in a somewhat declarative way. (Although
only to the extend needed by mpv.) In particular, ES and float formats
are a horrible mess.

Again this is a big refactor that might cause regression on "obscure"
configurations.
2016-05-12 21:22:28 +02:00
wm4 e68b510a94 vo_opengl: correctly disable interpolation if tscale can't be used
It'll fail with an assertion in the interpolation code otherwise.
2016-05-12 21:22:28 +02:00
wm4 d4712af5af vo_opengl: angle: dump translated shaders
Helpful for debugging and such.
2016-05-12 11:17:49 +02:00
wm4 01d04b100f vo_opengl: don't use dumb-mode with 10 bit integer texture hack
Recent regression. Caused it to use dumb-mode with integer textures,
which on ANGLE leads to nearest scaling.
2016-05-11 17:41:00 +02:00
wm4 70b3561270 video: add --hwdec=auto-copy mode
This uses the normal autoprobing rules like "auto", but rejects anything
that isn't flagged as copying data back to system memory.

The chunk in command.c was dead code, so remove it instead of updating
it.
2016-05-11 16:20:13 +02:00
wm4 cb694ffd8e vo_opengl: d3d11egl: support full range YUV
MSDN documents this as "Introduced in Windows 8.1.". I assume on Windows
7 this field will simply be ignored. Too bad for Windows 7 users.

Also, I'm not using D3D11_VIDEO_PROCESSOR_NOMINAL_RANGE_16_235 and
D3D11_VIDEO_PROCESSOR_NOMINAL_RANGE_0_255, because these are apparently
completely missing from the MinGW headers. (Such a damn pain.)
2016-05-11 15:40:31 +02:00
wm4 7983cda5bf vo_opengl: d3d11egl: don't require EGL_EXT_device_query
Older ANGLE builds don't export this.

This change is really only for convenience, and I might revert it at
some later point.
2016-05-11 15:40:31 +02:00
wm4 fd82e14888 build: merge d3d11va and dxva2 hwaccel checks
We don't have any reason to disable either. Both are loaded dynamically
at runtime anyway. There is also no reason why dxva2 would disappear
from libavcodec any time soon.
2016-05-11 15:40:31 +02:00
wm4 fde20d10bc vo_opengl: angle: dynamically load ANGLE
ANGLE is _really_ annoying to build. (Requires special toolchain and a
recent MSVC version.) This results in various issues with people
having trouble to build mpv against ANGLE (apparently linking it
against a prebuilt binary doesn't count, or using binaries from
potentially untrusted sources is not wanted).

Dynamically loading ANGLE is going to be a huge convenience. This commit
implements this, with special focus on keeping it source compatible to
a normal build with ANGLE linked at build-time.
2016-05-11 15:39:29 +02:00
wm4 1f6e71c7fa vo_opengl: fix passing along swizzle from hwdec interop
In theory this was needed for the previous commit (but wasn't in
practice, since for hwdec the LUMINANCE_ALPHA mangling is not applied
anymore, and ANGLE uses RG textures in absence of GL_ARB_texture_rg for
whatever crazy reasons).

In practice this caused funky colors on OSX with the uyvy422 format,
which is also fixed in this commit.
2016-05-10 21:12:57 +02:00
wm4 a3d416c3d3 vo_opengl: d3d11egl: native NV12 sampling support
This uses EGL_ANGLE_stream_producer_d3d_texture_nv12 and related
extensions to map the D3D textures coming from the hardware decoder
directly in GL.

In theory this would be trivial to achieve, but unfortunately ANGLE does
not have a mechanism to "import" D3D textures as GL textures. Instead,
an awkward mechanism via EGL_KHR_stream was implemented, which involves
at least 5 extensions and a lot of glue code. (Even worse than VAAPI EGL
interop, and very far from the simplicity you get on OSX.)

The ANGLE mechanism so far supports only the NV12 texture format, which
means 10 bit won't work. It also does not work in ES3 mode yet. For
these reasons, the "old" ID3D11VideoProcessor code is kept and used as a
fallback.
2016-05-10 21:06:34 +02:00
wm4 4b3faf9dc1 vo_opengl: add an angle-es2 backend
It forces es2 mode on ANGLE. Only useful for testing. Since the normal
"angle" backend already falls back to es2 if es3 does not work, this new
backend always exit when autoprobing it.
2016-05-10 20:19:25 +02:00
wm4 12ae19c449 vo_opengl: cosmetics: rename variables
"p" is used for the private context everywhere in the source file, but
renaming it also requires renaming some local variables.
2016-05-10 18:49:49 +02:00
wm4 b0b01aa250 vo_opengl: refactor how hwdec interop exports textures
Rename gl_hwdec_driver.map_image to map_frame, and let it fill out a
struct gl_hwdec_frame describing the exact texture layout. This gives
more flexibility to what the hwdec interop can export. In particular, it
can export strange component orders/permutations and textures with
padded size. (The latter originating from cropped video.)

The way gl_hwdec_frame works is in the spirit of the rest of the
vo_opengl video processing code, which tends to put as much information
in immediate state (as part of the dataflow), instead of declaring it
globally. To some degree this duplicates the texplane and img_tex
structs, but until we somehow unify those, it's better to give the hwdec
state its own struct. The fact that changing the hwdec struct would
require changes and testing on at least 4 platform/GPU combinations
makes duplicating it almost a requirement to avoid pain later.

Make gl_hwdec_driver.reinit set the new image format and remove the
gl_hwdec.converted_imgfmt field.

Likewise, gl_hwdec.gl_texture_target is replaced with
gl_hwdec_plane.gl_target.

Split out a init_image_desc function from init_format. The latter is not
called in the hwdec case at all anymore. Setting up most of struct
texplane is also completely separate in the hwdec and normal cases.

video.c does not check whether the hwdec "mapped" image format is
supported. This should not really happen anyway, and if it does, the
hwdec interop backend must fail at creation time, so this is not an
issue.
2016-05-10 18:42:42 +02:00
wm4 46fff8d31a video: refactor how VO exports hwdec device handles
The main change is with video/hwdec.h. mp_hwdec_info is made opaque (and
renamed to mp_hwdec_devices). Its accessors are mainly thread-safe (or
documented where not), which makes the whole thing saner and cleaner. In
particular, thread-safety rules become less subtle and more obvious.

The new internal API makes it easier to support multiple OpenGL interop
backends. (Although this is not done yet, and it's not clear whether it
ever will.)

This also removes all the API-specific fields from mp_hwdec_ctx and
replaces them with a "ctx" field. For d3d in particular, we drop the
mp_d3d_ctx struct completely, and pass the interfaces directly.

Remove the emulation checks from vaapi.c and vdpau.c; they are
pointless, and the checks that matter are done on the VO layer.

The d3d hardware decoders might slightly change behavior: dxva2-copy
will not use the VO device anymore if the VO supports proper interop.
This pretty much assumes that any in such cases the VO will not use any
form of exclusive mode, which makes using the VO device in copy mode
unnecessary.

This is a big refactor. Some things may be untested and could be broken.
2016-05-09 20:03:22 +02:00
wm4 7a75e7c002 vo_opengl: angle: avoid fullscreen FBO copy for flipping
In order to honor the differences between OpenGL and Direct3D coordinate
systems, ANGLE uses a full FBO copy merely to flip the final frame
vertically. This can be avoided with the EGL_ANGLE_surface_orientation
extension.
2016-05-05 18:44:41 +02:00
wm4 605dd928d3 vo_opengl: angle: call eglTerminate()
I hope that this does what we expect it does: destroy the EGLDisplay
specific to our HDC. (Some implementations will terminate all EGL
contexts in the whole process.)

eglReleaseThread() merely calls eglMakeCurrent(0, 0, 0, 0), which is
not enough.

This commit also fixes the problem fixed with the previous commit,
but I think both changes are needed to make our API usage clean.
2016-05-05 13:46:18 +02:00
wm4 f56555b514 vo_opengl: EGL: fix hwdec probing
If ANGLE was probed before (but rejected), the ANGLE API can remain
"initialized", and eglGetCurrentDisplay() will return a non-NULL
EGLDisplay. Then if a native GL context is used, the ANGLE/EGL API will
then (apparently) keep working alongside native OpenGL API. Since GL
objects are just numbers, they'll simply fail to interact, and OpenGL
will get invalid textures. For some reason this will result in black
textures.

With VAAPI-EGL, something similar could happen in theory, but didn't in
practice.
2016-05-05 13:38:08 +02:00
wm4 833375f88d command: change some hwdec properties
Introduce hwdec-current and hwdec-interop properties.

Deprecate hwdec-detected, which never made a lot of sense, and which is
replaced by the new properties. hwdec-active also becomes useless, as
hwdec-current is a superset, so it's deprecated too (for now).
2016-05-04 16:55:26 +02:00
Niklas Haas 86b5f1463c lcms: don't warn/error on 3dlut cache misses
Cache misses are a normal and expected part of the operation of a cache.
It doesn't really make sense to show a user-visible warning for them.

To work-around this, just skip trying to open the cache if it doesn't
exist yet.
2016-05-04 12:10:55 +02:00
Niklas Haas 9054460bba lcms: improve black point handling (especially BT.1886)
First of all, black point compensation is now on by default. This is
really rather harmless and only improves the result (where "improvement"
means "less black clipping").

Second, this adds an option to limit the ICC profile's contrast, which
helps for untagged matrix profiles that are implicitly black scaled even
in colorimetric intent. (Note that this relies on BPC being enabled to
work properly, which is why the two changes are tied together)

Third, this uses the LittleCMS built in black point estimator instead of
relying on the presence of accurate A2B tables. This also checks tags
and does some amounts of noise elimination.

If the option is unspecified and the profile is missing black point
information, print a warning instructing the user to set the option, and
fall back to 1000 otherwise.
2016-05-04 12:10:45 +02:00
wm4 eefe7ad28b vo_opengl: vdpau: fix certain cases of preemption recovery failures
The vdpau_mixer could fail to be recreated properly if preemption
occured at some point before playback initialization (like when using
--hwdec-preload and the opengl-cb API).

Normally, the vdpau_mixer was supposed to be marked invalid when the
components using it detect a preemption, e.g. in hwdec_vdpau.c. This one
didn't mark the vdpau_mixer as invalid if preemption was detected in
reinit(), only in map_image().

It's cleaner to detect preemption directly in the vdpau_mixer, which
ensures it's always recreated correctly.
2016-05-03 13:56:11 +02:00
James Ross-Gowan 622bcb0e37 win32: replace libuuid.a usage with initguid.h
Including initguid.h at the top of a file that uses references to GUIDs
causes the GUIDs to be declared globally with __declspec(selectany). The
'selectany' attribute tells the linker to consolidate multiple
definitions of each GUID, which would be great except that, in Cygwin
and MinGW GCC 6.1, this method of linking makes the GUIDs conflict with
the ones declared in libuuid.a.

Since initguid.h obsoletes libuuid.a in modern compilers that support
__declspec(selectany), add initguid.h to all files that use GUIDs and
remove libuuid.a from the build.

Fixes #3097
2016-05-01 21:10:24 +10:00
wm4 304d9d58dd vo_opengl: fix build with GLES3 headers
Legacy desktop GL only symbols. Broken by the previous commit.
2016-04-27 20:26:08 +02:00
wm4 9d16837c99 vo_opengl: support GL_EXT_texture_norm16 on GLES
This gives us 16 bit fixed-point integer texture formats, including
ability to sample from them with linear filtering, and using them as FBO
attachments.

The integer texture format path is still there for the sake of ANGLE,
which does not support GL_EXT_texture_norm16 yet.

The change to pass_dither() is needed, because the code path using
GL_R16 for the dither texture relies on glTexImage2D being able to
convert from GL_FLOAT to GL_R16. GLES does not allow this. This could be
trivially fixed by doing the conversion ourselves, but I'm too lazy to
do this now.
2016-04-27 19:19:56 +02:00
wm4 757c8baf8c vo_opengl: always use sized internal formats
This shouldn't make much of a difference, but should make the following
commit simpler.
2016-04-27 19:02:04 +02:00
wm4 9cb036f297 vo_opengl: d3d11egl: minor simplification
This should be ok. eglBindTexImage() just associates the texture, and
does not make a copy (not even a conceptual one).
2016-04-27 14:35:24 +02:00
wm4 0b1ba577b1 vo_opengl: d3d11egl: print warning on unsupported colorspaces
Not much we can do about. If there are many complaints, a mechanism to
automatically disable interop in such cases could be added.
2016-04-27 14:34:46 +02:00
wm4 dff33893f2 d3d11va: store texture/subindex in IMGFMT_D3D11VA plane pointers
Basically this gets rid of the need for the accessors in d3d11va.h, and
the code can be cleaned up a little bit.

Note that libavcodec only defines a ID3D11VideoDecoderOutputView pointer
in the last plane pointers, but it tolerates/passes through the other
plane pointers we set.
2016-04-27 14:06:50 +02:00
wm4 3706918311 vo_opengl: D3D11VA + ANGLE interop
This uses ID3D11VideoProcessor to convert the video to a RGBA surface,
which is then bound to ANGLE. Currently ANGLE does not provide any way
to bind nv12 surfaces directly, so this will have to do.

ID3D11VideoContext1 would give us slightly more control about the
colorspace conversion, though it's still not good, and not available
in MinGW headers yet.

The video processor is created lazily, because we need to have the coded
frame size, of which AVFrame and mp_image have no concept of. Doing the
creation lazily is less of a pain than somehow hacking the coded frame
size into mp_image.

I'm not really sure how ID3D11VideoProcessorInputView is supposed to
work. We recreate it on every frame, which is simple and hopefully
doesn't affect performance.
2016-04-27 13:49:47 +02:00
wm4 d3a26272cd vo_opengl: print error if opengl hwdec interop fails 2016-04-27 13:32:49 +02:00
wm4 244eff9201 vo_opengl: always reset some GL state when leaving renderer
The active texture and some pixelstore parameters are now always reset
to defaults when entering and leaving the renderer. Could be important
for libmpv.
2016-04-22 12:08:21 +02:00
wm4 44644e69f0 vo_opengl: fix an outdated comment
This wasn't updated over multiple iterations.
2016-04-16 16:16:50 +02:00
wm4 b735c0e207 lcms: include math.h
Fixes #3053.
2016-04-15 13:58:41 +02:00
wm4 8c02c92ab9 vo_opengl: rpi: don't include x11 header file
Copy & paste bug.
2016-04-15 09:45:15 +02:00
Niklas Haas e3e03d0f34 vo_opengl: simplify and improve up scale=oversample
Since what we're doing is a linear blend of the four colors, we can just
do it for free by using GPU sampling.

This requires significantly fewer texture fetches and calculations to
compute the final color, making it much more efficient. The code is also
much shorter and simpler.
2016-04-12 16:26:53 +02:00
wm4 f5ff2656e0 vaapi: determine surface format in decoder, not in renderer
Until now, we have made the assumption that a driver will use only 1
hardware surface format. the format is dictated by the driver (you
don't create surfaces with a specific format - you just pass a
rt_format and get a surface that will be in a specific driver-chosen
format).

In particular, the renderer created a dummy surface to probe the format,
and hoped the decoder would produce the same format. Due to a driver
bug this required a workaround to actually get the same format as the
driver did.

Change this so that the format is determined in the decoder. The format
is then passed down as hw_subfmt, which allows the renderer to configure
itself with the correct format. If the hardware surface changes its
format midstream, the renderer can be reconfigured using the normal
mechanisms.

This calls va_surface_init_subformat() each time after the decoder
returns a surface. Since libavcodec/AVFrame has no concept of sub-
formats, this is unavoidable. It creates and destroys a derived
VAImage, but this shouldn't have any bad performance effects (at
least I didn't notice any measurable effects).

Note that vaDeriveImage() failures are silently ignored as some
drivers (the vdpau wrapper) support neither vaDeriveImage, nor EGL
interop. In addition, we still probe whether we can map an image
in the EGL interop code. This is important as it's the only way
to determine whether EGL interop is supported at all. With respect
to the driver bug mentioned above, it doesn't matter which format
the test surface has.

In vf_vavpp, also remove the rt_format guessing business. I think the
existing logic was a bit meaningless anyway. It's not even a given
that vavpp produces the same rt_format for output.
2016-04-11 22:03:26 +02:00