Commit Graph

261 Commits

Author SHA1 Message Date
wm4 fde20d10bc vo_opengl: angle: dynamically load ANGLE
ANGLE is _really_ annoying to build. (Requires special toolchain and a
recent MSVC version.) This results in various issues with people
having trouble to build mpv against ANGLE (apparently linking it
against a prebuilt binary doesn't count, or using binaries from
potentially untrusted sources is not wanted).

Dynamically loading ANGLE is going to be a huge convenience. This commit
implements this, with special focus on keeping it source compatible to
a normal build with ANGLE linked at build-time.
2016-05-11 15:39:29 +02:00
wm4 1f6e71c7fa vo_opengl: fix passing along swizzle from hwdec interop
In theory this was needed for the previous commit (but wasn't in
practice, since for hwdec the LUMINANCE_ALPHA mangling is not applied
anymore, and ANGLE uses RG textures in absence of GL_ARB_texture_rg for
whatever crazy reasons).

In practice this caused funky colors on OSX with the uyvy422 format,
which is also fixed in this commit.
2016-05-10 21:12:57 +02:00
wm4 a3d416c3d3 vo_opengl: d3d11egl: native NV12 sampling support
This uses EGL_ANGLE_stream_producer_d3d_texture_nv12 and related
extensions to map the D3D textures coming from the hardware decoder
directly in GL.

In theory this would be trivial to achieve, but unfortunately ANGLE does
not have a mechanism to "import" D3D textures as GL textures. Instead,
an awkward mechanism via EGL_KHR_stream was implemented, which involves
at least 5 extensions and a lot of glue code. (Even worse than VAAPI EGL
interop, and very far from the simplicity you get on OSX.)

The ANGLE mechanism so far supports only the NV12 texture format, which
means 10 bit won't work. It also does not work in ES3 mode yet. For
these reasons, the "old" ID3D11VideoProcessor code is kept and used as a
fallback.
2016-05-10 21:06:34 +02:00
wm4 4b3faf9dc1 vo_opengl: add an angle-es2 backend
It forces es2 mode on ANGLE. Only useful for testing. Since the normal
"angle" backend already falls back to es2 if es3 does not work, this new
backend always exit when autoprobing it.
2016-05-10 20:19:25 +02:00
wm4 12ae19c449 vo_opengl: cosmetics: rename variables
"p" is used for the private context everywhere in the source file, but
renaming it also requires renaming some local variables.
2016-05-10 18:49:49 +02:00
wm4 b0b01aa250 vo_opengl: refactor how hwdec interop exports textures
Rename gl_hwdec_driver.map_image to map_frame, and let it fill out a
struct gl_hwdec_frame describing the exact texture layout. This gives
more flexibility to what the hwdec interop can export. In particular, it
can export strange component orders/permutations and textures with
padded size. (The latter originating from cropped video.)

The way gl_hwdec_frame works is in the spirit of the rest of the
vo_opengl video processing code, which tends to put as much information
in immediate state (as part of the dataflow), instead of declaring it
globally. To some degree this duplicates the texplane and img_tex
structs, but until we somehow unify those, it's better to give the hwdec
state its own struct. The fact that changing the hwdec struct would
require changes and testing on at least 4 platform/GPU combinations
makes duplicating it almost a requirement to avoid pain later.

Make gl_hwdec_driver.reinit set the new image format and remove the
gl_hwdec.converted_imgfmt field.

Likewise, gl_hwdec.gl_texture_target is replaced with
gl_hwdec_plane.gl_target.

Split out a init_image_desc function from init_format. The latter is not
called in the hwdec case at all anymore. Setting up most of struct
texplane is also completely separate in the hwdec and normal cases.

video.c does not check whether the hwdec "mapped" image format is
supported. This should not really happen anyway, and if it does, the
hwdec interop backend must fail at creation time, so this is not an
issue.
2016-05-10 18:42:42 +02:00
wm4 46fff8d31a video: refactor how VO exports hwdec device handles
The main change is with video/hwdec.h. mp_hwdec_info is made opaque (and
renamed to mp_hwdec_devices). Its accessors are mainly thread-safe (or
documented where not), which makes the whole thing saner and cleaner. In
particular, thread-safety rules become less subtle and more obvious.

The new internal API makes it easier to support multiple OpenGL interop
backends. (Although this is not done yet, and it's not clear whether it
ever will.)

This also removes all the API-specific fields from mp_hwdec_ctx and
replaces them with a "ctx" field. For d3d in particular, we drop the
mp_d3d_ctx struct completely, and pass the interfaces directly.

Remove the emulation checks from vaapi.c and vdpau.c; they are
pointless, and the checks that matter are done on the VO layer.

The d3d hardware decoders might slightly change behavior: dxva2-copy
will not use the VO device anymore if the VO supports proper interop.
This pretty much assumes that any in such cases the VO will not use any
form of exclusive mode, which makes using the VO device in copy mode
unnecessary.

This is a big refactor. Some things may be untested and could be broken.
2016-05-09 20:03:22 +02:00
wm4 7a75e7c002 vo_opengl: angle: avoid fullscreen FBO copy for flipping
In order to honor the differences between OpenGL and Direct3D coordinate
systems, ANGLE uses a full FBO copy merely to flip the final frame
vertically. This can be avoided with the EGL_ANGLE_surface_orientation
extension.
2016-05-05 18:44:41 +02:00
wm4 605dd928d3 vo_opengl: angle: call eglTerminate()
I hope that this does what we expect it does: destroy the EGLDisplay
specific to our HDC. (Some implementations will terminate all EGL
contexts in the whole process.)

eglReleaseThread() merely calls eglMakeCurrent(0, 0, 0, 0), which is
not enough.

This commit also fixes the problem fixed with the previous commit,
but I think both changes are needed to make our API usage clean.
2016-05-05 13:46:18 +02:00
wm4 f56555b514 vo_opengl: EGL: fix hwdec probing
If ANGLE was probed before (but rejected), the ANGLE API can remain
"initialized", and eglGetCurrentDisplay() will return a non-NULL
EGLDisplay. Then if a native GL context is used, the ANGLE/EGL API will
then (apparently) keep working alongside native OpenGL API. Since GL
objects are just numbers, they'll simply fail to interact, and OpenGL
will get invalid textures. For some reason this will result in black
textures.

With VAAPI-EGL, something similar could happen in theory, but didn't in
practice.
2016-05-05 13:38:08 +02:00
wm4 833375f88d command: change some hwdec properties
Introduce hwdec-current and hwdec-interop properties.

Deprecate hwdec-detected, which never made a lot of sense, and which is
replaced by the new properties. hwdec-active also becomes useless, as
hwdec-current is a superset, so it's deprecated too (for now).
2016-05-04 16:55:26 +02:00
Niklas Haas 86b5f1463c lcms: don't warn/error on 3dlut cache misses
Cache misses are a normal and expected part of the operation of a cache.
It doesn't really make sense to show a user-visible warning for them.

To work-around this, just skip trying to open the cache if it doesn't
exist yet.
2016-05-04 12:10:55 +02:00
Niklas Haas 9054460bba lcms: improve black point handling (especially BT.1886)
First of all, black point compensation is now on by default. This is
really rather harmless and only improves the result (where "improvement"
means "less black clipping").

Second, this adds an option to limit the ICC profile's contrast, which
helps for untagged matrix profiles that are implicitly black scaled even
in colorimetric intent. (Note that this relies on BPC being enabled to
work properly, which is why the two changes are tied together)

Third, this uses the LittleCMS built in black point estimator instead of
relying on the presence of accurate A2B tables. This also checks tags
and does some amounts of noise elimination.

If the option is unspecified and the profile is missing black point
information, print a warning instructing the user to set the option, and
fall back to 1000 otherwise.
2016-05-04 12:10:45 +02:00
wm4 eefe7ad28b vo_opengl: vdpau: fix certain cases of preemption recovery failures
The vdpau_mixer could fail to be recreated properly if preemption
occured at some point before playback initialization (like when using
--hwdec-preload and the opengl-cb API).

Normally, the vdpau_mixer was supposed to be marked invalid when the
components using it detect a preemption, e.g. in hwdec_vdpau.c. This one
didn't mark the vdpau_mixer as invalid if preemption was detected in
reinit(), only in map_image().

It's cleaner to detect preemption directly in the vdpau_mixer, which
ensures it's always recreated correctly.
2016-05-03 13:56:11 +02:00
James Ross-Gowan 622bcb0e37 win32: replace libuuid.a usage with initguid.h
Including initguid.h at the top of a file that uses references to GUIDs
causes the GUIDs to be declared globally with __declspec(selectany). The
'selectany' attribute tells the linker to consolidate multiple
definitions of each GUID, which would be great except that, in Cygwin
and MinGW GCC 6.1, this method of linking makes the GUIDs conflict with
the ones declared in libuuid.a.

Since initguid.h obsoletes libuuid.a in modern compilers that support
__declspec(selectany), add initguid.h to all files that use GUIDs and
remove libuuid.a from the build.

Fixes #3097
2016-05-01 21:10:24 +10:00
wm4 304d9d58dd vo_opengl: fix build with GLES3 headers
Legacy desktop GL only symbols. Broken by the previous commit.
2016-04-27 20:26:08 +02:00
wm4 9d16837c99 vo_opengl: support GL_EXT_texture_norm16 on GLES
This gives us 16 bit fixed-point integer texture formats, including
ability to sample from them with linear filtering, and using them as FBO
attachments.

The integer texture format path is still there for the sake of ANGLE,
which does not support GL_EXT_texture_norm16 yet.

The change to pass_dither() is needed, because the code path using
GL_R16 for the dither texture relies on glTexImage2D being able to
convert from GL_FLOAT to GL_R16. GLES does not allow this. This could be
trivially fixed by doing the conversion ourselves, but I'm too lazy to
do this now.
2016-04-27 19:19:56 +02:00
wm4 757c8baf8c vo_opengl: always use sized internal formats
This shouldn't make much of a difference, but should make the following
commit simpler.
2016-04-27 19:02:04 +02:00
wm4 9cb036f297 vo_opengl: d3d11egl: minor simplification
This should be ok. eglBindTexImage() just associates the texture, and
does not make a copy (not even a conceptual one).
2016-04-27 14:35:24 +02:00
wm4 0b1ba577b1 vo_opengl: d3d11egl: print warning on unsupported colorspaces
Not much we can do about. If there are many complaints, a mechanism to
automatically disable interop in such cases could be added.
2016-04-27 14:34:46 +02:00
wm4 dff33893f2 d3d11va: store texture/subindex in IMGFMT_D3D11VA plane pointers
Basically this gets rid of the need for the accessors in d3d11va.h, and
the code can be cleaned up a little bit.

Note that libavcodec only defines a ID3D11VideoDecoderOutputView pointer
in the last plane pointers, but it tolerates/passes through the other
plane pointers we set.
2016-04-27 14:06:50 +02:00
wm4 3706918311 vo_opengl: D3D11VA + ANGLE interop
This uses ID3D11VideoProcessor to convert the video to a RGBA surface,
which is then bound to ANGLE. Currently ANGLE does not provide any way
to bind nv12 surfaces directly, so this will have to do.

ID3D11VideoContext1 would give us slightly more control about the
colorspace conversion, though it's still not good, and not available
in MinGW headers yet.

The video processor is created lazily, because we need to have the coded
frame size, of which AVFrame and mp_image have no concept of. Doing the
creation lazily is less of a pain than somehow hacking the coded frame
size into mp_image.

I'm not really sure how ID3D11VideoProcessorInputView is supposed to
work. We recreate it on every frame, which is simple and hopefully
doesn't affect performance.
2016-04-27 13:49:47 +02:00
wm4 d3a26272cd vo_opengl: print error if opengl hwdec interop fails 2016-04-27 13:32:49 +02:00
wm4 244eff9201 vo_opengl: always reset some GL state when leaving renderer
The active texture and some pixelstore parameters are now always reset
to defaults when entering and leaving the renderer. Could be important
for libmpv.
2016-04-22 12:08:21 +02:00
wm4 44644e69f0 vo_opengl: fix an outdated comment
This wasn't updated over multiple iterations.
2016-04-16 16:16:50 +02:00
wm4 b735c0e207 lcms: include math.h
Fixes #3053.
2016-04-15 13:58:41 +02:00
wm4 8c02c92ab9 vo_opengl: rpi: don't include x11 header file
Copy & paste bug.
2016-04-15 09:45:15 +02:00
Niklas Haas e3e03d0f34 vo_opengl: simplify and improve up scale=oversample
Since what we're doing is a linear blend of the four colors, we can just
do it for free by using GPU sampling.

This requires significantly fewer texture fetches and calculations to
compute the final color, making it much more efficient. The code is also
much shorter and simpler.
2016-04-12 16:26:53 +02:00
wm4 f5ff2656e0 vaapi: determine surface format in decoder, not in renderer
Until now, we have made the assumption that a driver will use only 1
hardware surface format. the format is dictated by the driver (you
don't create surfaces with a specific format - you just pass a
rt_format and get a surface that will be in a specific driver-chosen
format).

In particular, the renderer created a dummy surface to probe the format,
and hoped the decoder would produce the same format. Due to a driver
bug this required a workaround to actually get the same format as the
driver did.

Change this so that the format is determined in the decoder. The format
is then passed down as hw_subfmt, which allows the renderer to configure
itself with the correct format. If the hardware surface changes its
format midstream, the renderer can be reconfigured using the normal
mechanisms.

This calls va_surface_init_subformat() each time after the decoder
returns a surface. Since libavcodec/AVFrame has no concept of sub-
formats, this is unavoidable. It creates and destroys a derived
VAImage, but this shouldn't have any bad performance effects (at
least I didn't notice any measurable effects).

Note that vaDeriveImage() failures are silently ignored as some
drivers (the vdpau wrapper) support neither vaDeriveImage, nor EGL
interop. In addition, we still probe whether we can map an image
in the EGL interop code. This is important as it's the only way
to determine whether EGL interop is supported at all. With respect
to the driver bug mentioned above, it doesn't matter which format
the test surface has.

In vf_vavpp, also remove the rt_format guessing business. I think the
existing logic was a bit meaningless anyway. It's not even a given
that vavpp produces the same rt_format for output.
2016-04-11 22:03:26 +02:00
wm4 87cb2339a6 vo_opengl: improve rotation handling (again)
Apply basic transformations like rotation by 90° and mirroring when
sampling from the source textures. The original idea was making this
part of img_tex.transform, but this didn't work: lots of code plays
tricks on the transform, so manipulating it is not necessarily
transparent, especially when width/height are switched. So add a new
pre_transform field, which is strictly applied before the normal
transform.

This fixes most glitches involved with rotating the image.

Cropping and rotation are now weirdly separated, even though they could
be done in the same step. I think this is not much of a problem, and
has the advantage that changing panscan does not trigger FBO
reallocations (I think...).
2016-04-08 22:21:38 +02:00
wm4 6325bdf197 vo_opengl: log if glGetString(GL_VERSION) returns NULL
Typically happens with some implementations if no context is currrent,
or is otherwise broken. This is particularly relevant to the opengl_cb
API, because the API user will have no other indication what went wrong.
2016-04-08 10:57:21 +02:00
wm4 0812719497 vo_opengl: videotoolbox: use kCVPixelBufferLock_ReadOnly for screenshots
Why not.
2016-04-07 19:55:51 +02:00
wm4 f033481551 videotoolbox: change how videotoolbox format is managed
The underlying intention of this code is to make changing
--videotoolbox-format at runtime work. For this reason, the format can't
just be statically setup, but must be read from the option at runtime.

This means the format is not fixed anymore, and we have to make sure the
renderer is property reinitialized if the format changes. There is
currently no way to trigger reinit on this level, which is why the
mp_image_params.hw_subfmt field was introduced.

One sketchy thing remains: normally, the renderer is supposed to be
involved with VO format negotiation, which would ensure that the VO
can take the format at all. Since the hw_subfmt is not part of this
format negotiation, it's implied the get_vt_fmt() callback only
returns formats supported by the renderer. This is not necessarily
clear because vo_opengl checks this with converted_imgfmt separately.
None of this matters in practice though, because we know all formats
are always supported.

(This still requires somehow triggering decoder reinit to make the
change effective.)
2016-04-07 19:54:58 +02:00
wm4 796b32c4d7 vo_opengl: fix build breakage 2016-04-06 01:21:16 +02:00
wm4 7a5312e9a6 vo_opengl: minor simplification
It's the same functionally.
2016-04-05 20:58:22 +02:00
wm4 afd685490d vo_opengl: fix nnedi + rectangle textures
Shader compilation error due to incompatible samplers.
2016-04-05 20:57:02 +02:00
Niklas Haas a3361ad0ce gl_lcms: choose BT.1886 gamma per-channel
This makes the black point closer (chromatically) to the white point, by
ensuring channels keep their consistent brightness ratios as they go
down to zero.

I also raised the 3DLUT version as this changes semantics and is a
separate commit from the previous one.
2016-04-01 10:27:32 +02:00
Niklas Haas 2dcf18c0c0 vo_opengl: generate 3DLUT against source and use full BT.1886
This commit refactors the 3DLUT loading mechanism to build the 3DLUT
against the original source characteristics of the file. This allows us,
among other things, to use a real BT.1886 profile for the source. This
also allows us to actually use perceptual mappings. Finally, this
reduces errors on standard gamut displays (where the previous 3DLUT
target of BT.2020 was unreasonably wide).

This also improves the overall accuracy of the 3DLUT due to eliminating
rounding errors where possible, and allows for more accurate use of
LUT-based ICC profiles.

The current code is somewhat more ugly than necessary, because the idea
was to implement this commit in a working state first, and then maybe
refactor the profile loading mechanism in a later commit.

Fixes #2815.
2016-04-01 10:27:27 +02:00
Niklas Haas ec6e8a31e0 vo_opengl: draw transparency checkerboard after upscaling
This also draws it after color management etc. In a nutshell, this
change makes the transparency checkerboard independent of upscaling,
panning, cropping etc. It will always be the same apparent size and
position (relative to the window).

It will also be independent of the video colorspace and such things.
(Note: This might cause white imbalance issues if playing a file with a
white point that does not match the display, in absolute colorimetric
mode. But that's uncommon, especially in conjunction with transparent
image files, so it's not a primary concern here)
2016-03-29 22:29:19 +02:00
wm4 dae23fff09 vo_opengl: always premultiply alpha
Until now, we've let the windowing backend decide. But since they
usually require premultiplied alpha, and premultiplied alpha is easier
to handle, hardcode it.
2016-03-29 21:56:38 +02:00
wm4 b95a10c2dd vo_opengl: fix rotation direction
The recent changes fixed rotation handling, but reversed the rotation
direction. The direction is expected to be counter-clockwise, because
demuxers export video rotation metadata as such.
2016-03-29 11:47:16 +02:00
wm4 bd98d9e232 vo_opengl: slightly compress gl_set_debug_logger()
No functional changes.
2016-03-28 18:07:41 +02:00
wm4 b8b2a465d1 vo_opengl: reduce temporary variables in gl_transform_trans()
Using a single gl_transform variable instead of many float ones makes it
easier to see what it's doing. No functional change.
2016-03-28 18:07:18 +02:00
wm4 5827d9cc09 vo_opengl: fix rotation
This has been completely broken since commit 93546f0c. But even before,
rotation handling did not make too much sense. In particular, it rotated
the contents of the cropped image, instead of adjusting the crop
rectangle as well. The result was that things like panscan or zooming
did not behave as expected with rotation applied.

The same is true for vertical flipping. Flipping is triggered by
negative image stride. OpenGL does not support flipping the image on
upload, so it's done as part of the rendering. It can be triggered with
--vf=flip, but other filters and even decoders could setup negative
stride to flip the image.

Fix these issues by applying transforms to texture coordinates properly,
and by making rotation and flipping part of these transforms.

This still doesn't work properly for separated scaling. The issue is
that we'd have to adjust how the passes are done. For now, pick a very
stupid solution by rotating the image to a FBO, and then scaling from
that. This has the avantage that the scale logic doesn't have to be
complicated for such a rare case. It could be improved later.

Prescaling is apparently still broken. I don't know if chroma
positioning works properly either. None of this should affect the case
with no rotation.
2016-03-28 17:02:27 +02:00
wm4 e5b5cc2a2f vo_opengl: fix row-major vs. column-major confusion
gl_transform_vec() assumed column-major, while everything else seemed to
assumed row-major memory organization for gl_transform.m. Also,
gl_transform_trans() seems to contain additional confusion.

This didn't matter until now, as everything has been orthogonal, this
the swapped matrix entries were always 0.
2016-03-28 16:16:09 +02:00
wm4 fb70819048 vo_opengl: don't upload potentially uninitialized memory to GL buffer
If the texture count is lower than 4, entries in va.textcoord[] will
remain uninitialized. While this is unlikely to be a problem (since
these values are unused on the shader side too), it's not nice and might
explain some things which have shown up in valgrind.

Fix by always initializing the whole thing.
2016-03-28 16:13:56 +02:00
wm4 c51fe7944d vo_openg: fix debanding + rectangle-textures 2016-03-27 16:46:01 +02:00
wm4 a76f3e8e46 vo_opengl: minor coding style adjustment 2016-03-24 21:23:22 +01:00
wm4 da015d9d00 vo_opengl: utils: some more minor shader string building optimization
Instead of reallocating almost all of the shader string several times
per pass, build it into a fixed buffer that will be reallocated as
needed.

While this still uses a linear search and full comparison of the shader
text, this will compare the shader's string length first before doing a
full comparison as a nice side effect. (That's also why the fragment
shader is compared first - it's more likely to be different for
different cache entries than the vertex shader stub.)
2016-03-24 21:22:10 +01:00
wm4 30e94fa711 vo_opengl: utils: slightly optimize shader string building
Use bstr as appending buffer, which should avoid frequent reallocation
and copying. The previous commit should help with this a little.
2016-03-23 22:03:53 +01:00