Rename gl_hwdec_driver.map_image to map_frame, and let it fill out a
struct gl_hwdec_frame describing the exact texture layout. This gives
more flexibility to what the hwdec interop can export. In particular, it
can export strange component orders/permutations and textures with
padded size. (The latter originating from cropped video.)
The way gl_hwdec_frame works is in the spirit of the rest of the
vo_opengl video processing code, which tends to put as much information
in immediate state (as part of the dataflow), instead of declaring it
globally. To some degree this duplicates the texplane and img_tex
structs, but until we somehow unify those, it's better to give the hwdec
state its own struct. The fact that changing the hwdec struct would
require changes and testing on at least 4 platform/GPU combinations
makes duplicating it almost a requirement to avoid pain later.
Make gl_hwdec_driver.reinit set the new image format and remove the
gl_hwdec.converted_imgfmt field.
Likewise, gl_hwdec.gl_texture_target is replaced with
gl_hwdec_plane.gl_target.
Split out a init_image_desc function from init_format. The latter is not
called in the hwdec case at all anymore. Setting up most of struct
texplane is also completely separate in the hwdec and normal cases.
video.c does not check whether the hwdec "mapped" image format is
supported. This should not really happen anyway, and if it does, the
hwdec interop backend must fail at creation time, so this is not an
issue.
This gives us 16 bit fixed-point integer texture formats, including
ability to sample from them with linear filtering, and using them as FBO
attachments.
The integer texture format path is still there for the sake of ANGLE,
which does not support GL_EXT_texture_norm16 yet.
The change to pass_dither() is needed, because the code path using
GL_R16 for the dither texture relies on glTexImage2D being able to
convert from GL_FLOAT to GL_R16. GLES does not allow this. This could be
trivially fixed by doing the conversion ourselves, but I'm too lazy to
do this now.
The active texture and some pixelstore parameters are now always reset
to defaults when entering and leaving the renderer. Could be important
for libmpv.
Apply basic transformations like rotation by 90° and mirroring when
sampling from the source textures. The original idea was making this
part of img_tex.transform, but this didn't work: lots of code plays
tricks on the transform, so manipulating it is not necessarily
transparent, especially when width/height are switched. So add a new
pre_transform field, which is strictly applied before the normal
transform.
This fixes most glitches involved with rotating the image.
Cropping and rotation are now weirdly separated, even though they could
be done in the same step. I think this is not much of a problem, and
has the advantage that changing panscan does not trigger FBO
reallocations (I think...).
This commit refactors the 3DLUT loading mechanism to build the 3DLUT
against the original source characteristics of the file. This allows us,
among other things, to use a real BT.1886 profile for the source. This
also allows us to actually use perceptual mappings. Finally, this
reduces errors on standard gamut displays (where the previous 3DLUT
target of BT.2020 was unreasonably wide).
This also improves the overall accuracy of the 3DLUT due to eliminating
rounding errors where possible, and allows for more accurate use of
LUT-based ICC profiles.
The current code is somewhat more ugly than necessary, because the idea
was to implement this commit in a working state first, and then maybe
refactor the profile loading mechanism in a later commit.
Fixes#2815.
This also draws it after color management etc. In a nutshell, this
change makes the transparency checkerboard independent of upscaling,
panning, cropping etc. It will always be the same apparent size and
position (relative to the window).
It will also be independent of the video colorspace and such things.
(Note: This might cause white imbalance issues if playing a file with a
white point that does not match the display, in absolute colorimetric
mode. But that's uncommon, especially in conjunction with transparent
image files, so it's not a primary concern here)
Until now, we've let the windowing backend decide. But since they
usually require premultiplied alpha, and premultiplied alpha is easier
to handle, hardcode it.
The recent changes fixed rotation handling, but reversed the rotation
direction. The direction is expected to be counter-clockwise, because
demuxers export video rotation metadata as such.
This has been completely broken since commit 93546f0c. But even before,
rotation handling did not make too much sense. In particular, it rotated
the contents of the cropped image, instead of adjusting the crop
rectangle as well. The result was that things like panscan or zooming
did not behave as expected with rotation applied.
The same is true for vertical flipping. Flipping is triggered by
negative image stride. OpenGL does not support flipping the image on
upload, so it's done as part of the rendering. It can be triggered with
--vf=flip, but other filters and even decoders could setup negative
stride to flip the image.
Fix these issues by applying transforms to texture coordinates properly,
and by making rotation and flipping part of these transforms.
This still doesn't work properly for separated scaling. The issue is
that we'd have to adjust how the passes are done. For now, pick a very
stupid solution by rotating the image to a FBO, and then scaling from
that. This has the avantage that the scale logic doesn't have to be
complicated for such a rare case. It could be improved later.
Prescaling is apparently still broken. I don't know if chroma
positioning works properly either. None of this should affect the case
with no rotation.
If the texture count is lower than 4, entries in va.textcoord[] will
remain uninitialized. While this is unlikely to be a problem (since
these values are unused on the shader side too), it's not nice and might
explain some things which have shown up in valgrind.
Fix by always initializing the whole thing.
Glitches when resizing are still possible, but are reduced. Other VOs
could support this too, but don't need to do so.
(Totally avoiding glitches would be much more effort, and probably not
worth the trouble. How about you just watch the video the player is
playing, instead of spending your time resizing the window.)
This also gets rid of the kind of hard to read texture swizzle setup and
turns it into something dumber.
Assumes that we don't create any FBOs with 2 channel formats. (Only the
video source textures are handled by this commit.)
This is a fresh implementation from scratch that carries with it
significantly less baggage and verbosity from the previous (ported)
version.
The actual values for the masks and such were copied from the
current code. Behavior and performance should be unaffected.
An important difference between the old code and the new code is that
the new code always explicitly samples from the first component, rather
than being able to process multiple planes at once.
Since prescale-luma only affects luma, I deemed this unnecessary. May
change in the future, if prescale-chroma ever gets implemented. But
prescaling multiple planes would be slow to do this way. (Better would
be to generalize it to differently-sized vectors)
Instead of hard-coding the logic and planes to skip, factor this out
to a reusible function, and instead add the number of relevant
coordinates to the texture state.
Since prescale now literally only affects the luma plane (and the
filters are all designed for luma-only operation either way), the option
has been renamed and the documentation updated to clarify this.
This is a pretty major rewrite of the internal texture binding
mechanic, which makes it more flexible.
In general, the difference between the old and current approaches is
that now, all texture description is held in a struct img_tex and only
explicitly bound with pass_bind. (Once bound, a texture unit is assumed
to be set in stone and no longer tied to the img_tex)
This approach makes the code inside pass_read_video significantly more
flexible and cuts down on the number of weird special cases and
spaghetti logic.
It also has some improvements, e.g. cutting down greatly on the number
of unnecessary conversion passes inside pass_read_video (which was
previously mostly done to cope with the fact that the alternative would
have resulted in a combinatorial explosion of code complexity).
Some other notable changes (and potential improvements):
- texture expansion is now *always* handled in pass_read_video, and the
colormatrix never does this anymore. (Which means the code could
probably be removed from the colormatrix generation logic, modulo some
other VOs)
- struct fbo_tex now stores both its "physical" and "logical"
(configured) size, which cuts down on the amount of width/height
baggage on some function calls
- vo_opengl can now technically support textures with different bit
depths (e.g. 10 bit luma, 8 bit chroma) - but the APIs it queries
inside img_format.c doesn't export this (nor does ffmpeg support it,
really) so the status quo of using the same tex_mul for all planes is
kept.
- dumb_mode is now only needed because of the indirect_fbo being in the
main rendering pipeline. If we reintroduce p->use_indirect and thread
a transform through the entire program this could be skipped where
unnecessary, allowing for the removal of dumb_mode. But I'm not sure
how to do this in a clean way. (Which is part of why it got introduced
to begin with)
- It would be trivial to resurrect source-shader now (it would just be
one extra 'if' inside pass_read_video).
Why was this done so stupidly, with so many complicated special cases,
before? Declare it once so the shader bits don't have to figure out where
and when to do so themselves.
Apple crap (namely hardware decoding interop) forces us to use rectangle
textures for input. But after that we continue with normal textures.
This was not considered for debanding, and the sampler type used for it
can be different depending on the exact render chain. Simply use the
target type of the input texture.
It thinks that integer_conv_fbo[index] is implied to be accessed with up
to index=5. Although that is theoretical only, it has a point that this
makes no sense. Use the same constant for the array allocation, to make
it more uniform and robust.
Fixes CID 1350060.
GLES does not support high bit depth fixed point textures for unknown
reasons, so direct 10 bit input is not possible. But we can still use
integer textures, which are supported by GLES 3.0. These store integer
data just like the standard fixed point textures, except they are not
normalized on sampling. They also don't support bilinear filtering, and
require a special sampler ("usampler2D").
While these texture formats enable us to shuffle the data to the GPU,
they're rather impractical with the requirements mentioned above and our
current architecture. One problem is that most code assumes it can
always use bilinear scaling (even if bilinear is never used when using
appropriate scale/cscale options). Another is that we don't have any
concept of running a function on a texture in an uniform way.
So for now, run a simple conversion step through a FBO. The FBO will use
the rgba16f format normally, which gives enough bits for 10 bit, and
will at least gracefully degrade with higher depth input.
This is bound to be much slower than a more "direct" method, but at
least it works and is simple to implement.
The odd change of function call order in init_video() is to properly
disable "dumb mode" (no FBO use) if these texture formats are in use.
This was never reset - absolutely can't be right. If the renderer
somehow switches back to another codepath, it certainly has to be reset.
Maybe this was hard to hit, as the normalization is going to be
idempotent in simpler cases (like rendering RGBA input).
Also get rid of the "merged" variable.
Often requested. The main argument, that prominent scalers like sharpen
change the image even if no scaling happens, disappeared anyway.
("sharpen", unsharp masking, is neither prominent nor a scaler anymore.
This is an artifact from MPlayer, which fuses unsharp masking with
bilinear scaling in order to make it single-pass, or so.)
Some VOs had support for these - remove them.
Typically, these formats will have only some use in cases where using
RGB software conversion with libswscale is faster than letting the
VO/GPU do it (i.e. almost never). For the sake of testing this case,
keep IMGFMT_RGB565. This is the least messy format, because it has no
padding/alpha bits with unknown semantics.
Note that decoding to these formats still works. We'll let libswscale
repack the data to whatever the VO in use can take.
Do this to make the license situation less confusing.
This change should be of no consequence, since LGPL is compatible with
GPL anyway, and making it LGPL-only does not restrict the use with GPL
code.
Additionally, the wording implies that this is allowed, and that we can
just remove the GPL part.
Should take care of the planned FFmpeg AV_PIX_FMT_P010 addition. (This
will eventually be needed when doing HEVC Main 10 decoding with DXVA2
copyback.)
Add a "blend-tiles" choice to the "alpha" sub-option. This is pretty
simplistic and uses the GL raster position to derive the tiles. A weird
consequence is that using --vo=opengl and --vo=opengl-hq gives different
scaling behavior (screenspace pixel size vs. source video pixel size
16x16 tiles), but it seems we don't have easy access to the original
texture coordinates. Using the rasterpos is probably simpler.
Make this option the default.
Since alpha isn't pulled through the colormatrix (maybe it should?), we
reject alpha formats with odd sizes, such as yuva444p10.
But the awful tex_mul path in vo_opengl does this anyway (at some points
even explicitly), which means there will be a subtle difference in
handling of 16 bit yuv alpha formats. Make it consistent and always
apply the range adjustment to the alpha component. This also means odd
sizes like 10 bit are supported now.
This assumes alpha uses the same "shifted" range as the yuv color
channels for depths larger than 8 bit. I'm not sure whether this is
actually the case.