Commit Graph

44 Commits

Author SHA1 Message Date
wm4 2a89da0c85 zimg: add slice threading and use it by default
This probably makes it much faster (I wouldn't know, I didn't run any
benchmarks ). Seems to work as well (although I'm not sure, it's not
like I'd perform rigorous tests).

The scale_zimg test seems to mysteriously treat color in fully
transparent alpha differently, which makes no sense, and isn't visible
(but makes the test fail). I can't be bothered with investigating this
more. What do you do with failing tests? Correct, you disable them. Or
rather, you disable whatever appears to cause them to fail, which is the
threading in this case.

This change follows mostly the tile_example.cpp. The slice size uses a
minimum of 64, which was suggested by the zimg author. Some of this
commit is a bit inelegant and weird, such as recomputing the scale
factor for every slice, or the way slice_h is managed. Too lazy to make
this more elegant.

zimg git had a regressio around active_region (which is needed by the
slicing), which was fixed in commit 83071706b2e6bc634. Apparently, the
bug was never released, so just add a warning to the manpage.
2020-07-15 22:59:17 +02:00
wm4 f1290f7095 zimg: refactor (move around fields)
The intention is to add slice-threading to the wrapper. For that
purpose, move all zimg related state to a new struct mp_zimg_state.
There is now an array of instances of this struct. The intention is to
have one instance for each thread. As of this commit, this is hardcoded
to 1 thread; the following commit extends this.
2020-07-15 22:59:17 +02:00
wm4 d8002f1dde video: separate repacking code from zimg and make it independent
For whatever purpose. If anything, this makes the zimg wrapper cleaner.

The added tests are not particular exhaustive, but nice to have. This
also makes the scale_zimg.c test pretty useless, because it only tests
repacking (going through the zimg wrapper). In theory, the repack_tests
things could also be used on scalers, but I guess it doesn't matter.

Some things are added over the previous zimg wrapper code. For example,
some fringe formats can now be expanded to 8 bit per component for
convenience.
2020-05-09 18:02:57 +02:00
wm4 d61ced37ae sws_utils: allow setting zimg options directly
One could wonder, why not just use the zimg wrapper directly?
2020-05-09 18:02:57 +02:00
wm4 008faa3d7f zimg: remove C11 aligned_alloc() requirement
It's not available on Windows because MinGW is fucking horrible and
Microsoft are fucking assholes.
2020-05-01 00:59:57 +02:00
wm4 640db1ed3f zimg: don't assume zimg reads are 64 byte aligned
Only _writes_ are aligned, so the assumption doesn't work for reads. But
it's easy to fix by rounding down x0 to the next byte boundary. Writing
pixels outside of the read area is allowed, and we don't go out of
buffer bounds.

Patch by anon32, permission to do anything with it.
2020-04-25 17:38:54 +02:00
wm4 71295fb872 video: add alpha type metadata
This is mostly for testing. It adds passing through the metadata through
the video chain. The metadata can be manipulated with vf_format. Support
for zimg alpha conversion (if built with zimg after it gained alpha
support) is implemented. Support premultiplied input in vo_gpu.

Some things still seem to be buggy.
2020-04-24 14:41:50 +02:00
wm4 ca83080fff zimg: get rid of special "override" fields for low depth RGB/gray
This makes it use the previously added fringe image formats. What is the
purpose of this change? Who knows.
2020-04-23 13:26:07 +02:00
wm4 a854aa234d zimg: slightly cleanup some mpv format handling nonsense
Move lookup GBRP or planar gray/alpha formats to separate functions in
some cases.

Make setup_regular_rgb_packer() not use a 4:4:4 YUV format to "pass"
RGB. This was used as a "trick" to avoid the stupid GBRP plane
permutation, but it confused severely, so get rid of it. Just do the
reordering, even if the zimg wrapper itself will reorder it back (which
is so stupid that I used the other approach at first). The comment
saying IMGFMT_420P was bogus of course; typically it was IMGFMT_444P.
2020-04-23 13:26:07 +02:00
wm4 0f8f6a665b video: change chroma_w/chroma_h fields to use shift instead of size
When I added mp_regular_imgfmt, I made the chroma subsampling use the
actual chroma division factor, instead of a shift (log2 of the actual
value). I had some ideas about how this was (probably?) more intuitive
and general. But nothing ever uses non-power of 2 subsampling (except
jpeg in rare cases apparently, because the world is a bad place).

Change the fields back to use shifts and rename them to avoid mistakes.
2020-04-23 13:24:35 +02:00
wm4 dd30f2658a zimg: fix swapped chroma planes with packed YUV bullshit
I must have messed this up when I actually added the Y210 format
(because that one is correct). So my comment in the commit adding this
about the FFmpeg pixfmt doxygen being wrong was wrong.

I'd like to use this opportunity to complain once more about the
existence of these terrible pixel formats.
2020-04-14 00:28:44 +02:00
wm4 0d792857c5 zimg: fix build with older FFmpeg (troublesome Intel dude format) 2020-04-14 00:01:46 +02:00
wm4 7832204c99 zimg: add support for 1 bit per pixel formats
Again worthless, slow, and only for libswscale parity.

With this, we support all formats libswscale supports, except bayer
input, and rgb4/bgr4 output. We even support some formats libswscale
doesn't.

It's possible that the zimg wrapper isn't always as fast as libswscale.
But there is optimization potential: the inner repack loops are
self-contained enough that they could be reasonably be implemented in
assembler (probably), and doing everything slice-wise should reduce the
overhead of the separate pack/unpack stages.
2020-04-13 20:42:34 +02:00
wm4 afedaf3b61 zimg: add packed YUV bullshit
Just lazily tested.

The comment on AV_PIX_FMT_Y210LE seems to be wrong. It claims it's "like
YUYV422", bit it seems more like YVYU422, at last the way libswscale
input treats it. Maybe Intel pays its developers too much?

The repacker inner lop is probably rather inefficient. In theory we
could optimize it by reading the packed pixels as words, doing the
component reshuffling using compile time values etc., but I'd rather
keep the code size small. It's already bad enough that we have to
support 16 bit per component variants, just because this one Intel guy
couldn't keep it in his pants. In general, I can't be bothered to spend
time on optimizing it; I'm only doing this for fun (i.e. masochistic
obligation).
2020-04-13 20:05:38 +02:00
wm4 f7f947f960 zimg: add support for some RGB fringe formats
This covers 8 and 16 bit packed RGB formats. It doesn't really help with
any actual use-cases, other than giving the finger to libswscale.

One problem is with different color depths. For example, rgb565 provides
1 bit more resolution to the green channel. zimg can only dither to a
uniform depth. I tried dithering to the highest depth and shifting away
1 bit for the lower channels, but that looked ugly (or I messed up
somewhere), so instead it dithers to the lowest depth, and adjusts the
value range if needed. Testing with bgr4_byte (extreme case with 1/2/1
depths), it looks more "grainy" (ordered dithering artifacts) than
libswscale, but it also looks cleaner and smoother. It doesn't have
libswscale's weird red-shift. So I call it a success.

Big endian formats need to be handled explicitly; the generic big endian
swapper code assumes byte-aligned components.

Unpacking is done with shifts and 3 LUTs. This is symmetric to the
packer. Using a generated palette might be better, but I preferred to
keep the symmetry, and not having to mess with a generated palette and
the pal8 code.

This uses FFmepg pixfmts constants directly. I would have preferred
keeping zimg completely separate. But neither do I want to add an IMGFMT
alias for every of these formats, nor do I want to extend our imgfmt
code such that it can provide a complete description of each packed RGB
format (similar to FFmpeg pixdesc).

It also appears that FFmpeg pixdesc as well as the FFmpeg pixfmt doxygen
have an error regarding RGB8: the R/B bit depths are swapped. libswscale
appears to be handling them differently. Not completely sure, as this is
the only packed format case with R/B havuing different depths (instead
of G, the middle component, where things are symmetric).
2020-04-13 15:56:52 +02:00
wm4 a8b84c9a1a zimg: add support for big endian input and output
One of the extremely annoying dumb things in ffmpeg is that most pixel
formats are available as little endian and big endian variants. (The
sane way would be having native endian formats only.) Usually, most of
the real codecs use native formats only, while non-native formats are
used by fringe raw codecs only. But the PNG encoders and decoders
unfortunately use big endian formats, and since PNG it such a popular
format, this causes problems for us. In particular, the current zimg
wrapper will refuse to work (and mpv will fall back to sws) when writing
non-8 bit PNGs.

So add non-native endian support to zimg. This is done in a fairly
"generic" way (which means lots of potential for bugs). If input is a
"regular" format (and just byte-swapped), the rest happens
automatically, which happens to cover all interesting formats.

Some things could be more efficient; for example, unpacking is done on
the data before it's passed to the unpacker. You could make endian
swapping part of the actual unpacking process, which might be slightly
faster. You could avoid copying twice in some cases (such as when
there's no actual repacker, or if alignment needs to be corrected). But
I don't really care. It's reasonably fast for the normal case.

Not entirely sure whether this is correct. Some (but not many) formats
are covered by the tests, some I tested manually. Some I can't even
test, because libswscale doesn't support them (like nv20*).
2020-04-13 15:56:27 +02:00
wm4 26f4f18c06 options: change option macros and all option declarations
Change all OPT_* macros such that they don't define the entire m_option
initializer, and instead expand only to a part of it, which sets certain
fields. This requires changing almost every option declaration, because
they all use these macros. A declaration now always starts with

   {"name", ...

followed by designated initializers only (possibly wrapped in macros).
The OPT_* macros now initialize the .offset and .type fields only,
sometimes also .priv and others.

I think this change makes the option macros less tricky. The old code
had to stuff everything into macro arguments (and attempted to allow
setting arbitrary fields by letting the user pass designated
initializers in the vararg parts). Some of this was made messy due to
C99 and C11 not allowing 0-sized varargs with ',' removal. It's also
possible that this change is pointless, other than cosmetic preferences.

Not too happy about some things. For example, the OPT_CHOICE()
indentation I applied looks a bit ugly.

Much of this change was done with regex search&replace, but some places
required manual editing. In particular, code in "obscure" areas (which I
didn't include in compilation) might be broken now.

In wayland_common.c the author of some option declarations confused the
flags parameter with the default value (though the default value was
also properly set below). I fixed this with this change.
2020-03-18 19:52:01 +01:00
wm4 7e6ea02183 zimg: fix previous odd sizes commit
Obviously, we don't want to lose fractions, and the zimg active_region
fields in fact have the type double. The integer division was wrong.

Also, always set active_region.width/height. It appears zimg behavior
does not change if they're set to the normal integer values, so the
extra check to not set them in this case was worthless.
2020-02-13 01:26:51 +01:00
wm4 7c8b40c38a zimg: correct output to odd (chroma un-aligned) sizes
As suggested by the zimg author: active_region is not supported on
outputs (and the API returns an error), so instead scale to the "full"
surface, but adjust the source rectangle such that the cropped output
image happens to cover the correct region.

Does this even work? Since Balmer Peak doesn't work, I can't really say,
but it seems to look correct.
2020-02-12 18:01:22 +01:00
wm4 0848f3f832 zimg: fix typos in a comment
Also remove the "o" case, which was never implemented (probably was an
idea to output alpha formats, now obsoleted by zimg's full alpha
support).
2020-02-12 17:54:35 +01:00
wm4 8c2179becf zimg: add pal8 unpacker
Some pngs are paletted, so this is vaguely interesting.
2020-02-10 19:01:21 +01:00
wm4 83f070dfb8 zimg: rename zplanes field
This was a confusing name, because 1. there's also a z_planes[] field,
and 2. it was not specific to zimg indexes.

Possibly there used to be an idea involved about supporting alpha to
non-alpha formats by discarding the alpha plane, but zimg does this now
(and zimg will correctly blend the alpha component too).
2020-02-10 17:59:13 +01:00
wm4 a9116ddd38 zimg: support gray/alpha conversion
The special thing about this format is

1. mpv assigns the component ID 4 to alpha, and component IDs 2 and 3
   are not present, which causes some messy details.
2. zimg always wants the alpha plane as plane 3, and plane 1 and 2 are
   not present, while FFmpeg/mpv put the alpha plane as plane 1.

In theory, 2. could be avoided, since FFmpeg actually doesn't have a any
2 plane formats (alpha is either packed, or plane 3). But having to skip
"empty" planes would break expectations.

zplanes is not equivalent to the mpv plane count (actually it was always
used this way), while zimg does not really have a plane count, but does,
in this case, only use plane 0 and 3, while 2 and 3 are unused and
unset. z_planes[] (not zplanes) is now always valid for all 4 array
entries (because it uses zimg indexes), but a -1 entry means it's an
unused plane.

I wonder if these conventions taken by mpv/zimg are not just causing
extra work. Maybe component IDs should just be indexes by the "natural"
order (e.g. R-G-B-A, Y-U-V-A, Y-A), and alpha should be represented as a
field that specifies the component ID for it, or just strictly assume
that 2/4 component formats always use the last component for alpha.
2020-02-10 17:57:01 +01:00
wm4 c31661466b zimg: fix some confusion about plane permutation
We reorder the planes between mpv and zimg conventions. It turns out the
code still confused when which convention was used.

So the way it actually works is that the _only_ place where zimg order
is used is the zimg_image_buffer.plane[] array. plane_aligned[] and
zmask[] were accessed incorrectly, although I guess it rarely had a
reason to fail (plane reordering is mostly for RGB, which has planes of
all the same size).

Adjust some comments accordingly too.
2020-02-10 17:45:20 +01:00
wm4 cca02e51ef zimg: add alpha support
libzimg recently added direct alpha support and new API for it. (The API
change is rather minimal, and it turns out we can easily support old and
new zimg versions.)

This does not support _all_ alpha formats. For example, gray + alpha is
not supported yet, because my stupid design in the zimg wrapper would
require a planar gray + alpha format, while ffmpeg provides only a
packed one.
2020-02-09 19:16:54 +01:00
wm4 afbddbdf8f DOCS/contribute.md, zimg: remove 2 instances of an extraneous "s" 2019-11-07 22:53:13 +01:00
wm4 6eb0dc5447 zimg: support subsampled chroma with non-aligned image sizes 2019-11-02 18:49:17 +01:00
wm4 d26c1c6814 zinmg: stop using GBRP for RGB
Instead, use a YUV planar format. It doesn't matter, since we use the
format only internally and for "management" purposes. We're only
interested in the physical layout, not what colorspace FFmpeg "forcibly"
associates with it.

Also get rid of using the old and slightly sketchy mp_imgfmt_find()
function. Yep, the IMGFMT_RGB30 now "constructs" the planar format,
instead of using a pixfmt constant. Slightly inconvenient, tricky, and
fragile, but I like it, so bugger off.

This whole thing gets rid of some of the strange plane permutations that
were needed earlier.
2019-11-02 18:11:02 +01:00
wm4 20c1bd8a5e zimg: correct RGB30 order (probably)
According to the definition of the GL format, and the definition in
img_format.h, and the actual output by vo_gpu, the order of components
was probably wrong. It's exceedingly likely that the vo_drm format (for
which this was originally written) has the same layout, so this was
probably a bug from when the zimg wrapper code was refactored.
2019-11-02 17:52:13 +01:00
wm4 00838fe0c3 zimg: make --zimg-fast=yes default
This is mostly just because of the odd RGB default gamma issue, which
shouldn't have any real impact. This also sets allow_approximate_gamma,
which I hope is fine for normal use cases.
2019-11-02 02:22:16 +01:00
wm4 c9d685f5d6 zimg: pass through Y plane when repacking nv12
Normally, the Y plane can just be passed directly to zimg, and only the
chroma plane needs to be (de)interleaved. It still needs a copy if the Y
pointer is not aligned, though. (Whether this is actually a problem
depends on the CPU and probably zimg's compiler.)

This requires deciding per plane whether the plane should go through the
repack buffer or not. This logic is active in non-nv12 cases, because
not doing so would require extra code (maybe 2 lines or so).
repack_align is now always called, even if it's planar->planar with all
input aligned, but it won't actually do anything in that case. The
assumption is that zimg won't change behavior if you pass a callback
that does nothing versus passing NULL as callback.
2019-11-02 02:18:24 +01:00
wm4 3b56f82965 zimg: add semi-planar repacker
This is for formats like nv12 (including p010, nv24, etc.). Might be
important for hardware decoding. Previously, this would have forced a
libswscale fallback.

The genericism makes this only slightly more complicated. The main
complication is due to the fact that mixing planar and packed stuff is
insane (thanks, Nvidia).

P010 output will actually happily set any of the 6 bit "padding" LSB,
that are normally supposed to be 0 (for unpadded data there is P016).
Scaling happens with 16 bit precision. Not going to bother adding an
extra packer which zeros them out, or with shifting them in
packing/unpacking. Lets just hope nobody notices.
2019-11-02 01:10:22 +01:00
wm4 4b3666329e zimg: fix out of bounds memory accesses due to broken zmask
We've set all planes to the same zmask. But for subsampled chroma, the
zmask obviously needs to be smaller. This could lead to out of bounds
memory read and write accesses.

Move the align repacker to a single function, since this is now more
convenient.
2019-11-02 00:48:04 +01:00
wm4 04b4155582 zimg: add more packers/unpackers
This probably covers all packed formats which have byte-aligned
component, no alpha, and no subsampling. Everything else needs more
imgfmt metadata, or something even more complicated. Alpha is primarily
not supported, because zimg requires a second scaler instance for it,
and handling packing/unpacking with it is an unacceptable mess.
2019-10-31 22:43:58 +01:00
wm4 a7230dfed0 sws_utils, zimg: destroy vo_x11 and vo_drm performance
Raise swscale and zimg default parameters. This restores screenshot
quality settings (maybe) unset in the commit before. Also expose some
more libswscale and zimg options.

Since these options are also used for VOs like x11 and drm, this will
make x11/drm/etc. much slower. For compensation, provide a profile that
sets the old option values: sw-fast. I'm also enabling zimg here, just
as an experiment.

The core problem is that we have a single set of command line options
which control the settings used for most swscale/zimg uses. This was
done in the previous commit. It cannot differentiate between the VOs,
which need to be realtime and may accept/require lower quality options,
and things like screenshots or vo_image, which can be slower, but should
not sacrifice quality by default.

Should this have two sets of options or something similar to do the
right thing depending on the code which calls libswscale? Maybe. Or
should I just ignore the problem, make it someone else's problem (users
who want to use software conversion VOs), provide a sub-optimal
solution, and call it a day? Definitely, sounds good, pushing to master,
goodbye.
2019-10-31 16:51:12 +01:00
wm4 835586513d sws_utils: shuffle around some shit
Purpose uncertain. I guess it's slightly better, maybe.

The move of the sws/zimg options from VO opts (vo_opt_list) to the
top-level option list is tricky. VO opts have some helper code in vo.c,
that sends VOCTRL_SET_PANSCAN to the VO on every VO opts change. That's
because updating certain VO options used to be this way (and not just
the panscan option). This isn't needed anymore for sws/zimg options, so
explicitly move them away.
2019-10-31 15:26:03 +01:00
wm4 c10ba5eb8e Use mp_log2() instead of av_log2() 2019-10-31 13:17:18 +01:00
wm4 43e903980b zimg: minor name consistency improvement
Now these are like x2ccc10_pack: MSB to LSB, with bit width following
each component (except for components with the same bit width).
2019-10-21 01:38:25 +02:00
wm4 525e712757 zimg: support RGB30 output
This may be used later elsewhere.
2019-10-20 19:41:18 +02:00
wm4 631379bea9 zimg: move component order arrays to top of file 2019-10-20 19:41:18 +02:00
wm4 f23e663a21 zimg: support 3 component 16 bit pixel unpacking
Works for RGB (e.g. rgb48le) and XYZ.

It's unsure whether XYZ is really correctly converted.
2019-10-20 16:16:28 +02:00
wm4 577c00510b zimg: avoid theoretical FFmpeg planar RGB/YUV mixup
The RGB pack/unpack code in theory supports packed, non-subsampled YUV,
although in practice FFmpeg defines no such formats. (Only one with
alpha, but all alpha input is rejected by the current code.)

This would in theory have failed, because we would have selected a GBRP
format (instead of YUV), which makes no sense and would either have been
rejected by zimg (inconsistent parameters), or lead to broken output
(wrong permutation of planes).

Select the correct format and don't permute the planes in the YUV case.
2019-10-20 16:16:28 +02:00
wm4 c9d217979e zimg: add some more colorspace mappings
As suggested by the zimg author. This is mostly related to XYZ support.

It's unclear whether this works. Using the only XYZ test sample we know,
and the next commits to consume the pixfmt, it looks wrong.
2019-10-20 16:16:28 +02:00
wm4 07aa29ed8e video: add zimg wrapper
This provides a very similar API to sws_utils.h, which can be used to
convert and scale from one mp_image to another.

This commit adds only the code, but does not use it anywhere.

The code is quite preliminary and barely tested. It supports only a few
pixel formats, and will return failure for many others. (Unlike
libswscale, which tries to support anything that FFmpeg knows.)

zimg itself accepts only planar formats. Supporting other formats
requires manual packing/unpacking. (Compared to libswscale, the zimg API
is generally lower level, but allows for more flexibility.) Only BGR0
output was actually tested. It appears to work.
2019-10-20 02:17:31 +02:00