The recent changes to the image format metadata broke big endian, and
that was intentional. Some things are inherent to little endian (like
the idea to coalesce bit and byte offsets into a single bit offset), and
they don't be fixed. But some obvious things can be fixed, such as
marking LE vs. BE formats the right way around on BE hosts.
The metadata is formally still in LE, except that if the LE/BE flag
matches the host endian, the host endian can be used when accessing
packed formats with bit shifts, or when computing byte aligned component
byte offsets. The former may work because formats with LE/BE variants
use the same bit offsets after byte swapping, the latter may work
because little endian is the natural concept for addressing memory. But
it will "subtly" fail to do the right thing in some cases, and code
using this can't know, so have fun.
Many things are broken, but this makes e.g. vo_gpu mostly work.
My general opinion about BE computers is that you should get a better
computer, you can get one for free from any garbage dump.
This change attempts to fix detection of how endian swapping is to be
performed. Details can be found in the code comments.
It should not change anything, other than fixing handling of the
X2RGB10BE ffmpeg pixel format. This format was detected incorrectly, and
the component location metadata was discarded due to an internal
consistency check. With this commit, it is handled correctly. At first I
thought the X2RGB10BE ffmpeg pixdesc metadata was wrong, but it appears
to be correct. The problem with this format is that it's the first
packed RGB format that requires bit shift to access, and where the
endian word size is larger than the (rounded up) component size, all
while pixdesc would "require" you to perform 3 memory accesses (instead
of 1), and the code tries to reverse this.
It appears that trying to use the pixdesc metadata is much, much more
work than just duplicating it in a saner form. To be fair, most problems
are with big endian, and the mpv internal format does not care much
about endian _hosts_.
A repeat of the previous useless commits.
Pondered whether to use separate fields or just a flags integer for
color and component types; the latter won for now.
Functions like mp_imgfmt_get_component_type() are now discouraged, and
mp_imgfmt_desc.flags is back for defining all information. Some days ago
I felt like the opposite would be the better design. Fortunately, it
doesn't matter.
With this, I think all image format properties that mpv needs are
exhaustively defined all in one place.
I guess I decided to stuff it all into mp_imgfmt_desc (the "old"
struct). This is probably a mistake. At first I was afraid that this
struct would get too fat (probably justified, and hereby happened), but
on the other hand mp_imgfmt_get_desc() (which builds the struct) calls
the former mp_imgfmt_get_layout(), and the separation doesn't make too
much sense anyway. Just merge them.
Still, try to keep out the extra info for packed YUV bullshit. I think
the result is OK, and there's as much information as there was before.
The test output changes a little. There's no independent bits[] array
anymore, so formats which did not previously have set this now show it.
(These formats are mpv-only and are still missing the metadata. To be
added later). Also, the output for the cursed packed formats changes.
My previous commit added support for this format, but it was still
broken, and prevented the allocation code from working. It's unknown
whether it's correct now (because this pixfmt is so obscure and useless,
there are no known samples around), but who cares.
I thought I'd probably want something like this, so the hardcoded stuff
in repack.c can be removed eventually. Of course this has no purpose at
all, and will not have any. (For now, this provides only metadata, and
nothing uses it, apart from the "test" that dumps it as text.)
This adds full support for AV_PIX_FMT_UYYVYY411 (probably out of spite,
because the format is 100% useless). Support for some mpv-only formats
is missing, ironically.
The code goes through _lengths_ to try to make sense out of the FFmpeg
AVPixFmtDescriptor data. Which is even more amazing that the new
metadata basically mirrors pixdesc, and just adds to it. Considering
code complexity and speed issues (it takes time to crunch through all
this shit all the time), and especially the fact that pixdesc is very
_incomplete_, it would probably better to have our own table to all
formats. But then we'd not scramble every time FFmpeg adds a new format,
which would be annoying. On the other hand, by using pixdesc, we get the
excitement to see whether this code will work, or break everything in
catastrophic ways.
The data structure still sucks a lot. Maybe I'll redo it again. The text
dump is weirdly differently formatted than the C struct - because I'm
not happy with the representation. Maybe I'll redo it all over again.
In summary: this commit does nothing.
Remove the vaguely defined plane_bits and component_bits fields from
struct mp_imgfmt_desc. Add weird replacements for existing uses. Remove
the bytes[] field, replace uses with bpp[].
Fix some potential alignment issues in existing code. As a compromise,
split mp_image_pixel_ptr() into 2 functions, because I think it's a bad
idea to implicitly round, but for some callers being slightly less
strict is convenient.
This shouldn't really change anything. In fact, it's a 100% useless
change. I'm just cleaning up what I started almost 8 years ago (see
commit 00653a3eb0). With this I've decided to keep mp_imgfmt_desc,
just removing the weird parts, and keeping the saner parts.
Adding all these so I can use them for obscure processing purposes (see
later draw_bmp commit).
There isn't really a reason why they should exist. On the other hand,
they're just labels for formats that can be handled in a generic way,
and this commit adds support for them in the zimg wrapper and vo_gpu
just by making the formats exist. (Well, vo_gpu had to be fixed in the
previous commit.)
This was inconsistent for unknown reason. monob was the way we wanted
it, and handling of monow was missing.
See the previous "img_format: add some mpv-only helper formats" commit.
Matters for the zimg wrapper.
mp_imgfmt_get_forced_csp() should be consistent with the MP_CSP_RGB/YUV
flags.
At least the different handling of the XYZ exception was a mess, even if
the result was the same.
Utterly useless, but the intention is to make dealing with corner case
pixel formats (forced upon us by FFmpeg, very rarely) less of a pain.
The zimg wrapper will use them. (It already supports these formats
automatically, but it will help with its internals.)
Y1 is considered RGB, even though gray formats are generally treated as
YUV for various reasons. mpv will default all YUV formats to limited
range internally, which makes no sense for a 1 bit format, so this is a
problem. I wanted to avoid that mp_image_params_guess_csp() (which
applies the default) explicitly checks for an image format, so although
a bit janky, this seems to be a good solution, especially because I
really don't give a shit about these formats, other than having to
handle them. It's notable that AV_PIX_FMT_MONOBLACK (also 1 bit gray,
just packed) already explicitly marked itself as RGB.
When I added mp_regular_imgfmt, I made the chroma subsampling use the
actual chroma division factor, instead of a shift (log2 of the actual
value). I had some ideas about how this was (probably?) more intuitive
and general. But nothing ever uses non-power of 2 subsampling (except
jpeg in rare cases apparently, because the world is a bad place).
Change the fields back to use shifts and rename them to avoid mistakes.
Make this slightly less ad-hoc. Also correct the missing alpha flag for
yap8/yap16.
Despite reduced redundancy, the LOC is going up anyway... whatever.
One of the extremely annoying dumb things in ffmpeg is that most pixel
formats are available as little endian and big endian variants. (The
sane way would be having native endian formats only.) Usually, most of
the real codecs use native formats only, while non-native formats are
used by fringe raw codecs only. But the PNG encoders and decoders
unfortunately use big endian formats, and since PNG it such a popular
format, this causes problems for us. In particular, the current zimg
wrapper will refuse to work (and mpv will fall back to sws) when writing
non-8 bit PNGs.
So add non-native endian support to zimg. This is done in a fairly
"generic" way (which means lots of potential for bugs). If input is a
"regular" format (and just byte-swapped), the rest happens
automatically, which happens to cover all interesting formats.
Some things could be more efficient; for example, unpacking is done on
the data before it's passed to the unpacker. You could make endian
swapping part of the actual unpacking process, which might be slightly
faster. You could avoid copying twice in some cases (such as when
there's no actual repacker, or if alignment needs to be corrected). But
I don't really care. It's reasonably fast for the normal case.
Not entirely sure whether this is correct. Some (but not many) formats
are covered by the tests, some I tested manually. Some I can't even
test, because libswscale doesn't support them (like nv20*).
Libav seems rather dead: no release for 2 years, no new git commits in
master for almost a year (with one exception ~6 months ago). From what I
can tell, some developers resigned themselves to the horrifying idea to
post patches to ffmpeg-devel instead, while the rest of the developers
went on to greener pastures.
Libav was a better project than FFmpeg. Unfortunately, FFmpeg won,
because it managed to keep the name and website. Libav was pushed more
and more into obscurity: while there was initially a big push for Libav,
FFmpeg just remained "in place" and visible for most people. FFmpeg was
slowly draining all manpower and energy from Libav. A big part of this
was that FFmpeg stole code from Libav (regular merges of the entire
Libav git tree), making it some sort of Frankenstein mirror of Libav,
think decaying zombie with additional legs ("features") nailed to it.
"Stealing" surely is the wrong word; I'm just aping the language that
some of the FFmpeg members used to use. All that is in the past now, I'm
probably the only person left who is annoyed by this, and with this
commit I'm putting this decade long problem finally to an end. I just
thought I'd express my annoyance about this fucking shitshow one last
time.
The most intrusive change in this commit is the resample filter, which
originally used libavresample. Since the FFmpeg developer refused to
enable libavresample by default for drama reasons, and the API was
slightly different, so the filter used some big preprocessor mess to
make it compatible to libswresample. All that falls away now. The
simplification to the build system is also significant.
The zimg wrapper "needs" these formats as intermediary when repacking
the normal gray/alpha packed format. The packed format is used by the
png decoder and encoder, and is thus interesting.
Unfortunately, mpv-only formats are a mess right now, because all the
existing code is focused around using the FFmpeg metadata for pixel
formats. This should be improved, but not now, so make the mess worse.
This commit doesn't add support for it to the zimg wrapper yet.
This is fragile enough that it warrants getting "monitored".
This takes the commented test program code from img_format.c, makes it
output to a text file, and then compares it to a "ref" file stored in
git.
Originally, I wanted to do the comparison etc. in a shell or Python
script. But why not do it in C. So mpv calls /usr/bin/diff as a
sub-process now.
This test will start producing different output if FFmpeg adds new pixel
formats or pixel format flags, or if mpv adds new IMGFMT (either aliases
to FFmpeg formats or own formats). That is unavoidable, and requires
manual inspection of the results, and then updating the ref file.
The changes in the non-test code are to guarantee that the format ID
conversion functions only translate between valid IDs.
They were used at some point, but then fell into disuse. In general,
these old flags are all a bit fuzzy, so it's a good idea to remove them
as much as possible.
The comment about MP_IMGFLAG_PAL isn't true anymore. The old meaning was
deprecated at some point, and the flag was removed from "pseudo
paletted" formats. I think mpv at one point changed its own flag from
AV_PIX_FMT_FLAG_PSEUDOPAL to AV_PIX_FMT_FLAG_PAL, when the former was
deprecated, and it became unnecessary to allocate a palette for
non-paletted formats. (The one who deprecated in FFmpeg was me, if you
wonder.)
MP_IMGFLAG_PLANAR was used in command.c, use a relatively similar flag
as replacement.
This is similar to mp_imgfmt_find(), but probably a bit saner. Used by
the next commit. The previous commit is required to map this
unambiguously between all formats.
As the code comment says, this is needed to disambiguate FFmpeg formats.
This struct only describes the "physical" layout of a format, while
FFmpeg also attaches part of the colorspace information to the format.
The plane pointer checking assert() triggered at least on gray8, because
that has a "pseudo palettes" in ffmpeg, which mpv refuses to allocate.
Remove a strange duplicated printf().
Log the component type where available.
(Why is this even here, I hate it when there are commented test programs
in source files.)
Generally, using x86 SIMD efficiently (or crash-free) requires aligning
all data on boundaries of 16, 32, or 64 (depending on instruction set
used). 64 bytes is needed or AVX-512, 32 for old AVX, 16 for SSE. Both
FFmpeg and zimg usually require aligned data for this reason.
FFmpeg is very unclear about alignment. Yes, it requires you to align
data pointers and strides. No, it doesn't tell you how much, except
sometimes (libavcodec has a legacy-looking avcodec_align_dimensions2()
API function, that requires a heavy-weight AVCodecContext as argument).
Sometimes, FFmpeg will take a shit on YOUR and ITS OWN alignment. For
example, vf_crop will randomly reduce alignment of data pointers,
depending on the crop parameters. On the other hand, some libavfilter
filters or libavcodec encoders may randomly crash if they get the wrong
alignment. I have no idea how this thing works at all.
FFmpeg usually doesn't seem to signal alignment internal anywhere, and
usually leaves it to av_malloc() etc. to allocate with proper alignment.
libavutil/mem.c currently has a ALIGN define, which is set to 64 if
FFmpeg is built with AVX-512 support, or as low as 16 if built without
any AVX support. The really funny thing is that a normal FFmpeg build
will e.g. align tiny string allocations to 64 bytes, even if the machine
does not support AVX at all.
For zimg use (in a later commit), we also want guaranteed alignment.
Modern x86 should actually not be much slower at unaligned accesses, but
that doesn't help. zimg's dumb intrinsic code apparently randomly
chooses between aligned or unaligned accesses (depending on compiler, I
guess), and on some CPUs these can even cause crashes. So just treat the
requirement to align as a fact of life.
All this means that we should probably make sure our own allocations are
64 bit aligned. This still doesn't guarantee alignment in all cases, but
it's slightly better than before.
This also makes me wonder whether we should always override libavcodec's
buffer pool, just so we have a guaranteed alignment. Currently, we only
do that if --vd-lavc-dr is used (and if that actually works). On the
other hand, it always uses DR on my machine, so who cares.
Get rid of the old vf.c code. Replace it with a generic filtering
framework, which can potentially handle more than just --vf. At least
reimplementing --af with this code is planned.
This changes some --vf semantics (including runtime behavior and the
"vf" command). The most important ones are listed in interface-changes.
vf_convert.c is renamed to f_swscale.c. It is now an internal filter
that can not be inserted by the user manually.
f_lavfi.c is a refactor of player/lavfi.c. The latter will be removed
once --lavfi-complex is reimplemented on top of f_lavfi.c. (which is
conceptually easy, but a big mess due to the data flow changes).
The existing filters are all changed heavily. The data flow of the new
filter framework is different. Especially EOF handling changes - EOF is
now a "frame" rather than a state, and must be passed through exactly
once.
Another major thing is that all filters must support dynamic format
changes. The filter reconfig() function goes away. (This sounds complex,
but since all filters need to handle EOF draining anyway, they can use
the same code, and it removes the mess with reconfig() having to predict
the output format, which completely breaks with libavfilter anyway.)
In addition, there is no automatic format negotiation or conversion.
libavfilter's primitive and insufficient API simply doesn't allow us to
do this in a reasonable way. Instead, filters can use f_autoconvert as
sub-filter, and tell it which formats they support. This filter will in
turn add actual conversion filters, such as f_swscale, to perform
necessary format changes.
vf_vapoursynth.c uses the same basic principle of operation as before,
but with worryingly different details in data flow. Still appears to
work.
The hardware deint filters (vf_vavpp.c, vf_d3d11vpp.c, vf_vdpaupp.c) are
heavily changed. Fortunately, they all used refqueue.c, which is for
sharing the data flow logic (especially for managing future/past
surfaces and such). It turns out it can be used to factor out most of
the data flow. Some of these filters accepted software input. Instead of
having ad-hoc upload code in each filter, surface upload is now
delegated to f_autoconvert, which can use f_hwupload to perform this.
Exporting VO capabilities is still a big mess (mp_stream_info stuff).
The D3D11 code drops the redundant image formats, and all code uses the
hw_subfmt (sw_format in FFmpeg) instead. Although that too seems to be a
big mess for now.
f_async_queue is unused.
This partiular format is not marked as AV_PIX_FMT_FLAG_RGB in FFmpeg's
pixdesc table, so mpv assumed it's a YUV format.
This is a regression, since the old code in mp_imgfmt_get_desc() also
treated this format specially to avoid this problem. Another format
which was special-cased in the old code was AV_PIX_FMT_MONOBLACK, so
make an exception for it as well.
Maybe this problem could be avoided by mp_image_params_guess_csp() not
forcing certain colorimetric parameters by the implied colorspace, but
certainly that would cause other problems. At least there are mistagged
files out there that would break. (Do we actually care?)
Fixes#4965.
These are useless and shouldn't be confused with normal RGB formats.
Replace the earlier hack checking the format name with a proper check.
(Not sure when this flag was added. Libav won't have it anyway, but also
no bayer formats.)
Add something that allows is to extract the component order from various
RGBA formats. In fact, also handle YUV, GBRP, and XYZ formats with this.
It introduces a new struct mp_regular_imgfmt, that hopefully will
eventually replace struct mp_imgfmt_desc. The latter is still needed by
a lot of code though, especially generic code. Also vo_opengl still uses
the old one, so this commit is sort of incomplete.
Due to its genericness, it's also possible that this commit introduces
rendering bugs, or accepts formats it shouldn't accept.
The problem with fmt-conversion.h is that "lucabe", who disagreed with
LGPL, originally wrote it. But it was actually rewritten by "reimar"
later. The original switch statement was replaced with a lookup table.
No code other than the imgfmt2pixfmt() function signature survives.
Neither the format pairs (PIXFMT<->IMGFMT), nor the concept of mapping
them, can be copyrighted.
So changing the license should be fine, because reimar and all other
authors involved with the new code agreed to LGPL.
We also don't consider format pairs added later as copyrightable.
(The direct-mapping idea mentioned in the "Copyright" file seems
attractive, and I might implement in later anyway.)
Likewise, there might be some format names added to img_format.h, which
are not covered by relicensing agreements. These all affect "later"
additions, and they follow either the FFmpeg PIXFMT naming or some other
pre-existing logic, so this should be fine.
We now have a video filter that uses the d3d11 video processor, so it
makes no sense to have one in the VO interop code. The VO uses it for
formats not directly supported by ANGLE (so the video data is converted
to a RGB texture, which ANGLE can take in).
Change this so that the video filter is automatically inserted if
needed. Move the code that maps RGB surfaces to its own inteorp backend.
Add a bunch of new image formats, which are used to enforce the new
constraints, and to automatically insert the filter only when needed.
The added vf mechanism to auto-insert the d3d11vpp filter is very dumb
and primitive, and will work only for this specific purpose. The format
negotiation mechanism in the filter chain is generally not very pretty,
and mostly broken as well. (libavfilter has a different mechanism, and
these mechanisms don't match well, so vf_lavfi uses some sort of hack.
It only works because hwaccel and non-hwaccel formats are strictly
separated.)
The RGB interop is now only used with older ANGLE versions. The only
reason I'm keeping it is because it's relatively isolated (uses only
existing mechanisms and adds no new concepts), and because I want to be
able to compare the behavior of the old code with the new one for
testing. It will be removed eventually.
If ANGLE has NV12 interop, P010 is now handled by converting to NV12
with the video processor, instead of converting it to RGB and using the
old mechanism to import that as a texture.
This is a pretty major rewrite of the internal texture binding
mechanic, which makes it more flexible.
In general, the difference between the old and current approaches is
that now, all texture description is held in a struct img_tex and only
explicitly bound with pass_bind. (Once bound, a texture unit is assumed
to be set in stone and no longer tied to the img_tex)
This approach makes the code inside pass_read_video significantly more
flexible and cuts down on the number of weird special cases and
spaghetti logic.
It also has some improvements, e.g. cutting down greatly on the number
of unnecessary conversion passes inside pass_read_video (which was
previously mostly done to cope with the fact that the alternative would
have resulted in a combinatorial explosion of code complexity).
Some other notable changes (and potential improvements):
- texture expansion is now *always* handled in pass_read_video, and the
colormatrix never does this anymore. (Which means the code could
probably be removed from the colormatrix generation logic, modulo some
other VOs)
- struct fbo_tex now stores both its "physical" and "logical"
(configured) size, which cuts down on the amount of width/height
baggage on some function calls
- vo_opengl can now technically support textures with different bit
depths (e.g. 10 bit luma, 8 bit chroma) - but the APIs it queries
inside img_format.c doesn't export this (nor does ffmpeg support it,
really) so the status quo of using the same tex_mul for all planes is
kept.
- dumb_mode is now only needed because of the indirect_fbo being in the
main rendering pipeline. If we reintroduce p->use_indirect and thread
a transform through the entire program this could be skipped where
unnecessary, allowing for the removal of dumb_mode. But I'm not sure
how to do this in a clean way. (Which is part of why it got introduced
to begin with)
- It would be trivial to resurrect source-shader now (it would just be
one extra 'if' inside pass_read_video).
AVComponentDescriptor.offset was introduced relatively recently. On
older releases, you have to use AVComponentDescriptor.offset_plus1,
which is now deprecated.
Instead of adding ifdeffery, assume AV_PIX_FMT_NV21 is the only format
for which this applies (and will remain the only case), which is
probably true enough.
A format could declare that some or all LSBs in a component are padding
bits by setting a non-0 AVComponentDescriptor.shift value. This means we
would interpret it incorrectly, because until now we always assumed all
regular formats have the padding in the MSBs.
Not a single format that does this actually exists, though. But a NV12
variant will be added later in FFmpeg.
This removes the need to define IMGFMT_GBRAP, which fixes compilation
with the current Libav release.
This also makes it automatically pick up a GBRP format with the same bit
width. (Unfortunately, it seems libswscale does not support conversion
to AV_PIX_FMT_GBRAP16, so our code falls back to 8 bit, removing
precision for video covered by subtitles in cases this code is used.)
Also, when the source video is e.g. 10 bit YUV, upsample to 16 bit.
Whether this is good or bad, it fixes behavior with alpha. Although I'm
not sure if the alpha range is really correct ([0,2^16-1] vs.
[0,255*256]). Keep in mind that libswscale doesn't even agree with the
way we do it.
The computation of the tex_mul variable was broken in multiple ways.
This variable is used e.g. by debanding for moving expansion of 10 bit
fixed-point input to normalized range to another stage of processing.
One obvious bug was that the rgb555 pixel format was broken. This format
has component_bits=5, but obviously it's already sampled in normalized
range, and does not need expansion. The tex_mul-free code path avoids
this by not using the colormatrix. (The code was originally designed to
work around dealing with the generally complicated pixel formats by only
using the colormatrix in the YUV case.)
Another possible bug was with 10 bit input. It expanded the input by
bringing the [0,2^10) range to [0,1], and then treating the expanded
input as 16 bit input. I didn't bother to check what this actually
computed, but it's somewhat likely it was wrong anyway. Now it uses
mp_get_csp_mul(), and disables expansion when computing the YUV matrix.