A repeat of the previous useless commits.
Pondered whether to use separate fields or just a flags integer for
color and component types; the latter won for now.
Functions like mp_imgfmt_get_component_type() are now discouraged, and
mp_imgfmt_desc.flags is back for defining all information. Some days ago
I felt like the opposite would be the better design. Fortunately, it
doesn't matter.
With this, I think all image format properties that mpv needs are
exhaustively defined all in one place.
I guess I decided to stuff it all into mp_imgfmt_desc (the "old"
struct). This is probably a mistake. At first I was afraid that this
struct would get too fat (probably justified, and hereby happened), but
on the other hand mp_imgfmt_get_desc() (which builds the struct) calls
the former mp_imgfmt_get_layout(), and the separation doesn't make too
much sense anyway. Just merge them.
Still, try to keep out the extra info for packed YUV bullshit. I think
the result is OK, and there's as much information as there was before.
The test output changes a little. There's no independent bits[] array
anymore, so formats which did not previously have set this now show it.
(These formats are mpv-only and are still missing the metadata. To be
added later). Also, the output for the cursed packed formats changes.
I thought I'd probably want something like this, so the hardcoded stuff
in repack.c can be removed eventually. Of course this has no purpose at
all, and will not have any. (For now, this provides only metadata, and
nothing uses it, apart from the "test" that dumps it as text.)
This adds full support for AV_PIX_FMT_UYYVYY411 (probably out of spite,
because the format is 100% useless). Support for some mpv-only formats
is missing, ironically.
The code goes through _lengths_ to try to make sense out of the FFmpeg
AVPixFmtDescriptor data. Which is even more amazing that the new
metadata basically mirrors pixdesc, and just adds to it. Considering
code complexity and speed issues (it takes time to crunch through all
this shit all the time), and especially the fact that pixdesc is very
_incomplete_, it would probably better to have our own table to all
formats. But then we'd not scramble every time FFmpeg adds a new format,
which would be annoying. On the other hand, by using pixdesc, we get the
excitement to see whether this code will work, or break everything in
catastrophic ways.
The data structure still sucks a lot. Maybe I'll redo it again. The text
dump is weirdly differently formatted than the C struct - because I'm
not happy with the representation. Maybe I'll redo it all over again.
In summary: this commit does nothing.
Remove the vaguely defined plane_bits and component_bits fields from
struct mp_imgfmt_desc. Add weird replacements for existing uses. Remove
the bytes[] field, replace uses with bpp[].
Fix some potential alignment issues in existing code. As a compromise,
split mp_image_pixel_ptr() into 2 functions, because I think it's a bad
idea to implicitly round, but for some callers being slightly less
strict is convenient.
This shouldn't really change anything. In fact, it's a 100% useless
change. I'm just cleaning up what I started almost 8 years ago (see
commit 00653a3eb0). With this I've decided to keep mp_imgfmt_desc,
just removing the weird parts, and keeping the saner parts.
They are sort of confusing, and they hide the fact that they have an
alpha component. Using the actual formats directly is no problem, sicne
these were useful only for big endian systems, something we can't test
anyway.
Adding all these so I can use them for obscure processing purposes (see
later draw_bmp commit).
There isn't really a reason why they should exist. On the other hand,
they're just labels for formats that can be handled in a generic way,
and this commit adds support for them in the zimg wrapper and vo_gpu
just by making the formats exist. (Well, vo_gpu had to be fixed in the
previous commit.)
Was broken with a zimg wrapper refucktor before the previous commit. In
addition, it seems this didn't match the vo_drm format, or the format
naming convention. So the order actually changes, and the format is
redefined. (The img_format.h comment was probably wrong.)
Change vo_gpu to the new format as well, so we can still test it.
For whatever purpose. If anything, this makes the zimg wrapper cleaner.
The added tests are not particular exhaustive, but nice to have. This
also makes the scale_zimg.c test pretty useless, because it only tests
repacking (going through the zimg wrapper). In theory, the repack_tests
things could also be used on scalers, but I guess it doesn't matter.
Some things are added over the previous zimg wrapper code. For example,
some fringe formats can now be expanded to 8 bit per component for
convenience.
Utterly useless, but the intention is to make dealing with corner case
pixel formats (forced upon us by FFmpeg, very rarely) less of a pain.
The zimg wrapper will use them. (It already supports these formats
automatically, but it will help with its internals.)
Y1 is considered RGB, even though gray formats are generally treated as
YUV for various reasons. mpv will default all YUV formats to limited
range internally, which makes no sense for a 1 bit format, so this is a
problem. I wanted to avoid that mp_image_params_guess_csp() (which
applies the default) explicitly checks for an image format, so although
a bit janky, this seems to be a good solution, especially because I
really don't give a shit about these formats, other than having to
handle them. It's notable that AV_PIX_FMT_MONOBLACK (also 1 bit gray,
just packed) already explicitly marked itself as RGB.
When I added mp_regular_imgfmt, I made the chroma subsampling use the
actual chroma division factor, instead of a shift (log2 of the actual
value). I had some ideas about how this was (probably?) more intuitive
and general. But nothing ever uses non-power of 2 subsampling (except
jpeg in rare cases apparently, because the world is a bad place).
Change the fields back to use shifts and rename them to avoid mistakes.
Make this slightly less ad-hoc. Also correct the missing alpha flag for
yap8/yap16.
Despite reduced redundancy, the LOC is going up anyway... whatever.
One of the extremely annoying dumb things in ffmpeg is that most pixel
formats are available as little endian and big endian variants. (The
sane way would be having native endian formats only.) Usually, most of
the real codecs use native formats only, while non-native formats are
used by fringe raw codecs only. But the PNG encoders and decoders
unfortunately use big endian formats, and since PNG it such a popular
format, this causes problems for us. In particular, the current zimg
wrapper will refuse to work (and mpv will fall back to sws) when writing
non-8 bit PNGs.
So add non-native endian support to zimg. This is done in a fairly
"generic" way (which means lots of potential for bugs). If input is a
"regular" format (and just byte-swapped), the rest happens
automatically, which happens to cover all interesting formats.
Some things could be more efficient; for example, unpacking is done on
the data before it's passed to the unpacker. You could make endian
swapping part of the actual unpacking process, which might be slightly
faster. You could avoid copying twice in some cases (such as when
there's no actual repacker, or if alignment needs to be corrected). But
I don't really care. It's reasonably fast for the normal case.
Not entirely sure whether this is correct. Some (but not many) formats
are covered by the tests, some I tested manually. Some I can't even
test, because libswscale doesn't support them (like nv20*).
The zimg wrapper "needs" these formats as intermediary when repacking
the normal gray/alpha packed format. The packed format is used by the
png decoder and encoder, and is thus interesting.
Unfortunately, mpv-only formats are a mess right now, because all the
existing code is focused around using the FFmpeg metadata for pixel
formats. This should be improved, but not now, so make the mess worse.
This commit doesn't add support for it to the zimg wrapper yet.
They were used at some point, but then fell into disuse. In general,
these old flags are all a bit fuzzy, so it's a good idea to remove them
as much as possible.
The comment about MP_IMGFLAG_PAL isn't true anymore. The old meaning was
deprecated at some point, and the flag was removed from "pseudo
paletted" formats. I think mpv at one point changed its own flag from
AV_PIX_FMT_FLAG_PSEUDOPAL to AV_PIX_FMT_FLAG_PAL, when the former was
deprecated, and it became unnecessary to allocate a palette for
non-paletted formats. (The one who deprecated in FFmpeg was me, if you
wonder.)
MP_IMGFLAG_PLANAR was used in command.c, use a relatively similar flag
as replacement.
This is similar to mp_imgfmt_find(), but probably a bit saner. Used by
the next commit. The previous commit is required to map this
unambiguously between all formats.
As the code comment says, this is needed to disambiguate FFmpeg formats.
This struct only describes the "physical" layout of a format, while
FFmpeg also attaches part of the colorspace information to the format.
New releases of VDPAU support decoding 4:4:4 content, and that comes
back as NV24 when using 'direct mode' in OpenGL Interop. That means we
need to be a little bit smarter about how we set up the OpenGL
textures.
Using the GL renderer for color conversion will make sure screenshots
will use the same conversion as normal video rendering. It can do this
for all types of screenshots.
The logic when to write 16 bit PNGs changes. To approximate the old
behavior, we decide by looking whether the source video format has more
than 8 bits per component. We apply this logic even for window
screenshots. Also, 16 bit PNGs now always include an unused alpha
channel. The reason is that FFmpeg has RGB48 and RGBA64 formats, but no
RGB064. RGB48 is 3 bytes and usually not supported by GPUs for
rendering, so we have to use RGBA64, which forces an alpha channel.
Will break for users who use --target-trc and similar options.
I considered creating a new gl_video context, but it could double GPU
memory use, so I didn't.
This uses FBOs instead of glGetTexImage(), because that increases the
chance it could work on GLES (e.g. ANGLE). Untested. No support for the
Vulkan and D3D11 backends yet.
Fixes#5498. Also fixes#5240, because the code for reading back is not
used with the new code path.
Get rid of the old vf.c code. Replace it with a generic filtering
framework, which can potentially handle more than just --vf. At least
reimplementing --af with this code is planned.
This changes some --vf semantics (including runtime behavior and the
"vf" command). The most important ones are listed in interface-changes.
vf_convert.c is renamed to f_swscale.c. It is now an internal filter
that can not be inserted by the user manually.
f_lavfi.c is a refactor of player/lavfi.c. The latter will be removed
once --lavfi-complex is reimplemented on top of f_lavfi.c. (which is
conceptually easy, but a big mess due to the data flow changes).
The existing filters are all changed heavily. The data flow of the new
filter framework is different. Especially EOF handling changes - EOF is
now a "frame" rather than a state, and must be passed through exactly
once.
Another major thing is that all filters must support dynamic format
changes. The filter reconfig() function goes away. (This sounds complex,
but since all filters need to handle EOF draining anyway, they can use
the same code, and it removes the mess with reconfig() having to predict
the output format, which completely breaks with libavfilter anyway.)
In addition, there is no automatic format negotiation or conversion.
libavfilter's primitive and insufficient API simply doesn't allow us to
do this in a reasonable way. Instead, filters can use f_autoconvert as
sub-filter, and tell it which formats they support. This filter will in
turn add actual conversion filters, such as f_swscale, to perform
necessary format changes.
vf_vapoursynth.c uses the same basic principle of operation as before,
but with worryingly different details in data flow. Still appears to
work.
The hardware deint filters (vf_vavpp.c, vf_d3d11vpp.c, vf_vdpaupp.c) are
heavily changed. Fortunately, they all used refqueue.c, which is for
sharing the data flow logic (especially for managing future/past
surfaces and such). It turns out it can be used to factor out most of
the data flow. Some of these filters accepted software input. Instead of
having ad-hoc upload code in each filter, surface upload is now
delegated to f_autoconvert, which can use f_hwupload to perform this.
Exporting VO capabilities is still a big mess (mp_stream_info stuff).
The D3D11 code drops the redundant image formats, and all code uses the
hw_subfmt (sw_format in FFmpeg) instead. Although that too seems to be a
big mess for now.
f_async_queue is unused.
This commit allows to use the AV_PIX_FMT_DRM_PRIME newly introduced
format in ffmpeg that allows decoders to provide an AVDRMFrameDescriptor
struct.
That struct holds dmabuf fds and information allowing zerocopy rendering
using KMS / DRM Atomic.
This has been tested on RockChip ROCK64 device.
It never worked. It relied on some obscure texture format to provide the
equivalent of GL_RG or GL_LUMINANCE_ALPHA, but no hardware seemed to
report support for it ever. No idea what's the correct way to do this.
On D3D11 it exists, of course.
(Actually I'd like to remove the whole VO.)
Another legacy annoyance. The only place where packed YUV is still
important is slightly older Apple hardware or drivers, which require
it for efficient hardware decoding.
For vo_opengl and vo_direct3d, these are supported in a generic way.
For vf_vapoursynth, we could probably map its VSFormat struct in a
generic way, but for now do some bullshit.
vf_eq.c actually loses support for these formats. We could add generic
support too (anything that has 8 bit planes will work), but why bother.
The filter is deprecated anyway.
Add something that allows is to extract the component order from various
RGBA formats. In fact, also handle YUV, GBRP, and XYZ formats with this.
It introduces a new struct mp_regular_imgfmt, that hopefully will
eventually replace struct mp_imgfmt_desc. The latter is still needed by
a lot of code though, especially generic code. Also vo_opengl still uses
the old one, so this commit is sort of incomplete.
Due to its genericness, it's also possible that this commit introduces
rendering bugs, or accepts formats it shouldn't accept.
The problem with fmt-conversion.h is that "lucabe", who disagreed with
LGPL, originally wrote it. But it was actually rewritten by "reimar"
later. The original switch statement was replaced with a lookup table.
No code other than the imgfmt2pixfmt() function signature survives.
Neither the format pairs (PIXFMT<->IMGFMT), nor the concept of mapping
them, can be copyrighted.
So changing the license should be fine, because reimar and all other
authors involved with the new code agreed to LGPL.
We also don't consider format pairs added later as copyrightable.
(The direct-mapping idea mentioned in the "Copyright" file seems
attractive, and I might implement in later anyway.)
Likewise, there might be some format names added to img_format.h, which
are not covered by relicensing agreements. These all affect "later"
additions, and they follow either the FFmpeg PIXFMT naming or some other
pre-existing logic, so this should be fine.
The latest 375.xx nvidia drivers add support for P016 output
surfaces. In combination with an ffmpeg change to return those
surfaces, we can display them.
The bulk of the work is related to knowing which format you're
dealing with at the right time. Once you know, it's straight forward.
Nvidia's "NvDecode" API (up until recently called "cuvid" is a cross
platform, but nvidia proprietary API that exposes their hardware
video decoding capabilities. It is analogous to their DXVA or VDPAU
support on Windows or Linux but without using platform specific API
calls.
As a rule, you'd rather use DXVA or VDPAU as these are more mature
and well supported APIs, but on Linux, VDPAU is falling behind the
hardware capabilities, and there's no sign that nvidia are making
the investments to update it.
Most concretely, this means that there is no VP8/9 or HEVC Main10
support in VDPAU. On the other hand, NvDecode does export vp8/9 and
partial support for HEVC Main10 (more on that below).
ffmpeg already has support in the form of the "cuvid" family of
decoders. Due to the design of the API, it is best exposed as a full
decoder rather than an hwaccel. As such, there are decoders like
h264_cuvid, hevc_cuvid, etc.
These decoders support two output paths today - in both cases, NV12
frames are returned, either in CUDA device memory or regular system
memory.
In the case of the system memory path, the decoders can be used
as-is in mpv today with a command line like:
mpv --vd=lavc:h264_cuvid foobar.mp4
Doing this will take advantage of hardware decoding, but the cost
of the memcpy to system memory adds up, especially for high
resolution video (4K etc).
To avoid that, we need an hwdec that takes advantage of CUDA's
OpenGL interop to copy from device memory into OpenGL textures.
That is what this change implements.
The process is relatively simple as only basic device context
aquisition needs to be done by us - the CUDA buffer pool is managed
by the decoder - thankfully.
The hwdec looks a bit like the vdpau interop one - the hwdec
maintains a single set of plane textures and each output frame
is repeatedly mapped into these textures to pass on.
The frames are always in NV12 format, at least until 10bit output
supports emerges.
The only slightly interesting part of the copying process is that
CUDA works by associating PBOs, so we need to define these for
each of the textures.
TODO Items:
* I need to add a download_image function for screenshots. This
would do the same copy to system memory that the decoder's
system memory output does.
* There are items to investigate on the ffmpeg side. There appears
to be a problem with timestamps for some content.
Final note: I mentioned HEVC Main10. While there is no 10bit output
support, NvDecode can return dithered 8bit NV12 so you can take
advantage of the hardware acceleration.
This particular mode requires compiling ffmpeg with a modified
header (or possibly the CUDA 8 RC) and is not upstream in ffmpeg
yet.
Usage:
You will need to specify vo=opengl and hwdec=cuda.
Note that hwdec=auto will probably not work as it will try to use
vdpau first.
mpv --hwdec=cuda --vo=opengl foobar.mp4
If you want to use filters that require frames in system memory,
just use the decoder directly without the hwdec, as documented
above.
We now have a video filter that uses the d3d11 video processor, so it
makes no sense to have one in the VO interop code. The VO uses it for
formats not directly supported by ANGLE (so the video data is converted
to a RGB texture, which ANGLE can take in).
Change this so that the video filter is automatically inserted if
needed. Move the code that maps RGB surfaces to its own inteorp backend.
Add a bunch of new image formats, which are used to enforce the new
constraints, and to automatically insert the filter only when needed.
The added vf mechanism to auto-insert the d3d11vpp filter is very dumb
and primitive, and will work only for this specific purpose. The format
negotiation mechanism in the filter chain is generally not very pretty,
and mostly broken as well. (libavfilter has a different mechanism, and
these mechanisms don't match well, so vf_lavfi uses some sort of hack.
It only works because hwaccel and non-hwaccel formats are strictly
separated.)
The RGB interop is now only used with older ANGLE versions. The only
reason I'm keeping it is because it's relatively isolated (uses only
existing mechanisms and adds no new concepts), and because I want to be
able to compare the behavior of the old code with the new one for
testing. It will be removed eventually.
If ANGLE has NV12 interop, P010 is now handled by converting to NV12
with the video processor, instead of converting it to RGB and using the
old mechanism to import that as a texture.
This commit adds the d3d11va-copy hwdec mode using the ffmpeg d3d11va
api. Functions in common with dxva2 are handled in a separate decode/d3d.c
file. A future commit will rewrite decode/dxva2.c to share this code.