Commit Graph

116725 Commits

Author SHA1 Message Date
Ramiro Polla
8744764a4c swscale/x86/yuv2rgb: add ssse3 yuv42{0,2}p -> gbrp unscaled colorspace converters
Note: this implementation is limited to x86_64 due to general purpose
      register pressure.

checkasm --bench on an Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz:
yuv420p_gbrp_8_c: 118.5
yuv420p_gbrp_8_ssse3: 93.3
yuv420p_gbrp_128_c: 1068.3
yuv420p_gbrp_128_ssse3: 319.3
yuv420p_gbrp_1080_c: 8841.8
yuv420p_gbrp_1080_ssse3: 2211.8
yuv420p_gbrp_1920_c: 15903.8
yuv420p_gbrp_1920_ssse3: 3814.3
yuv422p_gbrp_8_c: 144.8
yuv422p_gbrp_8_ssse3: 93.8
yuv422p_gbrp_128_c: 1395.8
yuv422p_gbrp_128_ssse3: 313.0
yuv422p_gbrp_1080_c: 11551.5
yuv422p_gbrp_1080_ssse3: 2240.8
yuv422p_gbrp_1920_c: 20585.3
yuv422p_gbrp_1920_ssse3: 5249.5
yuva420p_gbrp_8_c: 117.5
yuva420p_gbrp_8_ssse3: 92.0
yuva420p_gbrp_128_c: 1593.0
yuva420p_gbrp_128_ssse3: 319.3
yuva420p_gbrp_1080_c: 8694.5
yuva420p_gbrp_1080_ssse3: 2186.0
yuva420p_gbrp_1920_c: 15946.5
yuva420p_gbrp_1920_ssse3: 3805.3
2024-08-18 22:26:14 +02:00
Ramiro Polla
4545205a26 swscale/yuv2rgb: add yuv42{0,2}p -> gbrp unscaled colorspace converters 2024-08-18 22:26:11 +02:00
Ramiro Polla
af5adf57e3 swscale/yuv2rgb: prepare YUV2RGBFUNC macro for multi-planar rgb
This will be used in the upcoming yuv42{0,2}p -> gbrp unscaled
colorspace converters.

There is no difference in performance.
2024-08-18 22:26:08 +02:00
Ramiro Polla
24063e7827 swscale/yuv2rgb: prepare LOADCHROMA/PUTFUNC macros for multi-planar rgb
This will be used in the upcoming yuv42{0,2}p -> gbrp unscaled
colorspace converters.

There is no difference in performance.
2024-08-18 22:26:05 +02:00
Ramiro Polla
5c1c0325cd avcodec/aarch64/me_cmp: add dotprod implementations of sse16 and vsse_intra16
checkasm --bench for Raspberry Pi 5 Model B Rev 1.0:
sse_0_c: 241.5
sse_0_neon: 37.2
sse_0_dotprod: 22.2
vsse_4_c: 148.7
vsse_4_neon: 31.0
vsse_4_dotprod: 15.7
2024-08-17 15:31:48 +02:00
Dale Curtis
a31106d849 lavf/demux: don't reallocate a AVCodecContext when closing a non-open codec.
This results in an unnecessary ~800k allocation with H.264. A
nearby callsite uses avcodec_is_open() to avoid this, so do the
same when exiting avformat_find_stream_info().

Signed-off-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2024-08-17 12:54:41 +02:00
sfan5
c779766b5c avcodec/mediacodecdec: call MediaCodec.stop on close
Usually the MediaCodec context will be released immediately, or it needs to stay
alive due to existing hardware buffers.

However we can free resources early in the case of
hw_buffer_count == 0 && refcount > 1, which can be reproduced by keeping frames
referenced after flushing and closing. mpv currently behaves like this.

Signed-off-by: sfan5 <sfan5@live.de>
Signed-off-by: Matthieu Bouron <matthieu.bouron@gmail.com>
2024-08-17 09:03:05 +02:00
Timo Rothenpieler
817c6a6762 avformat/hlsenc: correctly reset subtitle stream counter per-varstream
Without resetting it, if there was a previous set of varstreams with
subtitles, it would subtract from all the streams, leading to chaos and
segfaults when trying to access for example stream -1.
2024-08-16 20:22:09 +02:00
James Almer
211c88b9d5 avfilter/f_zmq: fix graph argument
Fixes regression since d566a37003.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-08-16 09:53:22 -03:00
Niklas Haas
7b723ebd5a avcodec/dovi_rpudec: error out on strange RPU formats
Better safe than sorry.
2024-08-16 11:48:02 +02:00
Niklas Haas
3e1b70383e avcodec/dovi_rpuenc: slightly improve profile autodetection
In the absence of an RPU header, we can consult the colorspace tags to
make a more informed guess about whether we're looking at profile 5 or
profile 8.
2024-08-16 11:48:02 +02:00
Niklas Haas
ecea6ed3c9 avcodec/dovi_rpuenc: implement DM metadata compression
This implements limited metadata compression. To be a bit more lenient,
we try and re-order the static extension blocks when testing for an
exact match.

For sanity, and to avoid producing bitstreams we couldn't ourselves
decode, we don't accept partial matches - if some extension blocks
change while others remain static, compression is disabled for the
entire frame.

This shouldn't be an issue in practice because static extension blocks
are stated to remain constant throughout the entire sequence.
2024-08-16 11:48:02 +02:00
Niklas Haas
9824d1539e avcodec/dovi_rpudec: sanitize DM data before decoding
Some DM types do not fill the whole struct, so just clear it entirely
before going filling the decoded values.
2024-08-16 11:48:02 +02:00
Niklas Haas
45f5f4d3da avcodec/dovi_rpudec: implement limited DM decompression
This implements the limited DM metadata compression scheme described in
chapter 9 of the dolby vision bitstream specification.

The spec is a bit unclear about how to handle the presence of static
metadata inside compressed frames; in that it doesn't explicitly forbid
an encoder from repeating redundant metadata. In theory, we would need
to detect this case and then strip the corresponding duplicate metadata
from the existing set of static metadata. However, this is difficult to
implement - esspecially for the case of metadata blocks which may be
internally repeated (e.g. level 10).

That said, the spec states outright that static metadata should be
constant throughout the entire sequence, so a sane bitstream should not
have any static metadata values changing from one frame to the next (at
least up to a keyframe boundary), and therefore they should never be
present in compressed frames. As a consequence, it makes sense to treat
this as an error state regardless. (Ignoring them by default, or
erroring if either AV_EF_EXPLODE or AV_EF_AGGRESSIVE are set)

I was not able to find such samples in the wild (outside of artificially
produced test cases for this exact scenario), so I don't think we need
to worry about it until somebody produces one.
2024-08-16 11:48:02 +02:00
Niklas Haas
1c4d4cc368 avcodec/dovi_rpudec: don't unnecessarily allocate DOVIExt 2024-08-16 11:48:02 +02:00
Niklas Haas
a1f96ae157 avcodec/dovi_rpu: separate static ext blocks
Static and dynamic extension blocks are handled differently by metadata
compression, so we need to separate the extension block array into two.
2024-08-16 11:48:02 +02:00
Niklas Haas
f5d6eb4017 avcodec/dovi_rpu: move ext blocks into dedicated struct
Slightly re-organize the logic around extension blocks in order to allow
expanding the state tracking in a following commit.
2024-08-16 11:48:02 +02:00
Niklas Haas
b3d33f11fa avcodec/bsf/dovi_rpu: add new bitstream filter
This can be used to strip dovi metadata, or enable/disable dovi
metadata compression. Possibly more use cases in the future.
2024-08-16 11:48:02 +02:00
Niklas Haas
07712a0cab avcodec/dovi_rpuenc: add configuration for compression
In particular, validate that the chosen compression level is compatible
with the chosen profile.
2024-08-16 11:48:02 +02:00
Niklas Haas
1917270d32 avcodec/dovi_rpuenc: add ff_dovi_configure_ext()
More flexible version of ff_dovi_configure() which does not require an
AVCodecContext. Usable, for example, inside a bitstream filter.
2024-08-16 11:48:02 +02:00
Niklas Haas
765f29c61e avcodec/dovi_rpu: add ff_dovi_get_metadata()
Provides direct access to the AVDOVIMetadata without having to attach it
to a frame.
2024-08-16 11:48:02 +02:00
Niklas Haas
ae3a78593d avcodec/dovi_rpuenc: add a flag to enable compression
Keyframes must reset the metadata compression state, so we need to
also signal this at rpu generation time.

Default to uncompressed, because encoders cannot generally know if
a given frame will be a keyframe before they finish encoding, but also
cannot retroactively attach the RPU. (Within the confines of current
APIs)
2024-08-16 11:48:02 +02:00
Niklas Haas
b3bc8f8e1e avcodec/dovi_rpuenc: make encapsulation optional
And move the choice of desired container to `flags`. This is needed to
handle differing API requirements (e.g. libx265 requires the NAL RBSP,
but CBS BSF requires the unescaped bytes).
2024-08-16 11:48:02 +02:00
Niklas Haas
1e6fdb89bd avcodec/dovi_rpuenc: add flags to ff_dovi_rpu_generate()
Will be used to control compression, encapsulation etc.
2024-08-16 11:48:02 +02:00
Niklas Haas
c62b364dcb avcodec/dovi_rpuenc: respect dv_md_compression
Limited mode can only ever maintain a single VDR RPU reference, and
furthermore requires vdr_rpu_id == 0. So in practice, it will only ever
use VDR RPU slot 0. All remaining slots get flushed in this case, to
avoid leaking partial state.
2024-08-16 11:48:02 +02:00
Niklas Haas
fd00a56653 avcodec/dovi_rpuenc: eliminate unnecessary loop
This struct itself contains vdr_rpu_id, so we can never match it except
in the case of i == vdr_rpu_id. So just directly use this ID.
2024-08-16 11:48:02 +02:00
Niklas Haas
bf92441d6a avcodec/dovi_rpuenc: also copy ext blocks to dovi ctx
As the comment implies, DOVIContext.ext_blocks should also reflect the
current state after ff_dovi_rpu_generate().

Fluff for now, but will be needed once we start implementing metadata
compression for extension blocks as well.
2024-08-16 11:48:02 +02:00
Niklas Haas
a93801b626 avcodec/dovi_rpudec: implement validation for compression
Add some error checking. I've limited it to AV_EF_CAREFUL and
AV_EF_COMPLIANT for now, because we can technically decode such RPUs
just fine.
2024-08-16 11:48:02 +02:00
Niklas Haas
2a2e0aced2 avutil/dovi_meta: document static vs dynamic ext blocks 2024-08-16 11:48:02 +02:00
Niklas Haas
ae31acd702 fate/scalechroma: switch to standard chroma location
Replace the manually specified chroma location by one using standard
notation, arbitrarily "bottomleft" as it is a less common path.

Required if we want to phase out the use of manual chroma locations.
2024-08-16 11:43:37 +02:00
Niklas Haas
f1071dc634 avfilter/vf_zscale: remove unused fields 2024-08-16 11:43:37 +02:00
Niklas Haas
c8bc6fabd7 avfilter/vf_scale: fix 4:1:0 interlaced chroma pos
The current logic hard-coded a check for v_sub == 1. We can extend this
logic slightly to cover the case of interlaced 4:1:0 (which has v_sub ==
2).

Here is a diagram explaining this scenario (with center-siting):

a   a   a   a   a   a   a   a

b   b   b   b   b   b   b   b
      X               X
a   a   a   a   a   a   a   a

b   b   b   b   b   b   b   b

a   a   a   a   a   a   a   a

b   b   b   b   b   b   b   b
      Y               Y
a   a   a   a   a   a   a   a

b   b   b   b   b   b   b   b

a = even luma rows
b = odd luma rows
X = even chroma sample
Y = odd chroma sample

In progressive mode, the chroma samples sit at (384, 384) respectively.

Relative to the 8x4 grid of even luma samples (a), the X sample sits at:
  h_chr_pos = 384
  v_chr_pos = 192

Relative to the 8x4 grid of odd luma samples (b), the Y sample sits at:
  h_chr_pos = 384
  v_chr_pos = 576

The new code calculates the correct values in all circumstances.
2024-08-16 11:43:37 +02:00
Niklas Haas
15a67c0947 avfilter/vf_scale: add in/out_chroma_loc
Currently, this just functions as a more principled and user-friendly
replacement for the (undocumented and hard to use) *_chr_pos fields.

However, the goal is to automatically infer these values from the input
frames' chroma location, and deprecate the manual use of *_chr_pos
altogether. (Indeed, my plans for an swscale replacement will most
likely also end up limiting the set of legal chroma locations to those
permissible by AVFrame properties)
2024-08-16 11:43:37 +02:00
Niklas Haas
18b9687308 avfilter/swscale: always fix interlaced chroma location
The current logic only fixes it when the user does not explicitly
specify the chroma location. However, this does not make a lot of sense.
Since there is no way to specify this property per-field, it effectively
*prevents* the user from being able to correctly scale interlaced frames
with top-aligned chroma.

It makes more sense to consider the user setting in the progressive case
only, and automatically adapt it to the correct interlaced field
positions, following the details of the MPEG specification.
2024-08-16 11:43:37 +02:00
Niklas Haas
6b40be941a swscale/options: relax src/dst_h/v_chr_pos value range
When dealing with 4x subsampling ratios (log2 == 2), such as can arise
with 4:1:1 or 4:1:0, a value range of 512 is not enough to cover the
range of possible scenarios.

For example, bottom-sited chroma in 4:1:0 would require an offset of 768
(three luma rows). Simply double the limit to 1024. I don't see any
place in initFilter() that would experience overflow as a result of this
change, especially since get_local_pos() right-shifts it by the
subsampling ratio again.
2024-08-16 11:43:37 +02:00
Niklas Haas
5d964df5da avfilter/vf_setparams: remove unnecessary options bounds
AV_OPT_TYPE_CONST does not use min/max, we can leave them as 0.
2024-08-16 11:43:37 +02:00
Niklas Haas
201f1cba15 avfilter/vf_setparams: allow setting chroma location
Shockingly, there isn't currently _any_ filter for overriding this.
2024-08-16 11:43:37 +02:00
Niklas Haas
3e064f52eb swscale: document SWS_FULL_CHR_H_* flags
Based on my best understanding of what they do, given the source code.
2024-08-16 11:43:37 +02:00
Fei Wang
be7ab63552 lavc/qsvdec: Add vvc_mp4toannexb bsf for QSV VVC decoder
Fix error:
$ ffmpeg -hwaccel qsv -i input.mp4 -f null -
..
[vvc_qsv @ 0000026890D966C0] Error decoding stream header: unknown error (-1)
[vvc_qsv @ 0000026890D966C0] Error decoding header

Signed-off-by: Fei Wang <fei.w.wang@intel.com>
2024-08-16 14:15:04 +08:00
Lynne
a797317ab1
vulkan_filter: don't require the storage flag for the base frames format
We check for whether subformats support storage immediately below.
Those are the ones we require storage for, rather than the base format
itself.

This permits better reuse of AVHWFrame contexts.

The patch also removes an always-false check in the subformat check.
2024-08-16 01:22:17 +02:00
Lynne
b165f144e7
vulkan_filter: allow reusing frame contexts with DRM tiling
There's no reason not to permit this, particularly if a user wants
to manipulate images which will be exported back to DRM.
2024-08-16 01:22:17 +02:00
Lynne
604dfdb44c
hwcontext_vulkan: align host mapping size to minImportedHostPointerAlignment
This was left out of the recent rewrite of the system.
2024-08-16 01:22:16 +02:00
Lynne
18d964fc2c
vulkan: enable encoding of images if video_maintenance1 is enabled
Vulkan encoding was designed in a very... consolidated way.
You had to know the exact codec and profile that the image was going to
eventually be encoded as at... image creation time. Unfortunately, as good
as our code is, glimpsing into the exact future isn't what its capable of.

video_maintenance1 removed that requirement, which only then made encoding
images practically possible.
2024-08-16 01:22:16 +02:00
Lynne
46c13834b6
hwcontext_vulkan: enable VK_KHR_video_maintenance1
We require it for encoding.
2024-08-16 01:22:15 +02:00
Lynne
97e947a2a7
hwcontext_vulkan: setup extensions before features
The issue is that enabling features requires that the device
extension is supported. The extensions bitfield was set later,
so it was always 0, leading to no features being added.
2024-08-16 01:22:15 +02:00
Lynne
c3cbaf39bb
hwcontext_vulkan: don't enable deprecated VK_KHR_sampler_ycbcr_conversion extension
It was added to Vulkan 1.1 a long time ago.
Validation layer will warn if this is enabled.
2024-08-16 01:22:15 +02:00
Lynne
3f65d24075
hwcontext_vulkan: fix user layers, add support for different debug modes
The validation layer option only supported GPU-assisted validation.
This is mutually exclusive with shader debug printfs, so we need to
differentiate between the two.

This also fixes issues with user-given layers, and leaks in case of
errors.
2024-08-16 01:22:14 +02:00
Lynne
869f4aec48
vulkan_decode: use the correct queue family for decoding ops
In 680d969a30, the new API was
used to find a queue family for dispatch, but the found queue
family was not used for decoding, just for dispatching.
2024-08-16 01:22:08 +02:00
Anton Khirnov
d566a37003 lavfi: move AVFilterLink.graph to FilterLink 2024-08-15 19:34:27 +02:00
Anton Khirnov
fb3efef1db lavfi: move AVFilterLink.frame_wanted_out to FilterLinkInternal 2024-08-15 19:34:27 +02:00