Commit Graph

2702 Commits

Author SHA1 Message Date
Ramiro Polla
ca889b1328 swscale/aarch64: add neon {lum,chr}ConvertRange16
aarch64 A55:
chrRangeFromJpeg16_1920_c:    32684.2
chrRangeFromJpeg16_1920_neon:  8431.2 (3.88x)
chrRangeToJpeg16_1920_c:      24996.8
chrRangeToJpeg16_1920_neon:    9395.0 (2.66x)
lumRangeFromJpeg16_1920_c:    17305.2
lumRangeFromJpeg16_1920_neon:  4586.5 (3.77x)
lumRangeToJpeg16_1920_c:      21144.8
lumRangeToJpeg16_1920_neon:    5069.8 (4.17x)

aarch64 A76:
chrRangeFromJpeg16_1920_c:    11523.8
chrRangeFromJpeg16_1920_neon:  3367.5 (3.42x)
chrRangeToJpeg16_1920_c:      11655.2
chrRangeToJpeg16_1920_neon:    4087.2 (2.85x)
lumRangeFromJpeg16_1920_c:     5762.0
lumRangeFromJpeg16_1920_neon:  1815.8 (3.17x)
lumRangeToJpeg16_1920_c:       5946.2
lumRangeToJpeg16_1920_neon:    2148.2 (2.77x)
2024-12-05 21:10:29 +01:00
Ramiro Polla
87052c0933 swscale/x86: add sse4 and avx2 {lum,chr}ConvertRange16
chrRangeFromJpeg16_1920_c:    3153.9
chrRangeFromJpeg16_1920_sse4: 1770.0 (1.78x)
chrRangeFromJpeg16_1920_avx2:  891.5 (3.54x)
chrRangeToJpeg16_1920_c:      3165.0
chrRangeToJpeg16_1920_sse4:   1953.2 (1.62x)
chrRangeToJpeg16_1920_avx2:    973.0 (3.25x)
lumRangeFromJpeg16_1920_c:    1298.5
lumRangeFromJpeg16_1920_sse4:  886.5 (1.46x)
lumRangeFromJpeg16_1920_avx2:  447.7 (2.90x)
lumRangeToJpeg16_1920_c:      1905.0
lumRangeToJpeg16_1920_sse4:    993.0 (1.92x)
lumRangeToJpeg16_1920_avx2:    498.9 (3.82x)
2024-12-05 21:10:29 +01:00
Ramiro Polla
6fe4a4ffb6 swscale/aarch64/range_convert: update neon range_convert functions to new API
aarch64 A55:
chrRangeFromJpeg8_1920_c:    28835.2 (1.00x)
chrRangeFromJpeg8_1920_neon:  5313.9 (5.43x)  5308.4 (5.43x)
chrRangeToJpeg8_1920_c:      23074.7 (1.00x)
chrRangeToJpeg8_1920_neon:    5551.3 (4.16x)  5549.2 (4.16x)
lumRangeFromJpeg8_1920_c:    15389.7 (1.00x)
lumRangeFromJpeg8_1920_neon:  3152.3 (4.88x)  3147.7 (4.89x)
lumRangeToJpeg8_1920_c:      19227.8 (1.00x)
lumRangeToJpeg8_1920_neon:    3628.7 (5.30x)  3630.2 (5.30x)

aarch64 A76:
chrRangeFromJpeg8_1920_c:    6324.4 (1.00x)
chrRangeFromJpeg8_1920_neon: 2344.5 (2.70x) 2304.2 (2.74x)
chrRangeToJpeg8_1920_c:      9656.0 (1.00x)
chrRangeToJpeg8_1920_neon:   2824.2 (3.42x) 2794.2 (3.46x)
lumRangeFromJpeg8_1920_c:    4422.0 (1.00x)
lumRangeFromJpeg8_1920_neon: 1104.5 (4.00x) 1106.2 (4.00x)
lumRangeToJpeg8_1920_c:      5949.1 (1.00x)
lumRangeToJpeg8_1920_neon:   1329.8 (4.47x) 1328.2 (4.48x)
2024-12-05 21:10:29 +01:00
Ramiro Polla
be108ebcf4 swscale/x86/range_convert: update sse2 and avx2 range_convert functions to new API
chrRangeFromJpeg8_1920_c:    2127.4 (1.00x)
chrRangeFromJpeg8_1920_sse2:  816.0 (2.61x)  813.5 (2.62x)
chrRangeFromJpeg8_1920_avx2:  408.9 (5.20x)  405.4 (5.25x)
chrRangeToJpeg8_1920_c:      3166.9 (1.00x)
chrRangeToJpeg8_1920_sse2:    815.0 (3.89x)  815.0 (3.89x)
chrRangeToJpeg8_1920_avx2:    404.5 (7.83x)  405.5 (7.81x)
lumRangeFromJpeg8_1920_c:    1263.0 (1.00x)
lumRangeFromJpeg8_1920_sse2:  411.0 (3.07x)  413.2 (3.06x)
lumRangeFromJpeg8_1920_avx2:  200.5 (6.30x)  201.9 (6.26x)
lumRangeToJpeg8_1920_c:      1886.8 (1.00x)
lumRangeToJpeg8_1920_sse2:    412.0 (4.58x)  408.9 (4.61x)
lumRangeToJpeg8_1920_avx2:    208.5 (9.05x)  205.7 (9.17x)
2024-12-05 21:10:29 +01:00
Ramiro Polla
384fe39623 swscale/range_convert: fix mpeg ranges in yuv range conversion for non-8-bit pixel formats
There is an issue with the constants used in YUV to YUV range conversion,
where the upper bound is not respected when converting to mpeg range.

With this commit, the constants are calculated at runtime, depending on
the bit depth. This approach also allows us to more easily understand how
the constants are derived.

For bit depths <= 14, the number of fixed point bits has been set to 14
for all conversions, to simplify the code.
For bit depths > 14, the number of fixed points bits has been raised and
set to 18, to allow for the conversion to be accurate enough for the mpeg
range to be respected.

The convert functions now take the conversion constants (coeff and offset)
as function arguments.
For bit depths <= 14, coeff is unsigned 16-bit and offset is 32-bit.
For bit depths > 14, coeff is unsigned 32-bit and offset is 64-bit.

x86_64:
chrRangeFromJpeg8_1920_c:    2127.4   2125.0  (1.00x)
chrRangeFromJpeg16_1920_c:   2325.2   2127.2  (1.09x)
chrRangeToJpeg8_1920_c:      3166.9   3168.7  (1.00x)
chrRangeToJpeg16_1920_c:     2152.4   3164.8  (0.68x)
lumRangeFromJpeg8_1920_c:    1263.0   1302.5  (0.97x)
lumRangeFromJpeg16_1920_c:   1080.5   1299.2  (0.83x)
lumRangeToJpeg8_1920_c:      1886.8   2112.2  (0.89x)
lumRangeToJpeg16_1920_c:     1077.0   1906.5  (0.56x)

aarch64 A55:
chrRangeFromJpeg8_1920_c:   28835.2  28835.6  (1.00x)
chrRangeFromJpeg16_1920_c:  28839.8  32680.8  (0.88x)
chrRangeToJpeg8_1920_c:     23074.7  23075.4  (1.00x)
chrRangeToJpeg16_1920_c:    17318.9  24996.0  (0.69x)
lumRangeFromJpeg8_1920_c:   15389.7  15384.5  (1.00x)
lumRangeFromJpeg16_1920_c:  15388.2  17306.7  (0.89x)
lumRangeToJpeg8_1920_c:     19227.8  19226.6  (1.00x)
lumRangeToJpeg16_1920_c:    15387.0  21146.3  (0.73x)

aarch64 A76:
chrRangeFromJpeg8_1920_c:    6324.4   6268.1  (1.01x)
chrRangeFromJpeg16_1920_c:   6339.9  11521.5  (0.55x)
chrRangeToJpeg8_1920_c:      9656.0   9612.8  (1.00x)
chrRangeToJpeg16_1920_c:     6340.4  11651.8  (0.54x)
lumRangeFromJpeg8_1920_c:    4422.0   4420.8  (1.00x)
lumRangeFromJpeg16_1920_c:   4420.9   5762.0  (0.77x)
lumRangeToJpeg8_1920_c:      5949.1   5977.5  (1.00x)
lumRangeToJpeg16_1920_c:     4446.8   5946.2  (0.75x)

NOTE: all simd optimizations for range_convert have been disabled.
      they will be re-enabled when they are fixed for each architecture.

NOTE2: the same issue still exists in rgb2yuv conversions, which is not
       addressed in this commit.
2024-12-05 21:10:29 +01:00
Ramiro Polla
58bcdeb742 swscale/aarch64/range_convert: saturate output instead of limiting input
aarch64 A55:
chrRangeFromJpeg8_1920_c:    28836.2 (1.00x)
chrRangeFromJpeg8_1920_neon:  5312.6 (5.43x)  5313.9 (5.43x)
chrRangeToJpeg8_1920_c:      44196.2 (1.00x)
chrRangeToJpeg8_1920_neon:    6034.6 (7.32x)  5551.3 (7.96x)
lumRangeFromJpeg8_1920_c:    15388.5 (1.00x)
lumRangeFromJpeg8_1920_neon:  3150.7 (4.88x)  3152.3 (4.88x)
lumRangeToJpeg8_1920_c:      23069.7 (1.00x)
lumRangeToJpeg8_1920_neon:    3873.2 (5.96x)  3628.7 (6.36x)

aarch64 A76:
chrRangeFromJpeg8_1920_c:     6334.7 (1.00x)
chrRangeFromJpeg8_1920_neon:  2264.5 (2.80x)  2344.5 (2.70x)
chrRangeToJpeg8_1920_c:      11474.5 (1.00x)
chrRangeToJpeg8_1920_neon:    2646.5 (4.34x)  2824.2 (4.06x)
lumRangeFromJpeg8_1920_c:     4453.2 (1.00x)
lumRangeFromJpeg8_1920_neon:  1104.8 (4.03x)  1104.5 (4.03x)
lumRangeToJpeg8_1920_c:       6645.0 (1.00x)
lumRangeToJpeg8_1920_neon:    1310.5 (5.07x)  1329.8 (5.00x)
2024-12-05 21:10:29 +01:00
Ramiro Polla
2d1358a84d swscale/range_convert: saturate output instead of limiting input
For bit depths <= 14, the result is saturated to 15 bits.
For bit depths > 14, the result is saturated to 19 bits.

x86_64:
chrRangeFromJpeg8_1920_c:    2126.5   2127.4  (1.00x)
chrRangeFromJpeg16_1920_c:   2331.4   2325.2  (1.00x)
chrRangeToJpeg8_1920_c:      3163.0   3166.9  (1.00x)
chrRangeToJpeg16_1920_c:     3163.7   2152.4  (1.47x)
lumRangeFromJpeg8_1920_c:    1262.2   1263.0  (1.00x)
lumRangeFromJpeg16_1920_c:   1079.5   1080.5  (1.00x)
lumRangeToJpeg8_1920_c:      1860.5   1886.8  (0.99x)
lumRangeToJpeg16_1920_c:     1910.2   1077.0  (1.77x)

aarch64 A55:
chrRangeFromJpeg8_1920_c:   28836.2  28835.2  (1.00x)
chrRangeFromJpeg16_1920_c:  28840.1  28839.8  (1.00x)
chrRangeToJpeg8_1920_c:     44196.2  23074.7  (1.92x)
chrRangeToJpeg16_1920_c:    36527.3  17318.9  (2.11x)
lumRangeFromJpeg8_1920_c:   15388.5  15389.7  (1.00x)
lumRangeFromJpeg16_1920_c:  15389.3  15388.2  (1.00x)
lumRangeToJpeg8_1920_c:     23069.7  19227.8  (1.20x)
lumRangeToJpeg16_1920_c:    19227.8  15387.0  (1.25x)

aarch64 A76:
chrRangeFromJpeg8_1920_c:    6334.7   6324.4  (1.00x)
chrRangeFromJpeg16_1920_c:   6336.0   6339.9  (1.00x)
chrRangeToJpeg8_1920_c:     11474.5   9656.0  (1.19x)
chrRangeToJpeg16_1920_c:     9640.5   6340.4  (1.52x)
lumRangeFromJpeg8_1920_c:    4453.2   4422.0  (1.01x)
lumRangeFromJpeg16_1920_c:   4414.2   4420.9  (1.00x)
lumRangeToJpeg8_1920_c:      6645.0   5949.1  (1.12x)
lumRangeToJpeg16_1920_c:     6005.2   4446.8  (1.35x)

NOTE: all simd optimizations for range_convert have been disabled
      except for x86, which already had the same behaviour.
      they will be re-enabled when they are fixed for each architecture.
2024-12-05 21:10:29 +01:00
Niklas Haas
2f95bc3cb3 swscale/utils: disable full_chr_h_input optimization for odd width
The basic problem here is that the rgb*ToUV_half_* functions hard-code a
bilinear downsample from src[i] + src[i+1], with no bounds check on the i+1
access.

Due to the signature of the function, we cannot easily plumb the "true" width
into the function body to perform a bounds check. Similarly, we cannot easily
pre-pad the input because it is typically reading from the (const) input
frame, which would require a full memcpy to pad. Either of these solutions are
more trouble than the feature is worth, so just disable it on odd input sizes.

Fixes: use of uninitialized value
Fixes: ticket #11265
Signed-off-by: Niklas Haas <git@haasn.dev>
Sponsored-by: Sovereign Tech Fund
2024-12-04 11:38:47 +01:00
Niklas Haas
79452d382f swscale/graph: fix memleak of cascaded graphs
Just free them directly and discard the parent context.

Fixes: bf738412e8
Signed-off-by: Niklas Haas <git@haasn.dev>
Sponsored-by: Sovereign Tech Fund
2024-12-04 11:38:30 +01:00
Michael Niedermayer
d32dcc07a7
swscale/swscale_unscaled: Fix odd height with nv24_to_yuv420p_chroma()
Fixes: out of array read
Fixes: 71726/clusterfuzz-testcase-ffmpeg_SWS_fuzzer-5876893532880896
Fixes: 377735917/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-6686071112400896

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Approved-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-12-04 04:23:48 +01:00
Michael Niedermayer
aeec39f3c1
swscale/slice: clear allocated memory in alloc_lines()
Fixes: use of uninitialized memory in hScale16To15_c()
Fixes: 373924007/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-5841199968092160

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-12-02 03:14:47 +01:00
Sean McGovern
b9eaf6e05c
swscale/ppc: disable YUV2RGB AltiVec acceleration
The FATE test 'checkasm-sw_yuv2rgb' currently fails on this platform,
in both little- and big-endian configurations with AltiVec enabled.

Disable it for the time being.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-12-02 02:51:39 +01:00
Rémi Denis-Courmont
da1ab7940e riscv: remove unnecessary #include's 2024-11-25 19:29:21 +02:00
Marvin Scholz
6b9f4f36f7 swscale/internal: fix typo in loongarch specific code
Regression from 2d077f9acd
2024-11-25 17:15:00 +01:00
Niklas Haas
3edd1e42b9 tests/swscale: add a benchmarking mode
With the ability to set the thread count as well. This benchmark includes
the constant overhead of context initialization.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-11-25 11:03:54 +01:00
Niklas Haas
59c39a79ca tests/swscale: rewrite on top of new API
This rewrite cleans up the code to use AVFrames and the new swscale API. The
log format has also been simplified and expanded to account for the new
options. (Not yet implemented)

The self testing code path has also been expanded to test the new swscale
implementation against the old one, to serve as an unchanging reference. This
does not accomplish much yet, but serves as a framework for future work.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-11-25 11:03:54 +01:00
Niklas Haas
2a091d4f2e swscale: introduce new, dynamic scaling API
As part of a larger, ongoing effort to modernize and partially rewrite
libswscale, it was decided and generally agreed upon to introduce a new
public API for libswscale. This API is designed to be less stateful, more
explicitly defined, and considerably easier to use than the existing one.

Most of the API work has been already accomplished in the previous commits,
this commit merely introduces the ability to use sws_scale_frame()
dynamically, without prior sws_init_context() calls. Instead, the new API
takes frame properties from the frames themselves, and the implementation is
based on the new SwsGraph API, which we simply reinitialize as needed.

This high-level wrapper also recreates the logic that used to live inside
vf_scale for scaling interlaced frames, enabling it to be reused more easily
by end users.

Finally, this function is designed to simply copy refs directly when nothing
needs to be done, substantially improving throughput of the noop fast path.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-11-25 11:03:50 +01:00
Niklas Haas
bf738412e8 swscale/graph: add new high-level scaler dispatch mechanism
This interface has been designed from the ground up to serve as a new
framework for dispatching various scaling operations at a high level. This
will eventually replace the old ad-hoc system of using cascaded contexts,
as well as allowing us to plug in more dynamic scaling passes requiring
intermediate steps, such as colorspace conversions, etc.

The starter implementation merely piggybacks off the existing sws_init() and
sws_scale(), functions, though it does bring the immediate improvement of
splitting up cascaded functions and pre/post conversion functions into
separate filter passes, which allows them to e.g. be executed in parallel
even when the main scaler is required to be single threaded. Additionally,
a dedicated (multi-threaded) noop memcpy pass substantially improves
throughput of that fast path.

Follow-up commits will eventually expand this to move all of the scaling
decision logic into the graph init function, and also eliminate some of the
current special cases.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-11-25 11:02:16 +01:00
Niklas Haas
c461dcf291 swscale/internal: expose sws_init_single_context() internally
Used by the graph API swscale wrapper, for now.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-11-25 11:02:16 +01:00
Niklas Haas
fb16964009 swscale: organize and better document flags
Group them into an enum rather than random #defines, and document their
behavior a bit more obviously.

Of particular note, I discovered that SWS_DIRECT_BGR is not referenced
anywhere else in the code base. As such, I have moved it to the deprecated
section, alongside SWS_ERROR_DIFFUSION.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-11-25 11:02:12 +01:00
Niklas Haas
6a91a165fd swscale: eliminate redundant SwsInternal accesses
This is a purely cosmetic commit aimed at replacing accesses to
SwsInternal.opts by direct access to SwsContext wherever convenient.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-11-25 10:59:52 +01:00
Niklas Haas
ed5dd67562 swscale: expose SwsContext publicly
Following in the footsteps of the work in the previous commit, it's now
relatively straightforward to expose the options struct publicly as
SwsContext. This is a step towards making this more user friendly, as
well as following API conventions established elsewhere.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-11-25 10:59:49 +01:00
Niklas Haas
2d077f9acd swscale/internal: group user-facing options together
This is a preliminary step to separating these into a new struct. This
commit contains no functional changes, it is a pure search-and-replace.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-11-21 12:49:56 +01:00
Niklas Haas
10d1be2621 swscale/internal: use static_assert for enforcing offsets
Instead of sprinkling av_assert0 into random init functions.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-11-21 12:47:43 +01:00
Niklas Haas
55d5eae411 swscale/options: cosmetic changes
Reorganize the list, fix whitespace, make indentation consistent, and
rename some descriptions for clarity, consistency or informativeness.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-11-21 12:47:14 +01:00
Rémi Denis-Courmont
1912c86af6 sws/range_convert: fix RISC-V chrFromJpeg 2024-11-17 11:28:21 +02:00
James Almer
2eb9c35010 x86/swscale: disable AVX2 yuv2nv12cX functions if accurate_rnd is requested
Signed-off-by: James Almer <jamrial@gmail.com>
2024-11-07 11:16:42 -03:00
James Almer
4047b887fc swscale/swscale_unscaled: add more unscaled planar RGB to planar RGB coverage
Signed-off-by: James Almer <jamrial@gmail.com>
2024-11-06 17:45:47 -03:00
James Almer
c5ebd56500 swscale/swscale_unscaled: add unscaled XV{36,48}LE <-> XV{36,48}BE
Signed-off-by: James Almer <jamrial@gmail.com>
2024-11-06 17:45:47 -03:00
James Almer
271aea60a4 fate/pixfmts: extend the high bit depth test
Also test 8bit formats, and try bitdepth conversion paths.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-11-06 17:44:25 -03:00
James Almer
e7382b4d01 swscale/swscale_unscaled: add unscaled x2rgb10le to packed RGB
Signed-off-by: James Almer <jamrial@gmail.com>
2024-11-06 17:34:32 -03:00
James Almer
ae8ef645ec swscale/swscale_unscaled: add unscaled x2rgb10le to planar RGB
Signed-off-by: James Almer <jamrial@gmail.com>
2024-11-06 17:34:31 -03:00
James Almer
5f5421ec66 swscale/swscale: prevent integer overflow in chrRangeToJpeg16_c
Same as it's done in lumRangeToJpeg16_c(). Plenty of allowed input values can
overflow here.

Fixes: src/libswscale/swscale.c:198:47: runtime error: signed integer overflow: 475328 * 4663 cannot be represented in type 'int'
Signed-off-by: James Almer <jamrial@gmail.com>
2024-11-02 15:01:31 -03:00
James Almer
c029a2f7dd swscale/swscale_unscaled: add unscaled rgb to planar rgba
The fate test reference changes are due to the conversion being a simple
lossless deinterleave, instead of going through a RGB -> YUV -> RGB roundtrip.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-11-02 15:01:31 -03:00
James Almer
5ccc3f0fca swscale/swscale_unscaled: add unscaled hbd planar RGB to x2rgb10le
Signed-off-by: James Almer <jamrial@gmail.com>
2024-11-02 15:01:31 -03:00
James Almer
febc9e8162 swscale/output: add full chroma interpolation support for x2rgb10
Signed-off-by: James Almer <jamrial@gmail.com>
2024-11-02 15:01:31 -03:00
James Almer
78ba06928a swscale/x86/rgb2rgb: add optimized versions of the remaining shuffle_bytes functions
Signed-off-by: James Almer <jamrial@gmail.com>
2024-11-02 15:01:31 -03:00
James Almer
a686d34fea swscale/swscale_unscaled: add unscaled conversion for AYUV/VUYA/UYVA
Signed-off-by: James Almer <jamrial@gmail.com>
2024-11-02 15:01:31 -03:00
Ramiro Polla
8b30daedf7 swscale/range_convert: indent after previous commit 2024-10-27 13:20:56 +01:00
Ramiro Polla
f7ee0195df swscale/range_convert: drop redundant conditionals from arch-specific init functions
These conditions are already checked for in the main init function.
2024-10-27 13:20:56 +01:00
Ramiro Polla
7728b3357d swscale/range_convert: call arch-specific init functions from main init function
This commit also fixes the issue that the call to ff_sws_init_range_convert()
from sws_init_swscale() was not setting up the arch-specific optimizations.
2024-10-27 13:20:56 +01:00
James Almer
a67ba3c132 swscale/output: add XV48 output support
Signed-off-by: James Almer <jamrial@gmail.com>
2024-10-26 00:04:50 -03:00
James Almer
2f13f74791 swscale/input: add XV48 input support
Signed-off-by: James Almer <jamrial@gmail.com>
2024-10-26 00:04:29 -03:00
Michael Niedermayer
3fe3014405
swscale/output: used unsigned for bit accumulation
Fixes: Integer overflow
Fixes: 368725672/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-5009093023563776

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-10-25 22:46:39 +02:00
Michael Niedermayer
14f5d67be3
swscale/rgb2rgb_template: Fix ff_rgb24toyv12_c() with odd height
Fixes: out of array access
Fixes: 368143798/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-6475823425585152

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-10-25 22:46:39 +02:00
Niklas Haas
67adb30322 swscale: rename SwsContext to SwsInternal
And preserve the public SwsContext as separate name. The motivation here
is that I want to turn SwsContext into a public struct, while keeping the
internal implementation hidden. Additionally, I also want to be able to
use multiple internal implementations, e.g. for GPU devices.

This commit does not include any functional changes. For the most part, it is
a simple rename. The only complications arise from the public facing API
functions, which preserve their current type (and hence require an additional
unwrapping step internally), and the checkasm test framework, which directly
accesses SwsInternal.

For consistency, the affected functions that need to maintain a distionction
have generally been changed to refer to the SwsContext as *sws, and the
SwsInternal as *c.

In an upcoming commit, I will provide a backing definition for the public
SwsContext, and update `sws_internal()` to dereference the internal struct
instead of merely casting it.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-10-24 22:50:00 +02:00
Niklas Haas
f1f54d2f82 swscale/x86: use dedicated int for self-modifying MMX dstW
I want to pull options out of SwsInternal, so we need to make this field
a dedicated int that gets updated as appropriate in ff_swscale().

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-10-23 23:12:23 +02:00
Niklas Haas
b03c758600 swscale: add sws_is_noop()
Exactly what it says on the tin.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-10-23 23:06:55 +02:00
Niklas Haas
5e50a56b9c swscale: add new frame testing API
Replacing the old sws_isSupported* API with a more consistent family
of functions that follows the same signature and naming convention,
including a placeholder for testing the color space parameters that
we don't currently implement conversions for.

These functions also perform some extra basic sanity checking.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-10-23 23:06:16 +02:00
Niklas Haas
e2637a083a swscale/utils: add SwsFormat abstraction and helpers
Groups together all relevant color metadata from an AVFrame. While we could
use AVFrame directly, keeping it a separate struct has three advantages:

1. Functions accepting an SwsFormat will definitely not care about the
   data pointers.
2. It clearly separates sanitized and raw metadata, since the function to
   construct an SwsFormat from an AVFrame will also sanitize.
3. It's slightly more lightweight to pass around.

Move these into a new header file "utils.h" to avoid crowding
swscale_internal.h even more, and also to solve a circular dependency issue
down the line.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-10-23 23:04:06 +02:00