Commit Graph

116783 Commits

Author SHA1 Message Date
Rémi Denis-Courmont
8030876d1c checkasm/riscv: align the landing pads 2024-07-25 23:10:14 +03:00
Rémi Denis-Courmont
7dde8be29f checkasm/riscv: add forward-edge CFI landing pads 2024-07-25 23:10:14 +03:00
Rémi Denis-Courmont
4f2472909e sws/riscv: add forward-edge CFI landing pads 2024-07-25 23:10:14 +03:00
Rémi Denis-Courmont
b5c111272b lavfi/riscv: add forward-edge CFI landing pads 2024-07-25 23:10:14 +03:00
Rémi Denis-Courmont
f2c30fe15a lavc/riscv: add forward-edge CFI landing pads 2024-07-25 23:10:14 +03:00
Rémi Denis-Courmont
a14d21a446 lavu/riscv: add forward-edge CFI landing pads 2024-07-25 23:10:14 +03:00
Rémi Denis-Courmont
6319601343 lavu/riscv: assembly for zicfilp LPAD
This instruction, if aligned on a 4-byte boundary, defines a valid target
("landing pad") for an indirect call or jump. Since this instruction is a
HINT, it is safe to assemble even if not included in the target
instruction set architecture.

The necessary alignment is already provided by the `func` macro. However
this still lacks the ELF attribute to indicate that the zicfilp is supported
in simple mode. This is left for future work as the ELF specification is not
ratified as of yet.

This will also nonobviously require the assembler to support zicfilp,
insofar as the `tail` pseudo-instruction shall clobber T2 (instead of T1) as
its temporary register.
2024-07-25 23:10:14 +03:00
Rémi Denis-Courmont
982376660c lavu/riscv: align functions to 4 bytes
Currently the start of the byte range for each function is aligned to
4 bytes. But this can lead to situations whence the function is preceded
by a 2-byte C.NOP at the aligned 4-byte boundary. Then the first actual
instruction and the function symbol are only aligned on 2 bytes.

This forcefully disables compression for the alignment and the symbol,
thus ensuring that there is no padding before the function.
2024-07-25 23:10:14 +03:00
Rémi Denis-Courmont
b62586e310 lavc/h264dsp: use RISC-V B extension
This saves one register and one instruction per transform.
add16 and add16intra thus become stack-less.
2024-07-25 23:10:01 +03:00
Rémi Denis-Courmont
45d7078a21 lavu/riscv: add CPU flag for B bit manipulations
The B extension was finally ratified in May 2024, encompassing:
- Zba (addresses),
- Zbb (basics) and
- Zbs (single bits).
It does not include Zbc (base-2 polynomials).
2024-07-25 23:09:58 +03:00
Rémi Denis-Courmont
529d423012 lavu/riscv: remove bespoke SH{1,2,3}ADD assembler
configure checks that the assembler supports the B extension (or rather
its constituents) anyway. These macros were dodging sanity checks for
unsupported instructions and nothing else.
2024-07-25 18:55:48 +03:00
Rémi Denis-Courmont
5f10173fa1 lavu/riscv: require B or zba explicitly 2024-07-25 18:55:48 +03:00
Rémi Denis-Courmont
e91a8cc4de sws/riscv: require B or zba explicitly 2024-07-25 18:55:48 +03:00
Rémi Denis-Courmont
9108f3e5e1 lavfi/riscv: require B or zba explicitly 2024-07-25 18:55:48 +03:00
Rémi Denis-Courmont
187d4d066a lavc/riscv: require B or zba explicitly 2024-07-25 18:55:48 +03:00
Rémi Denis-Courmont
7f97344bfb lavu/riscv: grok B as an extension
The RISC-V B bit manipulation extension was ratified only two months ago.
But it is strictly equivalent to the union of the zba, zbb and zbs
extensions which were defined almost 3 years earlier. Rather than require
new assembler, we can just match the extension name manually and translate
it into its constituent parts.
2024-07-25 18:55:48 +03:00
Rémi Denis-Courmont
1e7ab200ee lavu/riscv: allow any number of extensions
This reworks the func/endfunc macros to support any number of ISA extension
as parameters.
2024-07-25 18:55:48 +03:00
Araz Iusubov
2128c17739 avcodec/amf_enc: av1 cropping support
Signed-off-by: James Almer <jamrial@gmail.com>
2024-07-25 10:22:02 -03:00
Rémi Denis-Courmont
896c22ef00 lavc/vp8dsp: fix RV32 stack alignment
SP must be a multiple of 16 bytes at all times on POSIX - even in leaf
functions - so that signal handlers have a properly aligned stack.
2024-07-24 22:08:54 +03:00
Jens Frederich
60b1750134
avdevice/dshow: Don't skip audio devices if no video device is present
The search of the current DirectShow device list has been customized so
that audio devices are always found even if no video device is connected.

Signed-off-by: Jens Frederich <jens.frederich@vector.com>
Reviewed-by: Roger Pack <rogerdpack2@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-24 14:45:20 +02:00
Martin Storsjö
97a708a507 checkasm: Increase the tolerance for ac3_sum_square_butterfly_float
Increase the tolerance from 10 ulp to 11 ulp. This fixes occasional
errors for some inputs; the errors could be reproduced on
aarch64/neon builds, with "checkasm --test=ac3dsp 3446175925".

Signed-off-by: Martin Storsjö <martin@martin.st>
2024-07-24 12:10:33 +03:00
Anton Khirnov
d1fa39d08d fftools/ffmpeg: prefer real errors over EOF in err_merge()
Fixes an issue in 6.1 when reading a corrupted file with -xerror would
exit with success. This specific issue is not present in master, but
this should generally be a more robust behaviour.

Reported-by: Andrej Peterka
2024-07-24 08:20:21 +02:00
Lynne
b1b69ccbc0
aacdec: set ac->output_elements upon channel element free
The issue is that ac->output_elements is populated from
ac->che, which may be freed, leaving dangling pointers in this
list.

Should fix clusterfuzz.
2024-07-24 00:32:38 +02:00
Michael Niedermayer
204f7f8cc7
avcodec/hdrenc: Allocate more space
This needs to be double checked or a checking way of writing should be used

Fixes: out of array access
Fixes: 70007/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HDR_fuzzer-5478704150020096

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-23 23:21:17 +02:00
Michael Niedermayer
5dde255abd
avcodec/cfhdenc: Height of 16 is not supported
Fixes: out of array access
Fixes: 68941/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_CFHD_fuzzer-5990952685600768

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-23 23:21:17 +02:00
Michael Niedermayer
a308d79e4d
avcodec/cfhdenc: Allocate more space
Fixes: Assertion failure
Fixes: 68979/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_CFHD_fuzzer-5375874714107904

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-23 23:21:16 +02:00
Michael Niedermayer
6420c1bf30
avcodec/osq: fix integer overflow when applying factor
Fixes: signed integer overflow: -35511773 * 256 cannot be represented in type 'int'
Fixes: 70406/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_OSQ_fuzzer-6545326804434944

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-23 23:21:16 +02:00
Michael Niedermayer
56c334d732
avcodec/osq: avoid using too large numbers for shifts and integers in update_residue_parameter()
Fixes: 2.96539e+09 is outside the range of representable values of type 'int'
Fixes: Assertion n>=0 && n<=32 failed at libavcodec/get_bits.h:423
Fixes: 62241/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_OSQ_fuzzer-4525761925873664
Fixes: 70406/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_OSQ_fuzzer-6545326804434944

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-23 23:21:16 +02:00
Michael Niedermayer
3cd077e282
avcodec/vaapi_encode: Check hwctx
Fixes: null pointer dereference
Fixes: 70376/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_H264_VAAPI_fuzzer-4733551250046976

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-23 23:21:15 +02:00
Michael Niedermayer
2f7aaa33e7
avcodec/aac/aacdec_lpd: Check kv indec
Fixes: index 9 out of bounds for type 'uint32_t [8][8]'
Fixes: 70363/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AAC_LATM_fuzzer-6723855293415424.fuzz

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-23 23:21:15 +02:00
Michael Niedermayer
ae20be8b5d
avcodec/aac/aacdec_usac: Dont leave invalid max_sfb in the context
Fixes: out of array read
Fixes: 70363/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_AAC_LATM_fuzzer-6723855293415424.fuzz

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-23 23:21:15 +02:00
Michael Niedermayer
d52fabaec0
avcodec/hevc/hevcdec: avoid signed shifts with lt_idx_sps
Fixes: left shift of 1 by 31 places cannot be represented in type 'int'
Fixes: 70122/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-5172200613675008

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-23 23:21:14 +02:00
Michael Niedermayer
419eee6356
avcodec/proresdec: Consider negative bits left
Fixes: 70036/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_PRORES_fuzzer-6298797647396864
Fixes: shift exponent 40 is too large for 32-bit type 'uint32_t' (aka 'unsigned int')

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-23 23:21:14 +02:00
Michael Niedermayer
6194cb87cb
avcodec/alsdec: Clear shift_value
(the exact issue is unreproducable but the use of uninitialized data is reproducable)

Should fix: signed integer overflow: -2147483648 - 127 cannot be represented in type 'int'
Should fix: 69881/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_ALS_fuzzer-4751301204836352

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-23 23:21:14 +02:00
Michael Niedermayer
5d9544cfb0
avcodec/hevc/hevcdec: Do not allow slices to depend on failed slices
An alternative would be to leave the context unchanged on failure of hls_slice_header()

Fixes: out of array access
Fixes: NULL pointer dereference
Fixes: 69584/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-5931086299856896
Fixes: 69724/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-5104066422702080
Fixes: 70422/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-5908731129298944

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-23 23:21:13 +02:00
Michael Niedermayer
586f6fda1d
avformat/mov: add an EOF check in IPRP
Fixes: Timeout
Fixes: 69230/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-6540512101203968

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-23 23:21:13 +02:00
Michael Niedermayer
55af81b5a4
Revert "avformat/udp: Fix temporary buffer race"
This is not needed

This reverts commit 7b2f67ea77.
2024-07-23 23:21:13 +02:00
Martin Storsjö
4acb9b7d10 aarch64: vvc: Fix unnecessary extra spaces
Signed-off-by: Martin Storsjö <martin@martin.st>
2024-07-23 16:04:28 +03:00
Martin Storsjö
99598629e8 aarch64: vvc: Consistently use # for immediate constants
Signed-off-by: Martin Storsjö <martin@martin.st>
2024-07-23 15:24:37 +03:00
Martin Storsjö
400843151d aarch64: vvc: Fix compilation of alf.S with MSVC 2022 17.7 and older
Use the "ldur" instruction explicitly, instead of having the
assembler implicitly convert "ldr" instructions to "ldur".

This fixes build errors like these:

libavcodec\aarch64\vvc\alf.o.asm(1023) : error A2518: operand 2: Memory offset must be aligned
        ldr             q22, [x3, #24]
libavcodec\aarch64\vvc\alf.o.asm(1024) : error A2518: operand 2: Memory offset must be aligned
        ldr             q24, [x2, #24]
libavcodec\aarch64\vvc\alf.o.asm(1393) : error A2518: operand 2: Memory offset must be aligned
        ldr             q22, [x3, #24]
libavcodec\aarch64\vvc\alf.o.asm(1394) : error A2518: operand 2: Memory offset must be aligned
        ldr             q24, [x2, #24]

Signed-off-by: Martin Storsjö <martin@martin.st>
2024-07-23 15:24:33 +03:00
aaron
53d0f9afb4 avcodec/electronicarts: decode framerate
Reviewed-by: Peter Ross <pross@xvid.org>
2024-07-23 06:40:30 +10:00
aaron
f44353cfb6 avcodec/adpcm: Mono ADPCM for EA WVE Files
Reviewed-by: Peter Ross <pross@xvid.org>
2024-07-23 06:40:30 +10:00
Rémi Denis-Courmont
0e32192548 lavu/riscv: do not fallback to AT_HWCAP auxillary vector
If __riscv_hwprobe() fails, then the kernel version is presumably too
old. There is not much point falling back to the auxillary vector.

- The Linux kernel requires I, so the flag is always set on Linux, and
  run-time detection is unnecessary. Our RISC-V assembler does anyway not
  support targets without I.

- Linux can compile with or without F and D, but it cannot perform
  run-time detection for them (a kernel with F support will not boot a
  processor without F). The run-time detection is thus useless in that
  case. Besides F and D extensions are used throughout the C code, so
  their run-time detection would not be practical.

- Support for V was added in a later kernel version than riscv_hwprobe(),
  so the system call will always be available if the kernel supports V.
  The only exception would be vendor kernel forks, but those are known to
  haphasardly pretend to support V on systems without actual V support, or
  with only pre-ratification binary-incompatible version. Furthermore, a
  large chunk of our optimisations require Zba and/or Zbb which cannot be
  detected with HWCAP in those kernels.

For what it is worth, OpenJDK already took a similar action. Note that this
keeps AT_HWCAP usage for platforms with neither C run-time <sys/hwprobe.h>
nor kernel <asm/hwprobe.h>, notably kernels other than Linux.
2024-07-22 19:43:51 +03:00
Zhao Zhili
2d4ef304c9 avcodec/vvc: Add aarch64 neon optimization for ALF
vvc_alf_filter_chroma_4x4_8_c: 3.0
vvc_alf_filter_chroma_4x4_8_neon: 1.0
vvc_alf_filter_chroma_4x4_10_c: 2.7
vvc_alf_filter_chroma_4x4_10_neon: 1.0
vvc_alf_filter_chroma_4x4_12_c: 2.7
vvc_alf_filter_chroma_4x4_12_neon: 1.0
vvc_alf_filter_chroma_8x8_8_c: 10.2
vvc_alf_filter_chroma_8x8_8_neon: 3.0
vvc_alf_filter_chroma_8x8_10_c: 10.0
vvc_alf_filter_chroma_8x8_10_neon: 2.5
vvc_alf_filter_chroma_8x8_12_c: 10.0
vvc_alf_filter_chroma_8x8_12_neon: 2.5
vvc_alf_filter_chroma_16x16_8_c: 41.7
vvc_alf_filter_chroma_16x16_8_neon: 11.2
vvc_alf_filter_chroma_16x16_10_c: 39.0
vvc_alf_filter_chroma_16x16_10_neon: 10.0
vvc_alf_filter_chroma_16x16_12_c: 40.2
vvc_alf_filter_chroma_16x16_12_neon: 10.2
vvc_alf_filter_chroma_32x32_8_c: 162.0
vvc_alf_filter_chroma_32x32_8_neon: 45.0
vvc_alf_filter_chroma_32x32_10_c: 155.5
vvc_alf_filter_chroma_32x32_10_neon: 39.5
vvc_alf_filter_chroma_32x32_12_c: 155.5
vvc_alf_filter_chroma_32x32_12_neon: 40.0
vvc_alf_filter_chroma_64x64_8_c: 646.0
vvc_alf_filter_chroma_64x64_8_neon: 175.5
vvc_alf_filter_chroma_64x64_10_c: 708.2
vvc_alf_filter_chroma_64x64_10_neon: 166.7
vvc_alf_filter_chroma_64x64_12_c: 619.2
vvc_alf_filter_chroma_64x64_12_neon: 157.2
vvc_alf_filter_chroma_128x128_8_c: 2611.5
vvc_alf_filter_chroma_128x128_8_neon: 698.2
vvc_alf_filter_chroma_128x128_10_c: 2470.0
vvc_alf_filter_chroma_128x128_10_neon: 616.0
vvc_alf_filter_chroma_128x128_12_c: 2531.5
vvc_alf_filter_chroma_128x128_12_neon: 620.2
vvc_alf_filter_luma_8x8_8_c: 25.2
vvc_alf_filter_luma_8x8_8_neon: 4.2
vvc_alf_filter_luma_8x8_10_c: 18.5
vvc_alf_filter_luma_8x8_10_neon: 4.0
vvc_alf_filter_luma_8x8_12_c: 19.0
vvc_alf_filter_luma_8x8_12_neon: 4.0
vvc_alf_filter_luma_16x16_8_c: 106.5
vvc_alf_filter_luma_16x16_8_neon: 16.2
vvc_alf_filter_luma_16x16_10_c: 75.2
vvc_alf_filter_luma_16x16_10_neon: 14.7
vvc_alf_filter_luma_16x16_12_c: 79.7
vvc_alf_filter_luma_16x16_12_neon: 14.7
vvc_alf_filter_luma_32x32_8_c: 400.5
vvc_alf_filter_luma_32x32_8_neon: 63.2
vvc_alf_filter_luma_32x32_10_c: 299.2
vvc_alf_filter_luma_32x32_10_neon: 57.7
vvc_alf_filter_luma_32x32_12_c: 299.2
vvc_alf_filter_luma_32x32_12_neon: 57.7
vvc_alf_filter_luma_64x64_8_c: 1602.5
vvc_alf_filter_luma_64x64_8_neon: 251.7
vvc_alf_filter_luma_64x64_10_c: 1197.0
vvc_alf_filter_luma_64x64_10_neon: 235.5
vvc_alf_filter_luma_64x64_12_c: 1220.2
vvc_alf_filter_luma_64x64_12_neon: 235.7
vvc_alf_filter_luma_128x128_8_c: 6570.2
vvc_alf_filter_luma_128x128_8_neon: 1007.7
vvc_alf_filter_luma_128x128_10_c: 4822.7
vvc_alf_filter_luma_128x128_10_neon: 936.2
vvc_alf_filter_luma_128x128_12_c: 4791.2
vvc_alf_filter_luma_128x128_12_neon: 938.5

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-07-22 21:09:56 +08:00
ha7sh17
172da370e7 doc/filters: fix outpad labels in libvmaf_cuda example 2024-07-22 13:00:04 +05:30
Rémi Denis-Courmont
9135dffd17 lavc/h264dsp: reduce spills in R-V V idct_add16 2024-07-21 22:39:45 +03:00
Rémi Denis-Courmont
245f76ad74 lavc/h264dsp: reuse the R-V V IDCT DC add functions
This reuses the DC bypass functions from the multiple IDCT functions, to
leverage vector code.

As an added bonus, the caller functions can now rely on the callee functions
to preserve their parameters, thus cutting down on stack spills.
2024-07-21 22:39:45 +03:00
Rémi Denis-Courmont
0a5b5bae89 lavc/h264dsp: correct VL and LMUL in idct_dc_add
T-Head C908 (cycles):
h264_idct4_dc_add_8bpp_c:        94.7
h264_idct4_dc_add_8bpp_rvv_i32:  55.0 (before)
h264_idct4_dc_add_8bpp_rvv_i32:  34.5 (after)
h264_idct4_dc_add_9bpp_c:        94.7
h264_idct4_dc_add_9bpp_rvv_i32:  43.5 (before)
h264_idct4_dc_add_9bpp_rvv_i32:  38.2 (after)
h264_idct4_dc_add_10bpp_c:       94.7
h264_idct4_dc_add_10bpp_rvv_i32: 43.5 (before)
h264_idct4_dc_add_10bpp_rvv_i32: 38.2 (after)
h264_idct4_dc_add_12bpp_c:       94.7
h264_idct4_dc_add_12bpp_rvv_i32: 43.7 (before)
h264_idct4_dc_add_12bpp_rvv_i32: 38.5 (after)
h264_idct4_dc_add_14bpp_c:       94.7
h264_idct4_dc_add_14bpp_rvv_i32: 43.7 (before)
h264_idct4_dc_add_14bpp_rvv_i32: 38.5 (after)
2024-07-21 22:39:45 +03:00
J. Dekker
c9dc2ad09b lavc/h264dsp: move R-V V idct_dc_add
No functional changes. This just moves the assembler so that it can be
referenced by other functions in h264idct_rvv.S with local jumps.

Edited-by: Rémi Denis-Courmont <remi@remlab.net>
2024-07-21 22:39:45 +03:00
Rémi Denis-Courmont
d15169c51f lavc/h264dsp: factor some mostly identical R-V V code 2024-07-21 22:39:45 +03:00