ffmpeg

Commit Graph

Author	SHA1	Message	Date
Rémi Denis-Courmont	25a33665a0	lavc/vp8dsp: remove unused macro parameter	2024-05-26 19:20:48 +03:00
Rémi Denis-Courmont	728a1dd3b6	lavc/rv34dsp: remove stray load immediate	2024-05-26 19:20:45 +03:00
sunyuechi	63697d3350	lavc/vp8dsp: R-V V put_epel hv C908: vp8_put_epel4_h4v4_c: 20.0 vp8_put_epel4_h4v4_rvv_i32: 11.0 vp8_put_epel4_h4v6_c: 25.2 vp8_put_epel4_h4v6_rvv_i32: 13.5 vp8_put_epel4_h6v4_c: 22.2 vp8_put_epel4_h6v4_rvv_i32: 14.5 vp8_put_epel4_h6v6_c: 29.0 vp8_put_epel4_h6v6_rvv_i32: 15.7 vp8_put_epel8_h4v4_c: 73.0 vp8_put_epel8_h4v4_rvv_i32: 22.2 vp8_put_epel8_h4v6_c: 90.5 vp8_put_epel8_h4v6_rvv_i32: 26.7 vp8_put_epel8_h6v4_c: 85.0 vp8_put_epel8_h6v4_rvv_i32: 27.2 vp8_put_epel8_h6v6_c: 104.7 vp8_put_epel8_h6v6_rvv_i32: 29.5 vp8_put_epel16_h4v4_c: 145.5 vp8_put_epel16_h4v4_rvv_i32: 26.5 vp8_put_epel16_h4v6_c: 190.7 vp8_put_epel16_h4v6_rvv_i32: 47.5 vp8_put_epel16_h6v4_c: 173.7 vp8_put_epel16_h6v4_rvv_i32: 33.2 vp8_put_epel16_h6v6_c: 222.2 vp8_put_epel16_h6v6_rvv_i32: 35.5 Amended to disable unsupported RV128. Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>	2024-05-26 15:15:28 +03:00
Rémi Denis-Courmont	0b2316e37f	lavc/sbrdsp: fix inverted boundary check 128-bit is the maximum, not the minimum here. Larger vector sizes can result in reads past the end of the noise value table. This partially reverts commit `cdcb4b98b7`.	2024-05-25 22:03:37 +03:00
Rémi Denis-Courmont	e6b38c944f	lavc/sbrdsp: fix potential overflow in noise table Since the SBR noise application optimisations are currently restricted to hardware with 128-bit vectors, and use a quadruple multipler, they can load up to 16 32-bit elements. But the "loads" are of 2 segments, or 16 pairs of single precision float. Thus we need to expand the dupiclated section of the noise table from 2x8 to 2x16 to avoid overflows.	2024-05-25 22:00:18 +03:00
Andreas Rheinhardt	e9197db4f7	tests/checkasm/vvc_alf: Don't use declare_func_emms VVC does not have MMX code at all, so one can use the stricter declare_func to also check that the MMX state has not been clobbered with (which would be an ABI violation). Reviewed-by: Martin Storsjö <martin@martin.st> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-05-25 14:21:54 +02:00
Andreas Rheinhardt	8e27bd025f	avformat/async,cache: Use more unique context names Otherwise Doxygen thinks any text like "Context for foo" is a link to the async protocol's struct called "Context". Reported-by: Andrew Sayers <ffmpeg-devel@pileofstuff.org> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-05-25 13:52:19 +02:00
Andreas Rheinhardt	edc235e076	avformat/riffenc: Fix outdated comment Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-05-25 13:52:05 +02:00
Andreas Rheinhardt	50c25d1f0a	avformat/matroskaenc: Check ff_put_wav_header() failure Fixes Coverity issue #1506706. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-05-25 13:51:58 +02:00
Andreas Rheinhardt	65763bffb6	avformat/mpegts: Don't use uninitialized value in av_log() It is undefined behaviour in (at least) C11 (see C11 6.3.2.1 (2)). Fixes Coverity issue #1500314. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-05-25 13:51:27 +02:00
Andreas Rheinhardt	d8cad01805	avformat/dhav: Check amount read Prevents potential use of uninitialized data in the following memcmp(). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-05-25 13:51:27 +02:00
Andreas Rheinhardt	cf6d07522a	avformat/dhav: Check ffio_ensure_seekback() Fixes Coverity issue #1492324. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-05-25 13:51:27 +02:00
Andreas Rheinhardt	95faf45af1	avformat/qoadec: Check ffio_ensure_seekback() Fixes Coverity issue #1598406. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-05-25 13:51:27 +02:00
Andreas Rheinhardt	6dc8d4eea8	avformat/westwood_vqa: Check ffio_ensure_seekback() Fixes Coverity issue #1598405. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-05-25 13:51:27 +02:00
Andreas Rheinhardt	590fffe6ad	avformat/gifdec: Check ffio_ensure_seekback() Fixes Coverity issue #1598400. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-05-25 13:51:27 +02:00
Andreas Rheinhardt	b47116be45	avformat/oggdec: Check ffio_ensure_seekback() Fixes Coverity issue #1492327. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-05-25 13:51:27 +02:00
Rémi Denis-Courmont	f883746587	lavc/flacdsp: do not assume maximum R-V VL This loop correctly assumes that VLMAX=16 (4x128-bit vectors with 32-bit elements) and 32 >= pred_order > 16. We need to alternate between VL=16 and VL=t2=pred_order-16 elements to add up to pred_order. The current code requests AVL=a2=pred_order elements. In QEMU and on thte K230 hardware, this sets VL=16 as we need. But the specification merely guarantees that we get: ceil(AVL / 2) <= VL <= VLMAX. For instance, if pred_order equals 27, we could end up with VL=14 or VL=15 instead of VL=16. So instead, request literally VLMAX=16.	2024-05-25 10:31:50 +03:00
Andreas Rheinhardt	aff24c1658	avcodec/flacdec: Remove unused variable Forgotten in `0380a03f1f`. Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-05-24 19:05:57 +02:00
Rémi Denis-Courmont	ba38d0e328	lavc/pixblockdsp: add scalar get_pixels_unaligned The code is already there, we just need to use it. get_pixels_unaligned_c: 2.2 get_pixels_unaligned_misaligned: 1.7	2024-05-24 17:53:43 +03:00
Rémi Denis-Courmont	d03cdfa2b6	checkasm/riscv: test misaligned before V Otherwise V functions mask scalar misaligned ones.	2024-05-24 17:53:43 +03:00
James Almer	0920f506a7	checkasm/flacdsp: add a test for lpc33 Signed-off-by: James Almer <jamrial@gmail.com>	2024-05-24 09:23:00 -03:00
James Almer	0380a03f1f	avcodec/flacdsp: split off lpc33 into a dsp function Signed-off-by: James Almer <jamrial@gmail.com>	2024-05-24 09:23:00 -03:00
James Almer	62397bcf6a	avformat/movenc: add support for writing SA3D boxes Signed-off-by: James Almer <jamrial@gmail.com>	2024-05-23 19:06:46 -03:00
James Almer	8c97449482	avutil/channel_layout: add a helper function to get the ambisonic order of a layout Signed-off-by: James Almer <jamrial@gmail.com>	2024-05-23 12:07:19 -03:00
Haihao Xiang	8155808ce6	libavcodec/x86/vvc/vvc_sad: fix assembler error X86ASM libavcodec/x86/vvc/vvc_sad.o libavcodec/x86/vvc/vvc_sad.asm:85: error: invalid number of operands libavcodec/x86/vvc/vvc_sad.asm:87: error: invalid number of operands Signed-off-by: Haihao Xiang <haihao.xiang@intel.com> Signed-off-by: James Almer <jamrial@gmail.com>	2024-05-23 09:12:50 -03:00
Andreas Rheinhardt	ece95dc3dc	avfilter/af_atempo: Fix indentation Forgotten after `b8f74ee57a`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-05-23 10:45:55 +02:00
Andreas Rheinhardt	42e0e05834	avfilter/af_atempo: Simplify resetting The earlier code distinguished between a partial reset (yae_clear()) and a complete reset (yae_release_buffers() which also releases the buffers); this separation existed to avoid allocations, as buffers were reallocated on reconfigs. Yet it is pointless since `a5704659e3`, so simply use yae_release_buffers() everywhere. Reviewed-by: Pavel Koshevoy <pkoshevoy@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-05-23 10:45:25 +02:00
Andreas Rheinhardt	35e7fa0a2e	avfilter/af_atempo: Properly check av_tx_init() Fixes Coverity issue #1516804. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2024-05-23 10:45:16 +02:00
Stone Chen	2e877090f9	tests/checkasm: Add check_vvc_sad to vvc_mc.c Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8x8_c: 50.3 vvc_sad_8x8_avx2: 0.3 vvc_sad_16x16_c: 250.3 vvc_sad_16x16_avx2: 10.3 vvc_sad_32x32_c: 1020.3 vvc_sad_32x32_avx2: 60.3 vvc_sad_64x64_c: 3850.3 vvc_sad_64x64_avx2: 220.3 vvc_sad_128x128_c: 14100.3 vvc_sad_128x128_avx2: 840.3 Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2024-05-22 20:36:46 -03:00
Stone Chen	0e52a4e434	libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC Implements AVX2 DMVR (decoder-side motion vector refinement) SAD functions. DMVR SAD is only calculated if w >= 8, h >= 8, and w * h > 128. To reduce complexity, SAD is only calculated on even rows. This is calculated for all video bitdepths, but the values passed to the function are always 16bit (even if the original video bitdepth is 8). The AVX2 implementation uses min/max/sub. Additionally this changes parameters dx and dy from int to intptr_t. This allows dx & dy to be used as pointer offsets without needing to use movsxd. Benchmarks ( AMD 7940HS ) Before: BQTerrace_1920x1080_60_10_420_22_RA.vvc \| 106.0 \| Chimera_8bit_1080P_1000_frames.vvc \| 204.3 \| NovosobornayaSquare_1920x1080.bin \| 197.3 \| RitualDance_1920x1080_60_10_420_37_RA.266 \| 174.0 \| After: BQTerrace_1920x1080_60_10_420_22_RA.vvc \| 109.3 \| Chimera_8bit_1080P_1000_frames.vvc \| 216.0 \| NovosobornayaSquare_1920x1080.bin \| 204.0\| RitualDance_1920x1080_60_10_420_37_RA.266 \| 181.7 \| Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2024-05-22 20:36:21 -03:00
James Almer	3146b77a7d	avformat/mov: store sample_sizes as unsigned ints As defined in Section 8.7.3.2.1 of ISO 14496-12. Any unsupported value will be rejected in mov_build_index() without outright aborting demuxing. Fixes ticket #11005. Signed-off-by: James Almer <jamrial@gmail.com>	2024-05-22 17:46:49 -03:00
James Almer	2d84ee3745	avformat/vvc: fix parsing sps_subpic_id The length of the sps_subpic_id[i] syntax element is sps_subpic_id_len_minus1 + 1 bits. Signed-off-by: James Almer <jamrial@gmail.com>	2024-05-22 17:46:49 -03:00
James Almer	3bd7e3a336	avformat/vvc: initialize some ptl flags Signed-off-by: James Almer <jamrial@gmail.com>	2024-05-22 17:46:49 -03:00
Rémi Denis-Courmont	910d281b21	lavc/h263dsp: R-V V {h,v}_loop_filter Since the horizontal and vertical filters are identical except for a transposition, this uses a common subprocedure with an ad-hoc ABI. To preserve return-address stack prediction, a link register has to be used (c.f. the "Control Transfer Instructions" from the RISC-V ISA Manual). The alternate/temporary link register T0 is used here, so that the normal RA is preserved (something Arm cannot do!). To load the strength value based on `qscale`, the shortest possible and PIC-compatible sequence is used: AUIPC; ADD; LBU. The classic LLA; ADD; LBU sequence would add one more instruction since LLA is a convenience alias for AUIPC; ADDI. To ensure that this trick works, relocation relaxation is disabled. To implement the two signed divisions by a power of two toward zero: (x / (1 << SHIFT)) the code relies on the small range of integers involved, computing: (x + (x >> (16 - SHIFT))) >> SHIFT rather than the more general: (x + ((x >> (16 - 1)) & ((1 << SHIFT) - 1))) >> SHIFT Thus one ANDI instruction is avoided. T-Head C908: h263dsp.h_loop_filter_c: 228.2 h263dsp.h_loop_filter_rvv_i32: 144.0 h263dsp.v_loop_filter_c: 242.7 h263dsp.v_loop_filter_rvv_i32: 114.0 (C is probably worse in real use due to less predictible branches.)	2024-05-22 19:15:39 +03:00
James Almer	3d1597d3e2	x86/vvc_alf: use the x86inc instruction macros Let its magic figure out the correct mnemonic based on target instruction set. Signed-off-by: James Almer <jamrial@gmail.com>	2024-05-22 20:51:30 +08:00
llyyr	d1b96c3808	avformat/mov: avoid seeking back to 0 on HEVC open GOP files `ab77b878f1` attempted to fix the issue of broken packets being sent to the decoder by implementing logic that kept attempting to PTS-step backwards until it reached a valid point, however applying this heuristic meant that in files that had no valid points (such as HEVC videos shot on iPhones), we'd seek back to sample 0 on every seek attempt. This meant that files that were previously seekable, albeit with some skipped frames, were not seekable at all now. Relax this heuristic a bit by giving up on seeking to a valid point if we've tried a different sample and we still don't have a valid point to seek to. This may some frames to be skipped on seeking but it's better than not being able to seek at all in such files. Fixes: `ab77b878f1` ("avformat/mov: fix seeking with HEVC open GOP files") Fixes: #10585 Signed-off-by: Philip Langdale <philipl@overt.org>	2024-05-21 18:57:44 -07:00
sunyuechi	0c1304ae11	lavc/vp9dsp: R-V V mc avg C908: vp9_avg4_8bpp_c: 1.2 vp9_avg4_8bpp_rvv_i64: 1.0 vp9_avg8_8bpp_c: 3.7 vp9_avg8_8bpp_rvv_i64: 1.5 vp9_avg16_8bpp_c: 14.7 vp9_avg16_8bpp_rvv_i64: 3.5 vp9_avg32_8bpp_c: 57.7 vp9_avg32_8bpp_rvv_i64: 10.0 vp9_avg64_8bpp_c: 229.0 vp9_avg64_8bpp_rvv_i64: 31.7 Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>	2024-05-21 21:28:14 +03:00
Rémi Denis-Courmont	7591eb4055	Revert "lavc/sbrdsp: R-V V neg_odd_64" While this function can easily be written with vectors, it just fails to get any performance improvement. For reference, this is a simpler loop-free implementation that does get better performance than the current one depending on hardware, but still more or less the same metrics as the C code: func ff_sbr_neg_odd_64_rvv, zve64x li a1, 32 addi a0, a0, 7 li t0, 8 vsetvli zero, a1, e8, m2, ta, ma li t1, 0x80 vlse8.v v8, (a0), t0 vxor.vx v8, v8, t1 vsse8.v v8, (a0), t0 ret endfunc This reverts commit `d06fd18f8f`.	2024-05-21 21:26:39 +03:00
Rémi Denis-Courmont	d452db8410	lavc/vc1dsp: R-V V vc1_unescape_buffer Notes: - The loop is biased toward no unescaped bytes as that should be most common. - The input byte array is slid rather than the (8 times smaller) bit-mask, as RISC-V V does not provide a bit-mask (or bit-wise) slide instruction. - There are two comparisons with 0 per iteration, for the same reason. - In case of match, bytes are copied until the first match, and the loop is restarted after the escape byte. Vector compression (vcompress.vm) could discard all escape bytes but that is slower if escape bytes are rare. Further optimisations should be possible, e.g.: - processing 2 bytes fewer per iteration to get rid of a 2 slides, - taking a short cut if the input vector contains less than 2 zeroes. But this is a good starting point: T-Head C908: vc1dsp.vc1_unescape_buffer_c: 12749.5 vc1dsp.vc1_unescape_buffer_rvv_i32: 6009.0 SpacemiT X60: vc1dsp.vc1_unescape_buffer_c: 11038.0 vc1dsp.vc1_unescape_buffer_rvv_i32: 2061.0	2024-05-21 21:16:30 +03:00
Martin Storsjö	6093367147	checkasm: h264dsp: Avoid out of buffer writes when benchmarking The loop filters can write before the pointer given to them; the actual test invocations correctly used an offset, while the benchmark calls were lacking an offset. Therefore, when running with benchmarking, these tests could have spurious failures. Signed-off-by: Martin Storsjö <martin@martin.st>	2024-05-21 19:20:06 +03:00
Lynne	d43e123837	checkasm: print bench runs when benchmarking Helps make sense of the possible noise in the results.	2024-05-21 17:48:48 +02:00
J. Dekker	b1adf6d1d0	checkasm: add runs argument to adjust during bench Some timers on certain device and test combinations can produce noisy results, affecting the reliability of performance measurements. One notable example of this is the Canaan K230 RISC-V development board. An option to adjust the number of samples by an exponent (--runs) has been added, allowing developers to increase the sample count for more reliable results. Signed-off-by: J. Dekker <jdek@itanimul.li>	2024-05-21 16:47:45 +02:00
Martin Storsjö	a9dc7dd7fd	checkasm: vvc_alf: Limit benchmarking to a reasonable subset of functions Don't benchmark every single combination of widths and heights; only benchmark cases which are squares (like in vvc_mc.c). Contrary to vvc_mc, which increases sizes by doubling dimensions, vvc_alf tests all sizes in increments of 4. Limit benchmarking to the cases which are powers of two. This reduces the number of benchmarked cases from 3072 down to 18.	2024-05-21 20:20:50 +08:00
Nuo Mi	b8eb8b4f19	Changelog: add DVB compatible information for VVC decoder see https://dvb.org/specifications/verification-validation/vvc-test-content/	2024-05-21 20:20:25 +08:00
Nuo Mi	1b33c9a50a	avcodec/vvcdec: support Reference Picture Resampling passed clips: RPR_A_Alibaba_4.bit RPR_B_Alibaba_3.bit RPR_C_Alibaba_3.bit RPR_D_Qualcomm_1.bit VVC_HDR_UHDTV1_OpenGOP_Max3840x2160_50fps_HLG10_res_change_with_RPR.ts	2024-05-21 20:20:25 +08:00
Nuo Mi	cae0b01282	avcodec/vvcdec: increase edge_emu_buffer for RPR	2024-05-21 20:20:25 +08:00
Nuo Mi	7904ec2d34	avcodec/vvcdec: refact, remove hf_idx and vf_idx from mc_xxx's param list	2024-05-21 20:20:25 +08:00
Nuo Mi	77d971c348	avcodec/vvcdec: refact out luma_prof from luma_prof_bi	2024-05-21 20:20:25 +08:00
Nuo Mi	ac4575594f	avcodec/vvcdec: fix dmvr, bdof, cb_prof for RPR	2024-05-21 20:20:25 +08:00
Nuo Mi	77acd0a0dd	avcodec/vvcdec: inter, wait reference with a different resolution For RPR, the current frame may reference a frame with a different resolution. Therefore, we need to consider frame scaling when we wait for reference pixels.	2024-05-21 20:20:25 +08:00

1 2 3 4 5 ...

115394 Commits All Branches Search

115394 Commits

All Branches