Commit Graph

116477 Commits

Author SHA1 Message Date
James Almer ab5c612137 avcodec/Makefile: use the correct path for aacdec_fixed.o when setting its dependencies
Fixes ticket #11112

Signed-off-by: James Almer <jamrial@gmail.com>
2024-07-31 11:32:56 -03:00
Anton Khirnov 43f702a253 lavfi/framesync: avoid forcing frame writability unnecessarily
Callers of ff_framesync_get_frame() generally do not expect the result
to be writable, those that do (e.g. ff_framesync_dualinput_get_writable())
ensure writability themselves.

Significantly reduces memory consumption in complex graphs with
framesync-based filters (e.g. scale, ssim).

Reported-By: Mark Shwartzman
2024-07-31 11:12:45 +02:00
Rémi Denis-Courmont 262168b04e lavc/videodsp: RISC-V zicbop prefetch
There are currently no ways to run-time detect the CPU capability, so we
take it for granted (in the worst case, it will execute NOPs).
2024-07-30 18:41:51 +03:00
Rémi Denis-Courmont 4570b9f3c4 configure: check if assembler supports RV zicbop
zicbop is the Cache Block Operation, Prefetch extension to RVI.
2024-07-30 18:41:51 +03:00
Rémi Denis-Courmont 324eba69f7 lavc/vc1dsp: use saturating arithmetic for RVV inv_trans_dc
T-Head C908 (cycles):
vc1dsp.vc1_inv_trans_4x4_dc_c:      113.7
vc1dsp.vc1_inv_trans_4x4_dc_rvv_i32: 46.5 (before)
vc1dsp.vc1_inv_trans_4x4_dc_rvv_i32: 45.5 (after)
vc1dsp.vc1_inv_trans_4x8_dc_c:      230.7
vc1dsp.vc1_inv_trans_4x8_dc_rvv_i32: 65.7 (before)
vc1dsp.vc1_inv_trans_4x8_dc_rvv_i32: 52.5 (after)
vc1dsp.vc1_inv_trans_8x4_dc_c:      246.7
vc1dsp.vc1_inv_trans_8x4_dc_rvv_i64: 56.7 (before)
vc1dsp.vc1_inv_trans_8x4_dc_rvv_i64: 45.5 (after)
vc1dsp.vc1_inv_trans_8x8_dc_c:      419.7
vc1dsp.vc1_inv_trans_8x8_dc_rvv_i64: 81.2 (before)
vc1dsp.vc1_inv_trans_8x8_dc_rvv_i64: 53.5 (after)
2024-07-30 18:41:51 +03:00
Rémi Denis-Courmont 784a72a116 lavc/vc1dsp: unify R-V V DC bypass functions 2024-07-30 18:41:51 +03:00
Rémi Denis-Courmont bd0c3edb13 lavu/riscv: count bytes rather than words for bswap32
This removes the dependency on Zba at essentially zero cost.
2024-07-30 18:41:51 +03:00
Rémi Denis-Courmont 5171baa228 lavc/ac3dsp: fix R-V CPU requirements
It probably will not matter on any real hardware, but the Zbb optimisations
do not require Zba. And then, we need HAVE_RVV to build the RVV stuff.
2024-07-30 18:41:51 +03:00
Peter Ross 0e09f6d690 avcodec/adpcm: only process right samples when decoding stereo
Fixes Coverity issue #1610760.
2024-07-30 19:55:31 +10:00
Leo Izen 7bb5626fa7
avcodec/pngenc: fix sBIT writing for indexed-color PNGs
We currently write invalid sBIT entries for indexed PNGs, which by PNG
specification[1] must be 3-bytes long. The values also are capped at 8
for indexed-color PNGs, not the palette depth. This patch fixes both of
these issues previously fixed in the decoder, but not the encoder.

[1]: https://www.w3.org/TR/png-3/#11sBIT

Regression since: c125860892.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
Reported-by: Ramiro Polla: <ramiro.polla@gmail.com>
2024-07-30 05:43:36 -04:00
Leo Izen 825606641b
avcodec/pngdec: use 8-bit sBIT cap for indexed PNGs per spec
The PNG specification[1] says that sBIT entries must be at most the bit
depth specified in IHDR, unless the PNG is indexed-color, in which case
sBIT must be between 1 and 8. We should not reject valid sBITs on PNGs
with indexed color.

[1]: https://www.w3.org/TR/png-3/#11sBIT

Regression since 84b454935f.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
Reported-by: Ramiro Polla <ramiro.polla@gmail.com>
2024-07-30 05:43:31 -04:00
Marth64 e2105b2800
avcodec/aacenc: Correct option description for aac_coder fast
The description advertises fast as "Default fast search", but
this has not been the default for a long time (current default
is twoloop).

Signed-off-by: Marth64 <marth64@proxyid.net>
2024-07-30 05:42:50 -04:00
Fei Wang 79b4869959 lavu/hwcontext_qsv: Derive bind flag from frame type if no valid surface
Fix cmd:
ffmpeg.exe -init_hw_device d3d11va=d3d -init_hw_device qsv=qsv@d3d \
-filter_hw_device d3d -hwaccel qsv -hwaccel_output_format qsv      \
-i in.h264 -vf "hwmap,format=d3d11,hwdownload,format=nv12" -y out.yuv

Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Tested-by: Tong Wu <wutong1208@outlook.com>
2024-07-30 13:41:15 +08:00
Fei Wang d30a9fdc80 lavc/qsvdec: Add VVC decoder
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
2024-07-30 13:40:21 +08:00
Fei Wang cf9c398fc1 configure: Alphabetical order for av1 codecs
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
2024-07-30 13:32:44 +08:00
James Almer 9e7a93c6fd x86/intreadwrite: add SSE2 optimized AV_COPY128U
Signed-off-by: James Almer <jamrial@gmail.com>
2024-07-29 23:17:52 -03:00
James Almer 92b317245c avformat/mov: use AV_WL*A
Signed-off-by: James Almer <jamrial@gmail.com>
2024-07-29 21:33:31 -03:00
James Almer f1fcc3ca5f avformat/matroskadec: use AV_WL32A
Signed-off-by: James Almer <jamrial@gmail.com>
2024-07-29 21:33:31 -03:00
James Almer 09de979ff6 avcodec/amfenc_av1: use AV_WL32A
Signed-off-by: James Almer <jamrial@gmail.com>
2024-07-29 21:33:31 -03:00
James Almer 753f2aeed7 avutil/intreadwrite: add missing aligned read/write macros
Signed-off-by: James Almer <jamrial@gmail.com>
2024-07-29 21:33:31 -03:00
Rémi Denis-Courmont 7b24f96c87 lavc/vp9dsp: remove R-V I intra functions
At this point, they are identical to the C code, except for instruction
ordering. In fact, they are typically slower or no faster than the C code.
2024-07-29 21:16:41 +03:00
Rémi Denis-Courmont 7aa6510fe1 lavc/vp9dsp: copy 8 pixels at once
In the 8-bit case, we can actually read/write 8 aligned pixel values per
load/store, which unsurprisingly tends to be faster on 64-bit systems (and
makes no differences on 32-bit systems). This requires ifdef'ing though.
2024-07-29 21:16:41 +03:00
Rémi Denis-Courmont c98127c00e lavc/vp9dsp: use restrict qualifier for copy/avg MC
Same as previous commit.
2024-07-29 21:16:41 +03:00
Rémi Denis-Courmont 56fc5fc6ce lavc/vp9dsp: restrict vertical intra pointers
This lets the compiler unroll ever so slightly better (at least in the
16x16 case for RISC-V GCC).
2024-07-29 21:16:41 +03:00
James Almer afb06aef7e avcodec/decode: remove unused argument from ff_frame_new_side_data_from_buf()
Signed-off-by: James Almer <jamrial@gmail.com>
2024-07-29 14:00:48 -03:00
James Almer e7d3ff8dcd avformat/mov: check that child boxes of trak are only present inside it
Based on the check done for the stco box.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-07-28 17:28:19 -03:00
James Almer 2aa63784b5 avformat/mov: check that sample and chunk count is 1 for HEIF
Fixes NULL pointer dereference in broken/fuzzed streams.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-07-28 17:28:19 -03:00
Rémi Denis-Courmont 39ced529b0 lavu/riscv: implement floating point clips
Unlike x86, fmin/fmax are single instructions, not function calls. They
are much much faster than doing a comparison, then branching based on its
results. With this, audiodsp.vector_clipf gets almost twice as fast, and
a properly unrollled version of it gets 4-5x faster, on SiFive-U74.
This is only the low-hanging fruit: FFMIN and FFMAX are presumably
affected as well.

This likely applies to other instruction sets with native IEEE floats,
especially those lacking a conditional select instruction.
2024-07-28 21:24:58 +03:00
Rémi Denis-Courmont b0b3bea10b lavc/h264dsp: use saturing add/sub for R-V V 8-bit DC add
T-Head C908 (cycles):
h264_idct4_dc_add_8bpp_c:      109.2
h264_idct4_dc_add_8bpp_rvv_i32: 34.5 (before)
h264_idct4_dc_add_8bpp_rvv_i32: 25.5 (after)
h264_idct8_dc_add_8bpp_c:      418.7
h264_idct8_dc_add_8bpp_rvv_i64: 69.5 (before)
h264_idct8_dc_add_8bpp_rvv_i64: 33.5 (after)
2024-07-28 21:24:12 +03:00
Shiyou Yin 4713a5cc24
swscale: [loongarch] Fix checkasm-sw_yuv2rgb failure.
Reviewed-by: 陈昊 <chenhao@loongson.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-28 19:02:16 +02:00
Michael Niedermayer 3b9c6c7fbb
avcodec/adpcm: Remove setting min_channel to value it is already set to
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-28 19:02:15 +02:00
Tong Wu b1d410716b lavc/d3d12va_encode: trim header alignment at output
It is d3d12va's requirement that the FrameStartOffset must be aligned as
per hardware limitation. However, we could trim this alignment at output
to reduce coded size. A aligned_header_size is added to
D3D12VAEncodePicture.

Signed-off-by: Tong Wu <wutong1208@outlook.com>
2024-07-28 17:50:30 +02:00
Rémi Denis-Courmont 9b4655c3a1 lavc/vp8dsp: use saturating add/sub for R-V V DC add
T-Head C908 (cycles):
vp7_idct_dc_add_c:          108.5
vp7_idct_dc_add_rvv_i32:     56.2 (before)
vp7_idct_dc_add_rvv_i32:     47.2 (after)
vp8_idct_dc_add_c:           96.2
vp8_idct_dc_add_rvv_i32:     43.0 (before)
vp8_idct_dc_add_rvv_i32:     34.0 (after)
2024-07-28 17:37:21 +03:00
Rémi Denis-Courmont bbfc0ac9ca lavc/riscv: don't set vxrm if unnecessary
While narrowing clip is nominally a rounding operation, the rounding mode
has no arithmetic consequence if the right shift is by zero bits.
2024-07-28 17:37:21 +03:00
Niklas Haas e42a0763b7 avcodec/dovi_rpudec: clarify semantics
ff_dovi_rpu_parse() and ff_dovi_rpu_generate() are a bit inconsistent in
that they expect different levels of encapsulation, due to the nature of
how this is handled in the context of different APIs. Clarify the status
quo. (And fix an incorrect reference to the RPU payload bytes as 'RBSP')
2024-07-28 12:20:07 +02:00
Niklas Haas 6b66df74b8 avcodec/dovi_rpu: correctly copy num_ext_blocks 2024-07-28 12:20:07 +02:00
Niklas Haas b5aeafc00a fftools/ffprobe: implement dv_md_compression 2024-07-28 12:20:07 +02:00
Niklas Haas 3d5d60d041 avformat/dump: implement dv_md_compression 2024-07-28 12:20:07 +02:00
Niklas Haas ce8166a19c avformat/mpegts: implement dv_md_compression 2024-07-28 12:20:07 +02:00
Niklas Haas b3a9fab9da avformat/dovi_isom: implement dv_md_compression 2024-07-28 12:20:07 +02:00
Niklas Haas cbea92c84d avutil/dovi_meta: add dv_md_compression to cfg record
This field is used to signal the compression method in use.
2024-07-28 12:20:07 +02:00
Zhao Zhili 719e46f54c avcodec/videotoolboxenc: Fix variable type of AV_OPT_TYPE_BOOL
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-07-26 19:54:56 +08:00
Zhao Zhili f4e0f40230 avcodec/videotoolboxenc: Set default bitrate to zero
Zero is auto mode. From the doc of videotoolbox:

   The default bit rate is zero, which indicates that the video
   encoder should determine the size of compressed data.

Before the patch, the default bitrate is 200000 setting by
avcodec/options_table, which doesn't work for most of cases.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-07-26 19:54:20 +08:00
Zhao Zhili d07da7539d avcodec/videotoolboxenc: Fix bitrate doesn't work as expected
Commit 4ef5e7d472 add qmin/qmax support to videotoolbox encoder.
The default value of (qmin, qmax) is (2, 31), which makes bitrate
control doesn't work as users' expectations.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-07-26 19:54:20 +08:00
Rémi Denis-Courmont 8030876d1c checkasm/riscv: align the landing pads 2024-07-25 23:10:14 +03:00
Rémi Denis-Courmont 7dde8be29f checkasm/riscv: add forward-edge CFI landing pads 2024-07-25 23:10:14 +03:00
Rémi Denis-Courmont 4f2472909e sws/riscv: add forward-edge CFI landing pads 2024-07-25 23:10:14 +03:00
Rémi Denis-Courmont b5c111272b lavfi/riscv: add forward-edge CFI landing pads 2024-07-25 23:10:14 +03:00
Rémi Denis-Courmont f2c30fe15a lavc/riscv: add forward-edge CFI landing pads 2024-07-25 23:10:14 +03:00
Rémi Denis-Courmont a14d21a446 lavu/riscv: add forward-edge CFI landing pads 2024-07-25 23:10:14 +03:00