Commit Graph

112938 Commits

Author SHA1 Message Date
Rémi Denis-Courmont 8a984aca59 checkasm/flacdsp: add LPC test 2023-11-18 22:01:59 +02:00
Rémi Denis-Courmont cd6089dc9c riscv: fix builds without Zbb support 2023-11-18 22:01:59 +02:00
Jun Zhao 2d4aef8982 lavfi/Makefile: fix vf_cropdetect missed edge_common
vf_cropdetect depends on edge_common, it's missing in Makefile.

Fix trac issue:
http://trac.ffmpeg.org/ticket/10664

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2023-11-18 20:19:39 +01:00
Diederik de Haas via ffmpeg-devel c07ed10b0e apply spelling fixes
Fix spelling issue as reported by Debian's lintian tool:
accomodate -> accommodate
addtional -> additional
auxillary -> auxiliary
bellow -> below
betweeen -> between
Calulate -> Calculate
coefficents -> coefficients
Defalt -> Default
defaul -> default
higer -> higher
neccesary -> necessary
orignal -> original
ouput -> output
precison -> precision
processsing -> processing
substract -> subtract
Transfered -> Transferred
upto -> up to

Also add several of them to the 'common typos' check in patcheck.

Signed-off-by: Diederik de Haas <didi.debian@cknow.org>
2023-11-18 19:55:42 +01:00
Paul B Mahol 5452cbdc15 avfilter/af_afir: add irnorm and irlink options
Deprecate gtype option.
2023-11-18 17:04:53 +01:00
Rémi Denis-Courmont 07c303b708 lavc/flacdsp: R-V V decorrelate_indep 16-bit packed
flac_decorrelate_indep2_16_c:        981.7
flac_decorrelate_indep2_16_rvv_i32:  199.2
flac_decorrelate_indep4_16_c:       1749.7
flac_decorrelate_indep4_16_rvv_i32:  401.2
flac_decorrelate_indep6_16_c:       2517.7
flac_decorrelate_indep6_16_rvv_i32:  858.0
flac_decorrelate_indep8_16_c:       3285.7
flac_decorrelate_indep8_16_rvv_i32: 1123.5
2023-11-17 23:59:56 +02:00
Rémi Denis-Courmont fb0295e5fd lavc/flacdsp: R-V V decorrelate_indep 32-bit packed
flac_decorrelate_indep2_32_c:       981.7
flac_decorrelate_indep2_32_rvv_i32: 183.7
flac_decorrelate_indep4_32_c:      1749.7
flac_decorrelate_indep4_32_rvv_i32: 362.5
flac_decorrelate_indep6_32_c:      2517.7
flac_decorrelate_indep6_32_rvv_i32: 715.2
flac_decorrelate_indep8_32_c:      3285.7
flac_decorrelate_indep8_32_rvv_i32: 909.0
2023-11-17 23:59:56 +02:00
Rémi Denis-Courmont 6183a69c0b lavc/flacdsp: R-V V decorrelate_ms packed
flac_decorrelate_ms_16_c:       585.5
flac_decorrelate_ms_16_rvv_i32: 263.0
flac_decorrelate_ms_32_c:       584.7
flac_decorrelate_ms_32_rvv_i32: 250.0
2023-11-17 23:59:23 +02:00
Rémi Denis-Courmont 636ae0e0bc lavc/flacdsp: R-V V packed decorrelate_{l,r}s
flac_decorrelate_ms_16_c:       457.2
flac_decorrelate_ms_16_rvv_i32: 203.0
flac_decorrelate_ms_32_c:       457.2
flac_decorrelate_ms_32_rvv_i32: 203.5
flac_decorrelate_rs_16_c:       456.2
flac_decorrelate_rs_16_rvv_i32: 207.0
flac_decorrelate_rs_32_c:       456.2
flac_decorrelate_rs_32_rvv_i32: 210.5
2023-11-17 23:59:22 +02:00
Rémi Denis-Courmont be1675035f checkasm/flacdsp: fix ls/rs/ms tests
decorrelate_ls, _rs and _ms are decorrelate[1], [2] and [3] respectively.
The code ended up testing indep ([0]) as twice, ms never, and misnaming
the other two.
2023-11-17 23:59:22 +02:00
Paul B Mahol 08e97dae20 avfilter/af_adynamicequalizer: add adaptive detection mode 2023-11-17 00:17:54 +01:00
Paul B Mahol 82be1e5c0d avfilter/af_adynamicequalizer: do gain calculations in log domain 2023-11-17 00:17:54 +01:00
sunyuechi afb967b81e af_afir: RISC-V V fcmul_add
Segmented loads are slow, so here we use unit-strided load and narrowing shifts.

c910:
fcmul_add_c: 2179
fcmul_add_rvv_f64: 1652

c908:
fcmul_add_c: 4891.2
fcmul_add_rvv_f64: 2399.5

Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-11-16 20:53:18 +02:00
Rémi Denis-Courmont d076517056 lavc/llauddsp: R-V V scalarproduct_and_madd_int32
scalarproduct_and_madd_int32_c:      10899.7
scalarproduct_and_madd_int32_rvv_i32: 1749.0
2023-11-16 16:53:44 +02:00
Rémi Denis-Courmont 45d0eb3f70 lavc/llauddsp: R-V V scalarproduct_and_madd_int16
scalarproduct_and_madd_int16_c:      10355.7
scalarproduct_and_madd_int16_rvv_i32: 1480.0
2023-11-16 16:53:44 +02:00
Rémi Denis-Courmont 6720a509a7 checkasm: add lossless audio DSP 2023-11-16 16:53:44 +02:00
James Almer 78f55457c9 x86/flacds: clear the high bits from pred_order in lpc_32 functions
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2023-11-15 16:10:15 -03:00
Dai, Jianhui J c9fe9fb863 avcodec/cbs_vp8: Add support for VP8 codec bitstream
This commit adds support for VP8 bitstream read methods to the cbs
codec. This enables the trace_headers bitstream filter to support VP8,
in addition to AV1, H.264, H.265, and VP9. This can be useful for
debugging VP8 stream issues.

The CBS VP8 implements a simple VP8 boolean decoder using GetBitContext
to read the bitstream.

Only the read methods `read_unit` and `split_fragment` are implemented.
The write methods `write_unit` and `assemble_fragment` return the error
code AVERROR_PATCHWELCOME. This is because CBS VP8 write is unlikely to
be used by any applications at the moment. The write methods can be
added later if there is a real need for them.

TESTS: ffmpeg -i fate-suite/vp8/frame_size_change.webm -vcodec copy
-bsf:v trace_headers -f null -

Signed-off-by: Jianhui Dai <jianhui.j.dai@intel.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2023-11-15 10:29:03 -05:00
Dai, Jianhui J 5cb8accd09 avcodec/vp8: Export `vp8_token_update_probs` variable
This commit exports the `vp8_token_update_probs` variable to internal
library scope to facilitate its reuse within the library.

Signed-off-by: Jianhui Dai <jianhui.j.dai@intel.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2023-11-15 10:29:03 -05:00
Rémi Denis-Courmont 90a779bed6 lavc/huffyuvdsp: basic R-V V add_hfyu_left_pred_bgr32
Better performance can probably be achieved with a more intricate
unrolled loop, but this is a start:

add_hfyu_left_pred_bgr32_c: 15084.0
add_hfyu_left_pred_bgr32_rvv_i32: 10280.2

This would actually be cleaner with the RISC-V P extension, but that is
not ratified yet (I think?) and usually not supported if V is supported.
2023-11-15 16:51:07 +02:00
Rémi Denis-Courmont 6b708cd783 checkasm/huffyuvdsp: test for add_hfyu_left_pred_bgr32 2023-11-15 16:51:07 +02:00
Cosmin Stejerean via ffmpeg-devel 575efc0406 tools/general_assembly.pl - add options to print names, emails or both
Signed-off-by: Cosmin Stejerean <cosmin@cosmin.at>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2023-11-14 18:33:07 +01:00
James Almer b360c91752 avcodec/codecpar: mention how to allocate coded_side_data
Signed-off-by: James Almer <jamrial@gmail.com>
2023-11-14 14:26:42 -03:00
Zhao Zhili a1a6a328f0 fftools/ffplay: add hwaccel decoding support
Add vulkan renderer via libplacebo.

Simple usage:
$ ffplay -hwaccel vulkan foo.mp4

Use cuda to vulkan map:
$ ffplay -hwaccel cuda foo.mp4

Create vulkan instance by libplacebo, and enable debug:
$ ffplay -hwaccel vulkan \
	-vulkan_params create_by_placebo=1:debug=1 foo.mp4

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2023-11-15 01:20:11 +08:00
Anton Khirnov 889a022cce fftools/ffmpeg: rework keeping track of file duration for -stream_loop
Current code tracks min/max pts for each stream separately; then when
the file ends it combines them with last frame's duration to compute the
total duration of each stream; finally it selects the longest of those
durations as the file duration.

This is incorrect - the total file duration is the largest timestamp
difference between any frames, regardless of the stream.

Also change the way the last frame information is reported from decoders
to the muxer - previously it would be just the last frame's duration,
now the end timestamp is sent, which is simpler.

Changes the result of the fate-ffmpeg-streamloop-transcode-av test,
where the timestamps are shifted slightly forward. Note that the
matroska demuxer does not return the first audio packet after seeking
(due to buggy interaction betwen the generic code and the demuxer), so
there is a gap in audio.
2023-11-14 18:18:26 +01:00
Anton Khirnov 87016e031f fftools/thread_queue: count receive-finished streams as finished
This ensures that tq_receive() will always return EOF after all streams
were receive-finished, even though the sending side might not have
closed them yet. This may allow the receiver to avoid manually tracking
which streams it has already closed.
2023-11-14 18:18:26 +01:00
Anton Khirnov 4f7b91a698 fftools/thread_queue: do not return elements for receive-finished streams
It does not cause any issues in current callers, but still should not
happen.
2023-11-14 18:18:26 +01:00
Anton Khirnov 7c97a0c63f fftools/ffmpeg: move a few inline function into a new header
Will allow to use them in future commits without including the whole
ffmpeg.h.
2023-11-14 18:18:26 +01:00
Anton Khirnov 6dbde68cb5 lavc/8bps: fix exporting palette after 63767b79a5
It would be left empty on each frame whose packet does not come with
palette attached.
2023-11-14 18:18:26 +01:00
Paul B Mahol 7282137f48 lavfi/af_amix: make sure the output does not depend on input ordering
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2023-11-14 18:18:26 +01:00
Anton Khirnov de85815bfa lavf/mux: do not apply max_interleave_delta to subtitles
It is common for subtitle streams to have large gaps between packets.
When the caller is interleaving packets from multiple files, it can
easily happen that two successive subtitle packets trigger this limit,
even though no excessive buffering is happening.

Should fix #7064
2023-11-14 18:18:26 +01:00
Anton Khirnov 436b972fc8 doc/ffmpeg: expand -bsf documentation
Explain how to pass options to filters.
2023-11-14 18:18:26 +01:00
Anton Khirnov a8d9d6b08d tests/fate: replace deprecated -vsync with -fps_mode 2023-11-14 18:18:26 +01:00
Anton Khirnov 23de85d1ec tests/fate/ffmpeg: replace deprecated -vbsf with -bsf:v 2023-11-14 18:18:26 +01:00
Rémi Denis-Courmont ce467421dc lavc/exrdsp: unroll predictor
With explicit unrolling, we can skip half of the sign bit flips, and
the compiler is then better able to optimise the scalar loop:

predictor_c: 31376.0 (before)
predictor_c: 23703.0 (after)
2023-11-14 19:15:51 +02:00
Rémi Denis-Courmont c536e92207 lavc/sbrdsp: R-V V hf_apply_noise functions
This is restricted to 128-bit vectors as larger vector sizes could read
past the end of the noise array. Support for future hardware with larger
vector sizes is left for some other time.

hf_apply_noise_0_c:       2319.7
hf_apply_noise_0_rvv_f32: 1229.0
hf_apply_noise_1_c:       2539.0
hf_apply_noise_1_rvv_f32: 1244.7
hf_apply_noise_2_c:       2319.7
hf_apply_noise_2_rvv_f32: 1232.7
hf_apply_noise_3_c:       2541.2
hf_apply_noise_3_rvv_f32: 1244.2
2023-11-13 18:34:29 +02:00
Rémi Denis-Courmont 20e6195c54 checkasm: test the noise case of sbrdsp.hf_apply_noise
The tested functions treat s_m[i] == 0 as a special case. Other than
that, the functions are slightly complicated vector additions.

This actually makes the zero case happen pseudorandomly.
2023-11-13 18:34:29 +02:00
Rémi Denis-Courmont 6d60cc7baf sws/rgb2rgb: fix unaligned accesses in R-V V YUYV to I422p
In my personal opinion, we should not need to support unaligned YUY2
pixel maps. They should always be aligned to at least 32 bits, and the
current code assumes just 16 bits. However checkasm does test for
unaligned input bitmaps. QEMU accepts it, but real hardware dose not.

In this particular case, we can at the same time improve performance and
handle unaligned inputs, so do just that.

uyvytoyuv422_c:      104379.0
uyvytoyuv422_c:      104060.0
uyvytoyuv422_rvv_i32: 25284.0 (before)
uyvytoyuv422_rvv_i32: 19303.2 (after)
2023-11-13 18:34:29 +02:00
Rémi Denis-Courmont 5b8b5ec9c5 sws/rgb2rgb: rework R-V V YUY2 to 4:2:2 planar
This saves three scratch registers and three instructions per line. The
performance gains are mostly negligible. The main point is to free up
registers for further rework.
2023-11-13 18:34:29 +02:00
Rémi Denis-Courmont 5b33104fca lavc/sbrdsp: R-V V hf_gen
hf_gen_c:      2922.7
hf_gen_rvv_f32: 731.5
2023-11-13 18:33:02 +02:00
Gyan Doshi 67a2571a55 avcodec/libsvtav1: add version guard for external param
Setting of external param 'force_key_frames' was added in 7bcc1b4eb8.
It is available since v1.1.0 but ffmpeg allows linking against v0.9.0.
2023-11-13 13:14:43 +05:30
Paul B Mahol 84e400ae37 avfilter/buffersrc: switch to activate
Fixes OOM when caller keeps adding frames into filtergraph
that reached EOF by other means, for example EOF is signalled
by other filter in filtergraph or by buffersink.
2023-11-12 23:48:10 +01:00
Evgeny Pavlov da3ce21f68 libavcodec/amfenc: Fix issue with missing headers in AV1 encoder
This commit fixes issue with missing SPS/PPS headers in video
encoded by AMF AV1 encoder.
Missing headers leads to broken seek in MPV video player.
Default value for property AV1_HEADER_INSERTION_MODE shouldn't be setup
to NONE (no headers insertion). We need to skip definition of this property,
because default value depends on USAGE property.

Signed-off-by: Dmitrii Ovchinnikov <ovchinnikov.dmitrii@gmail.com>
2023-11-12 22:57:17 +01:00
Rémi Denis-Courmont 427347309b checkasm: test with random bw value
With a value of zero, the function is a glorified memory copy.
2023-11-12 22:33:11 +02:00
Sebastian Ramacher 250471ea17 avcoded/fft: Fix memory leak if ctx2 is used
Signed-off-by: James Almer <jamrial@gmail.com>
2023-11-12 14:47:56 -03:00
Sebastian Ramacher a562cfee2e avcodec/fft: Use av_mallocz to avoid invalid free/uninit
Signed-off-by: James Almer <jamrial@gmail.com>
2023-11-12 14:47:56 -03:00
Rémi Denis-Courmont cd7b352c53 lavc/sbrdsp: R-V V autocorrelate
With 5 accumulator vectors and 6 inputs, this can only use LMUL=2.
Also the number of vector loop iterations is small, just 5 on 128-bit
vector hardware.

The vector loop is somewhat unusual in that it processes data in
descending memory order, in order to save on vector slides:
in descending order, we can extract elements to carry over to the next
iteration from the bottom of the vectors directly. With ascending order
(see in the Opus postfilter function), there are no ways to get the top
elements directly. On the downside, this requires the use of separate
shift and sub (the would-be SH3SUB instruction does not exist), with
a small pipeline stall on the vector load address.

The edge cases in scalar are done in scalar as this saves on loads
and remains significantly faster than C.

autocorrelate_c: 669.2
autocorrelate_rvv_f32: 421.0
2023-11-12 14:03:09 +02:00
Rémi Denis-Courmont f576a0835b lavc/aacpsdsp: rework R-V V hybrid_synthesis_deint
Given the size of the data set, strided memory accesses cannot be avoided.
We can still do better than the current code.

ps_hybrid_synthesis_deint_c:       12065.5
ps_hybrid_synthesis_deint_rvv_i32: 13650.2 (before)
ps_hybrid_synthesis_deint_rvv_i64:  8181.0 (after)
2023-11-12 14:03:09 +02:00
Rémi Denis-Courmont eb508702a8 lavc/aacpsdsp: rework R-V V add_squares
Segmented loads may be slower than not. So this advantageously uses a
unit-strided load and narrowing shifts instead.

Before:
ps_add_squares_c: 60757.7
ps_add_squares_rvv_f32: 22242.5

After:
ps_add_squares_c: 60516.0
ps_add_squares_rvv_i64: 17067.7
2023-11-12 14:03:09 +02:00
Dave Johansen ab78d22553 avformat/hlsenc: Fix name of flag in error message
Reviewed-by: Steven Liu <lq@chinaffmpeg.org>
2023-11-12 16:47:40 +08:00