Commit Graph

425 Commits

Author SHA1 Message Date
sunyuechi 1c3620b2bb checkasm: test for abs_pow34
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-12-11 18:42:07 +02:00
Rémi Denis-Courmont b3825bbe45 riscv: test for assembler support
This should fix the build on LLVM 16 and earlier, at the cost of turning
all non-RVV optimisations off.
2023-12-08 17:21:09 +02:00
Martin Storsjö 12598e72e3 checkasm: Fix the signature of float_to_fixed24
The len parameter was changed from unsigned int to size_t in
567c67c6c8.

This fixes crashes in the reference C code, when running checkasm
on aarch64.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-12-02 18:16:09 +02:00
sunyuechi d0ec826077 checkasm/ac3dsp: add float_to_fixed24 test
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-12-01 20:26:48 +02:00
sunyuechi ea6817d2a7 checkasm: test for dcmul_add
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-11-27 17:55:24 +02:00
Rémi Denis-Courmont 7212466e73 checkasm/riscv: report an error upon SIGILL
Terminating the whole checkasm process is not very helpful. This will
report if an illegal instruction occurs while executing a tested
function. This is a common occurrence whilst developping RISC-V
assembler, due to the compatibility between vector configuration and
instruction done at run-time.
2023-11-23 19:04:07 +02:00
Rémi Denis-Courmont 286d674221 checkasm: add helper to report a fatal signal 2023-11-23 18:57:18 +02:00
Rémi Denis-Courmont 8a984aca59 checkasm/flacdsp: add LPC test 2023-11-18 22:01:59 +02:00
Rémi Denis-Courmont be1675035f checkasm/flacdsp: fix ls/rs/ms tests
decorrelate_ls, _rs and _ms are decorrelate[1], [2] and [3] respectively.
The code ended up testing indep ([0]) as twice, ms never, and misnaming
the other two.
2023-11-17 23:59:22 +02:00
Rémi Denis-Courmont 6720a509a7 checkasm: add lossless audio DSP 2023-11-16 16:53:44 +02:00
Rémi Denis-Courmont 6b708cd783 checkasm/huffyuvdsp: test for add_hfyu_left_pred_bgr32 2023-11-15 16:51:07 +02:00
Rémi Denis-Courmont 20e6195c54 checkasm: test the noise case of sbrdsp.hf_apply_noise
The tested functions treat s_m[i] == 0 as a special case. Other than
that, the functions are slightly complicated vector additions.

This actually makes the zero case happen pseudorandomly.
2023-11-13 18:34:29 +02:00
Rémi Denis-Courmont 427347309b checkasm: test with random bw value
With a value of zero, the function is a glorified memory copy.
2023-11-12 22:33:11 +02:00
Andreas Rheinhardt 0228e27ded checkasm/motion: Don't allocate AVCodecContext
Instead use one on the stack to avoid pulling in all
of libavcodec.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-28 00:17:47 +02:00
Andreas Rheinhardt 80afcc8539 avfilter/bwdif: Add proper BWDIFDSPContext
This already avoids unnecessary indirectly included headers
in the arch-specific vf_bwdif_init.c files; it is also in
preparation for splitting the actual functions out of vf_bwdif.c.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-28 00:17:47 +02:00
Andreas Rheinhardt a84fe06112 avcodec/idctdsp: Avoid inclusion of avcodec.h
Not every user of idctdsp.h wants to initialize an IDCTDSPContext;
e.g. the proresdsp only uses ff_init_scantable_permutation()
and the IDCT permutation enum; similarly for cavsdsp and wmv2dsp.
Using a forward declaration here avoids an avcodec.h dependency
in the relevant files.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-11 00:26:34 +02:00
Andreas Rheinhardt f8503b4c33 avutil/internal: Don't auto-include emms.h
Instead include emms.h wherever it is needed.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-04 11:04:45 +02:00
Andreas Rheinhardt a39d6e81fa tests/checkasm/sw_scale: Avoid declare_func_emms where possible
This makes the test stricter because it is checked that the
MMX registers are not accidentally clobbered.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-04 11:04:45 +02:00
Andreas Rheinhardt e7866e00c8 tests/checkasm/llvidencdsp: Don't use declare_func_emms
Only sub_media_pred has an MMXEXT version, so one can use
the version with the stricter check (that checks that
the MMX registers have not been clobbered) for sub_left_predict.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-04 11:04:45 +02:00
Andreas Rheinhardt 3f82b38516 tests/checkasm/hevc_*: Avoid using declare_func_emms where possible
Only the idct_dc and add_residual functions have MMX versions,
so one can use the version with the stricter check (that checks
that the MMX registers have not been clobbered) for all the other
checks.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2023-09-04 11:04:45 +02:00
Matthias Dressel e41bd6e65e checkasm: hevc_sao: Fix a regression in hevc_sao_edge
check_func() might return NULL, in which case the function is not to be
benched. Introduced in cc679054c7.

Signed-off-by: Matthias Dressel <code@deadcode.eu>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-08-24 22:09:37 +03:00
Rémi Denis-Courmont f25ad0fe02 checkasm: improve Linux perf error message
Report the failing system call name, as is convention, rather than just
a rather unhelpful "syscall".
2023-07-22 21:35:15 +03:00
Rémi Denis-Courmont b6585eb04c lavu: add/use flag for RISC-V Zba extension
The code was blindly assuming that Zbb or V implied Zba. While the
earlier is practically always true, the later broke some QEMU setups,
as V was introduced earlier than Zba.
2023-07-19 19:29:35 +03:00
Rémi Denis-Courmont 98e4dd39c5 checkasm: test Zbb before V
Without this, Zbb functions get shadowed by V functions on devices
supporting both extensions, and never tested.
2023-07-19 19:29:35 +03:00
Rémi Denis-Courmont d8ea5f50e2 checkasm: print usage on invalid arguments
This checks that arguments are handled. If not, then this prints a
short usage notice and returns an error.
2023-07-17 18:48:42 +03:00
John Cox 697533e76d avfilter/vf_bwdif: Add a filter_line3 method for optimisation
Add an optional filter_line3 to the available optimisations.

filter_line3 is equivalent to filter_line, memcpy, filter_line

filter_line shares quite a number of loads and some calculations in
common with its next iteration and testing shows that using aarch64
neon filter_line3s performance is 30% better than two filter_lines
and a memcpy.

Adds a test for vf_bwdif filter_line3 to checkasm

Rounds job start lines down to a multiple of 4. This means that if
filter_line3 exists then filter_line will not sometimes be called
once at the end of a slice depending on thread count. The final slice
may do up to 3 extra lines but filter_edge is faster than filter_line
so it is unlikely to create any noticable thread load variation.

Signed-off-by: John Cox <jc@kynesim.co.uk>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-07-06 00:21:05 +03:00
John Cox 7ed7c00f55 tests/checkasm: Add test for vf_bwdif filter_edge
Signed-off-by: John Cox <jc@kynesim.co.uk>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-07-06 00:21:05 +03:00
John Cox 7caa8d6b91 tests/checkasm: Add test for vf_bwdif filter_intra
Signed-off-by: John Cox <jc@kynesim.co.uk>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-07-06 00:21:05 +03:00
Martin Storsjö 397cb623c8 aarch64: Add cpu flags for the dotprod and i8mm extensions
Set these available if they are available unconditionally for
the compiler.

Signed-off-by: Martin Storsjö <martin@martin.st>
2023-06-06 12:40:42 +03:00
Lynne 783270bfd1
checkasm: add h264chroma tests
Checks all variants of put_h264_chroma and avg_h264_chroma.
2023-05-20 20:07:21 +02:00
xufuji456 1e91a39502 checkasm: pass context as pointer
Signed-off-by: xufuji456 <839789740@qq.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-04-13 15:17:04 +03:00
xufuji456 30def6365d checkasm/hevc: add transform_luma test
Signed-off-by: xufuji456 <839789740@qq.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2023-04-13 15:17:04 +03:00
J. Dekker 68c151cb1b checkasm: add hevc_deblock chroma test
Signed-off-by: J. Dekker <jdek@itanimul.li>
2023-04-06 06:16:57 +02:00
James Darnley 087faf8cac checkasm: add test for bwdif 2023-03-25 02:38:17 +01:00
Lynne bbe95f7353
x86: replace explicit REP_RETs with RETs
From x86inc:
> On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either
> a branch or a branch target. So switch to a 2-byte form of ret in that case.
> We can automatically detect "follows a branch", but not a branch target.
> (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.)

x86inc can automatically determine whether to use REP_RET rather than
REP in most of these cases, so impact is minimal. Additionally, a few
REP_RETs were used unnecessary, despite the return being nowhere near a
branch.

The only CPUs affected were AMD K10s, made between 2007 and 2011, 16
years ago and 12 years ago, respectively.

In the future, everyone involved with x86inc should consider dropping
REP_RETs altogether.
2023-02-01 04:23:55 +01:00
James Darnley eef763c705 checkasm/v210dec: add extra space to the destination arrays 2022-12-21 00:36:49 +01:00
James Darnley 6af453ca38 avcodec/x86: add avx512icl function for v210dec
Ice Lake (Xeon Silver 4316): 2.01x faster (1147±36.8 vs. 571±38.2 decicycles) compared with avx2
2022-12-20 15:02:45 +01:00
James Darnley cfd1c3c0a1 checkasm/v210enc: test the entire width of 10-bit planar input arrays 2022-12-01 18:19:03 +01:00
bwang30 3ab11dc5bb libavfilter/x86/vf_convolution: add sobel filter optimization and unit test with intel AVX512 VNNI
This commit enabled assembly code with intel AVX512 VNNI and added unit test for sobel filter

sobel_c: 4537
sobel_avx512icl 2136

Signed-off-by: bwang30 <bin.wang@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2022-11-14 10:04:16 +08:00
Lynne e0661fc805
dca_core: convert to lavu/tx
Thanks to Martin Storsjö <martin@martin.st> for fixing and testing the
arm32 and aarch64 changes.
2022-11-06 14:39:36 +01:00
James Darnley 1936c06f02 checkasm: add a verbose check function for uint32_t data 2022-11-04 19:37:46 +01:00
Andreas Rheinhardt 37ee36f689 checkasm/idctdsp: Use declare_func_emms only when needed
There is no MMX code for (add|put|put_signed)_pixels_clamped
since commit bfb28b5ce8, so use
declare_func instead of declare_func_emms() to also test that
we are not in MMX mode after return.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-11 14:18:54 +02:00
Andreas Rheinhardt 5102b98b7a checkasm/llviddspenc: Use declare_func_emms only when needed
There is no MMX code for diff_bytes since commit
230ea38de1, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-11 14:18:54 +02:00
Andreas Rheinhardt e814569c8d checkasm/huffyuvdsp: Use declare_func_emms only when needed
There is no MMX code for add_int16 since commit
4b6ffc2880, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-11 14:18:54 +02:00
Andreas Rheinhardt cd8a33bcce checkasm/llviddsp: Be strict about MMX
There is no MMX code for llviddsp after commit
fed07efcde, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-11 14:18:54 +02:00
Andreas Rheinhardt b4e2d67636 checkasm/pixblockdsp: Be strict about MMX
There is no MMX code for pixblockdsp after commit
92b5800277, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-11 14:18:54 +02:00
Andreas Rheinhardt 42921190cb checkasm/audiodsp: Be strict about MMX
There is no MMX code for audiodsp after commit
3d716d38ab, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-11 14:18:54 +02:00
Andreas Rheinhardt 18afaa20f1 checkasm/blockdsp: Be strict about MMX
There is no MMX code for blockdsp after commit
ee551a21dd, so use declare_func
instead of declare_func_emms() to also test that we are not
in MMX mode after return.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-11 14:18:54 +02:00
Andreas Rheinhardt f224c195e0 checkasm/vc1dsp: Use declare_func_emms only when needed
There is no MMX code for vc1_inv_trans_8x8 or
vc1_unescape_buffer, so use declare_func instead of
declare_func_emms() to also test that we are not in MMX
mode after return.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-10-11 14:18:54 +02:00
Rémi Denis-Courmont c962c78901 checkasm: RISC-V 64-bit assembler test harness 2022-10-10 02:23:18 +02:00