bwang30
3ab11dc5bb
libavfilter/x86/vf_convolution: add sobel filter optimization and unit test with intel AVX512 VNNI
...
This commit enabled assembly code with intel AVX512 VNNI and added unit test for sobel filter
sobel_c: 4537
sobel_avx512icl 2136
Signed-off-by: bwang30 <bin.wang@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2022-11-14 10:04:16 +08:00
Rémi Denis-Courmont
c962c78901
checkasm: RISC-V 64-bit assembler test harness
2022-10-10 02:23:18 +02:00
Lynne
3ade6a8644
x86/lpc: implement a new Welch windowing function
...
Old one was written with the assumption only even inputs would be given.
This very messy replacement supports even and odd inputs, and supports
AVX2 for extra speed. The buffers given are usually quite big (4k samples),
so the speedup is worth it.
The new SSE version is still faster than the old inline asm version by 33%.
Also checkasm is provided to make sure this monstrosity works.
This fixes some FATE tests.
2022-09-21 07:12:39 +02:00
James Almer
8f119b501e
tests/checkasm: add a test for VorbisDSPContext
...
Signed-off-by: James Almer <jamrial@gmail.com>
2022-09-19 21:28:23 -03:00
Andreas Rheinhardt
6c4595190e
avcodec/flacdsp: Split encoder-only parts into a ctx of its own
...
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-08-05 03:28:45 +02:00
Swinney, Jonathan
c471cc7474
lavc/aarch64: motion estimation functions in neon
...
- ff_pix_abs16_neon
- ff_pix_abs16_xy2_neon
In direct micro benchmarks of these ff functions verses their C implementations,
these functions performed as follows on AWS Graviton 3.
ff_pix_abs16_neon:
pix_abs_0_0_c: 141.1
pix_abs_0_0_neon: 19.6
ff_pix_abs16_xy2_neon:
pix_abs_0_3_c: 269.1
pix_abs_0_3_neon: 39.3
Tested with:
./tests/checkasm/checkasm --test=motion --bench --disable-linux-perf
Signed-off-by: Jonathan Swinney <jswinney@amazon.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-06-28 00:51:39 +03:00
Ben Avison
bd3615a81a
checkasm: Add idctdsp add/put-pixels-clamped tests
...
Signed-off-by: Ben Avison <bavison@riscosopen.org>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-04-01 10:03:33 +03:00
Ben Avison
20cb43ea8b
checkasm: Add vc1dsp in-loop deblocking filter tests
...
Note that the benchmarking results for these functions are highly dependent
upon the input data. Therefore, each function is benchmarked twice,
corresponding to the best and worst case complexity of the reference C
implementation. The performance of a real stream decode will fall somewhere
between these two extremes.
Signed-off-by: Ben Avison <bavison@riscosopen.org>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-04-01 10:03:33 +03:00
Mark Reid
9e445a5be2
swscale/x86/output.asm: add x86-optimized planer gbr yuv2anyX functions
...
changes since v2:
* fixed label
changes since v1:
* remove vex intruction on sse4 path
* some load/pack marcos use less intructions
* fixed some typos
yuv2gbrp_full_X_4_512_c: 12757.6
yuv2gbrp_full_X_4_512_sse2: 8946.6
yuv2gbrp_full_X_4_512_sse4: 5138.6
yuv2gbrp_full_X_4_512_avx2: 3889.6
yuv2gbrap_full_X_4_512_c: 15368.6
yuv2gbrap_full_X_4_512_sse2: 11916.1
yuv2gbrap_full_X_4_512_sse4: 6294.6
yuv2gbrap_full_X_4_512_avx2: 3477.1
yuv2gbrp9be_full_X_4_512_c: 14381.6
yuv2gbrp9be_full_X_4_512_sse2: 9139.1
yuv2gbrp9be_full_X_4_512_sse4: 5150.1
yuv2gbrp9be_full_X_4_512_avx2: 2834.6
yuv2gbrp9le_full_X_4_512_c: 12990.1
yuv2gbrp9le_full_X_4_512_sse2: 9118.1
yuv2gbrp9le_full_X_4_512_sse4: 5132.1
yuv2gbrp9le_full_X_4_512_avx2: 2833.1
yuv2gbrp10be_full_X_4_512_c: 14401.6
yuv2gbrp10be_full_X_4_512_sse2: 9133.1
yuv2gbrp10be_full_X_4_512_sse4: 5126.1
yuv2gbrp10be_full_X_4_512_avx2: 2837.6
yuv2gbrp10le_full_X_4_512_c: 12718.1
yuv2gbrp10le_full_X_4_512_sse2: 9106.1
yuv2gbrp10le_full_X_4_512_sse4: 5120.1
yuv2gbrp10le_full_X_4_512_avx2: 2826.1
yuv2gbrap10be_full_X_4_512_c: 18535.6
yuv2gbrap10be_full_X_4_512_sse2: 33617.6
yuv2gbrap10be_full_X_4_512_sse4: 6264.1
yuv2gbrap10be_full_X_4_512_avx2: 3422.1
yuv2gbrap10le_full_X_4_512_c: 16724.1
yuv2gbrap10le_full_X_4_512_sse2: 11787.1
yuv2gbrap10le_full_X_4_512_sse4: 6282.1
yuv2gbrap10le_full_X_4_512_avx2: 3441.6
yuv2gbrp12be_full_X_4_512_c: 13723.6
yuv2gbrp12be_full_X_4_512_sse2: 9128.1
yuv2gbrp12be_full_X_4_512_sse4: 7997.6
yuv2gbrp12be_full_X_4_512_avx2: 2844.1
yuv2gbrp12le_full_X_4_512_c: 12257.1
yuv2gbrp12le_full_X_4_512_sse2: 9107.6
yuv2gbrp12le_full_X_4_512_sse4: 5142.6
yuv2gbrp12le_full_X_4_512_avx2: 2837.6
yuv2gbrap12be_full_X_4_512_c: 18511.1
yuv2gbrap12be_full_X_4_512_sse2: 12156.6
yuv2gbrap12be_full_X_4_512_sse4: 6251.1
yuv2gbrap12be_full_X_4_512_avx2: 3444.6
yuv2gbrap12le_full_X_4_512_c: 16687.1
yuv2gbrap12le_full_X_4_512_sse2: 11785.1
yuv2gbrap12le_full_X_4_512_sse4: 6243.6
yuv2gbrap12le_full_X_4_512_avx2: 3446.1
yuv2gbrp14be_full_X_4_512_c: 13690.6
yuv2gbrp14be_full_X_4_512_sse2: 9120.6
yuv2gbrp14be_full_X_4_512_sse4: 5138.1
yuv2gbrp14be_full_X_4_512_avx2: 2843.1
yuv2gbrp14le_full_X_4_512_c: 14995.6
yuv2gbrp14le_full_X_4_512_sse2: 9119.1
yuv2gbrp14le_full_X_4_512_sse4: 5126.1
yuv2gbrp14le_full_X_4_512_avx2: 2843.1
yuv2gbrp16be_full_X_4_512_c: 12367.1
yuv2gbrp16be_full_X_4_512_sse2: 8233.6
yuv2gbrp16be_full_X_4_512_sse4: 4820.1
yuv2gbrp16be_full_X_4_512_avx2: 2666.6
yuv2gbrp16le_full_X_4_512_c: 10904.1
yuv2gbrp16le_full_X_4_512_sse2: 8214.1
yuv2gbrp16le_full_X_4_512_sse4: 4824.1
yuv2gbrp16le_full_X_4_512_avx2: 2629.1
yuv2gbrap16be_full_X_4_512_c: 26569.6
yuv2gbrap16be_full_X_4_512_sse2: 10884.1
yuv2gbrap16be_full_X_4_512_sse4: 5488.1
yuv2gbrap16be_full_X_4_512_avx2: 3272.1
yuv2gbrap16le_full_X_4_512_c: 14010.1
yuv2gbrap16le_full_X_4_512_sse2: 10562.1
yuv2gbrap16le_full_X_4_512_sse4: 5463.6
yuv2gbrap16le_full_X_4_512_avx2: 3255.1
yuv2gbrpf32be_full_X_4_512_c: 14524.1
yuv2gbrpf32be_full_X_4_512_sse2: 8552.6
yuv2gbrpf32be_full_X_4_512_sse4: 4636.1
yuv2gbrpf32be_full_X_4_512_avx2: 2474.6
yuv2gbrpf32le_full_X_4_512_c: 13060.6
yuv2gbrpf32le_full_X_4_512_sse2: 9682.6
yuv2gbrpf32le_full_X_4_512_sse4: 4298.1
yuv2gbrpf32le_full_X_4_512_avx2: 2453.1
yuv2gbrapf32be_full_X_4_512_c: 18629.6
yuv2gbrapf32be_full_X_4_512_sse2: 11363.1
yuv2gbrapf32be_full_X_4_512_sse4: 15201.6
yuv2gbrapf32be_full_X_4_512_avx2: 3727.1
yuv2gbrapf32le_full_X_4_512_c: 16677.6
yuv2gbrapf32le_full_X_4_512_sse2: 10221.6
yuv2gbrapf32le_full_X_4_512_sse4: 5693.6
yuv2gbrapf32le_full_X_4_512_avx2: 3656.6
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2022-01-11 16:33:17 -03:00
Lynne
1978b143eb
checkasm: add av_tx FFT SIMD testing code
...
This sadly required making changes to the code itself,
due to the same context needing to be reused for both versions.
The lookup table had to be duplicated for both versions.
2021-04-24 17:19:17 +02:00
Josh Dekker
9c513edb79
checkasm: add hevc_pel tests
...
Co-authored-by: Niklas Haas <git@haasn.xyz>
Signed-off-by: Josh Dekker <josh@itanimul.li>
2021-01-25 09:24:11 +01:00
Josh de Kock
5913cd4e6c
checkasm: add hscale test
...
This tests the hscale 8bpp to 14/18bpp functions with different filter
sizes.
Signed-off-by: Josh de Kock <josh@itanimul.li>
2020-05-15 10:29:30 +01:00
Ting Fu
9691e2a426
checkasm/vf_eq: add test for vf_eq
...
Signed-off-by: Ting Fu <ting.fu@intel.com>
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
2019-09-26 08:10:31 +08:00
Lynne
4ce1e13b54
checkasm: add opusdsp tests
2019-09-11 03:28:22 +01:00
Ruiling Song
8f4963ad25
checkasm/vf_gblur: add test for horiz_slice simd
...
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
2019-06-12 08:54:05 +08:00
James Darnley
76c370af64
checkasm: add test for v210dec
2019-05-02 19:21:37 +02:00
James Almer
06476249cd
Merge commit '7e5bde93a1e7641e1622814dafac0be3f413d79b'
...
* commit '7e5bde93a1e7641e1622814dafac0be3f413d79b':
build: Rename OBJDIRS variable to OUTDIRS
Merged-by: James Almer <jamrial@gmail.com>
2019-03-10 19:31:13 -03:00
Diego Biurrun
7e5bde93a1
build: Rename OBJDIRS variable to OUTDIRS
...
These directories are not just for object files.
2019-02-16 13:09:35 +01:00
James Almer
ba89dc27b5
checkasm: add an af_afir test
...
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2019-01-03 10:12:18 -03:00
Clément Bœsch
f679711c1b
checkasm: add vf_nlmeans test for ssd_integral_image
2018-05-08 10:28:06 +02:00
Josh de Kock
cda43940da
checkasm/Makefile: add EXTRALIBS-libavformat
...
Signed-off-by: Josh de Kock <josh@itanimul.li>
2018-03-31 23:20:16 +01:00
Martin Vignali
a9a7ed4f27
checkasm/swscale : add test for rgb shuffle_bytes func
2018-03-24 20:22:12 +01:00
Yingming Fan
80798e3857
checkasm/hevc_sao : add hevc_sao for checkasm
...
Signed-off-by: James Almer <jamrial@gmail.com>
2018-03-07 23:53:32 -03:00
Muhammad Faiz
81d6501be7
checkasm/Makefile: add EXTRALIBS-swresample
...
Should fix https://ffmpeg.org/pipermail/ffmpeg-devel/2018-February/225058.html
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
2018-02-09 17:50:44 +07:00
Martin Vignali
78b982d3b9
checkasm : add test for losslessvideoencdsp for diff bytes and sub_left_pred
2018-01-28 20:23:16 +01:00
James Almer
da03242778
Revert "checkasm/vf_interlace : add test for lowpass_line 8 and 16"
...
This reverts commit adff97be5e
.
It currently fails on Windows targets.
Signed-off-by: James Almer <jamrial@gmail.com>
2017-12-19 19:07:24 -03:00
Martin Vignali
adff97be5e
checkasm/vf_interlace : add test for lowpass_line 8 and 16
2017-12-19 20:59:51 +01:00
Martin Vignali
cefb7e0060
checkasm/vf_hflip : add test for vf_hflip byte and short simd
2017-12-13 11:34:29 +01:00
Martin Vignali
cfce442750
checkasm/vf_threshold : add checkasm test for threshold8
2017-12-03 19:17:15 +01:00
Martin Vignali
4a6aa6d1b2
checkasm : add test for huffyuvdsp add_int16
2017-11-21 09:41:42 +01:00
Martin Vignali
6a7eb65e1b
checkasm : add utvideodsp test
2017-11-21 09:00:27 +01:00
James Almer
6dfcbd80ad
Merge commit '7cb1d9e2dbbe5bf4652be5d78cdd68e956fa3d63'
...
* commit '7cb1d9e2dbbe5bf4652be5d78cdd68e956fa3d63':
build: Fine-grained link-time dependency settings
Also included are bug fix commits 5ff3b5cafc
,
d9da7151ee
and
5e27ef800b
.
Merged-by: James Almer <jamrial@gmail.com>
2017-10-11 17:55:25 -03:00
James Almer
7323c896b2
checkasm: add an exrdsp test
...
Signed-off-by: James Almer <jamrial@gmail.com>
2017-09-17 19:01:40 -03:00
James Almer
823cc7e25f
checkasm: add a g722dsp test
...
Signed-off-by: James Almer <jamrial@gmail.com>
2017-07-13 17:00:19 -03:00
Matthieu Bouron
7864e07f4a
checkasm: add sbrdsp tests
2017-07-03 14:28:17 +02:00
Clément Bœsch
edd041e64c
checkasm: add AAC PS tests
...
This includes various fixes and improvements from James Almer.
Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-28 12:22:39 +02:00
Diego Biurrun
fd502f4f5f
build: Generalize yasm/nasm-related variable names
...
None of them are specific to the YASM assembler.
(Cherry-picked from libav commit 39e208f4d4
)
Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-21 17:00:29 -03:00
James Almer
5b10f484e2
checkasm: add float_dsp tests
...
Ported from libavutil/tests/float_dsp.c
Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-14 19:20:10 -03:00
James Almer
7b3cb953f7
checkasm: add fixed_dsp tests
...
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
2017-04-11 18:05:13 -03:00
Clément Bœsch
210678d3c5
Merge commit '3794062ab1a13442b06f6d76c54dce51ffa54697'
...
* commit '3794062ab1a13442b06f6d76c54dce51ffa54697':
Remove Plan 9 support
Merged-by: Clément Bœsch <u@pkh.me>
2017-04-09 14:52:00 +02:00
Clément Bœsch
3d4039f964
Merge commit 'ed48a9d8143d2575a4458589cebde69ec326afd8'
...
* commit 'ed48a9d8143d2575a4458589cebde69ec326afd8':
checkasm: Add a test for HEVC add_residual
Merged-by: Clément Bœsch <u@pkh.me>
2017-03-24 12:37:09 +01:00
James Almer
f23078904f
Merge commit '2816f8a8bb33bd67fec5e94f5d357918caf4e055'
...
* commit '2816f8a8bb33bd67fec5e94f5d357918caf4e055':
build: Drop arch-specific checkasm Makefiles
Merged-by: James Almer <jamrial@gmail.com>
2017-03-23 18:01:47 -03:00
Clément Bœsch
7c2a7f9c11
Merge commit '22c3ab18646924ce24dc6017a9e882ff69689e40'
...
* commit '22c3ab18646924ce24dc6017a9e882ff69689e40':
checkasm: Add test for huffyuvdsp add_bytes
huffyuvdsp is renamed to llviddsp to be consistent with our codebase.
Note: af607b7e07
wasn't actually required for this test since this
commit is not actually testing huffyuvdsp.
Merged-by: Clément Bœsch <u@pkh.me>
2017-03-22 16:31:38 +01:00
Clément Bœsch
8414755486
Merge commit 'e9ef6171396dc4106526aaa86b620c61ca3d1017'
...
* commit 'e9ef6171396dc4106526aaa86b620c61ca3d1017':
checkasm: add tests for audiodsp
Merged-by: Clément Bœsch <u@pkh.me>
2017-03-20 19:10:56 +01:00
Clément Bœsch
c50b2164a6
Merge commit '2eb97af66af90ca3978229da151f0b8b3a5d9370'
...
* commit '2eb97af66af90ca3978229da151f0b8b3a5d9370':
checkasm: add a test for blockdsp
Merged-by: Clément Bœsch <u@pkh.me>
2017-03-20 19:05:05 +01:00
Diego Biurrun
39e208f4d4
build: Generalize yasm/nasm-related variable names
...
None of them are specific to the YASM assembler.
2017-03-01 10:18:15 +01:00
Diego Biurrun
7cb1d9e2db
build: Fine-grained link-time dependency settings
...
Previously, all link-time dependencies were added for all libraries,
resulting in bogus link-time dependencies since not all dependencies
are shared across libraries. Also, in some cases like libavutil, not
all dependencies were taken into account, resulting in some cases of
underlinking.
To address all this mess a machinery is added for tracking which
dependency belongs to which library component and then leveraged
to determine correct dependencies for all individual libraries.
2017-03-01 09:00:40 +01:00
Clément Bœsch
92cb9a3869
Merge commit '9064777dbb335ab4809ae09e3fdcc0245f925cdc'
...
* commit '9064777dbb335ab4809ae09e3fdcc0245f925cdc':
checkasm: add HEVC test for testing IDCT DC
Merged-by: Clément Bœsch <cboesch@gopro.com>
2017-02-02 11:40:58 +01:00
Diego Biurrun
3794062ab1
Remove Plan 9 support
...
Supporting the system was a nice joke for the 9 release, but it has
run its course. Nowadays Plan 9 receives no testing and has no
practical usefulness.
2016-12-03 09:15:01 +01:00
Hendrik Leppkes
47f75839e4
Merge commit 'f8d17d53957056c053a46f9320fa7ae6fe1479a5'
...
* commit 'f8d17d53957056c053a46f9320fa7ae6fe1479a5':
checkasm: Add tests for vp8dsp
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-11-14 15:29:08 +01:00