ffmpeg/libswscale/aarch64
Hubert Mazur 2537fdc510 sw_scale: Add specializations for hscale 16 to 19
Provide arm64 neon optimized implementations for hscale16To19 with
filter sizes 4, 8 and X4.

The tests and benchmarks run on AWS Graviton 2 instances.
The results from a checkasm tool are shown below.

hscale_16_to_19__fs_4_dstW_512_c: 6216.0
hscale_16_to_19__fs_4_dstW_512_neon: 2257.0
hscale_16_to_19__fs_8_dstW_512_c: 10417.7
hscale_16_to_19__fs_8_dstW_512_neon: 3112.5
hscale_16_to_19__fs_12_dstW_512_c: 14890.5
hscale_16_to_19__fs_12_dstW_512_neon: 3899.0
hscale_16_to_19__fs_16_dstW_512_c: 19006.5
hscale_16_to_19__fs_16_dstW_512_neon: 5341.2
hscale_16_to_19__fs_32_dstW_512_c: 36629.5
hscale_16_to_19__fs_32_dstW_512_neon: 9502.7
hscale_16_to_19__fs_40_dstW_512_c: 45477.5
hscale_16_to_19__fs_40_dstW_512_neon: 11552.0

(Note, the checkasm tests for these functions haven't been
merged since they fail on x86.)

Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-11-01 15:24:58 +02:00
..
Makefile swscale: aarch64: Add a NEON implementation of interleaveBytes 2020-05-15 23:38:17 +03:00
hscale.S sw_scale: Add specializations for hscale 16 to 19 2022-11-01 15:24:58 +02:00
output.S swscale/aarch64: add vscale specializations 2022-08-16 13:40:42 +03:00
rgb2rgb.c swscale: aarch64: Add a NEON implementation of interleaveBytes 2020-05-15 23:38:17 +03:00
rgb2rgb_neon.S swscale: aarch64: Add a NEON implementation of interleaveBytes 2020-05-15 23:38:17 +03:00
swscale.c sw_scale: Add specializations for hscale 16 to 19 2022-11-01 15:24:58 +02:00
swscale_unscaled.c sws: rename SwsContext.swscale to convert_unscaled 2021-07-03 15:57:53 +02:00
yuv2rgb_neon.S swscale: aarch64: Fix yuv2rgb with negative strides 2022-10-27 21:49:26 +03:00