ffmpeg/libswscale/aarch64
Hubert Mazur 9ccf8c5bfc sw_scale: Add specializations for hscale 16 to 15
Add arm64 neon implementations for hscale 16 to 15 with filter
sizes 4, 8 and X4.

The tests and benchmarks run on AWS Graviton 2 instances.
The results from a checkasm tool are shown below.

hscale_16_to_15__fs_4_dstW_512_c: 6703.5
hscale_16_to_15__fs_4_dstW_512_neon: 2298.0
hscale_16_to_15__fs_8_dstW_512_c: 10983.0
hscale_16_to_15__fs_8_dstW_512_neon: 3216.5
hscale_16_to_15__fs_12_dstW_512_c: 15526.0
hscale_16_to_15__fs_12_dstW_512_neon: 3993.0
hscale_16_to_15__fs_16_dstW_512_c: 20183.5
hscale_16_to_15__fs_16_dstW_512_neon: 5369.7
hscale_16_to_15__fs_32_dstW_512_c: 39315.2
hscale_16_to_15__fs_32_dstW_512_neon: 9511.2
hscale_16_to_15__fs_40_dstW_512_c: 48995.7
hscale_16_to_15__fs_40_dstW_512_neon: 11570.0

(Note, the checkasm tests for these functions haven't been
merged since they fail on x86.)

Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-11-01 15:24:53 +02:00
..
hscale.S sw_scale: Add specializations for hscale 16 to 15 2022-11-01 15:24:53 +02:00
Makefile swscale: aarch64: Add a NEON implementation of interleaveBytes 2020-05-15 23:38:17 +03:00
output.S swscale/aarch64: add vscale specializations 2022-08-16 13:40:42 +03:00
rgb2rgb_neon.S swscale: aarch64: Add a NEON implementation of interleaveBytes 2020-05-15 23:38:17 +03:00
rgb2rgb.c swscale: aarch64: Add a NEON implementation of interleaveBytes 2020-05-15 23:38:17 +03:00
swscale_unscaled.c sws: rename SwsContext.swscale to convert_unscaled 2021-07-03 15:57:53 +02:00
swscale.c sw_scale: Add specializations for hscale 16 to 15 2022-11-01 15:24:53 +02:00
yuv2rgb_neon.S swscale: aarch64: Fix yuv2rgb with negative strides 2022-10-27 21:49:26 +03:00