mirror of
https://git.ffmpeg.org/ffmpeg.git
synced 2025-01-01 20:42:19 +00:00
3e708722a2
Use scalar times vector multiply accumlate instructions instead of vector times vector to remove the need for replicating load instructions which are slightly slower. On AWS c7g (Graviton 3, Neoverse V1) instances: yuv2yuvX_8_0_512_accurate_neon: 1144.8 987.4 yuv2yuvX_16_0_512_accurate_neon: 2080.5 1869.4 Signed-off-by: Jonathan Swinney <jswinney@amazon.com> Signed-off-by: Martin Storsjö <martin@martin.st> |
||
---|---|---|
.. | ||
hscale.S | ||
Makefile | ||
output.S | ||
rgb2rgb_neon.S | ||
rgb2rgb.c | ||
swscale_unscaled.c | ||
swscale.c | ||
yuv2rgb_neon.S |