ffmpeg/libavcodec/aarch64
Martin Storsjö 7e42d5f0ab aarch64: vp8: Optimize vp8_idct_add_neon for aarch64
The previous version was a pretty exact translation of the arm
version. This version does do some unnecessary arithemetic (it does
more operations on vectors that are only half filled; it does 4
uaddw and 4 sqxtun instead of 2 of each), but it reduces the overhead
of packing data together (which could be done for free in the arm
version).

This gives a decent speedup on Cortex A53, a minor speedup on
A72 and a very minor slowdown on Cortex A73.

Before:        Cortex A53    A72    A73
vp8_idct_add_neon:   79.7   67.5   65.0
After:
vp8_idct_add_neon:   67.7   64.8   66.7

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:46:28 +02:00
..
asm-offsets.h
cabac.h
dcadsp_init.c
dcadsp_neon.S
fft_init_aarch64.c
fft_neon.S
fmtconvert_init.c
fmtconvert_neon.S
h264chroma_init_aarch64.c
h264cmc_neon.S
h264dsp_init_aarch64.c h264/aarch64: add intra loop filter neon asm 2019-01-26 12:05:10 +01:00
h264dsp_neon.S h264/aarch64: add intra loop filter neon asm 2019-01-26 12:05:10 +01:00
h264idct_neon.S
h264pred_init.c
h264pred_neon.S
h264qpel_init_aarch64.c
h264qpel_neon.S
hpeldsp_init_aarch64.c
hpeldsp_neon.S
imdct15_init.c
imdct15_neon.S
Makefile aarch64: vp8: Move the vp8dsp makefile entries to the right places 2019-02-19 11:45:53 +02:00
mdct_init.c
mdct_neon.S
mpegaudiodsp_init.c
mpegaudiodsp_neon.S aarch64: Remove a dot from a label 2017-10-18 10:49:33 +03:00
neon.S
neontest.c
rv40dsp_init_aarch64.c
synth_filter_neon.S
vc1dsp_init_aarch64.c
videodsp_init.c
videodsp.S
vorbisdsp_init.c
vorbisdsp_neon.S
vp8dsp_init_aarch64.c aarch64: vp8: Port bilin functions from arm version 2019-02-19 11:46:14 +02:00
vp8dsp_neon.S aarch64: vp8: Optimize vp8_idct_add_neon for aarch64 2019-02-19 11:46:28 +02:00
vp8dsp.h aarch64: vp8: Port bilin functions from arm version 2019-02-19 11:46:14 +02:00
vp9dsp_init_aarch64.c
vp9itxfm_neon.S aarch64: vp9: Fix assembling with Xcode 6.2 and older 2017-06-20 16:14:03 +03:00
vp9lpf_neon.S
vp9mc_neon.S aarch64: vp9: Fix assembling with Xcode 6.2 and older 2017-06-20 16:14:03 +03:00