ffmpeg/libavcodec/arm
Janne Grunau 90b1b9350c arm: add ff_int32_to_float_fmul_array8_neon
Quite a bit faster than int32_to_float_fmul_array8_c calling
ff_int32_to_float_fmul_scalar_neon through FmtConvertContext.
Number of cycles per int32_to_float_fmul_array8 call while decoding
padded.dts on exynos5422:

               before  after   change
cortex-a7:     1270     951    -25%
cortex-a15:     434     285    -34%

checkasm --bench cycle counts:     cortex-a15   cortex-a7
int32_to_float_fmul_array8_c:      1730.4       4384.5
int32_to_float_fmul_array8_neon_c:  571.5       1694.3
int32_to_float_fmul_array8_neon:    374.0       1448.8

Interesting are the differences between
int32_to_float_fmul_array8_neon_c and int32_to_float_fmul_array8_neon.
The former is current behaviour of calling
ff_int32_to_float_fmul_scalar_neon repeatedly from the c function,
The raw numbers differ since checkasm uses different lengths than the
dca decoder.
2015-12-14 16:45:02 +01:00
..
aac.h
aacpsdsp_init_arm.c
aacpsdsp_neon.S
ac3dsp_arm.S
ac3dsp_armv6.S
ac3dsp_init_arm.c
ac3dsp_neon.S
apedsp_init_arm.c
apedsp_neon.S
asm-offsets.h
audiodsp_arm.h
audiodsp_init_arm.c
audiodsp_init_neon.c
audiodsp_neon.S
blockdsp_arm.h
blockdsp_init_arm.c
blockdsp_init_neon.c
blockdsp_neon.S
cabac.h
dca.h
dcadsp_init_arm.c arm: add a cpu flag for the VFPv2 vector mode 2015-12-14 16:42:35 +01:00
dcadsp_neon.S
dcadsp_vfp.S
dct-test.c
fft_fixed_init_arm.c
fft_fixed_neon.S arm: Use .data.rel.ro for const data with relocations 2014-12-09 11:43:25 +02:00
fft_init_arm.c arm: add a cpu flag for the VFPv2 vector mode 2015-12-14 16:42:35 +01:00
fft_neon.S arm: Use .data.rel.ro for const data with relocations 2014-12-09 11:43:25 +02:00
fft_vfp.S arm: Use .data.rel.ro for const data with relocations 2014-12-09 11:43:25 +02:00
flacdsp_arm.S
flacdsp_init_arm.c
fmtconvert_init_arm.c arm: add ff_int32_to_float_fmul_array8_neon 2015-12-14 16:45:02 +01:00
fmtconvert_neon.S arm: add ff_int32_to_float_fmul_array8_neon 2015-12-14 16:45:02 +01:00
fmtconvert_vfp.S
g722dsp_init_arm.c g722: Add ARM NEON implementation for g722_apply_qmf() 2015-02-15 22:47:21 +02:00
g722dsp_neon.S g722: Add ARM NEON implementation for g722_apply_qmf() 2015-02-15 22:47:21 +02:00
h264chroma_init_arm.c
h264cmc_neon.S
h264dsp_init_arm.c
h264dsp_neon.S
h264idct_neon.S
h264pred_init_arm.c h264: arm: use intra pred8x8 functions only for chroma_format_idc <= 1 2015-07-18 00:28:49 +02:00
h264pred_neon.S
h264qpel_init_arm.c
h264qpel_neon.S
hpeldsp_arm.h
hpeldsp_arm.S
hpeldsp_armv6.S
hpeldsp_init_arm.c
hpeldsp_init_armv6.c
hpeldsp_init_neon.c
hpeldsp_neon.S
idct.h
idctdsp_arm.h
idctdsp_arm.S
idctdsp_armv6.S
idctdsp_init_arm.c
idctdsp_init_armv5te.c
idctdsp_init_armv6.c
idctdsp_init_neon.c
idctdsp_neon.S
int_neon.S
jrevdct_arm.S
Makefile configure: Factor out g722dsp module 2015-07-17 18:46:24 +01:00
mathops.h
mdct_fixed_neon.S
mdct_neon.S
mdct_vfp.S
me_cmp_armv6.S
me_cmp_init_arm.c
mlpdsp_armv5te.S arm: mlpdsp: handle pic offset calculation in a macro 2014-12-09 22:00:08 +01:00
mlpdsp_armv6.S
mlpdsp_init_arm.c
mpegaudiodsp_fixed_armv6.S
mpegaudiodsp_init_arm.c
mpegvideo_arm.c
mpegvideo_arm.h
mpegvideo_armv5te_s.S
mpegvideo_armv5te.c
mpegvideo_neon.S
mpegvideoencdsp_armv6.S
mpegvideoencdsp_init_arm.c
neon.S
neontest.c
pixblockdsp_armv6.S
pixblockdsp_init_arm.c
rdft_neon.S
rv34dsp_init_arm.c
rv34dsp_neon.S
rv40dsp_init_arm.c
rv40dsp_neon.S
sbrdsp_init_arm.c
sbrdsp_neon.S
simple_idct_arm.S
simple_idct_armv5te.S
simple_idct_armv6.S
simple_idct_neon.S
startcode_armv6.S
startcode.h
synth_filter_neon.S
synth_filter_vfp.S
vc1dsp_init_arm.c
vc1dsp_init_neon.c
vc1dsp_neon.S
vc1dsp.h
videodsp_arm.h
videodsp_armv5te.S arm: use a local label instead of the function symbol in ff_prefetch_arm 2015-07-20 23:10:29 +02:00
videodsp_init_arm.c
videodsp_init_armv5te.c
vorbisdsp_init_arm.c
vorbisdsp_neon.S
vp3dsp_init_arm.c
vp3dsp_neon.S
vp6dsp_init_arm.c
vp6dsp_neon.S
vp8_armv6.S
vp8.h
vp8dsp_armv6.S
vp8dsp_init_arm.c
vp8dsp_init_armv6.c
vp8dsp_init_neon.c
vp8dsp_neon.S
vp8dsp.h
vp56_arith.h