ffmpeg/libavcodec/arm
Ben Avison 15a29c39d9 truehd: add hand-scheduled ARM asm version of mlp_filter_channel.
Profiling results for overall audio decode and the mlp_filter_channel(_arm)
function in particular are as follows:

              Before          After
              Mean   StdDev   Mean   StdDev  Confidence  Change
6:2 total     380.4  22.0     370.8  17.0    87.4%       +2.6%  (insignificant)
6:2 function  60.7   7.2      36.6   8.1     100.0%      +65.8%
8:2 total     357.0  17.5     343.2  19.0    97.8%       +4.0%  (insignificant)
8:2 function  60.3   8.8      37.3   3.8     100.0%      +61.8%
6:6 total     717.2  23.2     658.4  15.7    100.0%      +8.9%
6:6 function  140.4  12.9     81.5   9.2     100.0%      +72.4%
8:8 total     981.9  16.2     896.2  24.5    100.0%      +9.6%
8:8 function  193.4  15.0     103.3  11.5    100.0%      +87.2%

Experiments with adding preload instructions to this function yielded no
useful benefit, so these have not been included.

The assembly version has also been tested with a fuzz tester to ensure that
any combinations of inputs not exercised by my available test streams still
generate mathematically identical results to the C version.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-03-26 19:53:52 +02:00
..
Makefile truehd: add hand-scheduled ARM asm version of mlp_filter_channel. 2014-03-26 19:53:52 +02:00
aac.h
aacpsdsp_init_arm.c
aacpsdsp_neon.S
ac3dsp_arm.S
ac3dsp_armv6.S
ac3dsp_init_arm.c dsputil: Move apply_window_int16 to ac3dsp 2013-12-08 17:57:15 +01:00
ac3dsp_neon.S dsputil: Move apply_window_int16 to ac3dsp 2013-12-08 17:57:15 +01:00
asm-offsets.h
cabac.h arm: get_cabac inline asm 2014-03-09 00:45:34 +01:00
dca.h dcadec: simplify decoding of VQ high frequencies 2014-02-28 13:03:22 +01:00
dcadsp_init_arm.c arm: dcadsp: implement decode_hf as external NEON asm 2014-02-28 13:12:19 +01:00
dcadsp_neon.S arm: dcadsp: implement decode_hf as external NEON asm 2014-02-28 13:12:19 +01:00
dcadsp_vfp.S dcadec: remove scaling in lfe_interpolation_fir 2014-02-28 13:00:47 +01:00
dsputil_arm.S arm: Allow overriding the alignment set in the function macro 2014-01-07 19:29:56 +02:00
dsputil_arm.h dsputil: Propagate bit depth information to all (sub)init functions 2014-03-20 05:03:23 -07:00
dsputil_armv6.S
dsputil_init_arm.c dsputil: Propagate bit depth information to all (sub)init functions 2014-03-20 05:03:23 -07:00
dsputil_init_armv5te.c dsputil: Propagate bit depth information to all (sub)init functions 2014-03-20 05:03:23 -07:00
dsputil_init_armv6.c dsputil: Use correct type in me_cmp_func function pointer 2014-03-20 05:03:23 -07:00
dsputil_init_neon.c dsputil: Propagate bit depth information to all (sub)init functions 2014-03-20 05:03:23 -07:00
dsputil_neon.S dsputil: Move apply_window_int16 to ac3dsp 2013-12-08 17:57:15 +01:00
fft_fixed_init_arm.c Rename CONFIG_FFT_FLOAT ---> FFT_FLOAT 2014-01-06 19:12:48 +01:00
fft_fixed_neon.S
fft_init_arm.c arm: dcadsp: Move synth filter initialization to dcadsp file 2013-08-29 11:24:14 +02:00
fft_neon.S
fft_vfp.S arm: Add VFP-accelerated version of fft16 2013-07-22 10:15:41 +03:00
flacdsp_arm.S
flacdsp_init_arm.c
fmtconvert_init_arm.c arm: fmtconvert: Split armv6 fmtconvert code off from vfp code 2013-08-29 11:24:14 +02:00
fmtconvert_neon.S arm: Add X() around all references to extern symbols 2014-02-07 15:13:58 +02:00
fmtconvert_vfp.S arm: fmtconvert: Split armv6 fmtconvert code off from vfp code 2013-08-29 11:24:14 +02:00
fmtconvert_vfp_armv6.S arm: fmtconvert: Split armv6 fmtconvert code off from vfp code 2013-08-29 11:24:14 +02:00
h264chroma_init_arm.c
h264cmc_neon.S vc1: arm: Add NEON no_rnd chroma MC 2013-12-20 14:53:42 +02:00
h264dsp_armv6.S arm: Use the matching endfunc macro instead of the assembler directive directly 2014-01-04 13:53:08 +02:00
h264dsp_init_arm.c arm: cosmetics: Reindent the h264dsp neon init function 2014-01-07 19:29:31 +02:00
h264dsp_neon.S
h264idct_neon.S arm: Add X() around all references to extern symbols 2014-02-07 15:13:58 +02:00
h264pred_init_arm.c
h264pred_neon.S
h264qpel_init_arm.c
h264qpel_neon.S
hpeldsp_arm.S Update dsputil- and SIMD-related comments to match reality more closely 2014-03-13 05:50:29 -07:00
hpeldsp_arm.h arm: Use full filenames as multiple inclusion guards 2014-01-14 00:04:52 +01:00
hpeldsp_armv6.S arm: hpeldsp: fix put_pixels8_y2_{,no_rnd_}armv6 2014-03-08 18:31:57 +01:00
hpeldsp_init_arm.c dsputil: Refactor duplicated CALL_2X_PIXELS / PIXELS16 macros 2014-03-22 06:17:29 -07:00
hpeldsp_init_armv6.c
hpeldsp_init_neon.c
hpeldsp_neon.S
int_neon.S arm: Remove a stray .fpu directive 2014-02-09 18:36:16 +01:00
jrevdct_arm.S
mathops.h
mdct_fixed_neon.S
mdct_neon.S arm: Add X() around all references to extern symbols 2014-02-07 15:13:58 +02:00
mdct_vfp.S arm: Mangle external symbols properly in new vfp assembly files 2013-07-22 14:48:30 +03:00
mlpdsp_armv5te.S truehd: add hand-scheduled ARM asm version of mlp_filter_channel. 2014-03-26 19:53:52 +02:00
mlpdsp_init_arm.c truehd: add hand-scheduled ARM asm version of mlp_filter_channel. 2014-03-26 19:53:52 +02:00
mpegaudiodsp_fixed_armv6.S
mpegaudiodsp_init_arm.c
mpegvideo_arm.c
mpegvideo_arm.h arm: Use full filenames as multiple inclusion guards 2014-01-14 00:04:52 +01:00
mpegvideo_armv5te.c
mpegvideo_armv5te_s.S
mpegvideo_neon.S arm: Add X() around all references to extern symbols 2014-02-07 15:13:58 +02:00
neon.S
neontest.c arm: Add an option for making sure NEON registers aren't clobbered 2014-01-11 00:03:00 +02:00
rdft_neon.S
rv34dsp_init_arm.c
rv34dsp_neon.S
rv40dsp_init_arm.c
rv40dsp_neon.S
sbrdsp_init_arm.c
sbrdsp_neon.S
simple_idct_arm.S arm: Add a missing endfunc macro call 2014-01-04 13:53:02 +02:00
simple_idct_armv5te.S
simple_idct_armv6.S
simple_idct_neon.S
synth_filter_neon.S
synth_filter_vfp.S arm: Mangle external symbols properly in new vfp assembly files 2013-07-22 14:48:30 +03:00
vc1dsp.h vc1: arm: Add NEON assembly 2013-12-20 14:53:39 +02:00
vc1dsp_init_arm.c vc1: arm: Add NEON assembly 2013-12-20 14:53:39 +02:00
vc1dsp_init_neon.c vc1: arm: Add NEON no_rnd chroma MC 2013-12-20 14:53:42 +02:00
vc1dsp_neon.S vc1: arm: Add NEON assembly 2013-12-20 14:53:39 +02:00
videodsp_arm.h
videodsp_armv5te.S Update dsputil- and SIMD-related comments to match reality more closely 2014-03-13 05:50:29 -07:00
videodsp_init_arm.c
videodsp_init_armv5te.c
vorbisdsp_init_arm.c
vorbisdsp_neon.S
vp3dsp_init_arm.c arm: vp3: remove incorrect const in ff_vp3_idct_dc_add_neon declaration 2014-03-09 00:45:33 +01:00
vp3dsp_neon.S arm: Add a missing # as prefix for an immediate constant 2014-01-07 19:30:13 +02:00
vp6dsp_init_arm.c vp56: Mark VP6-only optimizations as such. 2013-08-23 14:42:19 +02:00
vp6dsp_neon.S vp56: Mark VP6-only optimizations as such. 2013-08-23 14:42:19 +02:00
vp8.h
vp8_armv6.S
vp8dsp.h
vp8dsp_armv6.S armv6: vp8: use explicit labels in motion compensation asm 2014-03-12 15:06:05 +01:00
vp8dsp_init_arm.c
vp8dsp_init_armv6.c
vp8dsp_init_neon.c
vp8dsp_neon.S vp8: Use 2 registers for dst_stride and src_stride in neon bilin filter 2014-02-06 09:32:26 +02:00
vp56_arith.h