Commit Graph

5 Commits

Author SHA1 Message Date
Diego Biurrun d4f05ae3b6 sbrdsp: Use standard multiple inclusion guards. 2012-04-04 14:54:11 +02:00
Christophe GISQUET 34454c761f SBR DSP x86: implement SSE sbr_sum_square_sse
The 32bits targets have been compiled with -mfpmath=sse for proper reference.
sbr_sum_square C  /32bits: 82c (unrolled)/102c
               C  /64bits: 69c (unrolled)/82c
               SSE/32bits: 42c
               SSE/64bits: 31c

Use of SSE4.1 dpps to perform the final sum is slower.
Not unrolling to perform 8 operations in a loop yields 10 more cycles.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-02-23 15:50:06 -08:00
Christophe GISQUET 2e74a5abc2 SBR DSP: use intptr_t for the ixh parameter.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-02-23 15:48:40 -08:00
Mans Rullgard be822d77b6 aacsbr: ARM NEON optimised sbrdsp functions
Overall speedup of HE-AAC decoding 2.3x on Cortex-A8, 1.2x on A9.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-01-28 14:56:18 +00:00
Mans Rullgard aac46e088d aacsbr: move some simdable loops to function pointers
This prepares for assembly optimisations by moving the most
time-consuming loops to functions called through pointers
in a new context.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-01-28 14:56:18 +00:00