Commit Graph

54 Commits

Author SHA1 Message Date
Martin Storsjö 99e2012523 x86/arm: Add clobber tests to libavresample
Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-13 14:13:27 +02:00
Derek Buitenhuis 206895708e x86inc: Remove our FMA4 support
This is so we can sync to x264's version of FMA4 support.

This partialy reverts commit 79687079a9.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-14 12:39:29 +01:00
Derek Buitenhuis 15748773bf avresample/x86: Switch operand order for mulps
With the forthcoming VEX instruction emulation, mulps
must have only the third operand point to memory, as
this is what vmulps expects.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-14 12:36:11 +01:00
Diego Biurrun 3ac7fa81b2 Consistently use "cpu_flags" as variable/parameter name for CPU flags 2013-07-18 00:31:35 +02:00
Diego Biurrun b6649ab503 cosmetics: Remove unnecessary extern keywords from function declarations 2013-03-27 14:21:45 +01:00
Justin Ruggles a6a3164b13 x86: lavr: add SSE2/AVX dither_int_to_float() 2013-01-08 14:52:43 -05:00
Justin Ruggles 1fb8f6a44f x86: lavr: add SSE2 quantize() for dithering 2013-01-08 14:52:43 -05:00
Justin Ruggles 95d01c3f1c x86: lavr: use the x86inc.asm automatic stack alignment in mixing functions
CC:libav-stable@libav.org
2013-01-05 16:14:35 -05:00
Ronald S. Bultje 7a9e65acee x86: lavr: fix stack allocation for 7 and 8 channel downmixing on x86-32
Fixes crashes on Win32 and stack overruns on x86-32 in general.
2012-11-17 20:16:04 -05:00
Diego Biurrun 2b479bcab0 build: Drop AVX assembly ifdefs
An assembler able to cope with AVX instructions is now required.
2012-11-11 20:43:28 +01:00
Diego Biurrun 4b60fac419 x86: PALIGNR: port to cpuflags 2012-11-09 21:31:31 +01:00
Diego Biurrun 352e18b766 x86: avresample: Add missing colons to assembly labels
YASM accepts labels without colons, but NASM issues warnings.
2012-11-06 12:07:35 +01:00
Diego Biurrun 04581c8c77 x86: yasm: Use complete source path for macro helper %includes
This is more consistent with the way we handle C #includes and
it simplifies the build system.
2012-10-31 00:37:42 +01:00
Diego Biurrun 6860b4081d x86: include x86inc.asm in x86util.asm
This is necessary to allow refactoring some x86util macros with cpuflags.
2012-10-31 00:37:42 +01:00
Justin Ruggles 10e645e9cb lavr: handle clipping in the float to s32 conversion
We cannot clip to INT_MAX because that value cannot be exactly
represented by a float value and ends up overflowing during conversion
anyway. We need to use a slightly smaller float value, which ends up
with slightly inaccurate results for samples which clip or nearly clip,
but it is close enough. Using doubles as intermediates in the conversion
would be more accurate, but it takes about twice as much time.

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-10-13 12:34:34 +02:00
Diego Biurrun e0c6cce447 x86: Replace checks for CPU extensions and flags by convenience macros
This separates code relying on inline from that relying on external
assembly and fixes instances where the coalesced check was incorrect.
2012-09-08 18:18:34 +02:00
Diego Biurrun 17337f54c0 x86: Split inline and external assembly #ifdefs 2012-08-31 01:53:25 +02:00
Diego Biurrun a886b279a0 x86: cosmetics: Comment some #endifs for better readability 2012-08-30 18:50:33 +02:00
Justin Ruggles 06e751a40f lavr: x86: optimized 6-channel flt to fltp conversion 2012-08-23 20:10:57 -04:00
Justin Ruggles e07c9705c8 lavr: x86: optimized 2-channel flt to fltp conversion 2012-08-23 20:10:57 -04:00
Justin Ruggles 5245c9f3ad lavr: x86: optimized 6-channel flt to s16p conversion 2012-08-23 20:10:57 -04:00
Justin Ruggles 31d0d7181d lavr: x86: optimized 2-channel flt to s16p conversion 2012-08-23 20:10:57 -04:00
Justin Ruggles 6092dafb5a lavr: x86: optimized 6-channel s16 to fltp conversion 2012-08-23 20:10:57 -04:00
Justin Ruggles 91851a7b37 lavr: x86: optimized 2-channel s16 to fltp conversion 2012-08-23 20:10:57 -04:00
Justin Ruggles 205ace8843 lavr: x86: optimized 6-channel s16 to s16p conversion 2012-08-23 20:10:57 -04:00
Justin Ruggles 8eeffa8ada lavr: x86: optimized 2-channel s16 to s16p conversion 2012-08-23 20:10:57 -04:00
Justin Ruggles b66e20d2aa lavr: x86: optimized 2-channel fltp to flt conversion 2012-08-23 20:10:56 -04:00
Justin Ruggles d5b4e50c47 lavr: x86: optimized 6-channel fltp to s16 conversion 2012-08-23 20:10:56 -04:00
Justin Ruggles a58a013980 lavr: x86: optimized 2-channel fltp to s16 conversion 2012-08-23 20:10:56 -04:00
Justin Ruggles 90cc27f813 lavr: x86: optimized 6-channel s16p to flt conversion 2012-08-23 20:10:56 -04:00
Justin Ruggles 46f929adad lavr: x86: optimized 2-channel s16p to flt conversion 2012-08-23 20:10:56 -04:00
Justin Ruggles 13df7d2d40 lavr: x86: optimized 6-channel s16p to s16 conversion 2012-08-23 20:10:56 -04:00
Justin Ruggles c0e12535aa lavr: x86: optimized 2-channel s16p to s16 conversion 2012-08-23 20:10:56 -04:00
Mans Rullgard a3df4781f4 x86: add colons after labels
nasm prints a warning if the colon is missing.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-07 15:20:56 +01:00
Justin Ruggles e9da9a3111 lavr: x86: improve non-SSE4 version of S16_TO_S32_SX macro
Removes a false dependency on existing contents of the 2nd dst register,
giving better performance for OOE.
2012-07-27 14:21:32 -04:00
Justin Ruggles 2f096bb10e lavr: add x86-optimized mixing functions
Adds optimized functions for mixing 3 through 8 input channels to 1 and 2
output channels in fltp or s16p format with flt coeffs.
2012-07-27 11:25:48 -04:00
Ronald S. Bultje 30b45d9c38 x86inc: automatically insert vzeroupper for YMM functions. 2012-07-26 13:43:16 -07:00
Justin Ruggles 0dadf9d1e9 lavr: x86: add missing vzeroupper in ff_mix_1_to_2_fltp_flt() 2012-07-25 15:41:25 -04:00
Justin Ruggles acd9948e74 lavr: x86: fix ff_conv_fltp_to_flt_6ch function prototypes
Changed to match the number of parameters in conv_func_interleave(), which is
how they are called. The change isn't strictly necessary because the 4th
parameter is not used, but the code is clearer if they match.
2012-06-26 12:29:35 -04:00
Justin Ruggles 14a34d90ad lavr: x86: merge some branches 2012-06-25 13:49:18 -04:00
Justin Ruggles 4e4dd71730 lavr: Add x86-optimized function for flt to s32 conversion 2012-06-18 16:16:59 -04:00
Justin Ruggles 6c63cbfe7a lavr: Add x86-optimized function for flt to s16 conversion 2012-06-18 16:16:59 -04:00
Justin Ruggles 97ce1ba867 lavr: Add x86-optimized functions for s32 to flt conversion 2012-06-18 16:16:59 -04:00
Justin Ruggles 5904f25b9f lavr: Add x86-optimized functions for s32 to s16 conversion 2012-06-18 16:16:59 -04:00
Justin Ruggles d721f67d0a lavr: Add x86-optimized functions for s16 to flt conversion 2012-06-18 16:16:59 -04:00
Justin Ruggles 1168e29df1 lavr: Add x86-optimized function for s16 to s32 conversion 2012-06-18 16:16:59 -04:00
Justin Ruggles f61ce90caa lavr: add x86-optimized functions for mixing 1-to-2 s16p with flt coeffs 2012-06-18 11:24:10 -04:00
Justin Ruggles 29f7490c46 lavr: add x86-optimized functions for mixing 1-to-2 fltp with flt coeffs 2012-06-18 11:24:10 -04:00
Justin Ruggles b75726cb79 lavr: add x86-optimized function for mixing 2 to 1 s16p with q8 coeffs 2012-05-29 15:33:25 -04:00
Justin Ruggles c140fb2cbc lavr: add x86-optimized functions for mixing 2 to 1 s16p with float coeffs 2012-05-29 15:33:18 -04:00