Commit Graph

143 Commits

Author SHA1 Message Date
Loren Merritt 25cb0c1a1e x86inc: activate REP_RET automatically
Now RET checks whether it immediately follows a branch, so the
programmer dosen't have to keep track of that condition. REP_RET
is still needed manually when it's a branch target, but that's
much rarer.

The implementation involves lots of spurious labels, but that's OK
because we strip them.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:17:59 -04:00
Alex Smith 08fa828b3f avutil: Fix compilation with inline asm disabled on mingw
Because of -Werror=implicit-function-declaration the build will fail.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-09-22 00:50:32 +03:00
Diego Biurrun 79aec43ce8 x86: Add and use more convenience macros to check CPU extension availability 2013-08-29 13:07:37 +02:00
Diego Biurrun 8410d6e93c avutil: Refactor CPU extension availability macros 2013-08-28 23:54:14 +02:00
Diego Biurrun b78b10c4b7 avutil: Move internal CPU detection function declarations to private header 2013-08-28 23:54:14 +02:00
Diego Biurrun 3ac7fa81b2 Consistently use "cpu_flags" as variable/parameter name for CPU flags 2013-07-18 00:31:35 +02:00
Loren Merritt c8b920a9b7 lls/x86: use 3-operator vaddpd in ADDPD_MEM
Fixes build with yasm-1.1

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2013-07-02 10:15:09 +02:00
Loren Merritt 1221bb6239 x86: lpc: fix a segfault in av_evaluate_lls_sse2() 2013-06-30 23:11:19 +00:00
Loren Merritt b545179fdf x86: lpc: simd av_evaluate_lls
1.5x-1.8x faster on sandybridge

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-06-29 13:23:57 +02:00
Loren Merritt 502ab21af0 x86: lpc: simd av_update_lls
4x-6x faster on sandybridge

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-06-29 13:23:57 +02:00
Diego Biurrun 1fda184a85 avutil: Add av_cold attributes to init functions missing them 2013-05-04 22:48:05 +02:00
Christophe Gisquet 566b7a20fd x86: float dsp: butterflies_float SSE
97c -> 49c
Some codecs could benefit from more unrolling, but AAC doesn't.
2013-05-03 08:08:02 +02:00
Ronald S. Bultje b93b27edb0 dsputil: Make dsputil selectable
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-10 11:04:05 +03:00
Christophe Gisquet 2e81acc687 x86inc: Fix number of operands for cmp* instructions
cmp{p,s}{s,d} instructions do take an imm8 operand.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-04-09 23:55:30 +02:00
Diego Biurrun b6649ab503 cosmetics: Remove unnecessary extern keywords from function declarations 2013-03-27 14:21:45 +01:00
Ronald S. Bultje 0c0828ecc5 x86: Use simple nop codes for <= sse (rather than <= mmx)
The "CentaurHauls family 6 model 9 stepping 8" family of CPUs
(flags: fpu vme de pse tsc msr cx8 sep mtrr pge mov pat mmx fxsr sse
up rng rng_en ace ace_en) SIGILLs on long nop codes.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-02-19 22:33:19 +02:00
Diego Biurrun 4db96649ca avutil: Ensure that emms_c is always defined, even on non-x86 2013-02-14 19:29:04 +01:00
Diego Biurrun ab441e20ff avutil: Move emms code to x86-specific header 2013-02-14 17:37:34 +01:00
Ronald S. Bultje d56668bd80 floatdsp: move scalarproduct_float from dsputil to avfloatdsp.
This makes the aac decoder and all voice codecs independent of dsputil.
2013-01-22 11:55:42 -08:00
Ronald S. Bultje 42d3246948 floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent
of this patch, wmaprodec also does not depend on dsputil, so I removed
it from there also.
2013-01-22 11:55:42 -08:00
Ronald S. Bultje 55aa03b9f8 floatdsp: move vector_fmul_add from dsputil to avfloatdsp. 2013-01-22 11:55:42 -08:00
Martin Storsjö f4facd2ce7 x86: Add a Yasm-based emms() replacement
This provides a fallback when building with Yasm enabled, but neither
inline assembly, nor the _mm_empty intrinsic are available or enabled.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-01-18 22:02:13 +01:00
Diego Biurrun d633d12b2c x86inc: Add cvisible macro for C functions with public prefix
This allows defining externally visible library symbols.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-01-18 22:02:03 +01:00
Diego Biurrun ef5d41a553 x86inc: Rename "program_name" to "private_prefix"
The new name is more descriptive and will allow defining a separate
public prefix for externally visible library symbols.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-01-18 20:29:53 +01:00
Martin Storsjö 973b4d44f1 float_dsp: Add #ifdef HAVE_INLINE_ASM around vector_fmul_window
This fixes builds on 64bit MSVC.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-01-17 19:07:35 +02:00
Justin Ruggles e034cc6c60 lavc: Move vector_fmul_window to AVFloatDSPContext
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-16 10:45:45 +01:00
Diego Biurrun dae1d507af x86: Add PAVGB macro to abstract pavgb/pavgusb instruction via cpuflags 2013-01-15 17:29:43 +01:00
Diego Biurrun 320e1d0df3 x86: ABSB2: port to cpuflags 2013-01-15 11:18:51 +01:00
Diego Biurrun 094a7405e5 x86: ABSB: port to cpuflags 2013-01-15 11:18:51 +01:00
Diego Biurrun 51969a652c x86: ABS2: port to cpuflags 2013-01-14 21:56:55 +01:00
Diego Biurrun 5b4dfbffc2 x86: ABS1: port to cpuflags 2013-01-06 13:57:01 +01:00
Ronald S. Bultje a34d9ad969 lavc: merge latest x86inc.asm fixes with x264
Unbreak NASM support.

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-12-19 07:27:33 +01:00
Janne Grunau 0995ad8db4 x86inc: fully concatenate tokens to fix macro expansion for nasm
Fixes build errors with nasm introduced in 6f40e9f070 for stack
memory alignment. Noticed by BugMaster.
2012-12-13 23:57:09 +01:00
Ronald S. Bultje 140367aff9 x86inc: fix stack alignment on win64
Signed-off-by: Martin Storsjö <martin@martin.st>
2012-12-12 21:30:49 +02:00
Ronald S. Bultje 6f40e9f070 x86inc: support stack mem allocation and re-alignment in PROLOGUE
Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-12-12 05:23:46 +01:00
Justin Ruggles 1c012e6bfb x86: float_dsp: fix loading of the len parameter on x86-32 2012-12-07 21:19:29 -05:00
Justin Ruggles ecc8b02194 x86: float_dsp: fix compilation of ff_vector_dmul_scalar_avx() on x86-32
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2012-12-06 14:11:15 +01:00
Justin Ruggles b30a363331 x86: af_volume: add SSE2/SSSE3/AVX-optimized s32 volume scaling 2012-12-05 11:23:37 -05:00
Justin Ruggles ac7eb4cb20 float_dsp: add vector_dmul_scalar() to multiply a vector of doubles
Include x86-optimized versions for SSE2 and AVX.
2012-12-05 11:23:36 -05:00
Diego Biurrun 490df522c7 x86: cpu: Drop unused HAVE_RWEFLAGS condition
The test for rweflags was dropped in a previous commit.
2012-11-28 00:28:09 +01:00
Justin Ruggles 947f933687 x86: float_dsp: add SSE version of vector_fmul_scalar() 2012-11-26 11:30:19 -05:00
Diego Biurrun 87af05c575 x86: SPLATD: port to cpuflags 2012-11-18 18:34:05 +01:00
Diego Biurrun 26301caaa1 x86: mmx2 ---> mmxext in asm constructs 2012-11-14 00:58:51 +01:00
Diego Biurrun 2b479bcab0 build: Drop AVX assembly ifdefs
An assembler able to cope with AVX instructions is now required.
2012-11-11 20:43:28 +01:00
Diego Biurrun f0d124f005 x86inc: Set program_name outside of x86inc.asm
This reduces the local difference to the x264 upstream version.
2012-11-11 11:06:19 +01:00
Diego Biurrun 4b60fac419 x86: PALIGNR: port to cpuflags 2012-11-09 21:31:31 +01:00
Diego Biurrun dbb37e7711 x86: PABSW: port to cpuflags 2012-11-05 14:51:10 +01:00
Diego Biurrun 0a7a94f2e5 x86: Refactor PSWAPD fallback implementations and port to cpuflags 2012-11-02 17:05:29 +01:00
Diego Biurrun 26f01bd106 x86: PMINUB: port to cpuflags 2012-11-02 15:38:15 +01:00
Diego Biurrun 61bc2bc7d4 x86util: Add cpuflags_mmxext alias for cpuflags_mmx2
"mmxext" is a more sensible name and more common in outside projects.
2012-11-02 15:22:34 +01:00