Commit Graph

119 Commits

Author SHA1 Message Date
Martin Storsjö 973b4d44f1 float_dsp: Add #ifdef HAVE_INLINE_ASM around vector_fmul_window
This fixes builds on 64bit MSVC.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-01-17 19:07:35 +02:00
Justin Ruggles e034cc6c60 lavc: Move vector_fmul_window to AVFloatDSPContext
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-16 10:45:45 +01:00
Diego Biurrun dae1d507af x86: Add PAVGB macro to abstract pavgb/pavgusb instruction via cpuflags 2013-01-15 17:29:43 +01:00
Diego Biurrun 320e1d0df3 x86: ABSB2: port to cpuflags 2013-01-15 11:18:51 +01:00
Diego Biurrun 094a7405e5 x86: ABSB: port to cpuflags 2013-01-15 11:18:51 +01:00
Diego Biurrun 51969a652c x86: ABS2: port to cpuflags 2013-01-14 21:56:55 +01:00
Diego Biurrun 5b4dfbffc2 x86: ABS1: port to cpuflags 2013-01-06 13:57:01 +01:00
Ronald S. Bultje a34d9ad969 lavc: merge latest x86inc.asm fixes with x264
Unbreak NASM support.

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-12-19 07:27:33 +01:00
Janne Grunau 0995ad8db4 x86inc: fully concatenate tokens to fix macro expansion for nasm
Fixes build errors with nasm introduced in 6f40e9f070 for stack
memory alignment. Noticed by BugMaster.
2012-12-13 23:57:09 +01:00
Ronald S. Bultje 140367aff9 x86inc: fix stack alignment on win64
Signed-off-by: Martin Storsjö <martin@martin.st>
2012-12-12 21:30:49 +02:00
Ronald S. Bultje 6f40e9f070 x86inc: support stack mem allocation and re-alignment in PROLOGUE
Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-12-12 05:23:46 +01:00
Justin Ruggles 1c012e6bfb x86: float_dsp: fix loading of the len parameter on x86-32 2012-12-07 21:19:29 -05:00
Justin Ruggles ecc8b02194 x86: float_dsp: fix compilation of ff_vector_dmul_scalar_avx() on x86-32
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2012-12-06 14:11:15 +01:00
Justin Ruggles b30a363331 x86: af_volume: add SSE2/SSSE3/AVX-optimized s32 volume scaling 2012-12-05 11:23:37 -05:00
Justin Ruggles ac7eb4cb20 float_dsp: add vector_dmul_scalar() to multiply a vector of doubles
Include x86-optimized versions for SSE2 and AVX.
2012-12-05 11:23:36 -05:00
Diego Biurrun 490df522c7 x86: cpu: Drop unused HAVE_RWEFLAGS condition
The test for rweflags was dropped in a previous commit.
2012-11-28 00:28:09 +01:00
Justin Ruggles 947f933687 x86: float_dsp: add SSE version of vector_fmul_scalar() 2012-11-26 11:30:19 -05:00
Diego Biurrun 87af05c575 x86: SPLATD: port to cpuflags 2012-11-18 18:34:05 +01:00
Diego Biurrun 26301caaa1 x86: mmx2 ---> mmxext in asm constructs 2012-11-14 00:58:51 +01:00
Diego Biurrun 2b479bcab0 build: Drop AVX assembly ifdefs
An assembler able to cope with AVX instructions is now required.
2012-11-11 20:43:28 +01:00
Diego Biurrun f0d124f005 x86inc: Set program_name outside of x86inc.asm
This reduces the local difference to the x264 upstream version.
2012-11-11 11:06:19 +01:00
Diego Biurrun 4b60fac419 x86: PALIGNR: port to cpuflags 2012-11-09 21:31:31 +01:00
Diego Biurrun dbb37e7711 x86: PABSW: port to cpuflags 2012-11-05 14:51:10 +01:00
Diego Biurrun 0a7a94f2e5 x86: Refactor PSWAPD fallback implementations and port to cpuflags 2012-11-02 17:05:29 +01:00
Diego Biurrun 26f01bd106 x86: PMINUB: port to cpuflags 2012-11-02 15:38:15 +01:00
Diego Biurrun 61bc2bc7d4 x86util: Add cpuflags_mmxext alias for cpuflags_mmx2
"mmxext" is a more sensible name and more common in outside projects.
2012-11-02 15:22:34 +01:00
Diego Biurrun 012f73e271 x86inc: Only define program_name if the macro is unset
This allows overriding the value from outside of the file.
2012-11-02 14:38:00 +01:00
Dave Yeo 9c167914a1 x86: Fix assembly with NASM
Unlike YASM, NASM only looks for include files in the current
directory, not in the directory that included files reside in.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-10-31 10:20:35 +01:00
Diego Biurrun 588fafe7f3 x86: MMX2 ---> MMXEXT in macro names 2012-10-31 01:04:55 +01:00
Diego Biurrun 6860b4081d x86: include x86inc.asm in x86util.asm
This is necessary to allow refactoring some x86util macros with cpuflags.
2012-10-31 00:37:42 +01:00
Ronald S. Bultje 08b028c18d Remove INIT_AVX from x86inc.asm. 2012-10-29 14:51:14 -07:00
Diego Biurrun a7329e5fc2 x86: get_cpu_flags: add necessary ifdefs around function body
ff_get_cpu_flags_x86() requires cpuid(), which is conditionally defined
elsewhere in the file.  Surrounding the function body with ifdefs allows
building even when cpuid is not defined.  An empty cpuflags mask is
returned in this case.
2012-10-04 19:29:14 +02:00
Diego Biurrun f6fbce761e x86: Drop CPU detection intrinsics
Now that there is CPU detection in YASM, there will always be one of
inline or external assembly enabled, which obviates the need to fall
back on CPU detection through compiler intrinsics.
2012-10-04 19:29:14 +02:00
Diego Biurrun 1f6d86991f x86: Add YASM implementations of cpuid and xgetbv from x264
This allows detecting CPU features with builds that have neither
gcc inline assembly nor the right compiler intrinsics enabled.
2012-10-04 19:29:14 +02:00
Diego Biurrun 54b243141e x86: cpu: Break out test for cpuid capabilities into separate function 2012-10-04 18:09:21 +02:00
Diego Biurrun cc5e9e5ff0 x86: ff_get_cpu_flags_x86(): Avoid a pointless variable indirection 2012-10-04 17:58:42 +02:00
Diego Biurrun e0c6cce447 x86: Replace checks for CPU extensions and flags by convenience macros
This separates code relying on inline from that relying on external
assembly and fixes instances where the coalesced check was incorrect.
2012-09-08 18:18:34 +02:00
Justin Ruggles 7327525997 x86: float_dsp: fix ff_vector_fmac_scalar_avx() on Win64
The SWAP macro does not work for explicit xmm/ymm usage, so instead just move
the scalar value from xmm2 to xmm0.
2012-09-07 14:49:10 -04:00
Diego Biurrun f82c4fb27f x86: Add convenience macros to check for CPU extensions and flags 2012-09-04 01:44:59 +02:00
Diego Biurrun 17337f54c0 x86: Split inline and external assembly #ifdefs 2012-08-31 01:53:25 +02:00
Diego Biurrun a886b279a0 x86: cosmetics: Comment some #endifs for better readability 2012-08-30 18:50:33 +02:00
Loren Merritt 7a1944b907 vf_hqdn3d: x86 asm
13% faster on penryn, 16% on sandybridge, 15% on bulldozer
Not simd; a compiler should have generated this, but gcc didn't.
2012-08-26 10:49:14 +00:00
Justin Ruggles 6092dafb5a lavr: x86: optimized 6-channel s16 to fltp conversion 2012-08-23 20:10:57 -04:00
Mans Rullgard 5b170c0bea x86: remove FASTDIV inline asm
GCC 4.3 and later do the right thing with the plain C code.  Earlier
versions in 32-bit mode generate one extra instruction, needlessly
zeroing what would be the high half of the shifted value.  At least
two gcc configurations miscompile the inline asm in some situations.

In 64-bit mode, all gcc versions generate imul r64, r64 followed by
shr.  On Intel i7 and later, this imul is faster 32-bit mul.  On
older Intel and all AMD, it is slightly slower.  On Atom it is much
slower.

Considering where the FASTDIV macro is used, any overall negative
performance impact of this change should be negligible.  If anyone
cares, they should file a bug against gcc and get the instruction
selection fixed.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-22 14:29:10 +01:00
Martin Storsjö 33e112847d Add more missing includes after removing the implicit common.h
Signed-off-by: Martin Storsjö <martin@martin.st>
2012-08-16 10:49:54 +03:00
Martin Storsjö 70766c2182 Add some more missing includes after removing the implicit common.h
Signed-off-by: Martin Storsjö <martin@martin.st>
2012-08-15 23:48:48 +03:00
Mans Rullgard 070a402b60 x86: move MANGLE() and related macros to libavutil/x86/asm.h
These x86-specific macros do not belong in generic code.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-09 00:58:20 +01:00
Mans Rullgard c318626ce2 x86: rename libavutil/x86_cpu.h to libavutil/x86/asm.h
This puts x86-specific things in the x86/ subdirectory where they
belong.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-09 00:58:20 +01:00
Mans Rullgard edd8226795 x86: fix build with nasm 2.08
It appears that something goes wrong in old nasm versions when the
%+ operator is used in the last argument of a macro invocation and
this argument is tested with %ifdef within the macro.  This patch
rearranges the macro arguments such that the %+ operator is never
used in the last argument.
2012-08-07 15:24:34 +01:00
Mans Rullgard 180d43bc67 x86: use nop cpu directives only if supported
nasm does not support 'CPU foonop' directives.  This adds a configure
test for the directive and uses it only if supported.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-07 15:22:20 +01:00