ffmpeg/libavutil/x86
Mans Rullgard 5b170c0bea x86: remove FASTDIV inline asm
GCC 4.3 and later do the right thing with the plain C code.  Earlier
versions in 32-bit mode generate one extra instruction, needlessly
zeroing what would be the high half of the shifted value.  At least
two gcc configurations miscompile the inline asm in some situations.

In 64-bit mode, all gcc versions generate imul r64, r64 followed by
shr.  On Intel i7 and later, this imul is faster 32-bit mul.  On
older Intel and all AMD, it is slightly slower.  On Atom it is much
slower.

Considering where the FASTDIV macro is used, any overall negative
performance impact of this change should be negligible.  If anyone
cares, they should file a bug against gcc and get the instruction
selection fixed.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-22 14:29:10 +01:00
..
asm.h x86: move MANGLE() and related macros to libavutil/x86/asm.h 2012-08-09 00:58:20 +01:00
bswap.h x86: place some inline asm under #if HAVE_INLINE_ASM 2012-06-25 13:23:12 +01:00
cpu.c x86: rename libavutil/x86_cpu.h to libavutil/x86/asm.h 2012-08-09 00:58:20 +01:00
float_dsp_init.c float_dsp: add x86-optimized functions for vector_fmac_scalar() 2012-06-18 18:01:14 -04:00
float_dsp.asm x86: add colons after labels 2012-08-07 15:20:56 +01:00
intreadwrite.h
Makefile Add a float DSP framework to libavutil 2012-06-08 13:14:38 -04:00
timer.h x86/timer: implement an intrinsic-based version for rdtsc (AV_READ_TIME). 2012-07-07 13:35:07 -07:00
w64xmmtest.h Add more missing includes after removing the implicit common.h 2012-08-16 10:49:54 +03:00
x86inc.asm x86: fix build with nasm 2.08 2012-08-07 15:24:34 +01:00
x86util.asm dsputil: x86: add SHUFFLE_MASK_W macro 2012-07-22 16:56:58 -04:00