Commit Graph

21 Commits

Author SHA1 Message Date
Dale Curtis 50e30d9bb7 Don't use _tzcnt instrinics with clang for windows w/o BMI.
Technically _tzcnt* intrinsics are only available when the BMI
instruction set is present. However the instruction encoding
degrades to "rep bsf" on older processors.

Clang for Windows debatably restricts the _tzcnt* instrinics behind
the __BMI__ architecture define, so check for its presence or
exclude the usage of these intrinics when clang is present.

See also:
https://ffmpeg.org/pipermail/ffmpeg-devel/2015-November/183404.html
https://bugs.llvm.org/show_bug.cgi?id=30506
http://lists.llvm.org/pipermail/cfe-dev/2016-October/051034.html

Signed-off-by: Dale Curtis <dalecurtis@chromium.org>
Reviewed-by: Matt Oliver <protogonoi@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-10-25 21:50:37 +02:00
Matt Oliver 5ca44ebd99 lavu/intmath.h: fix compilation with msvc10.
Signed-off-by: Matt Oliver <protogonoi@gmail.com>
2016-06-13 13:49:24 +10:00
James Almer a2e1b66460 x86/intmath: disable sse av_clip functions when using ICC
It seems to miscompile them

Should fix fate-ra-288 and fate-twinvq

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
2016-01-21 16:50:51 -03:00
James Almer 36778627e2 x86/intmath: add missing early clobber to output operands
Signed-off-by: James Almer <jamrial@gmail.com>
2016-01-15 13:32:58 -03:00
James Almer f4c1a48483 x86/intmath: add sse optimized av_clipf and av_clipd
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2016-01-07 14:24:01 -03:00
Matt Oliver 58d32c00be avutil/x86/intmath: Fix intrinsic header include when using newer gcc with older icc.
Signed-off-by: Matt Oliver <protogonoi@gmail.com>
2015-11-12 16:54:08 +11:00
Matt Oliver 9105399060 avutil/x86/intmath: Disable use of tzcnt on older intel compilers.
ICC versions older than atleast 12.1.6 dont have the tzcnt intrinsics.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Matt Oliver <protogonoi@gmail.com>
2015-11-11 10:18:08 +11:00
Matt Oliver f984174512 avutil/x86/intmath: Correct intrinsic headers for older compilers.
Signed-off-by: Matt Oliver <protogonoi@gmail.com>
2015-11-09 21:40:33 +11:00
Matt Oliver bff009697d avutil/x86/intmath: Add missing header.
Signed-off-by: Matt Oliver <protogonoi@gmail.com>
2015-11-01 02:11:29 +11:00
Matt Oliver 6c6ac9cb17 avutil/x86/intmath: Use tzcnt in place of bsf.
Signed-off-by: Matt Oliver <protogonoi@gmail.com>
2015-10-31 23:11:32 +11:00
Matt Oliver b0bb1dc62d lavu/intmath.h: Move x86 only msvc/icl functions to x86 specific header.
Signed-off-by: Matt Oliver <protogonoi@gmail.com>
2015-10-19 13:40:51 +11:00
Matt Oliver 216cc1f6fe lavu/intmath.h: Add msvc/icl ctzll optimisations.
Signed-off-by: Matt Oliver <protogonoi@gmail.com>
2015-10-19 13:40:27 +11:00
James Almer 93e7b7fb34 avutil/x86/intmath: add missing check for inline assembly
Signed-off-by: James Almer <jamrial@gmail.com>
2015-06-27 14:33:53 -03:00
James Almer 1e51e517be avutil/x86/intmath: use bzhi gcc builtin in av_mod_uintp2()
Signed-off-by: James Almer <jamrial@gmail.com>
2015-06-27 12:56:55 -03:00
James Almer 60b9373dbd libavutil: add bmi2 optimized av_mod_uintp2
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-03-20 15:47:43 -03:00
James Almer bc65abc8d7 libavutil: add x86 optimized av_popcount
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-25 19:58:00 -03:00
Mans Rullgard 5b170c0bea x86: remove FASTDIV inline asm
GCC 4.3 and later do the right thing with the plain C code.  Earlier
versions in 32-bit mode generate one extra instruction, needlessly
zeroing what would be the high half of the shifted value.  At least
two gcc configurations miscompile the inline asm in some situations.

In 64-bit mode, all gcc versions generate imul r64, r64 followed by
shr.  On Intel i7 and later, this imul is faster 32-bit mul.  On
older Intel and all AMD, it is slightly slower.  On Atom it is much
slower.

Considering where the FASTDIV macro is used, any overall negative
performance impact of this change should be negligible.  If anyone
cares, they should file a bug against gcc and get the instruction
selection fixed.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-22 14:29:10 +01:00
Ronald S. Bultje 8123e0901f x86: place some inline asm under #if HAVE_INLINE_ASM
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-25 13:23:12 +01:00
Mans Rullgard 2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Måns Rullgård 2ed6f39944 Replace many includes of libavutil/common.h with what is actually needed
This reduces the number of false dependencies on header files and
speeds up compilation.

Originally committed as revision 22407 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-09 17:39:19 +00:00
Måns Rullgård 75fb5c24ed Move FASTDIV macro to intmath.h
Originally committed as revision 21335 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-19 23:25:36 +00:00