Commit Graph

13 Commits

Author SHA1 Message Date
Lynne bbe95f7353
x86: replace explicit REP_RETs with RETs
From x86inc:
> On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either
> a branch or a branch target. So switch to a 2-byte form of ret in that case.
> We can automatically detect "follows a branch", but not a branch target.
> (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.)

x86inc can automatically determine whether to use REP_RET rather than
REP in most of these cases, so impact is minimal. Additionally, a few
REP_RETs were used unnecessary, despite the return being nowhere near a
branch.

The only CPUs affected were AMD K10s, made between 2007 and 2011, 16
years ago and 12 years ago, respectively.

In the future, everyone involved with x86inc should consider dropping
REP_RETs altogether.
2023-02-01 04:23:55 +01:00
Andreas Rheinhardt 4e51e48ebd swresample/x86/rematrix: Remove obsolete MMX functions
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2) for x64. So given that the only systems that
benefit from these functions are truely ancient 32bit x86s
they are removed.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2022-06-14 01:28:29 +02:00
James Almer f37a5dcb55 swresample/x86: add missing colon to labels
Silences warnings with Nasm

Signed-off-by: James Almer <jamrial@gmail.com>
2015-07-26 02:51:13 -03:00
Ronald S. Bultje ad75d2b590 x86: Fix compilation with nasm on PPC & OS/2
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 12:36:19 +02:00
Michael Niedermayer 3174616f59 Merge commit '6860b4081d046558c44b1b42f22022ea341a2a73'
* commit '6860b4081d046558c44b1b42f22022ea341a2a73':
  x86: include x86inc.asm in x86util.asm
  cng: Reindent some incorrectly indented lines
  cngdec: Allow flushing the decoder
  cngdec: Make the dbov variable have the right unit
  cngdec: Fix the memset size to cover the full array
  cngdec: Update the LPC coefficients after averaging the reflection coefficients
  configure: fix print_config() with broke awks

Conflicts:
	libavcodec/x86/ac3dsp.asm
	libavcodec/x86/dct32.asm
	libavcodec/x86/deinterlace.asm
	libavcodec/x86/dsputil.asm
	libavcodec/x86/dsputilenc.asm
	libavcodec/x86/fft.asm
	libavcodec/x86/fmtconvert.asm
	libavcodec/x86/h264_chromamc.asm
	libavcodec/x86/h264_deblock.asm
	libavcodec/x86/h264_deblock_10bit.asm
	libavcodec/x86/h264_idct.asm
	libavcodec/x86/h264_idct_10bit.asm
	libavcodec/x86/h264_intrapred.asm
	libavcodec/x86/h264_intrapred_10bit.asm
	libavcodec/x86/h264_weight.asm
	libavcodec/x86/vc1dsp.asm
	libavcodec/x86/vp3dsp.asm
	libavcodec/x86/vp56dsp.asm
	libavcodec/x86/vp8dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-31 13:43:33 +01:00
Carl Eugen Hoyos 52be5428c0 Add some missing _EXTERNAL suffixes to yasm source files. 2012-08-31 15:39:03 +02:00
Michael Niedermayer 68712ce820 swr/x86: 16bit integer mix functions need SSE2 not SSE
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-07 20:52:34 +02:00
Michael Niedermayer 5f8f6243ef swr: fix 10l use of uninitialized data
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-13 13:43:42 +02:00
Michael Niedermayer 728f86edfc swr: mix_2_1_int16_mmx/sse
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-12 17:49:12 +02:00
Michael Niedermayer d504266cef swr: mix_1_1_int16_sse
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-12 16:43:19 +02:00
Michael Niedermayer cbeeaf2593 swr: mix_1_1 int16 MMX
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-12 16:35:13 +02:00
Michael Niedermayer 52afa43691 swr: mix_2_1_float SSE/AVX
Based-on code by Justin Ruggles
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-12 16:35:13 +02:00
Michael Niedermayer beb0cd6acf swr: SIMD rematrixing and SSE/AVX mix_1_1 float
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-12 16:35:07 +02:00