Commit Graph

607 Commits

Author SHA1 Message Date
Diego Biurrun ec36aa6944 x86: Fix linking with some or all of yasm, mmx, optimizations disabled
Some optimized template functions reference optimized symbols, so they
must be explicitly disabled when those symbols are unavailable.
2012-08-30 19:37:32 +02:00
Diego Biurrun a886b279a0 x86: cosmetics: Comment some #endifs for better readability 2012-08-30 18:50:33 +02:00
Diego Biurrun 2e6f93a284 x86: Always compile files with functions that are called unconditionally 2012-08-29 00:27:06 +02:00
Diego Biurrun 2f2aa2e542 x86: mpegvideoenc: fix linking with --disable-mmx
The optimized dct_quantize template functions reference optimized
fdct symbols, so these functions must only be enabled if the relevant
optimizations have been enabled by configure.
2012-08-29 00:26:56 +02:00
Diego Biurrun d39791bf39 x86: mpegvideoenc: Do not abuse HAVE_ variables for template instantiation
This avoids trouble if HAVE_ variables are used elsewhere in the file.
2012-08-29 00:14:52 +02:00
Diego Biurrun bcc45d6348 x86: avcodec: Drop silly "_mmx" suffixes from filenames 2012-08-28 18:37:34 +02:00
Diego Biurrun efbd04c332 x86: avcodec: Drop silly "_sse" suffixes from filenames 2012-08-28 18:37:33 +02:00
Diego Biurrun 3f02c533f3 build: fft: x86: Drop unused YASM-OBJS-FFT- variable 2012-08-27 03:10:58 +02:00
Mans Rullgard db70730291 x86: fft: remove unused fft_dispatch* functions
These functions are not used since the yasm conversion.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-25 23:58:26 +01:00
Diego Biurrun dc40285427 x86: mpegvideo: more sensible names for optimization file and init function 2012-08-24 02:23:16 +02:00
Diego Biurrun d211547ddd x86: mpegvideoenc: Split optimizations off into a separate file 2012-08-24 02:23:16 +02:00
Diego Biurrun 26ce9aec03 dnxhdenc: x86: more sensible names for optimization file and init function 2012-08-24 02:23:15 +02:00
Diego Biurrun 6fa488678f build: x86: Only compile mpegvideo optimizations when necessary 2012-08-22 01:06:33 +02:00
Diego Biurrun 6961bdface x86: avcodec: Consistently name all init files 2012-08-16 11:05:38 +02:00
Martin Storsjö 1d9c2dc89a Don't include common.h from avutil.h
Signed-off-by: Martin Storsjö <martin@martin.st>
2012-08-15 22:32:06 +03:00
Diego Biurrun 29cfdd3767 x86: avcodec: Appropriately name files containing only init functions 2012-08-15 03:24:08 +02:00
Diego Biurrun be12958937 mpegvideo_mmx_template: drop some commented-out cruft 2012-08-15 03:24:07 +02:00
Mans Rullgard 8ec0204ee4 x86: cabac: allow building with suncc
This fixes two issues preventing suncc from building this code.

The undocumented 'a' operand modifier, causing gcc to omit a $ in
front of immediate operands (as required in addresses), is not
supported by suncc.  Luckily, the also undocumented 'c' modifer
has the same effect and is supported.

On some asm statements with a large number of operands, suncc for no
obvious reason fails to correctly substitute some of the operands.
Fortunately, some of the operands in these statements are plain
numbers which can be inserted directly into the code block instead
of passed as operands.

With these changes, the code builds correctly with both gcc and
suncc.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 14:51:52 +01:00
Mans Rullgard c8252e80eb x86: mlpdsp: avoid taking address of void
This code contains a C array of addresses of labels defined in
inline asm.  To do this, the names must be declared as external
in C.  The declared type does not matter since only the address is
used, and for some reason, the author of the code used the 'void'
type despite taking the address of a void expression being invalid.

Changing the type to char, a reasonable choice since the alignment
of the code labels cannot be known or guaranteed, eliminates gcc
warnings and allows building with suncc.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 14:51:52 +01:00
Diego Biurrun 3b9e832e17 x86: Drop silly "_yasm" suffixes from filenames 2012-08-12 17:13:05 +02:00
Mans Rullgard d7a4f8f8b9 Move MASK_ABS macro to libavcodec/mathops.h
This macro is only used in two places, both in libavcodec, so this
is a more sensible place for it.

Two small tweaks to the macro are made:

- removing the trailing semicolon
- dropping unnecessary 'volatile' from the x86 asm

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-09 00:58:20 +01:00
Mans Rullgard c318626ce2 x86: rename libavutil/x86_cpu.h to libavutil/x86/asm.h
This puts x86-specific things in the x86/ subdirectory where they
belong.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-09 00:58:20 +01:00
Dave Yeo 197439c1ef x86: pngdsp: Fix assembly for OS/2
The a.out object format does not allow aligning sections.
On OS/2 LD aligns sections to 16 bytes.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-08-08 15:45:09 +02:00
Mans Rullgard 2b140a3d09 x86: use 32-bit source registers with movd instruction
yasm tolerates mismatch between movd/movq and source register size,
adjusting the instruction according to the register.  nasm is more
strict.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-07 15:21:20 +01:00
Mans Rullgard a3df4781f4 x86: add colons after labels
nasm prints a warning if the colon is missing.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-07 15:20:56 +01:00
Anton Khirnov 36ef5369ee Replace all CODEC_ID_* with AV_CODEC_ID_* 2012-08-07 16:00:24 +02:00
Diego Biurrun 2096857551 x86: h264_idct: Rename x264_add8x4_idct_sse2 --> h264_add8x4_idct_sse2 2012-08-05 21:40:49 +02:00
Ronald S. Bultje 4a8143e73c fft: 3dnow: fix register name typo in DECL_IMDCT macro
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-08-04 00:16:02 +02:00
Diego Biurrun 0c3ff1982c x86: dct32: port to cpuflags 2012-08-03 22:51:06 +02:00
Diego Biurrun 239fdf1b4a x86: build: replace mmx2 by mmxext
Refactoring mmx2/mmxext YASM code with cpuflags will force renames.
So switching to a consistent naming scheme beforehand is sensible.
The name "mmxext" is more official and widespread and also the name
of the CPU flag, as reported e.g. by the Linux kernel.
2012-08-03 22:51:05 +02:00
Ronald S. Bultje da6505ad2f dsputil: make add_hfyu_left_prediction_sse4() support unaligned src.
This makes add_hfyu_left_prediction_sse4() handle sources that are not
16-byte aligned in its own function rather than by proxying the call to
add_hfyu_left_prediction_ssse3(). This fixes a crash on Win64, since the
sse4 version clobberes xmm6, but the ssse3 version (which uses MMX regs)
does not restore it, thus leading to XMM clobbering and RSP being off.

Fixes bug 342.
2012-08-03 11:09:14 -07:00
Diego Biurrun ca844b7be9 x86: Use consistent 3dnowext function and macro name suffixes
Currently there is a wild mix of 3dn2/3dnow2/3dnowext.  Switching to
"3dnowext", which is a more common name of the CPU flag, as reported
e.g. by the Linux kernel, unifies this.
2012-08-03 14:00:47 +02:00
Diego Biurrun 03737412a3 x86: proresdsp: improve SIGNEXTEND macro comments 2012-08-02 22:30:44 +02:00
Diego Biurrun 81905088a1 x86: h264dsp: K&R formatting cosmetics 2012-08-02 20:20:21 +02:00
Ronald S. Bultje c728518b3c x86: fft: fix imdct_half() for AVX
Some calculations were changed in b6a3849 to use mmsize, which was not correct
for the AVX version, which uses INIT_YMM and therefore has mmsize == 32.

Fixes Bug 341.

Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
2012-08-02 13:40:11 -04:00
Mans Rullgard ec7c501ed5 x86: remove libmpeg2 mmx(ext) idct functions
These functions are not faster than other mmx implementations on
any hardware I have been able to test on, and they are horribly
inaccurate.  There is thus no reason to ever use them.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-02 12:14:52 +01:00
Ronald S. Bultje b6a3849adb fft: port FFT/IMDCT 3dnow functions to yasm, and disable on x86-64.
64-bit CPUs always have SSE available, thus there is no need to compile
in the 3dnow functions. This results in smaller binaries.
2012-07-31 21:20:47 -07:00
Ronald S. Bultje 53dfaedc01 x86/dsputilenc: bury inline asm under HAVE_INLINE_ASM. 2012-07-31 20:28:52 -07:00
Diego Biurrun 6376a3ad24 x86: h264dsp: Remove unused variable ff_pb_3_1 2012-08-01 00:17:16 +02:00
Diego Biurrun 8728b381cb x86: h264dsp: Adjust YASM #ifdefs
This fixes compilation with YASM disabled.
2012-07-31 13:54:07 +02:00
Ronald S. Bultje b829b4ce29 h264: convert loop filter strength dsp function to yasm.
This completes the conversion of h264dsp to yasm; note that h264 also
uses some dsputil functions, most notably qpel. Performance-wise, the
yasm-version is ~10 cycles faster (182->172) on x86-64, and ~8 cycles
faster (201->193) on x86-32.
2012-07-30 19:39:47 -07:00
Ronald S. Bultje c83f44dba1 h264_idct_10bit: port x86 assembly to cpuflags. 2012-07-28 08:29:45 -07:00
Ronald S. Bultje b3c5ae5607 fft: rename "z" to "zc" to prevent name collision.
Without this, cglobal will expand "z" to "zh" to access the high byte
in a register's word, which causes a name collision with the ZH(x) macro
further up in this file.
2012-07-28 08:29:44 -07:00
Ronald S. Bultje 4d777eedfd vp3: don't compile mmx IDCT functions on x86-64.
64-bit CPUs always have SSE2, and a SSE2 version exists, thus the MMX
version will never be used.
2012-07-27 20:12:30 -07:00
Ronald S. Bultje a5bbb1242c h264_loopfilter: port x86 simd to cpuflags. 2012-07-27 20:12:11 -07:00
Ronald S. Bultje d07ff3cd5a h264_chromamc_10bit: port x86 simd to cpuflags. 2012-07-27 17:35:49 -07:00
Ronald S. Bultje 4a26fdd852 vp3: port x86 SIMD to cpuflags. 2012-07-27 17:35:49 -07:00
Ronald S. Bultje 76888c64b0 rv34: port x86 SIMD to cpuflags. 2012-07-27 15:13:26 -07:00
Ronald S. Bultje 158744a4cd vp56: only compile MMX SIMD on x86-32.
All x86-64 CPUs have SSE2, so the MMX version will never be used. This
leads to smaller binaries.
2012-07-27 14:40:27 -07:00
Ronald S. Bultje 2734ba787b vp56: port x86 simd to cpuflags. 2012-07-27 14:39:07 -07:00