Commit Graph

500 Commits

Author SHA1 Message Date
David Conrad c979fa030f Remove unused dequantization code from SSE VP3 IDCT
Originally committed as revision 15054 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-30 19:47:47 +00:00
David Conrad 167029a73a Use ff_pw_8 in MMX/SSE VP3 IDCT
Originally committed as revision 15053 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-30 19:41:42 +00:00
David Conrad 21383da8c4 Let ff_pw_8 be used as an SSE constant
Originally committed as revision 15052 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-30 19:40:21 +00:00
Vladimir Voroshilov 2ccddc0211 Add explicit (int) cast to i386 optimized MUL* macros.
Wrong result is returned when 16-bit value is passed as value.
Also fixes "Warning: using `%edx' instead of `%dx' due to `l' suffix".

Originally committed as revision 14981 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-26 19:38:17 +00:00
Alexis Ballier dad6afb4cb stricter constraints of asm() blocks
All these variables are used as left operands of a movd instruction,
which does accept only memory or register operands while the "g"
constraint also allows immediates. Use "rm" instead.
Patch by Alexis Ballier %alexis P ballier A gmail P com%

Originally committed as revision 14941 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-24 08:41:20 +00:00
Loren Merritt 7ca7d5fae0 file which should have been added in r14749
Originally committed as revision 14751 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-14 05:00:25 +00:00
Loren Merritt 75ac287517 missing prototype
Originally committed as revision 14750 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-14 04:41:02 +00:00
Loren Merritt ebceaa1cd5 gcc chokes on the 7 registers needed for float_to_int16_interleave6 (even inside HAVE_7REGS), so write it in yasm
Originally committed as revision 14749 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-14 04:40:46 +00:00
Loren Merritt ee46753739 gcc chokes on xmm constraints, so pessimize int32_to_float_fmul_scalar_sse a little
Originally committed as revision 14748 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-14 04:39:59 +00:00
Loren Merritt 675872382f special case 6 channel version of float_to_int16_interleave
5% faster ac3

Originally committed as revision 14744 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-13 23:36:37 +00:00
Loren Merritt 911e21a306 simd int->float
20% faster ac3 if downmixing, 15% if not

Originally committed as revision 14743 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-13 23:35:40 +00:00
Loren Merritt ac2e556456 simd downmix
13% faster ac3 if downmixing

Originally committed as revision 14742 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-13 23:33:48 +00:00
Loren Merritt 862b98d42c cosmetics in dsp init
Originally committed as revision 14704 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-12 00:51:45 +00:00
Loren Merritt 0a570e826d remove mdct tmp buffer
Originally committed as revision 14702 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-12 00:36:36 +00:00
Loren Merritt 46803f4f67 optimize imdct_half:
remove tmp buffer.
skip fft reinterleave pass, leaving data in a format more convenient for simd.
merge post-rotate with post-reorder.

Originally committed as revision 14700 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-12 00:33:34 +00:00
Loren Merritt 5d0ddd1a9f split-radix FFT
c is 1.9x faster than previous c (on various x86 cpus), sse is 1.6x faster than previous sse.

Originally committed as revision 14698 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-12 00:26:58 +00:00
Loren Merritt bafad220a7 import yasm macros from x264
Originally committed as revision 14697 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-11 23:54:09 +00:00
Uoti Urpala f769b746aa Mark add_png_paeth_prediction_* functions which are only used within this file
as static. patch by Uoti Urpala, uoti.urpala pp1.inet fi

Originally committed as revision 14509 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-08-02 17:32:55 +00:00
Michael Niedermayer 4f20b45fbe Fix h264_loop_filter_strength_mmx2() so it works with PAFF.
fixed at least:
CVFI1_Sony_D.jsv
CVFI1_SVA_C.264
MR6_BT_B.h264

Originally committed as revision 14310 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-19 21:53:54 +00:00
Loren Merritt 5eb0f2a425 float_to_int16_interleave: change src to an array of pointers instead of assuming it's contiguous.
this has no immediate effect, but will allow it to be used in more codecs.

Originally committed as revision 14252 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-16 00:50:12 +00:00
Loren Merritt 4342a7f30b 10l, float_to_int16_interleave_sse/3dnow wrote the wrong samples
Originally committed as revision 14236 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-15 04:11:30 +00:00
Loren Merritt b9fa32082c exploit mdct symmetry
2% faster vorbis on conroe, k8. 7% on celeron.

Originally committed as revision 14207 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-13 15:03:58 +00:00
Loren Merritt f27e1d645e simplify vorbis windowing
Originally committed as revision 14205 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-13 14:56:01 +00:00
Kostya Shishkov d7e1fc4254 SSE2 optimizations for Monkey's Audio decoder vector functions
Originally committed as revision 14161 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-11 04:48:38 +00:00
Alexander Strange bc31447225 Make the function prototype visible to comply with C99 inline.
Fixes building with gcc -std=gnu99.

Originally committed as revision 14140 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-09 17:51:57 +00:00
Michael Niedermayer e98750c373 float_to_int16_sse2()
20% faster than sse

Originally committed as revision 14138 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-09 07:21:12 +00:00
Victor Pollex 1835cda65a Make LOAD4/STORE4 macros more generic.
Patch by Victor Pollex victor pollex web de
Original thread: [PATCH] mmx implementation of vc-1 inverse transformations
Date: 06/21/2008 03:37 PM

Originally committed as revision 14108 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-08 09:24:11 +00:00
Michael Niedermayer 35ee72b1d7 1 c-asm loop less and 1x unroll of float_to_int16_sse()
25% faster

Originally committed as revision 14104 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-07 21:25:18 +00:00
Michael Niedermayer 560fa9bf51 Fix x86-64
Originally committed as revision 14103 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-07 21:04:29 +00:00
Michael Niedermayer 63b737d4f9 dont use C-asm loops and unroll once float_to_int16_3dnow()
30% faster

Originally committed as revision 14102 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-07 20:46:03 +00:00
Alexander Strange 74fd9022b5 Realign newlines.
Originally committed as revision 14023 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-06-28 18:30:50 +00:00
Alexander Strange 00969e1c59 Use MANGLE() instead of memory operands to read globals.
(fixes out of registers with apple gcc 4.2)

Originally committed as revision 14022 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-06-28 18:27:31 +00:00
Reimar Döffinger 00eebe3d6a Fix add_bytes_mmx and add_bytes_l2_mmx for w < 16
Originally committed as revision 13877 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-06-22 07:05:40 +00:00
Michael Niedermayer 0bd134abd3 Simplify vsad16_mmx2().
Originally committed as revision 13193 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-05-17 14:36:44 +00:00
Michael Niedermayer 6bf6a9301b Simplify vsad16_mmx().
Originally committed as revision 13191 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-05-17 14:35:14 +00:00
Michael Niedermayer e13810223a Simplify vsad_intra16_mmx2()
Originally committed as revision 13189 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-05-17 14:33:01 +00:00
Michael Niedermayer 06bb35f94c Simplify vsad_intra16_mmx()
Originally committed as revision 13188 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-05-17 14:31:10 +00:00
Diego Biurrun a12b44d7fb Add missing required header directly.
Originally committed as revision 13103 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-05-09 14:34:52 +00:00
Diego Biurrun 20cd685ae8 Add missing path to #include.
Originally committed as revision 13102 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-05-09 14:33:55 +00:00
Diego Biurrun 245976da2a Use full path for #includes from another directory.
Originally committed as revision 13098 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-05-09 11:56:36 +00:00
Ramiro Polla 40d0e665d0 Do not misuse long as the size of a register in x86.
typedef x86_reg as the appropriate size and use it instead.

Originally committed as revision 13081 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-05-08 21:11:24 +00:00
Diego Biurrun 57105ddd03 Rename i386/cputest.c --> i386/cpuid.c.
Originally committed as revision 13002 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-26 16:02:22 +00:00
Diego Biurrun c88c253d8b cosmetics: __asm__ __volatile__ --> asm volatile
Originally committed as revision 12885 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-17 21:57:52 +00:00
Diego Biurrun 80465c7eed cosmetics: Fix nonstandard indentation.
Originally committed as revision 12863 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-16 20:51:39 +00:00
Jeff Downs 591d87babe Cosmetics:
Break long lines.
Correct spelling in comment (duplicatin -> duplicating)

Originally committed as revision 12862 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-16 20:43:37 +00:00
Jeff Downs 52cb7981e2 Redo r12838, this time using svn copy to create h264_i386.h from cabac.h.
Move decode_significance_x86() and decode_significance_8x8_x86() to
i386-specific file from cabac.h.
New file is h264-oriented and only included from h264.c
Resolves compilation when configured with --disable-optimizations due to
decode_significance_8x8_x86 using last_coeff_flag_offset_8x8, which is
only defined in h264.c

Originally committed as revision 12846 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-16 04:40:21 +00:00
Jeff Downs 3aa9ede400 Revert 12838 to redo it the right way (use svn copy to create new
file based on old).

Originally committed as revision 12845 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-16 04:26:52 +00:00
Alexander Strange f73a6393e7 Add a new xvid-style IDCT using SSE2.
Originally committed as revision 12843 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-16 01:36:14 +00:00
Jeff Downs e6cfd8fffb Move decode_significance_x86() and decode_significance_8x8_x86() to
i386-specific file from cabac.h.
New file is h264-oriented and only included from h264.c
Resolves compilation when configured with --disable-optimizations due to
decode_significance_8x8_x86 using last_coeff_flag_offset_8x8, which is
only defined in h264.c

Originally committed as revision 12838 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-15 13:51:41 +00:00
Luca Barbato 3fbe711832 Eliminate movdqu in vp3dsp_sse2, patch from Alexander Strange astrangeAtithinkswDoTcom
Originally committed as revision 12824 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-14 20:54:23 +00:00