ffmpeg

Commit Graph

Author	SHA1	Message	Date
James Darnley	0a5814c9ba	yadif: x86 assembly for 9 to 14-bit samples These smaller samples do not need to be unpacked to double words allowing the code to process more pixels every iteration (still 2 in MMX but 6 in SSE2). It also avoids emulating the missing double word instructions on older instruction sets. Like with the previous code for 16-bit samples this has been tested on an Athlon64 and a Core2Quad. Athlon64: 1809275 decicycles in C, 32718 runs, 50 skips 911675 decicycles in mmx, 32727 runs, 41 skips, 2.0x faster 495284 decicycles in sse2, 32747 runs, 21 skips, 3.7x faster Core2Quad: 921363 decicycles in C, 32756 runs, 12 skips 486537 decicycles in mmx, 32764 runs, 4 skips, 1.9x faster 293296 decicycles in sse2, 32759 runs, 9 skips, 3.1x faster 284910 decicycles in ssse3, 32759 runs, 9 skips, 3.2x faster Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-16 22:32:54 +01:00
James Darnley	17e7b49501	yadif: x86 assembly for 16-bit samples This is a fairly dumb copy of the assembly for 8-bit samples but it works and produces identical output to the C version. The options have been tested on an Athlon64 and a Core2Quad. Athlon64: 1810385 decicycles in C, 32726 runs, 42 skips 1080744 decicycles in mmx, 32744 runs, 24 skips, 1.7x faster 818315 decicycles in sse2, 32735 runs, 33 skips, 2.2x faster Core2Quad: 924025 decicycles in C, 32750 runs, 18 skips 623995 decicycles in mmx, 32767 runs, 1 skips, 1.5x faster 406223 decicycles in sse2, 32764 runs, 4 skips, 2.3x faster 387842 decicycles in ssse3, 32767 runs, 1 skips, 2.4x faster 307726 decicycles in sse4, 32763 runs, 5 skips, 3.0x faster Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-16 22:32:34 +01:00
Diego Biurrun	e66240f22e	avfilter: x86: consistent filenames for filter optimizations	2013-02-04 15:00:47 +01:00
Diego Biurrun	76d90125cd	vf_hqdn3d: x86: Add proper arch optimization initialization	2013-02-01 13:11:45 +01:00
Daniel Kang	899157b308	yadif: Port inline assembly to yasm Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-01-09 18:41:02 +01:00
Justin Ruggles	f96f1e06a4	x86: af_volume: add SSE2-optimized s16 volume scaling	2012-12-05 11:23:37 -05:00
Diego Biurrun	f6c38c5f4e	avfilter: call x86 init functions under if (ARCH_X86), not if (HAVE_MMX)	2012-10-12 19:58:51 +02:00
Loren Merritt	7a1944b907	vf_hqdn3d: x86 asm 13% faster on penryn, 16% on sandybridge, 15% on bulldozer Not simd; a compiler should have generated this, but gcc didn't.	2012-08-26 10:49:14 +00:00
Nolan L	d5f187fd33	Add gradfun filter, ported from MPlayer. Patch by Nolan L nol888 <=> gmail >=< com. See thread: Subject: [FFmpeg-devel] [PATCH] Port gradfun to libavfilter (GCI) Date: Mon, 29 Nov 2010 07:18:14 -0500 Originally committed as revision 25942 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-12-12 17:59:10 +00:00
Aurelien Jacobs	fa6f4ebc08	use a Makefile in x86 subdir Originally committed as revision 25234 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-09-27 21:50:26 +00:00

10 Commits