Add x86 implementation using MMX/SSE. Originally committed as revision 21281 to svn://svn.ffmpeg.org/ffmpeg/trunk