4-operation form is preferred over 3-operation because it breaks a long
dependency chain, thus allowing a superscalar processor to execute more
operations in parallel.
The idea was taken from: http://www.zorinaq.com/papers/md5-amd64.html
AMD Athlon(tm) II X3 450 Processor, x86_64
$ for i in $(seq 1 4); do ./avutil_md5_test2; done
size: 1048576 runs: 1024 time: 5.821 +- 0.019
size: 1048576 runs: 1024 time: 5.822 +- 0.019
size: 1048576 runs: 1024 time: 5.841 +- 0.018
size: 1048576 runs: 1024 time: 5.821 +- 0.018
$ for i in $(seq 1 4); do ./avutil_md5_test2; done
size: 1048576 runs: 1024 time: 5.646 +- 0.019
size: 1048576 runs: 1024 time: 5.646 +- 0.018
size: 1048576 runs: 1024 time: 5.642 +- 0.019
size: 1048576 runs: 1024 time: 5.641 +- 0.019
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Where necessary use memcpy instead.
Thanks to Giorgio Vazzana [mywing81 gmail] for
spotting this loop as the cause for the bad
performance.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
They are essential to be able to use the utils without av_malloc()
That is for example use with malloc(), memalign(), some other
private allocation function, on the stack or others.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'e002e3291e6dc7953f843abf56fc14f08f238b21':
Use the new aes/md5/sha/tree allocation functions
avutil: Add functions for allocating opaque contexts for algorithms
svq3: fix pointer type warning
svq3: replace unsafe pointer casting with intreadwrite macros
parseutils-test: various cleanups
Conflicts:
doc/APIchanges
libavcodec/svq3.c
libavutil/parseutils.c
libavutil/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The current API where the plain size is exposed is not of much
use - in most cases it is allocated dynamically anyway.
If allocated e.g. on the stack via an uint8_t array, there's no
guarantee that the struct's members are aligned properly (unless
the array is overallocated and the opaque pointer within it
manually aligned to some unspecified alignment).
Signed-off-by: Martin Storsjö <martin@martin.st>
Basically to make code clearer and adherent to the
standard. RFC 1321, on page 2 states
Let the symbol "+" denote addition of words (i.e., modulo-2^32
addition). Let X <<< s denote the 32-bit value obtained by circularly
shifting (rotating) X left by s bit positions.
on page 3, section 3.3 states:
A four-word buffer (A,B,C,D) is used to compute the message digest.
Here each of A, B, C, D is a 32-bit register.
so the algorithm needs to work with integers that are exactly 32bits
in length. And indeed in struct AVMD5 the MD buffer is declared as
"uint32_t ABCD[4];", while in the function that performs the block
transformation the state variables were "unsigned int"s. On
architectures where sizeof(unsigned int) != sizeof(uint32_t) this
could be a problem, although I can't name such an architecture from
the top of my head.
On a side note, both the reference implementation in RFC 1321 and the
gnulib implementation (used by md5sum program on GNU systems) use
uint32_t in the transform function.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master: (40 commits)
H.264: template left MB handling
H.264: faster fill_decode_caches
H.264: faster write_back_*
H.264: faster fill_filter_caches
H.264: make filter_mb_fast support the case of unavailable top mb
Do not include log.h in avutil.h
Do not include pixfmt.h in avutil.h
Do not include rational.h in avutil.h
Do not include mathematics.h in avutil.h
Do not include intfloat_readwrite.h in avutil.h
Remove return statements following infinite loops without break
RTSP: Doxygen comment cleanup
doxygen: Escape '\' in Doxygen documentation.
md5: cosmetics
md5: use AV_WL32 to write result
md5: add fate test
md5: include correct headers
md5: fix test program
doxygen: Drop array size declarations from Doxygen parameter names.
doxygen: Fix parameter names to match the function prototypes.
...
Conflicts:
libavcodec/x86/dsputil_mmx.c
libavformat/flvenc.c
libavformat/oggenc.c
libavformat/wtv.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Other parts of FFmpeg use NE (native endian) rather than ME (machine).
This makes it consistent.
Originally committed as revision 24169 to svn://svn.ffmpeg.org/ffmpeg/trunk
This reduces the number of false dependencies on header files and
speeds up compilation.
Originally committed as revision 22407 to svn://svn.ffmpeg.org/ffmpeg/trunk
md5.c:150: warning: passing argument 2 of 'av_md5_update' from incompatible pointer type
Originally committed as revision 11665 to svn://svn.ffmpeg.org/ffmpeg/trunk
patch by Carl Eugen Hoyos cehoyos chez ag or at
original thread: [FFmpeg-devel] [PATCH] attribute_unused -> av_unused
date: 05/29/2007 01:23 PM
Originally committed as revision 9155 to svn://svn.ffmpeg.org/ffmpeg/trunk
depending on CONFIG_SMALL this can either be compiled to a fully unrolled kernel / rfc reference style md5 routine
or a single loop similar to what mplayer uses
Originally committed as revision 5565 to svn://svn.ffmpeg.org/ffmpeg/trunk