lavu/sha: Fully unroll the transform function loops
crypto_bench SHA-1 and SHA-256 results using an AMD Athlon X2 7750+, mingw32-w64 GCC 4.7.3 x86_64
Before:
lavu SHA-1 size: 1048576 runs: 1024 time: 9.012 +- 0.162
lavu SHA-256 size: 1048576 runs: 1024 time: 19.625 +- 0.173
After:
lavu SHA-1 size: 1048576 runs: 1024 time: 7.948 +- 0.154
lavu SHA-256 size: 1048576 runs: 1024 time: 17.841 +- 0.170
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>