Copy AVX2 implementation from https://github.com/sneves/blake2-avx2 .
Though this is marked experimental, libsodium uses this version.
Signed-off-by: David Sterba <dsterba@suse.com>
Copy implementation from https://github.com/BLAKE2/BLAKE2, add runtime
detection of SSE2 and add the switch function.
Signed-off-by: David Sterba <dsterba@suse.com>