On recent x86-64 system with march=native|<cpu>|<microarchitecture
level> gcc/clang will automatically define all the available vector
extensions macros. crypto/blake2-config.h then correctly set all the
HAVE_<EXTENSION> macros.
crypto/blake2-round.h then checks the HAVE_<EXTENSION> macros for
including further headers:
#if defined(HAVE_SSE41)
#include "blake2b-load-sse41.h"
#else
#include "blake2b-load-sse2.h"
#endif
which is wrong. On recent systems it always results in including
blake2b-load-sse41.h. crypto/blake2-round.h itself is included by
crypto/blake2b-sse2.c and now we have a SSE2/SSE4.1 code mixing
resulting in the incompatible type for argument build errors described
in #589.
The idea is to remove the lines above from crypto/blake2-round.h and put
the includes directly into crypto/blake2b-sse2.c and
crypto/blake2b-sse41.c respectively.
Note this slightly diverges from the upstream BLAKE2 sources.
Pull-request: #591
Author: Tino Mai <mai.tino@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Otherwise, the following error occurs:
| In file included from crypto/blake2b-sse2.c:30:
| crypto/blake2b-sse2.c: In function 'blake2b_compress_sse2':
| crypto/blake2b-round.h:32:22: warning: implicit declaration of function '_mm_shuffle_epi8'; did you mean '_mm_shuffle_epi32'? [-Wimplicit-function-declaration]
| 32 | : (-(c) == 24) ? _mm_shuffle_epi8((x), r24) \
| | ^~~~~~~~~~~~~~~~
Note: it's not yet clear what affects this build failure as it otherwise
builds in the tested configurations (gcc/clang, arch, distro), but it
apparently fixes the build in some environments.
Pull-request: #588
Signed-off-by: Alexander Kanavin <alex@linutronix.de>
[ add note ]
Signed-off-by: David Sterba <dsterba@suse.com>
Copy implementation from https://github.com/BLAKE2/BLAKE2, add runtime
detection of SSE2 and add the switch function.
Signed-off-by: David Sterba <dsterba@suse.com>