On recent x86-64 system with march=native|<cpu>|<microarchitecture
level> gcc/clang will automatically define all the available vector
extensions macros. crypto/blake2-config.h then correctly set all the
HAVE_<EXTENSION> macros.
crypto/blake2-round.h then checks the HAVE_<EXTENSION> macros for
including further headers:
#if defined(HAVE_SSE41)
#include "blake2b-load-sse41.h"
#else
#include "blake2b-load-sse2.h"
#endif
which is wrong. On recent systems it always results in including
blake2b-load-sse41.h. crypto/blake2-round.h itself is included by
crypto/blake2b-sse2.c and now we have a SSE2/SSE4.1 code mixing
resulting in the incompatible type for argument build errors described
in #589.
The idea is to remove the lines above from crypto/blake2-round.h and put
the includes directly into crypto/blake2b-sse2.c and
crypto/blake2b-sse41.c respectively.
Note this slightly diverges from the upstream BLAKE2 sources.
Pull-request: #591
Author: Tino Mai <mai.tino@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Otherwise, the following error occurs:
| In file included from crypto/blake2b-sse2.c:30:
| crypto/blake2b-sse2.c: In function 'blake2b_compress_sse2':
| crypto/blake2b-round.h:32:22: warning: implicit declaration of function '_mm_shuffle_epi8'; did you mean '_mm_shuffle_epi32'? [-Wimplicit-function-declaration]
| 32 | : (-(c) == 24) ? _mm_shuffle_epi8((x), r24) \
| | ^~~~~~~~~~~~~~~~
Note: it's not yet clear what affects this build failure as it otherwise
builds in the tested configurations (gcc/clang, arch, distro), but it
apparently fixes the build in some environments.
Pull-request: #588
Signed-off-by: Alexander Kanavin <alex@linutronix.de>
[ add note ]
Signed-off-by: David Sterba <dsterba@suse.com>
Add entries for all crypto backends and test the ones that match what
was configured with --with-crypto.
Signed-off-by: David Sterba <dsterba@suse.com>
Change what hash-speedtest benchmarks according to the
--with-crypto=backend option. Until now it would run the same version
under different names inherited from the builting.
At configure time detect availability of all backends and define all
macros.
Signed-off-by: David Sterba <dsterba@suse.com>
The build fails with crypto backends other than builtin, the
initializers cannot be reached as they're ifdef-ed out. Move
hash_init_accel under the right condition and delete the
algorithm-specific initializers as they're used only by the hash test
and that can simply call hash_init_accel to set the implementation.
All the -m flags need to be detected at configure time and the flag used
for ifdef (HAVE_CFLAG_m*), not the actual feature defined by compiler as
the dispatcher function is not built with the -m flags.
The uname check for x86_64 must be dropped so on i386/i586 we can still
build accelerated version.
Signed-off-by: David Sterba <dsterba@suse.com>
The crc32c selection has been already using the pointer-based approach
so drop the cpuid detection and use our common code for that.
Signed-off-by: David Sterba <dsterba@suse.com>
Change how sha256 implementation is selected. Instead of an if-else
check in the block processing function select the best version and assign
the function pointer. This is slightly faster.
At this point the selection is not implemented properly in
hash-speedtest so all results are from the fastest version. This will
be fixed once all algorithms are converted.
Signed-off-by: David Sterba <dsterba@suse.com>
Change how blake2 implementation is selected. Instead of an if-else check
inside blake2b_compress each time, select the best one and assign the
function pointer. This is slightly faster.
At this point the selection is not implemented properly in
hash-speedtest so all results are from the fastest version. This will
be fixed once all algorithms are converted.
Signed-off-by: David Sterba <dsterba@suse.com>
Prepare a single location that will detect or set accelerated versions
of hash algorithms. Right now it's the crc32c, blake2 and sha256 do
an if-else switch while crc32c sets a function pointer.
Signed-off-by: David Sterba <dsterba@suse.com>
Update xxhash implementation from https://github.com/Cyan4973/xxHash.
This has moved a lot of code so the diff is huge, plus the code we don't
need now for btrfs has been removed (XXH3, XXH32).
There's no significant change in performance.
Signed-off-by: David Sterba <dsterba@suse.com>
There are some stale headers that we don't need and the int types are
using the kernel types and pull kerncompat.h. As this is a basic header
that should minimize dependencies use the standard int types.
Signed-off-by: David Sterba <dsterba@suse.com>
Now that there are more implementations for the hashes test them all on
the vectors if the CPU supports that.
Signed-off-by: David Sterba <dsterba@suse.com>
Add test vectors that are longer that the internal block length.
Accelerated implementations may not be used on the short or unpaded
blocks but we need to test them as well.
Signed-off-by: David Sterba <dsterba@suse.com>
Copy sha256-x86.c from https://github.com/noloader/SHA-Intrinsics, that
uses the compiler intrinsics to implement the update step with the
native x86_64 instructions.
To avoid dependencies of the reference code and the x86 version, check
runtime support only if the compiler also supports -msha.
Signed-off-by: David Sterba <dsterba@suse.com>
Benchmark all accelerated implementations if the CPU supports them. Set
the level before each test, expecting that the implementation switches
the implementation dynamically.
Signed-off-by: David Sterba <dsterba@suse.com>
Copy AVX2 implementation from https://github.com/sneves/blake2-avx2 .
Though this is marked experimental, libsodium uses this version.
Signed-off-by: David Sterba <dsterba@suse.com>
Copy implementation from https://github.com/BLAKE2/BLAKE2, add runtime
detection of SSE2 and add the switch function.
Signed-off-by: David Sterba <dsterba@suse.com>
[FALSE ALERT]
Unlike gcc, clang doesn't really understand the comments, thus it's
reportings tons of fall through related errors:
cmds/reflink.c:124:3: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
case 'r':
^
cmds/reflink.c:124:3: note: insert '__attribute__((fallthrough));' to silence this warning
case 'r':
^
__attribute__((fallthrough));
cmds/reflink.c:124:3: note: insert 'break;' to avoid fall-through
case 'r':
^
break;
[CAUSE]
Although gcc is fine with /* fallthrough */ comments, clang is not.
[FIX]
So just introduce a fallthrough macro to handle the situation properly,
and use that macro instead.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Lots of code still uses fprintf(stderr, "...") that should be the
error() helper. The kernel-shared code is left out of the conversion for
now.
Signed-off-by: David Sterba <dsterba@suse.com>
There's an ancient macro btrfs_crc32c which is just wrapping crc32c and
not doing anything else, so we can use the crc helper directly.
Signed-off-by: David Sterba <dsterba@suse.com>
Add the GPL v2 header to files where it was missing and is not from an
external source, update to the most recent version with the address.
Signed-off-by: David Sterba <dsterba@suse.com>
Internally it's blake2b but for the user facing output or other command
line interfaces let's call it just BLAKE2.
Signed-off-by: David Sterba <dsterba@suse.com>
With explicit width the default alignment is to the right, using space
is a gnu extension. Fix the following warnings:
crypto/hash-speedtest.c: In function ‘main’:
crypto/hash-speedtest.c:152:15: warning: ' ' flag used with ‘%s’ gnu_printf format [-Wformat=]
152 | printf("% 12s: ", c->name);
| ^
crypto/hash-speedtest.c:172:21: warning: ' ' flag used with ‘%u’ gnu_printf format [-Wformat=]
172 | printf("%s: % 12llu, %s/i % 8llu",
| ^
crypto/hash-speedtest.c:172:34: warning: ' ' flag used with ‘%u’ gnu_printf format [-Wformat=]
172 | printf("%s: % 12llu, %s/i % 8llu",
| ^
Signed-off-by: David Sterba <dsterba@suse.com>
Add test vectors, a subset without keys as found in linux kernel sources
in crypto/test-mgr.h for all supported hash algorithms.
Signed-off-by: David Sterba <dsterba@suse.com>
https://github.com/smuellerDD/libkcapi allows user-space to access the
Linux kernel crypto API. Uses netlink interface and exports easy to use
APIs.
Signed-off-by: David Sterba <dsterba@suse.com>
For environments that require certified implementations of cryptographic
primitives allow to select a library providing them. The requirements
are SHA256 and BLAKE2 (with the 2b variant and 256 bit digest).
For now there are two: libgrcrypt and libsodium (openssl does not
provide the BLAKE2b-256). Accellerated versions are typically provided
and automatically selected.
Signed-off-by: David Sterba <dsterba@suse.com>
xxhash's state and results are always in little, but in progs after the
hash was calculated it was copied to the final buffer via memcpy,
meaning it'd be parsed as a big endian number on big endian machines.
This is incompatible with the kernel implementation of xxhash which
results in erroneous "checksum didn't match" errors on mount.
Fix it by using put_unaligned_le64 which always ensures the resulting
checksum will be copied in little endian format as the kernel expects
it.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206835
Fixes: f070ece2e9 ("btrfs-progs: add xxhash64 to mkfs")
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add definition, crypto wrappers and support to mkfs for blake2 for
checksumming. There are 2 aliases either blake2 or blake2b.
Signed-off-by: David Sterba <dsterba@suse.com>
Upstream commit 997fa5ba1e14b52c554fb03ce39e579e6f27b90c,
git repository: git://github.com/BLAKE2/BLAKE2
The reference implemetation added in this patch is unchanged and will be
modified only to compile in current code base and with minimal other
modifications in case of future sync with upstream code. IOW, the coding
style should stay as-is and does not conform to the other btrfs-progs
code. This is an exception for xxhash and sha256 code as well.
Signed-off-by: David Sterba <dsterba@suse.com>
Add the definition to the checksum types and let mkfs accept it.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
A simple tool to microbenchmark performance of the hashes. Uses rdtsc
for timing, so works only on x86_64.
$ make hash-speedtest
$ ./hash-speedtest [iterations]
Block size: 4096
Iterations: 100000
NULL-NOP: cycles: 56061823, c/i 560
NULL-MEMCPY: cycles: 61296469, c/i 612
CRC32C: cycles: 179961796, c/i 1799
XXHASH: cycles: 138434590, c/i 1384
Signed-off-by: David Sterba <dsterba@suse.com>
The SHA256 is going to be used in the future, so this makes it a second
user and we also have the appropriate directory now.
Signed-off-by: David Sterba <dsterba@suse.com>
With the introduction of xxhash64 to btrfs-progs we created a crypto/
directory for all the hashes used in btrfs (although no
cryptographically secure hash is there yet).
Move the crc32c implementation from kernel-lib/ to crypto/ as well so we
have all hashes consolidated.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Copy of xxhash.[ch] from git://github.com/Cyan4973/xxHash, version
v0.7.1. The include xxh3.h has been commented out as we don't have it
here, otherwise the copy is unchnaged.
Signed-off-by: David Sterba <dsterba@suse.com>