The accelerated crc32c needs to check for two CPU features, the crc32c
instructions is in SSE 4.2 and 'pclmulqdq' is a separate. There's still
old hardware used that does not have the PCLMUL instructions. Detect it
and make it the condition.
The pclmul is not supported on old compilers so also add a
configure-time detection and leave the SSE 4.2 only implementation as
the accelerated one if possible.
Issue: #676
Signed-off-by: David Sterba <dsterba@suse.com>
Copy faster implementation of crc32c from linux kernel as of 6.5-rc7
(x86_64, arch/x86/crypto/crc32c-pcl-intel-asm_64.S). This needs
assembler build support, so detect target architecture so
cross-compilation still works.
Add a special CPU flag so the old and new implementations can be
benchmarked and verified separately.
Sample benchmark:
CPU flags: 0x1ff
CPU features: SSE2 SSSE3 SSE41 SSE42 SHA AVX AVX2 CRC32C_PCL
Block size: 4096
Iterations: 1000000
Implementation: builtin
Units: CPU cycles
NULL-NOP: cycles: 77177218, cycles/i 77
NULL-MEMCPY: cycles: 226313072, cycles/i 226, 62133.395 MiB/s
CRC32C-ref: cycles: 24418596066, cycles/i 24418, 575.859 MiB/s
CRC32C-NI: cycles: 1188335920, cycles/i 1188, 11833.073 MiB/s
CRC32C-PCL: cycles: 463193456, cycles/i 463, 30358.037 MiB/s
XXHASH: cycles: 851606646, cycles/i 851, 16511.916 MiB/s
SHA256-ref: cycles: 74476234956, cycles/i 74476, 188.808 MiB/s
SHA256-NI: cycles: 34198637428, cycles/i 34198, 411.177 MiB/s
BLAKE2-ref: cycles: 14761411664, cycles/i 14761, 952.597 MiB/s
BLAKE2-SSE2: cycles: 18101896796, cycles/i 18101, 776.807 MiB/s
BLAKE2-SSE41: cycles: 12599091062, cycles/i 12599, 1116.087 MiB/s
BLAKE2-AVX2: cycles: 9668247506, cycles/i 9668, 1454.418 MiB/s
The new implementation is about 2.5x faster.
Note: there new version does not work on musl because of linkage
problems (relocations in .rodata), so it's still using the old
implementation.
Signed-off-by: David Sterba <dsterba@suse.com>
The crc32c selection has been already using the pointer-based approach
so drop the cpuid detection and use our common code for that.
Signed-off-by: David Sterba <dsterba@suse.com>
There are some stale headers that we don't need and the int types are
using the kernel types and pull kerncompat.h. As this is a basic header
that should minimize dependencies use the standard int types.
Signed-off-by: David Sterba <dsterba@suse.com>
Add the GPL v2 header to files where it was missing and is not from an
external source, update to the most recent version with the address.
Signed-off-by: David Sterba <dsterba@suse.com>
With the introduction of xxhash64 to btrfs-progs we created a crypto/
directory for all the hashes used in btrfs (although no
cryptographically secure hash is there yet).
Move the crc32c implementation from kernel-lib/ to crypto/ as well so we
have all hashes consolidated.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>