BUILD: makefile: add a few popular ARMv8 CPU targets

This adds the following CPUs to the makefile:
  - armv81    : modern ARM cores (Cortex A55/A75/A76/A78/X1, Neoverse, Graviton2)
  - a72       : ARM Cortex-A72 or A73 (e.g. RPi4, Odroid N2, VIM3, AWS Graviton)
  - a53       : ARM Cortex-A53 or any of its successors in 64-bit mode (e.g. RPi3)
  - armv8-auto: both older and newer ARMv8 cores, with a minor runtime penalty

The reasons for these ones are:
  - a53 is the common denominator of all of its successors, and does
    support CRC32 which is used by the gzip compression, that the generic
    armv8-a does not ;

  - a72 supports the same features but is an out-of-order one that deserves
    better optimizations; it's found in a number of high-performance
    multi-core CPUs mainly oriented towards I/O and network processing
    (Armada 8040, NXP LX2160A, AWS Graviton), and more recently the
    Raspberry Pi 4. The A73 found in VIM3 and Odroid-N2 can use the same
    optimizations ;

  - armv81 is for generic ARMv8.1-A and above, automatically enables LSE
    atomics which are way more scalable, and CRC32. This one covers modern
    ARMv8 cores such as Cortex A55/A75/A76/A77/A78/X1 and the Neoverse
    family such as found in AWS's Graviton2. The LSE instructions are
    essential for large numbers of cores (8 and above).

  - armv8-auto dynamically enables support for LSE extensions when
    detected while still being compatible with older cores. There is a
    small performance penalty in doing this (~3%) but a same executable
    will perform optimally on a wider range of hardware. This should be
    the best option for distros. It requires gcc-10 or gcc-9.4 and above.

When no CPU is specified, GCC version 10.2 and above will automatically
implement the wrapper used to detect the LSE extensions.
This commit is contained in:
Willy Tarreau 2021-05-12 09:47:30 +02:00
parent d2acd0b3a7
commit 40a871f09d
2 changed files with 15 additions and 2 deletions

10
INSTALL
View File

@ -285,7 +285,10 @@ systems, by passing "USE_SLZ=" to the "make" command.
Please note that SLZ will benefit from some CPU-specific instructions like the Please note that SLZ will benefit from some CPU-specific instructions like the
availability of the CRC32 extension on some ARM processors. Thus it can further availability of the CRC32 extension on some ARM processors. Thus it can further
improve its performance to build with "CPU=native" on the target system. improve its performance to build with "CPU=native" on the target system, or
"CPU=armv81" (modern systems such as Graviton2 or A55/A75 and beyond),
"CPU=a72" (e.g. for RPi4, or AWS Graviton), "CPU=a53" (e.g. for RPi3), or
"CPU=armv8-auto" (automatic detection with minor runtime penalty).
A second option involves the widely known zlib library, which is very likely A second option involves the widely known zlib library, which is very likely
installed on your system. In order to use zlib, simply pass "USE_ZLIB=1" to the installed on your system. In order to use zlib, simply pass "USE_ZLIB=1" to the
@ -421,6 +424,11 @@ one of the following choices to the CPU variable :
- ultrasparc : Sun UltraSparc I/II/III/IV processor - ultrasparc : Sun UltraSparc I/II/III/IV processor
- power8 : IBM POWER8 processor - power8 : IBM POWER8 processor
- power9 : IBM POWER9 processor - power9 : IBM POWER9 processor
- armv81 : modern ARM cores (Cortex A55/A75/A76/A78/X1, Neoverse, Graviton2)
- a72 : ARM Cortex-A72 or A73 (e.g. RPi4, Odroid N2, AWS Graviton)
- a53 : ARM Cortex-A53 or any of its successors in 64-bit mode (e.g. RPi3)
- armv8-auto : support both older and newer armv8 cores with a minor penalty,
thanks to gcc 10's outline atomics (default with gcc 10.2).
- native : use the build machine's specific processor optimizations. Use with - native : use the build machine's specific processor optimizations. Use with
extreme care, and never in virtualized environments (known to break). extreme care, and never in virtualized environments (known to break).
- generic : any other processor or no CPU-specific optimization. (default) - generic : any other processor or no CPU-specific optimization. (default)

View File

@ -162,7 +162,8 @@ TARGET =
#### TARGET CPU #### TARGET CPU
# Use CPU=<cpu_name> to optimize for a particular CPU, among the following # Use CPU=<cpu_name> to optimize for a particular CPU, among the following
# list : # list :
# generic, native, i586, i686, ultrasparc, power8, power9, custom # generic, native, i586, i686, ultrasparc, power8, power9, custom,
# a53, a72, armv81, armv8-auto
CPU = generic CPU = generic
#### Architecture, used when not building for native architecture #### Architecture, used when not building for native architecture
@ -274,6 +275,10 @@ CPU_CFLAGS.i686 = -O2 -march=i686
CPU_CFLAGS.ultrasparc = -O6 -mcpu=v9 -mtune=ultrasparc CPU_CFLAGS.ultrasparc = -O6 -mcpu=v9 -mtune=ultrasparc
CPU_CFLAGS.power8 = -O2 -mcpu=power8 -mtune=power8 CPU_CFLAGS.power8 = -O2 -mcpu=power8 -mtune=power8
CPU_CFLAGS.power9 = -O2 -mcpu=power9 -mtune=power9 CPU_CFLAGS.power9 = -O2 -mcpu=power9 -mtune=power9
CPU_CFLAGS.a53 = -O2 -mcpu=cortex-a53
CPU_CFLAGS.a72 = -O2 -mcpu=cortex-a72
CPU_CFLAGS.armv81 = -O2 -march=armv8.1-a
CPU_CFLAGS.armv8-auto = -O2 -march=armv8-a+crc -moutline-atomics
CPU_CFLAGS = $(CPU_CFLAGS.$(CPU)) CPU_CFLAGS = $(CPU_CFLAGS.$(CPU))
#### ARCH dependent flags, may be overridden by CPU flags #### ARCH dependent flags, may be overridden by CPU flags