haproxy public development tree
Go to file
Willy Tarreau 75b335abc7 MINOR: fd: don't scan the full fdtab on all threads
During tests, it's pretty visible that with many threads and a large
number of FDs, the process may take time to be ready. The reason for
this is that the full fdtab array is scanned by each and every thread
at boot in fd_reregister_all() in order to make each thread-local
poller adopt the FDs that are relevant to it. The problem is that
when dealing with 1-2M FDs and 64+ threads, it starts to represent
quite a number of loops, and usually the fdtab array doesn't entirely
fit in the CPU's L3 cache, causing extra memory accesses.

It's particularly visible when issuing debugging commands to the CLI
because usually the first one fails while the CPU is at 100% for half
a second (which also is socat's timeout). A quick test with this:

    global
        stats socket /tmp/sock1 level admin mode 666
        stats timeout 1h
        maxconn 2000000

And the following script started in another window:

    while ! time socat -t5 - /tmp/sock1 <<< "show version";do date -Ins;done

shows that it takes 1.58s for the socat instance that succeeds on an
Ampere Altra with 80 cores, this requires to change the timeout (defaults
to half a second) otherwise it returns nothing. In addition it also means
that during reloads, some CPU spikes will be noticed.

Adding a prefetch of the current FD + 16 improves the startup time by 30%
but that's far from being sufficient.

In practice all of this is performed at boot time, a moment at which we
know that extremely few FDs are registered (basically just the listeners),
so FD numbers are usually very low and the rest of the table is scanned
for no benefit. Ideally, knowing upfront how many FDs we have should be
sufficient.

A first approach would consist in counting the entries on a single thread
before registering pollers. It's not necessarily efficient and would take
time anyway.

This patch takes a different approach. It consists in keeping a thread-local
max ("fd_highest") that is updated whenever fd_insert() is called with a
larger number. Of course this is not correct once all threads have started,
but it will remain valid during boot since the same value is used during
startup and is cloned for each thread, and no scheduling happens anywhere
during this period, so that all threads are aware of the highest FD they've
seen registered, even if it had been done in some init code, and this without
having to deal with a shared variable.

Here on the test platform, the script gets its response in 10ms vs 1580
before.
2024-07-15 19:19:13 +02:00
.github CI: weekly QUIC Interop: try to fix private image 2024-07-10 09:43:02 +02:00
addons BUG/MINOR: promex: Remove Help prefix repeated twice for each metric 2024-07-01 10:50:27 +02:00
admin ADMIN: acme.sh: remove the old acme.sh code 2024-05-31 13:37:47 +02:00
dev MEDIUM: mux-spop: Introduce the SPOP multiplexer 2024-07-12 15:27:04 +02:00
doc DOC: spoe: Update SPOE documentation to reflect recent refactoring 2024-07-12 16:38:49 +02:00
examples
include MINOR: fd: don't scan the full fdtab on all threads 2024-07-15 19:19:13 +02:00
reg-tests MEDIUM: check/spoe: Use SPOP multiplexer to perform SPOP health-checks 2024-07-12 15:27:04 +02:00
scripts SCRIPTS: create-release: no more need to skip architecture.txt 2024-07-10 15:38:45 +02:00
src MINOR: fd: don't scan the full fdtab on all threads 2024-07-15 19:19:13 +02:00
tests MAJOR: import: update mt_list to support exponential back-off (try #2) 2024-07-09 16:46:38 +02:00
.cirrus.yml CI: FreeBSD: upgrade image, packages 2024-06-04 11:19:00 +02:00
.gitattributes
.gitignore
.mailmap
.travis.yml
BRANCHES
BSDmakefile
CHANGELOG [RELEASE] Released version 3.1-dev3 2024-07-10 15:39:36 +02:00
CONTRIBUTING
INSTALL DOC: INSTALL: minimum AWS-LC version is v1.22.0 2024-06-14 12:06:03 +02:00
LICENSE
MAINTAINERS MAJOR: spoe: Let the SPOE back into the game 2024-05-22 09:04:38 +02:00
Makefile MEDIUM: mux-spop: Introduce the SPOP multiplexer 2024-07-12 15:27:04 +02:00
README.md DOC: change the link to the FreeBSD CI in README.md 2024-06-03 15:21:29 +02:00
SUBVERS
VERDATE [RELEASE] Released version 3.1-dev3 2024-07-10 15:39:36 +02:00
VERSION [RELEASE] Released version 3.1-dev3 2024-07-10 15:39:36 +02:00

README.md

HAProxy

alpine/musl AWS-LC openssl no-deprecated Illumos NetBSD FreeBSD VTest

HAProxy logo

HAProxy is a free, very fast and reliable reverse-proxy offering high availability, load balancing, and proxying for TCP and HTTP-based applications.

Installation

The INSTALL file describes how to build HAProxy. A list of packages is also available on the wiki.

Getting help

The discourse and the mailing-list are available for questions or configuration assistance. You can also use the slack or IRC channel. Please don't use the issue tracker for these.

The issue tracker is only for bug reports or feature requests.

Documentation

The HAProxy documentation has been split into a number of different files for ease of use. It is available in text format as well as HTML. The wiki is also meant to replace the old architecture guide.

Please refer to the following files depending on what you're looking for:

  • INSTALL for instructions on how to build and install HAProxy
  • BRANCHES to understand the project's life cycle and what version to use
  • LICENSE for the project's license
  • CONTRIBUTING for the process to follow to submit contributions

The more detailed documentation is located into the doc/ directory:

License

HAProxy is licensed under GPL 2 or any later version, the headers under LGPL 2.1. See the LICENSE file for a more detailed explanation.