Commit Graph

8298 Commits

Author SHA1 Message Date
Olivier Houchard
3758eab71c MEDIUM: lb_fwrr: Use one ebtree per thread group.
When using the round-robin load balancer, the major source of contention
is the lbprm lock, that has to be held every time we pick a server.
To mitigate that, make it so there are one tree per thread-group, and
one lock per thread-group. That means we now have a lb_fwrr_per_tgrp
structure that will contain the two lb_fwrr_groups (active and backup) as well
as the lock to protect them in the per-thread lbprm struct, and all
fields in the struct server are now moved to the per-thread structure
too.
Those changes are mostly mechanical, and brings good performances
improvment, on a 64-cores AMD CPU, with 64 servers configured, we could
process about 620000 requests par second, and we now can process around
1400000 requests per second.
2025-04-17 17:38:23 +02:00
Olivier Houchard
f36f6cfd26 MINOR: proxies: Add a per-thread group lbprm struct.
Add a new structure in the per-thread groups proxy structure, that will
contain whatever is per-thread group in lbprm.
It will be accessed as p->per_tgrp[tgid].lbprm.
2025-04-17 17:38:23 +02:00
Olivier Houchard
7ca1c94ff0 MINOR: lb_fwrr: Move the next weight out of fwrr_group.
Move the "next_weight" outside of fwrr_group, and inside struct lb_fwrr
directly, one for the active servers, one for the backup servers.
We will soon have one fwrr_group per thread group, but next_weight will
be global to all of them.
2025-04-17 17:38:23 +02:00
Olivier Houchard
444125a764 MINOR: servers: Provide a pointer to the server in srv_per_tgroup.
Add a pointer to the server into the struct srv_per_tgroup, so that if
we only have access to that srv_per_tgroup, we can come back to the
corresponding server.
2025-04-17 17:38:23 +02:00
Willy Tarreau
36ec70c526 MINOR: sched: add a new function is_sched_alive() to report scheduler's health
This verifies that the scheduler is still ticking without having to
access the activity[] array nor keeping local copies of the ctxsw
counter. It just tests and sets a flag that is reset after each
return from a ->process() function.
2025-04-17 16:25:47 +02:00
Willy Tarreau
874ba2afed CLEANUP: debug: no longer set nor use TH_FL_DUMPING_OTHERS
TH_FL_DUMPING_OTHERS was being used to try to perform exclusion between
threads running "show threads" and those producing warnings. Now that it
is much more cleanly handled, we don't need that type of protection
anymore, which was adding to the complexity of the solution. Let's just
get rid of it.
2025-04-17 16:25:47 +02:00
Willy Tarreau
c16d5415a8 MINOR: debug: make ha_stuck_warning() only work for the current thread
Since we no longer call it with a foreign thread, let's simplify its code
and get rid of the special cases that were relying on ha_thread_dump_fill()
and synchronization with a remote thread. We're not only dumping the
current thread so ha_thread_dump_one() is sufficient.
2025-04-17 16:25:47 +02:00
Willy Tarreau
b24d7f248e MINOR: pass a valid buffer pointer to ha_thread_dump_one()
The goal is to let the caller deal with the pointer so that the function
only has to fill that buffer without worrying about locking. This way,
synchronous dumps from "show threads" are produced and emitted directly
without causing undesired locking of the buffer nor risking causing
confusion about thread_dump_buffer containing bits from an interrupted
dump in progress.

It's only the caller that's responsible for notifying the requester of
the end of the dump by setting bit 0 of the pointer if needed (i.e. it's
only done in the debug handler).
2025-04-17 16:25:47 +02:00
Willy Tarreau
5ac739cd0c MINOR: debug: remove unused case of thr!=tid in ha_thread_dump_one()
This function was initially designed to dump any threadd into the presented
buffer, but the way it currently works is that it's always called for the
current thread, and uses the distinction between coming from a sighandler
or being called directly to detect which thread is the caller.

Let's simplify all this by replacing thr with tid everywhere, and using
the thread-local pointers where it makes sense (e.g. th_ctx, th_ctx etc).
The confusing "from_signal" argument is now replaced with "is_caller"
which clearly states whether or not the caller declares being the one
asking for the dump (the logic is inverted, but there are only two call
places with a constant).
2025-04-17 16:25:47 +02:00
Willy Tarreau
6d8a523d14 MINOR: tinfo: keep a copy of the pointer to the thread dump buffer
Instead of using the thread dump buffer for post-mortem analysis, we'll
keep a copy of the assigned pointer whenever it's used, even for warnings
or "show threads". This will offer more opportunities to figure from a
core what happened, and will give us more freedom regarding the value of
the thread_dump_buffer itself. For example, even at the end of the dump
when the pointer is reset, the last used buffer is now preserved.
2025-04-17 16:25:47 +02:00
Willy Tarreau
337017e2f9 BUG/MINOR: threads: set threads_idle and threads_harmless even with no threads
Some signal handlers rely on these to decide about the level of detail to
provide in dumps, so let's properly fill the info about entering/leaving
idle. Note that for consistency with other tests we're using bitops with
t->ltid_bit, while we could simply assign 0/1 to the fields. But it makes
the code more readable and the whole difference is only 88 bytes on a 3MB
executable.

This bug is not important, and while older versions are likely affected
as well, it's not worth taking the risk to backport this in case it would
wake up an obscure bug.
2025-04-17 16:25:47 +02:00
Amaury Denoyelle
52246249ab MEDIUM: listener/mux-h2: implement idle-ping on frontend side
This commit is the counterpart of the previous one, adapted on the
frontend side. "idle-ping" is added as keyword to bind lines, to be able
to refresh client timeout of idle frontend connections.

H2 MUX behavior remains similar as the previous patch. The only
significant change is in h2c_update_timeout(), as idle-ping is now taken
into account also for frontend connection. The calculated value is
compared with http-request/http-keep-alive timeout value. The shorter
delay is then used as expired date. As hr/ka timeout are based on
idle_start, this allows to run them in parallel with an idle-ping timer.
2025-04-17 14:49:36 +02:00
Amaury Denoyelle
a78a04cfae MEDIUM: server/mux-h2: implement idle-ping on backend side
This commit implements support for idle-ping on the backend side. First,
a new server keyword "idle-ping" is defined in configuration parsing. It
is used to set the corresponding new server member.

The second part of this commit implements idle-ping support on H2 MUX. A
new inlined function conn_idle_ping() is defined to access connection
idle-ping value. Two new connection flags are defined H2_CF_IDL_PING and
H2_CF_IDL_PING_SENT. The first one is set for idle connections via
h2c_update_timeout().

On h2_timeout_task() handler, if first flag is set, instead of releasing
the connection as before, the second flag is set and tasklet is
scheduled. As both flags are now set, h2_process_mux() will proceed to
PING emission. The timer has also been rearmed to the idle-ping value.
If a PING ACK is received before next timeout, connection timer is
refreshed. Else, the connection is released, as with timer expiration.

Also of importance, special care is needed when a backend connection is
going to idle. In this case, idle-ping timer must be rearmed. Thus a new
invokation of h2c_update_timeout() is performed on h2_detach().
2025-04-17 14:49:36 +02:00
William Lallemand
e778049ffc MINOR: acme: register the task in the ckch_store
This patch registers the task in the ckch_store so we don't run 2 tasks
at the same time for a given certificate.

Move the task creation under the lock and check if there was already a
task under the lock.
2025-04-16 17:12:43 +02:00
William Lallemand
c291a5c73c BUILD: incompatible pointer type suspected with -DDEBUG_UNIT
src/jws.c: In function '__jws_init':
src/jws.c:594:38: error: passing argument 2 of 'hap_register_unittest' from incompatible pointer type [-Wincompatible-pointer-types]
  594 |         hap_register_unittest("jwk", jwk_debug);
      |                                      ^~~~~~~~~
      |                                      |
      |                                      int (*)(int,  char **)
In file included from include/haproxy/api.h:36,
                 from include/import/ebtree.h:251,
                 from include/import/ebmbtree.h:25,
                 from include/haproxy/jwt-t.h:25,
                 from src/jws.c:5:
include/haproxy/init.h:37:52: note: expected 'int (*)(void)' but argument is of type 'int (*)(int,  char **)'
   37 | void hap_register_unittest(const char *name, int (*fct)());
      |                                              ~~~~~~^~~~~~

GCC 15 is warning because the function pointer does have its
arguments in the register function.

Should fix issue #2929.
2025-04-15 15:49:44 +02:00
Willy Tarreau
b708345c17 DEBUG: counters: add the ability to enable/disable updating the COUNT_IF counters
These counters can have a noticeable cost on large machines, though not
dramatic. There's no single good choice to keep them enabled or disabled.
This commit adds multiple choices:
  - DEBUG_COUNTERS set to 2 will automatically enable them by default, while
    1 will disable them by default
  - the global "debug.counters on/off" will allow to change the setting at
    boot, regardless of DEBUG_COUNTERS as long as it was at least 1.
  - the CLI "debug counters on/off" will also allow to change the value at
    run time, allowing to observe a phenomenon while it's happening, or to
    disable counters if it's suspected that their cost is too high

Finally, the "debug counters" command will append "(stopped)" at the end
of the CNT lines when these counters are stopped.

Not that the whole mechanism would easily support being extended to all
counter types by specifying the types to apply to, but it doesn't seem
useful at all and would require the user to also type "cnt" on debug
lines. This may easily be changed in the future if it's found relevant.
2025-04-14 19:02:13 +02:00
Willy Tarreau
a142adaba0 DEBUG: counters: make COUNT_IF() only appear at DEBUG_COUNTERS>=1
COUNT_IF() is convenient but can be heavy since some of them were found
to trigger often (roughly 1 counter per request on avg). This might even
have an impact on large setups due to the cost of a shared cache line
bouncing between multiple cores. For now there's no way to disable it,
so let's only enable it when DEBUG_COUNTERS is 1 or above. A future
change will make it configurable.
2025-04-14 19:02:13 +02:00
Willy Tarreau
61d633a3ac DEBUG: rename DEBUG_GLITCHES to DEBUG_COUNTERS and enable it by default
Till now the per-line glitches counters were only enabled with the
confusingly named DEBUG_GLITCHES (which would not turn glitches off
when disabled). Let's instead change it to DEBUG_COUNTERS and make sure
it's enabled by default (though it can still be disabled with
-DDEBUG_GLITCHES=0 just like for DEBUG_STRICT). It will later be
expanded to cover more counters.
2025-04-14 19:02:13 +02:00
William Lallemand
39c05cedff BUILD: acme: enable the ACME feature when JWS is present
The ACME feature depends on the JWS, which currently does not work with
every SSL libraries. This patch only enables ACME when JWS is enabled.
2025-04-12 01:39:03 +02:00
William Lallemand
5500bda9eb MINOR: acme: implement retrieval of the certificate
Once the Order status is "valid", the certificate URL is accessible,
this patch implements the retrieval of the certificate which is stocked
in ctx->store.
2025-04-12 01:39:03 +02:00
William Lallemand
27fff179fe MINOR: acme: verify the order status once finalized
This implements a call to the order status to check if the certificate
is ready.
2025-04-12 01:39:03 +02:00
William Lallemand
680222b382 MINOR: acme: finalize by sending the CSR
This patch does the finalize step of the ACME task.
This encodes the CSR into base64 format and send it to the finalize URL.

https://www.rfc-editor.org/rfc/rfc8555#section-7.4
2025-04-12 01:29:27 +02:00
William Lallemand
de5dc31a0d MINOR: acme: generate the CSR in a X509_REQ
Generate the X509_REQ using the generated private key and the SAN from
the configuration. This is only done once before the task is started.

It could probably be done at the beginning of the task with the private
key generation once we have a scheduler instead of a CLI command.
2025-04-12 01:29:27 +02:00
William Lallemand
00ba62df15 MINOR: acme: implement a check on the challenge status
This patch implements a check on the challenge URL, once haproxy asked
for the challenge to be verified, it must verify the status of the
challenge resolution and if there weren't any error.
2025-04-12 01:29:27 +02:00
William Lallemand
711a13a4b4 MINOR: acme: send the request for challenge ready
This patch sends the "{}" message to specify that a challenge is ready.
It iterates on every challenge URL in the authorization list from the
acme_ctx.

This allows the ACME server to procede to the challenge validation.
https://www.rfc-editor.org/rfc/rfc8555#section-7.5.1
2025-04-12 01:29:27 +02:00
William Lallemand
ae0bc88f91 MINOR: acme: get the challenges object from the Auth URL
This patch implements the retrieval of the challenges objects on the
authorizations URLs. The challenges object contains a token and a
challenge url that need to be called once the challenge is setup.

Each authorization URLs contain multiple challenge objects, usually one
per challenge type (HTTP-01, DNS-01, ALPN-01... We only need to keep the
one that is relevent to our configuration.
2025-04-12 01:29:27 +02:00
William Lallemand
4842c5ea8c MINOR: acme: newOrder request retrieve authorizations URLs
This patch implements the newOrder action in the ACME task, in order to
ask for a new certificate, a list of SAN is sent as a JWS payload.
the ACME server replies a list of Authorization URLs. One Authorization
is created per SAN on a Order.

The authorization URLs are stored in a linked list of 'struct acme_auth'
in acme_ctx, so we can get the challenge URLs from them later.

The location header is also store as it is the URL of the order object.

https://datatracker.ietf.org/doc/html/rfc8555#section-7.4
2025-04-12 01:29:27 +02:00
William Lallemand
04d393f661 MINOR: acme: generate new account
The new account action in the ACME task use the same function as the
chkaccount, but onlyReturnExisting is not sent in this case!
2025-04-12 01:29:27 +02:00
William Lallemand
7f9bf4d5f7 MINOR: acme: check if the account exist
This patch implements the retrival of the KID (account identifier) using
the pkey.

A request is sent to the newAccount URL using the onlyReturnExisting
option, which allow to get the kid of an existing account.

acme_jws_payload() implement a way to generate a JWS payload using the
nonce, pkey and provided URI.
2025-04-12 01:29:27 +02:00
William Lallemand
0aa6dedf72 MINOR: acme: handle the nonce
ACME requests are supposed to be sent with a Nonce, the first Nonce
should be retrieved using the newNonce URI provided by the directory.

This nonce is stored and must be replaced by the new one received in the
each response.
2025-04-12 01:29:27 +02:00
William Lallemand
471290458e MINOR: acme: get the ACME directory
The first request of the ACME protocol is getting the list of URLs for
the next steps.

This patch implements the first request and the parsing of the response.

The response is a JSON object so mjson is used to parse it.
2025-04-12 01:29:27 +02:00
William Lallemand
b8209cf697 MINOR: acme/cli: add the 'acme renew' command
The "acme renew" command launch the ACME task for a given certificate.

The CLI parser generates a new private key using the parameters from the
acme section..
2025-04-12 01:29:27 +02:00
William Lallemand
bf6a39c4d1 MINOR: acme: add private key configuration
This commit allows to configure the generated private keys, you can
configure the keytype (RSA/ECDSA), the number of bits or the curves.

Example:

    acme LE
        uri https://acme-staging-v02.api.letsencrypt.org/directory
        account account.key
        contact foobar@example.com
        challenge HTTP-01
        keytype ECDSA
        curves P-384
2025-04-12 01:29:27 +02:00
William Lallemand
2e8c350b95 MINOR: acme: add configuration for the crt-store
Add new acme keywords for the ckch_conf parsing, which will be used on a
crt-store, a crt line in a frontend, or even a crt-list.

The cfg_postparser_acme() is called in order to check if a section referenced
elsewhere really exists in the config file.
2025-04-12 01:29:27 +02:00
William Lallemand
077e2ce84c MINOR: acme: add the acme section in the configuration parser
Add a configuration parser for the new acme section, the section is
configured this way:

    acme letsencrypt
        uri https://acme-staging-v02.api.letsencrypt.org/directory
        account account.key
        contact foobar@example.com
        challenge HTTP-01

When unspecified, the challenge defaults to HTTP-01, and the account key
to "<section_name>.account.key".

Section are stored in a linked list containing acme_cfg structures, the
configuration parsing is mostly resolved in the postsection parser
cfg_postsection_acme() which is called after the parsing of an acme section.
2025-04-12 01:29:27 +02:00
William Lallemand
20718f40b6 MEDIUM: ssl/ckch: add filename and linenum argument to crt-store parsing
Add filename and linenum arguments to the crt-store / ckch_conf parsing.

It allows to use them in the parsing function so we could emits error.
2025-04-12 01:29:27 +02:00
Willy Tarreau
00c967fac4 MINOR: master/cli: support bidirectional communications with workers
Some rare commands in the worker require to keep their input open and
terminate when it's closed ("show events -w", "wait"). Others maintain
a per-session context ("set anon on"). But in its default operation
mode, the master CLI passes commands one at a time to the worker, and
closes the CLI's input channel so that the command can immediately
close upon response. This effectively prevents these two specific cases
from being used.

Here the approach that we take is to introduce a bidirectional mode to
connect to the worker, where everything sent to the master is immediately
forwarded to the worker (including the raw command), allowing to queue
multiple commands at once in the same session, and to continue to watch
the input to detect when the client closes. It must be a client's choice
however, since doing so means that the client cannot batch many commands
at once to the master process, but must wait for these commands to complete
before sending new ones. For this reason we use the prefix "@@<pid>" for
this. It works exactly like "@" except that it maintains the channel
open during the whole execution. Similarly to "@<pid>" with no command,
"@@<pid>" will simply open an interactive CLI session to the worker, that
will be ended by "quit" or by closing the connection. This can be convenient
for the user, and possibly for clients willing to dedicate a connection to
the worker.
2025-04-11 16:09:17 +02:00
Aurelien DARRAGON
fbfeb591f7 MINOR: proxy: add deinit_proxy() helper func
Same as free_proxy(), but does not free the base proxy pointer (ie: the
proxy itself may not be allocated)

Goal is to be able to cleanup statically allocated dummy proxies.
2025-04-10 22:10:31 +02:00
Aurelien DARRAGON
e1cec655ee MINOR: proxy: add setup_new_proxy() function
Split alloc_new_proxy() in two functions: the preparing part is now
handled by setup_new_proxy() which can be called individually, while
alloc_new_proxy() takes care of allocating a new proxy struct and then
calling setup_new_proxy() with the freshly allocated proxy.
2025-04-10 22:10:31 +02:00
Willy Tarreau
f4634e5a38 MINOR: ring/cli: support delimiting events with a trailing \0 on "show events"
At the moment it is not supported to produce multi-line events on the
"show events" output, simply because the LF character is used as the
default end-of-event mark. However it could be convenient to produce
well-formatted multi-line events, e.g. in JSON or other formats. UNIX
utilities have already faced similar needs in the past and added
"-print0" to "find" and "-0" to "xargs" to mention that the delimiter
is the NUL character. This makes perfect sense since it's never present
in contents, so let's do exactly the same here.

Thus from now on, "show events <ring> -0" will delimit messages using
a \0 instead of a \n, permitting a better and safer encapsulation.
2025-04-08 14:36:35 +02:00
Willy Tarreau
0be6d73e88 MINOR: ring: support arbitrary delimiters through ring_dispatch_messages()
In order to support delimiting output events with other characters than
just the LF, let's pass the delimiter through the API. The default remains
the LF, used by applet_append_line(), and ignored by the log forwarder.
2025-04-08 14:36:35 +02:00
Willy Tarreau
f01ff2478f BUILD: atomics: fix build issue on non-x86/non-arm systems
Commit f435a2e518 ("CLEANUP: atomics: also replace __sync_synchronize()
with __atomic_thread_fence()") replaced the builtins used for barriers,
but the different API required an argument while the macros didn't specify
any, resulting in double parenthesis that were causing obscure build errors
such as "called object type 'void' is not a function or function pointer".
Let's just specify the args for the macro. No backport is needed.
2025-04-07 09:38:22 +02:00
Aurelien DARRAGON
11d4d0957e MEDIUM: task: make notification_* API thread safe by default
Some notification_* functions were not thread safe by default as they
assumed only one producer would emit events for registered tasks.

While this suited well with the Lua sockets use-case, this proved to
be a limitation with some other event sources (ie: lua Queue class)

instead of having to deal with both the non thread safe and thread
safe variants (_mt suffix), which is error prone, let's make the
entire API thread safe regarding the event list.

Pruning functions still require that only one thread executes them,
with Lua this is always the case because there is one cleanup list
per context.
2025-04-03 17:52:50 +02:00
Aurelien DARRAGON
748dba4859 MINOR: hlua_fcn: register queue class using hlua_register_metatable()
Most lua classes are registered by leveraging the
hlua_register_metatable() helper. Let's use that for the Queue class as
well for consitency.
2025-04-03 17:52:17 +02:00
Aurelien DARRAGON
b77b1a2c3a MINOR: task: add thread safe notification_new and notification_wake variants
notification_new and notification_wake were historically meant to be
called by a single thread doing both the init and the wakeup for other
tasks waiting on the signals.

In this patch, we extend the API so that notification_new and
notification_wake have thread-safe variants that can safely be used with
multiple threads registering on the same list of events and multiple
threads pushing updates on the list.
2025-04-03 17:52:03 +02:00
Amaury Denoyelle
f0f1816f1a MINOR: check: implement check-pool-conn-name srv keyword
This commit is a direct follow-up of the previous one. It defines a new
server keyword check-pool-conn-name. It is used as the default value for
the name parameter of idle connection hash generation.

Its behavior is similar to server keyword pool-conn-name, but reserved
for checks reuse. If check-pool-conn-name is set, it is used in priority
to match a connection for reuse. If unset, a fallback is performed on
check-sni.
2025-04-03 17:19:07 +02:00
Amaury Denoyelle
43367f94f1 MINOR: check/backend: support conn reuse with SNI
Support for connection reuse during server checks was implemented
recently. This is activated with the server keyword check-reuse-pool.

Similarly to stream processing via connect_backend(), a connection hash
is calculated when trying to perform reuse for checks. This is necessary
to retrieve for a connection which shares the check connect parameters.
However, idle connections can additionnally be tagged using a
pool-conn-name or SNI under connect_backend(). Check reuse does not test
these values, which prevent to retrieve a matching connection.

Improve this by using "check-sni" value as idle connection hash input
for check reuse. be_calculate_conn_hash() API has been adjusted so that
name value can be passed as input, both when using streams or checks.

Even with the current patch, there is still some scenarii which could
not be covered for checks connection reuse. most notably, when using
dynamic pool-conn-name/SNI value. It is however at least sufficient to
cover simpler cases.
2025-04-03 17:19:07 +02:00
Willy Tarreau
f435a2e518 CLEANUP: atomics: also replace __sync_synchronize() with __atomic_thread_fence()
The drop of older compilers also allows us to focus on clearer
barriers, so let's use them.
2025-04-03 11:59:31 +02:00
Willy Tarreau
34e3b83f9c CLEANUP: atomics: remove support for gcc < 4.7
The old __sync_* API is no longer necessary since we do not support
gcc before 4.7 anymore. Let's just get rid of this code, the file is
still ugly enough without it.
2025-04-03 11:55:35 +02:00
Ilia Shipitsin
27a6353ceb CLEANUP: assorted typo fixes in the code, commits and doc 2025-04-03 11:37:25 +02:00