Commit Graph

22639 Commits

Author SHA1 Message Date
Christopher Faulet d0d23a7a66 MINOR: spoe: Move spoe_str_to_vsn() into the header file
The function used to convert the SPOE version from a string to an integer is
now located in spoe-t.h header file.

The related issue is #2502.
2024-07-12 15:27:04 +02:00
Christopher Faulet 08b522d6ac MINOR: spoe: Move all stuff regarding the filter/applet in the C file
Structures describing the SPOE applet context, the SPOE filter configuration
and context and the SPOE messages and groups are moved in the C file. In
spoe-t.h file, it remains the structure describing an SPOE agent and flags
used by both sides.

In addition, the SPOE frontend, created for a given SPOE engine, is moved
from the SPOE filter configuration to the SPOE agent structure.

The related issue is #2502.
2024-07-12 15:27:04 +02:00
Christopher Faulet e6145a0ea1 MINOR: spoe: Dynamically alloc the message list per event of an agent
The inline array used to store, the configured messages per event in the
SPOE agent structure, is replaced by a dynamic array, allocated during the
configuration parsing. The main purpose of this change is to be able to move
all stuff regarding the SPOE filter and applet in the C file.

The related issue is #2502.
2024-07-12 15:27:04 +02:00
Christopher Faulet ce53bb6284 MINOR: spoe: Rename some flags and constant to use SPOP prefix
A SPOP multiplexer will be added. Many flags, constants and structures will
be remove from the applet scope. So the "SPOP" prefix is used instead of
"SPOE", to be consistent.

The related issue is #2502.
2024-07-12 15:27:04 +02:00
Christopher Faulet 51ebf644e5 MINOR: stconn: Use a dedicated function to get the opposite sedesc
se_opposite() function is added to let an endpoint retrieve the opposite
endpoint descriptor. Muxes supportng the zero-copy forwarding can now use
it. The se_shutdown() function too. This will be use by the SPOP multiplexer
to be able to retrieve the SPOE agent configuration attached to the applet
on client side.

The related issue is #2502.
2024-07-12 15:27:04 +02:00
Christopher Faulet 4b8098bf48 MINOR: connection: No longer include stconn type header in connection-t.h
It is a small change, but it is cleaner to no include stconn-t.h header in
connection-t.h, mainly to avoid circular definitions.

The related issue is #2502.
2024-07-12 15:27:04 +02:00
Christopher Faulet 33ac3dabcb MEDIUM: applet: Add a .shut callback function for applets
Applets can now define a shutdown callback function, just like the
multiplexer. It is especially usefull to get the abort reason. This will be
pretty useful to get the status code from the SPOP stream to report it at
the SPOe filter level.

The related issue is #2502.
2024-07-12 15:27:04 +02:00
Christopher Faulet 1538c4aa82 MEDIUM: proxy/spoe: Add a SPOP mode
The SPOE was significantly lightened. It is now possible to refactor it to
use a dedicated multiplexer. The first step is to add a SPOP mode for
proxies. The corresponding multiplexer mode is also added.

For now, there is no SPOP multiplexer, so it is only declarative. But at the
end, the SPOP multiplexer will be automatically selected for servers inside
a SPOP backend.

The related issue is #2502.
2024-07-12 15:27:04 +02:00
Christopher Faulet b986952a75 MINOR: spoe: Remove the dedicated SPOE applet task
The dedicated task per SPOE applet is no longer used. So it is removed.

The related issue is #2502.
2024-07-12 15:27:04 +02:00
Christopher Faulet 4e589095d9 MAJOR: spoe: Remove idle applets and pipelining support
Management of idle applets is removed. Consequently, the pipelining support
is also removed. It is a huge change but it should be transparent for the
agents, except regarding the performances. Of course, being able to reuse
already openned connections and being able to multiplex frames on a given
connection is a must have. These features will be restored later.

hello and idle timeout are not longer used. Because an applet is spawned to
process a NOTIFY frame and closed after receiving the ACK reply, the
processing timeout is the only one required. In addition, the parameters to
limit the SPOE applet creation are no longer used too.

The related issue is #2502.
2024-07-12 15:27:04 +02:00
Christopher Faulet 2405881ab0 MINOR: spoe: Remove debugging
All the SPOE debugging is removed. The code will be easier to rework this
way and the debugging will be mainly moved in the SPOP multiplexter via the
trace API.

The related issue is #2502.
2024-07-12 15:27:04 +02:00
Christopher Faulet d37489abef MINOR: spoe: Use only a global engine-id per agent
Because the async mode was removed, it is no longer mandatory to announce a
different engine identifiers per thread for a given SPOE agent. This was
used to be sure requests and the corresponding responses are stuck on the
same thread.

So, now, a SPOE agent only announces one engine identifier on all
connections. No changes should be expected for agents.

The related issue is #2502.
2024-07-12 15:27:04 +02:00
Christopher Faulet 52ad7eb79e MEDIUM: spoe: Remove async mode support
The support for asynchronous mode, the ability to send messages on a
connection and receive the responses on any other connections, is removed.
It appears this feature was a bit overkill. And it is a problem for this
refactoring. This feature is removed and will not be restored at the end.

It is not a big deal for agent supporting the async mode because it is
usable if it is announced on both sides. HAProxy stops to announce it. This
should be transparent for agents.

The related issue is #2502.
2024-07-12 15:27:04 +02:00
Christopher Faulet e3c92209f7 MEDIUM: spoe: Remove fragmentation support
It is the first patch of a long series to refactor the SPOE filter. The idea
is to rely on a dedicated multiplexer instead of hakcing HAProxy with a list
of applets processing a message queue.

First of all, optionnal features will be removed. Some will be restored at
the end, some others will just be removed. It is the case here. The frame
fragmentation support is removed. The only purpose of this feature is to be
able to support the streaming. Because it is out of the scope of this
refactoring, the fragmentation is removed.

The related issue is #2502.
2024-07-12 15:27:04 +02:00
Christopher Faulet 249a547f37 CLEANUP: stconn: Fix a typo in comments for SE_ABRT_SRC_*
Just a little typo: s/set bu/ set by/
2024-07-12 15:27:04 +02:00
Christopher Faulet 0764445505 BUG/MINOR: session: Eval L4/L5 rules defined in the default section
It is possible to define TCP/HTTP rules in a named default section to
inherit from it in a proxy. However, there is an issue with L4/L5 rules.
Only the lists of the current frontend are checked to know if an eval must
be performed. Nothing is done for an empty list. Of course, the lists of the
default proxy must also be checked to be sure to not ignored default L4/L5
rules. It is now fixed.

This patch should fix the issue #2637. It must be backported as far as 2.6.
2024-07-12 15:27:04 +02:00
Valentine Krasnobaeva 9302869c95 BUG/MINOR: limits: fix license type in limits.h
Need to use LGPL-2.1-or-later in headers since our hedaers default
to LGPL.
2024-07-11 18:15:48 +02:00
Amaury Denoyelle 3be58fc720 CLEANUP: quic: rename TID affinity elements
This commit is the renaming counterpart of the previous one, this time
for quic_conn module. Several elements related to TID affinity update
from quic_conn has been renamed : public functions, but also flag
renamed to QUIC_FL_CONN_TID_REBIND and trace event to
QUIC_EV_CONN_BIND_TID.

This should be backported with the same instruction as the previous
commit.
2024-07-11 15:14:06 +02:00
Amaury Denoyelle 9fbe8b0334 CLEANUP: proto: rename TID affinity callbacks
Since the following patch, protocol API to update a connection TID
affinity has been extended.
  commit 1a43b9f32c
  MINOR: proto: extend connection thread rebind API

The single callback set_affinity has been splitted in 3 different
functions which are called at different stages during listener_accept(),
depending on accept queue push success or not. However, the naming was
rendered confusing by the usage of function prefix 1 and 2.

Rename proto callback related to TID affinity update and use the
following names :

* bind_tid_prep
* bind_tid_commit
* bind_tid_reset

This commit should probably be backported at least up to 3.0 with the
above patch. This is because the fix was recently backported and it
would allow to keep changes minimal between the two versions. It could
even be backported up to 2.8 if there is no major conflict.
2024-07-11 15:14:06 +02:00
Christopher Faulet 2cb5b7dca6 BUG/MEDIUM: bwlim: Be sure to never set the analyze expiration date in past
Every time a bandwidth limitation is evaluated on a channel, the analyze
expiration date is renewed, mainly based on the internal bandwidth
limitation filter expiration date. However, when the filter is called while
there is no data to filter, we skip all limitation computations to jump at
the end of the function. At this stage, the analyze expiration date is
renewed before exiting. But here the internal expiration date may be expired
and not reset.

To sum up, it is possible to set the analyze expiration date of a channel in
the past. It is unexpected and this could lead to a loop in process_stream.

To fix the issue, we just now take care to reset the internal expiration
date, if needed, before exiting.

This patch should fix the issue #2634. It must be backported as far as 2.8.
2024-07-11 14:51:23 +02:00
Amaury Denoyelle b0990b38f8 MINOR: quic: add counters of sent bytes with and without GSO
Add a sent bytes counter for each quic_conn instance. A secondary field
which only account bytes sent via GSO which is useful to ensure if this
is activated.

For the moment, these counters are reported on "show quic" but not
aggregated on proxy quic module stats.
2024-07-11 11:02:44 +02:00
Amaury Denoyelle d0ea173e35 MEDIUM: quic: implement GSO fallback mechanism
UDP GSO on Linux is not implemented in every network devices. For
example, this is not available for veth devices frequently used in
container environment. In such case, EIO is reported on send()
invocation.

It is impossible to test at startup for proper GSO support in this case
as a listener may be bound on multiple network interfaces. Furthermore,
network interfaces may change during haproxy lifetime.

As such, the only option is to react on send syscall error when GSO is
used. The purpose of this patch is to implement a fallback when
encountering such conditions. Emission can be retried immediately by
trying to send each prepared datagrams individually.

To support this, qc_send_ppkts() is able to iterate over each datagram
in a so-called non-GSO fallback mode. Between each emission, a datagram
header is rewritten in front of the buffer which allows the sending loop
to proceed until last datagram is emitted.

To complement this, quic_conn listener is flagged on first GSO send
error with value LI_F_UDP_GSO_NOTSUPP. This completely disables GSO for
all future emission with QUIC connections using this listener.

For the moment, non-GSO fallback mode is activated when EIO is reported
after GSO has been set. This is the error reported for the veth usage
described above.
2024-07-11 11:02:44 +02:00
Amaury Denoyelle af22792a43 MAJOR: quic: support GSO when encoding datagrams
QUIC datagrams are encoded during emission via the function
qc_prep_pkts(). By default, if GSO is not used, each datagram is
prefixed by a metadata header which specify its length and address of
its first quic_tx_packet instance.

If GSO is activated, metadata header won't be inserted for datagrams
following the first one sent in a single syscall. Length field will
contain the total size of these datagrams. This allows to support both
GSO and non-GSO prepared datagram in the same Tx buffer.

qc_send_ppkts() is invoked just after datagrams encoding. It iterates
over each metadata header in Tx buffer to sent each datagram
individually. If length field is bigger than network MTU, GSO usage is
assumed and qc_snd_buf() GSO parameter will be set.

Another important point to note regarding GSO implementation is that
during datagram encoding, packets from the same datagram instance are
attached together. However, if using GSO, consecutive packets from
different datagrams are also linked, but without
QUIC_FL_TX_PACKET_COALESCED flag. This allows to properly update
quic_conn status with all sent packets in qc_send_ppkts(). Packets from
different datagrams are then unlinked to treat them separately when
receiving corresponding ACK frames.
2024-07-11 11:02:44 +02:00
Amaury Denoyelle 448d3d388a MINOR: quic: add GSO parameter on quic_sock send API
Add <gso_size> parameter to qc_snd_buf(). When non-null, this specifies
the value for socket option SOL_UDP/UDP_SEGMENT. This allows to send
several datagrams in a single call by splitting data multiple times at
<gso_size> boundary.

For now, <gso_size> remains set to 0 by caller, as such there should not
be any functional change.
2024-07-11 11:02:44 +02:00
Amaury Denoyelle 96a34d79d9 MINOR: quic: define quic_cc_path MTU as constant
Future commits will implement GSO support to be able to emit multiple
datagrams in a single syscall invocation. This will be used every time
there is more data to sent than the UDP network MTU.

No change will be done for Tx buffer encoding, in particular when using
extra metadata datagram header. When GSO will be used, length field will
contain the total length of all datagrams to emit in a single GSO
syscall send. As such, QUIC send functions will detect that GSO is in
use if total length is greater than MTU.

This last assumption forces to ensure that MTU is constant. Indeed, in
case qc_send() is interrupted, Tx buffer will be left with prepared
datagrams. These datagrams will be emitted at the next qc_send()
invocation. If MTU would change during these two calls, it would be
impossible to know if GSO was used or not. To prevent this, mark <mtu>
field of quic_cc_path as constant.
2024-07-11 11:02:44 +02:00
Amaury Denoyelle 35470d5185 MINOR: quic: activate UDP GSO for QUIC if supported
Add a startup test for GSO support in quic_test_socketopts() and
automatically activate it in qc_prep_pkts() when building datagrams as
big as MTU.

Also define a new config option tune.quic.disable-udp-gso. This is
useful to prevent warning on older platform or to debug an issue which
may be related to GSO.
2024-07-11 11:02:44 +02:00
Amaury Denoyelle 5bddf39fb2 MINOR: quic: extend detection of UDP API OS features
QUIC haproxy implementation relies on specific OS features to activate
some UDP optimization. One of these is the ability to bind multiple
sockets on the same address, which is necessary to have a dedicated
socket for each QUIC connections. This feature support is tested during
startup via an internal proto-quic function. It automatically deactivate
socket per connection if OS is not compatible.

The purpose of this patch is to render this QUIC feature detection code
more generic. Function is renamed quic_test_socketopts() and is still
invoked on startup. Its internal code has been refactored to be able to
implement other features support test in it.

Return value has also been changed and is now taken into account. In
case of ERR_FATAL, haproxy startup will be interrupted. This happens on
socket() syscall failure used to duplicate a QUIC listener FD.

This commit will become necessary to detect GSO support on startup.
2024-07-11 11:02:44 +02:00
Amaury Denoyelle cac47d19bd CLEANUP: quic: remove obsolete comment on send
Remove comment on send which is now obsolete since the introduction of
per-connection socket.
2024-07-11 11:02:44 +02:00
Valentine Krasnobaeva 3a0b44b122 MINOR: limits: add is_any_limit_configured
Let's encapsulate the check of all supported for now process internal limits in
a separate function. This will help in cases, when we need to simply check if
we have even only one limit set in the configuration file. It's important, as
the default value for a one limit (fd-hard-limits, for example) sometimes must
not affect the computation of the others.
2024-07-10 18:05:48 +02:00
Valentine Krasnobaeva 1f8addfdc2 REORG: haproxy: move limits handlers to limits
This patch moves handlers to compute process related limits in 'limits'
compilation unit.
2024-07-10 18:05:48 +02:00
Valentine Krasnobaeva 22db643648 MINOR: haproxy: prepare to move limits-related code
This patch is done in order to prepare the move of handlers to compute and to
check process related limits as maxconn, maxsock, maxpipes.

So, these handlers become no longer static due to the future move.

We add the handlers declarations in limits.h in this patch as well, in order to
keep the next patch, dedicated to code replacement, without any additional
modifications.

Such split also assures that this patch can be compiled separately from the
next one, where we moving the handlers. This  is important in case of
git-bisect.
2024-07-10 18:05:48 +02:00
Valentine Krasnobaeva b8dc783eb9 REORG: global: move rlim_fd_*_at_boot in limits
Let's move in 'limits' compilation unit global variables to keep the initial
process fd limits.
2024-07-10 18:05:48 +02:00
Valentine Krasnobaeva 47f2afb436 CLEANUP: fd: rm struct rlimit definition
As raise_rlim_nofile() was moved to limits compilation unit, limits.h includes
the system <sys/resource.h>. So, this definition of rlimit system type
structure is no longer need for compilation of fd unit.
2024-07-10 18:05:48 +02:00
Valentine Krasnobaeva 3759674047 REORG: fd: move raise_rlim_nofile to limits
Let's move raise_rlim_nofile() from 'fd' compilation unit to 'limits', as it
wraps setrlimit to change process RLIMIT_NOFILE.
2024-07-10 18:05:48 +02:00
Valentine Krasnobaeva 1517bcb5e3 MINOR: limits: prepare to keep limits in one place
The code which gets, sets and checks initial and current fd limits and process
related limits (maxconn, maxsock, ulimit-n, fd-hard-limit) is spread around
different functions in haproxy.c and in fd.c. Let's group it together in
dedicated limits.c and limits.h.

This patch is done in order to prepare the moving of limits-related functions
from different places to the new 'limits' compilation unit. It helps to keep
clean the next patch, which will do only the move without any additional
modifications.

Such detailed split is needed in order to be sure not to break accidentally
limits logic and in order to be able to compile each commit separately in case
of git-bisect.
2024-07-10 18:05:48 +02:00
Willy Tarreau a4bc71a1a3 [RELEASE] Released version 3.1-dev3
Released version 3.1-dev3 with the following main changes :
    - BUG/MINOR: quic: Wrong datagram building when probing.
    - BUG/MEDIUM: quic: fix possible exit from qc_check_dcid() without unlocking
    - BUG/MINOR: promex: Remove Help prefix repeated twice for each metric
    - DOC: configuration: add details about crt-store in bind "crt" keyword
    - BUG/MEDIUM: hlua/cli: Fix lua CLI commands to work with applet's buffers
    - DOC: configuration: more details about the master-worker mode
    - BUG/MEDIUM: server: fix race on server_atomic_sync()
    - BUG/MINOR: jwt: don't try to load files with HMAC algorithm
    - CLEANUP: quic: cleanup prototypes related to CIDs handling
    - CLEANUP: quic: remove non-existing quic_cid_tree definition
    - MINOR: quic: remove access to CID global tree outside of quic_cid module
    - REORG: quic: remove quic_cid_trees reference from proto_quic
    - MINOR: quic: add 2 BUG_ON() on datagram dispatch
    - MINOR: quic: ensure quic_conn is never removed on thread affinity rebind
    - MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD
    - DOC: configuration: update maxconn description
    - MINOR: proto: extend connection thread rebind API
    - BUG/MEDIUM: quic: prevent crash on accept queue full
    - BUG/MEDIUM: peers: Fix crash when syncing learn state of a peer without appctx
    - CI: add weekly QUIC Interop regression against LibreSSL
    - DEV: flags/quic: decode quic_conn flags
    - MINOR: quic: rename "ssl error" trace
    - BUG/MEDIUM: init: fix fd_hard_limit default in compute_ideal_maxconn
    - BUG/MINOR: jwt: fix variable initialisation
    - MINOR: ssl/sample: ssl_c_san returns a comma separated list of SAN
    - OPTIM: pool: improve needed_avg cache line access pattern
    - MAJOR: import: update mt_list to support exponential back-off (try #2)
    - CI: weekly QUIC Interop: try to fix private image
    - BUG/MINOR: h1: Fail to parse empty transfer coding names
    - BUG/MINOR: h1: Reject empty coding name as last transfer-encoding value
    - BUG/MEDIUM: h1: Reject empty Transfer-encoding header
    - BUG/MEDIUM: spoe: Be sure to create a SPOE applet if none on the current thread
    - BUILD: listener: silence a build warning about unused value without threads
    - DOC: architecture: remove the totally outdated architecture manual
    - SCRIPTS: create-release: no more need to skip architecture.txt
2024-07-10 15:39:36 +02:00
Willy Tarreau d96b9f4249 SCRIPTS: create-release: no more need to skip architecture.txt
Now that it's gone we won't stumble upon it by accident anymore.
2024-07-10 15:38:45 +02:00
Willy Tarreau 95b9d8abee DOC: architecture: remove the totally outdated architecture manual
We've discussed about removing it many times and I thought it had been
removed long ago, but apparently not as William proved me. Let's get
rid of it now. It's totally outdated (last updated 18 years ago, when
laptop processors were still 32 bits), mentions keywords and external
products that don't exist anymore. It's not even on docs.haproxy.org.
At some point, old stuff must really die.
2024-07-10 15:38:20 +02:00
Willy Tarreau 0cb8743209 BUILD: listener: silence a build warning about unused value without threads
A variable introduced in commit 1a43b9f32c ("MINOR: proto: extend
connection thread rebind API") is not used without threads and causes a
build warning. Let's just mark it maybe_unused.

Since the commit above is tagged for backporting, this one will need to
be backported along with it.
2024-07-10 15:17:04 +02:00
Christopher Faulet 5e84f13a0b BUG/MEDIUM: spoe: Be sure to create a SPOE applet if none on the current thread
When a message is queued, waiting to be processed by a SPOE applet, there
are some heuristic to know if a new applet must be created or not. There are
2 conditions to skip the applet creation:

  1 - if there are enough idle applets on the current thread, or,

  2 - if the processing rate on the current thread is high enough to handle
      this new message

In the 2nd case, there is a flaw when the number of processed messages falls
to zero while the processing rate is still greater than zero. In that case,
we will skip the SPOE applet creation without taking care to check there is
at least one applet on the current thread.

So now, the conditions above to skip the SPOE applet creation are only
evaluated if there is at least one applet on the current thread.

This patch must be backported to every stable versions.
2024-07-10 10:52:20 +02:00
Christopher Faulet 4a2dd6f377 BUG/MEDIUM: h1: Reject empty Transfer-encoding header
The Transfer-Encoding headers list the transfer coding that have been
applied to the content in order to form the message body. It is a list of
tokens. And as specified by RFC 9110, a token cannot be empty. When several
coding names are specify as a comma-separated value, this case is properly
handled and an error is triggered. However, an empty header value will just
be skipped and no error is triggered. This could be an issue with some buggy
servers.

Now, empty Transfer-Encoding header are rejected too.

This patch must be backported as far as 2.6.
2024-07-10 10:52:20 +02:00
Christopher Faulet 428451fe96 BUG/MINOR: h1: Reject empty coding name as last transfer-encoding value
The following Transfer-Encoding header is now rejected with a
400-bad-request:

  Transfer-Encoding: chunked,\r\n

This case was not properly handled and the last empty value was just
ignored.

This patch must be backported as far as 2.6.
2024-07-10 10:52:20 +02:00
Christopher Faulet b8b0102760 BUG/MINOR: h1: Fail to parse empty transfer coding names
Empty transfer coding names, inside a comma-separated list, are already
rejected. But it is only by chance. Today, it is detected as an unknown
coding names (not "chunked" concretly). Then, it is handled by the H1
multiplexer as an error and a 422-Unprocessable-Content response is
returned.

So, the error is properly detected in this case, but it is not accurate. A
400-bad-request response must be returned instead. Then, it is better to
catch the error during the header parsing. It is the purpose of this patch.

This patch should be backported as far as 2.6.
2024-07-10 10:52:20 +02:00
Ilia Shipitsin 89bdd8b62a CI: weekly QUIC Interop: try to fix private image
for some reason image built in HAProxy workflow is "private", it
is succesfully built, but fails to pull. Let's try explicit docker login
for run job as well
2024-07-10 09:43:02 +02:00
Willy Tarreau 4e65fc66f6 MAJOR: import: update mt_list to support exponential back-off (try #2)
This is the second attempt at importing the updated mt_list code (commit
59459ea3). The previous one was attempted with commit c618ed5ff4 ("MAJOR:
import: update mt_list to support exponential back-off") but revealed
problems with QUIC connections and was reverted.

The problem that was faced was that elements deleted inside an iterator
were no longer reset, and that if they were to be recycled in this form,
they could appear as busy to the next user. This was trivially reproduced
with this:

  $ cat quic-repro.cfg
  global
          stats socket /tmp/sock1 level admin
          stats timeout 1h
          limited-quic

  frontend stats
          mode http
          bind quic4@:8443 ssl crt rsa+dh2048.pem alpn h3
          timeout client 5s
          stats uri /

  $ ./haproxy -db -f quic-repro.cfg  &

  $ h2load -c 10 -n 100000 --npn h3 https://127.0.0.1:8443/
  => hang

This was purely an API issue caused by the simplified usage of the macros
for the iterator. The original version had two backups (one full element
and one pointer) that the user had to take care of, while the new one only
uses one that is transparent for the user. But during removal, the element
still has to be unlocked if it's going to be reused.

All of this sparked discussions with Fred and Aurlien regarding the still
unclear state of locking. It was found that the lock API does too much at
once and is lacking granularity. The new version offers a much more fine-
grained control allowing to selectively lock/unlock an element, a link,
the rest of the list etc.

It was also found that plenty of places just want to free the current
element, or delete it to do anything with it, hence don't need to reset
its pointers (e.g. event_hdl). Finally it appeared obvious that the
root cause of the problem was the unclear usage of the list iterators
themselves because one does not necessarily expect the element to be
presented locked when not needed, which makes the unlock easy to overlook
during reviews.

The updated version of the list presents explicit lock status in the
macro name (_LOCKED or _UNLOCKED suffixes). When using the _LOCKED
suffix, the caller is expected to unlock the element if it intends to
reuse it. At least the status is advertised. The _UNLOCKED variant,
instead, always unlocks it before starting the loop block. This means
it's not necessary to think about unlocking it, though it's obviously
not usable with everything. A few _UNLOCKED were used at obvious places
(i.e. where the element is deleted and freed without any prior check).

Interestingly, the tests performed last year on QUIC forwarding, that
resulted in limited traffic for the original version and higher bit
rate for the new one couldn't be reproduced because since then the QUIC
stack has gaind in efficiency, and the 100 Gbps barrier is now reached
with or without the mt_list update. However the unit tests definitely
show a huge difference, particularly on EPYC platforms where the EBO
provides tremendous CPU savings.

Overall, the following changes are visible from the application code:

  - mt_list_for_each_entry_safe() + 1 back elem + 1 back ptr
    => MT_LIST_FOR_EACH_ENTRY_LOCKED() or MT_LIST_FOR_EACH_ENTRY_UNLOCKED()
       + 1 back elem

  - MT_LIST_DELETE_SAFE() no longer needed in MT_LIST_FOR_EACH_ENTRY_UNLOCKED()
      => just manually set iterator to NULL however.
    For MT_LIST_FOR_EACH_ENTRY_LOCKED()
      => mt_list_unlock_self() (if element going to be reused) + NULL

  - MT_LIST_LOCK_ELT => mt_list_lock_full()
  - MT_LIST_UNLOCK_ELT => mt_list_unlock_full()

  - l = MT_LIST_APPEND_LOCKED(h, e);  MT_LIST_UNLOCK_ELT();
    => l=mt_list_lock_prev(h); mt_list_lock_elem(e); mt_list_unlock_full(e, l)
2024-07-09 16:46:38 +02:00
Willy Tarreau 87d269707b OPTIM: pool: improve needed_avg cache line access pattern
On an AMD EPYC 3rd gen, 20% of the CPU is spent calculating the amount
of pools needed when using QUIC, because pool allocations/releases are
quite frequent and the inter-CCX communication is super slow. Still,
there's a way to save between 0.5 and 1% CPU by using fetch-add and
sub-fetch that are converted to XADD so that the result is directly
fed into the swrate_add argument without having to re-read the memory
area. That's what this patch does.
2024-07-09 16:46:38 +02:00
William Lallemand 9797a7718c MINOR: ssl/sample: ssl_c_san returns a comma separated list of SAN
The ssl_c_san sample fetch returns a list of Subject Alt Name which was
presented by the client certificate.

The format is the same as the "openssl x509 -text" command, it's a
Description: Value list separated by commas.
The format is directly generated by the GENERAL_NAME_print() openssl
function.

https://github.com/openssl/openssl/blob/openssl-3.0/crypto/x509/v3_san.c#L207

Example:
    IP Address:127.0.0.1, IP Address:127.0.0.2, IP Address:127.0.0.3, URI:http://docs.haproxy.org/2.7/, DNS:ca.tests.haproxy.com
2024-07-09 13:57:18 +02:00
William Lallemand 0a1b251c1a BUG/MINOR: jwt: fix variable initialisation
Set the alg variable from sample_conv_jwt_verify_check() to
JWT_ALG_DEFAULT.

This was reported by coverity in #2630, but since you need to use the
first argument to use the 2nd, this has no real impact.

Mut be backported with 883f1bd (as far as 2.6).
2024-07-08 14:23:14 +02:00
Valentine Krasnobaeva 16a5fac4bb BUG/MEDIUM: init: fix fd_hard_limit default in compute_ideal_maxconn
This commit fixes 41275a691 ("MEDIUM: init: set default for fd_hard_limit via
DEFAULT_MAXFD").

fd_hard_limit is taken in account implicitly via 'ideal_maxconn' value in
all maxconn adjustements, when global.rlimit_memmax is set:

	MIN(global.maxconn, capped by global.rlimit_memmax, ideal_maxconn);

It also caps provided global.rlimit_nofile, if it couldn't be set as a current
process fd limit (see more details in the main() code).

So, lets set the default value for fd_hard_limit only, when there is no any
other haproxy-specific limit provided, i.e. rlimit_memmax, maxconn,
rlimit_nofile. Otherwise we may break users configs.

Please, note, that in master-worker mode, master does not need the
DEFAULT_MAXFD (1048576) as well, as we explicitly limit its maxconn to 100.

Must be backported in all stable versions until v2.6.0, including v2.6.0,
like the commit above.
2024-07-08 11:26:16 +02:00
Amaury Denoyelle 3d4baa3c7b MINOR: quic: rename "ssl error" trace
SSL status is reported each time quic_conn_io_cb() is finished via a
trace. Change the trace label from "ssl error" to "ssl status". This
allows to search for errors easier without being distracted by this
trace.
2024-07-08 09:38:35 +02:00