Commit Graph

20653 Commits

Author SHA1 Message Date
Andrew Hopkins
88988bb06c REGTESTS: ssl: skip ssl_dh test with AWS-LC
skip ssl_dh test when HAProxy is built with AWS-LC which does not support FFDH ciphersuites.
2023-09-04 18:21:01 +02:00
Andrew Hopkins
b3f94f8b3b BUILD: ssl: Build with new cryptographic library AWS-LC
This adds a new option for the Makefile USE_OPENSSL_AWSLC, and
update the documentation with instructions to use HAProxy with
AWS-LC.

Update the type of the OCSP callback retrieved with
SSL_CTX_get_tlsext_status_cb with the actual type for
libcrypto versions greater than 1.0.2. This doesn't affect
OpenSSL which casts the callback to void* in SSL_CTX_ctrl.
2023-09-04 18:19:18 +02:00
Miroslav Zagorac
3cfc30416c MINOR: properly mark the end of the CLI command in error messages
In several places in the file src/ssl_ckch.c, in the message about the
incorrect use of the CLI command, the end of that CLI command is not
correctly marked with the sign ' .
2023-09-04 18:13:43 +02:00
William Lallemand
637306c86d DOC: configuration: update examples for req.ver
Update the documentation for the req.ver sample fetch.

Could be backported as far as 2.6.
2023-09-04 18:12:58 +02:00
Willy Tarreau
8547f5cfa2 BUG/MINOR: stream: further protect stream_dump() against incomplete sessions
As found by Coverity in issue #2273, the fix in commit e64bccab2 ("BUG/MINOR:
stream: protect stream_dump() against incomplete streams") was still not
enough, as scf/scb are still dereferenced to dump their flags and states.

This should be backported to 2.8.
2023-09-04 15:32:17 +02:00
Chris Staite
3939e39479 BUG/MEDIUM: h1-htx: Ensure chunked parsing with full output buffer
A previous fix to ensure that there is sufficient space on the output buffer
to place parsed data (#2053) introduced an issue that if the output buffer is
filled on a chunk boundary no data is parsed but the congested flag is not set
due to the state not being H1_MSG_DATA.

The check to ensure that there is sufficient space in the output buffer is
actually already performed in all downstream functions before it is used.
This makes the early optimisation that avoids the state transition to
H1_MSG_DATA needless.  Therefore, in order to allow the chunk parser to
continue in this edge case we can simply remove the early check.  This
ensures that the state can progress and set the congested flag correctly
in the caller.

This patch fixes #2262. The upstream change that caused this logic error was
backported as far as 2.5, therefore it makes sense to backport this fix back
that far also.
2023-09-04 12:15:36 +02:00
Willy Tarreau
135c66f6cb BUG/MEDIUM: connection: fix pool free regression with recent ppv2 TLV patches
In commit fecc573da ("MEDIUM: connection: Generic, list-based allocation
and look-up of PPv2 TLVs") there was a tiny mistake, elements of length
<= 128 are allocated from pool_pp_128 but only those of length < 128 are
released to this pool, other ones go to pool_pp_256. Because of this,
elements of size exactly 128 are allocated from 128 and released to 256.
It can be reproduced a few times by running sample_fetches/tlvs.vtc 1000
times with -DDEBUG_DONT_SHARE_POOLS -DDEBUG_MEMORY_POOLS -DDEBUG_EXPR
-DDEBUG_STRICT=2 -DDEBUG_POOL_INTEGRITY -DDEBUG_POOL_TRACING
-DDEBUG_NO_POOLS. Not sure why it doesn't reproduce more often though.

No backport is needed. This should address github issues #2275 and #2274.
2023-09-04 11:45:37 +02:00
Frédéric Lécaille
d52466726f BUG/MINOR: quic: Unchecked pointer to packet number space dereferenced
It is possible that there are still Initial crypto data in flight without
Handshake crypto data in flight. This is very rare but possible.

This issue was reported by long-rtt interop test with quic-go as client
and @chipitsine in GH #2276.

No need to backport.
2023-09-04 11:29:35 +02:00
Frédéric Lécaille
9077f20251 BUG/MAJOR: quic: Really ignore malformed ACK frames.
If not correctly parsed, an ACK frame must be ignored without any more
treatment. Before this patch an ACK frame could be partially correctly
parsed, then some errors could be detected which leaded newly acknowledged
packets to be released in a wrong way calling free_quic_tx_pkts() called
by qc_parse_ack_frm(). But there is no reason to release such packets because
of a malformed ACK frame.

This patch modifies qc_parse_ack_frm(). The newly acknowledged TX packets is done
in two steps. It first collects the newly acknowledged packet calling
qc_newly_acked_pkts(). Then proceed the same way as before for the treatments of
haproxy TX packets acknowledged by the peer. If the ACK frame could not be fully
parsed, the newly ackowledged packets are replaced back from where they were
detached: the tree of TX packets for their encryption level.

Must be backported as far as 2.6.
2023-09-04 11:29:35 +02:00
Frédéric Lécaille
7dad52bdbd MINOR: quic: Add a trace to quic_release_frm()
Display the address of the frame to be released as soon as entering into
quic_release_frm() whose job is obviously to released the memory allocated
for the frame <frm> passed as parameter.
2023-09-04 11:29:35 +02:00
Frédéric Lécaille
3c90c1ce6b BUG/MINOR: quic: Possible skipped RTT sampling
There are very few chances this bug may occur. Furthermore the consequences
are not dramatic: an RTT sampling may be ignored. I guess this may happen
when the now_ms global value wraps.

Do not rely on the time variable value a packet was sent to decide if it
is a newly acknowledged packet but on its presence or not in the tx packet
ebtree.

Must be backported as far as 2.6.
2023-09-04 11:29:35 +02:00
Christopher Faulet
b50a471adb BUG/MEDIUM: stconn: Don't block sends if there is a pending shutdown
For the same reason than the previous patch, we must not block the sends
when there is a pending shutdown. In other words, we must consider the sends
are allowed when there is a pending shutdown.

This patch must slowly be backported as far as 2.2. It should partially fix
issue #2249.
2023-09-01 14:18:26 +02:00
Christopher Faulet
0b93ff8c87 BUG/MEDIUM: stconn: Wake applets on sending path if there is a pending shutdown
An applet is not woken up on sending path if it is not waiting for data or
if it states it will not consume data. However, it is important to still
wake it up if there is a pending shutdown. Otherwise, the event may be
missed and some data may remain blocked in the channel's buffer.

Because of this bug, it is possible to have a stream stuck if data are also
blocked on the opposite channel. It is for instance possible to hit the buf
with the stats applet and a client not consuming data.

This patch must slowly be backported as far as 2.2. It should partially fix
issue #2249.
2023-09-01 14:18:26 +02:00
Christopher Faulet
9e394d34e0 BUG/MINOR: stconn: Don't report blocked sends during connection establishment
The server timeout must not be handled during the connection establishment
to not superseed the connect timeout. To do so, we must not consider
outgoing data are blocked during this stage. Concretly, it means the fsb
time must not be updated during connection establishment.

It is not an issue with regular clients because the server timeout is only
defined when the connection is estalished. However, it may be an issue for the
HTTP client, when the server timeout is lower than the connect timeout. In this
case, an early 502 may be reported with no connection retries.

This patch must be backported to 2.8.
2023-09-01 14:18:26 +02:00
Christopher Faulet
3479d99d5f BUG/MEDIUM: stconn: Update stream expiration date on blocked sends
When outgoing data are blocked, we must update the stream expiration date
and requeue the task. It is important to be sure to properly handle write
timeout, expecially if the stream cannot expire on reads. This bug was
introduced when handling of channel's timeouts was refactored to be managed
by the stream-connectors.

It is an issue if there is no server timeout and the client does not consume
the response (or the opposite but it is less common). It is also possible to
trigger the same scenario with applets on server side because, most of time,
there is no server timeout.

This patch must be backported to 2.8.
2023-09-01 14:18:26 +02:00
Christopher Faulet
49ed83e948 DEBUG: applet: Properly report opposite SC expiration dates in traces
The wrong label was used in trace to report expiration dates of the opposite
SC. "sc" was used instead of "sco".

This patch should be backported to 2.8.
2023-09-01 14:18:26 +02:00
Willy Tarreau
b0031d9679 MINOR: checks: also consider the thread's queue for rebalancing
Let's also check for other threads when the current one is queueing,
let's not wait for the load to be high. Now this totally eliminates
differences between threads.
2023-09-01 14:00:04 +02:00
Willy Tarreau
844a3bc25b MEDIUM: checks: implement a queue in order to limit concurrent checks
The progressive adoption of OpenSSL 3 and its abysmal handshake
performance has started to reveal situations where it simply isn't
possible anymore to succesfully run health checks on many servers,
because between the moment all the checks are started and the moment
the handshake finally completes, the timeout has expired!

This also has consequences on production traffic which gets
significantly delayed as well, all that for lots of checks. While it's
possible to increase the check delays, it doesn't solve everything as
checks still take a huge amount of time to converge in such conditions.

Here we take a different approach by permitting to enforce the maximum
concurrent checks per thread limitation and implementing an ordered
queue. Thanks to this, if a thread about to start a check has reached
its limit, it will add the check at the end of a queue and it will be
processed once another check is finished. This proves to be extremely
efficient, with all checks completing in a reasonable amount of time
and not being disturbed by the rest of the traffic from other checks.
They're just cycling slower, but at the speed the machine can handle.

One must understand however that if some complex checks perform multiple
exchanges, they will take a check slot for all the required duration.
This is why the limit is not enforced by default.

Tests on SSL show that a limit of 5-50 checks per thread on local
servers gives excellent results already, so that could be a good starting
point.
2023-09-01 14:00:04 +02:00
Willy Tarreau
cfc0bceeb5 MEDIUM: checks: search more aggressively for another thread on overload
When the current check is overloaded (more running checks than the
configured limit), we'll try more aggressively to find another thread.
Instead of just opportunistically looking for one half as loaded, now if
the current thread has more than 1% more active checks than another one,
or has more than a configured limit of concurrent running checks, it will
search for a more suitable thread among 3 other random ones in order to
migrate the check there. The number of migrations remains very low (~1%)
and the checks load very fair across all threads (~1% as well). The new
parameter is called tune.max-checks-per-thread.
2023-09-01 08:26:06 +02:00
Willy Tarreau
016e189ea3 MINOR: check: also consider the random other thread's active checks
When checking if it's worth transferring a sleeping thread to another
random thread, let's also check if that random other thread has less
checks than the current one, which is another reason for transferring
the load there.

This commit adds a function "check_thread_cmp_load()" to compare two
threads' loads in order to simplify the decision taking.

The minimum active check count before starting to consider rebalancing
the load was now raised from 2 to 3, because tests show that at 15k
concurrent checks, at 2, 50% are evaluated for rebalancing and 30%
are rebalanced, while at 3, this is cut in half.
2023-09-01 08:26:06 +02:00
Willy Tarreau
00de9e0804 MINOR: checks: maintain counters of active checks per thread
Let's keep two check counters per thread:
  - one for "active" checks, i.e. checks that are no more sleeping
    and are assigned to the thread. These include sleeping and
    running checks ;

  - one for "running" checks, i.e. those which are currently
    executing on the thread.

By doing so, we'll be able to spread the health checks load a bit better
and refrain from sending too many at once per thread. The counters are
atomic since a migration increments the target thread's active counter.
These numbers are reported in "show activity", which allows to check
per thread and globally how many checks are currently pending and running
on the system.

Ideally, we should only consider checks in the process of establishing
a connection since that's really the expensive part (particularly with
OpenSSL 3.0). But the inner layers are really not suitable to doing
this. However knowing the number of active checks is already a good
enough hint.
2023-09-01 08:26:06 +02:00
Willy Tarreau
3b7942a1c9 MINOR: check/activity: collect some per-thread check activity stats
We now count the number of times a check was started on each thread
and the number of times a check was adopted. This helps understand
better what is observed regarding checks.
2023-09-01 08:26:06 +02:00
Willy Tarreau
e03d05c6ce MINOR: check: remember when we migrate a check
The goal here is to explicitly mark that a check was migrated so that
we don't do it again. This will allow us to perform other actions on
the target thread while still knowing that we don't want to be migrated
again. The new READY bit combine with SLEEPING to form 4 possible states:

 SLP  RDY   State      Description
  0    0    -          (reserved)
  0    1    RUNNING    Check is bound to current thread and running
  1    0    SLEEPING   Check is sleeping, not bound to a thread
  1    1    MIGRATING  Check is migrating to another thread

Thus we set READY upon migration, and check for it before migrating, this
is sufficient to prevent a second migration. To make things a bit clearer,
the SLEEPING bit was switched with FASTINTER so that SLEEPING and READY are
adjacent.
2023-09-01 08:26:06 +02:00
Willy Tarreau
3544c9f8a0 MINOR: checks: pin the check to its thread upon wakeup
When a check leaves the sleeping state, we must pin it to the thread that
is processing it. It's normally always the case after the first execution,
but initial checks that start assigned to any thread (-1) could be assigned
much later, causing problems with planned changes involving queuing. Thus
better do it early, so that all threads start properly pinned.
2023-09-01 08:26:06 +02:00
Willy Tarreau
7163f95b43 MINOR: checks: start the checks in sleeping state
The CHK_ST_SLEEPING state was introduced by commit d114f4a68 ("MEDIUM:
checks: spread the checks load over random threads") to indicate that
a check was not currently bound to a thread and that it could easily
be migrated to any other thread. However it did not start the checks
in this state, meaning that they were not redispatchable on startup.

Sometimes under heavy load (e.g. when using SSL checks with OpenSSL 3.0)
the cost of setting up new connections is so high that some threads may
experience connection timeouts on startup. In this case it's better if
they can transfer their excess load to other idle threads. By just
marking the check as sleeping upon startup, we can do this and
significantly reduce the number of failed initial checks.
2023-09-01 08:26:06 +02:00
Willy Tarreau
48442b8b15 BUG/MINOR: checks: do not queue/wake a bounced check
A small issue was introduced with commit d114f4a68 ("MEDIUM: checks:
spread the checks load over random threads"): when a check is bounced
to another thread, its expiration time is set to TICK_ETERNITY. This
makes it show as not expired upon first wakeup on the next thread,
thus being detected as "woke up too early" and being instantly
rescheduled. Only this after this next wakeup it will be properly
considered.

Several approaches were attempted to fix this. The best one seems to
consist in resetting t->expire and expired upon wakeup, and changing
the !expired test for !tick_is_expired() so that we don't trigger on
this case.

This needs to be backported to 2.7.
2023-09-01 08:26:06 +02:00
Willy Tarreau
338431ecb6 MINOR: activity: report the current run queue size
While troubleshooting the causes of load spikes, it appeared that the
length of individual run queues was missing, let's add it to "show
activity".
2023-09-01 08:26:06 +02:00
Willy Tarreau
2cb896c4b0 MEDIUM: server/ssl: pick another thread's session when we have none yet
The per-thread SSL context in servers causes a burst of connection
renegotiations on startup, both for the forwarded traffic and for the
health checks. Health checks have been seen to continue to cause SSL
rekeying for several minutes after a restart on large thread-count
machines. The reason is that the context is exlusively per-thread
and that the more threads there are, the more likely it is for a new
connection to start on a thread that doesn't have such a context yet.

In order to improve this situation, this commit ensures that a thread
starting an SSL connection to a server without a session will first
look at the last session that was updated by another thread, and will
try to use it. In order to minimize the contention, we're using a read
lock here to protect the data, and the first-level index is an integer
containing the thread number, that is always valid and may always be
dereferenced. This way the session retrieval algorithm becomes quite
simple:

  - if the last thread index is valid, then try to use the same session
    under a read lock ;
  - if any error happens, then atomically nuke the index so that other
    threads don't use it and the next one to update a connection updates
    it again

And for the ssl_sess_new_srv_cb(), we have this:

  - update the entry under a write lock if the new session is valid,
    otherwise kill it if the session is not valid;
  - atomically update the index if it was 0 and the new one is valid,
    otherwise atomically nuke it if the session failed.

Note that even if only the pointer is destroyed, the element will be
re-allocated by the next thread during the sess_new_srv_sb().

Right now a session is picked even if the SNI doesn't match, because
we don't know the SNI yet during ssl_sock_init(), but that's essentially
a matter of API, since connect_server() figures the SNI very early, then
calls conn_prepare() which calls ssl_sock_init(). Thus in the future we
could easily imaging storing a number of SNI-based contexts instead of
storing contexts per thread.

It could be worth backporting this to one LTS version after some
observation, though this is not strictly necessary. the current commit
depends on the following ones:

  BUG/MINOR: ssl_sock: fix possible memory leak on OOM
  MINOR: ssl_sock: avoid iterating realloc(+1) on stored context
  DOC: ssl: add some comments about the non-obvious session allocation stuff
  CLEANUP: ssl: keep a pointer to the server in ssl_sock_init()
  MEDIUM: ssl_sock: always use the SSL's server name, not the one from the tid
  MEDIUM: server/ssl: place an rwlock in the per-thread ssl server session
  MINOR: server/ssl: maintain an index of the last known valid SSL session
  MINOR: server/ssl: clear the shared good session index on failure
  MEDIUM: server/ssl: pick another thread's session when we have none yet
2023-08-31 09:27:14 +02:00
Willy Tarreau
777f62cfb7 MINOR: server/ssl: clear the shared good session index on failure
If we fail to set the session using SSL_set_session(), we want to quickly
erase our index from the shared one so that any other thread with a valid
session replaces it.
2023-08-31 08:50:01 +02:00
Willy Tarreau
52b260bae4 MINOR: server/ssl: maintain an index of the last known valid SSL session
When a thread creates a new session for a server, if none was known yet,
we assign the thread id (hence the reused_sess index) to a shared variable
so that other threads will later be able to find it when they don't have
one yet. For now we only set and clear the pointer upon session creation,
we do not yet pick it.

Note that we could have done it per thread-group, so as to avoid any
cross-thread exchanges, but it's anticipated that this is essentially
used during startup, at a moment where the cost of inter-thread contention
is very low compared to the ability to restart at full speed, which
explains why instead we store a single entry.
2023-08-31 08:50:01 +02:00
Willy Tarreau
607041dec3 MEDIUM: server/ssl: place an rwlock in the per-thread ssl server session
The goal will be to permit a thread to update its session while having
it shared with other threads. For now we only place the lock and arrange
the code around it so that this is quite light. For now only the owner
thread uses this lock so there is no contention.

Note that there is a subtlety in the openssl API regarding
i2s_SSL_SESSION() in that it fills the area pointed to by its argument
with a dump of the session and returns a size that's equal to the
previously allocated one. As such, it does modify the shared area even
if that's not obvious at first glance.
2023-08-31 08:50:01 +02:00
Willy Tarreau
95ac5fe4a8 MEDIUM: ssl_sock: always use the SSL's server name, not the one from the tid
In ssl_sock_set_servername(), we're retrieving the current server name
from the current thread, hoping it will not have changed. This is a
bit dangerous as strictly speaking it's not easy to prove that no other
connection had to use one between the moment it was retrieved in
ssl_sock_init() and the moment it's being read here. In addition, this
forces us to maintain one session per thread while this is not the real
need, in practice we only need one session per SNI. And the current model
prevents us from sharing sessions between threads.

This had been done in 2.5 via commit e18d4e828 ("BUG/MEDIUM: ssl: backend
TLS resumption with sni and TLSv1.3"), but as analyzed with William, it
turns out that a saner approach consists in keeping the call to
SSL_get_servername() there and instead to always assign the SNI to the
current SSL context via SSL_set_tlsext_host_name() immediately when the
session is retreived. This way the session and SNI are consulted atomically
and the host name is only checked from the session and not from possibly
changing elements.

As a bonus the rdlock that was added by that commit could now be removed,
though it didn't cost much.
2023-08-31 08:49:15 +02:00
Willy Tarreau
335b5adf2c CLEANUP: ssl: keep a pointer to the server in ssl_sock_init()
We're using about 6 times "__objt_server(conn->target)" there, it's not
quite easy to read, let's keep a pointer to the server.
2023-08-30 18:58:40 +02:00
Willy Tarreau
bc31ef0896 DOC: ssl: add some comments about the non-obvious session allocation stuff
The SSL session allocation/reuse part is far from being trivial, and
there are some necessary tricks such as allocating then immediately
freeing that are required by the API due to internal refcount. All of
this is particularly hard to grasp, even with the scarce man pages.

Let's document a little bit what's granted and expected along this path
to help the reader later.
2023-08-30 11:43:06 +02:00
Willy Tarreau
2c6fe24001 MINOR: ssl_sock: avoid iterating realloc(+1) on stored context
The SSL context storage in servers is per-thread, and the contents are
allocated for a length that is determined from the session. It turns out
that placing some traces there revealed that the realloc() that is called
to grow the area can be called multiple times in a row even for just
health checks, to grow the area by just one or two bytes. Given that
malloc() allocates in multiples of 8 or 16 anyway, let's round the
allocated size up to the nearest multiple of 8 to avoid this unneeded
operation.
2023-08-30 11:43:06 +02:00
Alexander Stephan
2cc53ecc8f MINOR: sample: Add common TLV types as constants for fc_pp_tlv
This patch adds common TLV types as specified in the PPv2 spec.
We will use the suffix of the type, e.g., PP2_TYPE_AUTHORITY becomes AUTHORITY.
2023-08-29 15:32:02 +02:00
Alexander Stephan
0a4f6992e0 MINOR: sample: Refactor fc_pp_unique_id by wrapping the generic TLV fetch
The fetch logic is redundant and can be simplified by simply
calling the generic fetch with the correct TLV ID set as an
argument, similar to fc_pp_authority.
2023-08-29 15:32:01 +02:00
Alexander Stephan
ece0d1ab49 MINOR: sample: Refactor fc_pp_authority by wrapping the generic TLV fetch
We already have a call that can retreive an TLV with any value.
Therefore, the fetch logic is redundant and can be simplified
by simply calling the generic fetch with the correct TLV ID
set as an argument.
2023-08-29 15:31:51 +02:00
Alexander Stephan
f773ef721c MEDIUM: sample: Add fetch for arbitrary TLVs
Based on the new, generic allocation infrastructure, a new sample
fetch fc_pp_tlv is introduced. It is an abstraction for existing
PPv2 TLV sample fetches. It takes any valid TLV ID as argument and
returns the value as a string, similar to fc_pp_authority and
fc_pp_unique_id.
2023-08-29 15:31:28 +02:00
Alexander Stephan
fecc573da1 MEDIUM: connection: Generic, list-based allocation and look-up of PPv2 TLVs
In order to be able to implement fetches in the future that allow
retrieval of any TLVs, a new generic data structure for TLVs is introduced.

Existing TLV fetches for PP2_TYPE_AUTHORITY and PP2_TYPE_UNIQUE_ID are
migrated to use this new data structure. TLV related pools are updated
to not rely on type, but only on size. Pools accomodate the TLV list
element with their associated value. For now, two pools for 128 B and
256 B values are introduced. More fine-grained solutions are possible
in the future, if necessary.
2023-08-29 15:15:47 +02:00
Alexander Stephan
c9d47652d2 CLEANUP/MINOR: connection: Improve consistency of PPv2 related constants
This patch improves readability by scoping HA proxy related PPv2 constants
with a 'HA" prefix. Besides, a new constant for the length of a CRC32C
TLV is introduced. The length is derived from the PPv2 spec, so 32 Bit.
2023-08-29 15:15:47 +02:00
Willy Tarreau
bd84387beb MEDIUM: capabilities: enable support for Linux capabilities
For a while there has been the constraint of having to run as root for
transparent proxying, and we're starting to see some cases where QUIC is
not running in socket-per-connection mode due to the missing capability
that would be needed to bind a privileged port. It's not realistic to
ask all QUIC users on port 443 to run as root, so instead let's provide
a basic support for capabilities at least on linux. The ones currently
supported are cap_net_raw, cap_net_admin and cap_net_bind_service. The
mechanism was made OS-specific with a dedicated file because it really
is. It can be easily refined later for other OSes if needed.

A new keyword "setcaps" is added to the global section, to enumerate the
capabilities that must be kept when switching from root to non-root. This
is ignored in other situations though. HAProxy has to be built with
USE_LINUX_CAP=1 for this to be supported, which is enabled by default
for linux-glibc, linux-glibc-legacy and linux-musl.

A good way to test this is to start haproxy with such a config:

    global
        uid 1000
        setcap cap_net_bind_service

    frontend test
        mode http
        timeout client 3s
        bind quic4@:443 ssl crt rsa+dh2048.pem allow-0rtt

and run it under "sudo strace -e trace=bind,setuid", then connecting
there from an H3 client. The bind() syscall must succeed despite the
user id having been switched.
2023-08-29 11:11:50 +02:00
Willy Tarreau
4d5f7d94b9 DOC: config: mention uid dependency on the tune.quic.socket-owner option
This option defaults to "connection" but is also dependent on the user
being allowed to bind the specified port. Since QUIC can easily run on
non-privileged ports, usually this is not a problem, but if bound to port
443 it will usually fail. Let's mention this.
2023-08-29 11:11:50 +02:00
Willy Tarreau
e64bccab20 BUG/MINOR: stream: protect stream_dump() against incomplete streams
If a stream is interrupted during its initialization by a panic signal
and tries to dump itself, it may cause a crash during the dump due to
scf and/or scb not being fully initialized. This may also happen while
releasing an endpoint to attach a new one. The effect is that instead
of dying on an abort, the process dies on a segv. This race is ultra-
rare but totally possible. E.g:

  #0  se_fl_test (test=1, se=0x0) at include/haproxy/stconn.h:98
  #1  sc_ep_test (test=1, sc=0x7ff8d5cbd560) at include/haproxy/stconn.h:148
  #2  sc_conn (sc=0x7ff8d5cbd560) at include/haproxy/stconn.h:223
  #3  stream_dump (buf=buf@entry=0x7ff9507e7678, s=0x7ff4c40c8800, pfx=pfx@entry=0x55996c558cb3 ' ' <repeats 13 times>, eol=eol@entry=10 '\n') at src/stream.c:2840
  #4  0x000055996c493b42 in ha_task_dump (buf=buf@entry=0x7ff9507e7678, task=<optimized out>, pfx=pfx@entry=0x55996c558cb3 ' ' <repeats 13 times>) at src/debug.c:328
  #5  0x000055996c493edb in ha_thread_dump_one (thr=thr@entry=18, from_signal=from_signal@entry=0) at src/debug.c:227
  #6  0x000055996c493ff1 in ha_thread_dump (buf=buf@entry=0x7ff9507e7678, thr=thr@entry=18) at src/debug.c:270
  #7  0x000055996c494257 in ha_panic () at src/debug.c:430
  #8  ha_panic () at src/debug.c:411
  (...)
  #23 0x000055996c341fe8 in ssl_sock_close (conn=<optimized out>, xprt_ctx=0x7ff8dcae3880) at src/ssl_sock.c:6699
  #24 0x000055996c397648 in conn_xprt_close (conn=0x7ff8c297b0c0) at include/haproxy/connection.h:148
  #25 conn_full_close (conn=0x7ff8c297b0c0) at include/haproxy/connection.h:192
  #26 h1_release (h1c=0x7ff8c297b3c0) at src/mux_h1.c:1074
  #27 0x000055996c39c9f0 in h1_detach (sd=<optimized out>) at src/mux_h1.c:3502
  #28 0x000055996c474de4 in sc_detach_endp (scp=scp@entry=0x7ff9507e3148) at src/stconn.c:375
  #29 0x000055996c4752a5 in sc_reset_endp (sc=<optimized out>, sc@entry=0x7ff8d5cbd560) at src/stconn.c:475

Note that this cannot happen on "show sess" since a stream never leaves
process_stream in such an uninitialized state, thus it's really only the
crash dump that may cause this.

It should be backported to 2.8.
2023-08-29 11:11:50 +02:00
William Lallemand
e7d9082315 BUG/MINOR: ssl/cli: can't find ".crt" files when replacing a certificate
Bug was introduced by commit 26654 ("MINOR: ssl: add "crt" in the
cert_exts array").

When looking for a .crt directly in the cert_exts array, the
ssl_sock_load_pem_into_ckch() function will be called with a argument
which does not have its ".crt" extensions anymore.

If "ssl-load-extra-del-ext" is used this is not a problem since we try
to add the ".crt" when doing the lookup in the tree.

However when using directly a ".crt" without this option it will failed
looking for the file in the tree.

The fix removes the "crt" entry from the array since it does not seem to
be really useful without a rework of all the lookups.

Should fix issue #2265

Must be backported as far as 2.6.
2023-08-28 18:20:39 +02:00
Willy Tarreau
0074c36dd2 BUILD: pools: import plock.h to build even without thread support
In 2.9-dev4, commit 544c2f2d9 ("MINOR: pools: use EBO to wait for
unlock during pool_flush()") broke the thread-less build by calling
pl_wait_new_long() without explicitly including plock.h which is
normally included by thread.h when threads are enabled.
2023-08-26 17:28:08 +02:00
Willy Tarreau
892d04733f BUILD: import: guard plock.h against multiple inclusion
Surprisingly there's no include guard in plock.h though there is one in
atomic-ops.h. Let's add one, or we cannot risk including the file multiple
times.
2023-08-26 17:28:08 +02:00
Willy Tarreau
a7b9baa2cc BUG/MEDIUM: mux-h2: fix crash when checking for reverse connection after error
If the connection is closed in h2_release(), which is indicated by ret<0, we
must not dereference conn anymore. This was introduced in 2.9-dev4 by commit
5053e8914 ("MEDIUM: h2: prevent stream opening before connection reverse
completed") and detected after a few hours of runtime thanks to running with
pool integrity checks and caller enabled. No backport is needed.
2023-08-26 17:05:19 +02:00
Willy Tarreau
518349f08a [RELEASE] Released version 2.9-dev4
Released version 2.9-dev4 with the following main changes :
    - DEV: flags/show-sess-to-flags: properly decode fd.state
    - BUG/MINOR: stktable: allow sc-set-gpt(0) from tcp-request connection
    - BUG/MINOR: stktable: allow sc-add-gpc from tcp-request connection
    - DOC: typo: fix sc-set-gpt references
    - SCRIPTS: git-show-backports: automatic ref and base detection with -m
    - REGTESTS: Do not use REQUIRE_VERSION for HAProxy 2.5+ (3)
    - DOC: jwt: Add explicit list of supported algorithms
    - BUILD: Makefile: add the USE_QUIC option to make help
    - BUILD: Makefile: add USE_QUIC_OPENSSL_COMPAT to make help
    - BUILD: Makefile: realigned USE_* options in make help
    - DEV: makefile: fix POSIX compatibility for "range" target
    - IMPORT: plock: also support inlining the int code
    - IMPORT: plock: always expose the inline version of the lock wait function
    - IMPORT: lorw: support inlining the wait call
    - MINOR: threads: inline the wait function for pthread_rwlock emulation
    - MINOR: atomic: make sure to always relax after a failed CAS
    - MINOR: pools: use EBO to wait for unlock during pool_flush()
    - BUILD/IMPORT: fix compilation with PLOCK_DISABLE_EBO=1
    - MINOR: quic+openssl_compat: Do not start without "limited-quic"
    - MINOR: quic+openssl_compat: Emit an alert for "allow-0rtt" option
    - BUG/MINOR: quic: allow-0rtt warning must only be emitted with quic bind
    - BUG/MINOR: quic: ssl_quic_initial_ctx() uses error count not error code
    - MINOR: pattern: do not needlessly lookup the LRU cache for empty lists
    - IMPORT: xxhash: update xxHash to version 0.8.2
    - MINOR: proxy: simplify parsing 'backend/server'
    - MINOR: connection: centralize init/deinit of backend elements
    - MEDIUM: connection: implement passive reverse
    - MEDIUM: h2: reverse connection after SETTINGS reception
    - MINOR: server: define reverse-connect server
    - MINOR: backend: only allow reuse for reverse server
    - MINOR: tcp-act: parse 'tcp-request attach-srv' session rule
    - REGTESTS: provide a reverse-server test
    - MINOR: tcp-act: define optional arg name for attach-srv
    - MINOR: connection: use attach-srv name as SNI reuse parameter on reverse
    - REGTESTS: provide a reverse-server test with name argument
    - MINOR: proto: define dedicated protocol for active reverse connect
    - MINOR: connection: extend conn_reverse() for active reverse
    - MINOR: proto_reverse_connect: parse rev@ addresses for bind
    - MINOR: connection: prepare init code paths for active reverse
    - MEDIUM: proto_reverse_connect: bootstrap active reverse connection
    - MINOR: proto_reverse_connect: handle early error before reversal
    - MEDIUM: h2: implement active connection reversal
    - MEDIUM: h2: prevent stream opening before connection reverse completed
    - REGTESTS: write a full reverse regtest
    - BUG/MINOR: h2: fix reverse if no timeout defined
    - CI: fedora: fix "dnf" invocation syntax
    - BUG/MINOR: hlua_fcn: potentially unsafe stktable_data_ptr usage
    - DOC: lua: fix Sphinx warning from core.get_var()
    - DOC: lua: fix core.register_action typo
    - BUG/MINOR: ssl_sock: fix possible memory leak on OOM
    - MEDIUM: map/acl: Improve pat_ref_set() efficiency (for "set-map", "add-acl" action perfs)
    - MEDIUM: map/acl: Improve pat_ref_set_elt() efficiency (for "set-map", "add-acl"action perfs)
    - MEDIUM: map/acl: Accelerate several functions using pat_ref_elt struct ->head list
    - MEDIUM: map/acl: Replace map/acl spin lock by a read/write lock.
    - DOC: map/acl: Remove the comments about map/acl performance issue
    - DOC: Explanation of be_name and be_id fetches
    - MINOR: connection: simplify removal of idle conns from their trees
    - MINOR: server: move idle tree insert in a dedicated function
    - MAJOR: connection: purge idle conn by last usage
2023-08-25 17:57:22 +02:00
Amaury Denoyelle
5afcb686b9 MAJOR: connection: purge idle conn by last usage
Backend idle connections are purged on a recurring occurence during the
process lifetime. An estimated number of needed connections is
calculated and the excess is removed periodically.

Before this patch, purge was done directly using the idle then the safe
connection tree of a server instance. This has a major drawback to take
no account of a specific ordre and it may removed functional connections
while leaving ones which will fail on the next reuse.

The problem can be worse when using criteria to differentiate idle
connections such as the SSL SNI. In this case, purge may remove
connections with a high rate of reusing while leaving connections with
criteria never matched once, thus reducing drastically the reuse rate.

To improve this, introduce an alternative storage for idle connection
used in parallel of the idle/safe trees. Now, each connection inserted
in one of this tree is also inserted in the new list at
`srv_per_thread.idle_conn_list`. This guarantees that recently used
connection is present at the end of the list.

During the purge, use this list instead of idle/safe trees. Remove first
connection in front of the list which were not reused recently. This
will ensure that connection that are frequently reused are not purged
and should increase the reuse rate, particularily if distinct idle
connection criterias are in used.
2023-08-25 15:57:48 +02:00