Commit Graph

21830 Commits

Author SHA1 Message Date
Ilia Shipitsin
3a0fc8641b CI: temporarily adjust kernel entropy to work with ASAN/clang
clang runtime (shipped with clang14) is not compatible with recent
Ubuntu kernels

more details: https://github.com/actions/runner-images/issues/9491
2024-03-18 19:54:33 +01:00
Ilia Shipitsin
5fe02c33bc CLEANUP: assorted typo fixes in the code and comments
This is 40th iteration of typo fixes
2024-03-18 19:54:33 +01:00
Christopher Faulet
885e40494c MINOR: spoe: Add SPOE filters in the exposed deprecated directives
It is the first deprecated directive exposed via the
'expose-deprecated-directives' global option. This way, it is possible to
silent the warning about the SPOE uses.
2024-03-15 11:31:48 +01:00
Christopher Faulet
189f74d4ff MINOR: cfgparse: Add a global option to expose deprecated directives
Similarly to "expose-exprimental-directives" option, there is no a global
option to expose some deprecated directives. Idea is to have a way to silent
warnings about deprecated directives when there is no alternative solution.

Of course, deprecated directives covered by this option are not listed and
may change. It is only a best effort to let users upgrade smoothly.
2024-03-15 11:31:48 +01:00
Christopher Faulet
dff9807188 MAJOR: spoe: Deprecate the SPOE filter
As announced on the ML few weeks (months ?) ago and on several GH issues,
the SPOE is now deprecated. Sadly, this filter should be refactored to work
properly. It was implemented as a functionnal PoC for the 1.7 and since
then, no time was invest to improve it and make it truly maintainable in
time. Worst, other parts of HAProxy evolve, especially applets part, making
maintenance ever more expensive.

Instead of keeping the SPOE filter in a this state and always reply to users
encountering issues or limitations that it is far from perfect but we cannot
work on it for now, we decided to deprecate it.

We can still change our mind before the 3.0.0 release if the situation
evolves. Otherwise the filter will be removed or marked as unmaintained for
the 3.1. If the situation does not change, it means the 3.0 will be the last
version with a true SPOE support.
2024-03-15 11:29:39 +01:00
Christopher Faulet
6547b14292 BUG/MINOR: spoe: Be sure to be able to quickly close IDLE applets on soft-stop
On soft-stop, we try, as far as possible, to process all pending messages
before closing SPOE applets. However, in sync mode, when an applets waiting
for a response receives the ACK frame, it is switched to IDLE state without
checking if it may be closed. In this case, we will wait the idle timeout
before closing de applet, delaying the soft-stop.

To reduce this delay, on soft-stop, IDLE applets are woken up. On the next
wakeup, the applet will try to process pending messages or will be
closed.

This patch should be backported to all stable versions.
2024-03-15 09:09:22 +01:00
Christopher Faulet
3c066b1e34 BUG/MEDIUM: spoe: Don't rely on stream's expiration to detect processing timeout
On stream side, the SPOE filter relied on the stream's expiration date to be
woken up and be able to detect processing timeout. However, the stream
expiration date must not be updated this way. Mainly because it may be
overwritten at the end of process_stream(). In the worst case, it is set to
TICK_ETERNITY for any reason. In this case, it is impossible to detect the
SPOE filter must time out and abort the processing.

The right way to do is to set an analysis expiration date on the
corresponding channel, depending on the direction. This expiration date will
be used to compute the stream's expiration date at the end of
process_stream().

This patch may be related to issue #2478. It must be backported to all
stable versions.
2024-03-15 09:09:22 +01:00
Amaury Denoyelle
7dae3ceaa0 BUG/MAJOR: server: do not delete srv referenced by session
A server can only be deleted if there is no elements which reference it.
This is taken care via srv_check_for_deletion(), most notably for active
and idle connections.

A special case occurs for connections directly managed by a session.
This is for so-called private connections, when using http-reuse never
or H2 + http-reuse safe for example. In this case. server does not
account these connections into its idle lists. This caused a bug as the
server is deleted despite the session still being able to access it.

To properly fix this, add a new referencing element into the server for
these session connections. A mt_list has been chosen for this. On
default http-reuse, private connections are typically not used so it
won't make any difference. If using H2 servers, or more generally when
dealing with private connections, insert/delete should typically occur
only once per session lifetime so impact on performance should be
minimal.

This should be backported up to 2.4. Note that srv_check_for_deletion()
was introduced in 3.0 dev tree. On backport, the extra condition in it
should be placed in cli_parse_delete_server() instead.
2024-03-14 15:21:07 +01:00
Amaury Denoyelle
5ad801c058 MINOR: session: rename private conns elements
By default, backend connections are attached to a server instance. This
allows to implement connection reuse. However, in some particular cases,
connection cannot be shared accross several clients. These connections
are considered and private and are attached to the session instance
instead.

These private connections are also indexed by the target server to not
mix them. All of this is implemented via a dedicated structure
previously named struct sess_srv_list.

Rename it to better reflect its usage to struct sess_priv_conns. Also
rename its internal members and all of the associated functions.

This commit is only a renaming, thus no functional impact is expected.
2024-03-14 15:21:02 +01:00
Christopher Faulet
f31a4e302e BUG/MINOR: listener: Don't schedule frontend without task in listener_release()
null pointer dereference was reported by Coverity in listener_release()
function. Indeed, we must not try to schedule frontend without task when a
limit is still blocking the frontend. This issue was introduced by commit
65ae1347c7 ("BUG/MINOR: listener: Wake proxy's mngmt task up if necessary on
session release")

This patch should fix issue #2488. It must be backported to all stable
version with the commit above.
2024-03-14 09:34:36 +01:00
Christopher Faulet
65ae1347c7 BUG/MINOR: listener: Wake proxy's mngmt task up if necessary on session release
When a session is released, listener_release() function is called to notify
the listener. It is an opportunity to resume limited/full listeners. We
first try to resume the listener owning the released session, then all
limited listeners in the global queue and finally all limited listeners in
the frontend's waiting queue. This last step is only performed if there is
no limit applied on the frontend. Nothing is performed if the session rate is
still limited. And it is an issue because if this happens for the last
listener's session, there is no other event to wake the frontend's managment
task up and the listener remains in the limited state.

To fix the issue, when a limit is still applied on the frontent, we must
compute the new wake up date from the sessions rate and schedule the
frontend's managment task.

It is easy to reproduce the issue in SSL by setting a maxconn and a rate
limit on sessions.

This patch should fix the issue #2476. It must be backported to all stable
versions.
2024-03-13 15:20:06 +01:00
William Lallemand
9c2e900a9b CI: github: add -dI to haproxy arguments
-dI is useful when running with ASAN, allow to fork addr2line
2024-03-13 11:23:14 +01:00
William Lallemand
70be894e41 MINOR: debug: enable insecure fork on the command line
-dI allow to enable "insure-fork-wanted" directly from the command line,
which is useful when you want to run ASAN with addr2line with a lot of
configuration files without editing them.
2024-03-13 11:23:14 +01:00
Aurelien DARRAGON
07b2e84bce BUG/MEDIUM: hlua: streams don't support mixing lua-load with lua-load-per-thread (2nd try)
While trying to reproduce another crash case involving lua filters
reported by @bgrooot on GH #2467, we found out that mixing filters loaded
from different contexts ('lua-load' vs 'lua-load-per-thread') for the same
stream isn't supported and may even cause the process to crash.

Historically, mixing lua-load and lua-load-per-threads for a stream wasn't
supported, but this changed thanks to 0913386 ("BUG/MEDIUM: hlua: streams
don't support mixing lua-load with lua-load-per-thread").

However, the above fix didn't consider lua filters's use-case properly:
unlike lua fetches, actions or even services, lua filters don't simply
use the stream hlua context as a "temporary" hlua running context to
process some hlua code. For fetches, actions.. hlua executions are
processed sequentially, so we simply reuse the hlua context from the
previous action/fetch to run the next one (this allows to bypass memory
allocations and initialization, thus it increases performance), unless
we need to run on a different hlua state-id, in which case we perform a
reset of the hlua context.

But this cannot work with filters: indeed, once registered, a filter will
last for the whole stream duration. It means that the filter will rely
on the stream hlua context from ->attach() to ->detach(). And here is the
catch, if for the same stream we register 2 lua filters from different
contexts ('lua-load' + 'lua-load-per-thread'), then we have an issue,
because the hlua stream will be re-created each time we switch between
runtime contexts, which means each time we switch between the filters (may
happen for each stream processing step), and since lua filters rely on the
stream hlua to carry context between filtering steps, this context will be
lost upon a switch. Given that lua filters code was not designed with that
in mind, it would confuse the code and cause unexpected behaviors ranging
from lua errors to crashing process.

So here we take another approach: instead of re-creating the stream hlua
context each time we switch between "global" and "per-thread" runtime
context, let's have both of them inside the stream directly as initially
suggested by Christopher back then when talked about the original issue.

For this we leverage hlua_stream_ctx_prepare() and hlua_stream_ctx_get()
helper functions which return the proper hlua context for a given stream
and state_id combination.

As for debugging infos reported after ha_panic(), we check for both hlua
runtime contexts to check if one of them was active when the panic occured
(only 1 runtime ctx per stream may be active at a given time).

This should be backported to all stable versions with 0913386
("BUG/MEDIUM: hlua: streams don't support mixing lua-load with lua-load-per-thread")

This commit depends on:
 - "DEBUG: lua: precisely identify if stream is stuck inside lua or not"
   [for versions < 2.9 the ha_thread_dump_one() part should be skipped]
 - "MINOR: hlua: use accessors for stream hlua ctx"

For 2.4, the filters API didn't exist. However it may be a good idea to
backport it anyway because ->set_priv()/->get_priv() from tcp/http lua
applets may also be affected by this bug, plus it will ease code
maintenance. Of course, filters-related parts should be skipped in this
case.
2024-03-13 09:24:46 +01:00
Aurelien DARRAGON
aa554be69c MINOR: hlua: use accessors for stream hlua ctx
Change hlua_stream_ctx_prepare() prototype so that it now returns the
proper hlua ctx on success instead of returning a boolean.

Add hlua_stream_ctx_get() to retrieve hlua ctx out of a given stream.

This way we may easily change the storage mechanism for hlua stream in
the future without extensive code changes.

No backport needed unless a commit depends on it.
2024-03-13 09:24:46 +01:00
Aurelien DARRAGON
1a2cdf64c9 DEBUG: lua: precisely identify if stream is stuck inside lua or not
When ha_panic() is called by the watchdog, we try to guess from
ha_task_dump() and ha_thread_dump_one() if the thread was stuck while
executing lua from the stream context. However we consider this is the
case by simply checking if the stream hlua context was set, but this is
not very precise because if the hlua context is set, then it simply means
that at least one lua instruction was executed at the stream level, not
that the stuck was currently executing lua when the panic occured.

This is especially true with filters, one could simply register a lua
filter that does nothing but this will still end up initializing the
stream hlua context for each stream. If the thread end up being stuck
during the stream handling, then debug dumping functions will report
that the stream was stuck while handling lua, which is not necessarilly
true, and could in fact confuse us even more.

So here we take another approach, we add the BUSY flag to hlua context:
this flag is set by hlua_ctx_resume() around lua_resume() call, this way
we can precisely tell if the thread was handling lua when it was
interrupted, and we rely on this flag in debug functions to check if the
thread was effectively stuck inside lua or not while processing the stream

No backport needed unless a commit depends on it.
2024-03-13 09:24:46 +01:00
Aurelien DARRAGON
85d81e4d0a BUG/MINOR: hlua: fix missing lock in hlua_filter_delete()
hlua_filter_delete() calls hlua_unref() on the stream hlua stack, but
we should own the lock prior to manipulating the stack.

This should be backported up to 2.6.
2024-03-13 09:24:46 +01:00
Aurelien DARRAGON
ecd8f3bfd7 BUG/MINOR: hlua: missing lock in hlua_filter_new()
This is a complementary patch to 8670db7 ("BUG/MAJOR: hlua: improper lock
usage with hlua_ctx_resume()") for hlua_filter_new().

Indeed, the HLUA_E_ERRMSG case still relies on the lua stack but didn't
take the lock to do so.

This should be backported up to 2.6.
2024-03-13 09:24:46 +01:00
Aurelien DARRAGON
4aefffc38c BUG/MINOR: hlua: segfault when loading the same filter from different contexts
Trying to register the same lua filter from global and per-thread context
(using 'lua-load' + 'lua-load-per-thread') causes a segmentation fault in
hlua_post_init().

This is due to a simple copy paste error as we try to print the function
name in the error message (like we do when loading the same lua function
from different contexts) instead of the filter name.

This should be backported up to 2.6.
2024-03-13 09:24:46 +01:00
William Lallemand
bb25ee7b26 CI: github: add -DDEBUG_LIST to the default builds
Add the -DDEBUG_LIST flag which allow to check if a list element was
removed twice.
2024-03-13 09:01:11 +01:00
William Lallemand
bbc215d3bd CLEANUP: ssl: remove useless #ifdef in openssl-compat.h
Remove a useless #ifdef in openssl-compat.h
2024-03-13 08:51:04 +01:00
William Lallemand
501d9fdb86 MEDIUM: ssl: allow to change the OpenSSL security level from global section
The new "ssl-security-level" option allows one to change the OpenSSL
security level without having to change the openssl.cnf global file of
your distribution. This directives applies on every SSL_CTX context.

People sometimes change their security level directly in the ciphers
directive, however there are some cases when the security level change
is not applied in the right order (for example when applying a DH
param).

Before this patch, it was to possible to trick by using a specific
openssl.cnf file and start haproxy this way:

    OPENSSL_CONF=./openssl.cnf ./haproxy -f bug-2468.cfg

Values for the security level can be found there:

https://www.openssl.org/docs/man1.1.1/man3/SSL_CTX_set_security_level.html

This was discussed in github issue #2468.
2024-03-12 17:37:11 +01:00
William Lallemand
7e9e4a8f50 MEDIUM: ssl: initialize the SSL stack explicitely
In issue #2448, users are complaining that FIPS is not working correctly
since the removal of SSL_library_init().

This was removed because SSL_library_init() is deprecated with OpenSSL
3.x and emits a warning. But the initialization was not needed anymore
because it is done at the first openssl API call.

However it some cases it is needed. SSL_library_init() is now a define
to OPENSSL_init_ssl(0, NULL). This patch adds OPENSSL_init_ssl(0, NULL)
to the init.

This could be backported in every stable branches, however let's wait
before backporting it.
2024-03-12 12:03:07 +01:00
Willy Tarreau
7223296092 BUG/MINOR: server: fix first server template not being indexed
3.0-dev1 introduced a small regression with commit b4db3be86e ("BUG/MINOR:
server: fix server_find_by_name() usage during parsing"). By changing the
way servers are indexed and moving it into the server template loop, the
first one is no longer indexed because the loop starts at low+1 since it
focuses on duplication. Let's index the first one explicitly now.

This should not be backported, unless the commit above is backported.
2024-03-12 08:23:03 +01:00
Dragan Dosen
0091692d97 BUG/MINOR: ssl: do not set the aead_tag flags in sample_conv_aes_gcm()
This was not useful and was using uninitialized value. Introduced with
the commit 08ac28237 ("MINOR: Add aes_gcm_enc converter").

Must be backported wherever the commit 08ac28237 was backported.
2024-03-11 19:20:44 +01:00
Dragan Dosen
d7610e6dde BUG/MINOR: ssl: fix possible ctx memory leak in sample_conv_aes_gcm()
The issue was introduced with the commit c31499d74 ("MINOR: ssl: Add
aes_gcm_dec converter").

This must be backported to all stable branches where the above converter
is present, but it may need to be adjusted for older branches because of
code refactoring.
2024-03-11 19:20:31 +01:00
Brooks Davis
c03a023882 MINOR: tools: use public interface for FreeBSD get_exec_path()
Where possible (FreeBSD 13+), use the public, documented interface to
the ELF auxiliary argument vector: elf_aux_info().

__elf_aux_vector is a private interface exported so that the runtime
linker can set its value during process startup and not intended for
public consumption.  In FreeBSD 15 it has been removed from libc and
moved to libsys.
2024-03-11 19:00:37 +01:00
William Lallemand
3262c2ddcd DOC: configuration: clarify ciphersuites usage (V2)
The previous attempt removed the TLSv1.3 version for the
"ciphersuites" keywords. However it looks like the TLSv1.2 support for
SSL_CTX_set_ciphersuites() is a bug, and can have inconsistent behavior.

This patch revert the previous attempt and add explaining about this
problem and clear examples on how to configure TLSv1.2 ciphers + TLSv1.3
ciphersuites.

Revert "DOC: configuration: clarify ciphersuites usage"
This reverts commit e2a44d6c94.

This must be backported to all stable branches.

Fixes issue #2459.
2024-03-11 17:58:16 +01:00
Amaury Denoyelle
c499d66f37 MINOR: quic: remove qc_treat_rx_crypto_frms()
This commit removes qc_treat_rx_crypto_frms(). This function was used in
a single place inside qc_ssl_provide_all_quic_data(). Besides, its
naming was confusing as conceptually it is directly linked to quic_ssl
module instead of quic_rx.

Thus, body of qc_treat_rx_crypto_frms() is inlined directly inside
qc_ssl_provide_all_quic_data(). Also, qc_ssl_provide_quic_data() is now
only used inside quic_ssl to its scope is set to static. Overall, API
for CRYPTO frame handling is now cleaner.
2024-03-11 14:27:51 +01:00
Amaury Denoyelle
b068e758fb MINOR: quic: simplify rescheduling for handshake
On CRYPTO frames reception, tasklet is rescheduled with TASK_HEAVY to
limit CPU consumption. This commit slighly simplifies this by regrouping
TASK_HEAVY setting and tasklet_wakeup() instructions in a single
location in qc_handle_crypto_frm(). All other unnecessary
tasklet_wakeup() are removed.
2024-03-11 14:15:36 +01:00
Willy Tarreau
6770259083 MEDIUM: mux-h2: allow to set the glitches threshold to kill a connection
Till now it was still needed to write rules to eliminate bad behaving
H2 clients, while most of the time it would be desirable to just be able
to set a threshold on the level of anomalies on a connection.

This is what this patch does. By setting a glitches threshold for frontend
and backend, it allows to automatically turn a connection to the error
state when the threshold is reached so that the connection dies by itself
without having to write possibly complex rules.

One subtlety is that we still have the error state being exclusive to the
parser's state so this requires the h2c_report_glitches() function to return
a status indicating if the threshold was reached or not so that processing
can instantly stop and bypass the state update, otherwise the state could
be turned back to a valid one (e.g. after parsing CONTINUATION); we should
really contemplate the possibility to use H2_CF_ERROR for this. Fortunately
there were very few places where a glitch was reported outside of an error
path so the changes are quite minor.

Now by setting the front value to 1000, a client flooding with short
CONTINUATION frames is instantly stopped.
2024-03-11 08:25:08 +01:00
Willy Tarreau
e6e7e1587e MINOR: mux-h2: always use h2c_report_glitch()
The function aims at centralizing counter measures but due to the fact
that it only increments the counter by one unit, sometimes it was not
used and the value was calculated directly. Let's pass the increment in
argument so that it can be used everywhere.
2024-03-11 07:36:56 +01:00
Willy Tarreau
db1a7513b7 [RELEASE] Released version 3.0-dev5
Released version 3.0-dev5 with the following main changes :
    - BUG/MEDIUM: applet: Fix HTX .rcv_buf callback function to release outbuf buffer
    - BUG/MAJOR: ssl/ocsp: crash with ocsp when old process exit or using ocsp CLI
    - BUG/MEDIUM: server: fix dynamic servers initial settings
    - BUG/MINOR: ssl/cli: duplicate cleaning code in cli_parse_del_crtlist
    - LICENSE: event_hdl: fix GPL license version
    - LICENSE: http_ext: fix GPL license version
    - BUG/MEDIUM: mux-h1: Fix again 0-copy forwarding of chunks with an unknown size
    - BUG/MINOR: mux-h1: Properly report when mux is blocked during a nego
    - MINOR: mux-h1: Move checks performed before a shutdown in a dedicated function
    - MINOR: mux-h1: Move all stuff to detach a stream in an internal function
    - MAJOR: mux-h1: Drain requests on client side before shut a stream down
    - MEDIUM: htx/http-ana: No longer close connection on early HAProxy response
    - MINOR: quic: filter show quic by address
    - MINOR: quic: specify show quic output fields
    - MINOR: quic: add MUX output for show quic
    - CLEANUP: mux-h2: Fix h2s_make_data() comment about the return value
    - DOC: configuration: clarify ciphersuites usage
    - BUG/MINOR: config/quic: Alert about PROXY protocol use on a QUIC listener
    - BUG/MINOR: hlua: Fix log level to the right value when set via TXN:set_loglevel
    - MINOR: hlua: Be able to disable logging from lua
    - BUG/MINOR: tools: seed the statistical PRNG slightly better
    - BUG/MINOR: hlua: fix unsafe lua_tostring() usage with empty stack
    - BUG/MINOR: hlua: don't use lua_tostring() from unprotected contexts
    - BUG/MINOR: hlua: fix possible crash in hlua_filter_new() under load
    - BUG/MINOR: hlua: improper lock usage in hlua_filter_callback()
    - BUG/MINOR: hlua: improper lock usage in hlua_filter_new()
    - BUG/MEDIUM: hlua: improper lock usage with SET_SAFE_LJMP()
    - BUG/MAJOR: hlua: improper lock usage with hlua_ctx_resume()
    - BUG/MINOR: hlua: don't call ha_alert() in hlua_event_subscribe()
    - MINOR: hlua: use SEND_ERR to report errors in hlua_event_runner()
    - CLEANUP: hlua: txn class functions may LJMP
    - BUG/MINOR: sink: fix a race condition in the TCP log forwarding code
    - BUILD: thread: move lock label definitions to thread-t.h
    - BUILD: tree-wide: fix a few missing includes in a few files
    - BUILD: buf: make b_ncat() take a const for the source
    - CLEANUP: assorted typo fixes in the code and comments
    - CLEANUP: fix typo in naming for variable "unused"
    - CI: run more smoke tests on config syntax to check memory related issues
    - CI: enable monthly build only test on netbsd-9.3
    - CI: skip scheduled builds on forks
    - BUG/MINOR: ssl/cli: typo in new ssl crl-file CLI description
    - BUG/MEDIUM: quic: fix connection freeze on post handshake
    - BUG/MINOR: mux-quic: fix crash on aborting uni remote stream
    - CLEANUP: log: fix obsolete comment for add_sample_to_logformat_list()
    - CLEANUP: tree-wide: use proper ERR_* return values for PRE_CHECK fcts
    - BUG/MINOR: cfgparse: report proper location for log-format-sd errors
    - MINOR: vars: export var_set and var_unset functions
    - MINOR: Add aes_gcm_enc converter
    - BUG/MEDIUM: quic: fix handshake freeze under high traffic
    - MINOR: quic: always use ncbuf for rx CRYPTO
    - BUILD: ssl: define EVP_CTRL_AEAD_GET_TAG for older versions
    - DOC: design: write first notes about ring-v2
    - OPTIM: sink: try to merge "dropped" messages faster
    - OPTIM: sink: drop the sink lock used to count drops
    - DEV: haring: make haring not depend on the struct ring itself
    - DEV: haring: split the code between ring and buffer
    - DEV: haring: automatically use the advertised ring header size
    - BUILD: solaris: fix compilation errors
2024-03-09 16:50:15 +01:00
matthias sweertvaegher
062ea3a3d4 BUILD: solaris: fix compilation errors
Compilation on solaris fails because of usage of names reserved on that
platform, i.e. 'queue' and 's_addr'.

This patch redefines 'queue' as '_queue' and renames 's_addr' to
'srv_addr' which fixes compilation for now.

Future plan: rename 'queue' in code base so define can be removed again.

Backporting: 2.9, 2.8
2024-03-09 11:24:54 +01:00
Willy Tarreau
88e141b823 DEV: haring: automatically use the advertised ring header size
Instead of emitting a warning, since we don't need the ring struct
anymore, we can just read what we need, parse the buffer and use the
advertised offset. Thus for now -f is simply ignored.
2024-03-09 11:23:52 +01:00
Willy Tarreau
77d7c35243 DEV: haring: split the code between ring and buffer
By splitting the initialization and the parsing of the ring, we'll ease
the support for multiple ring sizes and get rid of the annoyances of the
optional lock.
2024-03-09 11:23:52 +01:00
Willy Tarreau
4dddbb63a0 DEV: haring: make haring not depend on the struct ring itself
haring needs to be self-sufficient about the ring format so that it continues
to build when the ring API changes. Let's import the struct ring definition
and call it "ring_v1".
2024-03-09 11:23:52 +01:00
Willy Tarreau
758cb450a2 OPTIM: sink: drop the sink lock used to count drops
The sink lock was made to prevent event producers from passing while
there were other threads trying to print a "dropped" message, in order
to guarantee the absence of reordering. It has a serious impact however,
which is that all threads need to take the read lock when producing a
regular trace even when there's no reader.

This patch takes a different approach. The drop counter is shifted left
by one so that the lowest bit is used to indicate that one thread is
already taking care of trying to dump the counter. Threads only read
this value normally, and will only try to change it if it's non-null,
in which case they'll first check if they are the first ones trying to
dump it, otherwise will simply count another drop and leave. This has
a large benefit. First, it will avoid the locking that causes stalls
as soon as a slow reader is present. Second, it avoids any write on the
fast path as long as there's no drop. And it remains very lightweight
since we just need to add +2 or subtract 2*dropped in operations, while
offering the guarantee that the sink_write() has succeeded before
unlocking the counter.

While a reader was previously limiting the traffic to 11k RPS under
4C/8T, now we reach 36k RPS vs 14k with no reader, so readers will no
longer slow the traffic down and will instead even speed it up due to
avoiding the contention down the chain in the ring. The locking cost
dropped from ~75% to ~60% now (it's in ring_write now).
2024-03-09 11:23:52 +01:00
Willy Tarreau
eb7b2ec83a OPTIM: sink: try to merge "dropped" messages faster
When a reader doesn't read fast enough and causes drops, subsequent
threads try to produce a "dropped" message. But it takes time to
produce and emit this message, in part due to the use of chunk_printf()
that relies on vfprintf() which has to parse the printf format, and
during this time other threads may continue to increment the counter.
This is the reason why this is currently performed in a loop. When
reading what is received, it's common to see a large count followed
by one or two single-digit counts, indicating that we could possibly
have improved that by writing faster.

Let's improve the situation a little bit. First we're now using a
static message prefixed with enough space to write the digits, and a
call to ultoa_r() fills these digits from right to left so that we
don't have to process a format string nor perform a copy of the message.

Second, we now re-check the counter immediately after having prepared
the message so that we still get an opportunity for updating it. In
order to avoid too long loops, this is limited to 10 iterations.

Tests show that the number of single-digit "dropped" counters on output
now dropped roughly by 15-30%. Also, it was observed that with 8 threads,
there's almost never more than one retry.
2024-03-09 11:23:52 +01:00
Willy Tarreau
571232535a DOC: design: write first notes about ring-v2
This explains the observed limitations of the current ring applied to
traces and proposes a multi-step, more scalable, improvement.
2024-03-09 11:23:52 +01:00
Willy Tarreau
26cd248feb BUILD: ssl: define EVP_CTRL_AEAD_GET_TAG for older versions
Amaury reported that previous commit 08ac282375 ("MINOR: Add aes_gcm_enc
converter") broke the CI on OpenSSL 1.0.2 due to the define above not
existing there. Let's just map it to its older name when not existing.
For reference, these were renamed when switching to 1.1.0:

    https://marc.info/?l=openssl-cvs&m=142244190907706&w=2

No backport is needed.
2024-03-08 18:23:34 +01:00
Amaury Denoyelle
1ee7bf5bd9 MINOR: quic: always use ncbuf for rx CRYPTO
The previous patch fix the handling of in-order CRYPTO frames which
requires the usage of a new buffer for these data as their handling is
delayed to run under TASK_HEAVY.

In fact, as now all CRYPTO frames handling must be delayed, their
handling can be unify. This is the purpose of this commit, which removes
the just introduced new buffer. Now, all CRYPTO frames are buffered
inside the ncbuf. Unused elements such as crypto_frms member for
encryption level are also removed.

This commit is not a bugcfix but is a direct follow-up to the last one.
As such, it can probably be backported with it to 2.9 to reduce code
differences between these versions.
2024-03-08 17:22:48 +01:00
Amaury Denoyelle
81f118cec0 BUG/MEDIUM: quic: fix handshake freeze under high traffic
QUIC relies on SSL_do_hanshake() to be able to validate handshake. As
this function is computation heavy, it is since 2.9 called only under
TASK_HEAVY. This has been implemented by the following patch :
  94d20be138
  MEDIUM: quic: Heavy task mode during handshake

Instead of handling CRYPTO frames immediately during reception, this
patch delays the process to run under TASK_HEAVY tasklet. A frame copy
is stored in qel.rx.crypto_frms list. However, this frame still
reference the receive buffer. If the receive buffer is cleared before
the tasklet is rescheduled, it will point to garbage data, resulting in
haproxy decryption error. This happens if a fair amount of data is
received constantly to preempt the quic_conn tasklet execution.

This bug can be reproduced with a fair amount of clients. It is
exhibited by 'show quic full' which can report connections blocked on
handshake. Using the following commands result in h2load non able to
complete the last connections.

$ h2load --alpn-list h3 -t 8 -c 800 -m 10 -w 10 -n 8000 "https://127.0.0.1:20443/?s=10k"

Also, haproxy QUIC listener socket mode was active to trigger the issue.
This forces several connections to share the same reception buffer,
rendering the bug even more plausible to occur. It should be possible to
reproduce it with connection socket if increasing the clients amount.

To fix this bug, define a new buffer under quic_cstream. It is used
exclusively to copy CRYPTO data for in-order frame if ncbuf is empty.
This ensures data remains accessible even if receive buffer is cleared.

Note that this fix is only a temporary step. Indeed, a ncbuf is also
already used for out-of-order data. It should be possible to unify its
usage for both in and out-of-order data, rendering this new buffer
instance unnecessary. In this case, several unneeded elements will
become obsolete such as qel.rx.crypto_frms list. This will be done in a
future refactoring patch.

This must be backported up to 2.9.
2024-03-08 17:22:48 +01:00
Nenad Merdanovic
08ac282375 MINOR: Add aes_gcm_enc converter
The converter can be used to encrypt the raw byte input using the
AES-GCM algorithm, using provided nonce and key.

Co-authored-by: Dragan Dosen (ddosen@haproxy.com)
2024-03-08 17:20:43 +01:00
Nenad Merdanovic
e225e04ba7 MINOR: vars: export var_set and var_unset functions
Co-authored-by: Dragan Dosen <ddosen@haproxy.com>
2024-03-08 17:20:43 +01:00
Aurelien DARRAGON
cf37e4cc1b BUG/MINOR: cfgparse: report proper location for log-format-sd errors
When a parsing error occurs inside a log-format-sd expression, we report
the location of the log-format directive (which may not be set) instead
of reporting the proper log-format-sd directive location where the parsing
error occured.

 1|listen test
 2|  log-format "%B"      # no error
 3|  log-format-sd "%bad" # error

 | [ALERT]    (322261) : config : Parsing [empty.conf:2]: failed to parse log-format-sd : no such format variable 'bad'. If you wanted to emit the '%' character verbatim, you need to use '%%'.

The fix consists in using the config hints dedicated to log-format-sd
directive instead of the log-format one.

The bug was introduced in 8a4e4420 ("MEDIUM: log-format: Use standard
HAProxy log system to report errors").

This should be backported to every stable versions.
2024-03-07 11:48:17 +01:00
Aurelien DARRAGON
59f08f65fd CLEANUP: tree-wide: use proper ERR_* return values for PRE_CHECK fcts
httpclient_precheck(), ssl_ocsp_update_precheck(), and
resolvers_create_default() functions are registered through
REGISTER_PRE_CHECK() macro to be called by haproxy during init from the
pre_check_list list. When calling functions registered in pre_check_list,
haproxy expects ERR_* return values. However those 3 functions currently
use raw return values, so we better use explicit ERR_* macros to prevent
breakage in the future if ERR_* values mapping were to change.
2024-03-07 11:48:08 +01:00
Aurelien DARRAGON
2df7e077c7 CLEANUP: log: fix obsolete comment for add_sample_to_logformat_list()
Since 833cc794 ("MEDIUM: sample: handle comma-delimited converter list")
logformat expressions now support having a comma-delimited converter list
right after the fetch. Let's remove a leftover comment from the initial
implementation that says otherwise.
2024-03-07 11:47:56 +01:00
Amaury Denoyelle
b0dd4810e7 BUG/MINOR: mux-quic: fix crash on aborting uni remote stream
A remote unidirectional stream can be aborted prematurely if application
layers cannot identify its type. In this case, a STOP_SENDING frame is
emitted.

Since QUIC MUX refactoring, a crash would occur in this scenario due to
2 specific characteristics of remote uni streams :
* qcs.tx.fctl was not initialized completely. This cause a crash due to
  BUG_ON() statement inside qcs_destroy().
* qcs.stream is never allocated. This caused qcs_prep_bytes() to crash
  inside qcc_io_send().

This bug is considered minor as it happens only on very specific QUIC
clients. It was detected when using s2n-quic over interop.

This does not need to be backported.
2024-03-06 10:41:01 +01:00
Amaury Denoyelle
d8f1ff8648 BUG/MEDIUM: quic: fix connection freeze on post handshake
After handshake completion, QUIC server is responsible to emit
HANDSHAKE_DONE frame. Some clients wait for it to begin STREAM
transfers.

Previously, there was no explicit tasklet_wakeup() after handshake
completion, which is necessary to emit post-handshake frames. In most
cases, this was undetected as most client continue emission which will
reschedule the tasklet. However, as there is no tasklet_wakeup(), this
is not a consistent behavior. If this bug occurs, it causes a connection
freeze, preventing the client to emit any request. The connection is
finally closed on idle timeout.

To fix this, add an explicit tasklet_wakeup() after handshake
completion. It sounds simple enough but in fact it's difficult to find
the correct location efor tasklet_wakeup() invocation, as post-handshake
is directly linked to connection accept, with different orderings.
Notably, if 0-RTT is used, connection can be accepted prior handshake
completion. Another major point is that along HANDSHAKE_DONE frame, a
series of NEW_CONNECTION_ID frames are emitted. However, these new CIDs
allocation must occur after connection is migrated to its new thread as
these CIDs are tied to it. A BUG_ON() is present to check this in
qc_set_tid_affinity().

With all this in mind, 2 locations were selected for the necessary
tasklet_wakeup() :
* on qc_xprt_start() : this is useful for standard case without 0-RTT.
  This ensures that this is done only after connection thread migration.
* on qc_ssl_provide_all_quic_data() : this is done on handshake
  completion with 0-RTT used. In this case only, connection is already
  accepted and migrated, so tasklet_wakeup() is safe.

Note that as a side-change, quic_accept_push_qc() API has evolved to
better reflect differences between standard and 0-RTT usages. It is now
forbidden to call it multiple times on a single quic_conn instance. A
BUG_ON() has been added.

This issue is labelled as medium even though it seems pretty rare. It
was only reproducible using QUIC interop runner, with haproxy compiled
with LibreSSL with quic-go as client. However, affected code parts are
pretty sensible, which justify the chosen severity.

This should fix github issue #2418.

It should be backported up to 2.6, after a brief period of observation.
Note that the extra comment added in qc_set_tid_affinity() can be
removed in 2.6 as thread migration is not implemented for this version.
Other parts should apply without conflict.
2024-03-06 10:39:57 +01:00