Commit Graph

24596 Commits

Author SHA1 Message Date
Amaury Denoyelle
f3b9676416 MINOR: quic: display stream age
Add a field to save the creation date of qc_stream_desc instance. This
is useful to display QUIC stream age in "show quic stream" output.
2025-05-13 15:44:22 +02:00
Amaury Denoyelle
dbf07c754e MINOR: quic: display QCS info on "show quic stream"
Complete stream output for "show quic" by displaying information from
its upper QCS. Note that QCS may be NULL if already released, so a
default output is also provided.
2025-05-13 15:43:28 +02:00
Amaury Denoyelle
cbadfa0163 MINOR: quic: add stream format for "show quic"
Add a new format for "show quic" command labelled as "stream". This is
an equivalent of "show sess", dedicated to the QUIC stack. Each active
QUIC streams are listed on a line with their related infos.

The main objective of this command is to ensure there is no freeze
streams remaining after a transfer.
2025-05-13 15:41:51 +02:00
Amaury Denoyelle
1ccede211c MINOR: mux-quic: account Rx data per stream
Add counters to measure Rx buffers usage per QCS. This reused the newly
defined bdata_ctr type already used for Tx accounting.

Note that for now, <tot> value of bdata_ctr is not used. This is because
it is not easy to account for data accross contiguous buffers.

These values are displayed both on log/traces and "show quic" output.
2025-05-13 15:41:51 +02:00
Amaury Denoyelle
a1dc9070e7 MINOR: quic: account Tx data per stream
Add accounting at qc_stream_desc level to be able to report the number
of allocated Tx buffers and the sum of their data. This represents data
ready for emission or already emitted and waiting on ACK.

To simplify this accounting, a new counter type bdata_ctr is defined in
quic_utils.h. This regroups both buffers and data counter, plus a
maximum on the buffer value.

These values are now displayed on QCS info used both on logline and
traces, and also on "show quic" output.
2025-05-13 15:41:41 +02:00
Willy Tarreau
9a05c1f574 BUG/MEDIUM: h2/h3: reject some forbidden chars in :authority before reassembly
As discussed here:
   https://github.com/httpwg/http2-spec/pull/936
   https://github.com/haproxy/haproxy/issues/2941

It's important to take care of some special characters in the :authority
pseudo header before reassembling a complete URI, because after assembly
it's too late (e.g. the '/'). This patch does this, both for h2 and h3.

The impact on H2 was measured in the worst case at 0.3% of the request
rate, while the impact on H3 is around 1%, but H3 was about 1% faster
than H2 before and is now on par.

It may be backported after a period of observation, and in this case it
relies on this previous commit:

   MINOR: http: add a function to validate characters of :authority

Thanks to @DemiMarie for reviving this topic in issue #2941 and bringing
new potential interesting cases.
2025-05-12 18:02:47 +02:00
Willy Tarreau
ebab479cdf MINOR: http: add a function to validate characters of :authority
As discussed here:
  https://github.com/httpwg/http2-spec/pull/936
  https://github.com/haproxy/haproxy/issues/2941

It's important to take care of some special characters in the :authority
pseudo header before reassembling a complete URI, because after assembly
it's too late (e.g. the '/').

This patch adds a specific function which was checks all such characters
and their ranges on an ist, and benefits from modern compilers
optimizations that arrange the comparisons into an evaluation tree for
faster match. That's the version that gave the most consistent performance
across various compilers, though some hand-crafted versions using bitmaps
stored in register could be slightly faster but super sensitive to code
ordering, suggesting that the results might vary with future compilers.
This one takes on average 1.2ns per character at 3 GHz (3.6 cycles per
char on avg). The resulting impact on H2 request processing time (small
requests) was measured around 0.3%, from 6.60 to 6.618us per request,
which is a bit high but remains acceptable given that the test only
focused on req rate.

The code was made usable both for H2 and H3.
2025-05-12 18:02:47 +02:00
Aurelien DARRAGON
c40d6ac840 BUG/MINOR: server: perform lbprm deinit for dynamic servers
Last commit 7361515 ("BUG/MINOR: server: dont depend on proxy for server
cleanup in srv_drop()") introduced a regression because the lbprm
server_deinit is not evaluated anymore with dynamic servers, possibly
resulting in a memory leak.

To fix the issue, in addition to free_proxy(), the server deinit check
should be manually performed in cli_parse_delete_server() as well.

No backport needed.
2025-05-12 16:29:36 +02:00
Aurelien DARRAGON
736151556c BUG/MINOR: server: dont depend on proxy for server cleanup in srv_drop()
In commit b5ee8bebfc ("MINOR: server: always call ssl->destroy_srv when
available"), we made it so srv_drop() doesn't depend on proxy to perform
server cleanup.

It turns out this is now mandatory, because during deinit, free_proxy()
can occur before the final srv_drop(). This is the case when using Lua
scripts for instance.

In 2a9436f96 ("MINOR: lbprm: Add method to deinit server and proxy") we
added a freeing check under srv_drop() that depends on the proxy.
Because of that UAF may occur during deinit when using a Lua script that
manipulate server objects.

To fix the issue, let's perform the lbprm server deinit logic under
free_proxy() directly, where the DEINIT server hooks are evaluated.

Also, to prevent similar bugs in the future, let's explicitly document
in srv_drop() that server cleanups should assume that the proxy may
already be freed.

No backport needed unless 2a9436f96 is.
2025-05-12 16:17:26 +02:00
Willy Tarreau
be4d816be2 BUG/MINOR: cfgparse: improve the empty arg position report's robustness
OSS Fuzz found that the previous fix ebb19fb367 ("BUG/MINOR: cfgparse:
consider the special case of empty arg caused by \x00") was incomplete,
as the output can sometimes be larger than the input (due to variables
expansion) in which case the work around to try to report a bad arg will
fail. While the parse_line() function has been made more robust now in
order to avoid this condition, let's fix the handling of this special
case anyway by just pointing to the beginning of the line if the supposed
error location is out of the line's buffer.

All details here:
   https://oss-fuzz.com/testcase-detail/5202563081502720

No backport is needed unless the fix above is backported.
2025-05-12 16:11:15 +02:00
Willy Tarreau
2b60e54fb1 BUG/MINOR: tools: improve parse_line()'s robustness against empty args
The fix in 10e6d0bd57 ("BUG/MINOR: tools: only fill first empty arg when
not out of range") was not that good. It focused on protecting against
<arg> becoming out of range to detect we haven't emitted anything, but
it's not the right way to detect this. We're always maintaining arg_start
as a copy of outpos, and that later one is incremented when emitting a
char, so instead of testing args[arg] against out+arg_start, we should
instead check outpos against arg_start, thereby eliminating the <out>
offset and the need to access args[]. This way we now always know if
we've emitted an empty arg without dereferencing args[].

There's no need to backport this unless the fix above is also backported.
2025-05-12 16:11:15 +02:00
Aurelien DARRAGON
7d057e56af BUG/MINOR: threads: fix soft-stop without multithreading support
When thread support is disabled ("USE_THREAD=" or "USE_THREAD=0" when
building), soft-stop doesn't work as haproxy never ends after stopping
the proxies.

This used to work fine in the past but suddenly stopped working with
ef422ced91 ("MEDIUM: thread: make stopping_threads per-group and add
stopping_tgroups") because the "break;" instruction under the stopping
condition is never executed when support for multithreading is disabled.

To fix the issue, let's add an "else" block to run the "break;"
instruction when USE_THREAD is not defined.

It should be backported up to 2.8
2025-05-12 14:18:39 +02:00
William Lallemand
8b0d1a4113 MINOR: ssl/ckch: warn when the same keyword was used twice
When using a crt-list or a crt-store, keywords mentionned twice on the
same line overwritte the previous value.

This patch emits a warning when the same keyword is found another time
on the same line.
2025-05-09 19:18:38 +02:00
William Lallemand
9c0c05b7ba BUG/MINOR: ssl/ckch: always ha_freearray() the previous entry during parsing
The ckch_conf_parse() function is the generic function which parses
crt-store keywords from the crt-store section, and also from a
crt-list.

When having multiple time the same keyword, a leak of the previous
value happens. This patch ensure that the previous value is always
freed before overwriting it.

This is the same problem as the previous "BUG/MINOR: ssl/ckch: always
free() the previous entry during parsing" patch, however this one
applies on PARSE_TYPE_ARRAY_SUBSTR.

No backport needed.
2025-05-09 19:16:02 +02:00
William Lallemand
96b1f1fd26 MINOR: tools: ha_freearray() frees an array of string
ha_freearray() is a new function which free() an array of strings
terminated by a NULL entry.

The pointer to the array will be free and set to NULL.
2025-05-09 19:12:05 +02:00
William Lallemand
311e0aa5c7 BUG/MINOR: ssl/ckch: always free() the previous entry during parsing
The ckch_conf_parse() function is the generic function which parses
crt-store keywords from the crt-store section, and also from a crt-list.

When having multiple time the same keyword, a leak of the previous value
happens. This patch ensure that the previous value is always freed
before overwriting it.

This patch should be backported as far as 3.0.
2025-05-09 19:01:28 +02:00
William Lallemand
9ce3fb35a2 BUG/MINOR: ssl: prevent multiple 'crt' on the same ssl-f-use line
The 'ssl-f-use' implementation doesn't prevent to have multiple time the
'crt' keyword, which overwrite the previous value. Letting users think
that is it possible to use multiple certificates on the same line, which
is not the case.

This patch emits an alert when setting the 'crt' keyword multiple times
on the same ssl-f-use line.

Should fix issue #2966.

No backport needed.
2025-05-09 18:52:09 +02:00
William Lallemand
0c4abf5a22 BUG/MINOR: ssl: doesn't fill conf->crt with first arg
Commit c7f29afc ("MEDIUM: ssl: replace "crt" lines by "ssl-f-use"
lines") forgot to remove an the allocation of the crt field which was
done with the first argument.

Since ssl-f-use takes keywords, this would put the first keyword in
"crt" instead of the certificate name.
2025-05-09 18:23:06 +02:00
Willy Tarreau
8a96216847 MEDIUM: sock-inet: re-check IPv6 connectivity every 30s
IPv6 connectivity might start off (e.g. network not fully up when
haproxy starts), so for features like resolvers, it would be nice to
periodically recheck.

With this change, instead of having the resolvers code rely on a variable
indicating connectivity, it will now call a function that will check for
how long a connectivity check hasn't been run, and will perform a new one
if needed. The age was set to 30s which seems reasonable considering that
the DNS will cache results anyway. There's no saving in spacing it more
since the syscall is very check (just a connect() without any packet being
emitted).

The variables remain exported so that we could present them in show info
or anywhere else.

This way, "dns-accept-family auto" will now stay up to date. Warning
though, it does perform some caching so even with a refreshed IPv6
connectivity, an older record may be returned anyway.
2025-05-09 15:45:44 +02:00
Willy Tarreau
1404f6fb7b DEBUG: pools: add a new integrity mode "backup" to copy the released area
This way we can preserve the entire contents of the released area for
later inspection. This automatically enables comparison at reallocation
time as well (like "integrity" does). If used in combination with
integrity, the comparison is disabled but the check of non-corruption
of the area mangled by integrity is still operated.
2025-05-09 14:57:00 +02:00
William Lallemand
e7574cd5f0 MINOR: acme: add the global option 'acme.scheduler'
The automatic scheduler is useful but sometimes you don't want to use,
or schedule manually.

This patch adds an 'acme.scheduler' option in the global section, which
can be set to either 'auto' or 'off'. (auto is the default value)

This also change the ouput of the 'acme status' command so it does not
shows scheduled values. The state will be 'Stopped' instead of
'Scheduled'.
2025-05-09 14:00:39 +02:00
Willy Tarreau
0ae14beb2a DEBUG: pool: permit per-pool UAF configuration
The new MEM_F_UAF flag can be set just after a pool's creation to make
this pool UAF for debugging purposes. This allows to maintain a better
overall performance required to reproduce issues while still having a
chance to catch UAF. It will only be used by developers who will manually
add it to areas worth being inspected, though.
2025-05-09 13:59:02 +02:00
Amaury Denoyelle
14e4f2b811 BUG/MEDIUM: mux-quic: fix crash on invalid fctl frame dereference
Emission of flow-control frames have been recently modified. Now, each
frame is sent one by one, via a single entry list. If a failure occurs,
emission is interrupted and frame is reinserted into the original
<qcc.lfctl.frms> list.

This code is incorrect as it only checks if qcc_send_frames() returns an
error code to perform the reinsert operation. However, an error here
does not always mean that the frame was not properly emitted by lower
quic-conn layer. As such, an extra test LIST_ISEMPTY() must be performed
prior to reinsert the frame.

This bug would cause a heap overflow. Indeed, the reinsert frame would
be a random value. A crash would occur as soon as it would be
dereferenced via <qcc.lfctl.frms> list.

This was reproduced by issuing a POST with a big file and interrupt it
after just a few seconds. This results in a crash in about a third of
the tests. Here is an example command using ngtcp2 :

 $ ngtcp2-client -q --no-quic-dump --no-http-dump \
   -m POST -d ~/infra/html/1g 127.0.0.1 20443 "http://127.0.0.1:20443/post"

Heap overflow was detected via a BUG_ON() statement from qc_frm_free()
via qcc_release() caller :

  FATAL: bug condition "!((&((*frm)->reflist))->n == (&((*frm)->reflist)))" matched at src/quic_frame.c:1270

This does not need to be backported.
2025-05-09 11:07:11 +02:00
Willy Tarreau
3f9194bfc9 [RELEASE] Released version 3.2-dev15
Released version 3.2-dev15 with the following main changes :
    - BUG/MEDIUM: stktable: fix sc_*(<ctr>) BUG_ON() regression with ctx > 9
    - BUG/MINOR: acme/cli: don't output error on success
    - BUG/MINOR: tools: do not create an empty arg from trailing spaces
    - MEDIUM: config: warn about the consequences of empty arguments on a config line
    - MINOR: tools: make parse_line() provide hints about empty args
    - MINOR: cfgparse: visually show the input line on empty args
    - BUG/MINOR: tools: always terminate empty lines
    - BUG/MINOR: tools: make parseline report the required space for the trailing 0
    - DEBUG: threads: don't keep lock label "OTHER" in the per-thread history
    - DEBUG: threads: merge successive idempotent lock operations in history
    - DEBUG: threads: display held locks in threads dumps
    - BUG/MINOR: proxy: only use proxy_inc_fe_cum_sess_ver_ctr() with frontends
    - Revert "BUG/MEDIUM: mux-spop: Handle CLOSING state and wait for AGENT DISCONNECT frame"
    - MINOR: acme/cli: 'acme status' show the status acme-configured certificates
    - MEDIUM: acme/ssl: remove 'acme ps' in favor of 'acme status'
    - DOC: configuration: add "acme" section to the keywords list
    - DOC: configuration: add the "crt-store" keyword
    - BUG/MAJOR: queue: lock around the call to pendconn_process_next_strm()
    - MINOR: ssl: add filename and linenum for ssl-f-use errors
    - BUG/MINOR: ssl: can't use crt-store some certificates in ssl-f-use
    - BUG/MINOR: tools: only fill first empty arg when not out of range
    - MINOR: debug: bump the dump buffer to 8kB
    - MINOR: stick-tables: add "ipv4" as an alias for the "ip" type
    - MINOR: quic: extend return value during TP parsing
    - BUG/MINOR: quic: use proper error code on missing CID in TPs
    - BUG/MINOR: quic: use proper error code on invalid server TP
    - BUG/MINOR: quic: reject retry_source_cid TP on server side
    - BUG/MINOR: quic: use proper error code on invalid received TP value
    - BUG/MINOR: quic: fix TP reject on invalid max-ack-delay
    - BUG/MINOR: quic: reject invalid max_udp_payload size
    - BUG/MEDIUM: peers: hold the refcnt until updating ts->seen
    - BUG/MEDIUM: stick-tables: close a tiny race in __stksess_kill()
    - BUG/MINOR: cli: fix too many args detection for commands
    - MINOR: server: ensure server postparse tasks are run for dynamic servers
    - BUG/MEDIUM: stick-table: always remove update before adding a new one
    - BUG/MEDIUM: quic: free stream_desc on all data acked
    - BUG/MINOR: cfgparse: consider the special case of empty arg caused by \x00
    - DOC: config: recommend disabling libc-based resolution with resolvers
2025-05-09 10:51:30 +02:00
Willy Tarreau
4e20fab7ac DOC: config: recommend disabling libc-based resolution with resolvers
Using both libc and haproxy resolvers can lead to hard to diagnose issues
when their bevahiour diverges; recommend using only one type of resolver.

Should be backported to stable versions.

Link: https://www.mail-archive.com/haproxy@formilux.org/msg45663.html
Co-authored-by: Lukas Tribus <lukas@ltri.eu>
2025-05-09 10:31:39 +02:00
Willy Tarreau
ebb19fb367 BUG/MINOR: cfgparse: consider the special case of empty arg caused by \x00
The reporting of the empty arg location added with commit 08d3caf30
("MINOR: cfgparse: visually show the input line on empty args") falls
victim of a special case detected by OSS Fuzz:

     https://issues.oss-fuzz.com/issues/415850462

In short, making an argument start with "\x00" doesn't make it empty for
the parser, but still emits an empty string which is detected and
displayed. Unfortunately in this case the error pointer is not set so
the sanitization function crashes.

What we're doing in this case is that we fall back to the position of
the output argument as an estimate of where it was located in the input.
It's clearly inexact (quoting etc) but will still help the user locate
the problem.

No backport is needed unless the commit above is backported.
2025-05-09 10:01:44 +02:00
Amaury Denoyelle
3fdb039a99 BUG/MEDIUM: quic: free stream_desc on all data acked
The following patch simplifies qc_stream_desc_ack(). The qc_stream_desc
instance is not freed anymore, even if all data were acknowledged. As
implies by the commit message, the caller is responsible to perform this
cleaning operation.
  f4a83fbb14
  MINOR: quic: do not remove qc_stream_desc automatically on ACK handling

However, despite the commit instruction, qc_stream_desc_free()
invokation was not moved in the caller. This commit fixes this by adding
it after stream ACK handling. This is performed only when a transfer is
completed : all data is acknowledged and qc_stream_desc has been
released by its MUX stream instance counterpart.

This bug may cause a significant increase in memory usage when dealing
with long running connection. However, there is no memory leak, as every
qc_stream_desc attached to a connection are finally freed when quic_conn
instance is released.

This must be backported up to 3.1.
2025-05-09 09:25:47 +02:00
Willy Tarreau
576e47fb9a BUG/MEDIUM: stick-table: always remove update before adding a new one
Since commit 388539faa ("MEDIUM: stick-tables: defer adding updates to a
tasklet"), between the entry creation and its arrival in the updates tree,
there is time for scheduling, and it now becomes possible for an stksess
entry to be requeued into the list while it's still in the tree as a remote
one. Only local updates were removed prior to being inserted. In this case
we would re-insert the entry, causing it to appear as the parent of two
distinct nodes or leaves, and to be visited from the first leaf during a
delete() after having already been removed and freed, causing a crash,
as Christian reported in issue #2959.

There's no reason to backport this as this appeared with the commit above
in 3.2-dev13.
2025-05-08 23:32:25 +02:00
Aurelien DARRAGON
f03e999912 MINOR: server: ensure server postparse tasks are run for dynamic servers
commit 29b76cae4 ("BUG/MEDIUM: server/log: "mode log" after server
keyword causes crash") introduced some postparsing checks/tasks for
server

Initially they were mainly meant for "mode log" servers postparsing, but
we already have a check dedicated to "tcp/http" servers (ie: only tcp
proto supported)

However when dynamic servers are added they bypass _srv_postparse() since
the REGISTER_POST_SERVER_CHECK() is only executed for servers defined in
the configuration.

To ensure consistency between dynamic and static servers, and ensure no
post-check init routine is missed, let's manually invoke _srv_postparse()
after creating a dynamic server added via the cli.
2025-05-08 02:03:50 +02:00
Aurelien DARRAGON
976e0bd32f BUG/MINOR: cli: fix too many args detection for commands
d3f928944 ("BUG/MINOR: cli: Issue an error when too many args are passed
for a command") added a new check to prevent the command to run when
too many arguments are provided. In this case an error is reported.

However it turns out this check (despite marked for backports) was
ineffective prior to 20ec1de21 ("MAJOR: cli: Refacor parsing and
execution of pipelined commands") as 'p' pointer was reset to the end of
the buffer before the check was executed.

Now since 20ec1de21, the check works, but we have another issue: we may
read past initialized bytes in the buffer because 'p' pointer is always
incremented in a while loop without checking if we increment it past 'end'
(This was detected using valgrind)

To fix the issue introduced by 20ec1de21, let's only increment 'p' pointer
if p < end.

For 3.2 this is it, now for older versions, since d3f928944 was marked for
backport, a sligthly different approach is needed:

 - conditional p increment must be done in the loop (as in this patch)
 - max arg check must moved above "fill unused slots" comment where p is
   assigned to the end of the buffer

This patch should be backported with d3f928944.
2025-05-08 02:03:43 +02:00
Willy Tarreau
0cee7b5b8d BUG/MEDIUM: stick-tables: close a tiny race in __stksess_kill()
It might be possible not to see the element in the tree, then not to
see it in the update list, thus not to take the lock before deleting.
But an element in the list could have moved to the tree during the
check, and be removed later without the updt_lock.

Let's delete prior to checking the presence in the tree to avoid
this situation. No backport is needed since this arrived in -dev13
with the update list.
2025-05-07 18:49:21 +02:00
Willy Tarreau
006a3acbde BUG/MEDIUM: peers: hold the refcnt until updating ts->seen
In peer_treat_updatemsg(), we call stktable_touch_remote() after
releasing the write lock on the TS, asking it to decrement the
refcnt, then we update ts->seen. Unfortunately this is racy and
causes the issue that Christian reported in issue #2959.

The sequence of events is very hard to trigger manually, but what happens
is the following:

 T1.  stktable_touch_remote(table, ts, 1);
      -> at this point the entry is in the mt_list, and the refcnt is zero.

      T2.  stktable_trash_oldest() or process_table_expire()
           -> these can run, because the refcnt is now zero.
              The entry is cleanly deleted and freed.

 T1.  HA_ATOMIC_STORE(&ts->seen, 1)
      -> we dereference freed memory.

A first attempt at a fix was made by keeping the refcnt held during
all the time the entry is in the mt_list, but this is expensive as
such entries cannot be purged, causing lots of skips during
trash_oldest_data(). This managed to trigger watchdogs, and was only
hiding the real cause of the problem.

The correct approach clearly is to maintain the ref_cnt until we
touch ->seen. That's what this patch does. It does not decrement
the refcnt, while calling stktable_touch_remote(), and does it
manually after touching ->seen. With this the problem is gone.

Note that a reproducer involves the following:
  - a config with 10 stick-ctr tracking the same table with a
    random key between 10M and 100M depending on the machine.
  - the expiration should be between 10 and 20s. http_req_cnt
    is stored and shared with the peers.
  - 4 total processes with such a config on the local machine,
    each corresponding to a different peer. 3 of the peers are
    bound to half of the cores (all threads) and share the same
    threads; the last process is bound to the other half with
    its own threads.
  - injecting at full load, ~256 conn, on the shared listening
    port. After ~2x expiration time to 1 minute the lone process
    should segfault in pools code due to a corrupted by_lru list.

This problem already exists in earlier versions but the race looks
narrower. Given how difficult it is to trigger on a given machine
in its current form, it's likely that it only happens once in a
while on stable branches. The fix must be backported wherever the
code is similar, and there's no hope to reproduce it to validate
the backport.

Thanks again to Christian for his amazing help!
2025-05-07 18:49:21 +02:00
Amaury Denoyelle
4bc7aa548a BUG/MINOR: quic: reject invalid max_udp_payload size
Add a checks on received max_udp_payload transport parameters. As
defined per RFC 9000, values below 1200 are invalid, and thus the
connection must be closed with TRANSPORT_PARAMETER_ERROR code.

Prior to this patch, an invalid value was silently ignored.

This should be backported up to 2.6. Note that is relies on previous
patch "MINOR: quic: extend return value on TP parsing".
2025-05-07 15:21:30 +02:00
Amaury Denoyelle
ffabfb0fc3 BUG/MINOR: quic: fix TP reject on invalid max-ack-delay
Checks are implemented on some received transport parameter values,
to reject invalid ones defined per RFC 9000. This is the case for
max_ack_delay parameter.

The check was not properly implemented as it only reject values strictly
greater than the limit set to 2^14. Fix this by rejecting values of 2^14
and above. Also, the proper error code TRANSPORT_PARAMETER_ERROR is now
set.

This should be backported up to 2.6. Note that is relies on previous
patch "MINOR: quic: extend return value on TP parsing".
2025-05-07 15:21:30 +02:00
Amaury Denoyelle
b60a17aad7 BUG/MINOR: quic: use proper error code on invalid received TP value
As per RFC 9000, checks must be implemented to reject invalid values for
received transport parameters. Such values are dependent on the
parameter type.

Checks were already implemented for ack_delay_exponent and
active_connection_id_limit, accordingly with the QUIC specification.
However, the connection was closed with an incorrect error code. Fix
this to ensure that TRANSPORT_PARAMETER_ERROR code is used as expected.

This should be backported up to 2.6. Note that is relies on previous
patch "MINOR: quic: extend return value on TP parsing".
2025-05-07 15:21:30 +02:00
Amaury Denoyelle
10f1f1adce BUG/MINOR: quic: reject retry_source_cid TP on server side
Close the connection on error if retry_source_connection_id transport
parameter is received. This is specified by RFC 9000 as this parameter
must not be emitted by a client. Previously, it was silently ignored.

This should be backported up to 2.6. Note that is relies on previous
patch "MINOR: quic: extend return value on TP parsing".
2025-05-07 15:21:30 +02:00
Amaury Denoyelle
a54fdd3d92 BUG/MINOR: quic: use proper error code on invalid server TP
This commit is similar to the previous one. It fixes the error code
reported when dealing with invalid received transport parameters. This
time, it handles reception of original_destination_connection_id,
preferred_address and stateless_reset_token which must not be emitted by
the client.

This should be backported up to 2.6. Note that is relies on previous
patch "MINOR: quic: extend return value on TP parsing".
2025-05-07 15:20:06 +02:00
Amaury Denoyelle
df6bd4909e BUG/MINOR: quic: use proper error code on missing CID in TPs
Handle missing received transport parameter value
initial_source_connection_id / original_destination_connection_id.
Previously, such case would result in an error reported via
quic_transport_params_store(), which triggers a TLS alert converted as
expected as a CONNECTION_CLOSE. The issue is that the error code
reported in the frame was incorrect.

Fix this by returning QUIC_TP_DEC_ERR_INVAL for such conditions. This is
directly handled via quic_transport_params_store() which set the proper
TRANSPORT_PARAMETER_ERROR code for the CONNECTION_CLOSE. However, no
error is reported so the SSL handshake is properly terminated without a
TLS alert. This is enough to ensure that the CONNECTION_CLOSE frame will
be emitted as expected.

This should be backported up to 2.6. Note that is relies on previous
patch "MINOR: quic: extend return value on TP parsing".
2025-05-07 15:20:06 +02:00
Amaury Denoyelle
294bf26c06 MINOR: quic: extend return value during TP parsing
Extend API used for QUIC transport parameter decoding. This is done via
the introduction of a dedicated enum to report the various error
condition detected. No functional change should occur with this patch,
as the only returned code is QUIC_TP_DEC_ERR_TRUNC, which results in the
connection closure via a TLS alert.

This patch will be necessary to properly reject transport parameters
with the proper CONNECTION_CLOSE error code. As such, it should be
backported up to 2.6 with the following series.
2025-05-07 15:19:52 +02:00
Willy Tarreau
46b5dcad99 MINOR: stick-tables: add "ipv4" as an alias for the "ip" type
However the doc purposely says the opposite, to encourage migrating away
from "ip". The goal is that in the future we change "ip" to mean "ipv6",
which seems to be what most users naturally expect. But we cannot break
configurations in the LTS version so for now "ipv4" is the alias.

The reason for not changing it in the table is that the type name is
used at a few places (look for "].kw"):
  - dumps
  - promex

We'd rather not change that output for 3.2, but only do it in 3.3.
This way, 3.2 can be made future-proof by using "ipv4" in the config
without any other side effect.

Please see github issue #2962 for updates on this transition.
2025-05-07 10:11:55 +02:00
Willy Tarreau
697a531516 MINOR: debug: bump the dump buffer to 8kB
Now with the improved backtraces, the lock history and details in the
mux layers, some dumps appear truncated or with some chars alone at
the beginning of the line. The issue is in fact caused by the limited
dump buffer size (2kB for stderr, 4kB for warning), that cannot hold
a complete trace anymore.

Let's jump bump them to 8kB, this will be plenty for a long time.
2025-05-07 10:02:58 +02:00
Willy Tarreau
10e6d0bd57 BUG/MINOR: tools: only fill first empty arg when not out of range
In commit 3f2c8af313 ("MINOR: tools: make parse_line() provide hints
about empty args") we've added the ability to record the position of
the first empty arg in parse_line(), but that check requires to
access the args[] array for the current arg, which is not valid in
case we stopped on too large an argument count. Let's just check the
arg's validity before doing so.

This was reported by OSS Fuzz:
  https://issues.oss-fuzz.com/issues/415850462

No backport is needed since this was in the latest dev branch.
2025-05-07 07:25:29 +02:00
William Lallemand
fbceabbccf BUG/MINOR: ssl: can't use crt-store some certificates in ssl-f-use
When declaring a certificate via the crt-store section, this certificate
can then be used 2 ways in a crt-list:
- only by using its name, without any crt-store options
- or by using the exact set of crt-list option that was defined in the
  crt-store

Since ssl-f-use is generating a crt-list, this is suppose to behave the
same. To achieve this, ckch_conf_parse() will parse the keywords related
to the ckch_conf on the ssl-f-use line and use ckch_conf_cmp() to
compare it to the previous declaration from the crt-store. This
comparaison is only done when any ckch_conf keyword are present.

However, ckch_conf_parse() was done for the crt-list, and the crt-list
does not use the "crt" parameter to declare the name of the certificate,
since it's the first element of the line. So when used with ssl-f-use,
ckch_conf_parse() will always see a "crt" keyword which is a ckch_conf
one, and consider that it will always need to have the exact same set of
paremeters when using the same crt in a crt-store and an ssl-f-use line.

So a simple configuration like this:

   crt-store web
     load crt "foo.com.crt" key "foo.com.key" alias "foo"

   frontend mysite
     bind :443 ssl
     ssl-f-use crt "@web/foo" ssl-min-ver TLSv1.2

Would lead to an error like this:

    config : '@web/foo' in crt-list '(null)' line 0, is already defined with incompatible parameters:
    - different parameter 'key' : previously 'foo.com.key' vs '(null)'

In order to fix the issue, this patch parses the "crt" parameter itself
for ssl-f-use instead of using ckch_conf_parse(), so the keyword would
never be considered as a ckch_conf keyword to compare.

This patch also take care of setting the CKCH_CONF_SET_CRTLIST flag only
if a ckch_conf keyword was found. This flag is used by ckch_conf_cmp()
to know if it has to compare or not.

No backport needed.
2025-05-06 21:36:29 +02:00
William Lallemand
b3b282d2ee MINOR: ssl: add filename and linenum for ssl-f-use errors
Fill cfg_crt_node with a filename and linenum so the post_section
callback can use it to emit errors.

This way the errors are emitted with the right filename and linenum
where ssl-f-use is used instead of (null):0
2025-05-06 21:36:29 +02:00
Willy Tarreau
99f5be5631 BUG/MAJOR: queue: lock around the call to pendconn_process_next_strm()
The extra call to pendconn_process_next_strm() made in commit cda7275ef5
("MEDIUM: queue: Handle the race condition between queue and dequeue
differently") was performed after releasing the server queue's lock,
which is incompatible with the calling convention for this function.
The result is random corruption of the server's streams list likely
due to picking old or incorrect pendconns from the queue, and in the
end infinitely looping on apparently already locked mt_list objects.
Just adding the lock fixes the problem.

It's very difficult to reproduce, it requires low maxconn values on
servers, stickiness on the servers (cookie), a long enough slowstart
(e.g. 10s), and regularly flipping servers up/down to re-trigger the
slowstart.

No backport is needed as this was only in 3.2.
2025-05-06 18:59:54 +02:00
William Lallemand
e035f0c48e DOC: configuration: add the "crt-store" keyword
Add the "crt-store" keyword with its argument in the "3.12" section, so
this could be detected by haproxy-dconv has a keyword and put in the
keywords list.

Must be backported as far as 3.0
2025-05-06 16:07:29 +02:00
William Lallemand
e516b14d36 DOC: configuration: add "acme" section to the keywords list
Add the "acme" keyword with its argument in the "3.13" section, so this
could be detected by haproxy-dconv has a keyword and put in the keywords
list.
2025-05-06 15:34:39 +02:00
William Lallemand
b7c4a68ecf MEDIUM: acme/ssl: remove 'acme ps' in favor of 'acme status'
Remove the 'acme ps' command which does not seem useful anymore with the
'acme status' command.

The big difference with the 'acme status' command is that it was only
displaying the running tasks instead of the status of all certificate.
2025-05-06 15:27:29 +02:00
William Lallemand
48f1ce77b7 MINOR: acme/cli: 'acme status' show the status acme-configured certificates
The "acme status" command, shows the status of every certificates
configured with ACME, not only the running task like "acme ps".

The IO handler loops on the ckch_store tree and outputs a line for each
ckch_store which has an acme section set. This is still done under the
ckch_store lock and doesn't support resuming when the buffer is full,
but we need to change that in the future.
2025-05-06 15:27:29 +02:00
Christopher Faulet
a3ce7d7772 Revert "BUG/MEDIUM: mux-spop: Handle CLOSING state and wait for AGENT DISCONNECT frame"
This reverts commit 53c3046898.

This patch introduced a regression leading to a loop on the frames
demultiplexing because a frame may be ignore but not consumed.

But outside this regression that can be fixed, there is a design issue that
was not totally fixed by the patch above. The SPOP connection state is mixed
with the status of the frames demultiplexer and this needlessly complexify
the connection management. Instead of fixing the fix, a better solution is
to revert it to work a a proper solution.

For the record, the idea is to deal with the spop connection state onlu
using 'state' field and to introduce a new field to handle the frames
demultiplexer state. This should ease the closing state management.

Another issue that must be fixed. We must take care to not abort a SPOP
stream when an error is detected on a SPOP connection or when the connection
is closed, if the ACK frame was already received for this stream. It is not
a common case, but it can be solved by saving the last known stream ID that
recieved a ACK.

This patch must be backported if the commit above is backported.
2025-05-06 13:43:59 +02:00