Commit Graph

22424 Commits

Author SHA1 Message Date
Willy Tarreau
5590ada473 [RELEASE] Released version 3.0.0
Released version 3.0.0 with the following main changes :
    - MINOR: sample: implement the uptime sample fetch
    - CI: scripts: fix build of vtest regarding option -C
    - CI: scripts: build vtest using multiple CPUs
    - MINOR: log: rename 'log-format tag' to 'log-format alias'
    - DOC: config: document logformat item naming and typecasting features
    - BUILD: makefile: yearly reordering of objects by build time
    - BUILD: fd: errno is also needed without poll()
    - DOC: config: fix two typos "RST_STEAM" vs "RST_STREAM"
    - DOC: config: refer to the non-deprecated keywords in ocsp-update on/off
    - DOC: streamline http-reuse and connection naming definition
    - REGTESTS: complete http-reuse test with pool-conn-name
    - DOC: config: add %ID logformat alias alternative
    - CLEANUP: ssl/ocsp: readable ifdef in ssl_sock_load_ocsp
    - BUG/MINOR: ssl/ocsp: init callback func ptr as NULL
    - CLEANUP: ssl_sock: move dirty openssl-1.0.2 wrapper to openssl-compat
    - BUG/MINOR: activity: fix Delta_calls and Delta_bytes count
    - CI: github: upgrade the WolfSSL job to 5.7.0
    - DOC: install: update quick build reminders with some missing options
    - DOC: install: update the range of tested openssl version to cover 3.3
    - DEV: patchbot: prepare for new version 3.1-dev
    - MINOR: version: mention that it's 3.0 LTS now.
2024-05-29 14:43:38 +02:00
Willy Tarreau
2e42a19cde MINOR: version: mention that it's 3.0 LTS now.
The version will be maintained up to around Q2 2029. Let's
also update the INSTALL file to mention this.
2024-05-29 14:40:26 +02:00
Willy Tarreau
bb7e62b98a DEV: patchbot: prepare for new version 3.1-dev
The bot will now load the prompt for the upcoming 3.1 version so we have
to rename the files and update their contents to match the current version.
2024-05-29 14:38:21 +02:00
Willy Tarreau
8452a3f7c9 DOC: install: update the range of tested openssl version to cover 3.3
OpenSSL 3.3 is known to work since it's tested on the CI, to let's add
it to the list of known good versions.
2024-05-29 10:23:59 +02:00
Willy Tarreau
2a949be18d DOC: install: update quick build reminders with some missing options
The quick build reminders claimed to present "all options" but were
still missing QUIC. It was also the moment to split FreeBSD and
OpenBSD apart since the latter uses LibreSSL and does not require
the openssl compatibility wrapper. We also replace the hard-coded
number of cpus for the parallel build, by the real number reported
by the system.
2024-05-29 08:43:01 +02:00
William Lallemand
40cd5cc0e2 CI: github: upgrade the WolfSSL job to 5.7.0
WolfSSL 5.70 was released in March 2024,  let's upgrade our CI job to
this version.
2024-05-28 19:26:52 +02:00
Valentine Krasnobaeva
d5e43caaf5 BUG/MINOR: activity: fix Delta_calls and Delta_bytes count
Thanks to the commit 5714aff4a6
"DEBUG: pool: store the memprof bin on alloc() and update it on free()", the
amount of memory allocations and memory "frees" is shown now on the same line,
corresponded to the caller name. This is very convenient to debug memory leaks
(haproxy should run with -dMcaller option).

The implicit drawback of this solution is that we count twice same free_calls
and same free_tot (bytes) values in cli_io_handler_show_profiling(), when
we've calculed tot_free_calls and tot_free_bytes, by adding them to the these
totalizators for p_alloc, malloc and calloc allocator types. See the details
about why this happens in a such way in __pool_free() implementation and
also in the commit message for 5714aff4a6.

This double addition of free counters falses 'Delta_calls' and 'Delta_bytes',
sometimes we even noticed that they show negative values.

Same problem was with the calculation of average allocated buffer size for
lines, where we show simultaneously the number of allocated and freed bytes.
2024-05-28 19:25:08 +02:00
Willy Tarreau
decb7c90df CLEANUP: ssl_sock: move dirty openssl-1.0.2 wrapper to openssl-compat
Valentine noticed this ugly SSL_CTX_get_tlsext_status_cb() macro
definition inside ssl_sock.c that is dedicated to openssl-1.0.2 only.
It would be better placed in openssl-compat.h, which is what this
patch does. It also addresses a missing pair of parenthesis and
removes an invalid extra semicolon.
2024-05-28 19:17:57 +02:00
Valentine Krasnobaeva
84380965a5 BUG/MINOR: ssl/ocsp: init callback func ptr as NULL
In ssl_sock_load_ocsp() it is better to initialize local scope variable
'callback' function pointer as NULL, while we are declaring it. According to
SSL_CTX_get_tlsext_status_cb() API, then we will provide a pointer to this
'on stack' variable in order to check, if the callback was already set before:

OpenSSL 1.x.x and 3.x.x:
  long SSL_CTX_get_tlsext_status_cb(SSL_CTX *ctx, int (**callback)(SSL *, void *));
  long SSL_CTX_set_tlsext_status_cb(SSL_CTX *ctx, int (*callback)(SSL *, void *));

WolfSSL 5.7.0:
  typedef int(*tlsextStatusCb)(WOLFSSL* ssl, void*);
  WOLFSSL_API int wolfSSL_CTX_get_tlsext_status_cb(WOLFSSL_CTX* ctx, tlsextStatusCb* cb);
  WOLFSSL_API int wolfSSL_CTX_set_tlsext_status_cb(WOLFSSL_CTX* ctx, tlsextStatusCb cb);

When this func ptr variable stays uninitialized, haproxy comipled with ASAN
crushes in ssl_sock_load_ocsp():

  ./haproxy -d -f haproxy.cfg
  ...
  AddressSanitizer:DEADLYSIGNAL
  =================================================================
  ==114919==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000008 (pc 0x5eab8951bb32 bp 0x7ffcdd6d8410 sp 0x7ffcdd6d82e0 T0)
  ==114919==The signal is caused by a READ memory access.
  ==114919==Hint: address points to the zero page.
    #0 0x5eab8951bb32 in ssl_sock_load_ocsp /home/vk/projects/haproxy/src/ssl_sock.c:1248:22
    #1 0x5eab89510d65 in ssl_sock_put_ckch_into_ctx /home/vk/projects/haproxy/src/ssl_sock.c:3389:6
  ...

This happens, because callback variable is allocated on the stack. As not
being explicitly initialized, it may contain some garbage value at runtime,
due to the linked crypto library update or recompilation.

So, following ssl_sock_load_ocsp code, SSL_CTX_get_tlsext_status_cb() may
fail, callback will still contain its initial garbage value,
'if (!callback) {...' test will put us on the wrong path to access some
ocsp_cbk_arg properties via its pointer, which won't be set and like this
we will finish with segmentation fault.

Must be backported in all stable versions. All versions does not have
the ifdef, the previous cleanup patch is useful starting from the 2.7
version.
2024-05-28 18:14:26 +02:00
Valentine Krasnobaeva
fb7b46d267 CLEANUP: ssl/ocsp: readable ifdef in ssl_sock_load_ocsp
Due to the support of different TLS/SSL libraries and its different versions,
sometimes we are forced to use different internal typedefs and callback
functions. We strive to avoid this, but time to time "#ifdef... #endif"
become inevitable.

In particular, in ssl_sock_load_ocsp() we define a 'callback' variable, which
will contain a function pointer to our OCSP stapling callback, assigned
further via SSL_CTX_set_tlsext_status_cb() to the intenal SSL context
struct in a linked crypto library.

If this linked crypto library is OpenSSL 1.x.x/3.x.x, for setting and
getting this callback we have the following API signatures
(see doc/man3/SSL_CTX_set_tlsext_status_cb.pod):

  long SSL_CTX_get_tlsext_status_cb(SSL_CTX *ctx, int (**callback)(SSL *, void *));
  long SSL_CTX_set_tlsext_status_cb(SSL_CTX *ctx, int (*callback)(SSL *, void *));

If we are using WolfSSL, same APIs expect tlsextStatusCb function prototype,
provided via the typedef below (see wolfssl/wolfssl/ssl.h):

  typedef int(*tlsextStatusCb)(WOLFSSL* ssl, void*);
  WOLFSSL_API int wolfSSL_CTX_get_tlsext_status_cb(WOLFSSL_CTX* ctx, tlsextStatusCb* cb);
  WOLFSSL_API int wolfSSL_CTX_set_tlsext_status_cb(WOLFSSL_CTX* ctx, tlsextStatusCb cb);

It seems, that in OpenSSL < 1.0.0, there was no support for OCSP extention, so
no need to set this callback.

Let's avoid #ifndef... #endif for this 'callback' variable definition to keep
things clear. #ifndef... #endif are usually less readable, than
straightforward "#ifdef... #endif".
2024-05-28 18:00:44 +02:00
Aurelien DARRAGON
f9740230fc DOC: config: add %ID logformat alias alternative
unique-id sample fetch may be used instead of %ID alias but it wasn't
mentioned explicitly in the doc.
2024-05-28 15:45:03 +02:00
Amaury Denoyelle
b0e1f77fea REGTESTS: complete http-reuse test with pool-conn-name
Add new test cases in http_reuse_conn_hash vtest. Ensure new server
parameter "pool-conn-name" is used as expected for idle connection name,
both alone and mixed with a SNI.
2024-05-28 15:00:54 +02:00
Amaury Denoyelle
8c09c7f39f DOC: streamline http-reuse and connection naming definition
With the introduction of "pool-conn-name", documentation related to
http-reuse was rendered more complex than already, notably with multiple
cross-references between "pool-conn-name" and "sni" server keywords.

Took the opportunity to improve all http-reuse related documentation.
First, "http-reuse" keyword general purpose has been greatly expanded
and reordered.

Then, "pool-conn-name" and "sni" have been clarified, in particular the
relation between them, with the foremost being an advanced usage to the
default SSL SNI case in the context of http-reuse. Also update
attach-srv rule documentation as its name parameter is directly linked
to both "pool-conn-name" and "sni".
2024-05-28 13:58:08 +02:00
Willy Tarreau
652a6f18b2 DOC: config: refer to the non-deprecated keywords in ocsp-update on/off
The doc for "ocsp-update [ off | on ]" was still referring to
"tune.ssl.ocsp-update.*" instead of "ocsp-update.*". No backport
needed.
2024-05-27 20:13:42 +02:00
Willy Tarreau
2ed3531619 DOC: config: fix two typos "RST_STEAM" vs "RST_STREAM"
These were added in 3.0-dev11 by commit 068ce2d5d2 ("MINOR: stconn:
Add samples to retrieve about stream aborts"), no backport needed.
2024-05-27 19:51:19 +02:00
Willy Tarreau
725fa0ecd2 BUILD: fd: errno is also needed without poll()
When building without USE_POLL, fd.c fails on errno because that one is
only included when USE_POLL is set. Let's move it outside of the ifdef.
2024-05-27 19:14:14 +02:00
Willy Tarreau
35e9826c13 BUILD: makefile: yearly reordering of objects by build time
Some large files have been split since 2.9 (e.g. stats) and build times
have moved and become less smooth, causing a less even parallel build.
As usual, a small reordering cleans all this up. The effect was less
visible than previous years though.
2024-05-27 19:14:14 +02:00
Aurelien DARRAGON
141bc5ba0d DOC: config: document logformat item naming and typecasting features
The ability to give a name to a logformat_node (known as logformat item in
the documentation) implemented in 2ed6068f2a ("MINOR: log: custom name for
logformat node") wasn't documented.

The same goes for the ability to force the logformat_node's output type to
a specific type implemented in 1448478d62 ("MINOR: log: explicit
typecasting for logformat nodes")

Let's quickly describe such new usages at the start of the custom log
format section.
2024-05-27 17:04:16 +02:00
Aurelien DARRAGON
435a9da267 MINOR: log: rename 'log-format tag' to 'log-format alias'
In 2.9 we started to introduce an ambiguity in the documentation by
referring to historical log-format variables ('%var') as log-format
tags in 739c4e5b1e ("MINOR: sample: accept_date / request_date return
%Ts / %tr timestamp values") and 454c372b60 ("DOC: configuration: add
sample fetches for timing events").

In fact, we've had this confusion between log-format tag and log-format
var for more than 10 years now, but in 2.9 it was the first time the
confusion was exposed in the documentation.

Indeed, both 'log-format variable' and 'log-format tag' actually refer
to the same feature (that is: '%B' and friends that can be used for
direct access to some log-oriented predefined fetches instead of using
%[expr] with generic sample expressions).

This feature was first implemented in 723b73ad75 ("MINOR: config: Parse
the string of the log-format config keyword") and later documented in
4894040fa ("DOC: log-format documentation"). At that time, it was clear
that we used to name it 'log-format variable'.

But later the same year, 'log-format tag' naming started to appear in
some commit messages (while still referring to the same feature), for
instance with ffc3fcd6d ("MEDIUM: log: report SSL ciphers and version
in logs using logformat %sslc/%sslv").

Unfortunately in 2.9 when we added (and documented) new log-format
variables we officially started drifting to the misleading 'log-format
tag' naming (perhaps because it was the most recent naming found for
this feature in git log history, or because the confusion has always
been there)

Even worse, in 3.0 this confusion led us to rename all 'var' occurrences
to 'tag' in log-format related code to unify the code with the doc.

Hopefully William quickly noticed that we made a mistake there, but
instead of reverting to historical naming (log-format variable), it was
decided that we must use a different name that is less confusing than
'tags' or 'variables' (tags and variables are keywords that are already
used to designate other features in the code and that are not very
explicit under log-format context today).

Now we refer to '%B' and friends as a logformat alias, which is
essentially a handy way to print some log oriented information in the
log string instead of leveraging '%[expr]' with generic sample expressions
made of fetches and converters. Of course, there are some subtelties, such
as a few log-format aliases that still don't have sample fetch equivalent
for historical reasons, and some aliases that may be a little faster than
their generic sample expression equivalents because most aliases are
pretty much hardcoded in the log building function. But in general
logformat aliases should be simply considered as an alternative to using
expressions (with '%[expr']')

Also, under log-format context, when we want to refer to either an alias
('%alias') or an expression ('%[expr]'), we should use the generic term
'logformat item', which in fact designates a single item within the
logformat string provided by the user. Indeed, a logformat item (whether
is is an alias or an expression) always starts with '%' and may accept
optional flags / arguments

Both the code and the documentation were updated in that sense, hopefully
this will clarify things and prevent future confusions.
2024-05-27 17:03:48 +02:00
Willy Tarreau
7e943cdf27 CI: scripts: build vtest using multiple CPUs
Now that vtest supports make -j, let's use it to save a bit of time
(the build time is ~6s per test by default).
2024-05-27 12:15:50 +02:00
Willy Tarreau
01843c47a1 CI: scripts: fix build of vtest regarding option -C
On Linux, GNU make emits "w" at the beginning of the MAKEFLAGS
variable if -C is passed, which happens since vtest d6d228bcb3.
In fact it emits any of the command line flags without the leading
'-' in this case. gmake doesn't do that on BSD apparently. It's
documented under Options/Recursion in the GNU make doc. There's
also MFLAGS that could work but it does not contain the variables
definitions. So let's just avoid the -C that we don't really need.

This needs to be backported to stable versions.
2024-05-27 12:15:50 +02:00
William Lallemand
0a00302fab MINOR: sample: implement the uptime sample fetch
'uptime' returns the uptime of the current HAProxy worker in seconds.
2024-05-27 11:06:40 +02:00
Willy Tarreau
f76e73511a [RELEASE] Released version 3.0-dev13
Released version 3.0-dev13 with the following main changes :
    - CLEANUP: ssl/cli: remove unused code in dump_crtlist_conf
    - MINOR: ssl: check parameter in ckch_conf_cmp()
    - BUG/MINOR: ring: free ring's allocated area not ring's usable area when using maps
    - DOC: configuration: rework the crt-store load documentation
    - DEBUG: tools: add vma_set_name() helper
    - DEBUG: shctx: name shared memory using vma_set_name()
    - DEBUG: sink: add name hint for memory area used by memory-backed sinks
    - DEBUG: pollers: add name hint for large memory areas used by pollers
    - DEBUG: errors: add name hint for startup-logs memory area
    - DEBUG: fd: add name hint for large memory areas
    - MEDIUM: ssl: don't load file by discovering them in crt-store
    - DOC: configuration: update the crt-list documentation
    - DOC: configuration: add the supported crt-store options in crt-list
    - BUG/MEDIUM: proto: fix fd leak in <proto>_connect_server
    - MINOR: sock: set conn->err_code in case of EPERM
    - BUG/MINOR: http-ana: Don't crush stream termination condition on internal error
    - MAJOR: spoe: Let the SPOE back into the game
    - BUG/MINOR: connection: parse PROXY TLV for LOCAL mode
    - BUG/MINOR: server: free PROXY v2 TLVs on srv drop
    - MINOR: rhttp: add log on connection allocation failure
    - BUG/MEDIUM: rhttp: fix preconnect on single-thread
    - BUG/MINOR: rhttp: prevent listener suspend
    - BUG/MINOR: rhttp: fix task_wakeup state
    - MINOR: session: define flag to explicitely release listener on free
    - MEDIUM: rhttp: create session for active preconnect
    - MINOR: rhttp: support PROXY emission on preconnect
    - MINOR: connection: support PROXY v2 TLV emission without stream
    - MINOR: traces: enumerate the list of levels/verbosities when not found
    - BUG/MINOR: sock: fix sock_create_server_socket
    - MINOR: proto: fix coding style
    - BUG/MAJOR: quic: Crash with TLS_AES_128_CCM_SHA256 (libressl only)
    - REGTESTS: scripts: allow to change the vtest timeout
    - BUG/MEDIUM: quic_tls: prevent LibreSSL < 4.0 from negotiating CHACHA20_POLY1305
    - CI: scripts/build-ssl.sh: loudly fail on unsupported platforms
    - BUG/MEDIUM: mux-quic: Create sedesc in same time of the QUIC stream
    - MINOR: mux-quic: Set abort info for SC-less QCS on STOP_SENDING frame
    - CI: scripts/build-ssl: add a DESTDIR and TMPDIR variable
    - CI: scripts/buil-ssl: cleanup the boringssl and quictls build
    - MINOR: config: add thread-hard-limit to set an upper bound to nbthread
    - BUILD: quic: fix unused variable warning when threads are disabled
    - BUG/MEDIUM: stick-tables: Fix race with peers when trashing oldest entries
    - BUG/MEDIUM: stick-tables: Fix race with peers when killing a sticky session
    - BUG/MEDIUM: stick-tables: make sure never to create two same remote entries
    - CLEANUP: stick-tables: remove a few unneeded tests for use_wrlock
    - MINOR: stick-tables: remove the uneeded read lock in stksess_free()
    - CLEANUP: tools: fix vma_set_name() function comment
    - DEBUG: tools: add vma_set_name_id() helper
    - DEBUG: pollers/fd: add thread id suffix to per-thread memory areas name hints
    - DOC: config: fix aes_gcm_enc() description text
    - BUILD: trace: fix warning on null dereference
    - MEDIUM: config: prevent communication with privileged ports
    - MAJOR: config: prevent QUIC with clients privileged port by default
    - BUG/MINOR: quic: adjust restriction for stateless reset emission
    - MINOR: quic: clarify doc for quic_recv()
    - MINOR: server: generalize sni expr parsing
    - MINOR: server: define pool-conn-name keyword
    - MEDIUM: connection: use pool-conn-name instead of sni on reuse
    - BUG/MINOR: rhttp: initialize session origin after preconnect reversal
    - BUG/MEDIUM: server/dns: preserve server's port upon resolution timeout or error
    - BUG/MINOR: http-htx: Support default path during scheme based normalization
    - BUG/MINOR: server: Don't reset resolver options on a new default-server line
    - DOC: quic: specify that connection migration is not supported
    - DOC: config: fix incorrect section reference about custom log format
    - DOC: config: uniformize the naming and description of custom log format args
    - DOC: config: clarify the fact that custom log format is not just for logging
    - REGTESTS: acl_cli_spaces: avoid a warning caused by undefined logs
2024-05-24 17:57:29 +02:00
Willy Tarreau
45a187304e REGTESTS: acl_cli_spaces: avoid a warning caused by undefined logs
There's a warning being reported in this reg test in the detailed startup
logs because of "log global" and "option httplog" while there's no global
section hence no logger. Let's just drop both options since they're not
relevant to this test.
2024-05-24 17:50:19 +02:00
Willy Tarreau
0af9bfcbc5 DOC: config: clarify the fact that custom log format is not just for logging
The wording in the Custom log format section was still extremely centered
on logging, but it's about time to mention that these are usable for other
actions as well, otherwise it's very confusing for newcomers who try to
define a variable or header. The updated text also reminds about the risks
of safe encodings that may (rarely) mangle an output string, and encourages
to migrate away from the unquoted definition which is full of backslashes.
It would definitely deserve further improvements and refinements.
2024-05-24 17:32:59 +02:00
Willy Tarreau
c02cefce23 DOC: config: uniformize the naming and description of custom log format args
A significant number of actions now take arguments that are evaluated as
log-format expressions. Some of them are called "fmt", others "string".
The description of the argument sometimes just says "the log-format
string" or "log format" or "custom log format" etc. Most of them do not
mention the section to visit, and section 8.2 speaking about log-format
is very centric on logs usage (the primary use case), making all of this
very confusing for newcomers.

Since section 8.2.6 is titled "Custom log format" and describes the syntax
to be used with the "log-format" (and other) directives, let's call this
"Custom log format" everywhere and mention section 8.2.6. When the field
was called "string", it was also renamed to "fmt".

It doesn't seem worth backporting this, unless it applies fine.
2024-05-24 17:32:59 +02:00
Willy Tarreau
474cbcf842 DOC: config: fix incorrect section reference about custom log format
Since 2.5 with commit 98b930d043 ("MINOR: ssl: Define a default https
log format"), some log-format sections were shifted a bit without having
been renumberred, causing 8.2.4 to be referenced as the custom log
format while it's in fact 8.2.6. This patch fixes the affected
locations.

In addition two places mentioned 8.2.6 instead of 8.2.5 for the error
log format.

This can be backported to 2.6.
2024-05-24 17:32:59 +02:00
Amaury Denoyelle
59b69aafae DOC: quic: specify that connection migration is not supported
Currently haproxy does not support QUIC connection migration. This is
advertized to clients on their connections. Document this in the first
QUIC related paragraph.

This should be backported up to 2.6.
2024-05-24 17:32:37 +02:00
Christopher Faulet
0d7c1bc6ab BUG/MINOR: server: Don't reset resolver options on a new default-server line
When a new "default-server" line is parsed, some resolver options are reset.
Thus previously defined default options cannot be inherited. There is no
reason to do so. First because other server options are inherited. And then
because not all resolver options are reset. It is not consistent.

This patch should fix issue #2559. It should be backported to all stable
versions.
2024-05-24 16:31:01 +02:00
Christopher Faulet
8d2514e087 BUG/MINOR: http-htx: Support default path during scheme based normalization
As stated in RFC3986, for an absolute-form URI, an empty path should be
normalized to a path of "/". This is part of scheme based normalization
rules. This kind of normalization is already performed for default ports. So
we might as well deal with the case of empty path.

The associated reg-tests was updated accordingly.

This patch should fix the issue #2573. It may be backported as far as 2.4 if
necessary.
2024-05-24 16:17:24 +02:00
Aurelien DARRAGON
c16eba8183 BUG/MEDIUM: server/dns: preserve server's port upon resolution timeout or error
@boi4 reported in GH #2578 that since 3.0-dev1 for servers with address
learned from A/AAAA records after a DNS flap server would be put out of
maintenance with proper address but with invalid port (== 0), making it
unusable and causing tcp checks to fail:

[NOTICE]   (1) : Loading success.
[WARNING]  (8) : Server mybackend/myserver1 is going DOWN for maintenance (DNS refused status). 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT]    (8) : backend 'mybackend' has no server available!
[WARNING]  (8) : mybackend/myserver1: IP changed from '(none)' to '127.0.0.1' by 'myresolver/ns1'.
[WARNING]  (8) : Server mybackend/myserver1 ('myhost') is UP/READY (resolves again).
[WARNING]  (8) : Server mybackend/myserver1 administratively READY thanks to valid DNS answer.
[WARNING]  (8) : Server mybackend/myserver1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.

@boi4 also mentioned that this used to work fine before.

Willy suggested that this regression may have been introduced by 64c9c8e
("BUG/MINOR: server/dns: use server_set_inetaddr() to unset srv addr from DNS")

Turns out he was right! Indeed, in 64c9c8e we systematically memset the
whole server_inetaddr struct (which contains both the requested server's
addr and port planned for atomic update) instead of only memsetting the
addr part of the structure: except when SRV records are involved (SRV
records provide both the address and the port unlike A or AAAA records),
we must not reset the server's port upon DNS errors because the port may
have been provided at config time and we don't want to lose its value.

Big thanks to @boi4 for his well-documented issue that really helped us to
pinpoint the bug right on time for the dev-13 release.

No backport needed (unless 64c9c8e gets backported).
2024-05-24 15:29:48 +02:00
Amaury Denoyelle
98ed11b0c5 BUG/MINOR: rhttp: initialize session origin after preconnect reversal
Since the following commit, session is initialized early for rhttp
preconnect.

  12c40c25a9
  MEDIUM: rhttp: create session for active preconnect

Session origin member was not set. However, this prevents several
session fetches to not work as expected. Worst, this caused a regression
as previously session was created after reversal with origin member
defined. This was reported by user William Manley on the mailing-list
which rely on set-dst.

One possible fix would be to set origin on session_new(). However, as
this is done before reversal, some session members may be incorrectly
initialized, in particular source and destination address.

Thus, session origin is only set after reversal is completed. This
ensures that session fetches have the same behavior on standard
connections and reversable ones.

This does not need to be backported.
2024-05-24 14:47:21 +02:00
Amaury Denoyelle
47168e217a MEDIUM: connection: use pool-conn-name instead of sni on reuse
Implement pool-conn-name support for idle connection reuse. It replaces
SNI as arbitrary identifier for connections in the idle pool. Thus,
every SNI reference in this context have been replaced.

Main change occurs in connect_server() where pool-conn-name sample fetch
is now prehash to generate idle connection identifier. SNI is now solely
used in the context of SSL for ssl_sock_set_servername().
2024-05-24 14:47:21 +02:00
Amaury Denoyelle
be4f89f2b2 MINOR: server: define pool-conn-name keyword
Define a new server keyword pool-conn-name. The purpose of this keyword
will be to identify connections inside the idle connections pool,
replacing SNI in case SSL is not wanted.

This keyword uses a sample expression argument. It thus can reuse
existing function parse_srv_expr() for parsing. In the future, it may be
necessary to define a keyword variant which uses a logformat for
extensability.

This patch only implement parsing. Argument is stored inside new server
field <pool_conn_name> and expression is generated in
_srv_parse_finalize() into <pool_conn_name_expr>.

If pool-conn-name is not set but SNI is, the latter is reused
automatically as pool-conn-name via _srv_parse_finalize(). This ensures
current reuse behavior remains compatible and idle connection reuse will
not mix connections with different SNIs by mistake.

Main usage will be for rhttp when SSL is not wanted between the two
haproxy instances. Previously, it was possible to use "sni" keyword even
without SSL on a server line which have a similar effect. However,
having a dedicated "pool-conn-name" keyword is deemed clearer. Besides,
it would allow for more complex configuration where pool-conn-name and
SNI are use in parallel with different values.
2024-05-24 14:36:31 +02:00
Amaury Denoyelle
91001422b4 MINOR: server: generalize sni expr parsing
Two functions exists for server sni sample expression parsing. This is
confusing so this commit aims at clarifying this.

Functions are renamed with the following identifiers. First function is
named parse_srv_expr() and can be used during parsing. Besides
expression parsing, it has ensure sample fetch validity in the context
of a server line.

Second function is renamed _parse_srv_expr() and is used internally by
parse_srv_expr(). It only implements sample parsing without extra
checks. It is already use for server instantiation derived from
server-template as checks were already performed. Also, it is now used
in http-client code as SNI is a fixed string.

Finally, both functions are generalized to remove any reference to SNI.
This will allow to reuse it to parse other server keywords which use an
expression. This will be the case for the future keyword pool-conn-name.
2024-05-24 14:36:31 +02:00
Amaury Denoyelle
b9f67a46a2 MINOR: quic: clarify doc for quic_recv()
Just highlight the fact that quic_recv() only receive a single datagram.
2024-05-24 14:36:31 +02:00
Amaury Denoyelle
5764bc50b5 BUG/MINOR: quic: adjust restriction for stateless reset emission
Review RFC 9000 and ensure restriction on Stateless reset are properly
enforced. After careful examination, several changes are introduced.

First, redefine minimal Stateless Reset emitted packet length to 21
bytes (5 random bytes + a token). This is the new default length used in
every case, unless received packet which triggered it is 43 bytes or
smaller.

Ensure every Stateless Reset packets emitted are at 1 byte shorter than
the received packet which triggered it. No Stateless reset will be
emitted if this falls under the above limit of 21 bytes. Thus this
should prevent looping issues.

This should be backported up to 2.6.
2024-05-24 14:36:31 +02:00
Amaury Denoyelle
f55748a422 MAJOR: config: prevent QUIC with clients privileged port by default
Previous commit introduce new protection mechanism to forbid
communications with clients which use a privileged source port. By
default, this mechanism is disabled for every protocols.

This patch changes the default value and activate the protection
mechanism for QUIC protocol. This is justified as it is a probable sign
of DNS/NTP amplification attack.

This is labelled as major as it can be a breaking change with some
network environments.
2024-05-24 14:36:31 +02:00
Amaury Denoyelle
45f40bac4c MEDIUM: config: prevent communication with privileged ports
This commit introduces a new global setting named
harden.reject_privileged_ports.{tcp|quic}. When active, communications
with clients which use privileged source ports are forbidden. Such
behavior is considered suspicious as it can be used as spoofing or
DNS/NTP amplication attack.

Value is configured per transport protocol. For each TCP and QUIC
distinct code locations are impacted by this setting. The first one is
in sock_accept_conn() which acts as a filter for all TCP based
communications just after accept() returns a new connection. The second
one is dedicated for QUIC communication in quic_recv(). In both cases,
if a privileged source port is used and setting is disabled, received
message is silently dropped.

By default, protection are disabled for both protocols. This is to be
able to backport it without breaking changes on stable release.

This should be backported as it is an interesting security feature yet
relatively simple to implement.
2024-05-24 14:36:31 +02:00
Amaury Denoyelle
4e632545f7 BUILD: trace: fix warning on null dereference
Since a recent change on trace, the following compilation warning may
occur :
  src/trace.c: In function ‘trace_parse_cmd’:
  src/trace.c:865:33: error: potential null pointer dereference [-Werror=null-dereference]
    865 |                         for (nd = src->decoding; nd->name && nd->desc; nd++)
        |                              ~~~^~~~~~~~~~~~~~~

Fix this by rearranging code path to better highlight that only "quiet"
verbosity is allowed if no trace source is specified.

This was detected with GCC 14.1.
2024-05-24 14:36:03 +02:00
Willy Tarreau
77c228f04f DOC: config: fix aes_gcm_enc() description text
As reported by Nick Ramirez, it was written "decrypts" instead of
"encrypts". No backport needed.
2024-05-24 12:09:25 +02:00
Aurelien DARRAGON
c9af6d5414 DEBUG: pollers/fd: add thread id suffix to per-thread memory areas name hints
Willy reported that since abb8412d2 ("DEBUG: pollers: add name hint for
large memory areas used by pollers") and 22ec2ad8b ("DEBUG: fd: add name
hint for large memory areas") multiple maps with the same name could be
found in /proc/<pid>/maps when haproxy process is started with multiple
threads, which can be annoying.

In fact this happens because some poller and fd-created memory areas are
being created for each available thread, and since the naming was done
using vma_set_name() with the same <type> and <name> inputs, the resulting
name was the same for all threads.

Thanks to the previous commit, we now use vma_set_name_id() for naming
per-thread memory areas so that "-id" prefix is appended after the name
name, where "id" equals to 'tid+1' (to match the thread numbering logic
found in config file or in ha_panic() report),  allowing to easily
identify which haproxy thread owns the map in /proc/<pid>/maps:

7d3b26200000-7d3b26a01000 rw-p 00000000 00:00 0                          [anon:ev_poll:poll_events-2]
7d3b26c00000-7d3b27001000 rw-p 00000000 00:00 0                          [anon:fd:fd_updt-2]
7d3b27200000-7d3b27a01000 rw-p 00000000 00:00 0                          [anon:ev_poll:poll_events-1]
7d3b34200000-7d3b34601000 rw-p 00000000 00:00 0                          [anon:fd:fd_updt-1]
2024-05-24 12:07:18 +02:00
Aurelien DARRAGON
9d37c4b989 DEBUG: tools: add vma_set_name_id() helper
Just like vma_set_name() from 51a8f134e ("DEBUG: tools: add vma_set_name()
helper"), but also takes <id> as parameter to append "-$id" suffix after
the name in order to differentiate 2 areas that were named using the same
<type> and <name> combination.

example, using mmap + MAP_SHARED|MAP_ANONYMOUS:
  7364c4fff000-736508000000 rw-s 00000000 00:01 3540  [anon_shmem:type:name-id]
Another example, using mmap + MAP_PRIVATE|MAP_ANONYMOUS or using
glibc/malloc() above MMAP_THRESHOLD:
  7364c4fff000-736508000000 rw-s 00000000 00:01 3540  [anon:type:name-id]
2024-05-24 12:07:13 +02:00
Aurelien DARRAGON
23814a44e5 CLEANUP: tools: fix vma_set_name() function comment
There was a typo in the example provided in vma_set_name(): maps named
using the function will show up as "type:name", not "type.name", updating
the comment to reflect the current behavior.
2024-05-24 12:07:07 +02:00
Willy Tarreau
0bda33a3ec MINOR: stick-tables: remove the uneeded read lock in stksess_free()
During changes made in 2.7 by commits 8d3c3336f9 ("MEDIUM: stick-table:
make stksess_kill_if_expired() avoid the exclusive lock") and 996f1a5124
("MEDIUM: stick-table: do not take a lock to update t->current anymore."),
the operation was done cautiously one baby step at a time and the final
cleanup was not done, as we're keeping a read lock under an atomic dec.
Furthermore there's a pool_free() call under that lock, and we try to
avoid pool_alloc() and pool_free() under locks for their nasty side
effects (e.g. when memory gets recompacted), so let's really drop it
now.

Note that the performance gain is not really perceptible here, it's
essentially for code clarity reasons that this has to be done.
2024-05-24 11:52:57 +02:00
Willy Tarreau
8580f9db20 CLEANUP: stick-tables: remove a few unneeded tests for use_wrlock
Due to the code in stktable_touch_with_exp() being the same as in other
functions previously made around a loop trying first to upgrade a read
lock then to fall back to a direct write lock, there remains a confusing
construct with multiple tests on use_wrlock that is obviously zero when
tested. Let's remove them since the value is known and the loop does not
exist anymore.
2024-05-24 11:52:19 +02:00
Willy Tarreau
77f286e8bc BUG/MEDIUM: stick-tables: make sure never to create two same remote entries
In GH issue #2552, Christian Ruppert reported an increase in crashes
with recent 3.0-dev versions, always related with stick-tables and peers.
One particularity of his config is that it has a lot of peers.

While trying to reproduce, it empirically was found that firing 10 load
generators at 10 different haproxy instances tracking a random key among
100k against a table of max 5k entries, on 8 threads and between a total
of 50 parallel peers managed to reproduce the crashes in seconds, very
often in ebtree deletion or insertion code, but not only.

The debugging revealed that the crashes are often caused by a parent node
being corrupted while delete/insert tries to update it regarding a recently
inserted/removed node, and that that corrupted node had always been proven
to be deleted, then immediately freed, so it ought not be visited in the
tree from functions enclosed between a pair of lock/unlock. As such the
only possibility was that it had experienced unexpected inserts. Also,
running with pool integrity checking would 90% of the time cause crashes
during allocation based on corrupted contents in the node, likely because
it was found at two places in the same tree and still present as a parent
of a node being deleted or inserted (hence the __stksess_free and
stktable_trash_oldest callers being visible on these items).

Indeed the issue is in fact related to the test set (occasionally redundant
keys, many peers). What happens is that sometimes, a same key is learned
from two different peers. When it is learned for the first time, we end up
in stktable_touch_with_exp() in the "else" branch, where the test for
existence is made before taking the lock (since commit cfeca3a3a3
("MEDIUM: stick-table: touch updates under an upgradable read lock") that
was merged in 2.9), and from there the entry is added. But is one of the
threads manages to insert it before the other thread takes the lock, then
the second thread will try to insert this node again. And inserting an
already inserted node will corrupt the tree (note that we never switched
to enforcing a check in insertion code on this due to API history that
would break various code parts).

Here the solution is simple, it requires to recheck leaf_p after getting
the lock, to avoid touching anything if the entry has already been
inserted in the mean time.

Many thanks to Christian Ruppert for testing this and for his invaluable
help on this hard-to-trigger issue.

This fix needs to be backported to 2.9.
2024-05-24 11:52:11 +02:00
Christopher Faulet
9938fb9c7a BUG/MEDIUM: stick-tables: Fix race with peers when killing a sticky session
When a sticky session is killed, we must be sure no other entity is still
referencing it. The session's ref_cnt must be 0. However, there is a race
with peers, as decribed in 21447b1dd4 ("BUG/MAJOR: stick-tables: fix race
with peers in entry expiration"). When the update lock is acquire, we must
recheck the ref_cnt value.

This patch is part of a debugging session about issue #2552. It must be
backported to 2.9.
2024-05-24 11:52:11 +02:00
Christopher Faulet
dfd938bad6 BUG/MEDIUM: stick-tables: Fix race with peers when trashing oldest entries
It is the same that the one fixed in process_table_expire() (21447b1dd4
["BUG/MAJOR: stick-tables: fix race with peers in entry expiration"]). In
stktable_trash_oldest(), when the update lock is acquired, we must take care
to check again the ref_cnt because some peers may increment it (See commit
above for details).

This patch fixes a crash mentionned in 2552#issuecomment-2110532706. It must
be backported to 2.9.
2024-05-24 11:52:11 +02:00
Willy Tarreau
51f9f6cfd4 BUILD: quic: fix unused variable warning when threads are disabled
The tree variable was introduced in 3.0 by commit dd58dff1e6
("BUG/MEDIUM: quic: QUIC CID removed from tree without locking") which
was marked for backport. The variable is only used for locks.
Let's just mark the variable __maybe_unused for when the code is
built without threads.

The patch above was marked for backport to 2.7 so this should be
backported wherever the fix was backported.
2024-05-24 11:51:41 +02:00