haproxy

Commit Graph

Author	SHA1	Message	Date
Willy Tarreau	35e9826c13	BUILD: makefile: yearly reordering of objects by build time Some large files have been split since 2.9 (e.g. stats) and build times have moved and become less smooth, causing a less even parallel build. As usual, a small reordering cleans all this up. The effect was less visible than previous years though.	2024-05-27 19:14:14 +02:00
Aurelien DARRAGON	141bc5ba0d	DOC: config: document logformat item naming and typecasting features The ability to give a name to a logformat_node (known as logformat item in the documentation) implemented in `2ed6068f2a` ("MINOR: log: custom name for logformat node") wasn't documented. The same goes for the ability to force the logformat_node's output type to a specific type implemented in `1448478d62` ("MINOR: log: explicit typecasting for logformat nodes") Let's quickly describe such new usages at the start of the custom log format section.	2024-05-27 17:04:16 +02:00
Aurelien DARRAGON	435a9da267	MINOR: log: rename 'log-format tag' to 'log-format alias' In 2.9 we started to introduce an ambiguity in the documentation by referring to historical log-format variables ('%var') as log-format tags in `739c4e5b1e` ("MINOR: sample: accept_date / request_date return %Ts / %tr timestamp values") and `454c372b60` ("DOC: configuration: add sample fetches for timing events"). In fact, we've had this confusion between log-format tag and log-format var for more than 10 years now, but in 2.9 it was the first time the confusion was exposed in the documentation. Indeed, both 'log-format variable' and 'log-format tag' actually refer to the same feature (that is: '%B' and friends that can be used for direct access to some log-oriented predefined fetches instead of using %[expr] with generic sample expressions). This feature was first implemented in `723b73ad75` ("MINOR: config: Parse the string of the log-format config keyword") and later documented in `4894040fa` ("DOC: log-format documentation"). At that time, it was clear that we used to name it 'log-format variable'. But later the same year, 'log-format tag' naming started to appear in some commit messages (while still referring to the same feature), for instance with `ffc3fcd6d` ("MEDIUM: log: report SSL ciphers and version in logs using logformat %sslc/%sslv"). Unfortunately in 2.9 when we added (and documented) new log-format variables we officially started drifting to the misleading 'log-format tag' naming (perhaps because it was the most recent naming found for this feature in git log history, or because the confusion has always been there) Even worse, in 3.0 this confusion led us to rename all 'var' occurrences to 'tag' in log-format related code to unify the code with the doc. Hopefully William quickly noticed that we made a mistake there, but instead of reverting to historical naming (log-format variable), it was decided that we must use a different name that is less confusing than 'tags' or 'variables' (tags and variables are keywords that are already used to designate other features in the code and that are not very explicit under log-format context today). Now we refer to '%B' and friends as a logformat alias, which is essentially a handy way to print some log oriented information in the log string instead of leveraging '%[expr]' with generic sample expressions made of fetches and converters. Of course, there are some subtelties, such as a few log-format aliases that still don't have sample fetch equivalent for historical reasons, and some aliases that may be a little faster than their generic sample expression equivalents because most aliases are pretty much hardcoded in the log building function. But in general logformat aliases should be simply considered as an alternative to using expressions (with '%[expr']') Also, under log-format context, when we want to refer to either an alias ('%alias') or an expression ('%[expr]'), we should use the generic term 'logformat item', which in fact designates a single item within the logformat string provided by the user. Indeed, a logformat item (whether is is an alias or an expression) always starts with '%' and may accept optional flags / arguments Both the code and the documentation were updated in that sense, hopefully this will clarify things and prevent future confusions.	2024-05-27 17:03:48 +02:00
Willy Tarreau	7e943cdf27	CI: scripts: build vtest using multiple CPUs Now that vtest supports make -j, let's use it to save a bit of time (the build time is ~6s per test by default).	2024-05-27 12:15:50 +02:00
Willy Tarreau	01843c47a1	CI: scripts: fix build of vtest regarding option -C On Linux, GNU make emits "w" at the beginning of the MAKEFLAGS variable if -C is passed, which happens since vtest d6d228bcb3. In fact it emits any of the command line flags without the leading '-' in this case. gmake doesn't do that on BSD apparently. It's documented under Options/Recursion in the GNU make doc. There's also MFLAGS that could work but it does not contain the variables definitions. So let's just avoid the -C that we don't really need. This needs to be backported to stable versions.	2024-05-27 12:15:50 +02:00
William Lallemand	0a00302fab	MINOR: sample: implement the uptime sample fetch 'uptime' returns the uptime of the current HAProxy worker in seconds.	2024-05-27 11:06:40 +02:00
Willy Tarreau	f76e73511a	[RELEASE] Released version 3.0-dev13 Released version 3.0-dev13 with the following main changes : - CLEANUP: ssl/cli: remove unused code in dump_crtlist_conf - MINOR: ssl: check parameter in ckch_conf_cmp() - BUG/MINOR: ring: free ring's allocated area not ring's usable area when using maps - DOC: configuration: rework the crt-store load documentation - DEBUG: tools: add vma_set_name() helper - DEBUG: shctx: name shared memory using vma_set_name() - DEBUG: sink: add name hint for memory area used by memory-backed sinks - DEBUG: pollers: add name hint for large memory areas used by pollers - DEBUG: errors: add name hint for startup-logs memory area - DEBUG: fd: add name hint for large memory areas - MEDIUM: ssl: don't load file by discovering them in crt-store - DOC: configuration: update the crt-list documentation - DOC: configuration: add the supported crt-store options in crt-list - BUG/MEDIUM: proto: fix fd leak in <proto>_connect_server - MINOR: sock: set conn->err_code in case of EPERM - BUG/MINOR: http-ana: Don't crush stream termination condition on internal error - MAJOR: spoe: Let the SPOE back into the game - BUG/MINOR: connection: parse PROXY TLV for LOCAL mode - BUG/MINOR: server: free PROXY v2 TLVs on srv drop - MINOR: rhttp: add log on connection allocation failure - BUG/MEDIUM: rhttp: fix preconnect on single-thread - BUG/MINOR: rhttp: prevent listener suspend - BUG/MINOR: rhttp: fix task_wakeup state - MINOR: session: define flag to explicitely release listener on free - MEDIUM: rhttp: create session for active preconnect - MINOR: rhttp: support PROXY emission on preconnect - MINOR: connection: support PROXY v2 TLV emission without stream - MINOR: traces: enumerate the list of levels/verbosities when not found - BUG/MINOR: sock: fix sock_create_server_socket - MINOR: proto: fix coding style - BUG/MAJOR: quic: Crash with TLS_AES_128_CCM_SHA256 (libressl only) - REGTESTS: scripts: allow to change the vtest timeout - BUG/MEDIUM: quic_tls: prevent LibreSSL < 4.0 from negotiating CHACHA20_POLY1305 - CI: scripts/build-ssl.sh: loudly fail on unsupported platforms - BUG/MEDIUM: mux-quic: Create sedesc in same time of the QUIC stream - MINOR: mux-quic: Set abort info for SC-less QCS on STOP_SENDING frame - CI: scripts/build-ssl: add a DESTDIR and TMPDIR variable - CI: scripts/buil-ssl: cleanup the boringssl and quictls build - MINOR: config: add thread-hard-limit to set an upper bound to nbthread - BUILD: quic: fix unused variable warning when threads are disabled - BUG/MEDIUM: stick-tables: Fix race with peers when trashing oldest entries - BUG/MEDIUM: stick-tables: Fix race with peers when killing a sticky session - BUG/MEDIUM: stick-tables: make sure never to create two same remote entries - CLEANUP: stick-tables: remove a few unneeded tests for use_wrlock - MINOR: stick-tables: remove the uneeded read lock in stksess_free() - CLEANUP: tools: fix vma_set_name() function comment - DEBUG: tools: add vma_set_name_id() helper - DEBUG: pollers/fd: add thread id suffix to per-thread memory areas name hints - DOC: config: fix aes_gcm_enc() description text - BUILD: trace: fix warning on null dereference - MEDIUM: config: prevent communication with privileged ports - MAJOR: config: prevent QUIC with clients privileged port by default - BUG/MINOR: quic: adjust restriction for stateless reset emission - MINOR: quic: clarify doc for quic_recv() - MINOR: server: generalize sni expr parsing - MINOR: server: define pool-conn-name keyword - MEDIUM: connection: use pool-conn-name instead of sni on reuse - BUG/MINOR: rhttp: initialize session origin after preconnect reversal - BUG/MEDIUM: server/dns: preserve server's port upon resolution timeout or error - BUG/MINOR: http-htx: Support default path during scheme based normalization - BUG/MINOR: server: Don't reset resolver options on a new default-server line - DOC: quic: specify that connection migration is not supported - DOC: config: fix incorrect section reference about custom log format - DOC: config: uniformize the naming and description of custom log format args - DOC: config: clarify the fact that custom log format is not just for logging - REGTESTS: acl_cli_spaces: avoid a warning caused by undefined logs	2024-05-24 17:57:29 +02:00
Willy Tarreau	45a187304e	REGTESTS: acl_cli_spaces: avoid a warning caused by undefined logs There's a warning being reported in this reg test in the detailed startup logs because of "log global" and "option httplog" while there's no global section hence no logger. Let's just drop both options since they're not relevant to this test.	2024-05-24 17:50:19 +02:00
Willy Tarreau	0af9bfcbc5	DOC: config: clarify the fact that custom log format is not just for logging The wording in the Custom log format section was still extremely centered on logging, but it's about time to mention that these are usable for other actions as well, otherwise it's very confusing for newcomers who try to define a variable or header. The updated text also reminds about the risks of safe encodings that may (rarely) mangle an output string, and encourages to migrate away from the unquoted definition which is full of backslashes. It would definitely deserve further improvements and refinements.	2024-05-24 17:32:59 +02:00
Willy Tarreau	c02cefce23	DOC: config: uniformize the naming and description of custom log format args A significant number of actions now take arguments that are evaluated as log-format expressions. Some of them are called "fmt", others "string". The description of the argument sometimes just says "the log-format string" or "log format" or "custom log format" etc. Most of them do not mention the section to visit, and section 8.2 speaking about log-format is very centric on logs usage (the primary use case), making all of this very confusing for newcomers. Since section 8.2.6 is titled "Custom log format" and describes the syntax to be used with the "log-format" (and other) directives, let's call this "Custom log format" everywhere and mention section 8.2.6. When the field was called "string", it was also renamed to "fmt". It doesn't seem worth backporting this, unless it applies fine.	2024-05-24 17:32:59 +02:00
Willy Tarreau	474cbcf842	DOC: config: fix incorrect section reference about custom log format Since 2.5 with commit `98b930d043` ("MINOR: ssl: Define a default https log format"), some log-format sections were shifted a bit without having been renumberred, causing 8.2.4 to be referenced as the custom log format while it's in fact 8.2.6. This patch fixes the affected locations. In addition two places mentioned 8.2.6 instead of 8.2.5 for the error log format. This can be backported to 2.6.	2024-05-24 17:32:59 +02:00
Amaury Denoyelle	59b69aafae	DOC: quic: specify that connection migration is not supported Currently haproxy does not support QUIC connection migration. This is advertized to clients on their connections. Document this in the first QUIC related paragraph. This should be backported up to 2.6.	2024-05-24 17:32:37 +02:00
Christopher Faulet	0d7c1bc6ab	BUG/MINOR: server: Don't reset resolver options on a new default-server line When a new "default-server" line is parsed, some resolver options are reset. Thus previously defined default options cannot be inherited. There is no reason to do so. First because other server options are inherited. And then because not all resolver options are reset. It is not consistent. This patch should fix issue #2559. It should be backported to all stable versions.	2024-05-24 16:31:01 +02:00
Christopher Faulet	8d2514e087	BUG/MINOR: http-htx: Support default path during scheme based normalization As stated in RFC3986, for an absolute-form URI, an empty path should be normalized to a path of "/". This is part of scheme based normalization rules. This kind of normalization is already performed for default ports. So we might as well deal with the case of empty path. The associated reg-tests was updated accordingly. This patch should fix the issue #2573. It may be backported as far as 2.4 if necessary.	2024-05-24 16:17:24 +02:00
Aurelien DARRAGON	c16eba8183	BUG/MEDIUM: server/dns: preserve server's port upon resolution timeout or error @boi4 reported in GH #2578 that since 3.0-dev1 for servers with address learned from A/AAAA records after a DNS flap server would be put out of maintenance with proper address but with invalid port (== 0), making it unusable and causing tcp checks to fail: [NOTICE] (1) : Loading success. [WARNING] (8) : Server mybackend/myserver1 is going DOWN for maintenance (DNS refused status). 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. [ALERT] (8) : backend 'mybackend' has no server available! [WARNING] (8) : mybackend/myserver1: IP changed from '(none)' to '127.0.0.1' by 'myresolver/ns1'. [WARNING] (8) : Server mybackend/myserver1 ('myhost') is UP/READY (resolves again). [WARNING] (8) : Server mybackend/myserver1 administratively READY thanks to valid DNS answer. [WARNING] (8) : Server mybackend/myserver1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. @boi4 also mentioned that this used to work fine before. Willy suggested that this regression may have been introduced by `64c9c8e` ("BUG/MINOR: server/dns: use server_set_inetaddr() to unset srv addr from DNS") Turns out he was right! Indeed, in `64c9c8e` we systematically memset the whole server_inetaddr struct (which contains both the requested server's addr and port planned for atomic update) instead of only memsetting the addr part of the structure: except when SRV records are involved (SRV records provide both the address and the port unlike A or AAAA records), we must not reset the server's port upon DNS errors because the port may have been provided at config time and we don't want to lose its value. Big thanks to @boi4 for his well-documented issue that really helped us to pinpoint the bug right on time for the dev-13 release. No backport needed (unless `64c9c8e` gets backported).	2024-05-24 15:29:48 +02:00
Amaury Denoyelle	98ed11b0c5	BUG/MINOR: rhttp: initialize session origin after preconnect reversal Since the following commit, session is initialized early for rhttp preconnect. `12c40c25a9` MEDIUM: rhttp: create session for active preconnect Session origin member was not set. However, this prevents several session fetches to not work as expected. Worst, this caused a regression as previously session was created after reversal with origin member defined. This was reported by user William Manley on the mailing-list which rely on set-dst. One possible fix would be to set origin on session_new(). However, as this is done before reversal, some session members may be incorrectly initialized, in particular source and destination address. Thus, session origin is only set after reversal is completed. This ensures that session fetches have the same behavior on standard connections and reversable ones. This does not need to be backported.	2024-05-24 14:47:21 +02:00
Amaury Denoyelle	47168e217a	MEDIUM: connection: use pool-conn-name instead of sni on reuse Implement pool-conn-name support for idle connection reuse. It replaces SNI as arbitrary identifier for connections in the idle pool. Thus, every SNI reference in this context have been replaced. Main change occurs in connect_server() where pool-conn-name sample fetch is now prehash to generate idle connection identifier. SNI is now solely used in the context of SSL for ssl_sock_set_servername().	2024-05-24 14:47:21 +02:00
Amaury Denoyelle	be4f89f2b2	MINOR: server: define pool-conn-name keyword Define a new server keyword pool-conn-name. The purpose of this keyword will be to identify connections inside the idle connections pool, replacing SNI in case SSL is not wanted. This keyword uses a sample expression argument. It thus can reuse existing function parse_srv_expr() for parsing. In the future, it may be necessary to define a keyword variant which uses a logformat for extensability. This patch only implement parsing. Argument is stored inside new server field <pool_conn_name> and expression is generated in _srv_parse_finalize() into <pool_conn_name_expr>. If pool-conn-name is not set but SNI is, the latter is reused automatically as pool-conn-name via _srv_parse_finalize(). This ensures current reuse behavior remains compatible and idle connection reuse will not mix connections with different SNIs by mistake. Main usage will be for rhttp when SSL is not wanted between the two haproxy instances. Previously, it was possible to use "sni" keyword even without SSL on a server line which have a similar effect. However, having a dedicated "pool-conn-name" keyword is deemed clearer. Besides, it would allow for more complex configuration where pool-conn-name and SNI are use in parallel with different values.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	91001422b4	MINOR: server: generalize sni expr parsing Two functions exists for server sni sample expression parsing. This is confusing so this commit aims at clarifying this. Functions are renamed with the following identifiers. First function is named parse_srv_expr() and can be used during parsing. Besides expression parsing, it has ensure sample fetch validity in the context of a server line. Second function is renamed _parse_srv_expr() and is used internally by parse_srv_expr(). It only implements sample parsing without extra checks. It is already use for server instantiation derived from server-template as checks were already performed. Also, it is now used in http-client code as SNI is a fixed string. Finally, both functions are generalized to remove any reference to SNI. This will allow to reuse it to parse other server keywords which use an expression. This will be the case for the future keyword pool-conn-name.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	b9f67a46a2	MINOR: quic: clarify doc for quic_recv() Just highlight the fact that quic_recv() only receive a single datagram.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	5764bc50b5	BUG/MINOR: quic: adjust restriction for stateless reset emission Review RFC 9000 and ensure restriction on Stateless reset are properly enforced. After careful examination, several changes are introduced. First, redefine minimal Stateless Reset emitted packet length to 21 bytes (5 random bytes + a token). This is the new default length used in every case, unless received packet which triggered it is 43 bytes or smaller. Ensure every Stateless Reset packets emitted are at 1 byte shorter than the received packet which triggered it. No Stateless reset will be emitted if this falls under the above limit of 21 bytes. Thus this should prevent looping issues. This should be backported up to 2.6.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	f55748a422	MAJOR: config: prevent QUIC with clients privileged port by default Previous commit introduce new protection mechanism to forbid communications with clients which use a privileged source port. By default, this mechanism is disabled for every protocols. This patch changes the default value and activate the protection mechanism for QUIC protocol. This is justified as it is a probable sign of DNS/NTP amplification attack. This is labelled as major as it can be a breaking change with some network environments.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	45f40bac4c	MEDIUM: config: prevent communication with privileged ports This commit introduces a new global setting named harden.reject_privileged_ports.{tcp\|quic}. When active, communications with clients which use privileged source ports are forbidden. Such behavior is considered suspicious as it can be used as spoofing or DNS/NTP amplication attack. Value is configured per transport protocol. For each TCP and QUIC distinct code locations are impacted by this setting. The first one is in sock_accept_conn() which acts as a filter for all TCP based communications just after accept() returns a new connection. The second one is dedicated for QUIC communication in quic_recv(). In both cases, if a privileged source port is used and setting is disabled, received message is silently dropped. By default, protection are disabled for both protocols. This is to be able to backport it without breaking changes on stable release. This should be backported as it is an interesting security feature yet relatively simple to implement.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	4e632545f7	BUILD: trace: fix warning on null dereference Since a recent change on trace, the following compilation warning may occur : src/trace.c: In function ‘trace_parse_cmd’: src/trace.c:865:33: error: potential null pointer dereference [-Werror=null-dereference] 865 \| for (nd = src->decoding; nd->name && nd->desc; nd++) \| ~~~^~~~~~~~~~~~~~~ Fix this by rearranging code path to better highlight that only "quiet" verbosity is allowed if no trace source is specified. This was detected with GCC 14.1.	2024-05-24 14:36:03 +02:00
Willy Tarreau	77c228f04f	DOC: config: fix aes_gcm_enc() description text As reported by Nick Ramirez, it was written "decrypts" instead of "encrypts". No backport needed.	2024-05-24 12:09:25 +02:00
Aurelien DARRAGON	c9af6d5414	DEBUG: pollers/fd: add thread id suffix to per-thread memory areas name hints Willy reported that since `abb8412d2` ("DEBUG: pollers: add name hint for large memory areas used by pollers") and `22ec2ad8b` ("DEBUG: fd: add name hint for large memory areas") multiple maps with the same name could be found in /proc/<pid>/maps when haproxy process is started with multiple threads, which can be annoying. In fact this happens because some poller and fd-created memory areas are being created for each available thread, and since the naming was done using vma_set_name() with the same <type> and <name> inputs, the resulting name was the same for all threads. Thanks to the previous commit, we now use vma_set_name_id() for naming per-thread memory areas so that "-id" prefix is appended after the name name, where "id" equals to 'tid+1' (to match the thread numbering logic found in config file or in ha_panic() report), allowing to easily identify which haproxy thread owns the map in /proc/<pid>/maps: 7d3b26200000-7d3b26a01000 rw-p 00000000 00:00 0 [anon:ev_poll:poll_events-2] 7d3b26c00000-7d3b27001000 rw-p 00000000 00:00 0 [anon:fd:fd_updt-2] 7d3b27200000-7d3b27a01000 rw-p 00000000 00:00 0 [anon:ev_poll:poll_events-1] 7d3b34200000-7d3b34601000 rw-p 00000000 00:00 0 [anon:fd:fd_updt-1]	2024-05-24 12:07:18 +02:00
Aurelien DARRAGON	9d37c4b989	DEBUG: tools: add vma_set_name_id() helper Just like vma_set_name() from `51a8f134e` ("DEBUG: tools: add vma_set_name() helper"), but also takes <id> as parameter to append "-$id" suffix after the name in order to differentiate 2 areas that were named using the same <type> and <name> combination. example, using mmap + MAP_SHARED\|MAP_ANONYMOUS: 7364c4fff000-736508000000 rw-s 00000000 00:01 3540 [anon_shmem:type:name-id] Another example, using mmap + MAP_PRIVATE\|MAP_ANONYMOUS or using glibc/malloc() above MMAP_THRESHOLD: 7364c4fff000-736508000000 rw-s 00000000 00:01 3540 [anon:type:name-id]	2024-05-24 12:07:13 +02:00
Aurelien DARRAGON	23814a44e5	CLEANUP: tools: fix vma_set_name() function comment There was a typo in the example provided in vma_set_name(): maps named using the function will show up as "type:name", not "type.name", updating the comment to reflect the current behavior.	2024-05-24 12:07:07 +02:00
Willy Tarreau	0bda33a3ec	MINOR: stick-tables: remove the uneeded read lock in stksess_free() During changes made in 2.7 by commits `8d3c3336f9` ("MEDIUM: stick-table: make stksess_kill_if_expired() avoid the exclusive lock") and `996f1a5124` ("MEDIUM: stick-table: do not take a lock to update t->current anymore."), the operation was done cautiously one baby step at a time and the final cleanup was not done, as we're keeping a read lock under an atomic dec. Furthermore there's a pool_free() call under that lock, and we try to avoid pool_alloc() and pool_free() under locks for their nasty side effects (e.g. when memory gets recompacted), so let's really drop it now. Note that the performance gain is not really perceptible here, it's essentially for code clarity reasons that this has to be done.	2024-05-24 11:52:57 +02:00
Willy Tarreau	8580f9db20	CLEANUP: stick-tables: remove a few unneeded tests for use_wrlock Due to the code in stktable_touch_with_exp() being the same as in other functions previously made around a loop trying first to upgrade a read lock then to fall back to a direct write lock, there remains a confusing construct with multiple tests on use_wrlock that is obviously zero when tested. Let's remove them since the value is known and the loop does not exist anymore.	2024-05-24 11:52:19 +02:00
Willy Tarreau	77f286e8bc	BUG/MEDIUM: stick-tables: make sure never to create two same remote entries In GH issue #2552, Christian Ruppert reported an increase in crashes with recent 3.0-dev versions, always related with stick-tables and peers. One particularity of his config is that it has a lot of peers. While trying to reproduce, it empirically was found that firing 10 load generators at 10 different haproxy instances tracking a random key among 100k against a table of max 5k entries, on 8 threads and between a total of 50 parallel peers managed to reproduce the crashes in seconds, very often in ebtree deletion or insertion code, but not only. The debugging revealed that the crashes are often caused by a parent node being corrupted while delete/insert tries to update it regarding a recently inserted/removed node, and that that corrupted node had always been proven to be deleted, then immediately freed, so it ought not be visited in the tree from functions enclosed between a pair of lock/unlock. As such the only possibility was that it had experienced unexpected inserts. Also, running with pool integrity checking would 90% of the time cause crashes during allocation based on corrupted contents in the node, likely because it was found at two places in the same tree and still present as a parent of a node being deleted or inserted (hence the __stksess_free and stktable_trash_oldest callers being visible on these items). Indeed the issue is in fact related to the test set (occasionally redundant keys, many peers). What happens is that sometimes, a same key is learned from two different peers. When it is learned for the first time, we end up in stktable_touch_with_exp() in the "else" branch, where the test for existence is made before taking the lock (since commit `cfeca3a3a3` ("MEDIUM: stick-table: touch updates under an upgradable read lock") that was merged in 2.9), and from there the entry is added. But is one of the threads manages to insert it before the other thread takes the lock, then the second thread will try to insert this node again. And inserting an already inserted node will corrupt the tree (note that we never switched to enforcing a check in insertion code on this due to API history that would break various code parts). Here the solution is simple, it requires to recheck leaf_p after getting the lock, to avoid touching anything if the entry has already been inserted in the mean time. Many thanks to Christian Ruppert for testing this and for his invaluable help on this hard-to-trigger issue. This fix needs to be backported to 2.9.	2024-05-24 11:52:11 +02:00
Christopher Faulet	9938fb9c7a	BUG/MEDIUM: stick-tables: Fix race with peers when killing a sticky session When a sticky session is killed, we must be sure no other entity is still referencing it. The session's ref_cnt must be 0. However, there is a race with peers, as decribed in `21447b1dd4` ("BUG/MAJOR: stick-tables: fix race with peers in entry expiration"). When the update lock is acquire, we must recheck the ref_cnt value. This patch is part of a debugging session about issue #2552. It must be backported to 2.9.	2024-05-24 11:52:11 +02:00
Christopher Faulet	dfd938bad6	BUG/MEDIUM: stick-tables: Fix race with peers when trashing oldest entries It is the same that the one fixed in process_table_expire() (`21447b1dd4` ["BUG/MAJOR: stick-tables: fix race with peers in entry expiration"]). In stktable_trash_oldest(), when the update lock is acquired, we must take care to check again the ref_cnt because some peers may increment it (See commit above for details). This patch fixes a crash mentionned in 2552#issuecomment-2110532706. It must be backported to 2.9.	2024-05-24 11:52:11 +02:00
Willy Tarreau	51f9f6cfd4	BUILD: quic: fix unused variable warning when threads are disabled The tree variable was introduced in 3.0 by commit `dd58dff1e6` ("BUG/MEDIUM: quic: QUIC CID removed from tree without locking") which was marked for backport. The variable is only used for locks. Let's just mark the variable __maybe_unused for when the code is built without threads. The patch above was marked for backport to 2.7 so this should be backported wherever the fix was backported.	2024-05-24 11:51:41 +02:00
Willy Tarreau	381ed2a4dd	MINOR: config: add thread-hard-limit to set an upper bound to nbthread On todays large systems, it's not always desired to run on all threads for light loads, and usually users enforce nbthread to a lower value (e.g. 8). The problem is that this is a fixed value, and moving such configs to smaller machines continues to enforce the value and this becomes extremely unproductive due to having more threads than CPUs. This also happens quite a bit in VMs, containers, or cloud instances of various sizes. This commit introduces the thread-hard-limit setting that allows to only set an upper bound to the number of threads without raising a lower value. This means that using "thread-hard-limit 8" will make sure that no more than 8 threads will be used when available, but it will remain two when run on a dual-core machine.	2024-05-24 09:46:49 +02:00
William Lallemand	9c1fa3e411	CI: scripts/buil-ssl: cleanup the boringssl and quictls build Put the quictls and boringssl build in their own function instead of keeping it in the main part of the script.	2024-05-23 16:54:30 +02:00
William Lallemand	5d73643ca3	CI: scripts/build-ssl: add a DESTDIR and TMPDIR variable Add a DESTDIR and TMPDIR variables so the build-ssl.sh script can be used as a generic SSL lib installer outside the CI. The varibles are prefixed with BUILDSSL so they doesn't collide with the makefile one. Ex: OPENSSL_VERSION=3.2.0 BUILDSSL_DESTDIR=/opt/openssl-3.2.0/ ./scripts/build-ssl.sh WOLFSSL_VERSION=5.7.0 BUILDSSL_DESTDIR=/opt/wolfssl-5.7.0/ ./scripts/build-ssl.sh	2024-05-23 15:34:59 +02:00
Christopher Faulet	d11249f292	MINOR: mux-quic: Set abort info for SC-less QCS on STOP_SENDING frame It is a revert of `cc9827bb09` ("BUG/MEDIUM: mux-quic: fix crash on STOP_SENDING received without SD"). This fix was based on a wrong assumption about QUIC streams that may have no stream-endpoint descriptor. However, it must never happen. And this was fixed. So we can now safely revert the commit above. However, it is not a bugfix because, for now, abort info are only used by the upper layer. So it is not a big deal to not set it when there is no SC.	2024-05-23 11:18:19 +02:00
Christopher Faulet	086e51017e	BUG/MEDIUM: mux-quic: Create sedesc in same time of the QUIC stream Recent changes to save abort reason revealed an issue during the QUIC stream creation. Indeed, by design, when a mux stream is created, it must always have a valid stream-endpoint descriptor and it must remain valid till the mux stream destruction. On frontend side, it is the multiplexer responsibility to create it and set it as orphan. On the backend side, the sedesc is provided by the upper layer. It is the sedesc of the back stream-connector. For the QUIC multiplexer, the stream-endpoint descriptor was only created when the stream-connector was created and attached on it. It is unexpected and some bugs may be introduced because there is no valid sedesc on a QUIC stream. And a recent bug was introduced for this reason. This patch must be backported as far as 2.6.	2024-05-23 11:18:06 +02:00
Ilia Shipitsin	4a968d9d27	CI: scripts/build-ssl.sh: loudly fail on unsupported platforms	2024-05-22 16:52:43 +02:00
Willy Tarreau	c7335d55f8	BUG/MEDIUM: quic_tls: prevent LibreSSL < 4.0 from negotiating CHACHA20_POLY1305 As diagnosed in GH issue #2569, there's currently an issue in LibreSSL's CHACHA20 in-place implementation that makes haproxy discard incoming QUIC packets encrypted with it. It's not very easy to observe the issue because: - QUIC recommends that CHACHA20 is used in priority - on x86 with AES-NI, LibreSSL prefers AES-GCM for performance reasons, so the problem is only observed there if a client explicitly forces TLS_CHACHA20_POLY1305_SHA256 only. - discarded packets cause retransmits showing some apparent activity, and the handshake succeeds so it's not easy to analyze from the client which thinks that the server is slow to respond. Thus in practice, on non-x86 machines running LibreSSL, requests made over QUIC freeze for a long time, unless the client explicitly forces algos excluding TLS_CHACHA20_POLY1305_SHA256. That's typically the case by default on modern OpenBSD systems, and was reported in the issue above for an arm64 machine running OpenBSD -current, and was also observed on a mips64 one running OpenBSD 7.5. There is no simple solution to this problem due to some of the protocol's constraints without digging too low into the stack (and risking to break more). Here we're taking a pragmatic approach consisting in making the connection fail hard when TLS_CHACHA20_POLY1305_SHA256 is selected, regardless of the availability of other ciphers. This means that every time a connection would have hung, instead it will fail fast, allowing the client to retry over TLS/TCP. Theo Buehler recommends that we limit this protection to all LibreSSL versions before 4.0 since it's where the fix will be implemented. Older stable versions will just see TLS_CHACHA20_POLY1305_SHA256 disabled, which should be sufficient to make QUIC work there again as well. The following config is sufficient to reproduce the issue (on a non-x86 machine, both arm64 & mips64 were confirmed to reproduce it): global limited-quic frontend stats mode http #bind :8181 #bind :8443 ssl crt rsa+dh2048.pem bind quic4@:8443 ssl crt rsa+dh2048.pem alpn h3 timeout client 5s stats uri / And the following commands will trigger the problem on affected LibreSSL versions: curl --tls13-ciphers TLS_CHACHA20_POLY1305_SHA256 -v --http3 -k https://127.0.0.1:8443/ curl -v --http3 -k https://127.0.0.1:8443/ while these ones must work: curl --tls13-ciphers TLS_AES_128_GCM_SHA256 -v --http3 -k https://127.0.0.1:8443/ curl --tls13-ciphers TLS_AES_256_GCM_SHA384 -v --http3 -k https://127.0.0.1:8443/ Normally all of them will work with LibreSSL 4, and only the first one should fail with stable LibreSSL versions higher than 3.9.2. An haproxy version without this workaround will show an unresponsive command after the GET is sent, while a version with the workaround will close the connection on error. On a version with this workaround, if TCP listeners are uncommented, curl will automatically fall back to TCP and attempt the reqeust again over HTTP/2. Finally, on OpenSSL 1.1.1 in compat mode (hence the limited-quic option above) all of them must work. Many thanks to github user @lgv5 for the detailed report, tests, and for spotting the issue, and to @botovq (Theo Buehler) for the quick analysis, patch and help on this workaround. This needs to be backported to versions 2.6 and above.	2024-05-22 16:22:22 +02:00
William Lallemand	0182f6bbb6	REGTESTS: scripts: allow to change the vtest timeout $ make reg-tests VTEST_TIMEOUT=5 Allow to change the timeout of the regtests with the VTEST_TIMEOUT variable. The default value is still 10.	2024-05-22 15:43:53 +02:00
Frederic Lecaille	169fc0b771	BUG/MAJOR: quic: Crash with TLS_AES_128_CCM_SHA256 (libressl only) At least 3.9.0 version of libressl TLS stack does not behave as others stacks like quictls which make SSL_do_handshake() return an error when no cipher could be negotiated in addition to emit a TLS alert(0x28). This is the case when TLS_AES_128_CCM_SHA256 is forced as TLS1.3 cipher from the client side. This make haproxy enter a code path which leads to a crash as follows: [Switching to Thread 0x7ffff76b9640 (LWP 23902)] 0x0000000000487627 in quic_tls_key_update (qc=qc@entry=0x7ffff00371f0) at src/quic_tls.c:910 910 struct quic_kp_trace kp_trace = { (gdb) list 905 { 906 struct quic_tls_ctx tls_ctx = &qc->ael->tls_ctx; 907 struct quic_tls_secrets rx = &tls_ctx->rx; 908 struct quic_tls_secrets tx = &tls_ctx->tx; 909 / Used only for the traces / 910 struct quic_kp_trace kp_trace = { 911 .rx_sec = rx->secret, 912 .rx_seclen = rx->secretlen, 913 .tx_sec = tx->secret, 914 .tx_seclen = tx->secretlen, (gdb) p qc $1 = (struct quic_conn ) 0x7ffff00371f0 (gdb) p qc->ael $2 = (struct quic_enc_level *) 0x0 (gdb) bt #0 0x0000000000487627 in quic_tls_key_update (qc=qc@entry=0x7ffff00371f0) at src/quic_tls.c:910 #1 0x000000000049bca9 in qc_ssl_provide_quic_data (len=268, data=<optimized out>, ctx=0x7ffff0047f80, level=<optimized out>, ncbuf=<optimized out>) at src/quic_ssl.c:617 #2 qc_ssl_provide_all_quic_data (qc=qc@entry=0x7ffff00371f0, ctx=0x7ffff0047f80) at src/quic_ssl.c:688 #3 0x00000000004683a7 in quic_conn_io_cb (t=0x7ffff0047f30, context=0x7ffff00371f0, state=<optimized out>) at src/quic_conn.c:760 #4 0x000000000063cd9c in run_tasks_from_lists (budgets=budgets@entry=0x7ffff76961f0) at src/task.c:596 #5 0x000000000063d934 in process_runnable_tasks () at src/task.c:876 #6 0x0000000000600508 in run_poll_loop () at src/haproxy.c:3073 #7 0x0000000000600b67 in run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:3287 #8 0x00007ffff7f6ae45 in start_thread () from /lib64/libpthread.so.0 #9 0x00007ffff78254af in clone () from /lib64/libc.so.6 When a TLS alert is emitted, haproxy calls quic_set_connection_close() which sets QUIC_FL_CONN_IMMEDIATE_CLOSE connection flag. This is this flag which is tested by this patch to make the handshake fail even if SSL_do_handshake() does not return an error. This test is specific to libressl and never run with others TLS stack. Thank you to @lgv5 and @botovq for having reported this issue in GH #2569. Must be backported as far as 2.6.	2024-05-22 15:21:55 +02:00
Valentine Krasnobaeva	0e93549d2a	MINOR: proto: fix coding style Remove redundant brackets for 'if' statements that contain only one instruction.	2024-05-22 12:00:11 +02:00
Valentine Krasnobaeva	83ab1479d0	BUG/MINOR: sock: fix sock_create_server_socket Set stream_err value as SF_ERR_NONE, if obtained socket fd has passed all common runtime and configuration related checks. '.connect()' method implementation in higher protocol layers requires Stream Error Flag as the return value. So, at the socket layer, we need to pass to sock_create_server_socket() a variable to set this flag, because syscalls and some socket options checks are convenient to performe at the socket layer.	2024-05-22 11:59:55 +02:00
Willy Tarreau	5b9503ed33	MINOR: traces: enumerate the list of levels/verbosities when not found It's quite frustrating, particularly on the command line, not to have access to the list of available levels and verbosities when one does not exist for a given source, because there's no easy way to find them except by starting without and connecting to the CLI. Let's enumerate the list of supported levels and verbosities when a name does not match. For example: $ ./haproxy -db -f quic-repro.cfg -dt h2:help [NOTICE] (9602) : haproxy version is 3.0-dev12-60496e-27 [NOTICE] (9602) : path to executable is ./haproxy [ALERT] (9602) : -dt: no such trace level 'help', available levels are 'error', 'user', 'proto', 'state', 'data', and 'developer'. $ ./haproxy -db -f quic-repro.cfg -dt h2:user:help [NOTICE] (9604) : haproxy version is 3.0-dev12-60496e-27 [NOTICE] (9604) : path to executable is ./haproxy [ALERT] (9604) : -dt: no such trace verbosity 'help' for source 'h2', available verbosities for this source are: 'quiet', 'clean', 'minimal', 'simple', 'advanced', and 'complete'. The same is done for the CLI where the existing help message is always displayed when entering an invalid verbosity or level.	2024-05-22 11:17:57 +02:00
Amaury Denoyelle	60496e884e	MINOR: connection: support PROXY v2 TLV emission without stream Update API for PROXY protocol header encoding. Previously, it requires stream parameter to be set. Change make_proxy_line() and associated functions to add an extra session parameter. This is useful in context where no stream is instantiated. For example, this is the case for rhttp preconnect. This change allows to extend PROXY v2 TLV encoding. Replace build_logline() which requires a stream instance and call directly sess_build_logline(). Note that stream parameter is kept as it is necessary for unique ID encoding. This change has no functional change for standard connections. However, it is necessary to support TLV encoding on rhttp preconnect.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	7a81bfc8d2	MINOR: rhttp: support PROXY emission on preconnect Extend preconnect to support PROXY protocol emission. Code is duplicated from connect_server() into new_reverse_conn(). This is necessary to support send-proxy on server line used as rhttp.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	12c40c25a9	MEDIUM: rhttp: create session for active preconnect Modify rhttp preconnect by instantiating a new session for each connection attempt. Connection is thus linked to a session directly on its instantiation contrary to previously where no session existed until listener_accept(). This patch will allow to extend rhttp usage. Most notably, it will be useful to use various sample fetches on the server line and extend logging capabilities. Changes are minimal, yet consequences are considered not trivial as for the first time a FE connection session is instantiated before listener_accept(). This requires an extra explicit check in session_accept_fd() to not overwrite an existing session. Also, flag SESS_FL_RELEASE_LI is not set immediately as listener counters must note be decremented if connection and its session are freed before reversal is completed, or else listener counters will be invalid. conn_session_free() is used as connection destroy callback to ensure the session will be freed automatically on connection release.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	45b80aed70	MINOR: session: define flag to explicitely release listener on free When a session is allocated for a FE connection, session_free() is responsible to call listener_release() to decrement listener connection counters and resume listening. Until now, <listener> member of session was tested inside session_free() before invocating listener_release(). To highlight more explicitely the relation between sessions and listeners, introduce a new flag SESS_FL_RELEASE_LI. Only session with such flag set will invoke listener_release() on their cleanup. Flag is set inside session_accept_fd() on success. This patch has no functional change. However, it will be useful to implement session creation for rHTTP preconnect.	2024-05-22 10:01:57 +02:00

1 2 3 4 5 ...

22408 Commits All Branches Search

22408 Commits

All Branches