haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-01-02 18:22:04 +00:00

Author	SHA1	Message	Date
Willy Tarreau	decb7c90df	CLEANUP: ssl_sock: move dirty openssl-1.0.2 wrapper to openssl-compat Valentine noticed this ugly SSL_CTX_get_tlsext_status_cb() macro definition inside ssl_sock.c that is dedicated to openssl-1.0.2 only. It would be better placed in openssl-compat.h, which is what this patch does. It also addresses a missing pair of parenthesis and removes an invalid extra semicolon.	2024-05-28 19:17:57 +02:00
Valentine Krasnobaeva	84380965a5	BUG/MINOR: ssl/ocsp: init callback func ptr as NULL In ssl_sock_load_ocsp() it is better to initialize local scope variable 'callback' function pointer as NULL, while we are declaring it. According to SSL_CTX_get_tlsext_status_cb() API, then we will provide a pointer to this 'on stack' variable in order to check, if the callback was already set before: OpenSSL 1.x.x and 3.x.x: long SSL_CTX_get_tlsext_status_cb(SSL_CTX ctx, int (callback)(SSL , void )); long SSL_CTX_set_tlsext_status_cb(SSL_CTX ctx, int (callback)(SSL , void )); WolfSSL 5.7.0: typedef int(tlsextStatusCb)(WOLFSSL* ssl, void); WOLFSSL_API int wolfSSL_CTX_get_tlsext_status_cb(WOLFSSL_CTX ctx, tlsextStatusCb* cb); WOLFSSL_API int wolfSSL_CTX_set_tlsext_status_cb(WOLFSSL_CTX* ctx, tlsextStatusCb cb); When this func ptr variable stays uninitialized, haproxy comipled with ASAN crushes in ssl_sock_load_ocsp(): ./haproxy -d -f haproxy.cfg ... AddressSanitizer:DEADLYSIGNAL ================================================================= ==114919==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000008 (pc 0x5eab8951bb32 bp 0x7ffcdd6d8410 sp 0x7ffcdd6d82e0 T0) ==114919==The signal is caused by a READ memory access. ==114919==Hint: address points to the zero page. #0 0x5eab8951bb32 in ssl_sock_load_ocsp /home/vk/projects/haproxy/src/ssl_sock.c:1248:22 #1 0x5eab89510d65 in ssl_sock_put_ckch_into_ctx /home/vk/projects/haproxy/src/ssl_sock.c:3389:6 ... This happens, because callback variable is allocated on the stack. As not being explicitly initialized, it may contain some garbage value at runtime, due to the linked crypto library update or recompilation. So, following ssl_sock_load_ocsp code, SSL_CTX_get_tlsext_status_cb() may fail, callback will still contain its initial garbage value, 'if (!callback) {...' test will put us on the wrong path to access some ocsp_cbk_arg properties via its pointer, which won't be set and like this we will finish with segmentation fault. Must be backported in all stable versions. All versions does not have the ifdef, the previous cleanup patch is useful starting from the 2.7 version.	2024-05-28 18:14:26 +02:00
Valentine Krasnobaeva	fb7b46d267	CLEANUP: ssl/ocsp: readable ifdef in ssl_sock_load_ocsp Due to the support of different TLS/SSL libraries and its different versions, sometimes we are forced to use different internal typedefs and callback functions. We strive to avoid this, but time to time "#ifdef... #endif" become inevitable. In particular, in ssl_sock_load_ocsp() we define a 'callback' variable, which will contain a function pointer to our OCSP stapling callback, assigned further via SSL_CTX_set_tlsext_status_cb() to the intenal SSL context struct in a linked crypto library. If this linked crypto library is OpenSSL 1.x.x/3.x.x, for setting and getting this callback we have the following API signatures (see doc/man3/SSL_CTX_set_tlsext_status_cb.pod): long SSL_CTX_get_tlsext_status_cb(SSL_CTX ctx, int (callback)(SSL , void )); long SSL_CTX_set_tlsext_status_cb(SSL_CTX ctx, int (callback)(SSL , void )); If we are using WolfSSL, same APIs expect tlsextStatusCb function prototype, provided via the typedef below (see wolfssl/wolfssl/ssl.h): typedef int(tlsextStatusCb)(WOLFSSL* ssl, void); WOLFSSL_API int wolfSSL_CTX_get_tlsext_status_cb(WOLFSSL_CTX ctx, tlsextStatusCb* cb); WOLFSSL_API int wolfSSL_CTX_set_tlsext_status_cb(WOLFSSL_CTX* ctx, tlsextStatusCb cb); It seems, that in OpenSSL < 1.0.0, there was no support for OCSP extention, so no need to set this callback. Let's avoid #ifndef... #endif for this 'callback' variable definition to keep things clear. #ifndef... #endif are usually less readable, than straightforward "#ifdef... #endif".	2024-05-28 18:00:44 +02:00
Willy Tarreau	725fa0ecd2	BUILD: fd: errno is also needed without poll() When building without USE_POLL, fd.c fails on errno because that one is only included when USE_POLL is set. Let's move it outside of the ifdef.	2024-05-27 19:14:14 +02:00
Aurelien DARRAGON	435a9da267	MINOR: log: rename 'log-format tag' to 'log-format alias' In 2.9 we started to introduce an ambiguity in the documentation by referring to historical log-format variables ('%var') as log-format tags in `739c4e5b1e` ("MINOR: sample: accept_date / request_date return %Ts / %tr timestamp values") and `454c372b60` ("DOC: configuration: add sample fetches for timing events"). In fact, we've had this confusion between log-format tag and log-format var for more than 10 years now, but in 2.9 it was the first time the confusion was exposed in the documentation. Indeed, both 'log-format variable' and 'log-format tag' actually refer to the same feature (that is: '%B' and friends that can be used for direct access to some log-oriented predefined fetches instead of using %[expr] with generic sample expressions). This feature was first implemented in `723b73ad75` ("MINOR: config: Parse the string of the log-format config keyword") and later documented in `4894040fa` ("DOC: log-format documentation"). At that time, it was clear that we used to name it 'log-format variable'. But later the same year, 'log-format tag' naming started to appear in some commit messages (while still referring to the same feature), for instance with `ffc3fcd6d` ("MEDIUM: log: report SSL ciphers and version in logs using logformat %sslc/%sslv"). Unfortunately in 2.9 when we added (and documented) new log-format variables we officially started drifting to the misleading 'log-format tag' naming (perhaps because it was the most recent naming found for this feature in git log history, or because the confusion has always been there) Even worse, in 3.0 this confusion led us to rename all 'var' occurrences to 'tag' in log-format related code to unify the code with the doc. Hopefully William quickly noticed that we made a mistake there, but instead of reverting to historical naming (log-format variable), it was decided that we must use a different name that is less confusing than 'tags' or 'variables' (tags and variables are keywords that are already used to designate other features in the code and that are not very explicit under log-format context today). Now we refer to '%B' and friends as a logformat alias, which is essentially a handy way to print some log oriented information in the log string instead of leveraging '%[expr]' with generic sample expressions made of fetches and converters. Of course, there are some subtelties, such as a few log-format aliases that still don't have sample fetch equivalent for historical reasons, and some aliases that may be a little faster than their generic sample expression equivalents because most aliases are pretty much hardcoded in the log building function. But in general logformat aliases should be simply considered as an alternative to using expressions (with '%[expr']') Also, under log-format context, when we want to refer to either an alias ('%alias') or an expression ('%[expr]'), we should use the generic term 'logformat item', which in fact designates a single item within the logformat string provided by the user. Indeed, a logformat item (whether is is an alias or an expression) always starts with '%' and may accept optional flags / arguments Both the code and the documentation were updated in that sense, hopefully this will clarify things and prevent future confusions.	2024-05-27 17:03:48 +02:00
William Lallemand	0a00302fab	MINOR: sample: implement the uptime sample fetch 'uptime' returns the uptime of the current HAProxy worker in seconds.	2024-05-27 11:06:40 +02:00
Christopher Faulet	0d7c1bc6ab	BUG/MINOR: server: Don't reset resolver options on a new default-server line When a new "default-server" line is parsed, some resolver options are reset. Thus previously defined default options cannot be inherited. There is no reason to do so. First because other server options are inherited. And then because not all resolver options are reset. It is not consistent. This patch should fix issue #2559. It should be backported to all stable versions.	2024-05-24 16:31:01 +02:00
Christopher Faulet	8d2514e087	BUG/MINOR: http-htx: Support default path during scheme based normalization As stated in RFC3986, for an absolute-form URI, an empty path should be normalized to a path of "/". This is part of scheme based normalization rules. This kind of normalization is already performed for default ports. So we might as well deal with the case of empty path. The associated reg-tests was updated accordingly. This patch should fix the issue #2573. It may be backported as far as 2.4 if necessary.	2024-05-24 16:17:24 +02:00
Aurelien DARRAGON	c16eba8183	BUG/MEDIUM: server/dns: preserve server's port upon resolution timeout or error @boi4 reported in GH #2578 that since 3.0-dev1 for servers with address learned from A/AAAA records after a DNS flap server would be put out of maintenance with proper address but with invalid port (== 0), making it unusable and causing tcp checks to fail: [NOTICE] (1) : Loading success. [WARNING] (8) : Server mybackend/myserver1 is going DOWN for maintenance (DNS refused status). 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. [ALERT] (8) : backend 'mybackend' has no server available! [WARNING] (8) : mybackend/myserver1: IP changed from '(none)' to '127.0.0.1' by 'myresolver/ns1'. [WARNING] (8) : Server mybackend/myserver1 ('myhost') is UP/READY (resolves again). [WARNING] (8) : Server mybackend/myserver1 administratively READY thanks to valid DNS answer. [WARNING] (8) : Server mybackend/myserver1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. @boi4 also mentioned that this used to work fine before. Willy suggested that this regression may have been introduced by `64c9c8e` ("BUG/MINOR: server/dns: use server_set_inetaddr() to unset srv addr from DNS") Turns out he was right! Indeed, in `64c9c8e` we systematically memset the whole server_inetaddr struct (which contains both the requested server's addr and port planned for atomic update) instead of only memsetting the addr part of the structure: except when SRV records are involved (SRV records provide both the address and the port unlike A or AAAA records), we must not reset the server's port upon DNS errors because the port may have been provided at config time and we don't want to lose its value. Big thanks to @boi4 for his well-documented issue that really helped us to pinpoint the bug right on time for the dev-13 release. No backport needed (unless `64c9c8e` gets backported).	2024-05-24 15:29:48 +02:00
Amaury Denoyelle	98ed11b0c5	BUG/MINOR: rhttp: initialize session origin after preconnect reversal Since the following commit, session is initialized early for rhttp preconnect. 12c40c25a9520fe3365950184fe724a1f4e91d03 MEDIUM: rhttp: create session for active preconnect Session origin member was not set. However, this prevents several session fetches to not work as expected. Worst, this caused a regression as previously session was created after reversal with origin member defined. This was reported by user William Manley on the mailing-list which rely on set-dst. One possible fix would be to set origin on session_new(). However, as this is done before reversal, some session members may be incorrectly initialized, in particular source and destination address. Thus, session origin is only set after reversal is completed. This ensures that session fetches have the same behavior on standard connections and reversable ones. This does not need to be backported.	2024-05-24 14:47:21 +02:00
Amaury Denoyelle	47168e217a	MEDIUM: connection: use pool-conn-name instead of sni on reuse Implement pool-conn-name support for idle connection reuse. It replaces SNI as arbitrary identifier for connections in the idle pool. Thus, every SNI reference in this context have been replaced. Main change occurs in connect_server() where pool-conn-name sample fetch is now prehash to generate idle connection identifier. SNI is now solely used in the context of SSL for ssl_sock_set_servername().	2024-05-24 14:47:21 +02:00
Amaury Denoyelle	be4f89f2b2	MINOR: server: define pool-conn-name keyword Define a new server keyword pool-conn-name. The purpose of this keyword will be to identify connections inside the idle connections pool, replacing SNI in case SSL is not wanted. This keyword uses a sample expression argument. It thus can reuse existing function parse_srv_expr() for parsing. In the future, it may be necessary to define a keyword variant which uses a logformat for extensability. This patch only implement parsing. Argument is stored inside new server field <pool_conn_name> and expression is generated in _srv_parse_finalize() into <pool_conn_name_expr>. If pool-conn-name is not set but SNI is, the latter is reused automatically as pool-conn-name via _srv_parse_finalize(). This ensures current reuse behavior remains compatible and idle connection reuse will not mix connections with different SNIs by mistake. Main usage will be for rhttp when SSL is not wanted between the two haproxy instances. Previously, it was possible to use "sni" keyword even without SSL on a server line which have a similar effect. However, having a dedicated "pool-conn-name" keyword is deemed clearer. Besides, it would allow for more complex configuration where pool-conn-name and SNI are use in parallel with different values.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	91001422b4	MINOR: server: generalize sni expr parsing Two functions exists for server sni sample expression parsing. This is confusing so this commit aims at clarifying this. Functions are renamed with the following identifiers. First function is named parse_srv_expr() and can be used during parsing. Besides expression parsing, it has ensure sample fetch validity in the context of a server line. Second function is renamed _parse_srv_expr() and is used internally by parse_srv_expr(). It only implements sample parsing without extra checks. It is already use for server instantiation derived from server-template as checks were already performed. Also, it is now used in http-client code as SNI is a fixed string. Finally, both functions are generalized to remove any reference to SNI. This will allow to reuse it to parse other server keywords which use an expression. This will be the case for the future keyword pool-conn-name.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	b9f67a46a2	MINOR: quic: clarify doc for quic_recv() Just highlight the fact that quic_recv() only receive a single datagram.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	5764bc50b5	BUG/MINOR: quic: adjust restriction for stateless reset emission Review RFC 9000 and ensure restriction on Stateless reset are properly enforced. After careful examination, several changes are introduced. First, redefine minimal Stateless Reset emitted packet length to 21 bytes (5 random bytes + a token). This is the new default length used in every case, unless received packet which triggered it is 43 bytes or smaller. Ensure every Stateless Reset packets emitted are at 1 byte shorter than the received packet which triggered it. No Stateless reset will be emitted if this falls under the above limit of 21 bytes. Thus this should prevent looping issues. This should be backported up to 2.6.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	f55748a422	MAJOR: config: prevent QUIC with clients privileged port by default Previous commit introduce new protection mechanism to forbid communications with clients which use a privileged source port. By default, this mechanism is disabled for every protocols. This patch changes the default value and activate the protection mechanism for QUIC protocol. This is justified as it is a probable sign of DNS/NTP amplification attack. This is labelled as major as it can be a breaking change with some network environments.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	45f40bac4c	MEDIUM: config: prevent communication with privileged ports This commit introduces a new global setting named harden.reject_privileged_ports.{tcp\|quic}. When active, communications with clients which use privileged source ports are forbidden. Such behavior is considered suspicious as it can be used as spoofing or DNS/NTP amplication attack. Value is configured per transport protocol. For each TCP and QUIC distinct code locations are impacted by this setting. The first one is in sock_accept_conn() which acts as a filter for all TCP based communications just after accept() returns a new connection. The second one is dedicated for QUIC communication in quic_recv(). In both cases, if a privileged source port is used and setting is disabled, received message is silently dropped. By default, protection are disabled for both protocols. This is to be able to backport it without breaking changes on stable release. This should be backported as it is an interesting security feature yet relatively simple to implement.	2024-05-24 14:36:31 +02:00
Amaury Denoyelle	4e632545f7	BUILD: trace: fix warning on null dereference Since a recent change on trace, the following compilation warning may occur : src/trace.c: In function ‘trace_parse_cmd’: src/trace.c:865:33: error: potential null pointer dereference [-Werror=null-dereference] 865 \| for (nd = src->decoding; nd->name && nd->desc; nd++) \| ~~~^~~~~~~~~~~~~~~ Fix this by rearranging code path to better highlight that only "quiet" verbosity is allowed if no trace source is specified. This was detected with GCC 14.1.	2024-05-24 14:36:03 +02:00
Aurelien DARRAGON	c9af6d5414	DEBUG: pollers/fd: add thread id suffix to per-thread memory areas name hints Willy reported that since abb8412d2 ("DEBUG: pollers: add name hint for large memory areas used by pollers") and 22ec2ad8b ("DEBUG: fd: add name hint for large memory areas") multiple maps with the same name could be found in /proc/<pid>/maps when haproxy process is started with multiple threads, which can be annoying. In fact this happens because some poller and fd-created memory areas are being created for each available thread, and since the naming was done using vma_set_name() with the same <type> and <name> inputs, the resulting name was the same for all threads. Thanks to the previous commit, we now use vma_set_name_id() for naming per-thread memory areas so that "-id" prefix is appended after the name name, where "id" equals to 'tid+1' (to match the thread numbering logic found in config file or in ha_panic() report), allowing to easily identify which haproxy thread owns the map in /proc/<pid>/maps: 7d3b26200000-7d3b26a01000 rw-p 00000000 00:00 0 [anon:ev_poll:poll_events-2] 7d3b26c00000-7d3b27001000 rw-p 00000000 00:00 0 [anon:fd:fd_updt-2] 7d3b27200000-7d3b27a01000 rw-p 00000000 00:00 0 [anon:ev_poll:poll_events-1] 7d3b34200000-7d3b34601000 rw-p 00000000 00:00 0 [anon:fd:fd_updt-1]	2024-05-24 12:07:18 +02:00
Aurelien DARRAGON	9d37c4b989	DEBUG: tools: add vma_set_name_id() helper Just like vma_set_name() from `51a8f134e` ("DEBUG: tools: add vma_set_name() helper"), but also takes <id> as parameter to append "-$id" suffix after the name in order to differentiate 2 areas that were named using the same <type> and <name> combination. example, using mmap + MAP_SHARED\|MAP_ANONYMOUS: 7364c4fff000-736508000000 rw-s 00000000 00:01 3540 [anon_shmem:type:name-id] Another example, using mmap + MAP_PRIVATE\|MAP_ANONYMOUS or using glibc/malloc() above MMAP_THRESHOLD: 7364c4fff000-736508000000 rw-s 00000000 00:01 3540 [anon:type:name-id]	2024-05-24 12:07:13 +02:00
Aurelien DARRAGON	23814a44e5	CLEANUP: tools: fix vma_set_name() function comment There was a typo in the example provided in vma_set_name(): maps named using the function will show up as "type:name", not "type.name", updating the comment to reflect the current behavior.	2024-05-24 12:07:07 +02:00
Willy Tarreau	0bda33a3ec	MINOR: stick-tables: remove the uneeded read lock in stksess_free() During changes made in 2.7 by commits `8d3c3336f9` ("MEDIUM: stick-table: make stksess_kill_if_expired() avoid the exclusive lock") and `996f1a5124` ("MEDIUM: stick-table: do not take a lock to update t->current anymore."), the operation was done cautiously one baby step at a time and the final cleanup was not done, as we're keeping a read lock under an atomic dec. Furthermore there's a pool_free() call under that lock, and we try to avoid pool_alloc() and pool_free() under locks for their nasty side effects (e.g. when memory gets recompacted), so let's really drop it now. Note that the performance gain is not really perceptible here, it's essentially for code clarity reasons that this has to be done.	2024-05-24 11:52:57 +02:00
Willy Tarreau	8580f9db20	CLEANUP: stick-tables: remove a few unneeded tests for use_wrlock Due to the code in stktable_touch_with_exp() being the same as in other functions previously made around a loop trying first to upgrade a read lock then to fall back to a direct write lock, there remains a confusing construct with multiple tests on use_wrlock that is obviously zero when tested. Let's remove them since the value is known and the loop does not exist anymore.	2024-05-24 11:52:19 +02:00
Willy Tarreau	77f286e8bc	BUG/MEDIUM: stick-tables: make sure never to create two same remote entries In GH issue #2552, Christian Ruppert reported an increase in crashes with recent 3.0-dev versions, always related with stick-tables and peers. One particularity of his config is that it has a lot of peers. While trying to reproduce, it empirically was found that firing 10 load generators at 10 different haproxy instances tracking a random key among 100k against a table of max 5k entries, on 8 threads and between a total of 50 parallel peers managed to reproduce the crashes in seconds, very often in ebtree deletion or insertion code, but not only. The debugging revealed that the crashes are often caused by a parent node being corrupted while delete/insert tries to update it regarding a recently inserted/removed node, and that that corrupted node had always been proven to be deleted, then immediately freed, so it ought not be visited in the tree from functions enclosed between a pair of lock/unlock. As such the only possibility was that it had experienced unexpected inserts. Also, running with pool integrity checking would 90% of the time cause crashes during allocation based on corrupted contents in the node, likely because it was found at two places in the same tree and still present as a parent of a node being deleted or inserted (hence the __stksess_free and stktable_trash_oldest callers being visible on these items). Indeed the issue is in fact related to the test set (occasionally redundant keys, many peers). What happens is that sometimes, a same key is learned from two different peers. When it is learned for the first time, we end up in stktable_touch_with_exp() in the "else" branch, where the test for existence is made before taking the lock (since commit cfeca3a3a3 ("MEDIUM: stick-table: touch updates under an upgradable read lock") that was merged in 2.9), and from there the entry is added. But is one of the threads manages to insert it before the other thread takes the lock, then the second thread will try to insert this node again. And inserting an already inserted node will corrupt the tree (note that we never switched to enforcing a check in insertion code on this due to API history that would break various code parts). Here the solution is simple, it requires to recheck leaf_p after getting the lock, to avoid touching anything if the entry has already been inserted in the mean time. Many thanks to Christian Ruppert for testing this and for his invaluable help on this hard-to-trigger issue. This fix needs to be backported to 2.9.	2024-05-24 11:52:11 +02:00
Christopher Faulet	9938fb9c7a	BUG/MEDIUM: stick-tables: Fix race with peers when killing a sticky session When a sticky session is killed, we must be sure no other entity is still referencing it. The session's ref_cnt must be 0. However, there is a race with peers, as decribed in 21447b1dd4 ("BUG/MAJOR: stick-tables: fix race with peers in entry expiration"). When the update lock is acquire, we must recheck the ref_cnt value. This patch is part of a debugging session about issue #2552. It must be backported to 2.9.	2024-05-24 11:52:11 +02:00
Christopher Faulet	dfd938bad6	BUG/MEDIUM: stick-tables: Fix race with peers when trashing oldest entries It is the same that the one fixed in process_table_expire() (`21447b1dd4` ["BUG/MAJOR: stick-tables: fix race with peers in entry expiration"]). In stktable_trash_oldest(), when the update lock is acquired, we must take care to check again the ref_cnt because some peers may increment it (See commit above for details). This patch fixes a crash mentionned in 2552#issuecomment-2110532706. It must be backported to 2.9.	2024-05-24 11:52:11 +02:00
Willy Tarreau	51f9f6cfd4	BUILD: quic: fix unused variable warning when threads are disabled The tree variable was introduced in 3.0 by commit `dd58dff1e6` ("BUG/MEDIUM: quic: QUIC CID removed from tree without locking") which was marked for backport. The variable is only used for locks. Let's just mark the variable __maybe_unused for when the code is built without threads. The patch above was marked for backport to 2.7 so this should be backported wherever the fix was backported.	2024-05-24 11:51:41 +02:00
Willy Tarreau	381ed2a4dd	MINOR: config: add thread-hard-limit to set an upper bound to nbthread On todays large systems, it's not always desired to run on all threads for light loads, and usually users enforce nbthread to a lower value (e.g. 8). The problem is that this is a fixed value, and moving such configs to smaller machines continues to enforce the value and this becomes extremely unproductive due to having more threads than CPUs. This also happens quite a bit in VMs, containers, or cloud instances of various sizes. This commit introduces the thread-hard-limit setting that allows to only set an upper bound to the number of threads without raising a lower value. This means that using "thread-hard-limit 8" will make sure that no more than 8 threads will be used when available, but it will remain two when run on a dual-core machine.	2024-05-24 09:46:49 +02:00
Christopher Faulet	d11249f292	MINOR: mux-quic: Set abort info for SC-less QCS on STOP_SENDING frame It is a revert of `cc9827bb09` ("BUG/MEDIUM: mux-quic: fix crash on STOP_SENDING received without SD"). This fix was based on a wrong assumption about QUIC streams that may have no stream-endpoint descriptor. However, it must never happen. And this was fixed. So we can now safely revert the commit above. However, it is not a bugfix because, for now, abort info are only used by the upper layer. So it is not a big deal to not set it when there is no SC.	2024-05-23 11:18:19 +02:00
Christopher Faulet	086e51017e	BUG/MEDIUM: mux-quic: Create sedesc in same time of the QUIC stream Recent changes to save abort reason revealed an issue during the QUIC stream creation. Indeed, by design, when a mux stream is created, it must always have a valid stream-endpoint descriptor and it must remain valid till the mux stream destruction. On frontend side, it is the multiplexer responsibility to create it and set it as orphan. On the backend side, the sedesc is provided by the upper layer. It is the sedesc of the back stream-connector. For the QUIC multiplexer, the stream-endpoint descriptor was only created when the stream-connector was created and attached on it. It is unexpected and some bugs may be introduced because there is no valid sedesc on a QUIC stream. And a recent bug was introduced for this reason. This patch must be backported as far as 2.6.	2024-05-23 11:18:06 +02:00
Frederic Lecaille	169fc0b771	BUG/MAJOR: quic: Crash with TLS_AES_128_CCM_SHA256 (libressl only) At least 3.9.0 version of libressl TLS stack does not behave as others stacks like quictls which make SSL_do_handshake() return an error when no cipher could be negotiated in addition to emit a TLS alert(0x28). This is the case when TLS_AES_128_CCM_SHA256 is forced as TLS1.3 cipher from the client side. This make haproxy enter a code path which leads to a crash as follows: [Switching to Thread 0x7ffff76b9640 (LWP 23902)] 0x0000000000487627 in quic_tls_key_update (qc=qc@entry=0x7ffff00371f0) at src/quic_tls.c:910 910 struct quic_kp_trace kp_trace = { (gdb) list 905 { 906 struct quic_tls_ctx tls_ctx = &qc->ael->tls_ctx; 907 struct quic_tls_secrets rx = &tls_ctx->rx; 908 struct quic_tls_secrets tx = &tls_ctx->tx; 909 / Used only for the traces / 910 struct quic_kp_trace kp_trace = { 911 .rx_sec = rx->secret, 912 .rx_seclen = rx->secretlen, 913 .tx_sec = tx->secret, 914 .tx_seclen = tx->secretlen, (gdb) p qc $1 = (struct quic_conn ) 0x7ffff00371f0 (gdb) p qc->ael $2 = (struct quic_enc_level *) 0x0 (gdb) bt #0 0x0000000000487627 in quic_tls_key_update (qc=qc@entry=0x7ffff00371f0) at src/quic_tls.c:910 #1 0x000000000049bca9 in qc_ssl_provide_quic_data (len=268, data=<optimized out>, ctx=0x7ffff0047f80, level=<optimized out>, ncbuf=<optimized out>) at src/quic_ssl.c:617 #2 qc_ssl_provide_all_quic_data (qc=qc@entry=0x7ffff00371f0, ctx=0x7ffff0047f80) at src/quic_ssl.c:688 #3 0x00000000004683a7 in quic_conn_io_cb (t=0x7ffff0047f30, context=0x7ffff00371f0, state=<optimized out>) at src/quic_conn.c:760 #4 0x000000000063cd9c in run_tasks_from_lists (budgets=budgets@entry=0x7ffff76961f0) at src/task.c:596 #5 0x000000000063d934 in process_runnable_tasks () at src/task.c:876 #6 0x0000000000600508 in run_poll_loop () at src/haproxy.c:3073 #7 0x0000000000600b67 in run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:3287 #8 0x00007ffff7f6ae45 in start_thread () from /lib64/libpthread.so.0 #9 0x00007ffff78254af in clone () from /lib64/libc.so.6 When a TLS alert is emitted, haproxy calls quic_set_connection_close() which sets QUIC_FL_CONN_IMMEDIATE_CLOSE connection flag. This is this flag which is tested by this patch to make the handshake fail even if SSL_do_handshake() does not return an error. This test is specific to libressl and never run with others TLS stack. Thank you to @lgv5 and @botovq for having reported this issue in GH #2569. Must be backported as far as 2.6.	2024-05-22 15:21:55 +02:00
Valentine Krasnobaeva	0e93549d2a	MINOR: proto: fix coding style Remove redundant brackets for 'if' statements that contain only one instruction.	2024-05-22 12:00:11 +02:00
Valentine Krasnobaeva	83ab1479d0	BUG/MINOR: sock: fix sock_create_server_socket Set stream_err value as SF_ERR_NONE, if obtained socket fd has passed all common runtime and configuration related checks. '.connect()' method implementation in higher protocol layers requires Stream Error Flag as the return value. So, at the socket layer, we need to pass to sock_create_server_socket() a variable to set this flag, because syscalls and some socket options checks are convenient to performe at the socket layer.	2024-05-22 11:59:55 +02:00
Willy Tarreau	5b9503ed33	MINOR: traces: enumerate the list of levels/verbosities when not found It's quite frustrating, particularly on the command line, not to have access to the list of available levels and verbosities when one does not exist for a given source, because there's no easy way to find them except by starting without and connecting to the CLI. Let's enumerate the list of supported levels and verbosities when a name does not match. For example: $ ./haproxy -db -f quic-repro.cfg -dt h2:help [NOTICE] (9602) : haproxy version is 3.0-dev12-60496e-27 [NOTICE] (9602) : path to executable is ./haproxy [ALERT] (9602) : -dt: no such trace level 'help', available levels are 'error', 'user', 'proto', 'state', 'data', and 'developer'. $ ./haproxy -db -f quic-repro.cfg -dt h2:user:help [NOTICE] (9604) : haproxy version is 3.0-dev12-60496e-27 [NOTICE] (9604) : path to executable is ./haproxy [ALERT] (9604) : -dt: no such trace verbosity 'help' for source 'h2', available verbosities for this source are: 'quiet', 'clean', 'minimal', 'simple', 'advanced', and 'complete'. The same is done for the CLI where the existing help message is always displayed when entering an invalid verbosity or level.	2024-05-22 11:17:57 +02:00
Amaury Denoyelle	60496e884e	MINOR: connection: support PROXY v2 TLV emission without stream Update API for PROXY protocol header encoding. Previously, it requires stream parameter to be set. Change make_proxy_line() and associated functions to add an extra session parameter. This is useful in context where no stream is instantiated. For example, this is the case for rhttp preconnect. This change allows to extend PROXY v2 TLV encoding. Replace build_logline() which requires a stream instance and call directly sess_build_logline(). Note that stream parameter is kept as it is necessary for unique ID encoding. This change has no functional change for standard connections. However, it is necessary to support TLV encoding on rhttp preconnect.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	7a81bfc8d2	MINOR: rhttp: support PROXY emission on preconnect Extend preconnect to support PROXY protocol emission. Code is duplicated from connect_server() into new_reverse_conn(). This is necessary to support send-proxy on server line used as rhttp.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	12c40c25a9	MEDIUM: rhttp: create session for active preconnect Modify rhttp preconnect by instantiating a new session for each connection attempt. Connection is thus linked to a session directly on its instantiation contrary to previously where no session existed until listener_accept(). This patch will allow to extend rhttp usage. Most notably, it will be useful to use various sample fetches on the server line and extend logging capabilities. Changes are minimal, yet consequences are considered not trivial as for the first time a FE connection session is instantiated before listener_accept(). This requires an extra explicit check in session_accept_fd() to not overwrite an existing session. Also, flag SESS_FL_RELEASE_LI is not set immediately as listener counters must note be decremented if connection and its session are freed before reversal is completed, or else listener counters will be invalid. conn_session_free() is used as connection destroy callback to ensure the session will be freed automatically on connection release.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	45b80aed70	MINOR: session: define flag to explicitely release listener on free When a session is allocated for a FE connection, session_free() is responsible to call listener_release() to decrement listener connection counters and resume listening. Until now, <listener> member of session was tested inside session_free() before invocating listener_release(). To highlight more explicitely the relation between sessions and listeners, introduce a new flag SESS_FL_RELEASE_LI. Only session with such flag set will invoke listener_release() on their cleanup. Flag is set inside session_accept_fd() on success. This patch has no functional change. However, it will be useful to implement session creation for rHTTP preconnect.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	808daa7cfb	BUG/MINOR: rhttp: fix task_wakeup state TASK_WOKEN_ANY was incorrectly used as argument to task_wakeup() for rhttp preconnect task. This value is used as a flag. Replace it by proper individual values. This is labelled as a bug but it has no known impact. This should be backported up to 2.9.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	2770ef352e	BUG/MINOR: rhttp: prevent listener suspend Ensure "disable frontend" on a reverse HTTP listener is forbidden by returing -1 on suspend callback. Suspending such a listener has unknown effect and so is not properly implemented for now. This should be backported up to 2.9.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	ceebb09744	BUG/MEDIUM: rhttp: fix preconnect on single-thread On initialization of a rhttp bind, the first thread available on the listener is selected to execute the first occurence of the preconnect task. This thread selection was incorrect as it used my_ffsl() which returns value indexed from 1, contrary to tid which are indexed from 0. This cause the first listener thread to be skipped in favor of the second one. Worst, if haproxy runs in single-thread mode, calculated thread ID will be invalid and the task will never run, which prevent any preconnect execution. Fix this by substracting the result of my_ffsl() by 1 to have a value indexed from 0. This must be backported up to 2.9.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	4f80543220	MINOR: rhttp: add log on connection allocation failure Add an error log when new_reverse_conn() fails. This may help to diagnose future issues on reverse HTTP.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	3efd9f3925	BUG/MINOR: server: free PROXY v2 TLVs on srv drop Dynamically allocated servers PROXY TLVs were not freed on server release. This patch fixes this leak by extending srv_free_params(). Every server line with set-proxy-v2-tlv-fmt keyword is impacted. For static servers, issue is minimal as it will only cause leak on deinit(). However, this could be aggravated when performing multiple removal of dynamic servers. This should be backported up to 2.9.	2024-05-22 10:01:57 +02:00
Amaury Denoyelle	8b72270e95	BUG/MINOR: connection: parse PROXY TLV for LOCAL mode conn_recv_proxy() is responsible to parse PROXY protocol header. For v2 of the protocol, TLVs parsing is implemented. However, this step was only done inside 'PROXY' command label. TLVs were never extracted for 'LOCAL' command mode. Fix this by extracting TLV parsing loop outside of the switch case. Of notable importance, tlv_offset is updated on LOCAL label to point to first TLV location. This bug should be backported up to 2.9 at least. It should even probably be backported to every stable versions. Note however that this code has changed much over time. It may be useful to use option '--ignore-all-space' to have a clearer overview of the git diff.	2024-05-22 10:01:57 +02:00
Christopher Faulet	eb89a7da33	MAJOR: spoe: Let the SPOE back into the game This reverts commits `885e40494c` and `dff9807188`. We decided to spend some time to refactor and rationnalize the SPOE for the 3.1. Thus there is no reason to still consider it as deprecated for the 3.0. Compatibility between the both versions will be maintained. See #2502 for more info.	2024-05-22 09:04:38 +02:00
Christopher Faulet	746e6f8597	BUG/MINOR: http-ana: Don't crush stream termination condition on internal error When internal error is reported from an HTTP analyzer, we must take care to not set the stream termination condition if it was already set. For instance, it happens when a message rewrite fails. In this case SF_ERR_PXCOND is set by the rule. The HTTP analyzer must not crush it with SF_ERR_INTERNAL. The regression was introduced with the commit `0fd25514d6` ("MEDIUM: http-ana: Set termination state before returning haproxy response"). The bug was discovered working in the issue #2568. It must be backported to 2.9.	2024-05-22 09:04:38 +02:00
Valentine Krasnobaeva	39caa20b3c	MINOR: sock: set conn->err_code in case of EPERM To improve the readability of sock_handle_system_err(), let's set explicitly conn->err_code as CO_ER_SOCK_ERR in case of EPERM (could be returned by setns syscall).	2024-05-21 20:14:31 +02:00
Valentine Krasnobaeva	5f713c03be	BUG/MEDIUM: proto: fix fd leak in <proto>_connect_server This fixes the fd leak, introduced in the commit `d3fc982cd7` ("MEDIUM: proto: make common fd checks in sock_create_server_socket"). Initially sock_create_server_socket() was designed to return only created socket FD or -1. Its callers from upper protocol layers were required to test the returned errno and were required then to apply different configuration related checks to obtained positive sock_fd. A lot of this code was duplicated among protocols implementations. The new refactored version of sock_create_server_socket() gathers in one place all duplicated checks, but in order to be complient with upper protocol layers, it needs the 3rd parameter: 'stream_err', in which it sets the Stream Error Flag for upper levels, if the obtained sock_fd has passed all additional checks. No backport needed since this was introduced in 3.0-dev10.	2024-05-21 20:14:05 +02:00
William Lallemand	e732de7db2	DOC: configuration: update the crt-list documentation Update the crt-list documentation with the supported keywords. Also format it in a more clear way. Must be backported to 2.8.	2024-05-21 18:30:45 +02:00
William Lallemand	e6657fd108	MEDIUM: ssl: don't load file by discovering them in crt-store In commit `55e9e9591` ("MEDIUM: ssl: temporarily load files by detecting their presence in crt-store"), ssl_sock_load_pem_into_ckch() was replaced by ssl_sock_load_files_into_ckch() in the crt-store loading. But the side effect was that we always try to autodetect, and this is not what we want. This patch reverse this, and add specific code in the crt-list loading, so we could autodetect in crt-list like it was done before, but still try to load files when a crt-store filename keyword is specified. Example: These crt-list lines won't autodetect files: foobar.crt [key foobar.key issuer foobar.issuer ocsp-update on] .foo.bar foobar.crt [key foobar.key] .foo.bar These crt-list lines will autodect files: foobar.pem [ocsp-update on] *.foo.bar foobar.pem	2024-05-21 18:30:45 +02:00
Aurelien DARRAGON	22ec2ad8b0	DEBUG: fd: add name hint for large memory areas Thanks to ("MINOR: tools: add vma_set_name() helper"), set a name hint for large arrays created by fd api (fdtab arrays and so on) so that that they can be easily identified in /proc/<pid>/maps. Depending on malloc() implementation, such memory areas will normally be merged on the heap under MMAP_THRESHOLD (128 kB by default) and will have a dedicated memory area once the threshold is exceeded. As such, when large enough, they will appear like this in /proc/<pid>/maps: 7b8e83200000-7b8e84201000 rw-p 00000000 00:00 0 [anon:fd:fdinfo] 7b8e84400000-7b8e85401000 rw-p 00000000 00:00 0 [anon:fd:polled_mask] 7b8e85600000-7b8e89601000 rw-p 00000000 00:00 0 [anon:fd:fdtab_addr] 7b8e90a00000-7b8e90e01000 rw-p 00000000 00:00 0 [anon:fd:fd_updt]	2024-05-21 17:55:29 +02:00
Aurelien DARRAGON	9424e5a06f	DEBUG: errors: add name hint for startup-logs memory area Thanks to ("MINOR: tools: add vma_set_name() helper"), set a name hint for startup-logs ring's memory area created using mmap() so it can be easily indentified in /proc/<pid>/maps. 7b8e91cce000-7b8e91cde000 rw-s 00000000 00:19 46 [anon_shmem:errors:startup_logs]	2024-05-21 17:55:20 +02:00
Aurelien DARRAGON	abb8412d20	DEBUG: pollers: add name hint for large memory areas used by pollers Thanks to ("MINOR: tools: add vma_set_name() helper"), set a name hint for large memory areas allocated by pollers upon init so that they can be easily indentified in /proc/<pid>/maps. For now, only linux-compatible pollers are considered since vma_set_name() requires a recent linux kernel (>= 5.17). Depending on malloc() implementation, such memory areas will normally be merged on the heap under MMAP_THRESHOLD (128 kB by default) and will have a dedicated memory area once the threshold is exceeded. As such, when large enough, they will appear like this in /proc/<pid>/maps: 7ec6b2d40000-7ec6b2d61000 rw-p 00000000 00:00 0 [anon:ev_poll:fd_evts_wr] 7ec6b2d61000-7ec6b2d82000 rw-p 00000000 00:00 0 [anon:ev_poll:fd_evts_rd]	2024-05-21 17:55:14 +02:00
Aurelien DARRAGON	6c5869f846	DEBUG: sink: add name hint for memory area used by memory-backed sinks Thanks to ("MINOR: tools: add vma_set_name() helper"), set a name hint for user created memory-backed sinks (ring sections without backing-file) so that they can be easily indentified in /proc/<pid>/maps. Depending on malloc() implementation, such memory areas will normally be merged on the heap under MMAP_THRESHOLD (128 kB by default) and will have a dedicated memory area once the threshold is exceeded. As such, when large enough, they will appear like this in /proc/<pid>/maps: 7b8e8ac00000-7b8e8bf13000 rw-p 00000000 00:00 0 [anon💍myring]	2024-05-21 17:55:09 +02:00
Aurelien DARRAGON	6de0da1b54	DEBUG: shctx: name shared memory using vma_set_name() In `98d22f212` ("MEDIUM: shctx: Naming shared memory context"), David implemented prctl/PR_SET_VMA support to give a name to shctx maps when supported. Maps were named after "HAProxy $name". It turns out that it is not relevant to include "HAProxy" in the map name, given that we're already looking at maps for a given PID (and here it's HAProxy's pid). Instead, let's name shctx maps by making use of the new vma_set_name() helper introduced by previous commit. Resulting maps will be named "shctx:$name", e.g.: "shctx:globalCache", they will appear like this in /proc/<pid>/maps: 7ec6aab0f000-7ec6ac000000 rw-s 00000000 00:01 405 [anon_shmem:shctx:custom_name]	2024-05-21 17:55:03 +02:00
Aurelien DARRAGON	51a8f134ef	DEBUG: tools: add vma_set_name() helper Following David Carlier's work in 98d22f21 ("MEDIUM: shctx: Naming shared memory context"), let's provide an helper function to set a name hint on a virtual memory area (ie: anonymous map created using mmap(), or memory area returned by malloc()). Naming will only occur if available, and naming errors will be ignored. The function takes mandatory <type> and <name> parameterss to build the map name as follow: "type:name". When looking at /proc/<pid>/maps, vma named using this helper function will show up this way (provided that the kernel has prtcl support for PR_SET_VMA_ANON_NAME): example, using mmap + MAP_SHARED\|MAP_ANONYMOUS: 7364c4fff000-736508000000 rw-s 00000000 00:01 3540 [anon_shmem:type:name] Another example, using mmap + MAP_PRIVATE\|MAP_ANONYMOUS or using glibc/malloc() above MMAP_THRESHOLD: 7364c4fff000-736508000000 rw-s 00000000 00:01 3540 [anon:type:name]	2024-05-21 17:54:58 +02:00
Aurelien DARRAGON	0cfbeb1ae8	BUG/MINOR: ring: free ring's allocated area not ring's usable area when using maps Since 40d1c84bf0 ("BUG/MAJOR: ring: free the ring storage not the ring itself when using maps"), munmap() call for startup_logs's ring and file-backed rings fails to work (EINVAL) and causes memory leaks during process cleanup. munmap() fails because it is called with the ring's usable area pointer which is an offset from the underlying original memory block allocated using mmap(). Indeed, ring_area() helper function was misused because it didn't explicitly mention that the returned address corresponds to the usable storage's area, not the allocated one. To fix the issue, we add an explicit ring_allocated_area() helper to return the allocated area for the ring, just like we already have ring_allocated_size() for the allocated size, and we properly use both the allocated size and allocated area to manipulate them using munmap() and msync(). No backport needed.	2024-05-21 11:42:35 +02:00
William Lallemand	d74ba7cc24	MINOR: ssl: check parameter in ckch_conf_cmp() Check prev and new parameters in ckch_conf_cmp() so we don't dereference a NULL ptr. There is no risk since it's not used with a NULL ptr yet. Also remove the check that are done later, and do it at the beginning of the function. Should fix issue #2572.	2024-05-21 11:09:59 +02:00
William Lallemand	140078c19d	CLEANUP: ssl/cli: remove unused code in dump_crtlist_conf This code was never used because space is never define before: if (space) chunk_appendf(buf, " "); Should fix issue #2571.	2024-05-21 10:58:09 +02:00
William Lallemand	58103bc8e6	MINOR: ssl: ckch_conf_cmp() compare multiple ckch_conf structures The ckch_conf_cmp() function allow to compare multiple ckch_conf structures in order to check that multiple usage of the same crt in the configuration uses the same ckch_conf definition. A crt-list allows to use "crt-store" keywords that defines a ckch_store, that can lead to inconsistencies when a crt is called multiple time with different parameters. This function compare and dump a list of differences in the err variable to be output as error. The variant ckch_conf_cmp_empty() compares the ckch_conf structure to an empty one, which is useful for bind lines, that are not able to have crt-store keywords. These functions are used when a crt-store is already inialized and we need to verify if the parameters are compatible. ckch_conf_cmp() handles multiple cases: - When the previous ckch_conf was declared with CKCH_CONF_SET_EMPTY, we can't define any new keyword in the next initialisation - When the previous ckch_conf was declared with keywords in a crtlist (CKCH_CONF_SET_CRTLIST), the next initialisation must have the exact same keywords. - When the previous ckch_conf was declared in a "crt-store" (CKCH_CONF_SET_CRTSTORE), the next initialisaton could use no keyword at all or the exact same keywords.	2024-05-17 17:35:51 +02:00
William Lallemand	1bc6e990f2	MEDIUM: ssl/cli: handle crt-store keywords in crt-list over the CLI This patch adds crt-store keywords from the crt-list on the CLI. - keywords from crt-store can be used over the CLI when inserting certificate in a crt-list - keywords from crt-store are dumped when showing a crt-list content over the CLI The ckch_conf_kws.func function pointer needed a new "cli" parameter, in order to differenciate loading that come from the CLI or from the startup, as they don't behave the same. For example it must not try to load a file on the filesystem when loading a crt-list line from the CLI. dump_crtlist_sslconf() was renamed in dump_crtlist_conf() and takes a new ckch_conf parameter in order to dump relevant crt-store keywords.	2024-05-17 17:35:51 +02:00
William Lallemand	2bcf38c7c8	MEDIUM: ssl: add ocsp-update.disable global option This option allow to disable completely the ocsp-update. To achieve this, the ocsp-update.mode global keyword don't rely anymore on SSL_SOCK_OCSP_UPDATE_OFF during parsing to call ssl_create_ocsp_update_task(). Instead, we will inherit the SSL_SOCK_OCSP_UPDATE_* value from ocsp-update.mode for each certificate which does not specify its own mode. To disable completely the ocsp without editing all crt entries, ocsp-update.disable is used instead of "ocsp-update.mode" which is now only used as the default value for crt.	2024-05-17 17:35:51 +02:00
William Lallemand	2e6615b282	MINOR: ssl: ckch_conf_clean() utility function for ckch_conf - ckch_conf_clean() to free() the content of a ckch_conf structure, mostly the string that were strdup()	2024-05-17 17:35:51 +02:00
William Lallemand	2b6b7fea58	MINOR: ssl/ocsp: use 'ocsp-update' in crt-store Use the ocsp-update keyword in the crt-store section. This is not used as an exception in the crtlist code anymore. This patch introduces the "ocsp_update_mode" variable in the ckch_conf structure. The SSL_SOCK_OCSP_UPDATE_* enum was changed to a define to match the ckch_conf on/off parser so we can have off to -1.	2024-05-17 17:35:51 +02:00
William Lallemand	462e5b0098	MINOR: ssl: handle PARSE_TYPE_INT and PARSE_TYPE_ONOFF in ckch_store_load_files() The callback used by ckch_store_load_files() only works with PARSE_TYPE_STR. This allows to use a callback which will use a integer type for PARSE_TYPE_INT and PARSE_TYPE_ONOFF. This require to change the type of the callback to void * to pass either a char * or a int depending of the parsing type. The ssl_sock_load_* functions were encapsuled in ckch_conf_load_* function just to match the type. This will allow to handle crt-store keywords that are ONOFF or INT types.	2024-05-17 17:35:51 +02:00
William Lallemand	c5a665f5d8	MEDIUM: ssl: ckch_conf_parse() uses -1/0/1 for off/default/on ckch_conf_parse() now set -1 for a off value and 1 for a on value. This allow to detect when a value is the default since the struct are memset to 0.	2024-05-17 17:35:51 +02:00
William Lallemand	2b8880e395	MINOR: ssl: pass ckch_store instead of ckch_data to ssl_sock_load_ocsp() ssl_sock_put_ckch_into_ctx() and ssl_sock_load_ocsp() need to take a ckch_store in argument. Indeed the ocsp_update_mode is not stored anymore in ckch_data, but in ckch_conf which is part of the ckch_store. This is a minor change, but the function definition had to change.	2024-05-17 17:35:51 +02:00
William Lallemand	db09c2168f	CLEANUP: ssl/ocsp: remove the deprecated parsing code for "ocsp-update" Remove the "ocsp-update" keyword handling from the crt-list. The code was made as an exception everywhere so we could activate the ocsp-update for an individual certificate. The feature will still exists but will be parsed as a "crt-store" keyword which will still be usable in a "crt-list". This will appear in future commits. This commit also disable the reg-tests for now.	2024-05-17 17:35:51 +02:00
William Lallemand	d616932076	MEDIUM: ssl/crtlist: loading crt-store keywords from a crt-list This patch allows the usage of "crt-store" keywords from a "crt-list". The crtstore_parse_load() function was splitted into 2 functions, so the keywords parsing is done in ckch_conf_parse(). With this patch, crt are loaded with ckch_store_new_load_files_conf() or ckch_store_new_load_files_path() depending on weither or not there is a "crt-store" keyword. More checks need to be done on "crt" bind keywords to ensure that keywords are compatible. This patch does not introduce the feature on the CLI.	2024-05-17 17:35:51 +02:00
William Lallemand	8526d666d2	MINOR: ssl: ckch_store_new_load_files_conf() loads filenames from ckch_conf ckch_store_new_load_files_conf() is the equivalent of new_ckch_store_load_files_path() but instead of trying to find the files using a base filename, it will load them from a list of files.	2024-05-17 17:35:51 +02:00
Christopher Faulet	2fc9e6fa39	MEDIUM: mux-h1: Support C-L/T-E header suppressions when sending messages During the 2.9 dev cycle, to be able to support zero-copy data forwarding, a change on the H1 mux was performed to ignore the headers modifications about payload representation (Content-Length and Transfer-Encoding headers). It appears there are some use-cases where it could be handy to change values of these headers or just remove them. For instance, we can imagine to remove these headers on a server response to force the old HTTP/1.0 close mode behavior. So thaks to this patch, the rules are relaxed. It is now possible to remove these headers. When this happens, the following rules are applied: * If "Content-Length" header is removed but a "Transfer-Encoding: chunked" header is found, no special processing is performed. The message remains chunked. However the close mode is not forced. * If "Transfer-Encoding" header is removed but a "Content-Length" header is found, no special processing is performed. The payload length must comply to the specified content length. * If one of them is removed and the other one is not found, a response is switch the close mode and a "Content-Length: 0" header is forced on a request. With these rules, we fit the best to the user expectations. This patch depends on the following commit: * MINOR: mux-h1: Add a flag to ignore the request payload This patch should fix the issue #2536. It should be backported it to 2.9 with the commit above.	2024-05-17 16:33:53 +02:00
Christopher Faulet	8e55d29109	MINOR: mux-h1: Add a flag to ignore the request payload There was a flag to skip the response payload on output, if any, by stating it is bodyless. It is used for responses to HEAD requests or for 204/304 responses. This allow rewrites during analysis. For instance a HEAD request can be rewrite to a GET request for any reason (ie, a server not supporting HEAD requests). In this case, the server will send a response with a payload. On frontend side, the payload will be skipped and a valid response (without payload) will be sent to the client. With this patch we introduce the corresponding flag for the request. It will be used to skip the request payload. In addition, when payload must be skipped for a request or a response, The zero-copy data forwarding is now disabled.	2024-05-17 16:33:53 +02:00
Christopher Faulet	45a45c917a	BUG/MINOR: stats: Don't state the 303 redirect response is chunked Start-line flags for 303-See-Other response returned by the stats applet are not properly set. Indeed, the reponse has a "content-length" header but both HTX_SL_F_CHNK and HTX_SL_F_CLEN flags are set. Because of this bug, the reponse is considered as chunked. So, let's remove HTX_SL_F_CHNK flag. And also add HTX_SL_F_BODYLESS flag because there is no payload ("content-length" header is always set to 0). This patch must be backported to all stable versions. On the 2.8 and lower versions, the commit `d0b04920d1` ("BUG/MINOR: htpp-ana/stats: Specify that HTX redirect messages have a C-L header") must be backported first.	2024-05-17 16:33:53 +02:00
Willy Tarreau	e362b076b1	Revert: MEDIUM: evports: permit to report multiple events at once" Tests have shown that switching nevlist to global.tune.maxpollevents is totally unreliable when using evports, and that events seem to be missed. A good reproducer seems to be QUIC. There are not enough users of Solaris to warrant spending more time trying to get down to this, and even the few that remain are by definition not interested in performance, so let's just revert the commit that tried to lift the value: `e6662bf706` ("MEDIUM: evports: permit to report multiple events at once"). No backport is needed.	2024-05-17 15:57:18 +02:00
Aurelien DARRAGON	b9915a745e	BUG/MEDIUM: fd: prevent memory waste in fdtab array In `97ea9c49f1` ("BUG/MEDIUM: fd: always align fdtab[] to 64 bytes"), the patch doesn't do what the message says. The intent was only to align the base fdtab addr on 64 bytes so that all fdtab entries are aligned and thus don't share the same cache line. For that, fdtab pointer is adjusted from fdtab_addr (unaligned) address after it is allocated. Thus, all we need is an extra 64 bytes in the fdtab_addr array for the aligment. Because we use calloc() to perform the allocation, a dumb mistake was made: the '+64' was added on <size> calloc argument, which means EACH fdtab entry is allocated with 64 extra bytes. Given that a single fdtab entry is 64 bytes, since `97ea9c49f1` each fdtab entry now takes 128 bytes! We doubled fdtab memory consumption. To give you an idea, on my laptop, when looking at memory consumption using 'ps -p `pidof haproxy` -o size' right after starting haproxy process with default settings (no maxsock enforced): before `97ea9c49f1`: -> 118440 (KB, ~= 118MB) after `97ea9c49f1`: -> 183976 (KB, ~= 184MB) To fix this, use calloc with 1 <nmemb> and manually provide the size with <size> as we would do if we used malloc(). With this patch, we're back to pre-97ea9c49f1 for fdtab memory consumption (with 64 extra bytes the whole array, which is insignificant). It should be backported to all stable versions.	2024-05-17 15:25:03 +02:00
Aurelien DARRAGON	e84c8dee1a	BUILD: log: get rid of non-portable strnlen() func In `c614fd3b9` ("MINOR: log: add +cbor encoding option"), I wrongly used strnlen() without noticing that the function is not portable (requires _POSIX_C_SOURCE >= 2008) and that it was the first occurrence in the entire project. In fact it is not a hard requirement since it's a pretty simple function. Thus to restore build compatibility with minimal/older build systems, let's actually get rid of it and use an equivalent portable code where needed (we cannot simply rely on strlen() because the string might not be NULL terminated, we must take upstream len into account). No backport needed (unless `c614fd3b9` gets backported)	2024-05-17 15:24:53 +02:00
William Lallemand	f18ed8d07e	MEDIUM: ssl: add ocsp-update.mindelay and ocsp-update.maxdelay This patch deprecates tune.ssl.ocsp-update.* in favor of "ocsp-update.*". Since the ocsp-update is not really a tunable of the SSL connections.	2024-05-17 15:00:11 +02:00
Amaury Denoyelle	fbc3d46b9f	BUILD: stats: remove non portable getline() usage getline() was used to read stats-file. However, this function is not portable and may cause build issue on some systems. Replace it by standard fgets(). No need to backport.	2024-05-17 14:53:19 +02:00
William Lallemand	ee58fac1b4	MINOR: ssl: rename tune.ssl.ocsp-update.mode in ocsp-update.mode Since the ocsp-update is not strictly a tuning of the SSL stack, but a feature of its own, lets rename the option. The option was also missing from the index.	2024-05-17 14:50:00 +02:00
Amaury Denoyelle	0d35f8d918	MINOR: h3: report glitch on RFC violation Increment glitch connection counter on every HTTP/3 or QPACK errors which is a violation of the specification. This could be useful to get rid early of bogus clients.	2024-05-16 10:58:54 +02:00
Amaury Denoyelle	216f70f989	MINOR: mux-quic: support glitches Implement basic support for glitches on QUIC multiplexer. This is mostly identical too glitches for HTTP/2. A new configuration option named tune.quic.frontend.glitches-threshold is defined to limit the number of glitches on a connection before closing it. Glitches counter is incremented via qcc_report_glitch(). A new qcc_app_ops callback <report_susp> is defined. On threshold reaching, it allows to set an application error code to close the connection. For HTTP/3, value H3_EXCESSIVE_LOAD is returned. If not defined, default code INTERNAL_ERROR is used. For the moment, no glitch are reported for QUIC or HTTP/3 usage. This will be added in future patches as needed.	2024-05-16 10:58:20 +02:00
Amaury Denoyelle	a6993a669b	MINOR: h3: adjust error reporting on receive This commit is the second step to simplify HTTP/3 error management. This times it deals with receive side on h3_rcv_buf(). Various internal HTTP/3 to HTX conversion functions does not set H3_INTERNAL_ERROR on h3c err anymore. Only standard error code are set. For every errors, both internal and protocol ones, a negative value is returned. This ensure that h3_rcv_buf() looping is interrupted. This function will then set H3_INTERNAL_ERROR only if no standard error is registered via h3c or h3s. Along the previous commit, this should better reflect internal errors from protocol ones caused by a faulty client.	2024-05-16 10:31:17 +02:00
Amaury Denoyelle	079d13f73f	MINOR: h3: adjust error reporting on sending It's currently difficult to differentiate HTTP/3 standard protocol violation from internal issues which use solely H3_INTERNAL_ERROR code. This patch aims is the first step to simplify this. The objective is to reduce H3_INTERNAL_ERROR. <err> field of h3c should be reserved exclusively to other values. Simplify error management in sending via h3_snd_buf(). Sending side is straightforward as only internal errors can be encountered. Do not manually set h3c.err to H3_INTERNAL_ERROR in HTX to HTTP/3 various conversion function. Instead, just return a negative value which is enough to break h3_snd_buf() loop. H3_INTERNAL_ERROR is thus positionned on a single location in this function for all sending operations.	2024-05-16 10:31:17 +02:00
Amaury Denoyelle	e094412337	MINOR: h3/qpack: adjust naming for errors Rename enum values used for HTTP/3 and QPACK RFC defined codes. First uses a prefix H3_ERR_* which serves as identifier between them. Also separate QPACK values in a new dedicated enum qpack_err. This is deemed cleaner.	2024-05-16 10:31:17 +02:00
Amaury Denoyelle	2dabcf30be	MINOR: qpack: prepare error renaming There is two distinct enums both related to QPACK error management. The first one is dedicated to RFC defined code. The other one is a set of internal values returned by qpack_decode_fs(). There has been issues discovered recently due to the confusion between them. Rename internal values with the prefix QPACK_RET_. The older name QPACK_ERR_ will be used in a future commit for the first enum.	2024-05-16 10:31:17 +02:00
Christopher Faulet	25bcdb1d95	BUG/MAJOR: h1: Be stricter on request target validation during message parsing As stated in issue #2565, checks on the request target during H1 message parsing are not good enough. Invalid paths, not starting by a slash are in fact parsed as authorities. The same error is repeated at the sample fetch level. This last point is annoying because routing rules may be fooled. It is also an issue when the URI or the Host header are updated. Because the error is repeated at different places, it must be fixed. We cannot be lax by arguing it is the server's job to accept or reject invalid request targets. With this patch, we strengthen the checks performed on the request target during H1 parsing. Idea is to reject invalid requests at this step to be sure it is safe to manipulate the path or the authority at other places. So now, the asterisk-form is only allowed for OPTIONS and OTHER methods. This last point was added to not reject the H2 preface. In addition, we take care to have only one asterisk and nothing more. For the CONNECT method, we take care to have a valid authority-form. All other form are rejected. The authority-form is now only supported for CONNECT method. No specific check is performed on the origin-form (except for the CONNECT method). For the absolute-form, we take care to have a scheme and a valid authority. These checks are not perfect but should be good enough to properly identify each part of the request target for a relative small cost. But, it is a breaking change. Some requests are now be rejected while they was not on older versions. However, nowadays, it is most probably not an issue. If it turns out it's really an issue for legitimate use-cases, an option would be to supports these kinds of requests when the "accept-invalid-http-request" option is set, with the consequence of seeing some sample fetches having an unexpected behavior. This patch should fix the issue #2665. It MUST NOT be backported. First because it is a breaking change. And then because by avoiding backporting it, it remains possible to relax the parsing with the "accept-invalid-http-request" option.	2024-05-15 21:20:37 +02:00
Christopher Faulet	d3d9d83f03	BUG/MEDIUM: h1: Reject CONNECT request if the target has a scheme The target of a CONNECT request must not have scheme. However, this was not checked during the message parsing. It is now rejected. This patch may be backported as far as 2.4.	2024-05-15 21:20:37 +02:00
Christopher Faulet	d724b0d147	BUG/MINOR: h1: Check authority for non-CONNECT methods only if a scheme is found When a non-CONNECT H1 request is parsed, the authority is compared to the host header value, to validate that they are the same. However there is an issue here when a relative path is used (not begining with a '/'). In this case, the path is considered as the authority and will be erroneously compared to the host header value. It is observable with this kind of request: GET admin HTTP/1.1 Host: www.mysite.com In this case "admin" is parsed as an authority while it is in fact a path. At this step, it is not a big deal because it just happens on the very first checks on the message during the parsing. However, the same happens when the authority is updated. This will be fixed in another commit Note this kind of request is invalid because the path does not start with a '/'. But, till now, HAProxy does not reject it. This patch is related to issue #2565. It must be backported as far as 2.4.	2024-05-15 21:20:37 +02:00
Willy Tarreau	821a04377d	BUG/MEDIUM: muxes: enforce buf_wait check in takeover() The ->takeover() is quite tricky. It didn't take care of the possibility that the original thread's connection handler had been woken up to handle an event (e.g. read0), failed to get a buffer, registered against its own thread's buffer_wait queue and left the connection in an idle state. A new thread could then come by, perform a takeover(), and when a buffer was available, the new thread's tasklet would be woken up by the old one via _buf_available(), causing all sort of problems. These problems are easy to reproduce, by running with shared backend connections and few buffers (tune.buffers.limit=20, 8 threads, 500 connections, transfer 64kB objects and wait 2-5s for a crash to appear). A first estimated solution consisted in removing the connection from the idle list but it turns out that it would be worse for the delete stuff (the connection no longer appearing as idle, making it impossible to find it in order to close it). Also, idle counts wouldn't match anymore the list's state, and the special case of private connections could be difficult to handle as the connection could be forcefully re-added to the idle list after allocation despite being private. After multiple attempts to address the problem in various ways, it appears that the only reliable solution for now (without starting to turn many lists to mt_lists) is to have the takeover() function handle the buf_wait detection or unregistration itself: - when doing a regular takeover aiming at finding an idle connection for a new request, connections that are blocked in a buffer_wait queue are quite rare and not interesting at all (since not immediately usable), so skipping them is sufficient. For this we detect that the desired connection belongs to a buffer_wait list by checking its buf_wait.list element. Note that this check is not* thread-safe! The LIST_DEL_INIT() is performed by __offer_buffers() after the callback was called. But this is sufficient as it is now because the only way for the element to be seen as not in a list is after the element was last touched by __offer_buffers(), so the situation for this connection will not change in a different way later. - when doing a server delete, we're running under thread isolation. The connection might get taken over to be killed. The only trick is that private connections not belonging to any idle list may also experience this, and in this case even the idle_conns lock will not offer any protection against anything. But since we're run under thread isolation, we're certain not to compete with the other thread, so it's safe to directly unregister the connection from its owner thread. Normally this is already handled by conn_release() in cli_parse_delete_server(), which calls mux->destroy(), but this would actually update the current thread's queue instead of the origin thread's, thus we do need to perform an explicit dequeue before completing the takeover. With this, the problem now looks solved for HTTP/1, HTTP/2 and FCGI, though extensive tests were essentially run on HTTP/1 and HTTP/2. While the problem has been there for a very long time, there should be no reason to backport it since buffer_wait didn't practically work before 3.0-dev and the process used to freeze hard very quickly before we'd even have a chance to meet that race.	2024-05-15 19:37:12 +02:00
Willy Tarreau	edb99e296d	BUG/MINOR: ssl_sock: fix xprt_set_used() to properly clear the TASK_F_USR1 bit In 2.4-dev8 with commit `5c7086f6b0` ("MEDIUM: connection: protect idle conn lists with locks"), the idle conns list started to be protected using the lock for takeover, and the SSL layer used to always take that lock. Later in 2.4-dev11, with commit `4149168255` ("MEDIUM: ssl: implement xprt_set_used and xprt_set_idle to relax context checks"), we decided to relax this lock using TASK_F_USR1 just as is done in muxes. However the xprt_set_used() call, that's supposed to clear the flag, visibly suffered from a copy-paste and kept the OR operation instead of the AND, resulting in the flag never being released, so that SSL on the backend continues to take the lock on each and every I/O access even when the connection is not idle. The effect is only a reduced performance. This could be backported, but given the non-zero risk of triggering another bug somewhere, it would be prudent to wait for this fix to be sufficiently tested in new versions first.	2024-05-15 19:37:12 +02:00
Amaury Denoyelle	86aafd0236	BUG/MINOR: qpack: fix error code reported on QPACK decoding failure qpack_decode_fs() is used to decode QPACK field section on HTTP/3 headers parsing. Its return value is incoherent as it returns either QPACK_DECOMPRESSION_FAILED defined in RFC 9204 or any other internal values defined in qpack-dec.h. On failure, such return code is reused by HTTP/3 layer to be reported via a CONNECTION_CLOSE frame. This is incorrect if an internal error values was reported as it is not defined by any specification. Fir return values of qpack_decode_fs() in two ways. Firstly, fix invalid usages of QPACK_DECOMPRESSION_FAILED when decoded content is too large for the correct internal error QPACK_ERR_TOO_LARGE. Secondly, adjust qpack_decode_fs() API to only returns internal code values. A new internal enum QPACK_ERR_DECOMP is defined to replace QPACK_DECOMPRESSION_FAILED. Caller is responsible to convert it to a suitable error value. For other internal values, H3_INTERNAL_ERROR is used. This is done through a set of convert functions. This should be backported up to 2.6. Note that trailers are not supported in 2.6 so chunk related to h3_trailers_to_htx() can be safely skipped.	2024-05-15 16:07:15 +02:00
Amaury Denoyelle	4295dd21bd	BUG/MINOR: mux-quic: fix error code on shutdown for non HTTP/3 qcc_shutdown() is called whenever the connection must be closed. If application protocol defined its owned shutdown callback, it is invoked to use the correct error code. Else transport error code NO_ERROR is used. A bug occurs in the latter case as NO_ERROR is used with quic_err_app() which is reserved for application errro codes. This will trigger the emission of a CONNECTION_CLOSE of type 0x1d (Application) instead of 0x1c (Transport). This bug is considered minor as it does not impact QUIC with HTTP/3. It may only be visible when using experimental HTTP/0.9 protocol. This should be backported up to 2.6. For 2.6, patch must be completed rewritten due to code differences. Here is the change to apply : diff --git a/src/mux_quic.c b/src/mux_quic.c index 26fb70ddf..c48f82e27 100644 --- a/src/mux_quic.c +++ b/src/mux_quic.c @@ -1918,7 +1918,9 @@ static void qc_release(struct qcc qcc) qc_send(qcc); } else { - qcc_emit_cc_app(qcc, QC_ERR_NO_ERROR, 0); + / Duplicate from qcc_emit_cc_app() for Transport error code. */ + if (!(qcc->conn->handle.qc->flags & QUIC_FL_CONN_IMMEDIATE_CLOSE)) + qcc->conn->handle.qc->err = quic_err_transport(QC_ERR_NO_ERROR); } }	2024-05-15 16:03:01 +02:00
Amaury Denoyelle	412f1eeb89	BUG/MEDIUM: server: clear purgeable conns before server deletion Since the following commit, idle connections are cleared before a server is deleted. This is better than blocking server deletion due to inactive connections : `6e0afb2e27` MEDIUM: server: close idle conn on server deletion A BUG_ON() has been added to ensure that server idle conn counter is nul after these connections are removed. However, Willy managed to trigger it easily by repeatedly and randomly delete servers accross a single-thread haproxy using a server-template with 1000 instances. In parallel, a h1load client is executed to generate traffic. This BUG_ON() reflected that it some connections referencing the server targetted for deletion remained, even though idle server list is empty. In fact, this is caused by connections scheduled for purging. These connections are moved from idle server list to a global toremove_list while still being accounted by the server. A first approach could be to decrement server idle counter while moving connection to the purge list. However, this is functionnaly incorrect as these purgeable connections still reference the server and it could cause a crash if cleared after it. The correct fix for this issue is simply to remove every purgeable connections before a server is deleted. This is implemented by this patch by extending cli_parse_delete_server(). It could be enough to only remove connections targetted the deleted server, but as these connections will be purged anyway it is justified to clear the whole list. This must not be backported, unless the above mentionned patch is.	2024-05-15 15:01:55 +02:00
Aurelien DARRAGON	231d3d32be	MEDIUM: hlua: take nbthread into account in hlua_get_nb_instruction() Based on Willy's idea (from 3.0-dev6 announcement message): in this patch we try to reduce the max latency that can be caused by running lua scripts with default settings. Indeed, by default, hlua engine is allowed to process up to 10k instructions per batch. While this value was found to be the optimal one for a single thread, it turns out that keeping a thread busy for 10k lua instructions could increase thread contention. This is especially true when the script is loaded with 'lua-load', because in that case the current thread owns the main lua lock and prevent other threads from making any progress if they're also waiting on the main lock. Thanks to Thierry Fournier's work, we know that performance-wise we can reach optimal performance by sticking between 500 and 10k instructions per batch. Given that, when the script is loaded using 'lua-load', if no "tune.lua.forced-yield" was set by the user, we automatically divide the default value (10K) by the number of threads haproxy can use to reduce thread contention (given that all threads could compete for the main lua lock), however we make sure not to return a value below 500, because Thierry's work showed that this would come with a significant performance loss. The historical behavior may still be enforced by setting "tune.lua.forced-yield" to 10000 in the global config section.	2024-05-15 11:59:44 +02:00
Aurelien DARRAGON	e60d9dddf8	MINOR: hlua: add hlua_nb_instruction getter No functional behavior change, but this will ease the work of dynamically computing hlua_nb_instruction value depending on various inputs.	2024-05-15 11:59:37 +02:00
Tim Duesterhus	6610f656ea	DOC: Update UUID references to RFC 9562 When support for UUIDv7 was added in commit `aab6477b67` the specification still was a draft. It has since been published as RFC 9562. This patch updates all UUID references from the obsoleted RFC 4122 and the draft for RFC 9562 to the published RFC 9562.	2024-05-15 11:40:08 +02:00
William Manley	366b722f7e	MINOR: rhttp: Don't require SSL when attach-srv name parsing An attach-srv config line usually looks like this: tcp-request session attach-srv be/srv name ssl_c_s_dn(CN) while a rhttp server line usually looks like this: server srv rhttp@ sni req.hdr(host) The server sni argument is used as a key for looking up connection in the connection pool. The attach-srv name argument is used as a key for inserting connections into the pool. For it to work correctly they must match. There was a check that either both the attach-srv and server provide that key or neither does. It also checked that SSL and SNI was activated on the server. However, thanks to current connect_server() implementation, it appears that SNI is usable even without SSL to identify a connection in the pool. Thus, it can be diverted from its original intent in reverse HTTP case to serve even without SSL activated. For example, this could be useful to use `fc_pp_unique_id` as a name expression (DISCLAIMER: note that for now PROXY protocol is not compatible with rhttp). Error is still reported if either SNI or name is used without the other. This patch adjust the message to a more helpful one. Arguably it would be easier to understand if instead of using `name` and `sni` for `attach-srv` and `server` rules it used the same term in both places - like "conn-pool-key" or something. That would make it clear that the two must match.	2024-05-14 16:39:07 +02:00
Aurelien DARRAGON	32f0cd3242	BUG/MINOR: log: smp_rgs array issues with inherited global log directives When a log directive is defined in the global section, each time we use "log global" in a proxy section, the global log directives are duplicated for the current proxy. This works by creating a new proxy logger struct and duplicating every members for each global one. However, smp_rgs logger member is a special pointer member that is allocated when "range" is used on a log directive. Currently, we simply copy the array pointer (from the global one), instead of creating our own copy. Because of that, range log sampling may not work properly in some situations prior to `3f1284560` ("MINOR: log: remove the unused curr_idx in struct smp_log_range") when used in global log directives, for instance: global log 127.0.0.1:5114 format raw sample 1-2,3:4 local0 info # should receive 75% of all proxy logs log 127.0.0.1:5115 format raw sample 4:4 local0 info # should receive 25% of all proxy logs listen proxy1 log global listen proxy2 log global May not work as expected, because curr_idx was stored within smp_rgs array member prior to `3f1284560`, and due to this bug, it happens to be shared between every log directive inherited from a "global" one. The result is that curr_idx counter will not behave properly because the index will be increased globally instead of per-log directive, and it could even suffer from concurrent thread accesses under load since we don't own the global log directive's lock when manipulating it. Another issue that was revealed because of this bug is that the smp_rgs array allocated during config parsing is never freed in free_logger(), resulting in small memory leak during clean exit. To fix these issues all at once, let's properly duplicate smp_rgs logger struct member in dup_logger() like we already do for other special members so that every log directive have its own sms_rgs copy, and then systematically free it in free_logger(). While this bug affects all stable versions (including 2.4), it's probably best to not backport this beyond 2.6 because of `211ea252d` ("BUG/MINOR: logs: fix logsrv leaks on clean exit") prerequisite that first appears in 2.6. [ada: for versions prior to 2.9, `969e212` ("MINOR: log: add dup_logsrv() helper function") and `76acde91` ("BUG/MINOR: log: keep the ref in dup_logger()") must be backported first. Note: Some ctx adjustments should be performed because 'logger' struct used to be named 'logsrv' in the past and 2.9 introduced logger target struct member. Thus it's probably easier to manually apply `76acde91` and the current bugfix by hand directly on top of `969e212`. ]	2024-05-14 12:00:23 +02:00
Aurelien DARRAGON	9d4a44e713	BUG/MINOR: log: fix leak in add_sample_to_logformat_list() error path If add_sample_to_logformat_list() fails to allocate new logformat_node, then we directly jump to error_free label to cleanup the node using free_logformat_node() before returning an error. However if the node failed to allocate, then the sample expression that was allocated just before (not yet assigned) isn't released (free_logformat_node() is a no-op when NULL is provided). Thus if expr wasn't assigned to the node during early failure, then it must be manually released. This bug was introduced by `2462e5bcc` ("BUG/MINOR: log: fix potential lf->name memory leak") which wasn't marked for backports. It only affects 3.0.	2024-05-13 16:44:27 +02:00
Willy Tarreau	0ce51dc93b	MEDIUM: dynbuf: implement emergency buffers The buffer reserve set by tune.buffers.reserve has long been unused, and in order to deal gracefully with failed memory allocations we'll need to resort to a few emergency buffers that are pre-allocated per thread. These buffers are only for emergency use, so every time their count is below the configured number a b_free() will refill them. For this reason their count can remain pretty low. We changed the default number from 2 to 4 per thread, and the minimum value is now zero (e.g. for low-memory systems). The tune.buffers.limit setting has always been a problem when trying to deal with the reserve but now we could simplify it by simply pushing the limit (if set) to match the reserve. That was already done in the past with a static value, but now with threads it was a bit trickier, which is why the per-thread allocators increment the limit on the fly before allocating their own buffers. This also means that the configured limit is saner and now corresponds to the regular buffers that can be allocated on top of emergency buffers. At the moment these emergency buffers are not used upon allocation failure. The only reason is to ease bisecting later if needed, since this commit only has to deal with resource management.	2024-05-10 17:18:13 +02:00

1 2 3 4 5 ...

17649 Commits