haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-02-16 10:36:55 +00:00

Author	SHA1	Message	Date
Ilia Shipitsin	89bdd8b62a	CI: weekly QUIC Interop: try to fix private image for some reason image built in HAProxy workflow is "private", it is succesfully built, but fails to pull. Let's try explicit docker login for run job as well	2024-07-10 09:43:02 +02:00
Willy Tarreau	4e65fc66f6	MAJOR: import: update mt_list to support exponential back-off (try #2 ) This is the second attempt at importing the updated mt_list code (commit 59459ea3). The previous one was attempted with commit `c618ed5ff4` ("MAJOR: import: update mt_list to support exponential back-off") but revealed problems with QUIC connections and was reverted. The problem that was faced was that elements deleted inside an iterator were no longer reset, and that if they were to be recycled in this form, they could appear as busy to the next user. This was trivially reproduced with this: $ cat quic-repro.cfg global stats socket /tmp/sock1 level admin stats timeout 1h limited-quic frontend stats mode http bind quic4@:8443 ssl crt rsa+dh2048.pem alpn h3 timeout client 5s stats uri / $ ./haproxy -db -f quic-repro.cfg & $ h2load -c 10 -n 100000 --npn h3 https://127.0.0.1:8443/ => hang This was purely an API issue caused by the simplified usage of the macros for the iterator. The original version had two backups (one full element and one pointer) that the user had to take care of, while the new one only uses one that is transparent for the user. But during removal, the element still has to be unlocked if it's going to be reused. All of this sparked discussions with Fred and Aur�lien regarding the still unclear state of locking. It was found that the lock API does too much at once and is lacking granularity. The new version offers a much more fine- grained control allowing to selectively lock/unlock an element, a link, the rest of the list etc. It was also found that plenty of places just want to free the current element, or delete it to do anything with it, hence don't need to reset its pointers (e.g. event_hdl). Finally it appeared obvious that the root cause of the problem was the unclear usage of the list iterators themselves because one does not necessarily expect the element to be presented locked when not needed, which makes the unlock easy to overlook during reviews. The updated version of the list presents explicit lock status in the macro name (_LOCKED or _UNLOCKED suffixes). When using the _LOCKED suffix, the caller is expected to unlock the element if it intends to reuse it. At least the status is advertised. The _UNLOCKED variant, instead, always unlocks it before starting the loop block. This means it's not necessary to think about unlocking it, though it's obviously not usable with everything. A few _UNLOCKED were used at obvious places (i.e. where the element is deleted and freed without any prior check). Interestingly, the tests performed last year on QUIC forwarding, that resulted in limited traffic for the original version and higher bit rate for the new one couldn't be reproduced because since then the QUIC stack has gaind in efficiency, and the 100 Gbps barrier is now reached with or without the mt_list update. However the unit tests definitely show a huge difference, particularly on EPYC platforms where the EBO provides tremendous CPU savings. Overall, the following changes are visible from the application code: - mt_list_for_each_entry_safe() + 1 back elem + 1 back ptr => MT_LIST_FOR_EACH_ENTRY_LOCKED() or MT_LIST_FOR_EACH_ENTRY_UNLOCKED() + 1 back elem - MT_LIST_DELETE_SAFE() no longer needed in MT_LIST_FOR_EACH_ENTRY_UNLOCKED() => just manually set iterator to NULL however. For MT_LIST_FOR_EACH_ENTRY_LOCKED() => mt_list_unlock_self() (if element going to be reused) + NULL - MT_LIST_LOCK_ELT => mt_list_lock_full() - MT_LIST_UNLOCK_ELT => mt_list_unlock_full() - l = MT_LIST_APPEND_LOCKED(h, e); MT_LIST_UNLOCK_ELT(); => l=mt_list_lock_prev(h); mt_list_lock_elem(e); mt_list_unlock_full(e, l)	2024-07-09 16:46:38 +02:00
Willy Tarreau	87d269707b	OPTIM: pool: improve needed_avg cache line access pattern On an AMD EPYC 3rd gen, 20% of the CPU is spent calculating the amount of pools needed when using QUIC, because pool allocations/releases are quite frequent and the inter-CCX communication is super slow. Still, there's a way to save between 0.5 and 1% CPU by using fetch-add and sub-fetch that are converted to XADD so that the result is directly fed into the swrate_add argument without having to re-read the memory area. That's what this patch does.	2024-07-09 16:46:38 +02:00
William Lallemand	9797a7718c	MINOR: ssl/sample: ssl_c_san returns a comma separated list of SAN The ssl_c_san sample fetch returns a list of Subject Alt Name which was presented by the client certificate. The format is the same as the "openssl x509 -text" command, it's a Description: Value list separated by commas. The format is directly generated by the GENERAL_NAME_print() openssl function. https://github.com/openssl/openssl/blob/openssl-3.0/crypto/x509/v3_san.c#L207 Example: IP Address:127.0.0.1, IP Address:127.0.0.2, IP Address:127.0.0.3, URI:http://docs.haproxy.org/2.7/, DNS:ca.tests.haproxy.com	2024-07-09 13:57:18 +02:00
William Lallemand	0a1b251c1a	BUG/MINOR: jwt: fix variable initialisation Set the alg variable from sample_conv_jwt_verify_check() to JWT_ALG_DEFAULT. This was reported by coverity in #2630, but since you need to use the first argument to use the 2nd, this has no real impact. Mut be backported with `883f1bd` (as far as 2.6).	2024-07-08 14:23:14 +02:00
Valentine Krasnobaeva	16a5fac4bb	BUG/MEDIUM: init: fix fd_hard_limit default in compute_ideal_maxconn This commit fixes `41275a691` ("MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD"). fd_hard_limit is taken in account implicitly via 'ideal_maxconn' value in all maxconn adjustements, when global.rlimit_memmax is set: MIN(global.maxconn, capped by global.rlimit_memmax, ideal_maxconn); It also caps provided global.rlimit_nofile, if it couldn't be set as a current process fd limit (see more details in the main() code). So, lets set the default value for fd_hard_limit only, when there is no any other haproxy-specific limit provided, i.e. rlimit_memmax, maxconn, rlimit_nofile. Otherwise we may break users configs. Please, note, that in master-worker mode, master does not need the DEFAULT_MAXFD (1048576) as well, as we explicitly limit its maxconn to 100. Must be backported in all stable versions until v2.6.0, including v2.6.0, like the commit above.	2024-07-08 11:26:16 +02:00
Amaury Denoyelle	3d4baa3c7b	MINOR: quic: rename "ssl error" trace SSL status is reported each time quic_conn_io_cb() is finished via a trace. Change the trace label from "ssl error" to "ssl status". This allows to search for errors easier without being distracted by this trace.	2024-07-08 09:38:35 +02:00
Amaury Denoyelle	19b8c1b7cd	DEV: flags/quic: decode quic_conn flags Decode quic_conn flags via qc_show_flags() function. To support this, quic flags definition have been put outside of USE_QUIC directive.	2024-07-08 09:38:35 +02:00
Ilia Shipitsin	f8a30b69d2	CI: add weekly QUIC Interop regression against LibreSSL currently only quic-go and picoquic clients are enabled with testsuites supposed to be "green". Tests will be run weekly.	2024-07-05 15:11:21 +02:00
Christopher Faulet	3e2d1476e6	BUG/MEDIUM: peers: Fix crash when syncing learn state of a peer without appctx For a given peer, the synchronization of the learn state is no longer performed in the peer appctx. It is delayed to be handled by the peers sync task. It means that for a given peer, it is possible to have finished to learn and only handle it after the appctx release. So the synchronization may happen on a peer without appctx. This was not tested and an unconditionnal wakeup on the appctx could lead to a crash because of a NULL-deref. It may be experienced by running reg-tests/peers/tls_basic_sync.vtc script in loop. The fix is obivous. In sync_peer_learn_state(), we must omit to wakeup the appctx if it was already released. This patch should fix issue #2629. It must be backported to 3.0.	2024-07-05 12:14:27 +02:00
Amaury Denoyelle	95f624540b	BUG/MEDIUM: quic: prevent crash on accept queue full Handshake for quic_conn instances runs on a single non-chosen thread. On completion, listener_accept() is performed to select the less loaded thread before initializing connection instance. As such, quic_conn instance is migrated to the thread with its upper connection. In case accept queue is full, listener_accept() fallback to local accept mode, which cause the connection to be assigned to the current thread. However, this is not supported by QUIC as quic_conn instance is left on the previously selected thread. In most cases, this will cause a BUG_ON() due to a task manipulation from an outside thread. To fix this, handle quic_conn thread rebind in multiple steps using the new extended protocol API. Several operations have been moved from qc_set_tid_affinity1() to newly defined qc_set_tid_affinity2(), in particular CID TID update. This ensures that quic_conn instance is not prematurely accessed on the new thread until accept queue push is guaranteed to succeed. qc_reset_tid_affinity() is also newly defined to reassign the newly created tasks and tasklets to the current thread. This is necessary to prevent the BUG_ON() crash described above. This must be backported up to 2.8 after a period of observation. Note that it depends on previous patch : MINOR: proto: extend connection thread rebind API	2024-07-04 17:28:56 +02:00
Amaury Denoyelle	1a43b9f32c	MINOR: proto: extend connection thread rebind API MINOR: listener: define callback for accept queue push Extend API for connection thread rebind API by replacing single callback set_affinity by three different ones. Each one of them is used at a different stage of the operation : * set_affinity1 is used similarly to previous set_affinity * set_affinity2 is called directly from accept_queue_push_mp() when an entry has been found in accept ring. This operation cannot fail. * reset_affinity is called after set_affinity1 in case of failure from accept_queue_push_mp() due to no space left in accept ring. This is necessary for protocols which must reconfigure resources before fallback on the current tid. This patch does not have any functional changes. However, it will be required to fix crashes for QUIC connections when accept queue ring is full. As such, it must be backported with it.	2024-07-04 16:33:21 +02:00
Valentine Krasnobaeva	ff024206f0	DOC: configuration: update maxconn description Let's update maxconn keyword description, in order to make it clear, which setting has the precedence over the global.maxconn and the SYSTEM_MAXCONN if set.	2024-07-04 07:53:07 +02:00
Valentine Krasnobaeva	41275a6918	MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD Let's provide a default value for fd_hard_limit, if it's not set in the configuration. With this patch we could set some specific default via compile-time variable DEFAULT_MAXFD as well. Hope, this will be helpfull for haproxy package maintainers. make -j 8 TARGET=linux-glibc DEBUG=-DDEFAULT_MAXFD=50000 If haproxy is comipled without DEFAULT_MAXFD defined, the default will be set to 1048576. This is done to avoid killing the process by its watchdog, while it started without any limitations in its configuration or in the command line and the hard RLIMIT_NOFILE is extremely huge (~1000000000). We use in this case compute_ideal_maxconn() to calculate maxconn and maxsock, maxsock defines the size of internal fdtab, which becames very-very large as well. When the process starts to simply loop over this fdtab (0(n)), this takes a lot of time, so watchdog does it job. To avoid this, maxconn now is always reduced to some reasonable value either by explicit global.fd-hard-limit from configuration, or by its default. The default may be changed at build-time and overwritten then by global.fd-hard-limit at runtime. Explicit global.fd-hard-limit from the configuration has always precedence over DEFAULT_MAXFD, if set. Must be backported in all stable versions until v2.6.0, including v2.6.0.	2024-07-04 07:52:42 +02:00
Amaury Denoyelle	bfdf145859	MINOR: quic: ensure quic_conn is never removed on thread affinity rebind On accept, quic_conn instance is migrated from its original thread to a new one. This operation is conducted in two steps, on the original than the new thread instance. During the interval, quic_conn is artificially rendered inactive. It must never be accessed nor removed until migration is completed via qc_finalize_affinity_rebind(). This new BUG_ON() will enforce that removal is never conducted until migration is completed.	2024-07-03 15:02:40 +02:00
Amaury Denoyelle	a4240fb26f	MINOR: quic: add 2 BUG_ON() on datagram dispatch QUIC datagram dispatch is an error prone operation as it must always ensure the correct thread is used before accessing to the recipient quic_conn instance. Strengthen this code part by adding two BUG_ON_HOT() to ensure thread safety.	2024-07-03 15:02:40 +02:00
Amaury Denoyelle	8550549cca	REORG: quic: remove quic_cid_trees reference from proto_quic Previous commit removed access/manipulation to QUIC CID global tree outside of quic_cid module. This ensures that proper locking is always performed. This commit finalizes this cleanup by marking CID global tree as static only to quic_cid source file. Initialization of this tree is removed from proto_quic and now performed using dedicated initcalls quic_alloc_global_cid_tree(). As a side change, complete CID global tree documentation, in particular to explain CID global tree artificial splitting and ODCID handling. Overall, the code is now clearer and safer.	2024-07-03 15:02:40 +02:00
Amaury Denoyelle	0a352ef08e	MINOR: quic: remove access to CID global tree outside of quic_cid module haproxy generates for each QUIC connection a set of CID. The peer must reuse them as DCID for its emitted packet. On datagram reception, DCID field serves as identifier to dispatch them on their correct thread. These CIDs are stored in a global CID tree. Access to this data structure must always be protected with CID_LOCK. This commit is a refactoring to regroup all CID tree access in quic_cid module. Several code parts are ajusted : * quic_cid_insert() is extended to check for insertion race-condition. This is useful on quic_conn instantiation. Code where such race cannot happen can use unsafe _quic_cid_insert() instead. * on RETIRE_CONNECTION_ID frame reception, existing quic_cid_delete() function is used. * remove tree lookup from qc_check_dcid(), extracted in the new quic_cmp_cid_conn() function. Ultimately, the latter should be removed as CID lookup could be conducted on quic_conn owned tree without locking.	2024-07-03 15:02:40 +02:00
Amaury Denoyelle	5d186673df	CLEANUP: quic: remove non-existing quic_cid_tree definition quic_cid_tree global variable does not exist anymore. Remove its definition in quic_conn.c.	2024-07-03 15:02:40 +02:00
Amaury Denoyelle	a05fefe74d	CLEANUP: quic: cleanup prototypes related to CIDs handling Remove duplicated prototypes from quic_conn.h also present in quic_cid.h. Also remove quic_derive_cid() prototype and mark it as static.	2024-07-03 15:02:40 +02:00
William Lallemand	883f1bdbce	BUG/MINOR: jwt: don't try to load files with HMAC algorithm When trying to use a HMAC algorithm (HS256, HS384, HS512) the sample_conv_jwt_verify_check() function of the converter tries to load a file even if it is only supposed to contain a secret instead of a path. When using lua, the check function is called at runtime so it even tries to load file at each call... This fixes the issue for HMAC algorithm but this is still a problem with the other algorithms, since we don't have a way of pre-loading files before the call. Another solution must be found to prevent disk IO with lua using other algorithms. Must be backported as far as 2.6.	2024-07-03 12:35:50 +02:00
Amaury Denoyelle	50ae717624	BUG/MEDIUM: server: fix race on server_atomic_sync() The following patch fixes a race condition during server addr/port update : `cd994407a9` BUG/MAJOR: server/addr: fix a race during server addr:svc_port updates The new update mechanism is implemented via an event update. It uses thread isolation to guarantee that no other thread is accessing server addr/port. Furthermore, to ensure server instance is not deleted just before the event handler, server instance is lookup via its ID in proxy tree. However, thread isolation is only entered after server lookup. This leaves a tiny race condition as the thread will be marked as harmless and a concurrent thread can delete the server in the meantime. This causes server_atomic_sync() to manipulated a deleted server instance to reinsert it in used_server_addr backend tree. This can cause a segfault during this operation or possibly on a future used_server_addr tree access. This issue was detected by criteo. Several backtraces were retrieved, each related to server addr_node insert or delete operation, either in srv_set_addr_desc(), or add/delete dynamic server handlers. To fix this, simply extend thread isolation section to start it before server lookup. This ensures that once retrieved the server cannot be deleted until its addr/port are updated. To ensure this issue won't happen anymore, a new BUG_ON() is added in srv_set_addr_desc(). Also note that ebpt_delete() is now called every time on delete handler as this is a safe idempotent operation. To reproduce these crashes, a script was executed to add then remove different servers every second. In parallel, the following CLI command was issued repeatdly without any delay to force multiple update on servers port : set server <srv> addr 0.0.0.0 port $((1024 + RANDOM % 1024)) This must be backported at least up to 3.0. If above mentionned patch has been selected for previous version, this commit must also be backported on them.	2024-07-03 09:20:24 +02:00
William Lallemand	419b79492a	DOC: configuration: more details about the master-worker mode Add more details about the master-worker mode in the "master-worker" global keyword. Should fix issue #2198.	2024-07-02 18:23:34 +02:00
Christopher Faulet	e5e36ce097	BUG/MEDIUM: hlua/cli: Fix lua CLI commands to work with applet's buffers In 3.0, the CLI applet was rewritten to use its own buffers. However, the lua part, used to register CLI commands at runtime, was not updated accordingly. It means the lua CLI commands still try to write in the channel buffers. This is of course totally unexepected and not supported. Because of this bug, the applet hangs intead of returning the command result. The registration of lua CLI commands relies on the lua TCP applets. So the send and receive functions were fixed to use the applet's buffer when it is required and still use the channel buffers otherwies. This way, other lua TCP applets can still run on the legacy mode, without the applet's buffers. This patch must be backported to 3.0.	2024-07-02 10:05:40 +02:00
William Lallemand	ba37ad41b2	DOC: configuration: add details about crt-store in bind "crt" keyword Add some details about the certificate storage cache system in the "crt" bind keyword. This should be backported to 3.0. Fix issue #2618.	2024-07-01 12:30:06 +02:00
Christopher Faulet	b789cef91f	BUG/MINOR: promex: Remove Help prefix repeated twice for each metric When the support for modules was added, the function producing the #HELP line of each metric was refactored. Since then, the prefix "#HELP <metric-name>" is printed twice because a code block was not removed. It is now fixed. This patch must be backported to 3.0.	2024-07-01 10:50:27 +02:00
Willy Tarreau	192abc6f83	BUG/MEDIUM: quic: fix possible exit from qc_check_dcid() without unlocking Locking of the CID tree was extended in qc_check_dcid() by recent commit `05f59a5` ("BUG/MINOR: quic: fix race condition in qc_check_dcid()") but there was a direct return from the middle of the function which was not covered by the unlock, resulting in the function keeping the lock on success return. Let's just remove this return and replace it with a variable to merge all exit paths. This must be backported wherever the fix above is backported.	2024-07-01 10:29:31 +02:00
Frederic Lecaille	6d943b8db6	BUG/MINOR: quic: Wrong datagram building when probing. This issue was revealed by chacha20 interop test which very often fails with ngtcp2 as client. This was due to the fact that 2 application level packets could be coalesced into the same datagram as revealed by such a capture: Frame 380: 255 bytes on wire (2040 bits), 255 bytes captured (2040 bits) Point-to-Point Protocol Internet Protocol Version 4, Src: 193.167.100.100, Dst: 193.167.0.100 User Datagram Protocol QUIC IETF QUIC Connection information [Connection Number: 0] [Packet Length: 187] QUIC Short Header DCID=ec523fe99840f9c17c868a88d649147814 PKN=333 0... .... = Header Form: Short Header (0) .1.. .... = Fixed Bit: True ..0. .... = Spin Bit: False [...0 0... = Reserved: 0] [.... .0.. = Key Phase Bit: False] [.... ..00 = Packet Number Length: 1 bytes (0)] Destination Connection ID: ec523fe99840f9c17c868a88d649147814 [Packet Number: 333] Protected Payload […]: 43537d43a3c83e47db6891bd6a4fd7d7fa31941badcb87a540e843341d6a5e493ed4c3f6e6bbff094804ee0ab06830dc1a1bbf52ace4323d2e4f6e0bd4eea73df0721d2949d05a058d3afb974e814494ebf44d1375b0e7f1fd5bcf634cf32ef9a9b4018758a49d39a24c40 STREAM id=0 fin=0 off=294768 len=144 dir=Bidirectional origin=Client-initiated Frame Type: STREAM (0x000000000000000e) .... ...0 = Fin: False .... ..1. = Len(gth): True .... .1.. = Off(set): True Stream ID: 0 .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... ...0 = Stream initiator: Client-initiated (0) .... .... .... .... .... .... .... .... .... .... .... .... .... .... .... ..0. = Stream direction: Bidirectional (0) Offset: 294768 Length: 144 Stream Data […]: 63eef6ccee0d2ab602db3682d0e7cc09b72db6adc307d7699a211144b4b6c029cbed9beae1491c10a5fe0678d815a5303843d33c0593fedc9b64068fd0207e280d05aac2c0054fe9ab30857bc3669ee51d34756cfd2e098eb1ab31a03911f6a103f0a16f8f984d9861efdcf4433c QUIC IETF [Packet Length: 38] QUIC Short Header DCID=ec523fe99840f9c17c868a88d649147814 PKN=334 0... .... = Header Form: Short Header (0) .1.. .... = Fixed Bit: True ..0. .... = Spin Bit: False [...0 0... = Reserved: 0] [.... .0.. = Key Phase Bit: False] [.... ..00 = Packet Number Length: 1 bytes (0)] Destination Connection ID: ec523fe99840f9c17c868a88d649147814 [Packet Number: 334] Protected Payload: b9c0e6dc3fc523574f8164c31b6cd156496212 PING Frame Type: PING (0x0000000000000001) PADDING Length: 2 Frame Type: PADDING (0x0000000000000000) [Padding Length: 2] On the peer side these two packet are considered as a unique one because there may be only one packet by datagram at application encryption level and reported as a STREAM frame encoding error: I00000332 0xec523fe99840f9c17c868a88d649147814 con recv packet len=225 mask=b2c69c7827 sample=43a3c83e47db6891bd6a4fd7d7fa3194 I00000332 0xec523fe99840f9c17c868a88d649147814 pkt rx pkn=333 dcid=0xec523fe99840f9c17c868a88d649147814 type=1RTT k=0 I00000332 0xec523fe99840f9c17c868a88d649147814 frm rx 333 1RTT STREAM(0x0e) id=0x0 fin=0 offset=294768 len=144 uni=0 ngtcp2_conn_read_pkt: ERR_FRAME_ENCODING I00000332 0xec523fe99840f9c17c868a88d649147814 pkt tx pkn=1531039643 dcid=0xae79dfc99d6c65d6 type=1RTT k=0 I00000332 0xec523fe99840f9c17c868a88d649147814 frm tx 1531039643 1RTT CONNECTION_CLOSE(0x1c) error_code=FRAME_ENCODING_ERROR(0x7) frame_type=0 reason_len=0 reason=[] I00000332 0xec523fe99840f9c17c868a88d649147814 frm tx 1531039643 1RTT PADDING(0x00) len=9 Note here that the sum of the two packet sizes (from capture) is the same as the packet length reporte by ngtcp2: 187+38 = 225. It also seems that wireshark tries to parse as much as packet into the same datagram, regardless of the QUIC protocol rules. Haproxy traces revealed that this could happen at least when probing the peer. The recent low level packet building modifications aim was to build as much as datagrams into the same buffer. But it seems that the probing packet case treatment has been broken. That said, I have not identified impacted commit. This issue could be reproduced inside interop test environment (no possible git bisection). To fix this, rely on the <probe> variable value to identify if the last packet built by qc_prep_pkts() was a probing one, then try to coalesce some others packet into the same datagram if this was not the case. Of course the test on <probe> value has to be done before setting it for the next packet. Must be backported to 3.0.	2024-07-01 09:29:09 +02:00
Willy Tarreau	bbc2f043e3	[RELEASE] Released version 3.1-dev2 Released version 3.1-dev2 with the following main changes : - BUG/MINOR: log: fix broken '+bin' logformat node option - DEBUG: hlua: distinguish burst timeout errors from exec timeout errors - REGTESTS: ssl: fix some regtests 'feature cmd' start condition - BUG/MEDIUM: ssl: AWS-LC + TLSv1.3 won't do ECDSA in RSA+ECDSA configuration - MINOR: ssl: activate sigalgs feature for AWS-LC - REGTESTS: ssl: activate new SSL reg-tests with AWS-LC - BUG/MEDIUM: proxy: fix email-alert invalid free - REORG: mailers: move free_email_alert() to mailers.c - BUG/MINOR: proxy: fix email-alert leak on deinit() (2nd try) - DOC: configuration: fix alphabetical order of bind options - DOC: management: document ptr lookup for table commands - BUG/MAJOR: quic: fix padding with short packets - BUG/MAJOR: quic: do not loop on emission on closing/draining state - MINOR: sample: date converter takes HTTP date and output an UNIX timestamp - SCRIPTS: git-show-backports: do not truncate git-show output - DOC: api/event_hdl: small updates, fix an example and add some precisions - BUG/MINOR: h3: fix crash on STOP_SENDING receive after GOAWAY emission - BUG/MINOR: mux-quic: fix crash on qcs SD alloc failure - BUG/MINOR: h3: fix BUG_ON() crash on control stream alloc failure - BUG/MINOR: quic: fix BUG_ON() on Tx pkt alloc failure - DEV: flags/show-fd-to-flags: adapt to recent versions - MINOR: capabilities: export capget and __user_cap_header_struct - MINOR: capabilities: prepare support for version 3 - MINOR: capabilities: use _LINUX_CAPABILITY_VERSION_3 - MINOR: cli/debug: show dev: add cmdline and version - MINOR: cli/debug: show dev: show capabilities - MINOR: debug: print gdb hints when crashing - BUILD: debug: also declare strlen() in __ABORT_NOW() - BUILD: Missing inclusion header for ssize_t type - BUG/MINOR: hlua: report proper context upon error in hlua_cli_io_handler_fct() - MINOR: cfgparse/log: remove leftover dead code - BUG/MEDIUM: stick-table: Decrement the ref count inside lock to kill a session - MINOR: stick-table: Always decrement ref count before killing a session - REORG: init: do MODE_CHECK_CONDITION logic first - REORG: init: encapsulate CHECK_CONDITION logic in a func - REORG: init: encapsulate 'reload' sockpair and master CLI listeners creation - REORG: init: encapsulate code that reads cfg files - BUG/MINOR: server: fix first server template name lookup UAF - MINOR: activity: make the memory profiling hash size configurable at build time - BUG/MEDIUM: server/dns: prevent DOWN/UP flap upon resolution timeout or error - BUG/MEDIUM: h3: ensure the ":method" pseudo header is totally valid - BUG/MEDIUM: h3: ensure the ":scheme" pseudo header is totally valid - BUG/MEDIUM: quic: fix race-condition in quic_get_cid_tid() - BUG/MINOR: quic: fix race condition in qc_check_dcid() - BUG/MINOR: quic: fix race-condition on trace for CID retrieval	2024-06-29 11:28:41 +02:00
Amaury Denoyelle	bbb9f8248e	BUG/MINOR: quic: fix race-condition on trace for CID retrieval quic_rx_pkt_retrieve_conn() is used when parsing a received datagram from the listener socket. It returned the quic_conn instance corresponding to the first packet DCID, unless it is mapped to another thread. As expected, global CID tree access is protected by a lock in the function. However, there is a race condition due to the final trace where qc instance is dereferenced outside of the lock. Fix this by adding a new trace under lock protection and remove qc deferencement at function end. This may fix first crash of github issue #2607. This must be backported up to 2.8.	2024-06-28 16:28:33 +02:00
Amaury Denoyelle	05f59a51ac	BUG/MINOR: quic: fix race condition in qc_check_dcid() qc_check_dcid() is a function which check that a DCID is associated to the expected quic_conn instance. This is used for quic_conn socket receive handler as there is a tiny risk that a datagram to another connection was received on this socket. As other operations on global CID tree, a lock must be used to protect against race condition. However, as previous commit, lock was not held long enough as CID tree node is accessed outside of the lock region. To fix this, increase critical section until CID dereferencement is done. The impact of this bug should be similar to the previous one. However, risk of crash are even less reduced as it should be extremely rare to receive datagram for other connections on a quic_conn socket. As such, most of the time first check condition of qc_check_dcid() is enough. This may fix first crash of issue github #2607. This must be backported up to 2.8.	2024-06-28 16:28:33 +02:00
Amaury Denoyelle	72267ff35f	BUG/MEDIUM: quic: fix race-condition in quic_get_cid_tid() haproxy generates CID for clients which reuse them as DCID on their packets. These CID are stored in a global tree quic_cid_trees. Each operation on this tree must be done under lock protection. quic_get_cid_tid() is a function which lookups a CID in global tree and return the associated thread ID. This is used on datagram reception on listener socket before redispatching the datagram to the correct thread. This function uses a lock to protect quic_cid_trees access. However, lock region is too small as CID tree node is accessed outside of it. Fix this by extending lock protection for CID dereferencement until thread ID is retrieved. The impact of this bug is unknown, but it may possible cause crashes. However, it is probably rare as most of datagram reception is done on quic_conn socket which does not uses quic_get_cid_tid(). This may fix first crash of github issue #2607. This must be backported up to 2.8.	2024-06-28 16:27:20 +02:00
Amaury Denoyelle	a3bed52d1f	BUG/MEDIUM: h3: ensure the ":scheme" pseudo header is totally valid Ensure pseudo-header scheme is only constitued of valid characters according to RFC 9110. If an invalid value is found, the request is rejected and stream is resetted. It's the same as for previous commit "BUG/MEDIUM: h3: ensure the ":method" pseudo header is totally valid" except that this time it applies to the ":scheme" pseudo header. This must be backported up to 2.6.	2024-06-28 14:36:30 +02:00
Amaury Denoyelle	789d4abd73	BUG/MEDIUM: h3: ensure the ":method" pseudo header is totally valid Ensure pseudo-header method is only constitued of valid characters according to RFC 9110. If an invalid value is found, the request is rejected and stream is resetted. Previously only characters forbidden in headers were rejected (NUL/CR/LF), but this is insufficient for :method, where some other forbidden chars might be used to trick a non-compliant backend server into seeing a different path from the one seen by haproxy. Note that header injection is not possible though. This must be backported up to 2.6. Many thanks to Yuki Mogi of FFRI Security Inc for the detailed report that allowed to quicky spot, confirm and fix the problem.	2024-06-28 14:36:30 +02:00
Aurelien DARRAGON	80aba1d284	BUG/MEDIUM: server/dns: prevent DOWN/UP flap upon resolution timeout or error This is a complementary patch to `c16eba818` ("BUG/MEDIUM: server/dns: preserve server's port upon resolution timeout or error"). Indeed, since `c16eba818`, the port is properly preserved, but unsetting server's address this way results in server_atomic_sync() function thinking that we're actually setting a new address and not unsetting the previous one because addr family is != AF_UNSPEC. Upon DNS timeout, this could be observed: [WARNING] (2588257) : Server http/s1 is going DOWN for maintenance (DNS timeout status). 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. [WARNING] (2588257) : Server http/s1 ('test1.localhost') is UP/READY (resolves again). Notice that server timeouts and then immediately resolves again. Of course in this case case the server's address was properly set to 0, meaning that the server will not receive any traffic, but it is confusing and could result in haproxy temporarily thinking that the server is actually available while it's not. To properly fix the issue and restore historical behavior, let's explicitly set inetaddr's family to AF_UNSPEC after fetching original server's address. It should be backported in 3.0 with `c16eba818`.	2024-06-28 11:26:52 +02:00
Willy Tarreau	290659ffd3	MINOR: activity: make the memory profiling hash size configurable at build time The MEMPROF_HASH_BITS variable was set to 10 without a possibility to change it (beyond patching the code). After seeing a few reports already with "other" being listed and a list with close to 1024 entries, it looks like it's about time to either increase the hash size, or at least make it configurable for special cases. As a reminder, in order to remain fast, the algorithm searches no more than 16 places after the hash, so when a table is almost full, searches are long and new places are rare. The present patch just makes it possible to redefine it by passing "-DMEMPROF_HASH_BITS=11" or "-DMEMPROF_HASH_BITS=12" in CFLAGS, and moves the definition to defaults.h to make it easier to find. Such values should be way sufficient for the vast majority of use cases. Maybe in the future we'd change the default. At least this version should be backported to ease rebuilds, say, till 2.8 or so.	2024-06-27 18:01:27 +02:00
Aurelien DARRAGON	eec8048042	BUG/MINOR: server: fix first server template name lookup UAF This is a follow-up for `7223296` ("BUG/MINOR: server: fix first server template not being indexed"). Indeed, in `7223296` we added a new call to _srv_parse_set_id_from_prefix() for the first server before handling additional ones. But we actually overlooked the fact that _srv_parse_set_id_from_prefix() was already performed at the end of _srv_parse_tmpl_init() for the same server. Since _srv_parse_set_id_from_prefix() frees srv->id, it results in UAF when performing name lookups on the first server, because used_server_name node key still uses the freed string pointer. The early _srv_parse_set_id_from_prefix() call (added in `7223296`) and the original one perform the same task, except that the new one is followed by name node insertion logic required for name lookups to work properly. So let's simply get rid of the old one at the end of the function. _srv_parse_set_id_from_prefix() in the 'err:' label was also removed since is is now useless as well starting with `7223296` and would trigger the same bug on error paths. Thanks to Amaury for noticing it. This bug was discovered while trying to address GH issue #2620. Thanks to @x-yuri for his detailed report (with working repro). It should be backported in 3.0 with `7223296`.	2024-06-27 16:38:25 +02:00
Valentine Krasnobaeva	ed90ad895c	REORG: init: encapsulate code that reads cfg files Haproxy master process should not read its configuration the second time after performing reexec and passing to MODE_MWORKER_WAIT. So, to make this part of init() function more readable and to distinguish better the point, where configs have been read, let's encapsulate it in a separate function.	2024-06-27 16:09:38 +02:00
Valentine Krasnobaeva	5e06d45df7	REORG: init: encapsulate 'reload' sockpair and master CLI listeners creation Let's encapsulate the logic of 'reload' sockpair and master CLI listeners creation, used by master CLI into a separate function, as we needed this only in master-worker runtime mode. This makes the code of init() more readable.	2024-06-27 16:08:42 +02:00
Valentine Krasnobaeva	6f613faa71	REORG: init: encapsulate CHECK_CONDITION logic in a func As MODE_CHECK_CONDITION logic terminates the process anyway, no matter if the test for the provided condition was successfull or not, let's encapsulate it in a separate function. This makes the code of init() more readable.	2024-06-27 16:01:01 +02:00
Valentine Krasnobaeva	10de58fbfb	REORG: init: do MODE_CHECK_CONDITION logic first In MODE_CHECK_CONDITION we only parse check_condition string, provided by '-cc', and then we evaluate it. Haproxy process terminates at the end of {if..else} block anyway, if the test has failed or passed. So, it will be more appropriate to perform MODE_CHECK_CONDITION test first and then do all other process runtime mode verifications.	2024-06-27 15:59:43 +02:00
Christopher Faulet	ad946a704d	MINOR: stick-table: Always decrement ref count before killing a session Guarded functions to kill a sticky session, stksess_kill() stksess_kill_if_expired(), may or may not decrement and test its reference counter before really killing it. This depends on a parameter. If it is set to non-zero value, the ref count is decremented and if it falls to zero, the session is killed. Otherwise, if this parameter is equal to zero, the session is killed, regardless the ref count value. In the code, these functions are always called with a non-zero parameter and the ref count is always decremented and tested. So, there is no reason to still have a special case. Especially because it is not really easy to say if it is supported or not. Does it mean it is possible to kill a sticky session while it is still referenced somewhere ? probably not. So, does it mean it is possible to kill a unreferenced session ? This case may be problematic because the session is accessed outside of any lock and thus may be released by another thread because it is unreferenced. Enlarging scope of the lock to avoid any issue is possible but it is a bit of shame to do so because there is no usage for now. The best is to simplify the API and remove this case. Now, stksess_kill() and stksess_kill_if_expired() functions always decrement and test the ref count before killing a sticky session.	2024-06-26 15:05:06 +02:00
Christopher Faulet	9357873641	BUG/MEDIUM: stick-table: Decrement the ref count inside lock to kill a session When we try to kill a session, the shard must be locked before decrementing the ref count on the session. Otherwise, the ref count can fall to 0 and a purge task (stktable_trash_oldest or process_table_expire) may release the session before we have the opportunity to acquire the lock on the shard to effectively kill the session. This could lead to a double free. Here is the scenario: Thread 1 Thread 2 sktsess_kill(ts) if (ATOMIC_DEC(&ts->ref_cnt) != 0) return /* here the ref count is 0 / stktable_trash_oldest() LOCK(&sh_lock) if (!ATOMIC_LOAD(&ts->ref_cnf)) __stksess_free(ts) UNLOCK(&sh_lock) / here the session was released */ LOCK(&sh_lock) __stksess_free(ts) <--- double free UNLOCK(&sh_lock) The bug was introduced in 2.9 by the commit `7968fe3889` ("MEDIUM: stick-table: change the ref_cnt atomically"). The ref count must be decremented inside the lock for stksess_kill() and sktsess_kill_if_expired() function. This patch should fix the issue #2611. It must be backported as far as 2.9. On the 2.9, there is no sharding. All the table is locked. The patch will have to be adapted.	2024-06-26 12:05:37 +02:00
Aurelien DARRAGON	bcf98c9b5f	MINOR: cfgparse/log: remove leftover dead code Remove development leftover introduced by commit `15e9c7da6` ("MINOR: log: add log-profile parsing logic"). Indeed, since "log-profile" section keyword is registered via REGISTER_CONFIG_SECTION() macro, it is not relevant to declare it in common_kw_list[] from cfgparse-global.c. All it does is that it could confuse the user by suggesting him to use "log-profile" inside a global section when trying to find a best match in cfg_parse_global().	2024-06-26 11:06:31 +02:00
Aurelien DARRAGON	185d230e2c	BUG/MINOR: hlua: report proper context upon error in hlua_cli_io_handler_fct() As a result of copy pasting, hlua_cli_io_handler_fct() used to report lua exceptions like E_ETMOUT as "Lua converter" instead of "Lua cli". Let's fix that. It could be backported to all stable versions. [ada: for older versions, HLUA_E_BTMOUT case didn't exist so it has to be skipped]	2024-06-26 11:06:24 +02:00
Frederic Lecaille	bc9821fd26	BUILD: Missing inclusion header for ssize_t type Compilation issue detected as follows by gcc: In file included from src/ncbuf.c:19: src/ncbuf.c: In function 'ncb_write_off': include/haproxy/bug.h:144:10: error: unknown type name 'ssize_t' 144 \| extern ssize_t write(int, const void *, size_t); \	2024-06-26 10:17:09 +02:00
Willy Tarreau	2d27c80288	BUILD: debug: also declare strlen() in __ABORT_NOW() Previous commit `8f204fa8ae` ("MINOR: debug: print gdb hints when crashing") broken on the CI where strlen() isn't known. Let's forward-declare it in the __ABORT_NOW() functions, just like write(). No backport is needed.	2024-06-26 08:04:40 +02:00
Willy Tarreau	8f204fa8ae	MINOR: debug: print gdb hints when crashing To make bug reporting easier for users, when crashing, let's suggest what to do. Typically when a BUG_ON() matches, only the current thread is useful the vast majority of the time, while when the watchdog triggers, all threads are interesting. The messages are printed at the end after the dump. We may adjust these with wiki links in the future is more detailed instructions are relevant.	2024-06-26 07:43:00 +02:00
Valentine Krasnobaeva	2cd52a88be	MINOR: cli/debug: show dev: show capabilities If haproxy compiled with Linux capabilities support, let's show process capabilities before applying the configuration and at runtime in 'show dev' command output. This maybe useful for debugging purposes. Especially in cases, when process changes its UID and GID to non-priviledged or it has started and run under non-priviledged UID and needed capabilities are set by admin on the haproxy binary.	2024-06-26 07:38:21 +02:00
Valentine Krasnobaeva	0d79c9bedf	MINOR: cli/debug: show dev: add cmdline and version 'show dev' command is very convenient to obtain haproxy debugging information, while process is run in container. Let's extend its output with version and cmdline. cmdline is useful in a way, as it shows absolute binary path and its arguments, because sometimes the person, who is debugging failing container is not the same, who has created and deployed it. argc and argv are stored in the exported global structure, because feed_post_mortem() is added as a post check function callback in the post_check_list. So we can't simply change the signature of feed_post_mortem(), without breaking other post check callbacks APIs. Parsers are not supposed to modify argv, so we can safely bypass its pointer to debug_parse_cli_show_dev(), without copying all argument stings somewhere in the heap or on stack.	2024-06-26 07:38:21 +02:00

1 2 3 4 5 ...

22596 Commits