haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-01-03 18:52:04 +00:00

Author	SHA1	Message	Date
Andrew Hopkins	88988bb06c	REGTESTS: ssl: skip ssl_dh test with AWS-LC skip ssl_dh test when HAProxy is built with AWS-LC which does not support FFDH ciphersuites.	2023-09-04 18:21:01 +02:00
Andrew Hopkins	b3f94f8b3b	BUILD: ssl: Build with new cryptographic library AWS-LC This adds a new option for the Makefile USE_OPENSSL_AWSLC, and update the documentation with instructions to use HAProxy with AWS-LC. Update the type of the OCSP callback retrieved with SSL_CTX_get_tlsext_status_cb with the actual type for libcrypto versions greater than 1.0.2. This doesn't affect OpenSSL which casts the callback to void* in SSL_CTX_ctrl.	2023-09-04 18:19:18 +02:00
Miroslav Zagorac	3cfc30416c	MINOR: properly mark the end of the CLI command in error messages In several places in the file src/ssl_ckch.c, in the message about the incorrect use of the CLI command, the end of that CLI command is not correctly marked with the sign ' .	2023-09-04 18:13:43 +02:00
William Lallemand	637306c86d	DOC: configuration: update examples for req.ver Update the documentation for the req.ver sample fetch. Could be backported as far as 2.6.	2023-09-04 18:12:58 +02:00
Willy Tarreau	8547f5cfa2	BUG/MINOR: stream: further protect stream_dump() against incomplete sessions As found by Coverity in issue #2273, the fix in commit `e64bccab2` ("BUG/MINOR: stream: protect stream_dump() against incomplete streams") was still not enough, as scf/scb are still dereferenced to dump their flags and states. This should be backported to 2.8.	2023-09-04 15:32:17 +02:00
Chris Staite	3939e39479	BUG/MEDIUM: h1-htx: Ensure chunked parsing with full output buffer A previous fix to ensure that there is sufficient space on the output buffer to place parsed data (#2053) introduced an issue that if the output buffer is filled on a chunk boundary no data is parsed but the congested flag is not set due to the state not being H1_MSG_DATA. The check to ensure that there is sufficient space in the output buffer is actually already performed in all downstream functions before it is used. This makes the early optimisation that avoids the state transition to H1_MSG_DATA needless. Therefore, in order to allow the chunk parser to continue in this edge case we can simply remove the early check. This ensures that the state can progress and set the congested flag correctly in the caller. This patch fixes #2262. The upstream change that caused this logic error was backported as far as 2.5, therefore it makes sense to backport this fix back that far also.	2023-09-04 12:15:36 +02:00
Willy Tarreau	135c66f6cb	BUG/MEDIUM: connection: fix pool free regression with recent ppv2 TLV patches In commit `fecc573da` ("MEDIUM: connection: Generic, list-based allocation and look-up of PPv2 TLVs") there was a tiny mistake, elements of length <= 128 are allocated from pool_pp_128 but only those of length < 128 are released to this pool, other ones go to pool_pp_256. Because of this, elements of size exactly 128 are allocated from 128 and released to 256. It can be reproduced a few times by running sample_fetches/tlvs.vtc 1000 times with -DDEBUG_DONT_SHARE_POOLS -DDEBUG_MEMORY_POOLS -DDEBUG_EXPR -DDEBUG_STRICT=2 -DDEBUG_POOL_INTEGRITY -DDEBUG_POOL_TRACING -DDEBUG_NO_POOLS. Not sure why it doesn't reproduce more often though. No backport is needed. This should address github issues #2275 and #2274.	2023-09-04 11:45:37 +02:00
Fr�d�ric L�caille	d52466726f	BUG/MINOR: quic: Unchecked pointer to packet number space dereferenced It is possible that there are still Initial crypto data in flight without Handshake crypto data in flight. This is very rare but possible. This issue was reported by long-rtt interop test with quic-go as client and @chipitsine in GH #2276. No need to backport.	2023-09-04 11:29:35 +02:00
Fr�d�ric L�caille	9077f20251	BUG/MAJOR: quic: Really ignore malformed ACK frames. If not correctly parsed, an ACK frame must be ignored without any more treatment. Before this patch an ACK frame could be partially correctly parsed, then some errors could be detected which leaded newly acknowledged packets to be released in a wrong way calling free_quic_tx_pkts() called by qc_parse_ack_frm(). But there is no reason to release such packets because of a malformed ACK frame. This patch modifies qc_parse_ack_frm(). The newly acknowledged TX packets is done in two steps. It first collects the newly acknowledged packet calling qc_newly_acked_pkts(). Then proceed the same way as before for the treatments of haproxy TX packets acknowledged by the peer. If the ACK frame could not be fully parsed, the newly ackowledged packets are replaced back from where they were detached: the tree of TX packets for their encryption level. Must be backported as far as 2.6.	2023-09-04 11:29:35 +02:00
Fr�d�ric L�caille	7dad52bdbd	MINOR: quic: Add a trace to quic_release_frm() Display the address of the frame to be released as soon as entering into quic_release_frm() whose job is obviously to released the memory allocated for the frame <frm> passed as parameter.	2023-09-04 11:29:35 +02:00
Fr�d�ric L�caille	3c90c1ce6b	BUG/MINOR: quic: Possible skipped RTT sampling There are very few chances this bug may occur. Furthermore the consequences are not dramatic: an RTT sampling may be ignored. I guess this may happen when the now_ms global value wraps. Do not rely on the time variable value a packet was sent to decide if it is a newly acknowledged packet but on its presence or not in the tx packet ebtree. Must be backported as far as 2.6.	2023-09-04 11:29:35 +02:00
Christopher Faulet	b50a471adb	BUG/MEDIUM: stconn: Don't block sends if there is a pending shutdown For the same reason than the previous patch, we must not block the sends when there is a pending shutdown. In other words, we must consider the sends are allowed when there is a pending shutdown. This patch must slowly be backported as far as 2.2. It should partially fix issue #2249.	2023-09-01 14:18:26 +02:00
Christopher Faulet	0b93ff8c87	BUG/MEDIUM: stconn: Wake applets on sending path if there is a pending shutdown An applet is not woken up on sending path if it is not waiting for data or if it states it will not consume data. However, it is important to still wake it up if there is a pending shutdown. Otherwise, the event may be missed and some data may remain blocked in the channel's buffer. Because of this bug, it is possible to have a stream stuck if data are also blocked on the opposite channel. It is for instance possible to hit the buf with the stats applet and a client not consuming data. This patch must slowly be backported as far as 2.2. It should partially fix issue #2249.	2023-09-01 14:18:26 +02:00
Christopher Faulet	9e394d34e0	BUG/MINOR: stconn: Don't report blocked sends during connection establishment The server timeout must not be handled during the connection establishment to not superseed the connect timeout. To do so, we must not consider outgoing data are blocked during this stage. Concretly, it means the fsb time must not be updated during connection establishment. It is not an issue with regular clients because the server timeout is only defined when the connection is estalished. However, it may be an issue for the HTTP client, when the server timeout is lower than the connect timeout. In this case, an early 502 may be reported with no connection retries. This patch must be backported to 2.8.	2023-09-01 14:18:26 +02:00
Christopher Faulet	3479d99d5f	BUG/MEDIUM: stconn: Update stream expiration date on blocked sends When outgoing data are blocked, we must update the stream expiration date and requeue the task. It is important to be sure to properly handle write timeout, expecially if the stream cannot expire on reads. This bug was introduced when handling of channel's timeouts was refactored to be managed by the stream-connectors. It is an issue if there is no server timeout and the client does not consume the response (or the opposite but it is less common). It is also possible to trigger the same scenario with applets on server side because, most of time, there is no server timeout. This patch must be backported to 2.8.	2023-09-01 14:18:26 +02:00
Christopher Faulet	49ed83e948	DEBUG: applet: Properly report opposite SC expiration dates in traces The wrong label was used in trace to report expiration dates of the opposite SC. "sc" was used instead of "sco". This patch should be backported to 2.8.	2023-09-01 14:18:26 +02:00
Willy Tarreau	b0031d9679	MINOR: checks: also consider the thread's queue for rebalancing Let's also check for other threads when the current one is queueing, let's not wait for the load to be high. Now this totally eliminates differences between threads.	2023-09-01 14:00:04 +02:00
Willy Tarreau	844a3bc25b	MEDIUM: checks: implement a queue in order to limit concurrent checks The progressive adoption of OpenSSL 3 and its abysmal handshake performance has started to reveal situations where it simply isn't possible anymore to succesfully run health checks on many servers, because between the moment all the checks are started and the moment the handshake finally completes, the timeout has expired! This also has consequences on production traffic which gets significantly delayed as well, all that for lots of checks. While it's possible to increase the check delays, it doesn't solve everything as checks still take a huge amount of time to converge in such conditions. Here we take a different approach by permitting to enforce the maximum concurrent checks per thread limitation and implementing an ordered queue. Thanks to this, if a thread about to start a check has reached its limit, it will add the check at the end of a queue and it will be processed once another check is finished. This proves to be extremely efficient, with all checks completing in a reasonable amount of time and not being disturbed by the rest of the traffic from other checks. They're just cycling slower, but at the speed the machine can handle. One must understand however that if some complex checks perform multiple exchanges, they will take a check slot for all the required duration. This is why the limit is not enforced by default. Tests on SSL show that a limit of 5-50 checks per thread on local servers gives excellent results already, so that could be a good starting point.	2023-09-01 14:00:04 +02:00
Willy Tarreau	cfc0bceeb5	MEDIUM: checks: search more aggressively for another thread on overload When the current check is overloaded (more running checks than the configured limit), we'll try more aggressively to find another thread. Instead of just opportunistically looking for one half as loaded, now if the current thread has more than 1% more active checks than another one, or has more than a configured limit of concurrent running checks, it will search for a more suitable thread among 3 other random ones in order to migrate the check there. The number of migrations remains very low (~1%) and the checks load very fair across all threads (~1% as well). The new parameter is called tune.max-checks-per-thread.	2023-09-01 08:26:06 +02:00
Willy Tarreau	016e189ea3	MINOR: check: also consider the random other thread's active checks When checking if it's worth transferring a sleeping thread to another random thread, let's also check if that random other thread has less checks than the current one, which is another reason for transferring the load there. This commit adds a function "check_thread_cmp_load()" to compare two threads' loads in order to simplify the decision taking. The minimum active check count before starting to consider rebalancing the load was now raised from 2 to 3, because tests show that at 15k concurrent checks, at 2, 50% are evaluated for rebalancing and 30% are rebalanced, while at 3, this is cut in half.	2023-09-01 08:26:06 +02:00
Willy Tarreau	00de9e0804	MINOR: checks: maintain counters of active checks per thread Let's keep two check counters per thread: - one for "active" checks, i.e. checks that are no more sleeping and are assigned to the thread. These include sleeping and running checks ; - one for "running" checks, i.e. those which are currently executing on the thread. By doing so, we'll be able to spread the health checks load a bit better and refrain from sending too many at once per thread. The counters are atomic since a migration increments the target thread's active counter. These numbers are reported in "show activity", which allows to check per thread and globally how many checks are currently pending and running on the system. Ideally, we should only consider checks in the process of establishing a connection since that's really the expensive part (particularly with OpenSSL 3.0). But the inner layers are really not suitable to doing this. However knowing the number of active checks is already a good enough hint.	2023-09-01 08:26:06 +02:00
Willy Tarreau	3b7942a1c9	MINOR: check/activity: collect some per-thread check activity stats We now count the number of times a check was started on each thread and the number of times a check was adopted. This helps understand better what is observed regarding checks.	2023-09-01 08:26:06 +02:00
Willy Tarreau	e03d05c6ce	MINOR: check: remember when we migrate a check The goal here is to explicitly mark that a check was migrated so that we don't do it again. This will allow us to perform other actions on the target thread while still knowing that we don't want to be migrated again. The new READY bit combine with SLEEPING to form 4 possible states: SLP RDY State Description 0 0 - (reserved) 0 1 RUNNING Check is bound to current thread and running 1 0 SLEEPING Check is sleeping, not bound to a thread 1 1 MIGRATING Check is migrating to another thread Thus we set READY upon migration, and check for it before migrating, this is sufficient to prevent a second migration. To make things a bit clearer, the SLEEPING bit was switched with FASTINTER so that SLEEPING and READY are adjacent.	2023-09-01 08:26:06 +02:00
Willy Tarreau	3544c9f8a0	MINOR: checks: pin the check to its thread upon wakeup When a check leaves the sleeping state, we must pin it to the thread that is processing it. It's normally always the case after the first execution, but initial checks that start assigned to any thread (-1) could be assigned much later, causing problems with planned changes involving queuing. Thus better do it early, so that all threads start properly pinned.	2023-09-01 08:26:06 +02:00
Willy Tarreau	7163f95b43	MINOR: checks: start the checks in sleeping state The CHK_ST_SLEEPING state was introduced by commit `d114f4a68` ("MEDIUM: checks: spread the checks load over random threads") to indicate that a check was not currently bound to a thread and that it could easily be migrated to any other thread. However it did not start the checks in this state, meaning that they were not redispatchable on startup. Sometimes under heavy load (e.g. when using SSL checks with OpenSSL 3.0) the cost of setting up new connections is so high that some threads may experience connection timeouts on startup. In this case it's better if they can transfer their excess load to other idle threads. By just marking the check as sleeping upon startup, we can do this and significantly reduce the number of failed initial checks.	2023-09-01 08:26:06 +02:00
Willy Tarreau	48442b8b15	BUG/MINOR: checks: do not queue/wake a bounced check A small issue was introduced with commit `d114f4a68` ("MEDIUM: checks: spread the checks load over random threads"): when a check is bounced to another thread, its expiration time is set to TICK_ETERNITY. This makes it show as not expired upon first wakeup on the next thread, thus being detected as "woke up too early" and being instantly rescheduled. Only this after this next wakeup it will be properly considered. Several approaches were attempted to fix this. The best one seems to consist in resetting t->expire and expired upon wakeup, and changing the !expired test for !tick_is_expired() so that we don't trigger on this case. This needs to be backported to 2.7.	2023-09-01 08:26:06 +02:00
Willy Tarreau	338431ecb6	MINOR: activity: report the current run queue size While troubleshooting the causes of load spikes, it appeared that the length of individual run queues was missing, let's add it to "show activity".	2023-09-01 08:26:06 +02:00
Willy Tarreau	2cb896c4b0	MEDIUM: server/ssl: pick another thread's session when we have none yet The per-thread SSL context in servers causes a burst of connection renegotiations on startup, both for the forwarded traffic and for the health checks. Health checks have been seen to continue to cause SSL rekeying for several minutes after a restart on large thread-count machines. The reason is that the context is exlusively per-thread and that the more threads there are, the more likely it is for a new connection to start on a thread that doesn't have such a context yet. In order to improve this situation, this commit ensures that a thread starting an SSL connection to a server without a session will first look at the last session that was updated by another thread, and will try to use it. In order to minimize the contention, we're using a read lock here to protect the data, and the first-level index is an integer containing the thread number, that is always valid and may always be dereferenced. This way the session retrieval algorithm becomes quite simple: - if the last thread index is valid, then try to use the same session under a read lock ; - if any error happens, then atomically nuke the index so that other threads don't use it and the next one to update a connection updates it again And for the ssl_sess_new_srv_cb(), we have this: - update the entry under a write lock if the new session is valid, otherwise kill it if the session is not valid; - atomically update the index if it was 0 and the new one is valid, otherwise atomically nuke it if the session failed. Note that even if only the pointer is destroyed, the element will be re-allocated by the next thread during the sess_new_srv_sb(). Right now a session is picked even if the SNI doesn't match, because we don't know the SNI yet during ssl_sock_init(), but that's essentially a matter of API, since connect_server() figures the SNI very early, then calls conn_prepare() which calls ssl_sock_init(). Thus in the future we could easily imaging storing a number of SNI-based contexts instead of storing contexts per thread. It could be worth backporting this to one LTS version after some observation, though this is not strictly necessary. the current commit depends on the following ones: BUG/MINOR: ssl_sock: fix possible memory leak on OOM MINOR: ssl_sock: avoid iterating realloc(+1) on stored context DOC: ssl: add some comments about the non-obvious session allocation stuff CLEANUP: ssl: keep a pointer to the server in ssl_sock_init() MEDIUM: ssl_sock: always use the SSL's server name, not the one from the tid MEDIUM: server/ssl: place an rwlock in the per-thread ssl server session MINOR: server/ssl: maintain an index of the last known valid SSL session MINOR: server/ssl: clear the shared good session index on failure MEDIUM: server/ssl: pick another thread's session when we have none yet	2023-08-31 09:27:14 +02:00
Willy Tarreau	777f62cfb7	MINOR: server/ssl: clear the shared good session index on failure If we fail to set the session using SSL_set_session(), we want to quickly erase our index from the shared one so that any other thread with a valid session replaces it.	2023-08-31 08:50:01 +02:00
Willy Tarreau	52b260bae4	MINOR: server/ssl: maintain an index of the last known valid SSL session When a thread creates a new session for a server, if none was known yet, we assign the thread id (hence the reused_sess index) to a shared variable so that other threads will later be able to find it when they don't have one yet. For now we only set and clear the pointer upon session creation, we do not yet pick it. Note that we could have done it per thread-group, so as to avoid any cross-thread exchanges, but it's anticipated that this is essentially used during startup, at a moment where the cost of inter-thread contention is very low compared to the ability to restart at full speed, which explains why instead we store a single entry.	2023-08-31 08:50:01 +02:00
Willy Tarreau	607041dec3	MEDIUM: server/ssl: place an rwlock in the per-thread ssl server session The goal will be to permit a thread to update its session while having it shared with other threads. For now we only place the lock and arrange the code around it so that this is quite light. For now only the owner thread uses this lock so there is no contention. Note that there is a subtlety in the openssl API regarding i2s_SSL_SESSION() in that it fills the area pointed to by its argument with a dump of the session and returns a size that's equal to the previously allocated one. As such, it does modify the shared area even if that's not obvious at first glance.	2023-08-31 08:50:01 +02:00
Willy Tarreau	95ac5fe4a8	MEDIUM: ssl_sock: always use the SSL's server name, not the one from the tid In ssl_sock_set_servername(), we're retrieving the current server name from the current thread, hoping it will not have changed. This is a bit dangerous as strictly speaking it's not easy to prove that no other connection had to use one between the moment it was retrieved in ssl_sock_init() and the moment it's being read here. In addition, this forces us to maintain one session per thread while this is not the real need, in practice we only need one session per SNI. And the current model prevents us from sharing sessions between threads. This had been done in 2.5 via commit `e18d4e828` ("BUG/MEDIUM: ssl: backend TLS resumption with sni and TLSv1.3"), but as analyzed with William, it turns out that a saner approach consists in keeping the call to SSL_get_servername() there and instead to always assign the SNI to the current SSL context via SSL_set_tlsext_host_name() immediately when the session is retreived. This way the session and SNI are consulted atomically and the host name is only checked from the session and not from possibly changing elements. As a bonus the rdlock that was added by that commit could now be removed, though it didn't cost much.	2023-08-31 08:49:15 +02:00
Willy Tarreau	335b5adf2c	CLEANUP: ssl: keep a pointer to the server in ssl_sock_init() We're using about 6 times "__objt_server(conn->target)" there, it's not quite easy to read, let's keep a pointer to the server.	2023-08-30 18:58:40 +02:00
Willy Tarreau	bc31ef0896	DOC: ssl: add some comments about the non-obvious session allocation stuff The SSL session allocation/reuse part is far from being trivial, and there are some necessary tricks such as allocating then immediately freeing that are required by the API due to internal refcount. All of this is particularly hard to grasp, even with the scarce man pages. Let's document a little bit what's granted and expected along this path to help the reader later.	2023-08-30 11:43:06 +02:00
Willy Tarreau	2c6fe24001	MINOR: ssl_sock: avoid iterating realloc(+1) on stored context The SSL context storage in servers is per-thread, and the contents are allocated for a length that is determined from the session. It turns out that placing some traces there revealed that the realloc() that is called to grow the area can be called multiple times in a row even for just health checks, to grow the area by just one or two bytes. Given that malloc() allocates in multiples of 8 or 16 anyway, let's round the allocated size up to the nearest multiple of 8 to avoid this unneeded operation.	2023-08-30 11:43:06 +02:00
Alexander Stephan	2cc53ecc8f	MINOR: sample: Add common TLV types as constants for fc_pp_tlv This patch adds common TLV types as specified in the PPv2 spec. We will use the suffix of the type, e.g., PP2_TYPE_AUTHORITY becomes AUTHORITY.	2023-08-29 15:32:02 +02:00
Alexander Stephan	0a4f6992e0	MINOR: sample: Refactor fc_pp_unique_id by wrapping the generic TLV fetch The fetch logic is redundant and can be simplified by simply calling the generic fetch with the correct TLV ID set as an argument, similar to fc_pp_authority.	2023-08-29 15:32:01 +02:00
Alexander Stephan	ece0d1ab49	MINOR: sample: Refactor fc_pp_authority by wrapping the generic TLV fetch We already have a call that can retreive an TLV with any value. Therefore, the fetch logic is redundant and can be simplified by simply calling the generic fetch with the correct TLV ID set as an argument.	2023-08-29 15:31:51 +02:00
Alexander Stephan	f773ef721c	MEDIUM: sample: Add fetch for arbitrary TLVs Based on the new, generic allocation infrastructure, a new sample fetch fc_pp_tlv is introduced. It is an abstraction for existing PPv2 TLV sample fetches. It takes any valid TLV ID as argument and returns the value as a string, similar to fc_pp_authority and fc_pp_unique_id.	2023-08-29 15:31:28 +02:00
Alexander Stephan	fecc573da1	MEDIUM: connection: Generic, list-based allocation and look-up of PPv2 TLVs In order to be able to implement fetches in the future that allow retrieval of any TLVs, a new generic data structure for TLVs is introduced. Existing TLV fetches for PP2_TYPE_AUTHORITY and PP2_TYPE_UNIQUE_ID are migrated to use this new data structure. TLV related pools are updated to not rely on type, but only on size. Pools accomodate the TLV list element with their associated value. For now, two pools for 128 B and 256 B values are introduced. More fine-grained solutions are possible in the future, if necessary.	2023-08-29 15:15:47 +02:00
Alexander Stephan	c9d47652d2	CLEANUP/MINOR: connection: Improve consistency of PPv2 related constants This patch improves readability by scoping HA proxy related PPv2 constants with a 'HA" prefix. Besides, a new constant for the length of a CRC32C TLV is introduced. The length is derived from the PPv2 spec, so 32 Bit.	2023-08-29 15:15:47 +02:00
Willy Tarreau	bd84387beb	MEDIUM: capabilities: enable support for Linux capabilities For a while there has been the constraint of having to run as root for transparent proxying, and we're starting to see some cases where QUIC is not running in socket-per-connection mode due to the missing capability that would be needed to bind a privileged port. It's not realistic to ask all QUIC users on port 443 to run as root, so instead let's provide a basic support for capabilities at least on linux. The ones currently supported are cap_net_raw, cap_net_admin and cap_net_bind_service. The mechanism was made OS-specific with a dedicated file because it really is. It can be easily refined later for other OSes if needed. A new keyword "setcaps" is added to the global section, to enumerate the capabilities that must be kept when switching from root to non-root. This is ignored in other situations though. HAProxy has to be built with USE_LINUX_CAP=1 for this to be supported, which is enabled by default for linux-glibc, linux-glibc-legacy and linux-musl. A good way to test this is to start haproxy with such a config: global uid 1000 setcap cap_net_bind_service frontend test mode http timeout client 3s bind quic4@:443 ssl crt rsa+dh2048.pem allow-0rtt and run it under "sudo strace -e trace=bind,setuid", then connecting there from an H3 client. The bind() syscall must succeed despite the user id having been switched.	2023-08-29 11:11:50 +02:00
Willy Tarreau	4d5f7d94b9	DOC: config: mention uid dependency on the tune.quic.socket-owner option This option defaults to "connection" but is also dependent on the user being allowed to bind the specified port. Since QUIC can easily run on non-privileged ports, usually this is not a problem, but if bound to port 443 it will usually fail. Let's mention this.	2023-08-29 11:11:50 +02:00
Willy Tarreau	e64bccab20	BUG/MINOR: stream: protect stream_dump() against incomplete streams If a stream is interrupted during its initialization by a panic signal and tries to dump itself, it may cause a crash during the dump due to scf and/or scb not being fully initialized. This may also happen while releasing an endpoint to attach a new one. The effect is that instead of dying on an abort, the process dies on a segv. This race is ultra- rare but totally possible. E.g: #0 se_fl_test (test=1, se=0x0) at include/haproxy/stconn.h:98 #1 sc_ep_test (test=1, sc=0x7ff8d5cbd560) at include/haproxy/stconn.h:148 #2 sc_conn (sc=0x7ff8d5cbd560) at include/haproxy/stconn.h:223 #3 stream_dump (buf=buf@entry=0x7ff9507e7678, s=0x7ff4c40c8800, pfx=pfx@entry=0x55996c558cb3 ' ' <repeats 13 times>, eol=eol@entry=10 '\n') at src/stream.c:2840 #4 0x000055996c493b42 in ha_task_dump (buf=buf@entry=0x7ff9507e7678, task=<optimized out>, pfx=pfx@entry=0x55996c558cb3 ' ' <repeats 13 times>) at src/debug.c:328 #5 0x000055996c493edb in ha_thread_dump_one (thr=thr@entry=18, from_signal=from_signal@entry=0) at src/debug.c:227 #6 0x000055996c493ff1 in ha_thread_dump (buf=buf@entry=0x7ff9507e7678, thr=thr@entry=18) at src/debug.c:270 #7 0x000055996c494257 in ha_panic () at src/debug.c:430 #8 ha_panic () at src/debug.c:411 (...) #23 0x000055996c341fe8 in ssl_sock_close (conn=<optimized out>, xprt_ctx=0x7ff8dcae3880) at src/ssl_sock.c:6699 #24 0x000055996c397648 in conn_xprt_close (conn=0x7ff8c297b0c0) at include/haproxy/connection.h:148 #25 conn_full_close (conn=0x7ff8c297b0c0) at include/haproxy/connection.h:192 #26 h1_release (h1c=0x7ff8c297b3c0) at src/mux_h1.c:1074 #27 0x000055996c39c9f0 in h1_detach (sd=<optimized out>) at src/mux_h1.c:3502 #28 0x000055996c474de4 in sc_detach_endp (scp=scp@entry=0x7ff9507e3148) at src/stconn.c:375 #29 0x000055996c4752a5 in sc_reset_endp (sc=<optimized out>, sc@entry=0x7ff8d5cbd560) at src/stconn.c:475 Note that this cannot happen on "show sess" since a stream never leaves process_stream in such an uninitialized state, thus it's really only the crash dump that may cause this. It should be backported to 2.8.	2023-08-29 11:11:50 +02:00
William Lallemand	e7d9082315	BUG/MINOR: ssl/cli: can't find ".crt" files when replacing a certificate Bug was introduced by commit 26654 ("MINOR: ssl: add "crt" in the cert_exts array"). When looking for a .crt directly in the cert_exts array, the ssl_sock_load_pem_into_ckch() function will be called with a argument which does not have its ".crt" extensions anymore. If "ssl-load-extra-del-ext" is used this is not a problem since we try to add the ".crt" when doing the lookup in the tree. However when using directly a ".crt" without this option it will failed looking for the file in the tree. The fix removes the "crt" entry from the array since it does not seem to be really useful without a rework of all the lookups. Should fix issue #2265 Must be backported as far as 2.6.	2023-08-28 18:20:39 +02:00
Willy Tarreau	0074c36dd2	BUILD: pools: import plock.h to build even without thread support In 2.9-dev4, commit `544c2f2d9` ("MINOR: pools: use EBO to wait for unlock during pool_flush()") broke the thread-less build by calling pl_wait_new_long() without explicitly including plock.h which is normally included by thread.h when threads are enabled.	2023-08-26 17:28:08 +02:00
Willy Tarreau	892d04733f	BUILD: import: guard plock.h against multiple inclusion Surprisingly there's no include guard in plock.h though there is one in atomic-ops.h. Let's add one, or we cannot risk including the file multiple times.	2023-08-26 17:28:08 +02:00
Willy Tarreau	a7b9baa2cc	BUG/MEDIUM: mux-h2: fix crash when checking for reverse connection after error If the connection is closed in h2_release(), which is indicated by ret<0, we must not dereference conn anymore. This was introduced in 2.9-dev4 by commit `5053e8914` ("MEDIUM: h2: prevent stream opening before connection reverse completed") and detected after a few hours of runtime thanks to running with pool integrity checks and caller enabled. No backport is needed.	2023-08-26 17:05:19 +02:00
Willy Tarreau	518349f08a	[RELEASE] Released version 2.9-dev4 Released version 2.9-dev4 with the following main changes : - DEV: flags/show-sess-to-flags: properly decode fd.state - BUG/MINOR: stktable: allow sc-set-gpt(0) from tcp-request connection - BUG/MINOR: stktable: allow sc-add-gpc from tcp-request connection - DOC: typo: fix sc-set-gpt references - SCRIPTS: git-show-backports: automatic ref and base detection with -m - REGTESTS: Do not use REQUIRE_VERSION for HAProxy 2.5+ (3) - DOC: jwt: Add explicit list of supported algorithms - BUILD: Makefile: add the USE_QUIC option to make help - BUILD: Makefile: add USE_QUIC_OPENSSL_COMPAT to make help - BUILD: Makefile: realigned USE_* options in make help - DEV: makefile: fix POSIX compatibility for "range" target - IMPORT: plock: also support inlining the int code - IMPORT: plock: always expose the inline version of the lock wait function - IMPORT: lorw: support inlining the wait call - MINOR: threads: inline the wait function for pthread_rwlock emulation - MINOR: atomic: make sure to always relax after a failed CAS - MINOR: pools: use EBO to wait for unlock during pool_flush() - BUILD/IMPORT: fix compilation with PLOCK_DISABLE_EBO=1 - MINOR: quic+openssl_compat: Do not start without "limited-quic" - MINOR: quic+openssl_compat: Emit an alert for "allow-0rtt" option - BUG/MINOR: quic: allow-0rtt warning must only be emitted with quic bind - BUG/MINOR: quic: ssl_quic_initial_ctx() uses error count not error code - MINOR: pattern: do not needlessly lookup the LRU cache for empty lists - IMPORT: xxhash: update xxHash to version 0.8.2 - MINOR: proxy: simplify parsing 'backend/server' - MINOR: connection: centralize init/deinit of backend elements - MEDIUM: connection: implement passive reverse - MEDIUM: h2: reverse connection after SETTINGS reception - MINOR: server: define reverse-connect server - MINOR: backend: only allow reuse for reverse server - MINOR: tcp-act: parse 'tcp-request attach-srv' session rule - REGTESTS: provide a reverse-server test - MINOR: tcp-act: define optional arg name for attach-srv - MINOR: connection: use attach-srv name as SNI reuse parameter on reverse - REGTESTS: provide a reverse-server test with name argument - MINOR: proto: define dedicated protocol for active reverse connect - MINOR: connection: extend conn_reverse() for active reverse - MINOR: proto_reverse_connect: parse rev@ addresses for bind - MINOR: connection: prepare init code paths for active reverse - MEDIUM: proto_reverse_connect: bootstrap active reverse connection - MINOR: proto_reverse_connect: handle early error before reversal - MEDIUM: h2: implement active connection reversal - MEDIUM: h2: prevent stream opening before connection reverse completed - REGTESTS: write a full reverse regtest - BUG/MINOR: h2: fix reverse if no timeout defined - CI: fedora: fix "dnf" invocation syntax - BUG/MINOR: hlua_fcn: potentially unsafe stktable_data_ptr usage - DOC: lua: fix Sphinx warning from core.get_var() - DOC: lua: fix core.register_action typo - BUG/MINOR: ssl_sock: fix possible memory leak on OOM - MEDIUM: map/acl: Improve pat_ref_set() efficiency (for "set-map", "add-acl" action perfs) - MEDIUM: map/acl: Improve pat_ref_set_elt() efficiency (for "set-map", "add-acl"action perfs) - MEDIUM: map/acl: Accelerate several functions using pat_ref_elt struct ->head list - MEDIUM: map/acl: Replace map/acl spin lock by a read/write lock. - DOC: map/acl: Remove the comments about map/acl performance issue - DOC: Explanation of be_name and be_id fetches - MINOR: connection: simplify removal of idle conns from their trees - MINOR: server: move idle tree insert in a dedicated function - MAJOR: connection: purge idle conn by last usage	2023-08-25 17:57:22 +02:00
Amaury Denoyelle	5afcb686b9	MAJOR: connection: purge idle conn by last usage Backend idle connections are purged on a recurring occurence during the process lifetime. An estimated number of needed connections is calculated and the excess is removed periodically. Before this patch, purge was done directly using the idle then the safe connection tree of a server instance. This has a major drawback to take no account of a specific ordre and it may removed functional connections while leaving ones which will fail on the next reuse. The problem can be worse when using criteria to differentiate idle connections such as the SSL SNI. In this case, purge may remove connections with a high rate of reusing while leaving connections with criteria never matched once, thus reducing drastically the reuse rate. To improve this, introduce an alternative storage for idle connection used in parallel of the idle/safe trees. Now, each connection inserted in one of this tree is also inserted in the new list at `srv_per_thread.idle_conn_list`. This guarantees that recently used connection is present at the end of the list. During the purge, use this list instead of idle/safe trees. Remove first connection in front of the list which were not reused recently. This will ensure that connection that are frequently reused are not purged and should increase the reuse rate, particularily if distinct idle connection criterias are in used.	2023-08-25 15:57:48 +02:00

1 2 3 4 5 ...

20653 Commits