haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2024-12-27 23:22:09 +00:00

Author	SHA1	Message	Date
Willy Tarreau	8c855f6cff	MINOR: leastconn: take the queue length into account when queuing servers When servers are queued into the leastconn tree, it's important to also consider their queue length. There could be some servers with lots of queued requests that we don't want to hammer with extra connections. In order not to add extra stress to the LB algorithm, we don't update the value when adding to the queue, only when updating the connection count (i.e. picking from the queue or releasing a connection). This will be sufficient to significantly improve the fairness in such situations.	2020-10-22 18:30:18 +02:00
Willy Tarreau	96bca33d75	OPTIM: queue: decrement the nbpend and totpend counters outside of the lock We don't need to do that inside the lock. However since the operation used to be done in deep functions, we have to make it resurface closer to visible parts. It remains reasonably self-contained in queue.c so that's not that big of a deal. Some places (redistribute) could benefit from a single operation for all counts at once. Others like pendconn_process_next_strm() are still called with both locks held but now it will be possible to change this.	2020-10-22 17:32:28 +02:00
Willy Tarreau	56c1cfb179	OPTIM: queue: make the nbpend counters atomic Instead of incrementing, decrementing them and updating their max under the lock, make them atomic and keep them out of the lock as much as possible. For __pendconn_unlink_* it would be wide to decide to move these counters outside of the function, inside the callers so that a single atomic op can be done per counter even for groups of operations.	2020-10-22 17:32:28 +02:00
Willy Tarreau	c7eedf7a5a	MINOR: queue: reduce the locked area in pendconn_add() Similarly to previous changes, we know if we're dealing with a server or proxy lock so let's directly lock at the finest possible places there. It's worth noting that a part of the operation consisting in an increment and update of a max could be done outside of the lock using atomic ops and a CAS.	2020-10-22 17:32:28 +02:00
Willy Tarreau	3e3ae2524d	MINOR: queue: split __pendconn_unlink() in per-srv and per-prx The function is called with the lock held and does too many tests for things that are already known from its callers. Let's split it in two so that its callers call either the per-server or per-proxy function depending on where the element is (since they had to determine it prior to taking the lock).	2020-10-22 17:32:28 +02:00
Willy Tarreau	b7ba1d9011	OPTIM: queue: don't call pendconn_unlink() when the pendconn is not queued On connection error processing, we can see massive storms of calls to pendconn_cond_unlink() to release a possible place in the queue. For example, in issue #908, on average half of the threads are caught in this function via back_try_conn_req() consecutive to a synchronous error. However we wait until grabbing the lock to know if the pendconn is effectively in a queue, which is expensive for many cases. We know the transition may only happen from in-queue to out-of-queue so it's safe to first run a preliminary check to see if it's worth going further. This will allow to avoid the cost of locking for most requests. This should not change anything for those completing correctly as they're already run through pendconn_free() which doesn't call pendconn_cond_unlink() unless deemed necessary.	2020-10-22 17:32:28 +02:00
Willy Tarreau	5503908bdc	MINOR: proxy/cli: only take a read lock in "show errors" There's no point having an exclusive lock here, nothing is modified.	2020-10-22 17:32:28 +02:00
Willy Tarreau	595e767030	MINOR: server: read-lock the cookie during srv_set_dyncookie() No need to use an exclusive lock on the proxy anymore when reading its setting, a read lock is enough. A few other places continue to use a write-lock when modifying simple flags only in order to let this function see a consistent value all along. This might be changed in the future using barriers and local copies.	2020-10-22 17:32:28 +02:00
Willy Tarreau	ac66d6bafb	MINOR: proxy; replace the spinlock with an rwlock This is an anticipation of finer grained locking for the queues. For now all lock places take a write lock so that there is no difference at all with previous code.	2020-10-22 17:32:28 +02:00
Willy Tarreau	de785f04e1	MINOR: threads/debug: only report lock stats for used operations In addition to the previous simplification, most locks don't use the seek or read lock (e.g. spinlocks etc) so let's split the dump into distinct operations (write/seek/read) and only report those which were used. Now the output size is roughly divided by 5 compared to previous ones.	2020-10-22 17:32:28 +02:00
Willy Tarreau	23d3b00bdd	MINOR: threads/debug: only report used lock stats The lock stats are very verbose and more than half of them are used in a typical test, making it hard to spot the sought values. Let's simply report "not used" for those which have not been called at all.	2020-10-22 17:32:28 +02:00
Christopher Faulet	9a3d3fcb5d	BUG/MAJOR: mux-h2: Don't try to send data if we know it is no longer possible In h2_send(), if we are in a state where we know it is no longer possible to send data, we must exit the sending loop to avoid any possiblity to loop forever. It may happen if the mbuf ring is released while the H2_CF_MUX_MFULL flag is still set. Here is a possible scenario to trigger the bug : 1) The mbuf ring is full because we are unable to send data. The H2_CF_MUX_MFULL flag is set on the H2 connection. 2) At this stage, the task timeout expires because the H2 connection is blocked. We enter in h2_timeout_task() function. Because the mbuf ring is full, we cannot send the GOAWAY frame. Thus the H2_CF_GOAWAY_FAILED flag is set. The H2 connection is not released yet because there is still a stream attached. Here we leave h2_timeout_task() function. 3) A bit later, the H2 connection is woken up. If h2_process(), nothing is performed by the first attempt to send data, in h2_send(). Then, because the H2_CF_GOAWAY_FAILED flag is set, the mbuf ring is released. But the H2_CF_MUX_MFULL flag is still there. At this step a second attempt to send data is performed. 4) In h2_send(), we try to send data in a loop. To exist this loop, done variable must be set to 1. Because the H2_CF_MUX_MFULL flag is set, we don't call h2_process_mux() and done is not updated. Because the mbuf ring is now empty, nothing is sent and the H2_CF_MUX_MFULL flag is never removed. Now, we loop forever... waiting for the watchdog. To fix the bug, we now exit the loop if one of these conditions is true : - The H2_CF_GOAWAY_FAILED flag is set on the H2 connection - The CO_FL_SOCK_WR_SH flag is set on the underlying connection - The H2 connection is in the H2_CS_ERROR2 state This patch should fix the issue #912 and most probably #875. It must be backported as far as the 1.8.	2020-10-22 17:13:22 +02:00
Christopher Faulet	d6c48366b8	BUG/MINOR: http-ana: Don't send payload for internal responses to HEAD requests When an internal response is returned to a client, the message payload must be skipped if it is a reply to a HEAD request. The payload is removed from the HTX message just before the message forwarding. This bugs has been around for a long time. It was already there in the pre-HTX versions. In legacy HTTP mode, internal errors are not parsed. So this bug cannot be easily fixed. Thus, this patch should only be backported in all HTX versions, as far as 2.0. However, the code has significantly changed in the 2.2. Thus in the 2.1 and 2.0, the patch must be entirely reworked.	2020-10-22 17:13:22 +02:00
Tim Duesterhus	6414cd1fc0	CLEANUP: compression: Make use of http_get_etag_type() This commit makes the compressor use http_get_etag_type to validate the ETag instead of using an ad-hoc condition.	2020-10-22 16:59:36 +02:00
Remi Tricot-Le Breton	5bbdc81cf1	REGTEST: cache: Add if-none-match test case Test that if-none-match header is properly taken into account and that when the conditions are fulfilled, a "304 Not Modified" response can be sent to the client. Co-authored-by: Tim Duesterhus <tim@bastelstu.be>	2020-10-22 16:10:20 +02:00
Remi Tricot-Le Breton	6cb10384a3	MEDIUM: cache: Add support for 'If-None-Match' request header Partial support of conditional HTTP requests. This commit adds the support of the 'If-None-Match' header (see RFC 7232#3.2). When a client specifies a list of ETags through one or more 'If-None-Match' headers, they are all compared to the one that might have been stored in the corresponding http cache entry until one of them matches. If a match happens, a specific "304 Not Modified" response is sent instead of the cached data. This response has all the stored headers but no other data (see RFC 7232#4.1). Otherwise, the whole cached data is sent. Although unlikely in a GET/HEAD request, the "If-None-Match: *" syntax is valid and also receives a "304 Not Modified" response (RFC 7434#4.3.2). This resolves a part of GitHub issue #821.	2020-10-22 16:10:20 +02:00
Remi Tricot-Le Breton	dbb65b5a7a	MEDIUM: cache: Store the ETag information in the cache_entry When sent by a server for a given resource, the ETag header is stored in the coresponding cache entry (as any other header). So in order to perform future ETag comparisons (for subsequent conditional HTTP requests), we keep the length of the ETag and its offset relative to the start of the cache_entry. If no ETag header exists, the length and offset are zero.	2020-10-22 16:10:20 +02:00
Remi Tricot-Le Breton	bcced09b91	MINOR: http: Add etag comparison function Add a function that compares two etags that might be of different types. If any of them is weak, the 'W/' prefix is discarded and a strict string comparison is performed. Co-authored-by: Tim Duesterhus <tim@bastelstu.be>	2020-10-22 16:06:20 +02:00
Tim Duesterhus	2493ee81d4	MINOR: http: Add `enum etag_type http_get_etag_type(const struct ist)` http_get_etag_type returns whether a given `etag` is a strong, weak, or invalid ETag.	2020-10-22 16:02:29 +02:00
Willy Tarreau	1e690bb6c4	BUG/MEDIUM: server: support changing the slowstart value from state-file If the slowstart value in a state file implies the latest state change is within the slowstart period, we end up calling srv_update_status() to reschedule the server's state change but its task is not yet allocated and remains null, causing a crash on startup. Make sure srv_update_status() supports being called with partially initialized servers which do not yet have a task. If the task has to be scheduled, it will necessarily happen after initialization since it will result from a state change. This should be backported wherever server-state is present.	2020-10-22 12:07:07 +02:00
Willy Tarreau	5c643f37d0	BUILD: makefile: add entries to build common debugging tools A few tools in contrib/ such as halog, flags, poll and tcploop are occasionally useful at least to developers, and some of them such as halog or flags can occasionally break due to some changes in the include files. As reported in issue #907, their inability to inherit the global build options also causes some warnings related to some specificities of the main include files. Let's just add entries in the main makefile to build them.	2020-10-22 05:17:08 +02:00
Willy Tarreau	9018ca9655	CONTRIB: tcploop: remove unused local variables in tcp_pause() Building with -Wall shows that "pollfd" and "ret" are not used. Silly copy-paste...	2020-10-22 05:17:08 +02:00
Willy Tarreau	ef71f0194c	BUG/MINOR: queue: properly report redistributed connections In commit `5cd4bbd7a` ("BUG/MAJOR: threads/queue: Fix thread-safety issues on the queues management") the counter of transferred connections was accidently lost, so that when a server goes down with connections in its queue, it will always be reported that 0 connection were transferred. This should be backported as far as 1.8 since the patch above was backported there.	2020-10-21 12:04:53 +02:00
William Lallemand	8e8581e242	MINOR: ssl: 'ssl-load-extra-del-ext' removes the certificate extension In issue #785, users are reporting that it's not convenient to load a ".crt.key" when the configuration contains a ".crt". This option allows to remove the extension of the certificate before trying to load any extra SSL file (.key, .ocsp, .sctl, .issuer etc.) The patch changes a little bit the way ssl_sock_load_files_into_ckch() looks for the file.	2020-10-20 18:25:46 +02:00
William Dauchy	835712ad90	BUG/MINOR: listener: close before free in `listener_accept` safer to close handle before the object is put back in the global pool. this was introduced by commit `9378bbe0be` ("MEDIUM: listener: use protocol->accept_conn() to accept a connection") this should fix github issue #902 no backport needed. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2020-10-20 15:40:36 +02:00
Willy Tarreau	f42d794d96	MEDIUM: config: report that "nbproc" is deprecated As previously discussed, nbproc usage is bad, deprecated, and scheduled for removal in 2.5. If "nbproc" is found with more than one process while nbthread is not set, a warning will be emitted encouraging to remove it or to migrate to nbthread instead. This makes sure the user has an opportunity to both see the message and silence it.	2020-10-20 11:54:49 +02:00
Christopher Faulet	96ddc8ab43	BUG/MEDIUM: connection: Never cleanup server lists when freeing private conns When a connection is released, depending on its state, it may be detached from the session and it may be removed from the server lists. The first case may happen for private or unsharable active connections. The second one should only be performed for idle or available connections. We never try to remove a connection from the server list if it is attached to a session. But it is also important to never try to remove a private connecion from the server lists, even if it is not attached to a session. Otherwise, the curr_used_conn server counter is decremented once too often. This bug was introduced by the commit `04a24c5ea` ("MINOR: connection: don't check priv flag on free"). It is related to the issue #881. It only affects the 2.3, no backport is needed.	2020-10-19 17:19:10 +02:00
Willy Tarreau	69a7b8fc6c	CLEANUP: task: remove the unused and mishandled global_rqueue_size This counter is only updated and never used, and in addition it's done without any atomicity so it's very unlikely to be correct on multi-CPU systems! Let's just remove it since it's not used.	2020-10-19 14:08:13 +02:00
Willy Tarreau	e72a3f4489	CLEANUP: tree-wide: reorder a few structures to plug some holes around locks A few structures were slightly rearranged in order to plug some holes left around the locks. Sizes ranging from 8 to 32 bytes could be saved depending on the structures. No performance difference was noticed (none was expected there), though memory usage might be slightly reduced in some rare cases.	2020-10-19 14:08:13 +02:00
Willy Tarreau	8f1f177ed0	MINOR: threads: change lock_t to an unsigned int We don't need to waste the size of a long for the locks: with the plocks, even an unsigned short would offer enough room for up to 126 threads! Let's use an unsigned int which will be easier to place in certain structures and will more conveniently plug some holes, and Atomic ops are at least as fast on 32-bit as on 64-bit. This will not change anything for 32-bit platforms.	2020-10-19 14:08:13 +02:00
Willy Tarreau	3d18498645	CLEANUP: threads: don't register an initcall when not debugging It's a bit overkill to register an initcall to call a function to set a lock to zero when not debugging, let's just declare the lock as pre-initialized to zero.	2020-10-19 14:08:13 +02:00
Ilya Shipitsin	fcb69d768b	BUILD: ssl: make BoringSSL use its own version numbers BoringSSL is a fork of OpenSSL 1.1.0, however in 49e9f67d8b7cbeb3953b5548ad1009d15947a523 it has changed version to 1.1.1. Should fix issue #895. This must be backported to 2.2, 2.1, 2.0, 1.8	2020-10-19 11:34:37 +02:00
Ilya Shipitsin	b3201a3e07	BUG/MINOR: disable dynamic OCSP load with BoringSSL it was accidently enabled on BoringSSL while actually it is not supported wla: Fix part of the issue mentionned in #895. It fixes build of boringSSL versions prior to commit https://boringssl.googlesource.com/boringssl/+/49e9f67d8b7cbeb3953b5548ad1009d15947a523 Must be backported in 2.2. Signed-off-by: William Lallemand <wlallemand@haproxy.org>	2020-10-19 11:00:51 +02:00
Willy Tarreau	4b6e3c284a	MINOR: lb/chash: use a read lock in chash_get_server_hash() When using a low hash-balance-factor value, it's possible to loop many times trying to find the best server. Figures in the order of 100-300 times were observed for 1000 servers with a factor of 101 (which seems a bit excessive for such a large farm). Given that there's nothing in that function that prevents multiple threads from working in parallel, let's switch to a read lock. Tests on 8 threads show roughly a 2% performance increase with this.	2020-10-17 20:15:49 +02:00
Willy Tarreau	f76a21f78c	MINOR: lb/first: use a read lock in fas_get_next_server() The "first" algorithm creates a lot of contention because all threads focus on the same server by definition (the first available one). By turning the exclusive lock to a read lock in fas_get_next_server(), the request rate increases by 16% for 8 threads when many servers are getting close to their maxconn.	2020-10-17 19:49:49 +02:00
Willy Tarreau	58bc9c1ced	MINOR: lb/leastconn: only take a read lock in fwlc_get_next_server() This function doesn't change the tree, it only looks for the first usable server, so let's do that under a read lock to limit the situations like the ones described in issue #881 where finding a usable server when dealing with lots of saturated ones can be expensive. At least threads will now be able to look up in parallel. It's interesting to note that s->served is not incremented during the server choice, nor is the server repositionned. So right now already, nothing prevents multiple threads from picking the same server. This will not cause a significant imbalance anyway given that the server will automatically be repositionned at the right place, but this might be something to improve in the future if it doesn't come with too high a cost. It also looks like the way a server's weight is updated could be revisited so that the write lock gets tighter at the expense of a short part of inconsistency between weights and servers still present in the tree.	2020-10-17 19:37:40 +02:00
Willy Tarreau	ae99aeb135	MINOR: lb/map: use seek lock and read locks where appropriate - map_get_server_hash() doesn't need a write lock since it only reads the array, let's only use a read lock here. - map_get_server_rr() only needs exclusivity to adjust the rr_idx while looking for its entry. Since this one is not used by map_get_server_hash(), let's turn this lock to a seek lock that doesn't block reads. With 8 threads, no significant performance difference was noticed given that lookups are usually instant with this LB algo so the lock contention is rare.	2020-10-17 19:04:27 +02:00
Willy Tarreau	cd10def825	MINOR: backend: replace the lbprm lock with an rwlock It was previously a spinlock, and it happens that a number of LB algos only lock it for lookups, without performing any modification. Let's first turn it to an rwlock and w-lock it everywhere. This is strictly identical. It was carefully checked that every HA_SPIN_LOCK() was turned to HA_RWLOCK_WRLOCK() and that HA_SPIN_UNLOCK() was turned to HA_RWLOCK_WRUNLOCK() on this lock. _INIT and _DESTROY were updated too.	2020-10-17 18:51:41 +02:00
Willy Tarreau	9d58c9b251	[RELEASE] Released version 2.3-dev7 Released version 2.3-dev7 with the following main changes : - CI: travis-ci: replace not defined SSL_LIB, SSL_INC for BotringSSL builds - BUG/MINOR: init: only keep rlim_fd_cur if max is unlimited - BUG/MINOR: mux-h2: do not stop outgoing connections on stopping - MINOR: fd: report an error message when failing initial allocations - MINOR: proto-tcp: make use of connect(AF_UNSPEC) for the pause - MINOR: sock: add sock_accept_conn() to test a listening socket - MINOR: protocol: make proto_tcp & proto_uxst report listening sockets - MINOR: sockpair: implement the .rx_listening function - CLEANUP: tcp: make use of sock_accept_conn() where relevant - CLEANUP: unix: make use of sock_accept_conn() where relevant - BUG/MINOR: listener: detect and handle shared sockets stopped in other processes - CONTRIB: tcploop: implement a disconnect operation 'D' - CLEANUP: protocol: intitialize all of the sockaddr when disconnecting - BUG/MEDIUM: deinit: check fdtab before fdtab[fd].owner - BUG/MINOR: connection: fix loop iter on connection takeover - BUG/MEDIUM: connection: fix srv idle count on conn takeover - MINOR: connection: improve list api usage - MINOR: mux/connection: add a new mux flag for HOL risk - MINOR: connection: don't check priv flag on free - MEDIUM: backend: add new conn to session if mux marked as HOL blocking - MEDIUM: backend: add reused conn to sess if mux marked as HOL blocking - MEDIUM: h2: remove conn from session on detach - MEDIUM: fcgi: remove conn from session on detach - DOC: Describe reuse safe for HOL handling - MEDIUM: proxy: remove obsolete "mode health" - MEDIUM: proxy: remove obsolete "monitor-net" - CLEANUP: protocol: remove the ->drain() function - CLEANUP: fd: finally get rid of fd_done_recv() - MINOR: connection: make sockaddr_alloc() take the address to be copied - MEDIUM: listener: allocate the connection before queuing a new connection - MINOR: session: simplify error path in session_accept_fd() - MINOR: connection: add new error codes for accept_conn() - MINOR: sock: rename sock_accept_conn() to sock_accepting_conn() - MINOR: protocol: add a new function accept_conn() - MINOR: sock: implement sock_accept_conn() to accept a connection - MINOR: sockpair: implement sockpair_accept_conn() to accept a connection - MEDIUM: listener: use protocol->accept_conn() to accept a connection - MEDIUM: listener: remove the second pass of fd manipulation at the end - MINOR: protocol: add a default I/O callback and put it into the receiver - MINOR: log: set the UDP receiver's I/O handler in the receiver - MINOR: protocol: register the receiver's I/O handler and not the protocol's - CLEANUP: protocol: remove the now unused <handler> field of proto_fam->bind() - DOC: improve the documentation for "option nolinger" - BUG/MEDIUM: proxy: properly stop backends - BUG/MEDIUM: task: bound the number of tasks picked from the wait queue at once - MINOR: threads: augment rwlock debugging stats to report seek lock stats - MINOR: threads: add the transitions to/from the seek state - MEDIUM: task: use an upgradable seek lock when scanning the wait queue - BUILD: listener: avoir a build warning when threads are disabled - BUG/MINOR: peers: Possible unexpected peer seesion reset after collisions. - MINOR: ssl: add volatile flags to ssl samples - MEDIUM: backend: reuse connection if using a static sni - BUG/MEDIUM: spoe: Unset variable instead of set it if no data provided - BUG/MEDIUM: mux-h1: Get the session from the H1S when capturing bad messages - BUG/MEDIUM: lb: Always lock the server when calling server_{take,drop}_conn - DOC: fix typo in MAX_SESS_STKCTR	2020-10-17 10:31:50 +02:00
Matteo Contrini	1857b8cf4d	DOC: fix typo in MAX_SESS_STKCTR MAX_SESS_STKCTR is spelled wrongly a couple of times in the configuration docs (K and C are swapped). This patch fixes the typos.	2020-10-17 09:37:25 +02:00
Christopher Faulet	26a52af642	BUG/MEDIUM: lb: Always lock the server when calling server_{take,drop}_conn The server lock must be held when server_take_conn() and server_drop_conn() lbprm callback functions are called. It is a documented prerequisite but it is not always performed. It only affects leastconn and fas lb algorithm. Others don't use these callback functions. A race condition on the next pending effecive weight (next_eweight) may be encountered with the leastconn lb algorithm. An agent check may set it to 0 while fwlc_srv_reposition() is called. The server is locked during the next_eweight update. But because the server lock is not acquired when fwlc_srv_reposition() is called, we may use it to recompute the server key, leading to a division by 0. This patch must be backported as far as 1.8.	2020-10-17 09:29:43 +02:00
Christopher Faulet	db2c17da60	BUG/MEDIUM: mux-h1: Get the session from the H1S when capturing bad messages It is not guaranteed that the backend connection has an owner. It is set when the connection is created. But when the connection is moved in a server idle list, the connection owner is set to NULL and may never be set again. On the other hand, when a mux is created or when a CS is attached, the session is always defined. The H1 stream always keep a reference on it when it is created. Thus, when a bad message is captured we should not rely on the connection owner to retrieve the session. Instead we should get it from the H1 stream.	2020-10-16 19:53:17 +02:00
Christopher Faulet	2469eba20f	BUG/MEDIUM: spoe: Unset variable instead of set it if no data provided If an agent try to set a variable with the NULL data type, an unset is perform instead to avoid undefined behaviors. Once decoded, such data are translated to a sample with the type SMP_T_ANY. It is unexpected in HAProxy. When a variable is set with such sample, no data are attached to the variable. Thus, when the variable is retrieved later in the transaction, the sample data are uninitialized, leading to undefined behaviors depending on how it is used. For instance, it leads to a crash if the debug converter is used on such variable. This patch should fix the issue #855. It must be backported as far as 1.8.	2020-10-16 19:53:17 +02:00
Amaury Denoyelle	7239c24986	MEDIUM: backend: reuse connection if using a static sni Detect if the sni used a constant value and if so, allow to reuse this connection for later sessions. Use a combination of SMP_USE_INTRN + !SMP_F_VOLATILE to consider a sample as a constant value. This features has been requested on github issue #371.	2020-10-16 17:48:01 +02:00
Amaury Denoyelle	2f0a797631	MINOR: ssl: add volatile flags to ssl samples The ssl samples are not constant over time and change according to the session. Add the flag SMP_F_VOL_SESS to indicate this.	2020-10-16 17:47:29 +02:00
Fr�d�ric L�caille	baeb919177	BUG/MINOR: peers: Possible unexpected peer seesion reset after collisions. During a peers session collision (two peer sessions opened on both side) we must mark the peer the session of which will be shutdown as alive, if not ->reconnect timer will be set with a wrong value if the synchro task expires after the peer has been reconnected. This possibly leads to unexpected deconnections during handshakes. Furthermore, this patch cancels any heartbeat tranmimission when a reconnection is prepared.	2020-10-16 17:45:58 +02:00
Willy Tarreau	0aa5a5b175	BUILD: listener: avoir a build warning when threads are disabled It's just a __decl_thread() that appeared before the last variable.	2020-10-16 17:43:04 +02:00
Willy Tarreau	d48ed6643b	MEDIUM: task: use an upgradable seek lock when scanning the wait queue Right now when running a configuration with many global timers (e.g. many health checks), there is a lot of contention on the global wait queue lock because all threads queue up in front of it to scan it. With 2000 servers checked every 10 milliseconds (200k checks per second), after 23 seconds running on 8 threads, the lock stats were this high: Stats about Lock TASK_WQ: write lock : 9872564 write unlock: 9872564 (0) wait time for write : 9208.409 msec wait time for write/lock: 932.727 nsec read lock : 240367 read unlock : 240367 (0) wait time for read : 149.025 msec wait time for read/lock : 619.991 nsec i.e. ~5% of the total runtime spent waiting on this specific lock. With upgradable locks we don't need to work like this anymore. We can just try to upgade the read lock to a seek lock before scanning the queue, then upgrade the seek lock to a write lock for each element we want to delete there and immediately downgrade it to a seek lock. The benefit is double: - all other threads which need to call next_expired_task() before polling won't wait anymore since the seek lock is compatible with the read lock ; - all other threads competing on trying to grab this lock will fail on the upgrade attempt from read to seek, and will let the current lock owner finish collecting expired entries. Doing only this has reduced the wake_expired_tasks() CPU usage in a very large servers test from 2.15% to 1.04% as reported by perf top, and increased by 3% the health check rate (all threads being saturated). This is expected to help against (and possibly solve) the problem described in issue #875.	2020-10-16 17:15:54 +02:00
Willy Tarreau	61f799b8da	MINOR: threads: add the transitions to/from the seek state Since our locks are based on progressive locks, we support the upgradable seek lock that is compatible with readers and upgradable to a write lock. The main purpose is to take it while seeking down a tree for modification while other threads may seek the same tree for an input (e.g. compute the next event date). The newly supported operations are: HA_RWLOCK_SKLOCK(lbl,l) pl_take_s(l) /* N --> S / HA_RWLOCK_SKTOWR(lbl,l) pl_stow(l) / S --> W / HA_RWLOCK_WRTOSK(lbl,l) pl_wtos(l) / W --> S / HA_RWLOCK_SKTORD(lbl,l) pl_stor(l) / S --> R / HA_RWLOCK_WRTORD(lbl,l) pl_wtor(l) / W --> R / HA_RWLOCK_SKUNLOCK(lbl,l) pl_drop_s(l) / S --> N / HA_RWLOCK_TRYSKLOCK(lbl,l) (!pl_try_s(l)) / N -?> S / HA_RWLOCK_TRYRDTOSK(lbl,l) (!pl_try_rtos(l)) / R -?> S */ Existing code paths are left unaffected so this patch doesn't affect any running code.	2020-10-16 16:53:46 +02:00
Willy Tarreau	8d5360ca7f	MINOR: threads: augment rwlock debugging stats to report seek lock stats We currently use only read and write lock operations with rwlocks, but ours also support upgradable seek locks for which we do not report any stats. Let's add them now when DEBUG_THREAD is enabled.	2020-10-16 16:51:49 +02:00

1 2 3 4 5 ...

13015 Commits