haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-03-05 19:10:45 +00:00

Author	SHA1	Message	Date
Willy Tarreau	a698eb6739	MINOR: streams: use one list per stream instead of a global one The global streams list is exclusively used for "show sess", to look up a stream to shut down, and for the hard-stop. Having all of them in a single list is extremely expensive in terms of locking when using threads, with performance losses as high as 7% having been observed just due to this. This patch makes the list per-thread, since there's no need to have a global one in this situation. All call places just iterate over all threads. The most "invasive" changes was in "show sess" where the end of list needs to go back to the beginning of next thread's list until the last thread is seen. For now the lock was maintained to keep the code auditable but a next commit should get rid of it. The observed performance gain here with only 4 threads is already 7% (350krps -> 374krps).	2021-02-24 13:53:20 +01:00
Willy Tarreau	5d533e2bad	MINOR: cli/streams: make "show sess" dump all streams till the new epoch Instead of placing the current stream at the end of the stream list when issuing a "show sess" on the CLI as was done in 2.2 with commit `c6e7a1b8e` ("MINOR: cli: make "show sess" stop at the last known session"), now we compare the listed stream's epoch with the dumping stream's and stop on more recent ones. This way we're certain to always only dump known streams at the moment we issue the dump command without having to modify the list. In theory we could miss some streams if more than 2^31 "show sess" requests are issued while an old stream remains present, but that's 68 years at 1 "show sess" per second and it's unlikely we'll keep a process, let alone a stream, that long. It could be verified that the count of dumped streams still matches the one before this change.	2021-02-24 12:12:51 +01:00
Willy Tarreau	b981318c11	MINOR: stream: add an "epoch" to figure which streams appeared when The "show sess" CLI command currently lists all streams and needs to stop at a given position to avoid dumping forever. Since 2.2 with commit `c6e7a1b8e` ("MINOR: cli: make "show sess" stop at the last known session"), a hack consists in unlinking the stream running the applet and linking it again at the current end of the list, in order to serve as a delimiter. But this forces the stream list to be global, which affects scalability. This patch introduces an epoch, which is a global 32-bit counter that is incremented by the "show sess" command, and which is copied by newly created streams. This way any stream can know whether any other one is newer or older than itself. For now it's only stored and not exploited.	2021-02-24 12:12:51 +01:00
Willy Tarreau	0d03825b93	BUG/MINOR: proxy: wake up all threads when sending the hard-stop signal The hard-stop event didn't wake threads up. In the past it wasn't an issue as the poll timeout was limited to 1 second, but since commit `4f59d3861` ("MINOR: time: increase the minimum wakeup interval to 60s") it has become a problem because old processes can remain live for up to one minute after the hard-stop-after delay. Let's just wake them up. This may be backported to older releases, though before 2.4 the extra delay was only one second.	2021-02-24 12:12:46 +01:00
Willy Tarreau	3f5dd2945c	BUG/MEDIUM: cli/shutdown sessions: make it thread-safe There's no locking around the lookup of a stream nor its shutdown when issuing "shutdown sessions" over the CLI so the risk of crashing the process is particularly high. Let's use a thread_isolate() there which is suitable for this task, and there are not that many alternatives. This must be backported to 1.8.	2021-02-24 11:11:06 +01:00
Willy Tarreau	92b887e20a	BUG/MEDIUM: proxy: use thread-safe stream killing on hard-stop When setting hard-stop-after, hard_stop() is called at the end to kill last pending streams. Unfortunately there's no locking there while walking over the streams list nor when shutting them down, so it's very likely that some old processes have been crashing or gone wild due to this. Let's use a thread_isolate() call for this as we don't have much other choice (and it happens once in the process' life, that's OK). This must be backported to 1.8.	2021-02-24 11:08:56 +01:00
Willy Tarreau	61d095ed37	DOC: muxes: add a diagram of the exchanges between muxes and outer world Since the muxes API is far from being obvious, let's show a stream being forwarded between two sides through muxes with their buffers and the transport layers. The diagram is provided in .fig, .svg, .png, and .pdf.	2021-02-24 09:13:00 +01:00
Dragan Dosen	ec0a604f27	CLEANUP: vars: make smp_fetch_var() to reuse vars_get_by_desc() They both do the same thing, so let's remove unneeded code duplication.	2021-02-23 17:23:53 +01:00
Dragan Dosen	14518f2305	BUG/MEDIUM: vars: make functions vars_get_by_{name,desc} thread-safe This patch adds a lock to functions vars_get_by_name() and vars_get_by_desc() to protect accesses to the list of variables. After the variable is fetched, a sample data is duplicated by using smp_dup() because the variable may be modified by another thread. This should be backported to all versions supporting vars along with "BUG/MINOR: sample: secure convs that accept base64 string and var name as args" which this patch depends on.	2021-02-23 17:22:46 +01:00
Dragan Dosen	9e8db138c9	BUG/MINOR: sample: secure convs that accept base64 string and var name as args This patch adds a few improvements in order to secure the use of converters that accept base64 string and variable name as arguments. The first change is within related function sample_conv_var2smp_str() which now flags the sample as SMP_F_CONST if the argument is of type ARGT_STR. This makes the sample more safe for later use. A new function sample_check_arg_base64() is added. It checks an argument and fills it with a variable type if the argument string contains a valid variable name. If failed, it tries to perform a base64 decode operation on a non-empty string, and fills the argument with the decoded content which can be used later, without any additional base64dec() function calls during runtime. This means that haproxy configuration check may fail if variable lookup fails and an invalid base64 encoded string is specified as an argument for such converters. Both converters, "aes_gcm_dec" and "hmac", now use alloc_trash_chunk() in order to allocate additional buffers for various conversions, and avoid the use of a pre-allocated trash chunks directly (usually returned by get_trash_chunk()). The function sample_check_arg_base64() is used for both converters in order to check their arguments specified within the haproxy configuration. This patch should be backported as far as 2.0. However, it is important to keep in mind a few things. The "hmac" converter is only available starting with 2.2. In versions prior to 2.2, the "aes_gcm_dec" converter and sample_conv_var2smp_str() are implemented in src/ssl_sock.c. Thus the patch will have to be adapted on these versions. Note that this patch is required for a subsequent, more important fix.	2021-02-23 17:21:46 +01:00
William Lallemand	6c0961442c	BUG/MINOR: ssl/cli: potential null pointer dereference in "set ssl cert" A potential null pointer dereference was reported with an old gcc version (6.5) src/ssl_ckch.c: In function 'cli_parse_set_cert': src/ssl_ckch.c:838:7: error: potential null pointer dereference [-Werror=null-dereference] if (!ssl_sock_copy_cert_key_and_chain(src->ckch, dst->ckch)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/ssl_ckch.c:838:7: error: potential null pointer dereference [-Werror=null-dereference] src/ssl_ckch.c: In function 'ckchs_dup': src/ssl_ckch.c:838:7: error: potential null pointer dereference [-Werror=null-dereference] if (!ssl_sock_copy_cert_key_and_chain(src->ckch, dst->ckch)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/ssl_ckch.c:838:7: error: potential null pointer dereference [-Werror=null-dereference] cc1: all warnings being treated as errors This case does not actually happen but it's better to fix the ckch API with a NULL check. Could be backported as far as 2.1.	2021-02-23 14:58:21 +01:00
Tim Duesterhus	6bcdc6530a	MINOR: Configure the `cpp` userdiff driver for *.[ch] in .gitattributes This might improve the output of `git diff` in certain cases. Especially `git diff --word-diff` will be much more useful. Does not affect the generated code, may be backported for consistency if desired.	2021-02-22 18:17:57 +01:00
Ilya Shipitsin	98a9e1b873	BUILD: SSL: introduce fine guard for RAND_keep_random_devices_open RAND_keep_random_devices_open is OpenSSL specific function, not implemented in LibreSSL and BoringSSL. Let us define guard HAVE_SSL_RAND_KEEP_RANDOM_DEVICES_OPEN in include/haproxy/openssl-compat.h That guard does not depend anymore on HA_OPENSSL_VERSION	2021-02-22 10:35:23 +01:00
Willy Tarreau	31dd393da0	[RELEASE] Released version 2.4-dev9 Released version 2.4-dev9 with the following main changes : - BUG/MINOR: server: Remove RMAINT from admin state when loading server state - CLEANUP: check: fix get_check_status_info declaration - CLEANUP: contrib/prometheus-exporter: align for with srv status case - MEDIUM: stats: allow to select one field in `stats_fill_li_stats` - MINOR: stats: add helper to get status string - MEDIUM: contrib/prometheus-exporter: add listen stats - BUG/MINOR: dns: add test on result getting value from buffer into ring. - BUG/MINOR: dns: dns_connect_server must return -1 unsupported nameserver's type - BUG/MINOR: dns: missing test writing in output channel in session handler - BUG/MINOR: dns: fix ring attach control on dns_session_new - BUG/MEDIUM: dns: fix multiple double close on fd in dns.c - BUG/MAJOR: connection: prevent double free if conn selected for removal - BUG/MINOR: session: atomically increment the tracked sessions counter - REGTESTS: fix http_reuse_conn_hash proxy test - BUG/MINOR: backend: do not call smp_make_safe for sni conn hash - MINOR: connection: remove pointers for prehash in conn_hash_params - BUG/MINOR: checks: properly handle wrapping time in __health_adjust() - BUG/MEDIUM: checks: don't needlessly take the server lock in health_adjust() - DEBUG: thread: add 5 extra lock labels for statistics and debugging - OPTIM: server: switch the actconn list to an mt-list - Revert "MINOR: threads: change lock_t to an unsigned int" - MINOR: lb/api: let callers of take_conn/drop_conn tell if they have the lock - OPTIM: lb-first: do not take the server lock on take_conn/drop_conn - OPTIM: lb-leastconn: do not take the server lock on take_conn/drop_conn - OPTIM: lb-leastconn: do not unlink the server if it did not change - MINOR: tasks: add DEBUG_TASK to report caller info in a task - MINOR: tasks/debug: add some extra controls of use-after-free in DEBUG_TASK - BUG/MINOR: sample: Always consider zero size string samples as unsafe - MINOR: cli: add missing agent commands for set server - BUILD/MEDIUM: da Adding pcre2 support. - BUILD: ssl: introduce fine guard for OpenSSL specific SCTL functions - REGTESTS: reorder reuse conn proxy protocol test - DOC: explain the relation between pool-low-conn and tune.idle-pool.shared - MINOR: tasks: refine the default run queue depth - MINOR: listener: refine the default MAX_ACCEPT from 64 to 4 - MINOR: mux_h2: do not try to remove front conn from idle trees - REGTESTS: workaround for a crash with recent libressl on http-reuse sni - BUG/MEDIUM: lists: Avoid an infinite loop in MT_LIST_TRY_ADDQ(). - MINOR: connection: allocate dynamically hash node for backend conns - DOC: DeviceAtlas documentation typo fix. - BUG/MEDIUM: spoe: Resolve the sink if a SPOE logs in a ring buffer - BUG/MINOR: http-rules: Always replace the response status on a return action - BUG/MINOR: server: Init params before parsing a new server-state line - BUG/MINOR: server: Be sure to cut the last parsed field of a server-state line - MEDIUM: server: Don't introduce a new server-state file version - DOC: contrib/prometheus-exporter: remove htx reference - REGTESTS: contrib/prometheus-exporter: test NaN values - REGTESTS: contrib/prometheus-exporter: test well known labels - CI: github actions: switch to stable LibreSSL release - BUG/MINOR: server: Fix test on number of fields allowed in a server-state line - MINOR: dynbuf: make the buffer wait queue per thread - MINOR: dynbuf: use regular lists instead of mt_lists for buffer_wait - MINOR: dynbuf: pass offer_buffers() the number of buffers instead of a threshold - MINOR: sched: have one runqueue ticks counter per thread	2021-02-20 13:30:31 +01:00
Willy Tarreau	c6ba9a0b9b	MINOR: sched: have one runqueue ticks counter per thread The runqueue_ticks counts the number of task wakeups and is used to position new tasks in the run queue, but since we've had per-thread run queues, the values there are not very relevant anymore and the nice value doesn't apply well if some threads are more loaded than others. In addition, letting all threads compete over a shared counter is not smart as this may cause some excessive contention. Let's move this index close to the run queues themselves, i.e. one per thread and a global one. In addition to improving fairness, this has increased global performance by 2% on 16 threads thanks to the lower contention on rqueue_ticks. Fairness issues were not observed, but if any were to be, this patch could be backported as far as 2.0 to address them.	2021-02-20 13:03:37 +01:00
Willy Tarreau	4d77bbf856	MINOR: dynbuf: pass offer_buffers() the number of buffers instead of a threshold Historically this function would try to wake the most accurate number of process_stream() waiters. But since the introduction of filters which could also require buffers (e.g. for compression), things started not to be as accurate anymore. Nowadays muxes and transport layers also use buffers, so the runqueue size has nothing to do anymore with the number of supposed users to come. In addition to this, the threshold was compared to the number of free buffer calculated as allocated minus used, but this didn't work anymore with local pools since these counts are not updated upon alloc/free! Let's clean this up and pass the number of released buffers instead, and consider that each waiter successfully called counts as one buffer. This is not rocket science and will not suddenly fix everything, but at least it cannot be as wrong as it is today. This could have been marked as a bug given that the current situation is totally broken regarding this, but this probably doesn't completely fix it, it only goes in a better direction. It is possible however that it makes sense in the future to backport this as part of a larger series if the situation significantly improves.	2021-02-20 12:38:18 +01:00
Willy Tarreau	90f366b595	MINOR: dynbuf: use regular lists instead of mt_lists for buffer_wait There's no point anymore in keeping mt_lists for the buffer_wait and buffer_wq since it's thread-local now.	2021-02-20 12:38:18 +01:00
Willy Tarreau	e8e5091510	MINOR: dynbuf: make the buffer wait queue per thread The buffer wait queue used to be global historically but this doest not make any sense anymore given that the most common use case is to have thread-local pools. Thus there's no point waking up waiters of other threads after releasing an entry, as they won't benefit from it. Let's move the queue head to the thread_info structure and use ti->buffer_wq from now on.	2021-02-20 12:38:18 +01:00
Christopher Faulet	28d7876a0c	BUG/MINOR: server: Fix test on number of fields allowed in a server-state line When a server-state line is parsed, a test is performed to be sure there is enough but not too much fields. However the test is buggy. The bug was introduced in the commit `ea2cdf55e` ("MEDIUM: server: Don't introduce a new server-state file version"). No backport needed.	2021-02-20 12:24:12 +01:00
Ilya Shipitsin	938e85b228	CI: github actions: switch to stable LibreSSL release LibreSSL-3.3.0 appeared to have its own bugs, it is development release, let us switch to stable LibreSSL-3.2.4 instead	2021-02-19 18:08:06 +01:00
William Dauchy	98ad35f826	REGTESTS: contrib/prometheus-exporter: test well known labels as we previously briefly broke labels handling, test them to make sure we don't introduce regressions in the future. see also commit `040b1195f7` ("BUG/MINOR: contrib/prometheus-exporter: Restart labels dump at the right pos") for reference Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-19 18:03:59 +01:00
William Dauchy	b45674433b	REGTESTS: contrib/prometheus-exporter: test NaN values In order to make sure we detect when we change default behaviour for some metrics, test the NaN value when it is expected. Those metrics were listed since our last rework as their default value changed, unless the appropriate config is set. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-19 18:03:59 +01:00
William Dauchy	04e90df7cb	DOC: contrib/prometheus-exporter: remove htx reference now that htx is the default everywhere, we can remove the need to put htx as a mandatory option to setup prometheus. Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-19 18:03:59 +01:00
Christopher Faulet	ea2cdf55e3	MEDIUM: server: Don't introduce a new server-state file version This revert the commit `63e6cba12` ("MEDIUM: server: add server-states version 2"), but keeping all recent features added to the server-sate file. Instead of adding a 2nd version for the server-state file format to handle the 5 new fields added during the 2.4 development, these fields are considered as optionnal during the parsing. So it is possible to load a server-state file from HAProxy 2.3. However, from 2.4, these new fields are always dumped in the server-state file. But it should not be a problem to load it on the 2.3. This patch seems a bit huge but the diff ignoring the space is much smaller. The version 2 of the server-state file format is reserved for a real refactoring to address all issues of the current format.	2021-02-19 18:03:59 +01:00
Christopher Faulet	868a5757e5	BUG/MINOR: server: Be sure to cut the last parsed field of a server-state line If a line of a server-state file has too many fields, the last one is not cut on the first following space, as all other fileds. It contains all the end of the line. It is not the expected behavior. So, now, we cut it on the next following space, if any. The parsing loop was slighly rewritten. Note that for now there is no error reported if the line is too long. This patch may be backported at least as far as 2.1. On 2.0 and prior the code is not the same. The line parsing is inlined in apply_server_state() function.	2021-02-19 18:03:59 +01:00
Christopher Faulet	06cd256978	BUG/MINOR: server: Init params before parsing a new server-state line Same static arrays of parameters are used to parse all server-state lines. Thus it is important to reinit them to be sure to not get params from the previous line, eventually from the previous loaded file. This patch should be backported to all stable branches. However, in 2.0 and prior, the parsing of server-state lines are inlined in apply_server_state() function. Thus the patch will have to be adapted on these versions.	2021-02-19 18:03:59 +01:00
Christopher Faulet	2d36df275b	BUG/MINOR: http-rules: Always replace the response status on a return action When a HTTP return action is triggered, HAProxy is responsible to return the response, based on the configured status code. On the request side, there is no problem because there is no server response to replace. But on the response side, we must take care to override the server response status code, if any, to be sure to use the rigth status code to get the http reply message. In short, we must always set the configured status code of the HTTP return action before returning the http reply to be sure to get the right reply, the one base on the http return action status code and not a reply based on the server response status code.. This patch should fix the issue #1139. It must be backported as far as 2.2.	2021-02-19 18:03:59 +01:00
Christopher Faulet	1d7d0f86b8	BUG/MEDIUM: spoe: Resolve the sink if a SPOE logs in a ring buffer If a SPOE filter is configured to send its logs to a ring buffer, the corresponding sink must be resolved during the configuration post parsing. Otherwise, the sink is undefined when a log message is emitted, crashing HAProxy. This patch must be backported as far as 2.2.	2021-02-19 18:03:59 +01:00
David Carlier	e0724580d3	DOC: DeviceAtlas documentation typo fix. The USE_PCRE syntax was incorrect. No backport needed.	2021-02-19 18:02:25 +01:00
Amaury Denoyelle	8990b010a0	MINOR: connection: allocate dynamically hash node for backend conns Remove ebmb_node entry from struct connection and create a dedicated struct conn_hash_node. struct connection contains now only a pointer to a conn_hash_node, allocated only for connections where target is of type OBJ_TYPE_SERVER. This will reduce memory footprints for every connections that does not need http-reuse such as frontend connections.	2021-02-19 16:59:18 +01:00
Olivier Houchard	5567f41d0a	BUG/MEDIUM: lists: Avoid an infinite loop in MT_LIST_TRY_ADDQ(). In MT_LIST_TRY_ADDQ(), deal with the "prev" field of the element before the "next". If the element is the first in the list, then its next will already have been locked when we locked list->prev->next, so locking it again will fail, and we'll start over and over. This should be backported to 2.3.	2021-02-19 16:47:20 +01:00
Amaury Denoyelle	d8ea188058	REGTESTS: workaround for a crash with recent libressl on http-reuse sni Disable the ssl-reuse for the sni test on http_reuse_conn_hash vtc. This seems to be the origin of a crash with libressl environment from 3.2.2 up to 3.3.1 included. For now, it is not determined if the root cause is in haproxy or libressl. Please look for the github issue #1115 for all the details.	2021-02-19 16:47:20 +01:00
Amaury Denoyelle	3d752a8f97	MINOR: mux_h2: do not try to remove front conn from idle trees In h2_process there was two parts where the connection was removed from the idle trees, without first checking if the connection is a backend side. This should not produce a crash as the node is properly zeroed on conn_init. However, it is better to explicit the test as it is done on all other places. Besides it will be mandatory if the node part is dynamically allocated only for backend connections.	2021-02-19 16:35:13 +01:00
Willy Tarreau	66161326fd	MINOR: listener: refine the default MAX_ACCEPT from 64 to 4 The maximum number of connections accepted at once by a thread for a single listener used to default to 64 divided by the number of processes but the tasklet-based model is much more scalable and benefits from smaller values. Experimentation has shown that 4 gives the highest accept rate for all thread values, and that 3 and 5 come very close, as shown below (HTTP/1 connections forwarded per second at multi-accept 4 and 64): ac\thr\| 1 2 4 8 16 ------+------------------------------ 4\| 80k 106k 168k 270k 336k 64\| 63k 89k 145k 230k 274k Some tests were also conducted on SSL and absolutely no change was observed. The value was placed into a define because it used to be spread all over the code. It might be useful at some point to backport this to 2.3 and 2.2 to help those who observed some performance regressions from 1.6.	2021-02-19 16:02:04 +01:00
Willy Tarreau	4327d0ac00	MINOR: tasks: refine the default run queue depth Since a lot of internal callbacks were turned to tasklets, the runqueue depth had not been readjusted from the default 200 which was initially used to favor batched processing. But nowadays it appears too large already based on the following tests conducted on a 8c16t machine with a simple config involving "balance leastconn" and one server. The setup always involved the two threads of a same CPU core except for 1 thread, and the client was running over 1000 concurrent H1 connections. The number of requests per second is reported for each (runqueue-depth, nbthread) couple: rq\thr\| 1 2 4 8 16 ------+------------------------------ 32\| 120k 159k 276k 477k 698k 40\| 122k 160k 276k 478k 722k 48\| 121k 159k 274k 482k 720k 64\| 121k 160k 274k 469k 710k 200\| 114k 150k 247k 415k 613k <-- default It's possible to save up to about 18% performance by lowering the default value to 40. One possible explanation to this is that checking I/Os more frequently allows to flush buffers faster and to smooth the I/O wait time over multiple operations instead of alternating phases of processing, waiting for locks and waiting for new I/Os. The total round trip time also fell from 1.62ms to 1.40ms on average, among which at least 0.5ms is attributed to the testing tools since this is the minimum attainable on the loopback. After some observation it would be nice to backport this to 2.3 and 2.2 which observe similar improvements, since some users have already observed some perf regressions between 1.6 and 2.2.	2021-02-19 16:01:55 +01:00
Willy Tarreau	0784db8566	DOC: explain the relation between pool-low-conn and tune.idle-pool.shared Disabling idle-pool sharing can result in awful performance in presence of a not so high number of threads, because the number of available idle connections will be shared among threads, resulting in most of them abandonning their connections after a request is done if there are already enough total available. This is a case where pool-low-conn ought to be used to preserve a number of connections for each thread, but this relation isn't obvious as is. Let's add mentions about this with both keywords.	2021-02-19 11:49:04 +01:00
Amaury Denoyelle	4cce7088d1	REGTESTS: reorder reuse conn proxy protocol test Try to fix http_reuse_conn_hash proxy protocol for both single and multi-thread environment. Schedule a new set of requests to be sure that takeover will be functional even with pool-low-count set to 2.	2021-02-18 16:07:17 +01:00
Ilya Shipitsin	c47d676bd7	BUILD: ssl: introduce fine guard for OpenSSL specific SCTL functions SCTL (signed certificate timestamp list) specified in RFC6962 was implemented in c74ce24cd22e8c683ba0e5353c0762f8616e597d, let us introduce macro HAVE_SSL_SCTL for the HAVE_SSL_SCTL sake, which in turn is based on SN_ct_cert_scts, which comes in the same commit	2021-02-18 15:55:50 +01:00
David Carlier	019dbd7884	BUILD/MEDIUM: da Adding pcre2 support. The DeviceAtlas Detection API now supports also the pcre2 library, and some users wish to have exclusively this version in their environment. Also, there is no longer new development happening in the legacy pcre(1) counterpart. Simple check in the build process as the mutual exclusivity check between the two are already taking care of early on. Moving the check to the part only when we build haproxy + the API from source as the other case the API is already built with the chosen regex library separately.	2021-02-18 14:58:43 +01:00
William Dauchy	3f4ec7d9fb	MINOR: cli: add missing agent commands for set server we previously forgot to add `agent-*` commands. Take this opportunity to rewrite the help string in a simpler way for readability (mainly removing simple quotes) Signed-off-by: William Dauchy <wdauchy@gmail.com>	2021-02-18 14:58:43 +01:00
Christopher Faulet	8dd40fbde9	BUG/MINOR: sample: Always consider zero size string samples as unsafe smp_is_safe() function is used to be sure a sample may be safely modified. For string samples, a test is performed to verify if there is a null-terminated byte. If not, one is added, if possible. It means if the sample is not const and if there is some free space in the buffer, after data. However, we must not try to read the null-terminated byte if the string sample is too long (data >= size) or if the size is equal to zero. This last test was not performed. Thus it was possible to consider a string sample as safe by testing a byte outside the buffer. Now, a zero size string sample is always considered as unsafe and is duplicated when smp_make_safe() is called. This patch must be backported in all stable versions.	2021-02-18 14:58:43 +01:00
Willy Tarreau	ca9f60c1ac	MINOR: tasks/debug: add some extra controls of use-after-free in DEBUG_TASK It's pretty easy to pre-initialize the index, change it on free() and check it during the wakeup, so let's do this to ease detection of any accidental task_wakeup() after a task_free() or tasklet_wakeup() after a tasklet_free(). If this would ever happen we'd then get a backtrace and a core now. The index's parity is respected so that the call history remains exploitable.	2021-02-18 14:38:49 +01:00
Willy Tarreau	b23f04260b	MINOR: tasks: add DEBUG_TASK to report caller info in a task The idea is to know who woke a task up, by recording the last two callers in a rotating mode. For now it's trivial with task_wakeup() but tasklet_wakeup_on() will require quite some more changes. This typically gives this from the debugger: (gdb) p t->debug $2 = { caller_file = {0x0, 0x8c0d80 "src/task.c"}, caller_line = {0, 260}, caller_idx = 1 } or this: (gdb) p t->debug $6 = { caller_file = {0x7fffe40329e0 "", 0x885feb "src/stream.c"}, caller_line = {284, 284}, caller_idx = 1 } But it also provides a trivial macro allowing to simply place a call in a task/tasklet handler that needs to be observed: DEBUG_TASK_PRINT_CALLER(t); Then starting haproxy this way would trivially yield such info: $ ./haproxy -db -f test.cfg \| sort \| uniq -c \| sort -nr 199992 h1_io_cb woken up from src/sock.c:797 51764 h1_io_cb woken up from src/mux_h1.c:3634 65 h1_io_cb woken up from src/connection.c:169 45 h1_io_cb woken up from src/sock.c:777	2021-02-18 10:42:07 +01:00
Willy Tarreau	5064ab6a98	OPTIM: lb-leastconn: do not unlink the server if it did not change Due to the two-phase server reservation, there are 3 calls to fwlc_srv_reposition() per request, one during assign_server() to reserve the slot, one in connect_server() to commit it, and one in process_stream() to release it. However only one of the first two will change the key, so it's needlessly costly to take the lock, remove a server and insert it again at the same place when we can already figure we ought not to even have taken the lock. Furthermore, even when the server needs to move, there can be quite some contention on the lbprm lock forcing the thread to wait. During this time the served and nbpend server values might have changed, just like the lb_node.key itself. Thus we measure the values again under the lock before updating the tree. Measurements have shown that under contention with 16 servers and 16 threads, 50% of the updates can be avoided there. This patch makes the function compute the new key and compare it to the current one before deciding to move the entry (and does it again under the lock forthe second test). This removes between 40 and 50% of the tree updates depending on the thread contention and the number of servers. The performance gain due to this (on 16 threads) was: 16 servers: 415 krps -> 440 krps (6%, contention on lbprm) 4 servers: 554 krps -> 714 krps (+29%, little contention) One point worth thinking about is that it's not logic to update the tree 2-3 times per request while it's only read once. half to 2/3 of these updates are not needed. An experiment consisting in subscribing the server to a list and letting the readers reinsert them on the fly showed further degradation instead of an improvement. A better approach would probably consist in avoinding writes to shared cache lines by having the leastconn nodes distinct from the servers, with one node per value, and having them hold an mt-list with all the servers having that number of connections. The connection count tree would then be read-mostly instead of facing heavy writes, and most write operations would be performed on 1-3 list heads which are way cheaper to migrate than a tree node, and do not require updating the last two updated neighbors' cache lines.	2021-02-18 10:06:45 +01:00
Willy Tarreau	85b2fb0358	OPTIM: lb-leastconn: do not take the server lock on take_conn/drop_conn The operations are only an insert and a delete into the LB tree, which doesn't require the server's lock at all as the lbprm lock is already held. Let's drop it. Just for the sake of cleanness, given that the served and nbpend values used to be atomically updated, we'll use an atomic load to read them.	2021-02-18 10:06:45 +01:00
Willy Tarreau	6b96e0e9d2	OPTIM: lb-first: do not take the server lock on take_conn/drop_conn The operations are only an insert and a delete into the LB tree, which doesn't require the server's lock at all as the lbprm lock is already held. Let's drop it.	2021-02-18 10:06:45 +01:00
Willy Tarreau	59b0fecfd9	MINOR: lb/api: let callers of take_conn/drop_conn tell if they have the lock The two algos defining these functions (first and leastconn) do not need the server's lock. However it's already present in pendconn_process_next_strm() so the API must be updated so that the functions may take it if needed and that the callers indicate whether they already own it. As such, the call places (backend.c and stream.c) now do not take it anymore, queue.c was unchanged since it's already held, and both "first" and "leastconn" were updated to take it if not already held. A quick test on the "first" algo showed a jump from 432 to 565k rps by just dropping the lock in stream.c!	2021-02-18 10:06:45 +01:00
Willy Tarreau	b9ad30a8ad	Revert "MINOR: threads: change lock_t to an unsigned int" This reverts commit `8f1f177ed0`. Repeated tests have shown a small perforamnce degradation of ~1.8% caused by this patch at high request rates on 16 threads. The exact cause is not yet perfectly known but it probably stems in slower accesses for non-64-bit aligned atomic accesses.	2021-02-18 10:06:45 +01:00
Willy Tarreau	751153e0f1	OPTIM: server: switch the actconn list to an mt-list The remaining contention on the server lock solely comes from sess_change_server() which takes the lock to add and remove a stream from the server's actconn list. This is both expensive and pointless since we have mt-lists, and this list is only used by the CLI's "shutdown server sessions" command! Let's migrate to an mt-list and remove the need for this costly lock. By doing so, the request rate increased by ~1.8%.	2021-02-18 10:06:45 +01:00
Willy Tarreau	ccea3c54f4	DEBUG: thread: add 5 extra lock labels for statistics and debugging Since OTHER_LOCK is commonly used it's become much more difficult to profile lock contention by temporarily changing a lock label. Let's add DEBUG1..5 to serve only for debugging. These ones must not be used in committed code. We could decide to only define them when DEBUG_THREAD is set but that would complicate attempts at measuring performance with debugging turned off.	2021-02-18 10:06:45 +01:00

1 2 3 4 5 ...

13933 Commits