haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-01-20 20:50:44 +00:00

Author	SHA1	Message	Date
Amaury Denoyelle	7afa5c1843	MINOR: global: define MODE_STOPPING Define a new mode MODE_STOPPING. It is used to indicate that the process is in the stopping stage and no event loop runs anymore.	2021-08-09 17:51:55 +02:00
Amaury Denoyelle	9ba34ae710	BUG/MINOR: check: test if server is not null in purge Test if server is not null before using free_server in the check purge operation. Currently, the null server scenario should not occured as purge is used with refcounted dynamic servers. However, this might not be always the case if purge is use in the future in other cases; thus the test is useful for extensibility. No need to backport, unless dynamic server checks are backported. This has been reported through a coverity report in github issue #1343.	2021-08-09 17:48:34 +02:00
Ilya Shipitsin	811ce5598e	CI: travis-ci: temporarily disable arm64 builds few recent builds failed with "gcc: fatal error: Killed signal terminated program cc1", let us disable arm64 builds until investigation	2021-08-07 07:28:15 +02:00
Amaury Denoyelle	a026783bd7	REGTESTS: server: fix dynamic server with checks test Add a missing 'rxreq' statement in first server. Without it the test is unstable. The issue is frequent when running with one thread only. This should fix github issue #1342.	2021-08-06 15:34:04 +02:00
Amaury Denoyelle	414a612bb3	MINOR: doc: specify ulimit-n usage for dynamic servers Complete the documentation of the dynamic servers to warn about a possible fd resource exhaustion when using a large number of them.	2021-08-06 11:22:01 +02:00
Amaury Denoyelle	3e7d468e80	REGTESTS: server: add dynamic check server test Write a regtest to validate check support by dynamic servers. Three differents servers are added on various configuration : - server OK - server DOWN - agent-check	2021-08-06 11:22:01 +02:00
Amaury Denoyelle	b65f4cab6a	MEDIUM: server: implement agent check for dynamic servers This commit is the counterpart for agent check of "MEDIUM: server: implement check for dynamic servers". The "agent-check" keyword is enabled for dynamic servers. The agent check must manually be activated via "enable agent" CLI. This can enable the dynamic server if the agent response is "ready" without an explicit "enable server" CLI.	2021-08-06 11:09:48 +02:00
Amaury Denoyelle	2fc4d39577	MEDIUM: server: implement check for dynamic servers Implement check support for dynamic servers. The "check" keyword is now enabled for dynamic servers. If used, the server check is initialized and the check task started in the "add server" CLI handler. The check is explicitely disabled and must be manually activated via "enable health" CLI handler. The dynamic server refcount is incremented if a check is configured. On "delete server" handler, the check is purged, which decrements the refcount.	2021-08-06 11:09:48 +02:00
Amaury Denoyelle	9ecee0fa36	MINOR: check: enable safe keywords for dynamic servers Implement a collection of keywords deemed safe and useful to dynamic servers. The list of the supported keywords is : - addr - check-proto - check-send-proxy - check-via-socks4 - rise - fall - fastinter - downinter - port - agent-addr - agent-inter - agent-port - agent-send	2021-08-06 11:09:48 +02:00
Amaury Denoyelle	b33a0abc0b	MEDIUM: check: implement check deletion for dynamic servers Implement a mechanism to free a started check on runtime for dynamic servers. A new function check_purge is created for this. The check task will be marked for deletion and scheduled to properly close connection elements and free the task/tasklet/buf_wait elements. This function will be useful to delete a dynamic server wich checks.	2021-08-06 11:09:48 +02:00
Amaury Denoyelle	d6b7080cec	MINOR: server: implement a refcount for dynamic servers It is necessary to have a refcount mechanism on dynamic servers to be able to enable check support. Indeed, when deleting a dynamic server with check activated, the check will be asynchronously removed. This is mandatory to properly free the check resources in a thread-safe manner. The server instance must be kept alive for this.	2021-08-06 11:09:48 +02:00
Amaury Denoyelle	403dce8e5a	MINOR: check: do not increment global maxsock at runtime global maxsock is used to estimate a number of fd to reserve for internal use, such as checks. It is incremented at startup with the info from the config file. Disable this incrementation in checks functions at runtime. First, it currently serves no purpose to increment it after startup. Worse, it may lead to out-of-bound accesse on the fdtab. This will be useful to initiate checks for dynamic servers.	2021-08-06 11:08:24 +02:00
Amaury Denoyelle	3c2ab1a0d4	MINOR: check: export check init functions Remove static qualifier on init_srv_check, init_srv_agent_check and start_check_task. These functions will be called in server.c for dynamic servers with checks.	2021-08-06 11:08:04 +02:00
Amaury Denoyelle	f2c27a5c67	MINOR: check: allocate default check ruleset for every backends Allocate default tcp ruleset for every backend without explicit rules defined, even if no server in the backend use check. This change is required to implement checks for dynamic servers. This allocation is done on check_config_validity. It must absolutely be called before check_proxy_tcpcheck (called via post proxy check) which allocate the implicit tcp connect rule.	2021-08-06 11:08:04 +02:00
Amaury Denoyelle	fca18172d9	MINOR: server: initialize fields for dynamic server check Set default inter/rise/fall values for dynamic servers check/agent. This is required because dynamic servers do not inherit from a default-server.	2021-08-06 11:08:04 +02:00
Amaury Denoyelle	7b368339af	MEDIUM: task: implement tasklet kill Implement an equivalent of task_kill for tasklets. This function can be used to request a tasklet deletion in a thread-safe way. Currently this function is unused.	2021-08-06 11:07:48 +02:00
Amaury Denoyelle	c755efd5c6	MINOR: server: unmark deprecated on enable health/agent cli Remove the "DEPRECATED" marker on "enable/disable health/agent" commands. Their purpose is to toggle the check/agent on a server. These commands are still useful because their purpose is not covered by the "set server" command. Most there was confusion with the commands 'set server health/agent', which in fact serves another goal. Note that the indication "use 'set server' instead" has been added since 2016 on the commit `2c04eda8b5` REORG: cli: move "{enable\|disable} health" to server.c and `58d9cb7d22` REORG: cli: move "{enable\|disable} agent" to server.c Besides, these commands will become required to enable check/agent on dynamic servers which will be created with check disabled. This should be backported up to 2.4.	2021-08-06 10:09:50 +02:00
Christopher Faulet	d7da3dd928	BUG/MEDIUM: spoe: Fix policy to close applets when SPOE connections are queued It is the second part of the fix that should solve fairness issues with the connections management inside the SPOE filter. Indeed, in multithreaded mode, when the SPOE detects there are some connections in queue on a server, it closes existing connections by releasing SPOE applets. It is mandatory when a maxconn is set because few connections on a thread may prenvent new connections establishment. The first attempt to fix this bug (`9e647e5af` "BUG/MEDIUM: spoe: Kill applets if there are pending connections and nbthread > 1") introduced a bug. In pipelining mode, SPOE applets might be closed while some frames are pending for the ACK reply. To fix the bug, in the processing stage, if there are some connections in queue, only truly idle applets may process pending requests. In this case, only one request at a time is processed. And at the end of the processing stage, only truly idle applets may be released. It is an empirical workaround, but it should be good enough to solve contention issues when a low maxconn is set. This patch should partely fix the issue #1340. It must be backported as far as 2.0.	2021-08-05 10:07:43 +02:00
Christopher Faulet	6f1296b5c7	BUG/MEDIUM: spoe: Create a SPOE applet if necessary when the last one is released On a thread, when the last SPOE applet is released, if there are still pending streams, a new one is created. Of course, HAproxy must not be stopping. It is important to start a new applet in this case to not abort in-progress jobs, especially when a maxconn is set. Because applets may be closed to be fair with connections waiting for a free slot. This patch should partely fix the issue #1340. It depends on the commit "MINOR: spoe: Create a SPOE applet if necessary when the last one on a thread is closed". Both must be backported as far as 2.0.	2021-08-05 10:07:43 +02:00
Christopher Faulet	434b8525ee	MINOR: spoe: Add a pointer on the filter config in the spoe_agent structure There was no way to access the SPOE filter configuration from the agent object. However it could be handy to have it. And in fact, this will be required to fix a bug.	2021-08-05 10:07:43 +02:00
Willy Tarreau	d332f1396b	BUG/MINOR: server: update last_change on maint->ready transitions too Nenad noticed that when leaving maintenance, the servers' last_change field was not updated. This is visible in the Status column of the stats page in front of the state, as the cumuled time spent in the current state is wrong, it starts from the last transition (typically ready->maint). In addition, the backend's state was not updated either, because the down transition is performed by set_backend_down() which also emits a log, and it is this function which was extended to update the backend's last_change, but it's not called for down->up transitions so that was not done. The most visible (and unpleasant) effect of this bug is that it affects slowstart so such a server could immediately restart with a significant load ratio. This should likely be backported to all stable releases.	2021-08-04 19:41:01 +02:00
Willy Tarreau	7b2ac29a92	CLEANUP: fd: remove the now unneeded fd_mig_lock This is not needed anymore since we don't use it when setting the running mask anymore.	2021-08-04 16:03:36 +02:00
Willy Tarreau	b201b1dab1	CLEANUP: fd: remove the now unused fd_set_running() It was inlined inside fd_update_events() since it relies on a loop that may return immediate failure codes.	2021-08-04 16:03:36 +02:00
Willy Tarreau	f69fea64e0	MAJOR: fd: get rid of the DWCAS when setting the running_mask Right now we're using a DWCAS to atomically set the running_mask while being constrained by the thread_mask. This DWCAS is annoying because we may seriously need it later when adding support for thread groups, for checking that the running_mask applies to the correct group. It turns out that the DWCAS is not strictly necessary because we never need it to set the thread_mask based on the running_mask, only the other way around. And in fact, the running_mask is always cleared alone, and the thread_mask is changed alone as well. The running_mask is only relevant to indicate a takeover when the thread_mask matches it. Any bit set in running and not present in thread_mask indicates a transition in progress. As such, it is possible to re-arrange this by using a regular CAS around a consistency check between running_mask and thread_mask in fd_update_events and by making a CAS on running_mask then an atomic store on the thread_mask in fd_takeover(). The only other case is fd_delete() but that one already sets the running_mask before clearing the thread_mask, which is compatible with the consistency check above. This change has happily survived 10 billion takeovers on a 16-thread machine at 800k requests/s. The fd-migration doc was updated to reflect this change.	2021-08-04 16:03:36 +02:00
Willy Tarreau	b1f29bc625	MINOR: activity/fd: remove the dead_fd counter This one is set whenever an FD is reported by a poller with a null owner, regardless of the thread_mask. It has become totally meaningless because it only indicates a migrated FD that was not yet reassigned to a thread, but as soon as a thread uses it, the status will change to skip_fd. Thus there is no reason to distinguish between the two, it adds more confusion than it helps. Let's simply drop it.	2021-08-04 16:03:36 +02:00
Amaury Denoyelle	bd8dd841e5	BUG/MINOR: server: remove srv from px list on CLI 'add server' error If an error occured during the CLI 'add server' handler, the newly created server must be removed from the proxy list if already inserted. Currently, this can happen on the extremely rare error during server id generation if there is no id left. The removal operation is not thread-safe, it must be conducted before releasing the thread isolation. This can be backported up to 2.4. Please note that dynamic server track is not implemented in 2.4, so the release_server_track invocation must be removed for the backport to prevent a compilation error.	2021-08-04 14:57:06 +02:00
Willy Tarreau	ba3ab7907a	MEDIUM: servers: make the server deletion code run under full thread isolation In 2.4, runtime server deletion was brought by commit `e558043e1` ("MINOR: server: implement delete server cli command"). A comment remained in the code about a theoretical race between the thread_isolate() call and another thread being in the process of allocating memory before accessing the server via a reference that was grabbed before the memory allocation, since the thread_harmless_now()/thread_harmless_end() pair around mmap() may have the effect of allowing cli_parse_delete_server() to proceed. Now that the full thread isolation is available, let's update the code to rely on this. Now it is guaranteed that competing threads will either be in the poller or queued in front of thread_isolate_full(). This may be backported to 2.4 if any report of breakage suggests the bug really exists, in which case the two following patches will also be needed: MINOR: threads: make thread_release() not wait for other ones to complete MEDIUM: threads: add a stronger thread_isolate_full() call	2021-08-04 14:49:36 +02:00
Willy Tarreau	88d1c5d3fb	MEDIUM: threads: add a stronger thread_isolate_full() call The current principle of running under isolation was made to access sensitive data while being certain that no other thread was using them in parallel, without necessarily having to place locks everywhere. The main use case are "show sess" and "show fd" which run over long chains of pointers. The thread_isolate() call relies on the "harmless" bit that indicates for a given thread that it's not currently doing such sensitive things, which is advertised using thread_harmless_now() and which ends usings thread_harmless_end(), which also waits for possibly concurrent threads to complete their work if they took this opportunity for starting something tricky. As some system calls were notoriously slow (e.g. mmap()), a bunch of thread_harmless_now() / thread_harmless_end() were placed around them to let waiting threads do their work while such other threads were not able to modify memory contents. But this is not sufficient for performing memory modifications. One such example is the server deletion code. By modifying memory, it not only requires that other threads are not playing with it, but are not either in the process of touching it. The fact that a pool_alloc() or pool_free() on some structure may call thread_harmless_now() and let another thread start to release the same object's memory is not acceptable. This patch introduces the concept of "idle threads". Threads entering the polling loop are idle, as well as those that are waiting for all others to become idle via the new function thread_isolate_full(). Once thread_isolate_full() is granted, the thread is not idle anymore, and it is released using thread_release() just like regular isolation. Its users have to keep in mind that across this call nothing is granted as another thread might have performed shared memory modifications. But such users are extremely rare and are actually expecting this from their peers as well. Note that that in case of backport, this patch depends on previous patch: MINOR: threads: make thread_release() not wait for other ones to complete	2021-08-04 14:49:36 +02:00
Willy Tarreau	f519cfaa63	MINOR: threads: make thread_release() not wait for other ones to complete The original intent of making thread_release() wait for other requesters to proceed was more of a fairness trade, guaranteeing that a thread that was granted an access to the CPU would be in turn giving back once its job is done. But this is counter-productive as it forces such threads to spin instead of going back to the poller, and it prevents us from implementing multiple levels of guarantees, as a thread_release() call could spin waiting for another requester to pass while that requester expects stronger guarantees than the current thread may be able to offer. Let's just remove that wait period and let the thread go back to the poller, a-la "race to idle". While in theory it could possibly slightly increase the perceived latency of concurrent slow operations like "show fd" or "show sess", it is not the case at all in tests, probably because the time needed to reach the poller remains extremely low anyway.	2021-08-04 14:49:36 +02:00
Willy Tarreau	286363be08	CLEANUP: thread: fix fantaisist indentation of thread_harmless_till_end() Probably due to a copy-paste, there were two indent levels in this function since its introduction in 1.9 by commit `60b639ccb` ("MEDIUM: hathreads: implement a more flexible rendez-vous point"). Let's fix this.	2021-08-04 14:49:36 +02:00
Amaury Denoyelle	08be72b827	BUG/MINOR: server: fix race on error path of 'add server' CLI if track If an error occurs during a dynamic server creation with tracking, it must be removed from the tracked list. This operation is not thread-safe and thus must be conducted under the thread isolation. Track support for dynamic servers has been introduced in this release. This does not need to be backported.	2021-08-04 09:18:12 +02:00
William Lallemand	85a16b2ba2	MINOR: stats: shows proxy in a stopped state Previous patch b5c0d65 ("MINOR: proxy: disabled takes a stopping and a disabled state") allows us to set 2 states for a stopped or a disabled proxy. With this patch we are now able to show the stats of all proxies when the process is in a stopping states, not only when there is some activity on a proxy. This patch should fix issue #1307.	2021-08-03 14:17:45 +02:00
William Lallemand	8e765b86fd	MINOR: proxy: disabled takes a stopping and a disabled state This patch splits the disabled state of a proxy into a PR_DISABLED and a PR_STOPPED state. The first one is set when the proxy is disabled in the configuration file, and the second one is set upon a stop_proxy().	2021-08-03 14:17:45 +02:00
William Lallemand	fdc3faf654	MINOR: doc: rename conn_status in `option httsplog` Rename the conn_status field by the real name of the sample fetch in the `option httpslog` documentation.	2021-08-02 10:57:49 +02:00
William Lallemand	56f1f75715	MINOR: log: rename 'dontloglegacyconnerr' to 'log-error-via-logformat' Rename the 'dontloglegacyconnerr' option to 'log-error-via-logformat' which is much more self-explanatory and readable. Note: only legacy keywords don't use hyphens, it is recommended to separate words with them in new keywords.	2021-08-02 10:42:42 +02:00
Willy Tarreau	8441deb1e2	[RELEASE] Released version 2.5-dev3 Released version 2.5-dev3 with the following main changes : - BUG/MINOR: arg: free all args on make_arg_list()'s error path - BUG/MINOR: cfgcond: revisit the condition freeing mechanism to avoid a leak - MEDIUM: proxy: remove long-broken 'option http_proxy' - CLEANUP: http_ana: Remove now unused label from http_process_request() - MINOR: deinit: always deinit the init_mutex on failed initialization - BUG/MEDIUM: cfgcond: limit recursion level in the condition expression parser - BUG/MEDIUM: mworker: do not register an exit handler if exit is expected - BUG/MINOR: mworker: do not export HAPROXY_MWORKER_REEXEC across programs - BUILD/MINOR: memprof fix macOs build. - BUG/MEDIUM: ssl_sample: fix segfault for srv samples on invalid request - BUG/MINOR: stats: Add missing agent stats on servers - BUG/MINOR: check: fix the condition to validate a port-less server - BUILD: threads: fix pthread_mutex_unlock when !USE_THREAD - BUG/MINOR: resolvers: Use a null-terminated string to lookup in servers tree - MINOR: ssl: use __objt_* variant when retrieving counters - BUG/MINOR: systemd: must check the configuration using -Ws - BUG/MINOR: mux-h1: Obey dontlognull option for empty requests - BUG/MINOR: mux-h2: Obey dontlognull option during the preface - BUG/MINOR: mux-h1: Be sure to swap H1C to splice mode when rcv_pipe() is called - BUG/MEDIUM: mux-h2: Handle remaining read0 cases on partial frames - MINOR: proxy: rename PR_CAP_LUA to PR_CAP_INT - MINOR: mworker: the mworker CLI proxy is internal - MINOR: stats: don't output internal proxies (PR_CAP_INT) - CLEANUP: mworker: use the proxy helper functions in mworker_cli_proxy_create() - CLEANUP: mworker: PR_CAP already initialized with alloc_new_proxy() - BUG/MINOR: connection: Add missing error labels to conn_err_code_str - MINOR: connection: Add a connection error code sample fetch - MINOR: ssl: Enable error fetches in case of handshake error - MINOR: ssl: Add new ssl_fc_hsk_err sample fetch - MINOR: ssl: Define a default https log format - MEDIUM: connection: Add option to disable legacy error log - REGTESTS: ssl: Add tests for the connection and SSL error fetches - REGTESTS: ssl: ssl_errors.vtc does not work with old openssl version - BUG/MEDIUM: connection: close a rare race between idle conn close and takeover - BUG/MEDIUM: pollers: clear the sleeping bit after waking up, not before - BUG/MINOR: select: fix excess number of dead/skip reported - BUG/MINOR: poll: fix abnormally high skip_fd counter - BUG/MINOR: pollers: always program an update for migrated FDs - BUG/MINOR: fd: protect fd state harder against a concurrent takeover - DOC: internals: document the FD takeover process - MINOR: fd: update flags only once in fd_update_events() - MINOR: poll/epoll: move detection of RDHUP support earlier - REORG: fd: uninline fd_update_events() - MEDIUM: fd: rely more on fd_update_events() to detect changes - BUG/MINOR: freq_ctr: use stricter barriers between updates and readings - MEDIUM: atomic: simplify the atomic load/store/exchange operations - MEDIUM: atomic: relax the load/store barriers on x86_64 - BUILD: opentracing: fixed build when using pkg-config utility	2021-08-01 18:19:51 +02:00
Miroslav Zagorac	3571111de3	BUILD: opentracing: fixed build when using pkg-config utility In case the OpenTracing C Wrapper library was installed as part of the system (ie in a directory that pkg-config considers part of the system), HAProxy building was not possible because in that case pkg-config did not set the value of the OT_CFLAGS variable in the addon Makefile. This resolves GitHub issue #1323.	2021-08-01 18:18:29 +02:00
Willy Tarreau	99198546f6	MEDIUM: atomic: relax the load/store barriers on x86_64 The x86-tso model makes the load and store barriers unneeded for our usage as long as they perform at least a compiler barrier: the CPU will respect store ordering and store vs load ordering. It's thus safe to remove the lfence and sfence which are normally needed only to communicate with external devices. Let's keep the mfence though, to make sure that reads of same memory location after writes report the value from memory and not the one snooped from the write buffer for too long. An in-depth review of all use cases tends to indicate that this is okay in the rest of the code. Some parts could be cleaned up to use atomic stores and atomic loads instead of explicit barriers though. Doing this reliably increases the overall performance by about 2-2.5% on a 8c-16t Xeon thanks to less frequent flushes (it's likely that the biggest gain is in the MT lists which use them a lot, and that this results in less cache line flushes).	2021-08-01 17:34:06 +02:00
Willy Tarreau	cb0451146f	MEDIUM: atomic: simplify the atomic load/store/exchange operations The atomic_load/atomic_store/atomic_xchg operations were all forced to __ATOMIC_SEQ_CST, which results in explicit store or even full barriers even on x86-tso while we do not need them: we're not communicating with external devices for example and are only interested in respecting the proper ordering of loads and stores between each other. These ones being rarely used, the emitted code on x86 remains almost the same (barring a handful of locations). However they will allow to place correct barriers at other places where atomics are accessed a bit lightly. The patch is marked medium because we can never rule out the risk of some bugs on more relaxed platforms due to the rest of the code.	2021-08-01 17:34:06 +02:00
Willy Tarreau	55a0975b1e	BUG/MINOR: freq_ctr: use stricter barriers between updates and readings update_freq_ctr_period() was using relaxed atomics without using barriers, which usually works fine on x86 but not everywhere else. In addition, some values were read without being enclosed by barriers, allowing the compiler to possibly prefetch them a bit earlier. Finally, freq_ctr_total() was also reading these without enough barriers. Let's make explicit use of atomic loads and atomic stores to get rid of this situation. This required to slightly rearrange the freq_ctr_total() loop, which could possibly slightly improve performance under extreme contention by avoiding to reread all fields. A backport may be done to 2.4 if a problem is encountered, but last tests on arm64 with LSE didn't show any issue so this can possibly stay as-is.	2021-08-01 17:34:06 +02:00
Willy Tarreau	200bd50b73	MEDIUM: fd: rely more on fd_update_events() to detect changes This function already performs a number of checks prior to calling the IOCB, and detects the change of thread (FD migration). Half of the controls are still in each poller, and these pollers also maintain activity counters for various cases. Note that the unreliable test on thread_mask was removed so that only the one performed by fd_set_running() is now used, since this one is reliable. Let's centralize all that fd-specific logic into the function and make it return a status among: FD_UPDT_DONE, // update done, nothing else to be done FD_UPDT_DEAD, // FD was already dead, ignore it FD_UPDT_CLOSED, // FD was closed FD_UPDT_MIGRATED, // FD was migrated, ignore it now Some pollers already used to call it last and have nothing to do after it, regardless of the result. epoll has to delete the FD in case a migration is detected. Overall this removes more code than it adds.	2021-07-30 17:45:18 +02:00
Willy Tarreau	84c7922c52	REORG: fd: uninline fd_update_events() This function has become a monster (80 lines and 2/3 of a kB), it doesn't benefit from being static nor inline anymore, let's move it to fd.c.	2021-07-30 17:41:55 +02:00
Willy Tarreau	53a16187fd	MINOR: poll/epoll: move detection of RDHUP support earlier Let's move the detection of support for RDHUP earlier and out of the FD update chain, as it complicates its simplification.	2021-07-30 17:41:55 +02:00
Willy Tarreau	a199a17d72	MINOR: fd: update flags only once in fd_update_events() Since 2.4 with commit `f50906519` ("MEDIUM: fd: merge fdtab[].ev and state for FD_EV_* and FD_POLL_* into state") we can merge all flag updates at once in fd_update_events(). Previously this was performed in 1 to 3 steps, setting the polling state, then setting READY_R if in/err/hup, and setting READY_W if out/err. But since the commit above, all flags are stored together in the same structure field that is being updated with the new flags, thus we can simply update the flags altogether and avoid multiple atomic operations. This even removes the need for atomic ops for FDs that are not shared.	2021-07-30 17:41:55 +02:00
Willy Tarreau	88babd9944	DOC: internals: document the FD takeover process This explains the traps to avoid and the sequence that leads to consistent use of an FD known by multiple threads at once. This was co-authored with Olivier.	2021-07-30 17:41:55 +02:00
Willy Tarreau	d5402b8df8	BUG/MINOR: fd: protect fd state harder against a concurrent takeover There's a theoretical race (that we failed to trigger) in function fd_update_events(), which could strike on idle connections. The "locked" variable will most often be 0 as the FD is bound to the current thread only. Another thread could take it over once "locked" is set, change the thread and running masks. Then the first thread updates the FD's state non-atomically and possibly overwrites what the other thread was preparing. It still looks like the FD's state will ultimately converge though. The solution against this is to set the running flag earlier so that a takeover() attempt cannot succeed, or that the fd_set_running() attempt fails, indicating that nothing needs to be done on this FD. While this is sufficient for a simple fix to be backported, it leaves the FD actively polled in the calling thread, this will trigger a second wakeup which will notice the absence of tid_bit in the thread_mask, getting rid of it. A more elaborate solution would consist in calling fd_set_running() directly from the pollers before calling fd_update_events(), getting rid of the thread_mask test and letting the caller eliminate that FD from its list if needed. Interestingly, this code also proves to be suboptimal in that it sets the FD state twice instead of calculating the new state at once and always using a CAS to set it. This is a leftover of a simplification that went into 2.4 and which should be explored in a future patch. This may be backported as far as 2.2.	2021-07-30 14:54:19 +02:00
Willy Tarreau	79e90b9615	BUG/MINOR: pollers: always program an update for migrated FDs If an MT-aware poller reports that a file descriptor was migrated, it must stop reporting it. The simplest way to do this is to program an update if not done yet. This will automatically mark the FD for update on next round. Otherwise there's a risk that some events are reported a bit too often and cause extra CPU usage with these pollers. Note that epoll is currently OK regarding this. Select does not need this because it uses a single shared events table, so in case of migration no FD change is expected. This should be backported as far as 2.2.	2021-07-30 14:21:43 +02:00
Willy Tarreau	177119bb11	BUG/MINOR: poll: fix abnormally high skip_fd counter The skip_fd counter that is incremented when a migrated FD is reported was abnormally high in with poll. The reason is that it was accounted for before preparing the polled events instead of being measured from the reported events. This mistake was done when the counters were introduced in 1.9 with commit `d80cb4ee1` ("MINOR: global: add some global activity counters to help debugging"). It may be backported as far as 2.0.	2021-07-30 14:04:28 +02:00
Willy Tarreau	fcc5281513	BUG/MINOR: select: fix excess number of dead/skip reported In 1.8, commit `ab62f5195` ("MINOR: polling: Use fd_update_events to update events seen for a fd") updated the pollers to rely on fd_update_events(), but the modification delayed the test of presence of the FD in the report, resulting in owner/thread_mask and possibly event updates being performed for each FD appearing in a block of 32 FDs around an active one. This caused the request rate to be ~3 times lower with select() than poll() under 6 threads. This can be backported as far as 1.8.	2021-07-30 13:55:36 +02:00
Willy Tarreau	c37ccd70b4	BUG/MEDIUM: pollers: clear the sleeping bit after waking up, not before A bug was introduced in 2.1-dev2 by commit `305d5ab46` ("MAJOR: fd: Get rid of the fd cache."). Pollers "poll" and "evport" had the sleeping bit accidentally removed before the syscall instead of after. This results in them not being woken up by inter-thread wakeups, which is particularly visible with the multi-queue accept() and with queues. As a work-around, when these pollers are used, "nbthread 1" should be used. The fact that it has remained broken for 2 years is a great indication that threads are definitely not enabled outside of epoll and kqueue, hence why this patch is only tagged medium. This must be backported as far as 2.2.	2021-07-30 10:57:09 +02:00

1 2 3 4 5 ...

15202 Commits