mirror of
http://git.haproxy.org/git/haproxy.git/
synced 2024-12-26 14:42:21 +00:00
BUG/MEDIUM: listener: read-lock the listener during accept()
Listeners might be disabled by other threads while running in listener_accept() due to a stopping condition or possibly a rebinding error after a failed stop/start. When this happens, the listener's FD is -1 and accesses made by the lower layers to fdtab[-1] do not end up well. This can occasionally be noticed if running at high connection rates in master-worker mode when compiled with ASAN and hammered with 10 reloads per second. From time to time an out-of-bounds error will be reported. One approach could consist in keeping a copy of critical information such as the FD before proceeding but that's not correct since in case of close() the FD might be reassigned to another connection for example. In fact what is needed is to read-lock the listener during this operation so that it cannot change while we're touching it. Tests have shown that using a spinlock only does generally work well but it doesn't scale much with threads and we can see listener_accept() eat 10-15% CPU on a 24 thread machine at 300k conn/s. For this reason the lock was turned to an rwlock by previous commit and this patch only takes the read lock to make sure other operations do not change the listener's state while threads are accepting connections. With this approach, no performance loss was noticed at all and listener_accept() doesn't appear in perf top. This ought to be backported to about all branches that make use of the unlocked listeners, but in practice it seems to mostly concern 2.3 and above, since 2.2 and older will take the FD in the argument (and the race exists there, this FD could end up being reassigned in parallel but there's not much that can be done there to prevent that race; at least a permanent error will be reported). For backports, the current approach is preferred, with a preliminary backport of previous commit "MINOR: listener: replace the listener's spinlock with an rwlock". However if for any reason this commit cannot be backported, the current patch can be modified to simply take a spinlock (tested and works), it will just impact high performance workloads (like DDoS protection).
This commit is contained in:
parent
08b6f96452
commit
fed93d367c
@ -915,7 +915,18 @@ void listener_accept(struct listener *l)
|
|||||||
} while (!_HA_ATOMIC_CAS(&actconn, (int *)(&count), next_actconn));
|
} while (!_HA_ATOMIC_CAS(&actconn, (int *)(&count), next_actconn));
|
||||||
}
|
}
|
||||||
|
|
||||||
cli_conn = l->rx.proto->accept_conn(l, &status);
|
/* be careful below, the listener might be shutting down in
|
||||||
|
* another thread on error and we must not dereference its
|
||||||
|
* FD without a bit of protection.
|
||||||
|
*/
|
||||||
|
cli_conn = NULL;
|
||||||
|
status = CO_AC_PERMERR;
|
||||||
|
|
||||||
|
HA_RWLOCK_RDLOCK(LISTENER_LOCK, &l->lock);
|
||||||
|
if (l->rx.flags & RX_F_BOUND)
|
||||||
|
cli_conn = l->rx.proto->accept_conn(l, &status);
|
||||||
|
HA_RWLOCK_RDUNLOCK(LISTENER_LOCK, &l->lock);
|
||||||
|
|
||||||
if (!cli_conn) {
|
if (!cli_conn) {
|
||||||
switch (status) {
|
switch (status) {
|
||||||
case CO_AC_DONE:
|
case CO_AC_DONE:
|
||||||
|
Loading…
Reference in New Issue
Block a user