BUG/MINOR: haproxy/threads: try to make all threads leave together

There's a small issue with soft stop combined with the incoming
connection load balancing. A thread may dispatch a connection to
another one at the moment stopping=1 is set, and the second one could
stop by seeing (jobs - unstoppable_jobs) == 0 in run_poll_loop(),
without ever picking these connections from the queue. This is
visible in that it may occasionally cause a connection drop on
reload since no remaining thread will ever pick that connection
anymore.

In order to address this, this patch adds a stopping_thread_mask
variable by which threads acknowledge their willingness to stop
when their runqueue is empty. And all threads will only stop at
this moment, so that if finally some late work arrives in the
thread's queue, it still has a chance to process it.

This should be backported to 2.1 and 2.0.
This commit is contained in:
Willy Tarreau 2020-03-12 17:28:01 +01:00
parent a7da5e8dd0
commit 4b3f27b67f

View File

@ -146,6 +146,7 @@ unsigned long pid_bit = 1; /* bit corresponding to the process id */
unsigned long all_proc_mask = 1; /* mask of all processes */
volatile unsigned long sleeping_thread_mask = 0; /* Threads that are about to sleep in poll() */
volatile unsigned long stopping_thread_mask = 0; /* Threads acknowledged stopping */
/* global options */
struct global global = {
@ -2810,8 +2811,12 @@ void run_poll_loop()
if (tid == 0)
signal_process_queue();
if (stopping && tasks_run_queue == 0)
_HA_ATOMIC_OR(&stopping_thread_mask, tid_bit);
/* stop when there's nothing left to do */
if ((jobs - unstoppable_jobs) == 0)
if ((jobs - unstoppable_jobs) == 0 && tasks_run_queue == 0 &&
(stopping_thread_mask & all_threads_mask) == all_threads_mask)
break;
/* also stop if we failed to cleanly stop all tasks */