BUG/MEDIUM: listener: mark the thread as not stuck inside the loop

We tried hard to make sure we report threads as not stuck at various
crucial places, but one of them is special, it's the listener_accept()
function. The reason it is special is because it will loop a certain
number of times (default: 64) accepting incoming connections, allocating
resources, dispatching them to other threads or running L4 rules on them,
and while all of this is supposed to be extremely fast, when the machine
slows down or runs low on memory, the expectedly small delays in malloc()
caused by contention with other threads can quickly accumulate and suddenly
become critical to the point of triggering the watchdog. Furthermore, it
is technically possible to trigger this by pure configuration by setting
a huge tune.maxaccept value, which should not be possible.

Given that each operation isn't related to the same task but to a different
one each time, it is appropriate to mark the thread as not stuck each time
it accepts new work that possibly gets dispatched to other threads which
execute it.

This looks like this could be a good reason for the issue reported in
issue #388.

This fix must be backported to 2.0.
This commit is contained in:
Willy Tarreau 2020-05-01 09:51:11 +02:00
parent a6cd078f75
commit 8d2c98b76c

View File

@ -1036,6 +1036,7 @@ void listener_accept(int fd)
}
#endif
ti->flags &= ~TI_FL_STUCK; // this thread is still running
} /* end of for (max_accept--) */
end: