mirror of
http://git.haproxy.org/git/haproxy.git/
synced 2025-02-21 05:06:56 +00:00
BUG/MEDIUM: debug: don't set the STUCK flag from debug_handler()
Since 2.0 with commite6a02fa65a
("MINOR: threads: add a "stuck" flag to the thread_info struct"), the TH_FL_STUCK flag was set by the debugger to flag that a thread was stuck and report it in the output. However, two commits later (2bfefdbaef
"MAJOR: watchdog: implement a thread lockup detection mechanism"), this flag was used to detect that a thread had already been reported as stuck. The problem is that it seldom happens that a "show threads" command instantly crashes because it calls debug_handler(), which sets the flag, and if the watchdog timer was about to trigger before going back to the scheduler, the watchdog believes that the thread has been stuck for a while and will kill the process. The issue was magnified in 3.1 with the lower-delay warning, because it's possible for a thread to die on the next wakeup after the first warning (which calls debug_handler() hence sets the STUCK flag). One good approach would have been to use two distinct flags, one for "stuck" as reported by the debug handler, and one for "stuck" as seen by the watchdog. However, one could also argue that since the second commit, given that the wdt monitors the threads, there's no point any more for the debug handler to set the flag itself. Removing this code means that two consecutive "show threads" will not report "stuck" until the watchdog sets it, which aligns better with expectations. This can be backported to all stable releases. This code has changed a bit over time, the "if" block and the harmless variables just need to be removed.
This commit is contained in:
parent
332839eb9d
commit
1151fe6818
@ -2347,7 +2347,6 @@ static int debug_iohandler_counters(struct appctx *appctx)
|
||||
void debug_handler(int sig, siginfo_t *si, void *arg)
|
||||
{
|
||||
struct buffer *buf = HA_ATOMIC_LOAD(&th_ctx->thread_dump_buffer);
|
||||
int harmless = is_thread_harmless();
|
||||
int no_return = 0;
|
||||
|
||||
/* first, let's check it's really for us and that we didn't just get
|
||||
@ -2371,13 +2370,6 @@ void debug_handler(int sig, siginfo_t *si, void *arg)
|
||||
*/
|
||||
ha_thread_dump_one(tid, 1);
|
||||
|
||||
/* mark the current thread as stuck to detect it upon next invocation
|
||||
* if it didn't move.
|
||||
*/
|
||||
if (!harmless &&
|
||||
!(_HA_ATOMIC_LOAD(&th_ctx->flags) & TH_FL_SLEEPING))
|
||||
_HA_ATOMIC_OR(&th_ctx->flags, TH_FL_STUCK);
|
||||
|
||||
/* in case of panic, no return is planned so that we don't destroy
|
||||
* the buffer's contents and we make sure not to trigger in loops.
|
||||
*/
|
||||
|
Loading…
Reference in New Issue
Block a user