BUG/MEDIUM: wdt/clock: properly handle early task hangs

In ae053b30 - BUG/MEDIUM: wdt: don't trigger the watchdog when p is unitialized:
	wdt is not triggering until prev_cpu_time
	is initialized to prevent unexpected process
	termination.

Unfortunately this is not enough, some tasks could start
immediately after process startup, and in such cases
prev_cpu_time could be uninitialized, because
prev_cpu_time is set after the polling loop while
process_runnable_tasks() is executed before the polling loop.

It happens to be the case with lua tasks registered using
register_task function from lua script.

Those tasks are registered in early init stage of haproxy and
they are scheduled to run before the first polling loop,
leading to prev_cpu_time being uninitialized (equals 0)
on the thread when the task is first executed.

Because of this, if such tasks get stuck right away
(e.g: blocking IO) the watchdog won't behave as expected
and the thread will remain stuck indefinitely.
(polling loop for the thread won't run at all as
the thread is already stuck)

To solve this, we're now making sure that prev_cpu_time is first
set before any tasks are processed on the thread.
This is done by setting initial prev_cpu_time value directly
in clock_init_thread_date()

Thanks to Abhijeet Rastogi for reporting this unexpected behavior.

It could be backported in every stable versions.
(everywhere ae053b30 is, because both are related)
This commit is contained in:
Aurelien DARRAGON 2022-11-10 11:47:47 +01:00 committed by Willy Tarreau
parent aa1909edf7
commit 16d6c0cb09

View File

@ -284,6 +284,7 @@ void clock_init_thread_date(void)
now.tv_sec = old_now >> 32;
now.tv_usec = (uint)old_now;
th_ctx->idle_pct = 100;
th_ctx->prev_cpu_time = now_cpu_time();
clock_update_date(0, 1);
}