BUG/MINOR: polling: fix time reporting when using busy polling

Since commit beb859abce ("MINOR: polling: add an option to support
busy polling") the time and status passed to clock_update_local_date()
were incorrect. Indeed, what is considered is the before_poll date
related to the configured timeout which does not correspond to what
is passed to the poller. That's not correct because before_poll+the
syscall's timeout will be crossed by the current date 100 ms after
the start of the poller. In practice it didn't happen when the poller
was limited to 1s timeout but at one minute it happens all the time.

That's particularly visible when running a multi-threaded setup with
busy polling and only half of the threads working (bind ... thread even).
In this case, the fixup code of clock_update_local_date() is executed
for each round of busy polling. The issue was made really visible
starting with recent commit e8b1ad4c2b ("BUG/MEDIUM: clock: also
update the date offset on time jumps") because upon a jump, the
shared offset is reset, while it should not be in this specific
case.

What needs to be done instead is to pass the configured timeout of
the poller (and not of the syscall), and always pass "interrupted"
set so as to claim we got an event (which is sort of true as it just
means the poller returned instantly). In this case we can still
detect backwards/forward jumps and will use a correct boundary
for the maximum date that covers the whole loop.

This can be backported to all versions since the issue was introduced
with busy-polling in 1.9-dev8.
This commit is contained in:
Willy Tarreau 2024-09-12 17:47:13 +02:00
parent 1900ca475f
commit ad98edd00a
3 changed files with 3 additions and 3 deletions

View File

@ -230,7 +230,7 @@ static void _do_poll(struct poller *p, int exp, int wake)
int timeout = (global.tune.options & GTUNE_BUSY_POLLING) ? 0 : wait_time;
status = epoll_wait(epoll_fd[tid], epoll_events, global.tune.maxpollevents, timeout);
clock_update_local_date(timeout, status);
clock_update_local_date(wait_time, (global.tune.options & GTUNE_BUSY_POLLING) ? 1 : status);
if (status) {
activity[tid].poll_io++;

View File

@ -225,7 +225,7 @@ static void _do_poll(struct poller *p, int exp, int wake)
break;
}
}
clock_update_local_date(timeout, nevlist);
clock_update_local_date(wait_time, (global.tune.options & GTUNE_BUSY_POLLING) ? 1 : nevlist);
if (nevlist || interrupted)
break;

View File

@ -183,7 +183,7 @@ static void _do_poll(struct poller *p, int exp, int wake)
kev, // struct kevent *eventlist
fd, // int nevents
&timeout_ts); // const struct timespec *timeout
clock_update_local_date(timeout, status);
clock_update_local_date(wait_time, (global.tune.options & GTUNE_BUSY_POLLING) ? 1 : status);
if (status) {
activity[tid].poll_io++;