MEDIUM: tasks: only base the nice offset on the run queue depth

The offset calculated for the nice value used to be wrong for a long
time and got even worse when the improved multi-thread sheduler was
implemented because it continued to rely on the run queue size, which
become irrelevant given that we extract tasks in batches, so the run
queue size moves following a sawtooth form.

However the offsets much better reflects insertion positions in the
queue, so it's worth dropping this rq_size component of the equation.
Last point, due to the batches made of runqueue-depth entries at once,
the higher the depth, the lower the effect of the nice setting since
values are picked together in batches and placed into a list. An
intuitive approach consists in multiplying the nice value with the
batch size to allow tasks to participate to a different batch. And
experimentation shows that this works pretty well.

With a runqueue-depth of 16 and a parasitic load of 16000 requests
per second on 100 streams, a default nice of 0 shows 16000 requests
per second for nice 0, 22000 for nice -1024 and 10000 for nice 1024.

The difference is even bigger with a runqueue depth of 5. At 200
however it's much smoother (16000-22000).
This commit is contained in:
Willy Tarreau 2019-04-15 09:18:31 +02:00
parent cde7902ac9
commit 2d1fd0a0d2

View File

@ -67,18 +67,12 @@ struct task_per_thread task_per_thread[MAX_THREADS];
void __task_wakeup(struct task *t, struct eb_root *root)
{
void *expected = NULL;
int *rq_size;
#ifdef USE_THREAD
if (root == &rqueue) {
rq_size = &global_rqueue_size;
HA_SPIN_LOCK(TASK_RQ_LOCK, &rq_lock);
} else
#endif
{
int nb = ((void *)root - (void *)&task_per_thread[0].rqueue) / sizeof(task_per_thread[0]);
rq_size = &task_per_thread[nb].rqueue_size;
}
#endif
/* Make sure if the task isn't in the runqueue, nobody inserts it
* in the meanwhile.
*/
@ -130,10 +124,7 @@ redo:
int offset;
_HA_ATOMIC_ADD(&niced_tasks, 1);
if (likely(t->nice > 0))
offset = (unsigned)(((*rq_size + 1) * (unsigned int)t->nice) / 32U);
else
offset = -(unsigned)(((*rq_size + 1) * (unsigned int)-t->nice) / 32U);
offset = t->nice * (int)global.tune.runqueue_depth;
t->rq.key += offset;
}