BUG/MAJOR: threads/task: dequeue expired tasks under the WQ lock

There is a small unprotected window for a task between the wait queue
and the run queue where a task could be woken up and destroyed at the
same time. What typically happens is that a timeout is reached at the
same time an I/O completes and wakes it up, and the I/O terminates the
task, causing a use after free in wake_expired_tasks() possibly causing
a crash and/or memory corruption :

       thread 1                             thread 2
  (wake_expired_tasks)                (stream_int_notify)

 HA_SPIN_UNLOCK(TASK_WQ_LOCK, &wq_lock);
                              task_wakeup(task, TASK_WOKEN_IO);
                              ...
                              process_stream()
                                stream_free()
                                   task_free()
                                      pool_free(task)
 task_wakeup(task, TASK_WOKEN_TIMER);

This case is reasonably easy to reproduce with a config using very short
server timeouts (100ms) and client timeouts (10ms), while injecting on
httpterm requesting medium sized objects (5kB) over SSL. All this is
easier done with more threads than allocated CPUs so that pauses can
happen anywhere and last long enough for process_stream() to kill the
task.

This patch inverts the lock and the wakeup(), but requires some changes
in process_runnable_tasks() to ensure we never try to grab the WQ lock
while having the RQ lock held. This means we have to release the RQ lock
before calling task_queue(), so we can't hold the RQ lock during the
loop and must take and drop it.

It seems that a different approach with the scope-aware trees could be
easier, but it would possibly not cover situations where a task is
allowed to run on multiple threads. The current solution covers it and
doesn't seem to have any measurable performance impact.
This commit is contained in:
Willy Tarreau 2017-11-23 18:36:50 +01:00
parent 541dd82879
commit 51753458c4
1 changed files with 11 additions and 5 deletions

View File

@ -162,8 +162,8 @@ int wake_expired_tasks()
__task_queue(task);
goto lookup_next;
}
HA_SPIN_UNLOCK(TASK_WQ_LOCK, &wq_lock);
task_wakeup(task, TASK_WOKEN_TIMER);
HA_SPIN_UNLOCK(TASK_WQ_LOCK, &wq_lock);
}
HA_SPIN_UNLOCK(TASK_WQ_LOCK, &wq_lock);
@ -306,21 +306,27 @@ void process_runnable_tasks()
local_tasks[final_tasks_count++] = t;
}
HA_SPIN_LOCK(TASK_RQ_LOCK, &rq_lock);
for (i = 0; i < final_tasks_count ; i++) {
t = local_tasks[i];
t->state &= ~TASK_RUNNING;
/* If there is a pending state
* we have to wake up the task
* immediatly, else we defer
* it into wait queue
*/
if (t->pending_state)
HA_SPIN_LOCK(TASK_RQ_LOCK, &rq_lock);
t->state &= ~TASK_RUNNING;
if (t->pending_state) {
__task_wakeup(t);
else
HA_SPIN_UNLOCK(TASK_RQ_LOCK, &rq_lock);
}
else {
/* we must never hold the RQ lock before the WQ lock */
HA_SPIN_UNLOCK(TASK_RQ_LOCK, &rq_lock);
task_queue(t);
}
}
HA_SPIN_LOCK(TASK_RQ_LOCK, &rq_lock);
if (max_processed <= 0) {
active_tasks_mask |= tid_bit;
break;