msg/SimpleMessenger: drop msgr lock when joining a Pipe

Avoid this deadlock:

- a fault
- delay thread entry gets a fast dispatch message
 - drops delay_lock
 - calls into fast_dispatch
- reaper tries to reap the pipe
 - pipe->join()
  - delay_thread->join()
   - blocks waiting for delay_thread to exit
- delay thread / fast dispatch blocks on msgr->lock trying to mark_down

The solution is to drop the msgr lock while joining the thread.  This will
allow the join() to complete.  Adjust the reaper thread to recheck the
exit condition since the lock may have been dropped.  The other two callers
do not care.

Fixes: #8891
Signed-off-by: Sage Weil <sage@redhat.com>
This commit is contained in:
Sage Weil 2014-08-03 18:26:34 -07:00
parent e36babc825
commit 98997f3b22

View File

@ -204,7 +204,9 @@ void SimpleMessenger::reaper_entry()
ldout(cct,10) << "reaper_entry start" << dendl; ldout(cct,10) << "reaper_entry start" << dendl;
lock.Lock(); lock.Lock();
while (!reaper_stop) { while (!reaper_stop) {
reaper(); reaper(); // may drop and retake the lock
if (reaper_stop)
break;
reaper_cond.Wait(lock); reaper_cond.Wait(lock);
} }
lock.Unlock(); lock.Unlock();
@ -236,7 +238,14 @@ void SimpleMessenger::reaper()
p->unregister_pipe(); p->unregister_pipe();
assert(pipes.count(p)); assert(pipes.count(p));
pipes.erase(p); pipes.erase(p);
// drop msgr lock while joining thread; the delay through could be
// trying to fast dispatch, preventing it from joining without
// blocking and deadlocking.
lock.Unlock();
p->join(); p->join();
lock.Lock();
if (p->sd >= 0) if (p->sd >= 0)
::close(p->sd); ::close(p->sd);
ldout(cct,10) << "reaper reaped pipe " << p << " " << p->get_peer_addr() << dendl; ldout(cct,10) << "reaper reaped pipe " << p << " " << p->get_peer_addr() << dendl;