client: fix session open vs mdsmap race with request kicking

A sequence like:

 - ceph-fuse starts, make_request on getattr
 - waits for mds to be active
 - tries to open a session
 - mds restarts, recovers
 - eventually gets session open reply
 - sends first getattr (even tho mds is in reconnect state)
 - gets mdsmap update that mds is now active
 - kicks request, resends getattr
 - get first reply
 - ignore second reply, caps get out of sync

The bug is that we send the first request when the MDS is still in
the reconnect state.  The fix is to loop in make_request so that we
ensure all conditions are satisfied before sending the request.  Any
time we wait, we loop, so that we know all conditions (still) pass if
we make it to the end.

Fixes: #4853
Signed-off-by: Sage Weil <sage@inktank.com>
This commit is contained in:
Sage Weil 2013-04-29 10:44:28 -07:00
parent cea2ff8615
commit ee553ac279

View File

@ -1299,13 +1299,9 @@ int Client::make_request(MetaRequest *request,
// wait
if (session->state == MetaSession::STATE_OPENING) {
Cond session_cond;
session->waiting_for_open.push_back(&session_cond);
while (session->state == MetaSession::STATE_OPENING) {
ldout(cct, 10) << "waiting for session to mds." << mds << " to open" << dendl;
session_cond.Wait(client_lock);
}
session->waiting_for_open.remove(&session_cond);
ldout(cct, 10) << "waiting for session to mds." << mds << " to open" << dendl;
wait_on_list(session->waiting_for_open);
continue;
}
if (!have_open_session(mds))