mds: fix race in connection accept; fix con replacement

We solve two problems with this patch.  The first is that the messenger
will now reuse an existing session's Connection with a new connection,
which means that we don't want to change session->connection when we
are validating an authorizer.  Instead, set (but do not change) it.

We also want to avoid a race where:

 - mds recovers, replays Sessions with no con's
 - multiple connection attempts for the same session race in the msgr
 - both are authorized, but out of order
 - Session->connection gets set to the losing attempt's Connection*

Instead, we take advantage of an accept event that is called only for
accepted winners.

Signed-off-by: Sage Weil <sage@inktank.com>

fixup
This commit is contained in:
Sage Weil 2012-07-10 13:24:51 -07:00
parent 68bad03b2c
commit a542d89ee5
2 changed files with 29 additions and 4 deletions

View File

@ -2032,15 +2032,27 @@ bool MDS::ms_verify_authorizer(Connection *con, int peer_type,
s = new Session;
s->inst.addr = con->get_peer_addr();
s->inst.name = n;
dout(10) << " new session " << s << " for " << s->inst << dendl;
dout(10) << " new session " << s << " for " << s->inst << " con " << con << dendl;
con->set_priv(s);
s->connection = con;
sessionmap.add_session(s);
} else {
dout(10) << " existing session " << s << " for " << s->inst << dendl;
dout(10) << " existing session " << s << " for " << s->inst << " existing con " << s->connection
<< ", new/authorizing con " << con << dendl;
con->set_priv(s->get());
s->connection = con;
}
// Wait until we fully accept the connection before setting
// s->connection. In particular, if there are multiple incoming
// connection attempts, they will all get their authorizer
// validated, but some of them may "lose the race" and get
// dropped. We only want to consider the winner(s). See
// ms_handle_accept(). This is important for Sessions we replay
// from the journal on recovery that don't have established
// messenger state; we want the con from only the winning
// connect attempt(s). (Normal reconnects that don't follow MDS
// recovery are reconnected to the existing con by the
// messenger.)
}
/*
s->caps.set_allow_all(caps_info.allow_all);
@ -2057,3 +2069,15 @@ bool MDS::ms_verify_authorizer(Connection *con, int peer_type,
};
void MDS::ms_handle_accept(Connection *con)
{
Session *s = (Session *)con->get_priv();
dout(10) << "ms_handle_accept " << con->get_peer_addr() << " con " << con << " session " << s << dendl;
if (s) {
s->put();
if (s->connection != con) {
dout(10) << " session connection " << s->connection << " -> " << con << dendl;
s->connection = con;
}
}
}

View File

@ -315,6 +315,7 @@ class MDS : public Dispatcher {
bool ms_verify_authorizer(Connection *con, int peer_type,
int protocol, bufferlist& authorizer_data, bufferlist& authorizer_reply,
bool& isvalid);
void ms_handle_accept(Connection *con);
void ms_handle_connect(Connection *con);
bool ms_handle_reset(Connection *con);
void ms_handle_remote_reset(Connection *con);