Commit Graph

20064 Commits

Author SHA1 Message Date
Sage Weil
d58df35f88 msg/Pipe: simplify Pipe::tcp_read() return value
0 for success; no reason to return length (always == len).

Signed-off-by: Sage Weil <sage@inktank.com>
2012-08-13 08:46:14 -07:00
Sage Weil
76954c13c1 msg/Pipe: document tcp_*()
Signed-off-by: Sage Weil <sage@inktank.com>
2012-08-13 08:46:14 -07:00
Sage Weil
5d5045d31a msg/Accepter: use learned_addr() from Accepter::bind()
Normally we never go from need_addr == false to need_addr == true.
It always starts out as true, so this else is useless on the first
call to Accepter::bind().

The only exception is rebind().  Add an unlearn_addr() that will clear
need_addr.  This is almost unnecessary, but doing so fixes a small bug
where the local_connection->peer_addr doesn't get updated when we do a
rebind().

Drop now-unused set_need_addr().  We keep get_need_addr() only because
it is useful in the debug output and for the assert.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-08-13 08:46:14 -07:00
Sage Weil
1b8f2e0599 msg/SimpleMessenger: push need_addr check into learned_addr()
This puts all of the do/do not lock logic in one place, and documents
it.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-08-13 08:46:14 -07:00
Sage Weil
8453a8198c msg/Accepter: pass nonce on start
This lets us drop the otherwise awkward SimpleMessenger::get_nonce()
accessor.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-08-13 08:46:14 -07:00
Sage Weil
a0017fcc25 msgr: protect set_myaddr()
This is used by Messenger implementation (and their constituent
components).

Signed-off-by: Sage Weil <sage@inktank.com>
2012-08-13 08:46:14 -07:00
Sage Weil
c84b7289c1 msg/Accepter: make members private
Nobody uses these.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-08-13 08:46:14 -07:00
Sage Weil
608c776bf9 msgr: remove useless SimpleMessenger::msgr
Signed-off-by: Sage Weil <sage@inktank.com>
2012-08-13 08:46:14 -07:00
Sage Weil
b97f6e3544 msgr: some SimpleMessenger docs
Document basic modules and the lock ordering.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-08-13 08:46:14 -07:00
Sage Weil
8c1632ba50 cephtool: send keepalive to tell target
If we 'ceph tell <foo> ...' to a non-monitor, we need to send keepalives to
ensure we detect a tcp drop.  (Not so for monitors; monclient already does
its own keepalive thing.)

Signed-off-by: Sage Weil <sage@inktank.com>
2012-08-13 08:46:14 -07:00
Sage Weil
90e0ef907f cephtool: retry 'ceph tell <who> ...' command if connection fails
It was easy to reproduce a hang with 'ceph osd tell osd.0 foo' and
messenger failure injection.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-08-13 08:45:48 -07:00
Sage Weil
ee206a52b6 cephtool: set messenger policy
Signed-off-by: Sage Weil <sage@inktank.com>
2012-08-13 08:45:48 -07:00
Sage Weil
b30ad9a1c9 cephtool: fix deadlock on fault when waiting for osdmap
send_command() was blocking for the osdmap, and also called from the
connect callback.  Instead, re-call it from the handle_osd_map() callback
so that it never blocks.

This was easy to trigger with 'ceph osd tell osd.0 foo' and ms failure
injection.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-08-13 08:45:48 -07:00
Sage Weil
bbc49179a9 msg/Pipe: if we send a wait, make sure we follow through
Mark our outgoing connection attempt if we send a WAIT in accept().  This
ensures we don't go to standby or closed in fault() on the outgoing
connection for any reason.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-27 10:45:21 -07:00
Sage Weil
6c01d46ee6 client: handle fault during session teardown
We may have a sequence like:

 - client does REQUEST_CLOSE
 - mds sends reply
 - connection faults, client does get reply
 - mds closes out its connection
 - client tries to reconnect/resend, gets RESET_SESSION
   -> continues lamely waiting

If we get a session reset and we were asking to close the connection,
we are happy--it was closed.

This was exposed with ceph-fuse start/stop tests with socket failure
injection.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-27 10:44:13 -07:00
Sage Weil
a879425b37 msg/Pipe: make STANDBY behavior optional
In particular, lossless_peers should use STANDBY, but lossless_clients
should reconnect immediately since they are already doing their own session
management.

Specifically, this fixes the problem where the Client tries to open a
connection to the MDS and faults after delivering its OPEN_SESSION message
but before it gets the reply: the session isn't open yet, so it isn't
pinging.  It could, but it is simpler and faster to make the msgr layer
keep the connection open instead of waiting for a periodic keepalive.

Fixes: #2824
Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-27 10:44:09 -07:00
Sage Weil
7cf1f1fb7f msg/Pipe: go to STANDBY on lossless accept fault
Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:34 -07:00
Sage Weil
ef3fd1c39d msg/Pipe: go to standby on lossless server connection faults
Go directly to the STANDBY state, and print a more accurate message.
Otherwise, we do the same check in writer() and go to STANDBY then.  This
is less confusing.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:33 -07:00
Sage Weil
9348bb388b osd: reopen heartbeat connections when they fail
If we have an active peer whose Connection fails, open a new one.  This
is necessary now that a lossy client connection does not automatically
reopen on its own (which is necessary to avoid races with session-based
lossy clients and the ms_handle_reset callback).

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:33 -07:00
Sage Weil
ea7511b83b msg/Pipe: fix leak of Connection in ctor
Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:33 -07:00
Sage Weil
60eb36ef9d msgr: close get_connection() race
This could null deref if the Pipe is registered but failed.

We need to loop here because the Pipe vs Connection stuff sucks; hopefully
this gets fixed up soonish.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:33 -07:00
Sage Weil
04fde5180e msgr: drop CLOSED checks during queueing
AFAICS these checks are pointless.  There should be no harm in queueing
messages on a closed connection; they'll get cleaned up when it is
deregistered.  Moreover, the *queuer* shouldn't be the one who has to
unregister a Pipe.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:33 -07:00
Sage Weil
adce6df207 msgr: simplify submit_message()
Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:33 -07:00
Sage Weil
2e67b7a383 msgr: do not reopen failed lossy Connections
There was a race where:

 - sending stuff to a lossy Connection
 - it fails, and queues itself for reap, queues a RESET event
 - reaper clears the Pipe
 - some thread queues new messages and the Pipe is reopened, messages sent
 - RESET event delivered to dispatch, connection is closed and reopened.

The result was that messages got sent to the OSD out of order during the
window between the fault() and ms_handle_reset() getting called.  This will
prevent that.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:33 -07:00
Sage Weil
9a4e702795 msg/Pipe: unregister pipe immediately on fault; fix mark_down
This fixes a problem where:

 - pipe faults, con->pipe is cleared
 - ms_handle_reset tries to mark_down, but it doesn't know the pipe

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:32 -07:00
Sage Weil
541694f768 msg/Pipe: disconnect Pipe from lossy Connection immediately on failure
When we have a lossy connection failure, immediately disconnect the Pipe
and set the Connection failed flag.  There is no reason to wait until the
reaper comes along.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:32 -07:00
Sage Weil
cef8510560 msg/Connection: add failed flag for lossy Connections
If a lossy Connection fails and we disconnect the Pipe, set a failed flag.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:32 -07:00
Sage Weil
472d14f717 msg/DispatchQueue: fix locking in dispatch thread
The locking was awkward with locally delivered messages.. we dropped dq
lock, inq lock, re-took dq lock, etc.   We would also take + drop + retake
+ drop the dq lock when queuing events.  Blech!

Instead:

 * simplify the queueing of cons for the local_queue
 * dequeue the con under the original dq lock
 * queue events under a single dq lock interval, by telling
   local_queue.queue() we already have it.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:32 -07:00
Sage Weil
9d94ed1caa test_stress_watch: verify that the watch operation isn't slow
Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:32 -07:00
Sage Weil
7b398a5d9c msgr: indicate whether clients are lossy
We need to know whether the client is lossy before we connect to the peer
in order to know whether to deliver a RESET event or not on connection
failure.  Lossy clients get one, lossless do not.

And in any case, we know ahead of time, so we may as well indicate as much
in the Policy.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:32 -07:00
Sage Weil
525830cd0b msgr: do not discard_queue in Pipe reaper
The IncomingQueue can live beyond the Pipe.  In particular, there is no
reason not to deliver messages we've received on this connection even
though the socket has errored out.

Separate incoming queue discard from outgoing, and only do the latter in
the reaper.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:32 -07:00
Sage Weil
8966f71ae5 msg/IncomingQueue: make the pipe parent informational only
Use this pointer only for debug output prefix; do not dereference, as we
may live beyond the original parent.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:32 -07:00
Sage Weil
999c506d5e msg/DispatchQueue: give IncomingQueue ref to queue
We want to be able to queue an event (e.g., RESET) and deliver it even
after the Pipe is destroyed.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:31 -07:00
Sage Weil
5a62dfef3d msg/DispachQueue: hold lock in IncomingQueue::discard_queue()
This prevents races with the dispatch thread, among other things.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:31 -07:00
Sage Weil
35b7bca357 msg: kill tcp.{cc,h}
Move the remaining comparator into msg_types.h and kill this off.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:36:31 -07:00
Sage Weil
5ecc5bce18 msg/DispatchQueue: cleanup debug prefix
Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:22:27 -07:00
Sage Weil
89b07f4703 msg/Pipe: move tcp_* functions into Pipe class
This lets us print nice debug prefixes.  It also calls BS on the
Pipe vs tcp.cc separation.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:22:24 -07:00
Sage Weil
d034e46dd8 msgr: move Accepter into separate .cc
Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:22:21 -07:00
Sage Weil
3e98617c3a msg/Pipe: get_state_name()
Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:22:15 -07:00
Sage Weil
f78a4010f1 msgr: rework accept() connect_seq/race handling
We change a couple of key things here:

 * If there is a matching connect_seq and the existing connection is in OPEN (or
   STANDBY; same thing + a failure), we send a RETRY_SESSION and ask the peer to
   bump their connect_seq.  This handles the case where there was a race, our
   end successfully opened, but the peer's racing attempt was slowly processed.
 * We always reply with connect_seq + 1.  This handles the above case
   more cleanly, and lets us use the same code path.

Also avoid duplicating the RETRY_SESSION path with a goto.  Beautiful!

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-20 18:22:10 -07:00
Sage Weil
a542d89ee5 mds: fix race in connection accept; fix con replacement
We solve two problems with this patch.  The first is that the messenger
will now reuse an existing session's Connection with a new connection,
which means that we don't want to change session->connection when we
are validating an authorizer.  Instead, set (but do not change) it.

We also want to avoid a race where:

 - mds recovers, replays Sessions with no con's
 - multiple connection attempts for the same session race in the msgr
 - both are authorized, but out of order
 - Session->connection gets set to the losing attempt's Connection*

Instead, we take advantage of an accept event that is called only for
accepted winners.

Signed-off-by: Sage Weil <sage@inktank.com>

fixup
2012-07-10 19:04:42 -07:00
Sage Weil
68bad03b2c msgr: queue accept event when pipe is accepted
Queue an event when an incoming connection is accepted.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-10 13:59:24 -07:00
Sage Weil
fab6e824c4 msg/DispatchQueue: queue and deliver accept events
Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-10 13:59:21 -07:00
Sage Weil
d4ef004e64 dispatcher: new 'accept' event type
Create a new event type when we successfully accept a connection.  This is
distinct from the authorizor verification, which may happen for multiple
racing connection attempts.  In contrast, this will only happen on those
that win the race(s).  I don't think this is that important for stateless
servers (OSD, MON), but it is important for the MDS to ensure that it keeps
its Session con reference pointing to the most recently-successful
connection attempt.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-10 13:30:44 -07:00
Sage Weil
1a4a133071 msgr: drop unnecessary (un)locking on queuing connection events
This used to be necessary because the pipe_lock was used when queueing
the pipe in the dispatch queue.  Now that is handled by IncomingQueue's
own lock, so these can be removed.

By no longer dropping the lock, we eliminate a whole category of potential
hard-to-debug races.  (Not that any were observed, but now we dno't need to
worry about them.)

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-10 13:30:33 -07:00
Sage Weil
e84565d9e8 msgr: move dispatch thread into DispatchQueue
The DispatchQueue class now completely owns message delivery.  This is
cleaner and lets us drop the redundant destination_stopped flag from
msgr (DQ has its own stop flag).

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-10 13:30:33 -07:00
Sage Weil
9e291bae96 msgr: simplify checks for queueing connection events
Looking through git history it is not clear exactly how these checks
came to be.  They seem to have grown during the multiple-entity-per-rank
transition a few years back.  I'm not fully convinced they are necessary,
but we will keep them regardless.

Push checks into DispatchQueue and look at the local stop flag to
determine whether these events should be queued.  This moves us away from
the kludgey SimpleMessenger::destination_stopped flag (which will soon
be removed).

Also move the refcount futzing into the DispatchQueue methods.  This makes
the callers much simpler.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-10 13:30:33 -07:00
Sage Weil
bafcbdeb74 msgr: remove unnecessary accept check
We don't need to worry about racing with shutdown here; the cleanup
procedure will stop the accepter thread before cleaning up all the
pipes.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-10 13:30:33 -07:00
Sage Weil
bffd46c56a msgr: remove obsolete dead path
This hasn't triggered in years.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-10 13:30:33 -07:00
Sage Weil
34908140bf msgr: uninline ctor and dtor
Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-10 13:30:33 -07:00