Commit Graph

957 Commits

Author SHA1 Message Date
Samuel Just
fc9b8ef06b MOSDOp: drop ops vector in clear_data()
Otherwise, clear_data on MOSDOp will leave essentially
all of the buffers intact.  This is a problem since the
OpTracker mechanism relies on being able to keep the mesage
around without keeping around the data.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-03-01 14:53:52 -08:00
Samuel Just
6cd64a507d messages,osd: add EC messages and associated types
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-02-17 20:12:15 -08:00
Samuel Just
4c1338f457 SimpleMessenger: init_local_connection whenever my_inst changes
This is necessary to correctly handle messages to self.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-02-17 14:24:55 -08:00
Josh Durgin
d389e617c1 msg/Pipe: add option to restrict delay injection to specific msg type
This makes it possible to test timeouts reliably by delaying certain
messages effectively forever, but still being able to e.g. connect and
authenticate to the monitors.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2014-02-10 12:53:12 -08:00
Noah Watkins
4c4e1d0d47 libc++: use ceph:: namespaced data types
Switches the implemetnation of smart pointers and unordered map/set to
use the ceph:: versions.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2014-01-18 14:03:20 -08:00
Noah Watkins
6342d05195 pipe: handle missing MSG_MORE and MSG_NOSIGNAL
On OSX (and currently any platform missing the MSG_MORE
macro) the MSG_MORE optimization is disabled. The MSG_NOSIGNAL flag is
available on OSX but is called SO_NOSIGPIPE and must be set via
setsockopt.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2014-01-04 09:18:04 -08:00
Sage Weil
d77101ccf8 Merge pull request #1016 from bydsky/bugfix
Fix Issue #6992: stop the accepter and mark all pipes down before rebind

Backport: emperor, dumpling
Reviewed-by: Sage Weil <sage@inktank.com>
2013-12-31 08:18:15 -08:00
Xihui He
f8e413f9c7 msgr: fix rebind() race
stop the accepter and mark all pipes down before rebind to avoid race

Fixes: #6992

Signed-off-by: Xihui He xihuihe@gmail.com
2013-12-31 10:57:57 +08:00
Noah Watkins
1fec818f7c spinlock: add generic spinlock implementation
Adds a ceph_spinlock_t implementation that will use pthread_spinlock_t
if available, and otherwise reverts to pthread_mutex_t. Note that this
spinlock is not intended to be used in process-shared memory.

Switches implementation in:

  ceph_context
  SimpleMessenger
  atomic_t

Only ceph_context initialized its spinlock with PTHREAD_PROCESS_SHARED.
However, there does not appear to be any instance in which CephContext
is allocated in shared memory, and thus can use the default private
memory space behavior.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
2013-12-28 14:43:14 -08:00
Sage Weil
a9df335b12 msgr: debug delay_thread join
Signed-off-by: Sage Weil <sage@inktank.com>
2013-10-01 12:04:42 -07:00
Sage Weil
f45675c2a2 msg/msg_types: use proper NI_MAXSERV when formatting an IP address
May as well be pedantic about it, even though we are leaving the port
in numeric form.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
2013-09-26 09:51:56 -07:00
Roald J. van Loon
6949d221ad automake cleanup: implementing non-recursive make
- Enabling subdir objects
- Created a Makefile-env.am with basic automake init
- Created .am files per subdir, included from src/Makefile.am

Signed-off-by: Roald J. van Loon <roaldvanloon@gmail.com>
2013-09-08 00:11:09 +02:00
Sage Weil
a286090602 common/crc32c: refactor a bit
- the generic function without the _le suffix (useless)
- use a static global so that detection only happens once
- make the structure a bit cleaner to plug in new implementations

Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-20 16:42:53 -07:00
Christophe Courtaut
e1666d0400 Fix compilation -Wmismatched-tags warnings
Keep consistency in the code to not generate warnings of this type.

Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
2013-08-09 11:58:58 +02:00
Sage Weil
053659d05e msg/Pipe: work around incorrect features reported by earlier versions
If we see a peer reporting features ~0ull, we know they are deluded in a
particular way and should infer what features they *actually* have.  Do
this right when the features come over the wire to catch all users.

Fixes: #5655
Signed-off-by: Samuel Just <sam.just@inktank.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-19 08:09:09 -07:00
Sage Weil
f0feabe81f Message,OSD,PG: make Connection::features private
Use has_feature() method too.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-19 08:09:09 -07:00
Sage Weil
fd53d53a42 msgr: mark addr-based [lazy_]send_message and get_connection deprecated
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-18 15:05:22 -07:00
Sage Weil
f7d0403f87 msg/SimpleMessenger: remove duplicated interface docs
Document these in the interface, not the implementation; having two copies
clutters the header and invites them to get out of sync.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-18 15:02:08 -07:00
Sage Weil
5e7241792f msgr: update docs for mark_down, mark_down_all semantics
* RESET events
* note that the reset detection only happens if it is enabled in the
  policy.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-18 15:02:08 -07:00
Sage Weil
e1e0d5056d msgr: generate reset event on mark_down to addr (not con)
If the caller is marking down an addr, they presumably don't have the
Connection* handy, so we should generate a reset event to help them
clean up con <-> session ref cycles.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-18 15:02:08 -07:00
Sage Weil
723d691f7a msg/Pipe: do not hold pipe_lock for verify_authorizer()
We shouldn't hold the pipe_lock while doing the ms_verify_authorizer
upcalls.

Fix by unlocking a bit earlier, and verifying our state is still correct
in the failure path.

This regression was introduced by ecab4bb9513385bd765cca23e4e2fadb7ac4bac2.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-07-18 12:17:08 -07:00
Greg Farnum
1a84411209 msgr: fix a typo/goto-cross from dd4addef2d
We didn't build or review carefully enough!

Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-07-17 15:23:12 -07:00
Sage Weil
16568d9e1f msg/Pipe: a bit of additional debug output
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-17 14:39:04 -07:00
Sage Weil
ecab4bb951 msg/Pipe: hold pipe_lock during important parts of accept()
Previously we did not bother with locking for accept() because we were
not visible to any other threads.  However, we need to close accepting
Pipes from mark_down_all(), which means we need to handle interference.

Fix up the locking so that we hold pipe_lock when looking at Pipe state
and verify that we are still in the ACCEPTING state any time we retake
the lock.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-17 14:39:04 -07:00
Sage Weil
687fe888b3 msgr: close accepting_pipes from mark_down_all()
We need to catch these pipes too, particularly when doing a rebind(),
to avoid them leaking through.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-17 14:39:04 -07:00
Sage Weil
dd4addef2d msgr: maintain list of accepting pipes
New pipes exist in a sort of limbo before we know who the peer is and
add them to rank_pipe.  Keep a list of them in accepting_pipes for that
period.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-17 14:39:04 -07:00
Sage Weil
994e2bf224 msgr: adjust nonce on rebind()
We can have a situation where:

 - we have a pipe to a peer
 - pipe goes to standby (on peer)
 - we rebind to a new port
 - ....
 - we rebind again to the same old port
 - we connect to peer

and get reattached to the ancient pipe from two instances back.  Avoid that
by picking a new nonce each time we rebind.

Add 1,000,000 each time so that the port is still legible in the printed
output.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-17 14:38:57 -07:00
Sage Weil
07a0860a18 msgr: mark_down_all() after, not before, rebind
If we are shutting down all old connections and binding to new ports,
we want to avoid a sequence like:

 - close all prevoius connections
 - new connection comes in on old port
 - rebind to new ports
 -> connection from old port leaks through

As a first step, close all connections after we shut down the old
accepter and before we start the new one.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-17 14:36:37 -07:00
Sage Weil
ad548e72fd msg/Pipe: unlock msgr->lock earlier in accept()
Small cleanup.  Nothing needs msgr->lock for the previously larger
window.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-17 14:34:40 -07:00
Sage Weil
9f1c272618 msg/Pipe: avoid creating empty out_q entry
We need to maintain the invariant that all sub queues in out_q are never
empty.  Fix discard_requeued_up_to() to avoid creating an entry unless we
know it is already present.

This bug leads to an incorrect reconnect attempt when

 - we accept a pipe (lossless peer)
 - they send some stuff, maybe
 - fault
 - we initiate reconnect, even tho we have nothing queued

In particular, we shouldn't reconnect because we aren't checking for
resets, and the fact that our out_seq is 0 while the peer's might be
something else entirely will trigger asserts later.

This fixes at least one source of #5626, and possibly #5517.

Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-17 14:34:40 -07:00
Sage Weil
579d858aab msg/Pipe: assert lock is held in various helpers
These all require that we hold pipe_lock.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-17 14:34:39 -07:00
Sage Weil
4282971d47 msg/Pipe: be a bit more explicit about encoding outgoing messages
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-15 13:48:07 -07:00
Sage Weil
495ee108db msg/Pipe: fix RECONNECT_SEQ behavior
Calling handle_ack() here has no effect because we have already
spliced sent messages back into our out queue.  Instead, pull them out
of there and discard.  Add a few assertions along the way.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-07-12 16:21:24 -07:00
Sage Weil
e390f44b4a Merge remote-tracking branch 'gh/wip-corpus' into next
Rgw bits Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
2013-07-12 13:06:35 -07:00
Sage Weil
b7c549db3e msgr: add 'ms die on old message' to help catch reconnect seq issues
Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-12 11:38:40 -07:00
Sage Weil
93b3e63f43 msg/Message: use old footer for encoded message dump
This avoids the need for a conditional decoding check on ceph-dencoder,
and makes it match up with what encode_message() is doing.  The new(ish)
fields in the footer (the signature) is not useful for the object
corpus.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-07-11 11:42:02 -07:00
Samuel Just
0e93dd93e5 Merge branch 'wip-small-object-recovery'
Conflicts:
	src/include/ceph_features.h

Reviewed-by: Sage Weil <sage@inktank.com>
Fixes: #5278
2013-07-08 16:53:17 -07:00
Samuel Just
264dbf3f9e messages/,osd_types: add messages for Push, PushReply, Pull
Signed-off-by: Samuel Just <sam.just@inktank.com>
2013-07-08 16:43:31 -07:00
Sage Weil
a9906641a1 mon: implement simple 'scrub' command
Compare all keys within the sync'ed prefixes across members of the quorum
and compare the key counts and CRC for inconsistencies.

Currently this is a one-shot inefficient hammer.  We'll want to make this
work in chunks before it is usable in production environments.

Protect with a feature bit to avoid sending MMonScrub to mons who can't
decode it.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2013-07-08 15:34:32 -07:00
Samuel Just
bc3e2f09f8 Pipe: use uint64_t not unsigned when setting features
Fixes: #5497
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Luis <joao.luis@inktank.com>
2013-07-03 13:21:28 -07:00
Sage Weil
57dc73627e msgr: clear_pipe+queue reset when replacing lossy connections
We already handle the lossless replacement and lossy fault paths, but
not the lossy replacement.  This fixes an assert(!cleared) in the
reaper.  Adjust comments appropriately.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-23 18:09:55 -07:00
Sage Weil
9586305a23 msgr: reaper: make sure pipe has been cleared (under pipe_lock)
All paths to pipe shutdown should have cleared the con->pipe reference
already.  Assert as much.

Also, do it under pipe_lock!

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-23 15:10:24 -07:00
Sage Weil
ec612a5bda msg/Pipe: goto fail_unlocked on early failures in accept()
Instead of duplicating an incomplete cleanup sequence (that does not
clear_pipe()), goto fail_unlocked and do the cleanup in a generic way.
s/rc/r/ while we are here.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-23 15:10:24 -07:00
Sage Weil
afafb87e84 msgr: clear con->pipe inside pipe_lock on mark_down
We need to do this under protection of the pipe_lock.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-23 15:10:24 -07:00
Sage Weil
5fc1dabfb3 msgr: clear_pipe inside pipe_lock on mark_down_all
Observed a segfault in rebind -> mark_down_all -> clear_pipe -> put that
may have been due to a racing thread clearing the connection_state pointer.
Do the clear_pipe() call under the protection of pipe_lock, as we do in
all other contexts.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-23 15:10:24 -07:00
Sage Weil
597e4398b5 msgr: queue reset when marking down pipes on shutdown
This lets the callbacks clean up ref cycles.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-13 10:53:06 -07:00
Sage Weil
ea6880f8a2 msg/DispatchQueue: do not discard queued events on stop
When the shutdown/stop flag is set, continue to work through the queue.
Process events, but discard messages.  This avoids the loss of reset events
on shutdown that are necessary to clean up ref cycles.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-13 10:53:06 -07:00
Sage Weil
de64bc50f2 msgr: queue reset exactly once on any connection
Use the atomic pipe link removal as a signal that we are the one failing
the con and use that to queue the reset event.

This fixes the case where we have an open, the session gets set up via the
handle_accept callback, and then race with another connection and go into
wait + close, or just close.  In that case, fault() needs to queue a reset
event to match the accept.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-13 10:52:18 -07:00
Sage Weil
26e16c008d msg/Pipe: include con reef in debug prestring
Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-13 10:52:18 -07:00
Sage Weil
eea73ab88f msg/Pipe: reset replaced pipes
This gives the ms_handle_reset call a chance to clean up (for example, by
breaking a con->priv <-> session reference cycle).

Signed-off-by: Sage Weil <sage@inktank.com>
2013-06-13 10:52:18 -07:00