Commit Graph

33239 Commits

Author SHA1 Message Date
Greg Farnum
5268e51b79 OSD: don't share_map_incoming() directly from handle_replica_op()
Let the op_tp handle it, or our C_SendMap callback in the op_gen_wq.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:19 -07:00
Greg Farnum
ebdc097047 OSD: use the async workqueue to send OSDMap updates on dropped ops
Check whether we actually want to send a map in-line, and if we do, create
a GenContext which does so and put that in the op_gen_wq.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:19 -07:00
Greg Farnum
6c98e36f89 OSD: add an op threadpool GenContext workqueue
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:19 -07:00
Greg Farnum
9fba69a11a OSD: allow build_incremental_map_msg to fail on lookups
Since we're now building incremental map messages out-of-band with doing
other map updates now, we need to tolerate lookup failures at the bottom
end. Do so by returning a NULL message in that case.
Handle that in send_incremental_map by looping until we get a
message back -- if we fail on the first attempt, we'll get
the OSDSuperblock again and deal with it.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:19 -07:00
Greg Farnum
0ffdeab900 OSD: fix a few map sharing bugs
1) do not share OSD maps with peers that already have them
2) do not share maps with oneself

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:19 -07:00
Greg Farnum
0fbaa160c1 OSD: move should_share_map and share_map_incoming to OSDService
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:19 -07:00
Greg Farnum
399e67f884 OSD: pass a pointer to last_sent_epoch instead of the whole Session
We don't use any other part of the Session, and this interface will
be easier to move out of the OSD class.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:19 -07:00
Greg Farnum
c97f96837a OSD: share map updates in the op_tp threads instead of the main dispatch thread
Sharing maps can require disk accesses and things. We don't want to do that
in our fast path, so do it in OSD::dequeue_op instead of OSD::handle_op. We're
cheating slightly and still doing it in handle_op if no op actually gets queued,
but we're going to put those into a separate work queue next. We'll also be
moving all the functions necessary for this into OSDService so that our completion
struct doesn't need to be a friend to OSD.
To make this easier, we're adding send_map_update and sent_epoch members to
OpRequest.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:18 -07:00
Greg Farnum
d78988bf41 OSD: refactor handle_op error handling cases
We move our map version-checking code earlier (to dispatch_op) and refactor
our other fail-to-dispatch cases. This is friendlier for the no-lock
message processing we'll use with fast dispatch.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:18 -07:00
Greg Farnum
276a4fe422 OSD: change Session handling around _share_map_incoming
Move responsibility for the reference up to _share_map_incoming's caller,
and start using the Session::sent_epoch_lock. This looks a little silly
now, but we're going to split up the decide-to-send-maps and send-maps steps
and don't want to block in the decide-to-send step, so we need some
pretty flexible locking up at this level.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:18 -07:00
Greg Farnum
1e3c4959a9 OSD: add a Session::sent_epoch_lock
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:18 -07:00
Greg Farnum
667769c624 OSD: simplify _share_map_incoming based on _should_share_map()
Also, remove the bool return code since nobody looks at it.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:18 -07:00
Greg Farnum
b53cec43d1 OSD: add _should_share_map function
Just copy _share_map_incoming and rip out all the parts that actually
update data structures.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:18 -07:00
Greg Farnum
b2187ac935 OSD: use an OSDMapRef& and require the Session* in _share_map_incoming
You can pass in a NULL Session*, but both callers do that; and using
an OSDMapRef& reduces shared_ptr copies.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:18 -07:00
Greg Farnum
9835866e8e OSD: use safe params in map-sharing functions
We were previously using unprotected access to OSD members.

Unfortunately, this does not make them completely safe: we are looking up
maps asynchronously from when we got access to the cached map bounds, and
so the OSD could delete a map out from underneath us. Fixing that will
require some kind of map bounds lock. :/

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:17 -07:00
Samuel Just
b199194db1 OSD::send_incremental_map: use service superblock so we can avoid locking osd_lock
TODO: make it actually safe by dealing with build_incremental_map_msg()

Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-05-05 15:29:17 -07:00
Samuel Just
812c67236d OSD::_share_map_incoming: pass osdmap in explicitly
We'll want to be able to use this method without the osd_lock. Note
that we can't do so yet -- we call send_incremental_map, which is not
safe to call unlocked.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:17 -07:00
Greg Farnum
2f97f4776f OSD: protect state member with a Spinlock
This member was previously protected by the osd_lock (although setting
SHUTDOWN was synchronized with the heartbeat lock, too), but we need
to read it for fast dispatch, so protect it under its own lock at all times.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:17 -07:00
Greg Farnum
a94a64d9d0 OSD: protect access to boot_epoch, up_epoch, bind_epoch
We need to access these members in some call chains via fast_dispatch,
where they're otherwise unprotected.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:17 -07:00
Greg Farnum
767e94ac3d OSD: shard heartbeat_lock
heartbeat_need_update must be protected independently in order to avoid
a loop with the pg_map_lock and the PG::_lock.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:17 -07:00
Greg Farnum
9d8c797e65 OSD: Push responsibility for grabbing pg_map_lock up to callers of _remove_pg()
The atomicity requirements of other systems prevent us dropping the PG lock
inside that function, and the PG lock is ordered underneath the pg_map_lock.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:17 -07:00
Samuel Just
00d36f6e8a OSD: wake_pg_waiters atomically with pg_map update
Also, call enqueue_op directly rather than going back
through the entire dispatch machinery.
Be sure to grab the pg lock under the pg_map_lock in _open_lock_pg() to
preserve necessary lock ordering.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:16 -07:00
Samuel Just
3755318342 OSD: remove wake_all_pg_waiters
We shouldn't need this -- we check the pg waiters list on each
map.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:16 -07:00
Samuel Just
eb30f88c94 OSD: add session waiting_for_map mechanisms
This will replace the existing waiting_for_osdmap mechanism
with a per-session wait list.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:16 -07:00
Samuel Just
6d53349282 OSD: pass osdmap to handle_op and handle_replica_op
We need a map to process them, and we don't want to
take the OSD lock to access one. (And we can't just
use the service because we need all processing of
a message to be done with the same map.)

Signed-off-by: Samuel Just <sam.just@inktank.com>
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:16 -07:00
Greg Farnum
475d8319dd OSD: add a RWLock pg_map_lock
If we're going to dispatch ops without grabbing the osd lock, we need
something else to protect the pg map (and it'll be a little
contended, so use a read-write lock).

We repurpose the (previously oddly-named) _lookup_lock_pg_with_map_lock_held()
function to refer to the pg_map_lock. handle_pg_query and handle_pg_remove
switch to use that version, because they're holding pg_map_lock already and
we know the PG they're referring to exists.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:16 -07:00
Samuel Just
5abbbfeb37 OSDService: add osdmap reservation mechanism
The goal here is to be able to get "reserved" refs
to next_map, and ensure that pgs won't see a newer
map until the ref is "released".  I haven't done
a cute RAII trick here yet...probably not worth
the effort.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>

Conflicts:
	src/osd/OSD.h
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:16 -07:00
Greg Farnum
09bf5e80d5 msgr: change the delay queue flushing semantics
Since we're doing fast_dispatch out of the delay queue, we don't want to
flush while holding the pipe lock. Instead, make flush set it up for instant
delivery, and steal the delay queue when replacing pipes. If we're shutting
down a pipe, wait until flushing has completed before doing so.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:16 -07:00
Greg Farnum
69fc6b2b66 msgr: enable fast_dispatch on local connections
We do two things:
1) Call ms_handle_fast_connect() when setting up the local connection, so
the Dispatcher can set up any state it needs
2)Move local_delivery into a separate thread from the sender's. fast_dispatch
makes this entirely necessary since otherwise we're dipping back in to the
Dispatcher while holding whatever locks it held when it sent the Message.

Implementation starts with a thread and a list of messages to process and
proceeds as you'd expect from that.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:15 -07:00
Samuel Just
4e20ce1961 Messenger,DispatchQueue: add ms_fast_dispatch mechanism
This adds a Dispatcher interface allowing the implementation
to accept ms_fast_dispatch calls for some messages without
going through the DispatchQueue. To support that, we also add
1) new synchronous notifications on connect and accept events
2) a fast_preprocess mechanism

Signed-off-by: Samuel Just <sam.just@inktank.com>
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:15 -07:00
Samuel Just
a62db614a6 DispatchQueue: factor out pre_dispatch and post_dispatch
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:15 -07:00
Greg Farnum
1379c031d5 OSD: remove never-activated while loop from send_incremental_map
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:15 -07:00
Greg Farnum
dd3d023a74 OSD: rename gen_wq, schedule_work, and PG_QueueAsync to include "recovery"
These all hook into the recovery thread pool and need to make that obvious.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:15 -07:00
Greg Farnum
e0ac34a08a OSD: remove unused push_wq
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:15 -07:00
Samuel Just
ec163579a2 OSD: replace handle_pg_scan, handle_pg_backfill with handle_replica_op
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:15 -07:00
Greg Farnum
63cc1ec193 OSD: add handle_osd_map debug output
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:14 -07:00
Samuel Just
37fac29ca0 OSD::_share_map_incoming: line wrap debug output
Formatting only.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:14 -07:00
Greg Farnum
78f310d853 PG: constify the init() function params
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:14 -07:00
Samuel Just
816b10ed8c RWLock: assert pthread function return values
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
2014-05-05 15:29:14 -07:00
Yan, Zheng
24c5ea8df0 osd: check blacklisted clients in ReplicatedPG::do_op()
OSD checks if client is blacklisted only when receiving OSD request.
It's possible that OSD request's sender get blacklisted while OSD
request in in some waiting list.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
(cherry picked from commit f92677c5b2)
2014-05-03 15:14:39 -07:00
Sage Weil
b4dfd3d578 Merge pull request #1740 from ceph/wip-8155
mon: OSDMonitor: disallow nonsensical cache-mode transitions

Reviewed-by: Sage Weil <sage@inktank.com>
2014-05-03 15:13:37 -07:00
Sage Weil
c64b67b56c ceph-object-corpus: rebase onto firefly corpus
Signed-off-by: Sage Weil <sage@inktank.com>
2014-05-03 07:59:28 -07:00
Sage Weil
af8a5298d3 Merge pull request #1762 from yuyuyu101/wip-8282
Fix clone problem

Backport: firefly
Reviewed-by: Sage Weil <sage@inktank.com>
2014-05-03 07:41:48 -07:00
Sage Weil
ca116a3e31 Merge pull request #1752 from ceph/wip-da-SCA-fixes-20140501
Various fixes from SCA

Reviewed-by: Sage Weil <sage@inktank.com>
2014-05-03 06:49:56 -07:00
Sage Weil
00868514ec Merge pull request #1755 from eile/master
Fix out of source builds

Reviewed-by: Sage Weil <sage@inktank.com>
2014-05-03 06:42:04 -07:00
Stefan Eilemann
8bd4e58242 Fix out of source builds
Signed-off-by: Stefan Eilemann <Stefan.Eilemann@epfl.ch>
2014-05-03 12:02:02 +02:00
Haomai Wang
3aee1e0ffe Fix clone problem
When clone happened, the origin header also will be updated in GenericObjectMap,
so the new header wraper(StripObjectHeader) should be updated too.

Fix #8282
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
2014-05-03 12:55:17 +08:00
Joao Eduardo Luis
fd970bbc95 mon: OSDMonitor: disallow nonsensical cache-mode transitions
Fixes: 8155

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
2014-05-03 03:42:19 +01:00
David Zafman
3e31458444 Merge pull request #1735 from ceph/wip-8113
Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-05-02 19:26:30 -07:00
Samuel Just
4aa93dd12c Merge pull request #1698 from ceph/wip-snapmapper-debug
osd/SnapMapper: debug

Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-05-02 16:52:36 -07:00