Commit Graph

20247 Commits

Author SHA1 Message Date
Sage Weil
6e3fb20dec Merge remote-tracking branch 'gh/wip_osd_threading' 2012-07-05 17:20:14 -07:00
Samuel Just
09af670b1d PG,ReplicatedPG: on_removal must handle repop and watcher state
on_removal is now in ReplicatedPG in order to handle watcher state
and repop state.  Addionally, workqueue dequeues are handled already
in OSD::_remove_pg.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 17:18:55 -07:00
Samuel Just
691741985a OSDMonitor: disable cluster snapshot
The map handling changes broke cluster snapshot support.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 17:18:55 -07:00
Samuel Just
8e93e8b00a OSD: ensure that OpSequencer lives through on_commit callback
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 17:18:55 -07:00
Samuel Just
816d424727 ReplicatedPG.cc: C_OSD_CommittedPushedObject move pg->put() to finish
This should clarify the ownership of the pg ref.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 17:18:55 -07:00
Samuel Just
fe14c181da OSD::PeeringWQ::_dequeue(PG*) drop pg refs
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 17:18:55 -07:00
Samuel Just
0475ee45ca OSD,PG:;replica_scrub: move msg->put() into queue process
This clarifies the ownership of the reference.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 17:18:55 -07:00
Samuel Just
bdf09f2007 OSD,ReplicatedPG::snap_trimmer: pg->put() in process, not snap_trimmer()
This clarifies responsibility for the reference.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 17:18:55 -07:00
Samuel Just
cab7b75d30 OSD: drop pg refcounts in OpWQ::_dequeue(PG*)
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 17:18:54 -07:00
Samuel Just
868168a5fb OSD: clean up revcovery_wq queueing and ref counting
Previously, we tended to explicitely remove the pg from the queue uisng
remove_myself on the xlist::item.  This causes us to drop a reference
count.  Manipulating the revovery_wq is now accomplished through the
recovery_wq interface, which also handles pg ref counting.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 17:18:51 -07:00
Ross Turk
c70392a80d doc: minor typo
Signed-off-by: Ross Turk <ross@inktank.com>
2012-07-05 15:29:23 -07:00
Ross Turk
4d7bb07561 doc: update copyright notice in footer
Signed-off-by: Ross Turk <ross@inktank.com>
2012-07-05 15:24:42 -07:00
John Wilkins
57bc8da9c7 doc: minor updates to the restrucuredText file.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-07-05 14:01:45 -07:00
John Wilkins
0659f7c594 doc: minor cleanup.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-07-05 14:00:22 -07:00
John Wilkins
1c9e1c614c doc: Publishing as described. Still requires some verification and QA.
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
2012-07-05 13:47:45 -07:00
Samuel Just
7e26d6df10 PG: C_PG_MarkUnfoundLost put pg in finish
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:02 -07:00
Samuel Just
31db8ed08d OSD::activate_map: don't publish map until pgs in deleted pools have been removed
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:02 -07:00
Samuel Just
7f2354c76d doc/scripts/gen_state_diagram.py: make parser a bit more forgiving
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:02 -07:00
Samuel Just
9fc5db8c96 ReplicatedPG::op_applied: update last_update_applied iff !aborted
scrub state and last_update_applied will have been reset during
the interval change.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:02 -07:00
Samuel Just
4ce17cca2e test/encoding/types.h: disable pg_query_t encoding test
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:02 -07:00
Samuel Just
99c23b693f OSD: split notify|info|query messages for old clients
Old clients do not expect mixed epoch compound messages.  Thus, we
send each sub-message independently.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:02 -07:00
Samuel Just
193f18f2a9 FileStore: delete source collection if not replaying collection_rename
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:02 -07:00
Samuel Just
f0b2310f84 ReplicatedPG: RepModify track epoch_started and bail on interval change
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:02 -07:00
Samuel Just
7b5d8e8c37 ReplicatedPG: on_activate for a peer might happen before flush
We don't ensure for a peer that the flush completes before activation,
merely that we don't serve any ops until flush completes.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:02 -07:00
Samuel Just
87d1cdb5f4 OSD: _remove_pg not ruin iterator consistency
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:02 -07:00
Samuel Just
311a061e0d OSD: move watch into OSDService
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:01 -07:00
Samuel Just
442b5583bf PG: pass activate epoch with Activate event
This allows us to pass into activate() in which epoch the
message triggering activation occurred allowing us mark
the activate committed callback with the right query_epoch.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:01 -07:00
Samuel Just
f9282e6c6d Revert "osd: check against last_peering_reset in _activate_committed"
This reverts commit 86aa07d7a9.
2012-07-05 10:15:01 -07:00
Samuel Just
392df3b722 Revert "osd: reset last_peering_interval on replica activate"
This reverts commit 17114f266a.
2012-07-05 10:15:01 -07:00
Samuel Just
1b558fba0e OSD: write_info/log during process_peering_events, do_recovery
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:01 -07:00
Samuel Just
c6db1b2ee2 PG: delay ops in do_request, not queue_op
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:01 -07:00
Samuel Just
9b182d20e6 OSD: maybe_update_heartbeat_peers, don't print pg
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:01 -07:00
Samuel Just
0ee3d87f4f OSD: process_peering_event check for new map on each pg
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:01 -07:00
Samuel Just
c1f2a8026f OSD: peering_wq is now a BatchWorkQueue
process_peering_events now handles multiple pgs at once to better
batch up notifes, etc.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:01 -07:00
Samuel Just
d8a68e76ea OSD: do_(notifies|infos|queries) must now be passed a map
This removes the need to call them from within the osd lock.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:01 -07:00
Samuel Just
3ca6359ce5 common/WorkQueue.h: add BatchWorkQueue
Rather than dispatching one item at a time to process, etc,
BatchWorkQueue dispatches up to a configurable number of
items.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:01 -07:00
Samuel Just
5c0e8b465f OSD: bail out of do_recovery if no longer primary and active
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:01 -07:00
Samuel Just
5dc45f7728 PG: PG now store its own PGPool
Otherwise, we need to syncronize access to the shared PGPool objects.
The wasted memory is probably preferable to syncronization overhead.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:01 -07:00
Samuel Just
b242c565c0 OSD: on pg_removal, project_pg_history to get current interval
First, we don't really want to remove the pg if we can use it.  Second,
there might be messages in the pg peering queue for the next interval.
If one of those happens to be an info request or notify, we would lose
the peering message.

If the message falls in the current interval as determined by the
current osdmap, than we know that any messages currently queued must be
obsolete and can safetly be discarded.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:00 -07:00
Samuel Just
a67a874b15 CrushWrapper: add locking around crush_do_rule
crush_do_rule uses a cache on the bucket objects.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:00 -07:00
Samuel Just
c7581b69bd CrushWrapper: rmaps don't need to be mutable
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:00 -07:00
Samuel Just
73f5ce9481 OSD,PG: issue pg removals in line, remove remove_list
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:00 -07:00
Samuel Just
7c1dc90a60 OSD: don't advance_pg() if pg is up-to-date
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:00 -07:00
Samuel Just
8079a489bd OSD,PG: clean up _get_or_create_pg and set interval based on msg
Previously, we set last_peering_reset based on the epoch in which the pg
is created.  We now pass the map from the query_epoch to the creation
methods to set based on that.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:00 -07:00
Samuel Just
a5bf3d71a7 OSD: lock recovery_wq before debug output on finish_recovery_op
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:00 -07:00
Samuel Just
3dcce50e2a OSD: only do_(notify|info|query) for up osd
pg may have an older map and attempt to notify|info|query on a down
osd.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:00 -07:00
Samuel Just
040a22b692 OSD: map_cache should contain const OSDMap
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:00 -07:00
Samuel Just
4fec85f26d OSD: activate_map() in handle_osd_map only when active
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:00 -07:00
Samuel Just
2552a7f430 OSD,PG: _share_map_outgoing must not require osd_lock
Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:00 -07:00
Samuel Just
35949c541c ReplicatedPG: explicitely block on not active for certain ops
Ops and some subops need to wait for active to ensure correct ordering
with respect to peering operations.

Signed-off-by: Samuel Just <sam.just@inktank.com>
2012-07-05 10:15:00 -07:00