We have a specific PGRecoveryContext type/event--even though we are just
calling a GenContext--so that we can distinguish the event type properly.
Signed-off-by: Sage Weil <sage@redhat.com>
The mgr code is updated to send spg_t's instead of pg_t's (and is slightly
refactored/cleaned).
The PG events are added to the Primary state, unless we're also in the
Clean substate, in which case they are ignored.
Signed-off-by: Sage Weil <sage@redhat.com>
Once we know which PGs are about to be created, we instantiate their
pg_slot and mark them waiting_pg, which blocks all incoming events until
the split completes, the PG is installed, and we call wake_pg_waiters().
Signed-off-by: Sage Weil <sage@redhat.com>
Add a new MOSDPGCreate2 message that sends the spg_t (not just pg_t) and
includes only the info we need. Fast dispatch it.
Signed-off-by: Sage Weil <sage@redhat.com>
Queue a null event tagged with create_info, elimiating the special
legacy path.
These are still not fast dispatch because we need an spg (not pg) to queue
and event, and we need a current osdmap in order to calculate that. That
isn't possible/a good idea in fast dispatch. In a subsequent patch we'll
create a new pg create message that includes the correct information and
can be fast dispatched, allowing this path to die off post-nautilus.
Also, improve things so that we ack the pg creation only after the PG has
gone active, meaning it is fully replicated (by at least min_size PGs).
Signed-off-by: Sage Weil <sage@redhat.com>
This actually puts the remaining peering events into fast dispatch. The
only remaining event is the pg create from the mon.
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/pull/16779/head:
mds: cleanup MDCache::open_snaprealms()
mds: make sure snaptable version > 0
mds: don't consider CEPH_INO_LOST_AND_FOUND as base inode
mds: replace MAX() with std::max()
tools/cephfs: make cephfs-data-scan create snaprealm for base inodes
qa/cephfs: don't run TestSnapshots.test_kill_mdstable on kclient
qa/cephfs: adjust check of 'cephfs-table-tool all show snap' output
mds: don't warn unconnected snaplrealms in cluster log
mds: update CInode/CDentry's first according to global snapshot seq
qa/cephfs: add tests for snapclient cache
qa/cephfs: add tests for snaptable transaction
mds: add asok command that dumps cached snap infos
qa/cephfs: add tests for multimds snapshot
client: don't mark snap directory complete when its dirstat is empty
qa/workunits/snaps: add snaprealm split test
mds: make sure mds has uptodate mdsmap before checking 'allows_snaps'
client: fix incorrect snaprealm when adding caps
qa/workunits/snaps: add hardlink snapshot test
mds: add incompat feature and bump protocol for snapshot changes
mds: detach inode with single hardlink from global snaprealm
mds: record hardlink snaps in inode's snaprealm
mds: attach inode with multiple hardlinks to dummy global snaprealm
mds: cleanup rename code
mds: ensure xlocker has uptodate lock state
mds: simplify SnapRealm::build_snap_{set,trace}
mds: record global last_created/last_destroyed in snaptable
mds: pop projected snaprealm before inode's parent changes
mds: keep isnap lock in sync state
mds: handle mksnap vs resolve_snapname race
mds: cleanup snaprealm past parents open check
mds: rollback snaprealms when rolling back slave request
mds: send updated snaprealms along with slave requests
mds: explict notification for snap update
mds: send snap related messages centrally during mds recovery
mds: synchronize snaptable caches when mds recovers
mds: introduce MDCache::maybe_finish_slave_resolve()
mds: notify all mds about prepared snaptable update
mds: record snaps in old snaprealm when moving inode into new snaprealm
mds: cache snaptable in snapclient
mds: recover snaptable client when mds enters resolve state
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/21003/head:
client: lookup . on non-directory inode
client: avoid may_lookup for lookup . and lookup ..
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Zheng Yan <zyan@redhat.com>
This is a big commit that lays out the infrastructure changes to fast
dispatch the remaining peering events. It's hard to separate it all out
so this probably doesn't quite build; it's just easier to review as a
separate patch.
- lock ordering for pg_map has changed:
before:
OSD::pg_map_lock
PG::lock
ShardData::lock
after:
PG::lock
ShardData::lock
OSD::pg_map_lock
- queue items are now annotated with whether they can proceed without a
pg at all (e.g., query) or can instantiate a pg (e.g., notify log etc).
- There is some wonkiness around getting the initial Initialize event to
a newly-created PG. I don't love it but it gets the job done for now.
Signed-off-by: Sage Weil <sage@redhat.com>