Commit Graph

82978 Commits

Author SHA1 Message Date
Yan, Zheng
d46dbbebac qa/cephfs: add tests for multimds snapshot
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:42:29 +08:00
Yan, Zheng
81adc4ebe6 client: don't mark snap directory complete when its dirstat is empty
MDS has trouble in tracking dirstat for snap inode. Snap directory
inode's dirstat can be inaccuracy.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:42:29 +08:00
Yan, Zheng
8da3283dda qa/workunits/snaps: add snaprealm split test
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:42:29 +08:00
Yan, Zheng
b6f5f343eb mds: make sure mds has uptodate mdsmap before checking 'allows_snaps'
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:42:29 +08:00
Yan, Zheng
c00ee8b743 client: fix incorrect snaprealm when adding caps
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:42:29 +08:00
Yan, Zheng
35af091a04 qa/workunits/snaps: add hardlink snapshot test
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:42:29 +08:00
Yan, Zheng
838bd091b5 mds: add incompat feature and bump protocol for snapshot changes
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:42:29 +08:00
Yan, Zheng
a5fdda678b mds: detach inode with single hardlink from global snaprealm
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:42:28 +08:00
Yan, Zheng
46bb0f448a mds: record hardlink snaps in inode's snaprealm
Inode with multiple hardlinks is attached to global snaprealm.
Before modifying a hardlink, record snaps that reference the
the hardlink. When all hardlinks are removed, stray inode gets
moved into normal snaprealm. By checking the recorded snaps,
mds knows if there still are snaps reference the stray inode.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:42:28 +08:00
Yan, Zheng
0f31d49de2 mds: attach inode with multiple hardlinks to dummy global snaprealm
The dummy global snaprealm includes all snapshots in the filesystem.
For any later snapshot, mds will COW the inode and preserve snap data.
These snap data will cover any possible snapshot on remote linkages
of the inode.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:42:28 +08:00
Yan, Zheng
a9c3664304 mds: cleanup rename code
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:42:28 +08:00
Yan, Zheng
bdcbc8a5ad mds: ensure xlocker has uptodate lock state
This simplifies trans-authority rename. Master can prepare new snaplream
for source inode even it's not auth mds.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:42:27 +08:00
Yan, Zheng
813ba65ecf mds: simplify SnapRealm::build_snap_{set,trace}
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:41:28 +08:00
Yan, Zheng
be75beadc9 mds: record global last_created/last_destroyed in snaptable
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:41:28 +08:00
Yan, Zheng
7b8c844c3b mds: pop projected snaprealm before inode's parent changes
When creating new snaprealm, we need to split its parent snaprealm's
inodes_with_caps. If new snaprealm is newly created during rename,
inode's original snaprealm's inodes_with_caps should be split. So in
rename/rmdir cases, we should pop projectd snaprealm before inode's
parent changes

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:41:28 +08:00
Yan, Zheng
2d60956c8b mds: keep isnap lock in sync state
unlike locks of other types, isnap lock and dentry lock in unreadable
state can block path traverse, so it should be in sync state as much
as possible.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:41:27 +08:00
Yan, Zheng
280ab1c8cf mds: handle mksnap vs resolve_snapname race
In multimds setup, it's possible that mds receives snap update message
after receiving client requests that lookup the newly created snapshot.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:41:27 +08:00
Yan, Zheng
471046f449 mds: cleanup snaprealm past parents open check
For new format snaprealm, there is no need to open past parent,
SnapRealm::have_past_parents_open() always return true, In multimds
setup, mds may use snaprealm whithout opening past parents. So the
assertion in SnapRealm::check_cache() is wrong.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:41:27 +08:00
Yan, Zheng
34682e4475 mds: rollback snaprealms when rolling back slave request
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:41:27 +08:00
Yan, Zheng
853e00a64e mds: send updated snaprealms along with slave requests
rmdir and rename may create/update snaprealms. If snaprealms are
created/updated, encode the updated snaprealms in slave requests
and dentry unlink messages. So that when rmdir or rename finishes,
snaprealms in different mds are in sync.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 18:41:26 +08:00
Yan, Zheng
f842c95d92 mds: explict notification for snap update
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 17:46:55 +08:00
Yan, Zheng
93e7267757 mds: send snap related messages centrally during mds recovery
sending CEPH_SNAP_OP_SPLIT and CEPH_SNAP_OP_UPDATE messages to
clients centrally in MDCache::open_snaprealms()

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 17:46:55 +08:00
Yan, Zheng
3b5da9c613 mds: synchronize snaptable caches when mds recovers
The basic idea is:

1. For recovering mds:
 Learn other mds' pending snaptable commits from resolve messages.
 Load snaptable cache from snapserver when resolve done.

2. For survivor mds:
  Refresh snaptable cache from snapserver when cluster is in resolving
  state.
  Learn recovering mds' pending snaptable commits from resolve messages.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 17:46:55 +08:00
Yan, Zheng
02889cf8d3 mds: introduce MDCache::maybe_finish_slave_resolve()
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 17:46:55 +08:00
Yan, Zheng
30301f7fa5 mds: notify all mds about prepared snaptable update
After snaptable update get prepared, push the update preparation to
all active snaptable clients, then send reply to update initiator.
By this way, the initiator know that all mds have record the update
preparation in their cache. When committing the snaptable update,
the initiator notifies all mds about the commit. Bystander mds'
snaptable cache get synchronized when it receives the notification.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 17:46:55 +08:00
Yan, Zheng
a810e9f1aa mds: record snaps in old snaprealm when moving inode into new snaprealm
To get effective snaps in past snaprealms, we just need to filter out
deleted snaps by using global snap infos. This avoids the complexity
of opening 'past parents'

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 17:46:55 +08:00
Yan, Zheng
ca1126fdb0 mds: cache snaptable in snapclient
The idea is caching both snap infos and pending updates in snapclient.
The snapclient also tracks updates that are being committed, it applies
these commits to its cached snap infos. Steps to update snaptable are:

 - mds.x acquire locks (xlock on snaplock of affected snaprealm inode)
 - mds.x prepares snaptable update. (send preare to snapserver and waits
   for 'agree' reply)
 - snapserver sends notification about the update to all mds and waits
   for ACKs. (not implemented by this patch)
 - snapserver send 'agree' reply to mds.x
 - mds.x journals corresponding
 - mds.x commits the snaptable update and notifies all mds that it
   commits that update. then mds drops locks.

When receiving committing notification, mds applies the committing
update to its cached snap infos. By this way, cached snap infos get
synchronized before snaplock become readable.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 17:46:55 +08:00
Yan, Zheng
eca532278c mds: recover snaptable client when mds enters resolve state
this is preparetion for later change that caches snaptable in
snapclient and sync the cached snaptable between mds.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2018-02-09 17:46:55 +08:00
Kefu Chai
01d350f388
Merge pull request #20382 from tchaikov/wip-fix-ftbfs-store-test
test/store_test: fix FTBFS as Sequencer is removed

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
2018-02-09 14:39:49 +08:00
Kefu Chai
3a7cac0684 test/store_test: fix FTBFS as Sequencer is removed
Signed-off-by: Kefu Chai <kchai@redhat.com>
2018-02-09 13:05:25 +08:00
Kefu Chai
5757d6f118
Merge pull request #20294 from rzarzynski/wip-bs-drop-std_function
os/bluestore: avoid overhead of std::function in blob_t.

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-02-09 10:45:24 +08:00
Kefu Chai
24d1b2edb4
Merge pull request #19232 from socketpair/precision
mgr: increase time resolution of Commit/Apply OSD latencies.

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-02-09 10:44:27 +08:00
Kefu Chai
34c3af45a5
Merge pull request #18804 from majianpeng/bluestore-collection-prealloc
os/bluestore: Prealloc memory avoid realloc in list_collection.

Reviewed-by: Sage Weil <sage@redhat.com>
2018-02-09 10:43:24 +08:00
Kefu Chai
7894961e9f
Merge pull request #18494 from ifed01/wip-stupidalloc-fix2
os/bluestore: do not assert if BlueFS rebalance is unable to allocate sufficient space

Reviewed-by: Sage Weil <sage@redhat.com>
2018-02-09 10:42:43 +08:00
Kefu Chai
a82cad5839
Merge pull request #18343 from shinobu-x/sk-remove-osdmap
mon/PGMap: Remove unnecessary header

Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-02-09 10:41:50 +08:00
Patrick Donnelly
8c95cc5144
Merge PR #19954 into master
* refs/pull/19954/head:
	test/encoding: refactor to avoid escaping shell magic
	mds: minor refactor of SimpleLock
	mds: track Capability in mempool
	mds: move CInode container members to mempool
	mds: move CDentry container members to mempool
	mds: move CDir container members to mempool
	mds: put MDSCacheObject compact_map in mempool
	common: use size_t for object size
	mds: convert to allocator agnostic string_view
	mds: simplify initialization
	compact_*: support mempool allocated containers

Reviewed-by: Zheng Yan <zyan@redhat.com>
2018-02-08 18:17:15 -08:00
Jason Dillaman
7b7a7d55b1
Merge pull request #20007 from mogeb/steady-clock-librbd
librbd: use steady clock to measure elapsed time in AioCompletion

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2018-02-08 19:20:21 -05:00
Jason Dillaman
5029ee8a60
Merge pull request #20008 from mogeb/steady-clock-tools-rbd
tools/rbd: use steady clock in bencher

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2018-02-08 19:20:07 -05:00
Jason Dillaman
7b79305f60
Merge pull request #20218 from shun-s/wip-speedup-diskusage-resize
librbd: speed up object map disk usage and resize

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2018-02-08 19:19:52 -05:00
Jason Dillaman
6d5652125c
Merge pull request #20311 from Songweibin/wip-group-snap-ls
rbd: do not show title if there is no group snapshot

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2018-02-08 19:19:35 -05:00
Jason Dillaman
6c3d5aa149
Merge pull request #20349 from trociny/wip-22932
rbd-mirror: fix potential infinite loop when formatting status message

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2018-02-08 19:19:12 -05:00
Patrick Donnelly
84ae0ce5fc
Merge PR #20310 into master
* refs/pull/20310/head:
	qa: adjust cephfs full test for kclient

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2018-02-08 13:30:04 -08:00
Patrick Donnelly
65217e5363
Merge PR #20148 into master
* refs/pull/20148/head:
	mds: reset connection's priv when marking down connection
	mds: fix session reference leak

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2018-02-08 13:05:26 -08:00
Patrick Donnelly
b0afc33811
Merge PR #20155 into master
* refs/pull/20155/head:
	osdc/Journaler: make sure flush() writes enough data

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2018-02-08 13:05:19 -08:00
Patrick Donnelly
e8e72570fe
Merge PR #20190 into master
* refs/pull/20190/head:
	mon: allow removal of tier of ec overwritable pool

Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: João Eduardo Luís <joao@suse.de>
2018-02-08 13:05:12 -08:00
Patrick Donnelly
06b176b362
Merge PR #20200 into master
* refs/pull/20200/head:
	client: add cap_dirtier_uid/gid to CapSnap

Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
Reviewed-by: Zheng Yan <zyan@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2018-02-08 13:05:05 -08:00
Patrick Donnelly
cb03b5e7f5
Merge PR #20246 into master
* refs/pull/20246/head:
	mds: remove extra 0x in ino prints
	mds: print inode number not CInode ptr
2018-02-08 13:04:54 -08:00
Alfredo Deza
7a5777183e
Merge pull request #20367 from ceph/simple-custom-cluster
ceph-volume: adds custom cluster name support to simple

Reviewed-by: Alfredo Deza <adeza@redhat.com>
2018-02-08 08:48:41 -05:00
Andrew Schoen
7f1dc6b3ab ceph-volume: use a custom cluster name in simple functional tests
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-02-08 07:09:57 -06:00
Xie Xingguo
f1cb504b2a
Merge pull request #20305 from xiexingguo/wip-more-balancer-fixes
pybind/mgr/balancer: more specific command outputs

Reviewed-by: Sage Weil <sage@redhat.com>
2018-02-08 12:30:46 +08:00