RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-01-09 20:52:09 +00:00

Author	SHA1	Message	Date
Yan, Zheng	eeb68eb33d	mds: open inode by ino This patch adds "open-by-ino" helper. It utilizes backtrace to find inode's path and open the inode. The algorithm looks like: 1. Check MDS peers. If any MDS has the inode in its cache, goto step 6. 2. Fetch backtrace. If backtrace was previously fetched and get the same backtrace again, return -EIO. 3. Traverse the path in backtrace. If the inode is found, goto step 6; if non-auth dirfrag is encountered, goto next step. If fail to find the inode in its parent dir, goto step 1. 4. Request MDS peers to traverse the path in backtrace. If the inode is found, goto step 6. If MDS peer encounters non-auth dirfrag, it stops traversing. If any MDS peer fails to find the inode in its parent dir, goto step 1. 5. Use the same algorithm to open the inode's parent. Goto step 3 if succeeds; goto step 1 if fails. 6. return the inode's auth MDS ID. The algorithm has two main assumptions: 1. If an inode is in its auth MDS's cache, its on-disk backtrace can be out of date. 2. If an inode is not in any MDS's cache, its on-disk backtrace must be up to date. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:22 +08:00
Yan, Zheng	617f70d216	mds: move fetch_backtrace() to class MDCache We may want to fetch backtrace while corresponding inode isn't instantiated. MDCache::fetch_backtrace() will be used by later patch. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:22 +08:00
Yan, Zheng	05a7588d37	mds: remove old backtrace handling Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:22 +08:00
Yan, Zheng	39b5e76ca4	mds: update backtraces when unlinking inodes unlink moves inodes to stray dir, it's a special form of rename. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:22 +08:00
Yan, Zheng	b88c49b751	mds: bring back old style backtrace handling To queue a backtrace update, current code allocates a BacktraceInfo structure and adds it to log segment's update_backtraces list. The main issue of this approach is that BacktraceInfo is independent from inode. It's very inconvenient to find pending backtrace updates for given inodes. When exporting inodes from one MDS to another MDS, we need find and cancel all pending backtrace updates on the source MDS. This patch brings back old backtrace handling code and adapts it for the current backtrace format. The basic idea behind of the old code is: when an inode's backtrace becomes dirty, add the inode to log segment's dirty_parent_inodes list. Compare to the current backtrace handling, another difference is that backtrace update is journalled in EMetaBlob::full_bit Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:22 +08:00
Yan, Zheng	c9d2e25641	mds: rename last_renamed_version to backtrace_version Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:22 +08:00
Yan, Zheng	6c721116fc	mds: journal backtrace update in EMetaBlob::fullbit Current way to journal backtrace update is set EMetaBlob::update_bt to true. The problem is that an EMetaBlob can include several inodes. If an EMetaBlob's update_bt is true, journal replay code has to queue backtrace updates for all inodes in the EMetaBlob. This patch adds two new flags to class EMetaBlob::fullbit, make it be able to journal backtrace update. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:22 +08:00
Yan, Zheng	03c0fe937d	mds: reorder EMetaBlob::add_primary_dentry's parameters prepare for adding new state parameter such as 'dirty_parent' Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:22 +08:00
Yan, Zheng	26effc0e58	mds: warn on unconnected snap realms When there are more than one active MDS, restarting MDS triggers assertion "reconnected_snaprealms.empty()" quite often. If there is no snapshot in the FS, the items left in reconnected_snaprealms should be other MDS' mdsdir. I think it's harmless. If there are snapshots in the FS, the assertion probably can catch real bugs. But at present, snapshot feature is broken, fixing it is non-trivial. So replace the assertion with a warning. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:22 +08:00
Yan, Zheng	f3a9f4746d	mds: slient MDCache::trim_non_auth() No need to output the function's debug message to console. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	9424298f27	mds: fix check for base inode discovery If a MDiscover message is for discovering base inode, want_base_dir should be false, path should be empty. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	c9707f636c	mds: Fix replica's allowed caps for filelock in SYNC_LOCK state For replica, filelock in LOCK_LOCK state doesn't allow Fc cap. So filelock in LOCK_SYNC_LOCK/LOCK_EXCL_LOCK state shouldn't allow Fc cap either. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	3962a7510f	mds: defer releasing cap if necessary When inode is freezing or frozen, we defer processing MClientCaps messages and cap release embedded in requests. The same deferral logical should also cover MClientCapRelease messages.	2013-05-28 13:57:21 +08:00
Yan, Zheng	a918e611e2	mds: fix Locker::request_inode_file_caps() After sending cache rejoin message, replica need notify auth MDS when cap_wanted changes. But it can send MInodeFileCaps message only after receiving auth MDS' rejoin ack. Locker::request_inode_file_caps() has correct wait logical, but it skips sending MInodeFileCaps message if the auth MDS is still in rejoin state. The fix is defer sending MInodeFileCaps message until the auth MDS is active. It makes the function's wait logical less tricky. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	2b1b6cae2d	mds: notify auth MDS when cap_wanted changes So the auth MDS can choose locks' states base on our cap_wanted. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	fc94f47b8b	mds: export CInode:mds_caps_wanted CInode:mds_caps_wanted is used to keep track of caps wanted by non-auth MDS. The auth MDS checks it when choosing locks' states. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	e21f328f1a	mds: export CInode::STATE_NEEDSRECOVER Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	882be6b1d7	mds: send slave request after target MDS is active when failure of peer is detected, MDCache::handle_mds_failure() checks if there are requests waiting for slave replies from the failed peer, and adds them to the "wait for active peer" list. The "retry request" logical only covers slave requests sent before MDCache::handle_mds_failure() is called. If a slave request was sent while peer isn't up, we wait for its reply forever. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	38fb2ec78b	mds: unfreeze inode after rename rollback finishes we should not wake up the unfreeze waiter while the inode is still linked to a non-auth dirfrag. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	8a1114cead	mds: remove buggy cache rejoin code I previously added code to handle a corner case of cache rejoin: entire subtree, together with the inode subtree root belongs to, were trimmed between sending cache rejoin and receiving rejoin ack. In this case, we should send cache expire message to the subtree's auth MDS. But the code is complete broken, remove it temporarily. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	30c68218f7	mds: fix typo in Server::do_rename_rollback Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	e8497f8087	mds: fix import cancel race Current code uses import state to detect obsolete import discover/prep message. it does not work for the case: cancel a subtree import, import the same subtree again, the discover/prep message for the first import get dispatched. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	0708d44f12	mds: fix straydn race For unlink/rename request, the target dentry's linkage may change before all locks are acquired. So we need check if the existing stray dentry is valid. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	7a6ec35367	mds: fix slave commit tracking MDS may crash after journalling a slave commit, but before sending commit ack to the master. Later when the MDS restarts, it will not send commit ack to the master. So the master waits for the commit ack forever. The fix is remove failed MDS from requests' uncommitted slave list. When failed MDS recovers, its resolve message will tell the master which slave requests are not committed. The master will re-add the recovering MDS to requests' uncommitted slave list if necessary. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	0c1ca8edda	mds: fix uncommitted master wait We may add new waiter while the master is committing. so we should take the waiters and wake up them when the master is committed. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	5426c75d7b	mds: adjust subtree auth if import aborts in PREPPED state Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	d7b999be1b	mds: don't stop at export bounds when journaling dir context We only journal the finish of exporting subtree, so we shouldn't consider export bounds as subtree root. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	81d073fecb	mds: fix underwater dentry cleanup If the underwater dentry is a remove link, we shouldn't mark the inode clean Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:20 +08:00
Yan, Zheng	8b4e9911a4	mds: journal new subtrees created by rename this avoids creating bare dirfrags during journal replay. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:20 +08:00
Sage Weil	a6df7644b6	PendingReleaseNotes: notes about enabling HASHPSPOOL Signed-off-by: Sage Weil <sage@inktank.com>	2013-05-27 21:17:06 -07:00
Sage Weil	aa0649c66b	osdmaptool: fix cli tests Now that the default pool flags have changed. Signed-off-by: Sage Weil <sage@inktank.com>	2013-05-27 21:17:04 -07:00
Sage Weil	f0958c36fd	Merge pull request #321 from dalgaaf/wip-da-CID-727981 kv_flat_btree_async.cc: fix AioCompletion resource leak	2013-05-27 13:55:54 -07:00
Sage Weil	35a8c6160c	Merge pull request #320 from dalgaaf/wip-da-CID-727983 kv_flat_btree_async.cc: fix resource leak	2013-05-27 13:55:24 -07:00
John Wilkins	615b54c6e4	doc: Updated rgw.conf example. fixes: #4608 Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2013-05-25 15:13:01 -07:00
John Wilkins	6f935419e6	doc: Updated RGW Quickstart. Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2013-05-25 15:11:49 -07:00
John Wilkins	e59897c8b2	doc: Updated index for newer terms. Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2013-05-25 15:11:06 -07:00
Samuel Just	6d1e14e045	pg_pool_t: enable FLAG_HASHPSPOOL by default Fixes: #5160 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-05-24 16:21:48 -07:00
Danny Al-Gaaf	0f5474834a	kv_flat_btree_async.cc: fix AioCompletion resource leak Call AioCompletion::release() if the completion is no longer needed to free the resources. CID 727981 (#3 of 3): Resource leak (RESOURCE_LEAK) leaked_storage: Variable "top_aioc" going out of scope leaks the storage it points to. Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2013-05-24 14:50:34 +02:00
Danny Al-Gaaf	7b438e131b	kv_flat_btree_async.cc: fix resource leak Call AioCompletion::release() if the completion is no longer needed to free the resources. CID 727983 : Resource leak (RESOURCE_LEAK) leaked_storage: Variable "aioc" going out of scope leaks the storage it points to. Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2013-05-24 14:43:17 +02:00
Danny Al-Gaaf	9785478a2a	ceph-disk: remove unnecessary semicolons Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2013-05-24 12:46:15 +02:00
Danny Al-Gaaf	16ecae153d	ceph-disk: cast output of _check_output() Cast output of _check_output() to str() to be able to use str.split(). Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2013-05-24 12:41:11 +02:00
Danny Al-Gaaf	9429ff90a0	ceph-disk: fix undefined variable Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2013-05-24 12:33:16 +02:00
Danny Al-Gaaf	c127745cc0	ceph-disk: add missing spaces around operator Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2013-05-24 12:29:07 +02:00
Samuel Just	8c1c2d98c6	Merge branch 'wip_scrub_tphandle' into next Fixes: #5159 Reviewed-by: Sage Weil <sage@inktank.com>	2013-05-23 20:08:54 -07:00
Samuel Just	86822485e5	PG: ping tphandle during omap loop as well Signed-off-by: Samuel Just <sam.just@inktank.com>	2013-05-23 19:42:32 -07:00
Samuel Just	d62716dd4c	PG: reset timeout in _scan_list for each object, read chunk Signed-off-by: Samuel Just <sam.just@inktank.com>	2013-05-23 19:42:32 -07:00
Samuel Just	b8a25e08a6	OSD,PG: pass tphandle down to _scan_list Signed-off-by: Samuel Just <sam.just@inktank.com>	2013-05-23 19:42:32 -07:00
John Wilkins	bb407bfd10	doc: Updated Ceph FS Quick Start. Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2013-05-23 17:02:17 -07:00
John Wilkins	7c497d95db	doc: Added troubleshooting to Ceph FS index. Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2013-05-23 17:01:51 -07:00
John Wilkins	3dda794a66	doc: Added separate troubleshooting for MDS and Ceph FS. Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2013-05-23 17:01:29 -07:00

1 2 3 4 5 ...

26319 Commits