RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2024-12-26 21:43:10 +00:00

Author	SHA1	Message	Date
Yan, Zheng	ff8b9ac358	mds: send info of imported caps back to the exporter (export dir) Introduce a new class Capability::Import and use it to send information of imported caps back to the exporter. This is preparation for including counterpart's information in cap import/export message. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:24 +08:00
Yan, Zheng	d00ec7915c	mds: flush session messages before exporting caps Following sequence of events can happen when exporting inodes: - client sends open file request to mds.0 - mds.0 handles the request and sends inode stat back to the client - mds.0 export the inode to mds.1 - mds.1 sends cap import message to the client - mds.0 sends cap export message to the client - client receives the cap import message from mds.1, but the client still doesn't have corresponding inode in the cache. So the client releases the imported caps. - client receives the open file reply from mds.0 - client receives the cap export message from mds.0. After the end of these events, the client doesn't have any cap for the opened file. To fix the message ordering issue, this patch introduces a new session operation FLUSHMSG. Before exporting caps, we send a FLUSHMSG seesion message to client and wait for the acknowledgment. When receiveing the FLUSHMSG_ACK message from client, we are sure that clients have received all messages sent previously. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:24 +08:00
Yan, Zheng	77515b7a3c	mds: increase cap sequence when sharing max size For case: - client voluntarily releases some caps through cap update message - mds shares the new max by sending cap grant message - mds recevies the cap update message If mds doesn't increase the cap sequence when sharing the max size. It can't determine if the cap update message was sent before or after client reveived the cap grant message that updates max size. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:24 +08:00
Yan, Zheng	65259796ae	mds: include inode version in auth mds' lock messages encode inode version in auth mds' lock messages, so that version of replica inodes get updated. This is important because client use inode version in mds reply to check if the cached inode is already up-to-date. It skips updating the inode if it thinks the inode is already up-to-date. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:24 +08:00
Yan, Zheng	f134c77267	mds: avoid allocating MDRequest::More when cleanup request Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:24 +08:00
Yan, Zheng	e6c4d32e64	mds: waiting for slave reuqest to finish If MDS receives a client request, but find there is an existing slave request. It's possible that other MDS forwarded the request to us, but the MMDSSlaveRequest::OP_FINISH message arrives after the client request. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:24 +08:00
Yan, Zheng	1536e814da	mds: check lock state before eval_gather Locker::eval_gather() can dispatch requests, which may change other locks' states. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:24 +08:00
Yan, Zheng	e1818692d1	mds: don't request CEPH_CAP_PIN from auth mds avoid triggering assert(in->get_loner() >= 0 && in->mds_caps_wanted.empty()) in Locker::file_xsyn() Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:24 +08:00
Yan, Zheng	87ca260488	mds: fix sending resolve message need to send resolve message when mds is in reconnect state Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:24 +08:00
Yan, Zheng	b7d78918de	mds: keep dentry lock in sync state unlike locks of other types, dentry lock in unreadable state can block path traverse, so it should be in sync state as much as possible. This patch make Locker::try_eval() change dentry lock's state to sync even when the dentry is freezing. Also make migrator check imported dentries' lock states, change locks' states to sync if necessary. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:23 +08:00
Yan, Zheng	d8440c4cae	mds: avoid leaving bare-bone dirfrags in the cache Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:23 +08:00
Yan, Zheng	b2a137007f	mds: re-issue caps after importing inode After importing inode, the issued caps can be less than the caps client wants. So always re-issue caps after importing inode. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:23 +08:00
Yan, Zheng	3ac08860d4	mds: avoid issuing caps when inode is frozen Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:23 +08:00
Yan, Zheng	31f5b0275e	mds: fix rename notify commit `1d86f77edf` (mds: fix cross-authorty rename race) introduced rename notify, but it puts the code in wrong bracket. This patch also fixes a rename notify related bug in MDCache::handle_mds_failure() Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:23 +08:00
Yan, Zheng	bd561772ba	mds: re-send discover if want_xlocked becomes true If want_xlocked becomes true, we can not rely on previously sent discover because it's likely the previous discover is blocked on the xlocked dentry. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:23 +08:00
Yan, Zheng	913f7fd8db	mds: fix empty directory check Since commit 310032ee81(fix mds scatter_writebehind starvation), rdlock a scatter lock does not always propagate dirty fragstats to corresponding inode. So Server::_dir_is_nonempty() needs to check each dirfrag's stat intead of checking inode's dirstat. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:23 +08:00
Yan, Zheng	2fea08b59c	mds: merge delayed cache expire Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:23 +08:00
Yan, Zheng	498d5c4998	mds: process delayed expire if exporting dir cancelled in warnning state we may add delayed expire when exporting dir is in warnning state Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:23 +08:00
Yan, Zheng	0aed0d48c7	mds: handle cache rejoin corner case A recovering MDS may receives strong cache rejoin from a survivor, then the survivor restarts, the recovering MDS receives week cache rejoin from the same MDS. Before processing the week cache rejoin, we should scour replicas added by the obsoleted strong cache rejoin. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:23 +08:00
Yan, Zheng	5a902a0e5d	mds: unify nonce type MDSCacheObject::replica_nonce is defined as __s16, but nonce type in MDSCacheObject::replica_map is int. This mismatch may confuse MDCache::handle_cache_expire(). this patch unifies the nonce type as uint32 Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:23 +08:00
Yan, Zheng	0344d9af74	mds: rework stale import/export message detection Current code uses import state to detect obsolete import/export messages. it does not work for the case: cancel a subtree export, export the same subtree again, the messages for the first export get dispatched. This patch introduces "transation ID" for subtree exports. Each subtree export has a unique TID, the ID is recorded in all import/export related messages. By comparing the TID, we can reliably detect stale messages. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:22 +08:00
Yan, Zheng	9471fdc613	mds: put import/export related states together Current code uses several STL maps to record import/export related states. A map lookup is required for each state access, this is not efficient. It's better to put import/export related states together. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:22 +08:00
Yan, Zheng	ab93aa59bf	mds: freeze tree deadlock detection. there are two situations that result freeze tree deadlock. - mds.0 authpins an item in subtree A - mds.0 sends request to mds.1 to authpin an item in subtree B - mds.0 freezes subtree A - mds.1 authpins an item in subtree B - mds.1 sends request to mds.0 to authpin an item in subtree A - mds.1 freezes subtree B - mds.1 receives the remote authpin request from mds.0 (wait because subtree B is freezing) - mds.0 receives the remote authpin request from mds.1 (wait because subtree A is freezing) - client request authpins items in subtree B - freeze subtree B - import subtree A which is parent of subtree B (authpins parent inode of subtree B, see CDir::set_dir_auth()) - freeze subtree A - client request tries authpinning items in subtree A (wait because subtree A is freezing) Enforcing a authpinning order can avoid the deadlock, but it's very expensive. The deadlock is rare, so I think deadlock detection is more suitable for the case. This patch introduces freeze tree deadlock detection. We record the start time of freezing tree. If we fail to freeze the tree within a given duration, cancel the process of freezing tree. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-12-16 12:15:22 +08:00
Sage Weil	edc4224de4	Merge remote-tracking branch 'gh/wip-hitset' Reviewed-by: Greg Farnum <greg@inktank.com> Conflicts: src/common/config_opts.h src/osd/ReplicatedPG.cc src/osdc/Objecter.cc src/vstart.sh	2013-12-15 16:57:23 -08:00
Sage Weil	f192a600c5	Revert "common/Formatter: add newline to flushed output if m_pretty" This reverts commit `d6146b0d91`. As Yehuda points out, this does not properly handle cases where we flush the same output stream multiple times.	2013-12-15 16:23:09 -08:00
Sage Weil	c7b44d6675	Revert "common: fix perf_counters unittests for trailing newline in m_pretty" This reverts commit `ba5572397c`.	2013-12-15 16:22:59 -08:00
Loic Dachary	31507c90f0	qa: test for error when ceph osd rm is EBUSY http://tracker.ceph.com/issues/6824 fixes #6824 Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-15 23:06:26 +01:00
Loic Dachary	4b9a41aa17	qa: make cephtool test imune to pool size instead of assuming the pool size is 2, query it and increment it to test for pool set data size. It allows to run the test from vstart.sh without knowing what the required pool size is in advance: rm -fr dev out ; mkdir -p dev ; \ MON=1 OSD=3 ./vstart.sh -n -X -l mon osd LC_ALL=C PATH=:$PATH CEPH_CONF=ceph.conf \ ../qa/workunits/cephtool/test.sh Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-15 21:45:31 +01:00
Loic Dachary	f9cfa24adc	qa: add function name and line number to cephtool output Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-15 21:45:31 +01:00
Loic Dachary	cb352484f1	qa: silence cephtool tests cleanup The file removal installed to be triggered when the script stops must not fail if the file does not exist. Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-15 21:45:31 +01:00
Loic Dachary	15b8616b13	mon: set ceph osd (down\|out\|in\|rm) error code on failure Instead of always returning true, the error code is set if at least one operation fails. EINVAL if the OSD id is invalid (osd.foobar for instance). EBUSY if trying to remove and OSD that is up. When used with the ceph command line, it looks like this: ceph -c ceph.conf osd rm osd.0 Error EBUSY: osd.0 is still up; must be down before removal. kill PID_OF_osd.0 ceph -c ceph.conf osd down osd.0 marked down osd.0. ceph -c ceph.conf osd rm osd.0 osd.1 Error EBUSY: removed osd.0, osd.1 is still up; must be down before removal. http://tracker.ceph.com/issues/6824 fixes #6824 Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-15 21:45:31 +01:00
Sage Weil	80c6c54d93	Merge pull request #716 from ceph/wip-formatter-newlines common/Formatter: add newline to flushed output if m_pretty	2013-12-15 10:24:03 -08:00
Sage Weil	3862ad8f8f	Merge pull request #943 from dachary/wip-formatter-newlines common: fix perf_counters unittests for trailing newline in m_pretty	2013-12-15 10:23:33 -08:00
Sage Weil	550adb824f	Merge pull request #942 from sstock/master Add -n option to mount.ceph, feature 7006 Reviewed-by: Sage Weil <sage@inktank.com>	2013-12-15 10:18:49 -08:00
Steve Stock	e37467b7bf	Add -n option to mount.ceph. Required by autofs when /etc/mtab is a link to /proc/mounts (e.g. Debian Wheezy), otherwise automounting a ceph file system fails. Also useful when /etc is read-only. feature 7006 Signed-off-by: Steve Stock <steve@technolope.org>	2013-12-15 12:49:42 -05:00
Sage Weil	11065b5a76	Merge pull request #937 from christian-marie/master Document librados's rados_write's behaviour in reguards to return value.	2013-12-15 08:41:16 -08:00
Sage Weil	25838f3b0d	Merge pull request #924 from dachary/wip-erasure-doc doc: update erasure code development doc	2013-12-15 08:40:52 -08:00
Sage Weil	62a7d9c7bd	Merge pull request #946 from dachary/wip-80-column osd: format test_osd_types.cc to 80 columns	2013-12-15 08:40:32 -08:00
Sage Weil	caf5963565	Merge pull request #945 from dachary/wip-6981 ceph-disk: zap needs at least one device Reviewed-by: Sage Weil <sage@inktank.com>	2013-12-15 08:40:16 -08:00
Sage Weil	89dd0206ee	Merge pull request #944 from dachary/wip-6679 common: fix rare race condition in Throttle unit tests Reviewed-by: Sage Weil <sage@inktank.com>	2013-12-15 08:39:55 -08:00
Sage Weil	9c71d97b2c	Merge pull request #948 from dachary/wip-6736-1 mon: typo s/degrated/degraded/ Backport: emperor, dumpling	2013-12-15 08:32:41 -08:00
Loic Dachary	aa365e4b1a	mon: typo s/degrated/degraded/ http://tracker.ceph.com/issues/6736 refs #6736 Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-15 17:15:46 +01:00
Loic Dachary	5741bfe9fc	osd: format test_osd_types.cc to 80 columns Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-15 16:23:53 +01:00
Loic Dachary	07888ef3fd	ceph-disk: zap needs at least one device If given no argument, ceph-disk zap should display the usage instead of silently doing nothing. Silence can be confused with "I zapped all the disks". http://tracker.ceph.com/issues/6981 fixes #6981 Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-15 15:34:17 +01:00
Loic Dachary	e57239e920	common: fix rare race condition in Throttle unit tests The thread created to test Throttle race conditions updates a value ( throttle.get_current() ) that is tested by the main gtest thread but is not protected by a lock. Instead of adding a lock, the main thread tests the value after pthread_join() on the child thread. http://tracker.ceph.com/issues/6679 fixes #6679 Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-15 14:31:27 +01:00
Loic Dachary	938f22cae2	common: format Throttle test to 80 columns Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-15 14:30:38 +01:00
Loic Dachary	ba5572397c	common: fix perf_counters unittests for trailing newline in m_pretty Signed-off-by: Loic Dachary <loic@dachary.org>	2013-12-15 13:24:14 +01:00
Loic Dachary	c744aec660	Merge pull request #929 from kazhang/add-pkg-config add apt-get install pkg-config for ubuntu server Reviewed-by: Loic Dachary <loic@dachary.org> Reviewed-by: Sage Weil <sage@inktank.com>	2013-12-15 03:26:21 -08:00
John Wilkins	b7946ff4b3	doc: Added additional comments on placement targets and default placement. Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2013-12-13 16:09:35 -08:00
John Wilkins	902f19c23a	doc: Updates to federated config. Reverted Emperor versionadded to Dumpling as it gets backported. Added default index and bucket pools to pool creation Added default default_placment setting Added placement_pools key val pair examples. Added comments for re-running the procedure for the secondary region. Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2013-12-13 16:08:37 -08:00

1 2 3 4 5 ...

30285 Commits