RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-01-10 13:10:46 +00:00

Author	SHA1	Message	Date
Yan, Zheng	c9707f636c	mds: Fix replica's allowed caps for filelock in SYNC_LOCK state For replica, filelock in LOCK_LOCK state doesn't allow Fc cap. So filelock in LOCK_SYNC_LOCK/LOCK_EXCL_LOCK state shouldn't allow Fc cap either. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	3962a7510f	mds: defer releasing cap if necessary When inode is freezing or frozen, we defer processing MClientCaps messages and cap release embedded in requests. The same deferral logical should also cover MClientCapRelease messages.	2013-05-28 13:57:21 +08:00
Yan, Zheng	a918e611e2	mds: fix Locker::request_inode_file_caps() After sending cache rejoin message, replica need notify auth MDS when cap_wanted changes. But it can send MInodeFileCaps message only after receiving auth MDS' rejoin ack. Locker::request_inode_file_caps() has correct wait logical, but it skips sending MInodeFileCaps message if the auth MDS is still in rejoin state. The fix is defer sending MInodeFileCaps message until the auth MDS is active. It makes the function's wait logical less tricky. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	2b1b6cae2d	mds: notify auth MDS when cap_wanted changes So the auth MDS can choose locks' states base on our cap_wanted. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	fc94f47b8b	mds: export CInode:mds_caps_wanted CInode:mds_caps_wanted is used to keep track of caps wanted by non-auth MDS. The auth MDS checks it when choosing locks' states. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	e21f328f1a	mds: export CInode::STATE_NEEDSRECOVER Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	882be6b1d7	mds: send slave request after target MDS is active when failure of peer is detected, MDCache::handle_mds_failure() checks if there are requests waiting for slave replies from the failed peer, and adds them to the "wait for active peer" list. The "retry request" logical only covers slave requests sent before MDCache::handle_mds_failure() is called. If a slave request was sent while peer isn't up, we wait for its reply forever. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	38fb2ec78b	mds: unfreeze inode after rename rollback finishes we should not wake up the unfreeze waiter while the inode is still linked to a non-auth dirfrag. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	8a1114cead	mds: remove buggy cache rejoin code I previously added code to handle a corner case of cache rejoin: entire subtree, together with the inode subtree root belongs to, were trimmed between sending cache rejoin and receiving rejoin ack. In this case, we should send cache expire message to the subtree's auth MDS. But the code is complete broken, remove it temporarily. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	30c68218f7	mds: fix typo in Server::do_rename_rollback Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	e8497f8087	mds: fix import cancel race Current code uses import state to detect obsolete import discover/prep message. it does not work for the case: cancel a subtree import, import the same subtree again, the discover/prep message for the first import get dispatched. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	0708d44f12	mds: fix straydn race For unlink/rename request, the target dentry's linkage may change before all locks are acquired. So we need check if the existing stray dentry is valid. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	7a6ec35367	mds: fix slave commit tracking MDS may crash after journalling a slave commit, but before sending commit ack to the master. Later when the MDS restarts, it will not send commit ack to the master. So the master waits for the commit ack forever. The fix is remove failed MDS from requests' uncommitted slave list. When failed MDS recovers, its resolve message will tell the master which slave requests are not committed. The master will re-add the recovering MDS to requests' uncommitted slave list if necessary. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	0c1ca8edda	mds: fix uncommitted master wait We may add new waiter while the master is committing. so we should take the waiters and wake up them when the master is committed. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	5426c75d7b	mds: adjust subtree auth if import aborts in PREPPED state Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	d7b999be1b	mds: don't stop at export bounds when journaling dir context We only journal the finish of exporting subtree, so we shouldn't consider export bounds as subtree root. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:21 +08:00
Yan, Zheng	81d073fecb	mds: fix underwater dentry cleanup If the underwater dentry is a remove link, we shouldn't mark the inode clean Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:20 +08:00
Yan, Zheng	8b4e9911a4	mds: journal new subtrees created by rename this avoids creating bare dirfrags during journal replay. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-05-28 13:57:20 +08:00
Samuel Just	8c1c2d98c6	Merge branch 'wip_scrub_tphandle' into next Fixes: #5159 Reviewed-by: Sage Weil <sage@inktank.com>	2013-05-23 20:08:54 -07:00
Samuel Just	86822485e5	PG: ping tphandle during omap loop as well Signed-off-by: Samuel Just <sam.just@inktank.com>	2013-05-23 19:42:32 -07:00
Samuel Just	d62716dd4c	PG: reset timeout in _scan_list for each object, read chunk Signed-off-by: Samuel Just <sam.just@inktank.com>	2013-05-23 19:42:32 -07:00
Samuel Just	b8a25e08a6	OSD,PG: pass tphandle down to _scan_list Signed-off-by: Samuel Just <sam.just@inktank.com>	2013-05-23 19:42:32 -07:00
Yehuda Sadeh	8b3a04dec8	rgw: iterate usage entries from correct entry Fixes: #5152 When iterating through usage entries, and when user id was provided, we started at the user's first entry and not from the entry indexed by the request start time. This commit fixes the issue. Backport: bobtail Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>	2013-05-23 13:11:01 -07:00
Sage Weil	87cef3d5c3	mon: drop unnecessary conditionals Signed-off-by: Sage Weil <sage@inktank.com>	2013-05-23 10:23:43 -07:00
Sage Weil	6af640517f	Merge pull request #311 from ceph/wip-5102 Reviewed-by: Sage Weil <sage@inktank.com>	2013-05-23 10:21:51 -07:00
Xiaoxi Chen	e09e94424b	modified: src/init-ceph.in fixed bug in init script, the "df" should be run on remote host by do_cmd, and use $host instead of "hostname -s" Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com> (cherry picked from commit `1dd99f0fc9`) Conflicts: src/init-ceph.in	2013-05-23 08:48:24 -07:00
Sage Weil	c2e262fc94	osd: skip mark-me-down message if osd is not up Fixes crash when the OSD has not successfully booted and gets a SIGINT or SIGTERM. Signed-off-by: Sage Weil <sage@inktank.com>	2013-05-22 15:03:50 -07:00
Sage Weil	32dc463ad4	osd, mds: shut down async signal handler on exit Signed-off-by: Sage Weil <sage@inktank.com>	2013-05-22 14:56:24 -07:00
Sage Weil	eb91f41042	messages/MOSDMarkMeDown: fix uninit field Fixes valgrind warning: ==14803== Use of uninitialised value of size 8 ==14803== at 0x12E7614: sctp_crc32c_sb8_64_bit (sctp_crc32.c:567) ==14803== by 0x12E76F8: update_crc32 (sctp_crc32.c:609) ==14803== by 0x12E7720: ceph_crc32c_le (sctp_crc32.c:733) ==14803== by 0x105085F: ceph::buffer::list::crc32c(unsigned int) (buffer.h:427) ==14803== by 0x115D7B2: Message::calc_front_crc() (Message.h:441) ==14803== by 0x1159BB0: Message::encode(unsigned long, bool) (Message.cc:170) ==14803== by 0x1323934: Pipe::writer() (Pipe.cc:1524) ==14803== by 0x13293D9: Pipe::Writer::entry() (Pipe.h:59) ==14803== by 0x120A398: Thread::_entry_func(void*) (Thread.cc:41) ==14803== by 0x503BE99: start_thread (pthread_create.c:308) ==14803== by 0x6C6E4BC: clone (clone.S:112) Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com>	2013-05-22 14:29:37 -07:00
Sage Weil	b0d64de484	Merge pull request #316 from ceph/wip-sysvinit Reviewed-by: Dan Mick <dan.mick@inktank.com>	2013-05-22 13:25:42 -07:00
Sage Weil	d81d0ea5c4	sysvinit: fix osd weight calculation on remote hosts We need to do df on the remote host, not locally. Simlarly, the ceph command uses the osd key, which exists remotely; run it there. Signed-off-by: Sage Weil <sage@inktank.com>	2013-05-22 12:39:11 -07:00
Sage Weil	caa15a34cb	sysvinit: use known hostname $host instead of (incorrectly) recalculating We would need to do hostname -s on the remote node, not the local one. But we already have $host; use it! Reported-by: Xiaoxi Chen <xiaoxi.chen@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>	2013-05-22 12:39:10 -07:00
Samuel Just	0289c445be	OSDMonitor: skip new pools in update_pools_status() and get_pools_health() New pools won't be full. mon->pgmon()->pg_map.pg_pool_sum[poolid] will implicitly create an entry for poolid causing register_new_pgs() to assume that the newly created pgs in the new pool are in fact a result of a split preventing MOSDPGCreate messages from being sent out. Fixes: #4813 Backport: cuttlefish Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-05-22 10:23:25 -07:00
Joao Eduardo Luis	e15d290945	mon: Paxos: get rid of the 'prepare_bootstrap()' mechanism We don't need it after all. If we are in the middle of some proposal, then we guarantee that said proposal is likely to be retried. If we haven't yet proposed, then it's forever more likely that a client will eventually retry the message that triggered this proposal. Basically, this mechanism attempted at fixing a non-problem, and was in fact triggering some unforeseen issues that would have required increasing the code complexity for no good reason. Fixes: #5102 Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>	2013-05-22 17:12:38 +01:00
Joao Eduardo Luis	586e8c2075	mon: Paxos: finish queued proposals instead of clearing the list By finishing these Contexts, we make sure the Contexts they enclose (to be called once the proposal goes through) will behave as their were initially planned: for instance, a C_Command() may retry the command if a -EAGAIN is passed to 'finish_contexts', while a C_Trimmed() will simply set 'going_to_trim' to false. This aims at fixing at least a bug in which Paxos will stop trimming if an election is triggered while a trim is queued but not yet finished. Such happens because it is the C_Trimmed() context that is responsible for resetting 'going_to_trim' back to false. By clearing all the contexts on the proposal list instead of finishing them, we stay forever unable to trim Paxos again as 'going_to_trim' will stay True till the end of time as we know it. Fixes: #4895 Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>	2013-05-22 17:10:42 +01:00
Joao Eduardo Luis	2ff23fe784	mon: Paxos: finish_proposal() when we're finished recovering Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>	2013-05-22 13:33:34 +01:00
Sage Weil	e9d20ffe19	mon: implement --extract-monmap <filename> This will make for a simpler process for http://ceph.com/docs/master/rados/operations/add-or-rm-mons/#removing-monitors-from-an-unhealthy-cluster Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit `c0268e2749`)	2013-05-21 15:14:47 -07:00
Yehuda Sadeh	d48f1edb07	rgw: protect ops log socket formatter Fixes: #4905 Ops log (through the unix domain socket) uses a formatter, which wasn't protected. Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>	2013-05-21 13:05:22 -07:00
Sage Weil	1c7b9c3505	os/LevelDBStore: fix compression selection We were always disabling compression. Fixes: #5131 Reported-by: Sylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: Sage Weil <sage@inktank.com>	2013-05-21 08:16:56 -07:00
Sage Weil	2f193fb931	debian: stop sysvinit on ceph.prerm Signed-off-by: Sage Weil <sage@inktank.com>	2013-05-20 14:39:16 -07:00
Mike Kelly	d05a4e5574	ceph df: fix si units for 'global' stats si_t expects bytes, but it was being given kilobytes. Signed-off-by: Mike Kelly <pioto@pioto.org> (cherry picked from commit `0c2b738d8d`)	2013-05-20 09:06:09 -07:00
Sage Weil	d0a5d3a7f4	Merge pull request #295 from ceph/wip-5077 Reviewed-by: Joao Luis <joao.luis@inktank.com>	2013-05-17 09:26:25 -07:00
Sage Weil	c80c6a032c	sysvinit: fix enumeration of local daemons when specifying type only - prepend $local to the $allconf list at the top - remove $local special case for all case - fix the type prefix checks to explicitly check for prefixes Fugly bash, but works! Backport: cuttlefish, bobtail Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>	2013-05-16 20:39:32 -07:00
Sage Weil	d8d7113c35	udev: install disk/by-partuuid rules Wheezy's udev (175-7.2) has broken rules for the /dev/disk/by-partuuid/ symlinks that ceph-disk relies on. Install parallel rules that work. On new udev, this is harmless; old older udev, this will make life better. Fixes: #4865 Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com>	2013-05-16 18:40:29 -07:00
Sage Weil	65072f2e43	mon: clear pg delta after some period If we have not pg_map updates, the delta doesn't update, and can get stuck with the velocity right before activity stopped. This is confusing, and can cause incorrect health warnings about in-progress recovery. To fix this, zero the delta if there is no activity for 'mon delta reset interval' seconds. Fixes: #5077 Signed-off-by: Sage Weil <sage@inktank.com>	2013-05-16 17:58:48 -07:00
Samuel Just	9b9d322c20	test_filestore_idempotent_sequence: unmount prior to deleting store FileStoreDiff umounts the stores in its destructor. Also, DeterministicOpSequence handles deletes its passed object store. Fixes: #5076 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: David Zafman <david.zafman@inktank.com>	2013-05-16 15:46:11 -07:00
Samuel Just	5a27e85cf1	Revert "test_filejournal.cc: cleanup memory in destructor" The finish() method for Contexts calls delete this. This reverts commit `36028916c4`. Fixes: #5075 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: David Zafman <david.zafman@inktank.com>	2013-05-16 15:45:42 -07:00
Sage Weil	604c83ff18	debian: make radosgw require matching version of librados2 ...indirectly via ceph-common. We get bad behavior when they diverge, I think because of libcommon.la being linked both statically and dynamically. Fixes: #4997 Backport: cuttlefish, bobtail Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Gary Lowell <gary.lowell@inktank.com>	2013-05-16 13:17:45 -07:00
Samuel Just	eaf3abf3f9	FileJournal: adjust write_pos prior to unlocking write_lock In committed_thru, we use write_pos to reset the header.start value in cases where seq is past the end of our journalq. It is therefore important that the journalq be updated atomically with write_pos (that is, under the write_lock). The call to align_bl() is moved into do_write in order to ensure that write_pos is adjusted correctly prior to write_bl(). Also, we adjust pos at the end of write_bl() such that pos \in [get_top(), header.max_size) after write_bl(). Fixes: #5020 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-05-16 11:14:37 -07:00
Sage Weil	64871e0931	mds: avoid assert after suicide() Fixes: #5079 Signed-off-by: Sage Weil <sage@inktank.com>	2013-05-16 09:42:29 -07:00

1 2 3 4 5 ...

26086 Commits