RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2024-12-16 00:15:35 +00:00

Author	SHA1	Message	Date
Sage Weil	988a521735	osd: special case CALL op to not have RD bit effects In commit `20496b8d2b` we treat a CALL as different from a normal "read", but we did not adjust the behavior determined by the RD bit in the op. We tried to fix that in `91e941aef9`, but changing the op code breaks compatibility, so that was reverted. Instead, special-case CALL in the helper--the only point in the code that actually checks for the RD bit. (And fix one lingering user to use that helper appropriately.) Fixes: #3731 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>	2013-01-04 20:46:56 -08:00
Sage Weil	d3abd0fe0b	Revert "OSD: remove RD flag from CALL ops" This reverts commit `91e941aef9`. We cannot change this op code without breaking compatibility with old code (client and server). We'll have to special case this op code instead. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>	2013-01-04 20:46:48 -08:00
Noah Watkins	3a9408742a	libcephfs: delete client after messenger shutdown Prevents race between messages being dispatched to the client after the client has been free'd. Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-01-04 19:51:52 -08:00
Dan Mick	0978dc4963	rbd: Don't call ProgressContext's finish() if there's an error. do_copy was different from the others; call pc.fail() on error and do not call pc.finish(). Fixes: #3729 Signed-off-by: Dan Mick <dan.mick@inktank.com>	2013-01-04 18:02:55 -08:00
Samuel Just	e89b6ade63	ReplicatedPG: remove old-head optization from push_to_replica This optimization allowed the primary to push a clone as a single push in the case that the head object on the replica is old and happens to be at the same version as the clone. In general, using head in clone_subsets is tricky since we might be writing to head during the push. calc_clone_subsets does not consider head (probably for this reason). Handling the clone from head case properly would require blocking writes on head in the interim which is probably a bad trade off anyway. Because the old-head optimization only comes into play if the replica's state happens to fall on the last write to head prior to the snap that caused the clone in question, it's not worth the complexity. Fixes: #3698 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-01-04 13:44:18 -08:00
Josh Durgin	6a3d475cf0	Merge remote branch 'origin/wip-rbd-watch' Reviewed-by: Dan Mick <dan.mick@inktank.com>	2013-01-04 13:37:36 -08:00
Yan, Zheng	acfa0c9a4a	mds: optimize C_MDC_RetryOpenRemoteIno When opening remote inode, C_MDC_RetryOpenRemoteIno is used as onfinish context for discovering remote inode. When it is called, the MDS may already have the inode. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-01-04 11:00:05 +08:00
Yan, Zheng	acbe6d97fd	mds: don't issue caps while inode is exporting caps If issue caps while inode is exporting caps, the client will drop the caps soon when it receives the CAP_OP_EXPORT message, but the client will not receive corresponding CAP_OP_IMPORT message. Except open file request, it's OK to not issue caps for client requests. If an non-auth MDS receives open file request but it can't issue caps, forward the request to auth MDS. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-01-04 10:45:40 +08:00
Yan, Zheng	ca4dc4dbc6	mds: check if stray dentry is needed The necessity of stray dentry can change before the request acquires all locks. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-01-04 10:45:40 +08:00
Yan, Zheng	3705c7ca9c	mds: drop locks when opening remote dentry Opening remote dentry while holding locks may cause dead lock. For example, 'discover' is blocked by a xlocked dentry, the request holding the xlock is blocked by the locks hold by the readdir request. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-01-04 10:45:40 +08:00
Yan, Zheng	ea2fd1276b	mds: check null context in CDir::fetch() Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-01-04 10:45:40 +08:00
Yan, Zheng	420f335566	mds: rdlock prepended dest trace when handling rename rdlock prepended dest trace to prevent them from being xlocked by someone else. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-01-04 10:45:40 +08:00
Yan, Zheng	248e4ab8e8	mds: fix cap mask for ifile lock ifile lock has 8 cap bits, should its cap mask should be 0xff Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-01-04 10:45:40 +08:00
Yan, Zheng	f9280cb694	mds: fix replica state for LOCK_MIX_LOCK LOCK_MIX_LOCK state is for gathering local locks and caps, so replica state should be LOCK_MIX. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-01-04 10:45:40 +08:00
Yan, Zheng	59953257e2	mds: keep dentry lock in sync state as much as possible Unlike locks of other types, dentry lock in unreadable state can block path traverse, so it should be in sync state as much as possible. there are two rare cases that dentry lock is not set to sync state: the dentry becomes replicated; finishing xlock but the dentry is freezing. In commit `efbca31d`, I tried fixing the issue that unreadable replica dentry blocks path traverse by modifying MDCache::path_traverse(), but it does not work. This patch also reverts that change. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-01-04 10:45:40 +08:00
Yan, Zheng	b03eab22e4	mds: forbid creating file in deleted directory Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-01-04 10:45:40 +08:00
Yan, Zheng	d379ac8e0b	mds: disable concurrent remote locking Current code allows multiple MDRequests to concurrently acquire a remote lock. But a lock ACK message wakes all requests because they were all put to the same waiting queue. One request gets the lock, the rest requests will re-send the OP_WRLOCK/OPWRLOCK slave requests and trigger assertion on remote MDS. The fix is disable concurrently acquiring remote lock, send OP_WRLOCK/OPWRLOCK slave request only if there is no on-going slave request. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-01-04 10:45:08 +08:00
Sage Weil	28d59d374b	os/FileStore: fix non-btrfs op_seq commit order The op_seq file is the starting point for journal replay. For stable btrfs commit mode, which is using a snapshot as a reference, we should write this file before we take the snap. We normally ignore current/ contents anyway. On non-btrfs file systems, however, we should only write this file after we do a full sync, and we should then fsync(2) it before we continue (and potentially trim anything from the journal). This fixes a serious bug that could cause data loss and corruption after a power loss event. For a 'kill -9' or crash, however, there was little risk, since the writes were still captured by the host's cache. Fixes: #3721 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Samuel Just <sam.just@inktank.com>	2013-01-03 17:15:07 -08:00
John Wilkins	f1e0305f0d	doc: Removed the --without-tcmalloc flag until further advised. Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2013-01-03 16:13:13 -08:00
Sage Weil	19df20867d	Merge pull request #30 from rca/master Minor clarification in docs.	2013-01-03 16:07:59 -08:00
John Wilkins	88af7d182a	doc: Added defaults for PGs, links to recommended settings, and updated note on splitting. Fixes: #3555 Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2013-01-03 14:51:33 -08:00
Samuel Just	4ae4dce5c5	OSD: for old osds, dispatch peering messages immediately Normally, we batch up peering messages until the end of process_peering_events to allow us to combine many notifies, etc to the same osd into the same message. However, old osds assume that the actiavtion message (log or info) will be _dispatched before the first sub_op_modify of the interval. Thus, for those peers, we need to send the peering messages before we drop the pg lock, lest we issue a client repop from another thread before activation message is sent. Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-01-03 14:18:00 -08:00
John Wilkins	73bc8ffc90	doc: Added comments on --without-tcmalloc option when building Ceph. Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2013-01-03 13:30:14 -08:00
rca	37b57cdf0f	Update doc/rados/configuration/filesystem-recommendations.rst Clarified when it's necessary to use the setting: filestore xattr use omap = true	2013-01-03 13:30:01 -08:00
John Wilkins	43ef6772eb	doc: Added some packages to the copyable line. Fixes: #3686 Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2013-01-03 13:29:20 -08:00
John Wilkins	333ae82c61	doc: Fixed syntax error. Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2013-01-03 13:28:06 -08:00
Sage Weil	7e94f6f1a7	Merge remote-tracking branch 'gh/wip-3714-b' into next Signed-off-by: Samuel Just <sam.just@inktank.com>	2013-01-03 12:53:07 -08:00
David Zafman	224a33bb3b	qa/workunit: Add dbench-short.sh for nfs suite A multi-client dbench run doesn't work over NFS, see bug #3718. Make single client dbench available. Signed-off-by: David Zafman <david.zafman@inktank.com>	2013-01-03 12:44:19 -08:00
Sage Weil	a32d6c5dca	osd: move common active vs booting code into consume_map Push osdmaps to PGs in separate method from activate_map() (whose name is becoming less and less accurate). Signed-off-by: Sage Weil <sage@inktank.com>	2013-01-02 22:39:10 -08:00
Sage Weil	0bfad8ef20	osd: let pgs process map advances before booting The OSD deliberate consumes and processes most OSDMaps from while it was down before it marks itself up, as this is can be slow. The new threading code does this asynchronously in peering_wq, though, and does not let it drain before booting the OSD. The OSD can get into a situation where it marks itself up but is not responsive or useful because of the backlog, and only makes the situation works by generating more osdmaps as result. Fix this by calling activate_map() even when booting, and when booting draining the peering_wq on each call. This is harmless since we are not yet processing actual ops; we only need to be async when active. Fixes: #3714 Signed-off-by: Sage Weil <sage@inktank.com>	2013-01-02 22:20:06 -08:00
Sage Weil	5fc94e89a9	osd: drop oldest_last_clean from activate_map Signed-off-by: Sage Weil <sage@inktank.com>	2013-01-02 22:04:34 -08:00
Sage Weil	67f7ee6799	osd: drop unused variables from activate_map Signed-off-by: Sage Weil <sage@inktank.com>	2013-01-02 22:04:08 -08:00
Sage Weil	a14a36ed78	OSDMap: fix modifed -> modified typo Signed-off-by: Sage Weil <sage@inktank.com>	2013-01-02 21:09:07 -08:00
Sage Weil	6b5a89d237	Merge remote-tracking branch 'gh/next'	2013-01-02 18:13:25 -08:00
Sage Weil	43cba617aa	log: fix locking typo/stupid for dump_recent() We weren't locking m_flush_mutex properly, which in turn was leading to racing threads calling dump_recent() and garbling the crash dump output. Backport: bobtail, argonaut Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>	2013-01-02 17:01:32 -08:00
John Wilkins	29ff87a573	Merge branch 'master' of https://github.com/ceph/ceph	2013-01-02 15:59:59 -08:00
John Wilkins	64d2760a49	doc: Added a memory profiling section. Ported from the wiki. Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2013-01-02 15:58:03 -08:00
John Wilkins	5066abf189	doc: Added memory profiling to the index. Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2013-01-02 15:57:22 -08:00
Sam Lang	0e9a0cd7b8	qa/workunit: Update pjd script to use new tarball The pjd script now uses the latest version of pjd with an additional test for opening a non-existent file. Signed-off-by: Sam Lang <sam.lang@inktank.com>	2013-01-02 17:08:37 -06:00
Sam Lang	d8940d15c3	fuse: Fix cleanup code path on init failure With the changes from `856f32ab`, the cfuse.init call returns a _positive_ errno, which was getting ignored. Also, if an error occurs during cfuse.init(), we need to teardown the client mount. Signed-off-by: Sam Lang <sam.lang@inktank.com>	2013-01-02 16:38:28 -06:00
Josh Durgin	c4370ff03f	librbd: establish watch before reading header This eliminates a window in which a race could occur when we have an image open but no watch established. The previous fix (using assert_version) did not work well with resend operations. Signed-off-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-02 14:15:34 -08:00
Sage Weil	9a1cf51888	Merge branch 'wip-journal-aio' into next Reviewed-by: Samuel Just <sam.just@inktank.com> Backport: bobtail	2013-01-02 13:42:22 -08:00
Sage Weil	483c6f76ad	test_filejournal: optionally specify journal filename as an argument Signed-off-by: Sage Weil <sage@inktank.com>	2013-01-02 13:39:05 -08:00
Sage Weil	c461e7fc1e	test_filejournal: test journaling bl with >IOV_MAX segments Signed-off-by: Sage Weil <sage@inktank.com>	2013-01-02 13:39:05 -08:00
Sage Weil	dda7b65189	os/FileJournal: limit size of aio submission Limit size of each aio submission to IOV_MAX-1 (to be safe). Take care to only mark the last aio with the seq to signal completion. Signed-off-by: Sage Weil <sage@inktank.com>	2013-01-02 13:39:05 -08:00
Josh Durgin	e0858fa899	Revert "librbd: ensure header is up to date after initial read" Using assert version for linger ops doesn't work with retries, since the version will change after the first send. This reverts commit `e177680903`. Conflicts: qa/workunits/rbd/watch_correct_version.sh	2013-01-02 12:32:33 -08:00
John Wilkins	82297706da	doc: Minor edits. Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2013-01-02 11:24:39 -08:00
John Wilkins	d3b9803eab	doc: Fixed typo, clarified usage. Signed-off-by: John Wilkins <john.wilkins@inktank.com>	2013-01-02 11:15:16 -08:00
Yan, Zheng	8422474320	mds: fix rename inode exportor check Use "srcdn->is_auth() && destdnl->is_primary()" to check if the MDS is inode exportor of rename operation is not reliable, This is because OP_FINISH slave request may race with subtree import. The fix is use a variable in MDRequest to indicate if the MDS is inode exportor. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-01-02 19:25:40 +08:00
Yan, Zheng	5e8642a82e	mds: call maybe_eval_stray after removing a replica dentry MDCache::handle_cache_expire() processes dentries after inodes, so the MDCache::maybe_eval_stray() in MDCache::inode_remove_replica() always fails to remove stray inode because MDCache::eval_stray() checks if the stray inode's dentry is replicated. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2013-01-02 19:25:40 +08:00

... 2 3 4 5 6 ...

23342 Commits