RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2024-12-16 08:26:25 +00:00

Author	SHA1	Message	Date
Sage Weil	635673928a	osd: fix recovery assert for pg repair case In the case of PG repair, this assert is not valid. Disable it for now. Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-27 13:26:09 -08:00
Sage Weil	1fa8c83d2d	Merge branch 'wip-osd-flags'	2012-12-27 13:09:24 -08:00
Sage Weil	207e93abef	Merge remote-tracking branch 'gh/wip-mds-pool' Reviewed-by: Sam Lang <sam.lang@inktank.com>	2012-12-27 13:07:57 -08:00
Sage Weil	f230603873	osd: only calculate OpRequest rmw flags once Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-27 12:12:40 -08:00
Sage Weil	f1dfd64f72	messages/MOSDOpReply: remove misleading may_read/may_write These are OpRequest properties, calculated/enforced at the OSD. They don't belong in the MOSDOp or MOSDOpReply messages. Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-27 12:12:40 -08:00
Sage Weil	03f6dfa46e	osd: move rmw_flags to OpRequest, out of MOSDOp It was very sloppy to put a server-side processing state inside the messsage. Move it to the OpRequestRef instead. Note that the client was filling in bogus data that was then lost during encoding/decoding; clean that up. Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-27 12:12:40 -08:00
tamil	998f71945d	dropping xfs test 186 due to bug: 3685 Signed-off-by: tamil <tamil.muthamizhan@inktank.com>	2012-12-27 11:27:31 -08:00
Gary Lowell	98e7b59807	docs: remove extra release-process2 file. This file mostly duplicated the existing release documentation. Differences have been merged into the primary file. Signed-off-by: Gary Lowell <gary.lowell@inktank.com>	2012-12-27 11:14:19 -08:00
Sage Weil	82c71716f7	osd: drop 'osd recovery max active' back to previous default (5) Having this too large means that queues get too deep on the OSDs during backfill and latency is very high. In my tests, it also meant we generated a lot of slow recovery messages just from the recovery ops themselves (no client io). Keeping this at the old default means we are no worse in this respect than argonaut, which is a safe position to start from. Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-27 11:12:33 -08:00
Sage Weil	6f1f03c7d3	journal: reduce journal max queue size Keep the journal queue size smaller than the filestore queue size. Keeping this small also means that we can lower the latency for new high priority ops that come into the op queue. Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-27 11:11:08 -08:00
Sage Weil	0d2ad2f24b	mds: use set to store MDSMap data pools Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-27 11:09:00 -08:00
Sage Weil	2137d5cde0	mds: wait for client's mdsmap when specifying data pool The client may have a newer map than we do; make sure we wait for it lest we inadvertantly reply because we think the pool doesn't exist. Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-27 09:36:55 -08:00
Sage Weil	9da6d88291	doc: document mds config options Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-27 09:33:27 -08:00
Sage Weil	916d1cf607	doc: journaler config options Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-26 17:34:12 -08:00
Gary Lowell	cedea1391c	docs: Merge changes from release-process2 document.	2012-12-26 12:54:27 -08:00
Sage Weil	850a056bec	mds: add waiting_for_mdsmap queue Defer events until we get a specific MDSMap epoch. Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-26 11:58:51 -08:00
Sage Weil	c764935d0e	mds: do not check for pool existence in osdmap We don't have a wait mechanism to ensure the MSDMap has the latest osdmap here. Just trust the MDSMap. Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-26 11:58:41 -08:00
Josh Durgin	4929fc7dd9	qa: remove xfstests 172 and 173 from qemu testing These seem to require newer xfs. Signed-off-by: Josh Durgin <josh.durgin@inktank.com>	2012-12-26 10:55:47 -08:00
Sage Weil	f5403f9493	doc/man/8/mkcephfs: update --mkfs a bit Document that 'devs' and 'osd mkfs type' must be defined. Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-26 09:42:13 -08:00
Sage Weil	8b59908370	mds: replace closed sessions on connect If a connection comes and there is a closed session attached, remove it. This is probably a failure of an old session to get cleaned up properly, and in certain cases it may even be from a different client (if the addr nonce is reused). In that case this prevents further damage, although a complete solution would also clean up the closed connection state if there is a fault. See #3630. This fixes a hang that is reproduced by running the libcephfs Caps.ReadZero test in a loop; eventually the client addr is reused and we are linked to an ancient Session with a different client id. Backport: bobtail Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-23 20:01:13 -08:00
Sage Weil	d18f3c2dd2	mds: don't force in->first == dn->first The fullbit sets it now. For multiversion inodes, it's "first" can be in the future, since this dentry may not have changed when the inode was cowed in place. (OTOH, the dentry cannot have changed without the inode also have changing.) Signed-off-by: Sage Weil <sage.weil@dreamhost.com>	2012-12-23 20:01:13 -08:00
Yan, Zheng	a1485f959d	mds: compare sessionmap version before replaying imported sessions Otherwise we may wrongly increase mds->sessionmap.version, which will confuse future journal replays that involving sessionmap. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2012-12-23 20:01:12 -08:00
Yan, Zheng	0002546205	mds: fix race between send_dentry_link() and cache expire MDentryLink message can race with cache expire, When it arrives at the target MDS, it's possible there is no corresponding dentry in the cache. If this race happens, we should expire the replica inode encoded in the MDentryLink message. But to expire an inode, the MDS need to know which subtree does the inode belong to, so modify the MDentryLink message to include this information. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2012-12-23 20:01:12 -08:00
Yan, Zheng	efbca31d3b	mds: fix file existing check in Server::handle_client_openc() Creating new file needs to be handled by directory fragment's auth MDS, opening existing file in write mode needs to be handled by corresponding inode's auth MDS. If a file is remote link, its parent directory fragment's auth MDS can be different from corresponding inode's auth MDS. So which MDS to handle create file request can be affected by if the corresponding file already exists. handle_client_openc() calls rdlock_path_xlock_dentry() at the very beginning. It always assumes the request needs to be handled by directory fragment's auth MDS. When handling a create file request, if the file already exists and remotely linked to a non-auth inode, handle_client_openc() falls back to handle_client_open(), handle_client_open() forwards the request because the MDS is not inode's auth MDS. Then when the request arrives at inode's auth MDS, rdlock_path_xlock_dentry() is called, it will forward the request back. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2012-12-23 20:01:12 -08:00
Yan, Zheng	f5e86ecbd2	mds: delay processing cache expire when state >= EXPORT_EXPORTING It's possible that MDS receives cache expire in EXPORT_LOGGINGFINISH and EXPORT_NOTIFYING states. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2012-12-23 20:01:12 -08:00
Yan, Zheng	1174dd3188	mds: don't retry readdir request after issuing caps If remote linkage without inode is encountered after some caps are issued, Server::handle_client_readdir() should send the reply to client immediately instead of retrying the request after opening the remote dentry. This is because the MDS may want to revoke these caps before the MDS succeeds in opening the remote dentry. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2012-12-23 20:01:12 -08:00
Yan, Zheng	dd4415768d	mds: take export lock set before sending MExportDirDiscover Migrator::export_dir() only check if it can lock the export lock set but not take the lock set. So someone else can change the path to the exporting dir and confuse Migrator::handle_export_discover(). Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2012-12-23 20:01:12 -08:00
Yan, Zheng	96f48aa056	mds: re-issue caps after importing caps The imported caps may prevent unstable locks from entering stable states. So we should call Locker::eval_gather() with parameter "first" set to true after caps are imported. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2012-12-23 20:01:12 -08:00
Yan, Zheng	a3e70aede8	mds: always send discover if want_xlocked is true If want_xlocked is true, we can not rely on previously sent discover because it's likely the previous discover is blocking on the xlocked dentry. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2012-12-23 20:01:12 -08:00
Yan, Zheng	69f9f024e8	mds: fix error hanlding in MDCache::handle_discover_reply() The error hanlding code in MDCache::handle_discover_reply() has two main issues. MDCache::handle_discover_reply() does not wake waiters if dir_auth_hint in reply message is equal to itself's nodeid. This can happen if discover race with subtree importing. Another issue is that it checks the existence of cached directory fragment to decide if it should take waiter from inode or from directory fragment. The check is unreliable because subtree importing can add directory fragments to the cache. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2012-12-23 20:01:11 -08:00
Yan, Zheng	e6b8f0a659	mds: set want_base_dir to false for MDCache::discover_ino() When frozen inode is encountered, MDCache::handle_discover() sends reply immediately if the reply message is not empty. When handling "discover ino" requests, the reply message always contains the base directory fragment. But requestor already has the base directory fragment, the only effect of the reply message is wake the requestor and make it send same "discover ino" request again. So the requestor keeps sending "discover ino" requests but can't make any progress. The fix is set want_base_dir to false for MDCache::discover_ino(). After set want_base_dir to false, also need update the code that handles "discover ino" error. This patch also remove unused error handling code for flag_error_dn Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2012-12-23 20:01:11 -08:00
Yan, Zheng	b7e698a52b	mds: no bloom filter for replica dir We should delete dir fragment's bloom filter after exporting the dir fragment to other MDS. Otherwise the residual bloom filter may cause problem if the MDS imports dir fragment later. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2012-12-23 20:01:11 -08:00
Yan, Zheng	0ab0744e6f	mds: properly mark dirfrag dirty If predirty_journal_parents() does not propagate changes in dir's fragstat into corresponding inode's dirstat, it should mark the inode as dirfrag dirty. This happens when we modify dir fragments that are auth subtree roots. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2012-12-23 20:01:11 -08:00
Yan, Zheng	48d8ae58ef	mds: alllow handle_client_readdir() fetching freezing dir. At that point, the request already auth pins and locks some objects. So CDir::fetch() should ignore the can_auth_pin check and continue to fetch freezing dir. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2012-12-23 20:01:11 -08:00
Sage Weil	d9673ca324	Merge branch 'wip-create-layout' Reviewed-by: Greg Farnum <greg@inktank.com> The functional tests for the create operations should add and specify non-default pools, but we don't have a set of library methods to do that yet (to interact with the monitor).	2012-12-23 19:59:04 -08:00
Sage Weil	8efcf54dc1	mds: _pg_pool -> _pool Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-23 19:39:23 -08:00
Sage Weil	d2f5890f84	client, libcephfs: add method to get the pool name for an open file Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-23 19:39:23 -08:00
Sage Weil	32ab274a4f	client: specify data pool on create operations Fill in the data pool field if specified by the client, or set to -1. Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-23 19:39:22 -08:00
Sage Weil	3f4582176a	mds: verify that the pool id is valid on SET[DIR]LAYOUT Make sure the data pool exists and is part of the MDSMap data pools list. Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-23 19:39:22 -08:00
Sage Weil	99d9e1daa5	mds: allow data pool to be specfied on create Reuse old preferred_pg field. Only use if the new CREATEPOOLID feature is present, and the value is >= 0. Verify that the data pool is allowed, or return EINVAL to the client. Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-23 19:39:22 -08:00
Sage Weil	697ed23cb9	client: remove set_default_*() methods This is a poor interface. The hadoop stuff is shifting to specify this information on file creation instead. Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-23 19:39:22 -08:00
Sage Weil	850d1d544b	osd: fix dup failure cancellations If we had a pending failure report, and send a cancellation, take it out of our pending list so that we don't keep resending cancellations. Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-23 15:21:18 -08:00
Sage Weil	61d43af747	osd: make MOSDFailure output more sensible Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-23 15:21:18 -08:00
Sage Weil	9df522e9ec	mon: make osd failure report log msgs sensible Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-23 15:11:39 -08:00
Sage Weil	1290671f15	Merge branch 'wip-scrub' into next Reviewed-by: Sage Weil <sage@inktank.com> Conflicts: src/osd/PG.cc	2012-12-23 14:42:51 -08:00
Sage Weil	8362e6403e	monclient: fix get_monmap_privately retry interval Use mon_client_hunt_interval (default 3) instead of hardcoding 1 second. Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-23 13:53:21 -08:00
Sage Weil	d843a64a3a	Makefile: fix 'base' rule Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-23 13:53:18 -08:00
Sage Weil	00b89c3f7b	Merge branch 'next'	2012-12-23 11:19:39 -08:00
Sage Weil	a09f5b1b46	init-ceph,mkcephfs: default inode64 for mounting xfs According to hch this is now the default or new kernels. Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-23 11:18:45 -08:00
Sage Weil	5f25f9f8cf	init-ceph: default osd_data path Signed-off-by: Sage Weil <sage@inktank.com>	2012-12-22 11:10:03 -08:00

... 2 3 4 5 6 ...

23215 Commits