RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2024-12-21 02:42:48 +00:00

Author	SHA1	Message	Date
Sam Lang	2678db532a	qa/workunits/restart: Add test to check backtrace This script uses the python bindings to libcephfs and rados to create files and check the correctness of the backtrace written to the 'parent' xattr on the first object (if its a file) or inode (if its a dir). The script includes test cases that kill the mds at specific kill points and restart it through teuthology using the teuthology restart task. Signed-off-by: Sam Lang <sam.lang@inktank.com>	2013-03-16 11:45:36 -05:00
Sam Lang	85057e962b	mds: Add kill points for backtrace testing To test the mds journal and replay behavior, and the functionality for storing backtraces on inodes, we add kill points to the MDS in the openc, journal replay, and journal expire paths. Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>	2013-03-16 11:45:36 -05:00
Sam Lang	acae3d02b9	pybind/cephfs: Add initial py wrappers for cephfs. Initial Support for python bindings to libcephfs for testing MDS backtraces with a the python script test-backtraces.py. Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-03-16 11:45:36 -05:00
Sam Lang	0c7447576a	mds: Cleanup new segment conditionals The second conditional for adding a new segment is always true when the first conditional is true. Clean this up to simply create a new segment when we've reached the end of the current segment. Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>	2013-03-16 11:45:36 -05:00
Sam Lang	8b79886773	mds: Backtrace for create,rename,mkdir,setlayout Design info: http://www.spinics.net/lists/ceph-devel/msg11872.html Adds a backtrace to the data pool for supporting lookup-by-ino, storing the backtrace on the first object in the data pool or the metadata pool for a directory, as the 'parent' xattr on the object (named by inode) in that pool. For create, rename, mkdir, and setlayout operations, the backtrace is queued (with the current log segment) after the journal is committed and the safe reply is returned to the client, but the the backtrace operation itself isn't started until the log segment is expired. For journal replay, we queue the backtrace so that it gets written out on journal expire. Inodes get added to the EMetaBlob in the fullbits list, so we queue backtraces while iterating through the fullbits during replay. Using setlayout or setxattr('ceph.file.layout.pool'), the data pool for a file can be changed after it is created but before anything is written to the file. A forwarding backtrace is written to the old pool on a setlayout, to ensure we can always find the latest backtrace. We store a list of old pools with the backtrace for cleaning up all forwarding pointers of an inode. Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>	2013-03-16 11:45:36 -05:00
Sam Lang	4d0448f876	mds: New backtrace handling Add unified backtrace handling for storing a backtrace on file objects (the first data object) and dirs. The backtrace store operation is queued on the LogSegment (for performing the store on log segment expire). We encode the backtrace on queue to avoid keeping a reference around to the CInode, which may get dropped from the cache by the time the log segment is expired (and the backtrace is written out). Fetching the backtrace is implemented on the CInode. Also allow incrementing/decrementing the DIRTYPARENT pin ref as needed, instead of using a state semaphore to keep track of whether itsset or not. This allows us to remove the STATE_DIRTYPARENT field on CInode. Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>	2013-03-16 11:45:36 -05:00
Sam Lang	62d12d8ae9	message/mds: Fix client reconnect decode Flip the conditional so that snap realms are decoded, otherwise this results in an assertion failure of the mds when a client attempts to reconnect. Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>	2013-03-16 11:45:36 -05:00
Sam Lang	fb07745c75	include/elist: Fix clear() to use pop_front() elist<T>::clear() is calling remove(), which isn't a method defined on elist<T> (it was never defined according to git). Because elist is templated and no references to clear() are ever made, the compiler matches remove(T) to the remove(const char ) system call defined in stdio.h. Once clear is invoked on an instance of elist<T>, we get the compile error shown below. The fix here is to use pop_front() instead of remove(). Compile error is: In file included from ../../src/mds/CInode.h:22:0, from ../../src/mds/CInode.cc:19: ../../src/include/elist.h: In instantiation of ‘void elist<T>::clear() [with T = cinode_backtrace_info_t]’: ../../src/mds/CInode.cc:1129:20: required from here ../../src/include/elist.h:101:7: error: no matching function for call to ‘remove(cinode_backtrace_info_t)’ ../../src/include/elist.h:101:7: note: candidates are: In file included from ../../src/mds/CInode.cc:17:0: /usr/include/stdio.h:179:12: note: int remove(const char) /usr/include/stdio.h:179:12: note: no known conversion for argument 1 from ‘cinode_backtrace_info_t’ to ‘const char’ In file included from /usr/include/c++/4.7/algorithm:63:0, from /usr/include/c++/4.7/backward/hashtable.h:65, from /usr/include/c++/4.7/ext/hash_map:65, from ../../src/include/encoding.h:292, from ../../src/common/entity_name.h:22, from ../../src/common/config.h:26, from ../../src/mds/CInode.h:20, from ../../src/mds/CInode.cc:19: /usr/include/c++/4.7/bits/stl_algo.h:1117:5: note: template<class _FIter, class _Tp> _FIter std::remove(_FIter, _FIter, const _Tp&) /usr/include/c++/4.7/bits/stl_algo.h:1117:5: note: template argument deduction/substitution failed: In file included from ../../src/mds/CInode.h:22:0, from ../../src/mds/CInode.cc:19: ../../src/include/elist.h:101:7: note: candidate expects 3 arguments, 1 provided Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>	2013-03-16 11:45:36 -05:00
Sam Lang	b8fb2ee0f7	dencoder: Add inode_backtrace_t to types To test the backtrace attributes on objects, we need to be able to decode the backtrace using ceph-dencoder. Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>	2013-03-16 11:45:36 -05:00
Sam Lang	9e1388626c	mds: Use map for CInode pinrefs Implements pin refs on the inode as a map instead of a multiset, allowing individual ref counts to act as real references with values that can be >1. The pin refs are only used for debugging, but allowing them to be >1 avoids the need for a separate state field for things like DIRTYPARENT. Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>	2013-03-16 11:45:36 -05:00
Sam Lang	fc80c1dc6e	client: Ensure inode/dentries are ref counted The MetaRequest holds onto inodes and dentries for retrying unsafe requests, but those objects might be removed from the cache (unlink for example) causing the inode/dentry to be freed. Ensure that the inode/dentry is never freed while the MetaRequest holds onto it by putting/getting the refs using set/get interfaces. Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>	2013-03-16 11:45:36 -05:00
Samuel Just	f8d66e87a5	OSD: split temp collection as well Otherwise, when we eventually remove the temp collection, there might be objects in the temp collection which were independently pulled into the child pg collection. Thus, removing the old stale parent link from its temp collection also blasts the omap entries and snap mappings for the real child object. Backport: bobtail Fixes: #4452 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-03-15 17:05:54 -07:00
Samuel Just	5b022e8b15	hobject: fix snprintf args for 32 bit Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-03-15 17:05:54 -07:00
Samuel Just	9ea02b8410	ceph_features: fix CEPH_FEATURE_OSD_SNAPMAPPER definition Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-03-15 17:05:54 -07:00
Gary Lowell	ee178fba49	ceph.spec.in: Additional clean-up on package removal When removing the last instance of ceph, also remove the files created by ceph during operation. These consist of the files under /var/lib/ceph, /etc/ceph, and /var/log/ceph. Bug #4415. Signed-off-by: Gary Lowell <gary.lowell@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-03-15 16:25:39 -07:00
Sage Weil	65c31e1b97	ceph-fuse: invalidate cache by default Closes: #2215 Signed-off-by: Sage Weil <sage@inktank.com>	2013-03-15 13:35:37 -07:00
Samuel Just	9edc87a54f	Merge branch 'wip_4196' Reviewed-by: Sage Weil <sage@inktank.com> Fixes: #4196	2013-03-15 11:21:44 -07:00
Samuel Just	39a66b8637	Merge branch 'next'	2013-03-15 11:21:35 -07:00
Samuel Just	f3ad12eab3	test_filejournal: add tests for footer, header, payload corruption Signed-off-by: Samuel Just <sam.just@inktank.com>	2013-03-15 11:21:07 -07:00
Samuel Just	a22cdc67b0	FileJournal: add testing methods to corrupt entries Signed-off-by: Samuel Just <sam.just@inktank.com>	2013-03-15 11:21:07 -07:00
Samuel Just	3b767fa63f	FileJournal,Journal: detect some corrupt journal scenarios When the checksum or footer are invalid, we will now try to look at the next entry. If we find a valid entry, it is likely that the journal is corrupt. Signed-off-by: Samuel Just <sam.just@inktank.com>	2013-03-15 11:21:07 -07:00
Samuel Just	cf00930021	FileJournal::wrap_read_bl: convert arguments to explicit in/out arguments Signed-off-by: Samuel Just <sam.just@inktank.com>	2013-03-15 11:21:07 -07:00
Samuel Just	c3725e92ec	FileJournal: add committed_up_to to header header_t::committed_up_to provides a lower bound for safetly committed journal entries. If read_entry fails prior to committed_up_to, we know we have a corrupt jorunal entry. Furthermore, if journal_write_header_frequency is not 0, we will write out the journal header once every journal_write_header_frequency journal writes. Signed-off-by: Samuel Just <sam.just@inktank.com>	2013-03-15 11:21:07 -07:00
Samuel Just	4aa0f8a274	FileStore: add more debugging for remove and split Signed-off-by: Samuel Just <sam.just@inktank.com>	2013-03-15 11:21:01 -07:00
Samuel Just	de8edb732e	FileJournal: queue_pos \in [get_top(), header.max_size) If queue_pos == header.max_size when we create the entry header magic, the entry will be rejected at get_top() on replay. Fixes: #4436 Backport: bobtail Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-03-15 11:07:12 -07:00
Samuel Just	f1b031b3cf	OSD: expand_pg_num after pg removes Otherwise: 1) expand_pg_num removes a splitting pg entry 2) peering thread grabs pg lock and starts split 3) OSD::consume_map grabs pg lock and starts removal At step 2), we run afoul of the assert(is_splitting) check in split_pgs. This way, the would be splitting pg is marked as removed prior to the splitting state being updated. Backport: bobtail Fixes: #4449 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-03-15 11:07:10 -07:00
Samuel Just	8222cbc8f3	PG: ignore non MISSING pg query in ReplicaActive 1) Replica sends notify 2) Prior to processing notify, primary queues query to replica 3) Primary processes notify and activates sending MOSDPGLog to replica. 4) Primary does do_notifies at end of process_peering_events and sends to Query. 5) Replica sees MOSDPGLog and activates 6) Replica sees Query and asserts. In the above case, the Replica should simply ignore the old Query. Fixes: #4050 Backport: bobtail Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-03-15 11:07:07 -07:00
Sage Weil	632f7200bb	Merge branch 'next' Conflicts: src/mon/AuthMonitor.cc	2013-03-14 21:11:35 -07:00
Sage Weil	11650c5a8c	mon: only try to bump max if leader I broke this in `4637752db6` when I restructured this function. Only try to increase the max if we are the leader. Signed-off-by: Sage Weil <sage@inktank.com>	2013-03-14 21:10:46 -07:00
Sage Weil	80af5fb887	ceph-disk-activate: identify cluster .conf by fsid Determine what cluster the disk belongs to by checking the fsid defined in /etc/ceph/*.conf. Previously we hard-coded 'ceph'. Note that this has the nice side-effect that if we have a disk with a bad/different fsid, we now fail to activate it. Previously, we would mount and start ceph-osd, but the daemon would fail to authenticate because it was part of the wrong cluster. Fixes: #3253 Signed-off-by: Sage Weil <sage@inktank.com>	2013-03-14 21:05:07 -07:00
Gary Lowell	6f15dba931	debian/control: Fix for moved file The ceph-mds.conf file moced from the ceph package to the ceph-mds package. Add replaces/breaks statements to the control file to handle this on upgrade. Signed-off-by: Gary Lowell <gary.lowell@inktank.com>	2013-03-14 17:16:24 -07:00
Sage Weil	7370b55646	ceph-disk-activate: abort if target position is already mounted If the target position is already a mount point, fail to move our mount over to it. This usually indicates that a different osd.N from a different cluster instances is in that position. Signed-off-by: Sage Weil <sage@inktank.com>	2013-03-14 17:05:56 -07:00
David Zafman	18525ebf55	rados/test.sh fails in the nightly run Make test more robust by using my_snaps vector for snap IDs Signed-off-by: David Zafman <david.zafman@inktank.com>	2013-03-14 13:45:01 -07:00
Noah Watkins	1cbcc04467	Merge remote-tracking branch 'origin/wip-osd-addr-api' Reviewed-by: Sage Weil <sage@inktank.com>	2013-03-14 13:27:56 -07:00
Sage Weil	efd153e9e2	debian: add start ceph-mds-all on ceph-mds install This ensures that when we then start individual mds instances, we can stop ceph-mds-all and they will get stopped. We do the same already for ceph-all. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit `41897fcba1`)	2013-03-14 12:33:57 -07:00
Sage Weil	41897fcba1	debian: add start ceph-mds-all on ceph-mds install This ensures that when we then start individual mds instances, we can stop ceph-mds-all and they will get stopped. We do the same already for ceph-all. Signed-off-by: Sage Weil <sage@inktank.com>	2013-03-14 12:33:08 -07:00
Noah Watkins	47378d69ed	libcephfs: add ceph_get_osd_addr interface Return the network address for an OSD by ID. Signed-off-by: Noah Watkins <noahwatkins@gmail.com>	2013-03-14 12:25:17 -07:00
Sage Weil	b6102c0945	Revert "ceph-disk-activate: rely on default/configured keyring path" This reverts commit `936b8f20af`. This is necessary because we mount the osd in a temporary location. Signed-off-by: Sage Weil <sage@inktank.com>	2013-03-14 12:05:52 -07:00
Sage Weil	3e628eee77	Revert "ceph-disk-activate: don't override default or configured osd journal path" This reverts commit `813e9fe2b4`. We run --mkfs with the osd disk mounted in a temporary location, so it is necessary to explicitly pass in these paths. If we want to support journals in a different location, we need to make ceph-disk-prepare update the journal symlink accordingly.. not control it via the config option. Signed-off-by: Sage Weil <sage@inktank.com>	2013-03-14 12:04:44 -07:00
Sage Weil	8f462d476a	Merge pull request #109 from dalgaaf/wip-da-performance-1-v2 prefer prefix ++/--operator for e.g. iterators for performance reasons Reviewed-by: Sage Weil <sage@inktank.com>	2013-03-14 11:41:04 -07:00
Sage Weil	ab54d67f8e	Merge pull request #108 from ceph/wip-refuse-last-mon-remove mon: refuse "mon remove" if only one mon left Reviewed-by: Sage Weil <sage@inktank.com>	2013-03-14 11:38:21 -07:00
Danny Al-Gaaf	282d3aa478	monmaptool.cc: prefer prefix ++operator for iterators Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2013-03-14 19:24:02 +01:00
Danny Al-Gaaf	0c5532cfce	mon/PGMonitor.cc: prefer prefix ++operator for iterators Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2013-03-14 19:24:02 +01:00
Danny Al-Gaaf	01c6a7e50f	mon/MonmapMonitor.cc: prefer prefix ++operator for iterators Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2013-03-14 19:24:02 +01:00
Danny Al-Gaaf	45a3d0cc56	mon/Monitor.cc: prefer prefix ++operator for iterators Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2013-03-14 19:24:02 +01:00
Danny Al-Gaaf	a26a9f7ca8	mon/MonMap.cc: prefer prefix ++operator for iterators Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2013-03-14 19:24:02 +01:00
Danny Al-Gaaf	a6c454320d	mon/MDSMonitor.cc: prefer prefix ++operator for iterators Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2013-03-14 19:24:01 +01:00
Danny Al-Gaaf	a66170df11	mon/LogMonitor.cc: prefer prefix ++operator for iterators Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2013-03-14 19:24:01 +01:00
Danny Al-Gaaf	23ce79ff13	mon/AuthMonitor.cc: prefer prefix ++operator for iterators Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2013-03-14 19:24:01 +01:00
Danny Al-Gaaf	ab0dac1559	mds/mdstypes.cc: prefer prefix ++operator for iterators Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>	2013-03-14 19:24:01 +01:00

... 3 4 5 6 7 ...

24967 Commits