RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-01-18 09:02:08 +00:00

Author	SHA1	Message	Date
Kefu Chai	4cf28de4c9	qa/tasks/workunit: use the suite repo for cloning workunit as "workunits" reside in ceph/qa/workunits, it's more intuitive to respect suite-repo option when cloning workunits. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-24 16:47:47 +08:00
John Spray	de5249436c	Merge pull request #13359 from jcsp/wip-logrotate-sshexception qa: handle SSHException in logrotate Reviewed-by: Kefu Chai <kchai@redhat.com>	2017-02-22 10:05:07 +00:00
Kefu Chai	b3e516fc38	Merge pull request #13518 from tchaikov/wip-fix-pgp-num test: Thrasher: do not update pools_to_fix_pgp_num if nothing happens Reviewed-by: Sage Weil <sage@redhat.com>	2017-02-21 00:46:26 +08:00
Kefu Chai	c0f0cde399	test: Thrasher: do not update pools_to_fix_pgp_num if nothing happens we should not update pools_to_fix_pgp_num if the pool is not expanded or the pg_num is not increased due to pgs being created. this prevent us from fixing the pgp_num after done with thrashing if we actually did nothing when fixing the pgp_num when thrashing, but we removed the pool from pools_to_fix_pgp_num after set_pool_pgpnum() returns. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-19 13:10:46 +08:00
Sage Weil	86c0d07e32	qa/tasks/ceph.py: fix timing of wait-for-* and osd markdown Mark down osds, then wait for them to come up or for the cluster to be healthy! Signed-off-by: Sage Weil <sage@redhat.com>	2017-02-18 21:12:23 -05:00
Sage Weil	96bc86b537	Revert "qa/tasks/workunit: use the suite repo for cloning workunit"	2017-02-17 11:54:27 -06:00
Kefu Chai	1f82b9b944	qa/tasks/workunit: use the suite repo for cloning workunit as "workunits" reside in ceph/qa/workunits, it's more intuitive to respect suite-repo option when cloning workunits. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-16 15:05:51 +08:00
Samuel Just	4aebf59d90	rados: check that pool is done trimming before removing it Signed-off-by: Samuel Just <sjust@redhat.com>	2017-02-13 09:47:02 -08:00
Kefu Chai	de59b5102c	test: Thrasher: restore changed options after done with thrash Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-13 09:25:51 +08:00
Kefu Chai	761a1dc391	tests: Thrasher: extract _set_config() method Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-13 09:25:50 +08:00
Kefu Chai	995e144e3e	tests: CephManager: add get_config() method Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-13 09:25:50 +08:00
Kefu Chai	136483a8f9	test: Thrasher: update pgp_num of all expanded pools if not yet otherwise wait_until_healthy will fail after timeout as seeing warning like: HEALTH_WARN pool cephfs_data pg_num 182 > pgp_num 172 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-13 09:25:50 +08:00
John Spray	880cbf09aa	Merge pull request #13137 from jcsp/wip-18661 qa: fix race in Mount.open_background Reviewed-by: Yan, Zheng <zyan@redhat.com>	2017-02-10 17:48:05 +00:00
John Spray	a3fd3f225c	Merge pull request #13099 from jcsp/wip-18663 qa/tasks: force umount during kclient teardown	2017-02-10 17:42:37 +00:00
John Spray	6f9e11f03d	qa: handle SSHException in logrotate Yet another different type of exception we may get when orchestra.run can't talk to a remote host. Signed-off-by: John Spray <john.spray@redhat.com>	2017-02-10 17:16:24 +00:00
Nathan Cutler	6b7443fb50	tests: drop buildpackages.py The buildpackages suite has been moved to teuthology. This cleans up a file that was left behind by https://github.com/ceph/ceph/pull/13297 Fixes: http://tracker.ceph.com/issues/18846 Signed-off-by: Nathan Cutler <ncutler@suse.com>	2017-02-08 21:23:54 +01:00
Loic Dachary	5a43f8d579	buildpackages: remove because it does not belong It should live in teuthology, not in Ceph. And it is currently broken: there is no need to keep it around. Fixes: http://tracker.ceph.com/issues/18846 Signed-off-by: Loic Dachary <loic@dachary.org>	2017-02-07 18:37:26 +01:00
John Spray	6203f33df4	tasks/cephfs: tear down on mount() failure There were some cases where we would leave a mountpoint that would cause the teuthology teardown to get hung up when it tried to look inside cephtest/ Signed-off-by: John Spray <john.spray@redhat.com>	2017-02-06 22:53:21 +00:00
Patrick Donnelly	d748226f00	qa: add DaemonWatchdog to stop tests on failure Thrashing MDS will often result in failures which often do not stop the test. The failure may also cause the test to stall which will force the machines to needlessly be locked until a timeout is reached. This watchdog will unmount mounts and kill daemons when a failure is detected. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:14 -05:00
Patrick Donnelly	f005e8af6b	qa: disable max_mds changes during thrashing While the trasher supports the behavior desired by issue 10792 [1], the bugs uncovered due to deactivating MDS (and sometimes killing deactivating MDS) are presently a distraction from addressing issues during normal failures. So now thrashing max_mds is turned off by default. I have added a TODO to deactivate ranks in order (configurably) as random deactivation causes a lot of other problems. This also fixes a bug: random.randrange(0.0, 1.0) always returns 0. Oops. [1] http://tracker.ceph.com/issues/10792 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:14 -05:00
Patrick Donnelly	82662edd7f	qa: do not pretty the json to shorten stdout log Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:14 -05:00
Patrick Donnelly	a0052fc2d6	qa: use gevent.sleep so greenlet yields Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:14 -05:00
Patrick Donnelly	cf9e0da078	qa: use fs methods for setting configs Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:13 -05:00
Patrick Donnelly	0098873fb7	qa: remove old comment Filesystem is now cluster aware. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:13 -05:00
Patrick Donnelly	fd4b61890d	qa: allow revived MDS to be up:active Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:13 -05:00
Patrick Donnelly	884215d933	qa: timeout waiting for thrashed MDS to revive Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:13 -05:00
Patrick Donnelly	8e9ea7b6ac	qa: configure thrashing while MDS are stopping Currently multimds is prone to many failures when killing an active or stopping MDS when there are MDS in the cluster which have been deactivated (stopping). Have this turned off by default for now. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:13 -05:00
Patrick Donnelly	6304b6ed5d	qa: add deactivation log message Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:13 -05:00
Patrick Donnelly	1185326c45	qa: avoid infinite wait if no repl. can be made The thrasher can enter an infinite loop waiting for an MDS to take a certain rank when a replacement may not be possible. For example, max_mds actives are already running. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:12 -05:00
Patrick Donnelly	638bccb2bb	qa: timeout thrasher if fs does not stabilize After 5 minutes of waiting, it's reasonable to stop as the cluster is probably stuck. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:12 -05:00
Patrick Donnelly	8f3e745344	qa: check replacement MDS is active in thrasher Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:12 -05:00
Patrick Donnelly	19289725c8	qa: handle thrashing ranks with holes During the course of thrashing max_mds, the ranks assigned to MDSs may develop holes. This causes the thrasher to try to wrongly deactivate ranks that are not assigned. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:12 -05:00
Nathan Cutler	db2582e25e	tests: fix regression in qa/tasks/ceph_master.py https://github.com/ceph/ceph/pull/13194 introduced a regression: 2017-02-06T16:14:23.162 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last): File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 722, in wrapper return func(self) File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 839, in do_thrash self.choose_action()() File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 305, in kill_osd output = proc.stderr.getvalue() AttributeError: 'NoneType' object has no attribute 'getvalue' This is because the original patch failed to pass "stderr=StringIO()" to run(). Fixes: http://tracker.ceph.com/issues/16263 Signed-off-by: Nathan Cutler <ncutler@suse.com> Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-06 19:37:38 +01:00
Sage Weil	5fc3dd36e2	Merge pull request #13237 from smithfarm/wip-18799 tests: Thrasher: eliminate a race between kill_osd and __init__ Reviewed-by: Sage Weil <sage@redhat.com>	2017-02-05 12:49:30 -06:00
Josh Durgin	21cdcfcc66	Merge pull request #13194 from smithfarm/wip-16263 tests: ignore bogus ceph-objectstore-tool error in ceph_manager Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: David Zafman <dzafman@redhat.com>	2017-02-02 15:31:29 -08:00
Nathan Cutler	b519d38fb1	tests: Thrasher: eliminate a race between kill_osd and __init__ If Thrasher.__init__() spawns the do_thrash thread before initializing the ceph_objectstore_tool property, do_thrash races with the rest of Thrasher.__init__() and in some cases do_thrash can call kill_osd() before Trasher.__init__() progresses much further. This can lead to an exception ("AttributeError: Thrasher instance has no attribute 'ceph_objectstore_tool'") being thrown in kill_osd(). This commit eliminates the race by making sure the ceph_objectstore_tool attribute is initialized before the do_thrash thread is spawned. Fixes: http://tracker.ceph.com/issues/18799 Signed-off-by: Nathan Cutler <ncutler@suse.com>	2017-02-02 23:23:54 +01:00
John Spray	3c9f16d8ab	tasks/kclient: apply timeout to umount The umount process can get stuck, in which case we want to fail the test rather than waiting around for it. During teardown of the kclient task catch this timeout explicitly so that we will powercycle the node if needed. Signed-off-by: John Spray <john.spray@redhat.com>	2017-02-02 15:09:48 +00:00
Mykola Golub	93f7b5ef3f	Merge pull request #13158 from dillaman/wip-18594 qa: integrate OpenStack 'gate-tempest-dsvm-full-devstack-plugin-ceph' Reviewed-by: Mykola Golub <mgolub@mirantis.com>	2017-02-02 08:27:49 +02:00
John Spray	a027dba78f	tasks/cephfs: switch open vs. write in test_open_inode Do the write after opening the file, so that we get good behaviour wrt the change in Mount.open_background that uses file existence to confirm that the open happened. Signed-off-by: John Spray <john.spray@redhat.com>	2017-02-01 00:38:08 +00:00
John Spray	7f7f44ea5c	qa/tasks: force umount during kclient teardown Previously we could readily end up hanging on teardown when something had gone wrong with umount. Forcing is a big hammer (umount_wait will power cycle the node if umount isn't working), so if we had to do that then raise an exception to indicate that something was wrong with the test. Fixes: http://tracker.ceph.com/issues/18663 Signed-off-by: John Spray <john.spray@redhat.com>	2017-02-01 00:26:59 +00:00
John Spray	d4f6385b85	Merge pull request #12800 from jcsp/wip-vstart-qasuite Improve vstart_runner to (optionally) create its own cluster Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2017-01-31 02:02:49 +01:00
Nathan Cutler	046e873026	tests: ignore bogus ceph-objectstore-tool error in ceph_manager Fixes: http://tracker.ceph.com/issues/16263 Signed-off-by: Nathan Cutler <ncutler@suse.com>	2017-01-31 00:49:05 +01:00
Jason Dillaman	ce675383b3	qa/tasks/qemu: allow tests to customize the number of CPUs Signed-off-by: Jason Dillaman <dillaman@redhat.com>	2017-01-26 14:18:48 -05:00
Jason Dillaman	42e967f0bb	qa/tasks/qemu: copy ceph configuration to VM image Signed-off-by: Jason Dillaman <dillaman@redhat.com>	2017-01-26 14:17:43 -05:00
Jason Dillaman	d98aa1a39a	qa/tasks/qemu: attach all disks as rbd block devices Signed-off-by: Jason Dillaman <dillaman@redhat.com>	2017-01-26 14:17:30 -05:00
Jason Dillaman	67a4a6c519	qa/tasks/qemu: support overriding the cloud image Signed-off-by: Jason Dillaman <dillaman@redhat.com>	2017-01-26 14:16:16 -05:00
Jason Dillaman	454348004b	qa/tasks/qemu: support arbitrary additions to cloud-init-archive Signed-off-by: Jason Dillaman <dillaman@redhat.com>	2017-01-26 14:16:10 -05:00
John Spray	c6d91dd912	qa: fix race in Mount.open_background Previously a later remote call could end up executing before the remote python program in open_background had actually got as far as opening the file. Fixes: http://tracker.ceph.com/issues/18661 Signed-off-by: John Spray <john.spray@redhat.com>	2017-01-26 16:48:58 +00:00
Michal Jarzabek	052c3d3f68	mon/MDSMonitor.cc:refuse fs new on pools with obj Fixes: http://tracker.ceph.com/issues/11124 Signed-off-by: Michal Jarzabek <stiopa@gmail.com>	2017-01-23 19:48:53 +00:00
John Spray	fe219df2a2	qa: update vstart_runner docstring ...to use paths pointing to ceph tree, not ceph-qa-suite tree. Signed-off-by: John Spray <john.spray@redhat.com>	2017-01-19 06:30:20 +01:00
John Spray	549d993d3f	qa: update remaining ceph.com to download.ceph.com Fixes: http://tracker.ceph.com/issues/18574 Signed-off-by: John Spray <john.spray@redhat.com>	2017-01-17 17:14:50 +01:00
Jason Dillaman	6d17befb3b	qa/tasks/qemu: update default image url after ceph.com redesign Fixes: http://tracker.ceph.com/issues/18542 Signed-off-by: Jason Dillaman <dillaman@redhat.com>	2017-01-16 22:12:51 -05:00
Alfredo Deza	7172b55ad9	Merge pull request #12892 from ceph/wip-cd-fs-fix qa/tasks/ceph-deploy: use the new create option during instantiation Reviewed-by: Alfredo Deza <adeza@redhat.com>	2017-01-13 16:06:24 -05:00
John Spray	1e62467d09	Merge pull request #12833 from ukernel/wip-18396 tasks/cephfs: fix kernel force umount Reviewed-by: Greg Farnum <gfarnum@redhat.com>	2017-01-13 11:20:00 +00:00
John Spray	2076cda04a	Merge pull request #12749 from ukernel/wip-18179 mds: propagate error encountered during opening inode by number Reviewed-by: John Spray <john.spray@redhat.com>	2017-01-13 11:18:59 +00:00
Yan, Zheng	6526ecc084	qa/tasks: add test_open_ino_errors Validate that errors encountered during opening inos are properly propagated Signed-off-by: Yan, Zheng <zyan@redhat.com>	2017-01-12 20:15:53 +08:00
Vasu Kulkarni	2d4ed95f2b	use the create option during instantiation Signed-off-by: Vasu Kulkarni <vasu@redhat.com>	2017-01-10 15:43:12 -08:00
Alfredo Deza	ebb02c8ef5	Merge pull request #12867 from ceph/wip-ceph-deploy-workaround qa/tasks/ceph-deploy: create-keys explicitly Reviewed-by: Alfredo Deza <adeza@redhat.com>	2017-01-10 15:47:26 -05:00
Vasu Kulkarni	2d6c3fa8b2	Add ceph-create-keys to explicitly create admin/bootstrap keys Signed-off-by: Vasu Kulkarni <vasu@redhat.com>	2017-01-09 17:14:33 -08:00
Yan, Zheng	4cdeeaac10	qa/tasks/cephfs: fix kernel force umount Fixes: http://tracker.ceph.com/issues/18396 Signed-off-by: Yan, Zheng <zyan@redhat.com>	2017-01-10 08:31:25 +08:00
John Spray	6542a2e0d0	Merge pull request #12588 from jcsp/wip-18311 mds: check for errors decoding backtraces Reviewed-by: Greg Farnum <gfarnum@redhat.com>	2017-01-09 11:02:32 +00:00
Nathan Cutler	74689df754	tests: subst branch and repo in qa/tasks/qemu.py References: http://tracker.ceph.com/issues/18440 Signed-off-by: Nathan Cutler <ncutler@suse.com>	2017-01-07 22:49:54 +01:00
Nathan Cutler	56e37e41f4	tests: subst repo name in qa/tasks/cram.py Inspired by `bcbe45d948` Fixes: http://tracker.ceph.com/issues/18440 Signed-off-by: Nathan Cutler <ncutler@suse.com>	2017-01-07 13:40:06 +01:00
John Spray	aa01f44022	qa: enable cluster creation in vstart_runner Convenient when you want to create a fresh cluster each test run: just pass --create and you'll get a cluster with the right number of daemons for the tests you're running. Signed-off-by: John Spray <john.spray@redhat.com>	2017-01-05 13:43:40 +00:00
John Spray	5d945fb71e	qa/vstart_runner: more robust stop() on daemons Previously this could get hung up if we killed one PID and then the daemon reappears with a different one (perhaps because we caught it during daemonization?) Signed-off-by: John Spray <john.spray@redhat.com>	2017-01-05 13:43:39 +00:00
John Spray	081038ef53	qa: fix vstart_runner tasks import Instead of hunting around the filesystem for ceph-qa-suite, get it from our own location. Signed-off-by: John Spray <john.spray@redhat.com>	2017-01-05 13:43:39 +00:00
John Spray	5f6cdab80f	qa/tasks: add test_corrupt_backtrace Validate that we get EIO and a damage table entry when seeing a decode error on a backtrace. Signed-off-by: John Spray <john.spray@redhat.com>	2017-01-05 13:41:59 +00:00
Sage Weil	2861a2188a	Merge pull request #12630 from liewegas/wip-workunit-retry qa/tasks/workunit: clear clone dir before retrying checkout Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2016-12-23 08:12:35 -06:00
Sage Weil	2a7013cd5a	qa/tasks/workunit: clear clone dir before retrying checkout If we checkout ceph-ci.git, and don't find a branch, we'll try again from ceph.git. But the checkout will already exist and the clone will fail, so we'll still fail to find the branch. The same can happen if a previous workunit task already checked out the repo. Fix by removing the repo before checkout (the first and second times). Note that this may break if there are multiple workunit tasks running in parallel on the same role. That is already racy, so if it's happening, we'll want to switch to using a truly unique clonedir for each instantiation. Fixes: http://tracker.ceph.com/issues/18336 Signed-off-by: Sage Weil <sage@redhat.com>	2016-12-22 13:05:22 -05:00
Sage Weil	e1781dd573	qa/tasks/peer: update task based on current peering behavior This changed in `0be3f5f72e`. Fixes: http://tracker.ceph.com/issues/18330 Signed-off-by: Sage Weil <sage@redhat.com>	2016-12-22 08:40:45 -05:00
Sage Weil	c922404a03	qa/tasks/osd_backfill.py: wait for osd.[12] to start ...before sending a tell command. Otherwise osd.2 might start without 1, the io unblocks, and the tell fails because osd.1 is still down. Fixes: http://tracker.ceph.com/issues/18303 Signed-off-by: Sage Weil <sage@redhat.com>	2016-12-19 21:56:11 -05:00
Sage Weil	72d73b8c88	qa/tasks/workunit: retry on ceph.git if checkout fails Signed-off-by: Sage Weil <sage@redhat.com>	2016-12-16 15:06:16 -05:00
Vasu Kulkarni	9f04a7b32e	use dev option instead of dev-commit Signed-off-by: Vasu Kulkarni <vasu@redhat.com>	2016-12-15 14:11:00 -08:00
Sage Weil	6bb3a037e5	Merge pull request #12511 from liewegas/wip-workunits qa/workunits/rbd: fix Reviewed-by: Jason Dillaman <dillaman@redhat.com> Reviewed-by: Mykola Golub <mgolub@mirantis.com>	2016-12-15 14:15:31 -06:00
Sage Weil	c6698c95b8	Merge pull request #12508 from liewegas/wip-qa-admin-socket qa/tasks/admin_socket: subst in repo name	2016-12-15 13:53:10 -06:00
Sage Weil	27b8eac249	qa/tasks/workunit.py: add CEPH_BASE env var Root of git checkout Signed-off-by: Sage Weil <sage@redhat.com>	2016-12-15 13:52:03 -05:00
Sage Weil	4602884ab8	qa/tasks/workunit: leave workunits inside git checkout Signed-off-by: Sage Weil <sage@redhat.com>	2016-12-15 13:52:03 -05:00
Sage Weil	bcbe45d948	qa/tasks/admin_socket: subst in repo name It is either ceph.git or ceph-ci.git. Signed-off-by: Sage Weil <sage@redhat.com>	2016-12-15 13:35:02 -05:00
Samuel Just	ae40602c14	Merge remote-tracking branch 'ceph-qa-suite/master' into wip-18113-qa	2016-12-14 16:05:35 -08:00
Sage Weil	c01f2ee0e2	move ceph-qa-suite dirs into qa/	2016-12-14 11:29:55 -06:00

1 2 3 4

180 Commits