RepoMirrors/ceph

mirror of https://github.com/ceph/ceph synced 2025-02-24 11:37:37 +00:00

Author	SHA1	Message	Date
Orit Wasserman	c320fbd9f8	Merge pull request #15753 from pritha-srivastava/wip-rgw-s3tests-conf rgw: Changes for s3test config file, to add user under a tenant. Reviewed-by: Casey Bodely <cbodley@redhat.com> Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2017-06-22 11:00:26 +03:00
Patrick Donnelly	d4870a093c	qa: wait for healthy cluster before testing pins Fixes: http://tracker.ceph.com/issues/20318 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-06-21 13:21:32 -07:00
Vasu Kulkarni	14b6267cba	s3a task to test radosgw compatibility with hadoop s3a interface Signed-off-by: Vasu Kulkarni <vasu@redhat.com>	2017-06-21 11:52:10 -07:00
Sage Weil	6a00ba0e26	qa/tasks/ceph_manager: get osds all in after thrashing Otherwise we might end up with some PGs remapped, which means they won't get scrubbed. Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-20 12:07:25 -04:00
Yan, Zheng	57e82edc9c	qa/cephfs: use ceph.dir.pin to trigger migration Signed-off-by: "Yan, Zheng" <zyan@redhat.com>	2017-06-20 17:39:46 +08:00
Pritha Srivastava	5e94a9852c	rgw: Changes for s3test config file, to add user under a tenant. Signed-off-by: Pritha Srivastava <prsrivas@redhat.com>	2017-06-20 12:57:24 +05:30
Sage Weil	04969eff23	qa/tasks/resolve_stuck_peering: start osd at end Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-19 14:28:28 -04:00
Sage Weil	cc902a1f6b	qa/tasks/ceph: osd_scrub_pgs: reissue scrub requests in loop The scrub commands are not reliable: if the OSD doesn't happen to be connected at the time the command is issued it may not get delivered. Re-request scrubs for each PG that has not yet been scrubbed so that we don't wait forever when the original request is dropped. Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-19 12:00:12 -04:00
Sage Weil	32361a798f	qa/tasks/ceph: osd_scrub_pgs: tolerate down osd at initial scrub time Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-19 12:00:12 -04:00
Sage Weil	bdf40c546d	Merge pull request #15717 from liewegas/wip-20326 qa/tasks/ceph.py: tolerate active+clean+something Reviewed-by: Kefu Chai <kchai@redhat.com>	2017-06-16 16:12:20 -05:00
Sage Weil	1565b86dc0	qa/tasks/ceph.py: tolerate active+clean+something where something is, say, snaptrim. or maybe scrubbing. or whatever. Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-15 22:29:28 -04:00
Sage Weil	f870cc5f28	qa/tasks/thrashosds: wait before wait_for_recovery Make sure OSDs are up and they have flushed their PG stats before waiting for recovery to ensure that we do not see a stale 'clean' state. Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-15 12:14:24 -04:00
Sage Weil	200abcee6d	qa/tasks/ceph: raise exception if scrubs time out Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-15 11:23:18 -04:00
Sage Weil	0d80c88667	qa/tasks/ceph: raise an exception if pgs are not clean If this happens the preceding test should have cleaned up (e.g., ceph.healthy:). Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-15 11:23:18 -04:00
Sage Weil	6fa9d32407	qa/tasks/ceph: osd_scrub_pgs: try a bit longer I just saw a test fail that was still waiting for scrubs to complete. Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-15 11:23:18 -04:00
John Spray	18fbf24c7a	Merge pull request #15308 from jcsp/wip-19706 mon: don't kill MDSs unless some beacons are getting through Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2017-06-15 10:50:44 -04:00
John Spray	4a1fe14bc6	Merge pull request #15411 from jcsp/wip-fs-suite qa: misc cephfs test improvements Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2017-06-15 10:50:07 -04:00
Yan, Zheng	5e1d8879ee	qa/cephfs: update stray reintegration test case Signed-off-by: "Yan, Zheng" <zyan@redhat.com>	2017-06-12 09:46:06 +08:00
Sage Weil	554cf8394a	Merge pull request #15073 from liewegas/wip-mgr-stats mon,mgr: extricate PGmap from monitor Reviewed-by: Greg Farnum <gfarnum@redhat.com>	2017-06-04 13:36:01 -05:00
Kefu Chai	e8b23d6852	qa/tasks: add a blacklist for flush_pg_stats() so we don't wait for marked out osds. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-06-02 13:06:50 -04:00
Sage Weil	ab1b78ae00	qa/tasks: use new reliable flush_pg_stats helper The helper gets a sequence number from the osd (or osds), and then polls the mon until that seq is reflected there. This is overkill in some cases, since many tests only require that the stats be reflected on the mgr (not the mon), but waiting for it to also reach the mon is sufficient! Signed-off-by: Sage Weil <sage@redhat.com>	2017-06-02 13:02:45 -04:00
Yehuda Sadeh	ea911b7f48	Merge pull request #14351 from yehudasa/wip-rgw-mdsearch rgw: metadata search part 2 Reviewed-by: Casey Bodley <cbodley@redhat.com> Reviewed-by: Abhishek Lekshmanan <abhishek@suse.com>	2017-06-02 09:16:07 -07:00
Yehuda Sadeh	6594d972f2	qa/tasks/rgw_multisite.py: adjust zone init zone is now a ZoneConn object. Also, change import to make it relative so that qa task can locate it. Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>	2017-06-01 13:32:00 -07:00
John Spray	7e1be30b9a	qa: clean up test_exports.py Mainly just using the setfattr helper instead of run_shell. Signed-off-by: John Spray <john.spray@redhat.com>	2017-06-01 07:18:03 -04:00
John Spray	6ef30d1ed3	qa: explicitly set up standby replay in test_journal_migration Previously this relied on being run in a special cluster configuration that set up standby replay daemons. This change will allow it to live alongside all the 'normal' functional tests. Signed-off-by: John Spray <john.spray@redhat.com>	2017-06-01 07:18:03 -04:00
John Spray	01c46bf832	Merge pull request #15205 from batrick/i20039 mds: check export pin during replay Reviewed-by: Yan, Zheng <zyan@redhat.com>	2017-06-01 11:23:02 +01:00
John Spray	3326321858	qa: fix daemon restart between tests Previously, calling mds_stop without mds_fail meant that if the filesystem creation was not quick, then we would see those daemons go laggy. This starts to trigger failures now that we have cluster log messages that fire when a daemon gets failed out due to being laggy. Signed-off-by: John Spray <john.spray@redhat.com>	2017-05-31 18:00:43 -04:00
Yehuda Sadeh	760c5e4f86	Merge pull request #15184 from cbodley/wip-qa-rgw-cleanup qa/rgw: remove apache/fastcgi and radosgw-agent tests Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>	2017-05-30 13:09:31 -07:00
Patrick Donnelly	76335b0e0f	qa: improve debug message for subtree wait Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-05-30 09:08:27 -07:00
Sage Weil	8554158574	Merge pull request #15325 from liewegas/wip-redirect osd,librados: add manifest, redirect Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2017-05-29 14:48:33 -05:00
Sage Weil	a9a728fe4d	Merge pull request #15296 from liewegas/wip-fix-at-end qa/tasks/repair_test: unset flags we set Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2017-05-27 22:11:31 -05:00
Kefu Chai	8abc6e1bea	qa/tasks/rebuild_mondb: update to address ceph-mgr changes - revive ceph-mgr after updating the keyring cap - grant "mgr:allow *" to client.admin - minor refactors Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-05-28 09:59:50 +08:00
Sage Weil	a4247dd594	Merge branch 'wip-extensible_tier-redirect' of git://github.com/myoungwon/ceph into wip-redirect	2017-05-26 22:50:14 -04:00
Sage Weil	d292b5419f	qa/tasks/repair_test: unset flags we set In particular, noscrub and nodeepscrub leave a health warning, which prevents shutdown with at-end.yaml. Signed-off-by: Sage Weil <sage@redhat.com>	2017-05-25 18:05:42 -04:00
John Spray	f80e0973f5	Merge pull request #15062 from ukernel/wip-19912 qa/tasks/cephfs: use getattr to guarantee inode is in client cache Reviewed-by: John Spray <john.spray@redhat.com>	2017-05-25 18:44:54 +01:00
Sage Weil	5d80c74e63	Merge pull request #15252 from liewegas/wip-cleanup-tell qa/tasks/ceph_manager: 'ceph $service tell ...' is obsolete Reviewed-by: Kefu Chai <kchai@redhat.com>	2017-05-24 17:05:32 -05:00
John Spray	ef9d555916	Merge pull request #15105 from ukernel/wip-19892 qa/cephfs: disable mds_bal_frag for TestStrays.test_purge_queue_op_rate Reviewed-by: John Spray <john.spray@redhat.com>	2017-05-24 16:41:45 +01:00
John Spray	ee75318807	Merge pull request #15122 from batrick/test-fragment-error qa: fix float parse error in test_fragment Reviewed-by: John Spray <john.spray@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com>	2017-05-24 16:40:50 +01:00
Sage Weil	5ab996ab3c	qa/tasks/ceph_manager: 'ceph $service tell ...' is obsolete This died forever ago; no need for the fallback here. Signed-off-by: Sage Weil <sage@redhat.com>	2017-05-23 22:53:53 -04:00
John Spray	3913ed0ba6	qa: refine assert_session_count (don't count killing) Signed-off-by: John Spray <john.spray@redhat.com>	2017-05-23 05:22:18 -04:00
John Spray	ee2683c804	qa: update TestVolumeClient for new blacklisting Blacklisted clients will now proactively fail outstanding operations, rather than blocking. Signed-off-by: John Spray <john.spray@redhat.com>	2017-05-23 05:22:18 -04:00
John Spray	ab8e328c80	qa: clean up whitespace in test_misc.py Signed-off-by: John Spray <john.spray@redhat.com>	2017-05-23 05:22:18 -04:00
John Spray	c91ccac6f6	qa: remove outdated TODO in TestVolumeClient Signed-off-by: John Spray <john.spray@redhat.com>	2017-05-23 05:22:17 -04:00
John Spray	47a9c9ba67	qa: add test_filelock_eviction To check that eviction is releasing flocks. Signed-off-by: John Spray <john.spray@redhat.com>	2017-05-23 05:22:17 -04:00
Casey Bodley	8c74c8a639	qa/rgw: remove apache/fastcgi Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-05-19 16:05:36 -04:00
Casey Bodley	0fb3e76eae	qa/rgw: more cleanup in rgw.py Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-05-19 15:53:37 -04:00
Casey Bodley	c8d8b9cae1	qa/rgw: remove unused helpers in util/rgw.py Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-05-19 15:53:37 -04:00
Casey Bodley	a05b3bb409	qa/rgw: remove radosgw_agent task Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-05-19 15:53:37 -04:00
Casey Bodley	762e15fbb3	qa/rgw: remove radosgw-agent config from s3tests task Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-05-19 15:53:37 -04:00
Casey Bodley	9d82486d0e	qa/rgw: remove radosgw-agent tests from radosgw_admin task Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-05-19 15:53:37 -04:00
Casey Bodley	898ab4bb0f	qa/rgw: remove multisite configuration from rgw task Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-05-19 15:53:36 -04:00
Casey Bodley	cff53b246f	Merge pull request #14688 from cbodley/wip-rgw-multi-suite qa/rgw: add multisite suite to configure and run multisite tests Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2017-05-19 14:30:57 -04:00
Sage Weil	590fd5362a	Merge pull request #15071 from cbodley/wip-qa-dnsmasq qa: add task for dnsmasq configuration Reviewed-by: Vasu Kulkarni <vasu@redhat.com> Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@gmail.com>	2017-05-19 13:25:12 -05:00
Casey Bodley	de836ee684	qa/rgw: add test config to rgw_multisite_tests task Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-05-18 13:38:44 -04:00
Casey Bodley	efb3b181fd	qa/rgw: add log_level argument to rgwadmin() changes default level from info to debug Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-05-18 13:37:35 -04:00
Casey Bodley	4722d1d920	qa/rgw: add rgw_multisite_tests task to run tests Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-05-17 14:48:55 -04:00
Casey Bodley	b6d86be2c5	qa/rgw: add rgw_multisite task based on rgw_multi Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-05-17 14:48:55 -04:00
Casey Bodley	a86ce77155	qa/rgw: add symlink to qa/tasks/rgw_multi Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-05-17 14:48:55 -04:00
Casey Bodley	746c630999	qa/rgw: move startup polling logic to util/rgw.py Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-05-17 14:48:55 -04:00
Casey Bodley	76e147614f	qa/rgw: fixes for cluster name on cleanup Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-05-17 14:48:55 -04:00
Casey Bodley	4c59d343c3	qa/rgw: move compression type out of ceph.conf this makes the 'compression type' setting global to all gateways, and makes the setting visible to other tasks in ctx.rgw.compression_type Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-05-17 14:48:55 -04:00
Patrick Donnelly	6c34a2c673	qa: silence upgrade test failure The new fs setting standby_count_wanted is only avialable in luminous. Upgrade tests were tripping on this. Fixes: http://tracker.ceph.com/issues/19934 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-05-16 18:43:57 -04:00
Patrick Donnelly	4b72940d02	qa: fix float parse error in test_fragment 2017-05-16 17:45:30,663.663 INFO:__main__:run args=['./bin/ceph', 'daemon', 'mds.b', 'perf', 'dump', 'mds'] 2017-05-16 17:45:30,664.664 INFO:__main__:Running ['./bin/ceph', 'daemon', 'mds.b', 'perf', 'dump', 'mds'] Can't get admin socket path: unable to get conf option admin_socket for mds.b: parse error setting 'mds_bal_fragment_size_max' to '152.0' 2017-05-16 17:45:30,781.781 INFO:__main__:test_rapid_creation (tasks.cephfs.test_fragment.TestFragmentation) ... ERROR 2017-05-16 17:45:30,782.782 ERROR:__main__:Traceback (most recent call last): File "/home/pdonnell/ceph/qa/tasks/cephfs/test_fragment.py", line 114, in test_rapid_creation self.assertEqual(self.get_splits(), 0) File "/home/pdonnell/ceph/qa/tasks/cephfs/test_fragment.py", line 15, in get_splits return self.fs.mds_asok(['perf', 'dump', 'mds'])['mds']['dir_split'] File "/home/pdonnell/ceph/qa/tasks/cephfs/filesystem.py", line 788, in mds_asok return self.json_asok(command, 'mds', mds_id) File "/home/pdonnell/ceph/qa/tasks/cephfs/filesystem.py", line 174, in json_asok proc = self.mon_manager.admin_socket(service_type, service_id, command) File "../qa/tasks/vstart_runner.py", line 561, in admin_socket args=[os.path.join(BIN_PREFIX, "ceph"), "daemon", "{0}.{1}".format(daemon_type, daemon_id)] + command, check_status=check_status File "../qa/tasks/vstart_runner.py", line 296, in run proc.wait() File "../qa/tasks/vstart_runner.py", line 174, in wait raise CommandFailedError(self.args, self.exitstatus) CommandFailedError: Command failed with status 22: ['./bin/ceph', 'daemon', 'mds.b', 'perf', 'dump', 'mds'] Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-05-16 18:02:18 -04:00
myoungwon oh	a07ad9fe80	qa/suites/rados/thrash: add redirect test cases Signed-off-by: Myoungwon Oh omwmw@sk.com	2017-05-17 05:47:12 +09:00
John Spray	60f904615f	Merge pull request #15096 from jcsp/wip-journalrepair-test qa: simplify TestJournalRepair Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2017-05-16 16:11:57 +01:00
Yan, Zheng	6473b79337	qa/cephfs: disable mds_bal_frag for TestStrays.test_purge_queue_op_rate directory fragmentation generates extra osd ops, which affects checks in the test. Fixes: http://tracker.ceph.com/issues/19892 Signed-off-by: "Yan, Zheng" <zyan@redhat.com>	2017-05-16 16:43:29 +08:00
John Spray	2350555fe5	qa: simplify TestJournalRepair This was sending lots of metadata ops to MDSs to persuade them to migrate some subtrees, but that was flaky. Use the shiny new rank pinning functionality instead. Signed-off-by: John Spray <john.spray@redhat.com>	2017-05-15 17:27:07 -04:00
Douglas Fuller	7f659e104d	qa/cephfs: Fix for test_data_scan Don't assume that test_data_scan will be run on exactly 2 MDS nodes. Fixes: http://tracker.ceph.com/issues/19893 Signed-off-by: Douglas Fuller <dfuller@redhat.com>	2017-05-15 16:01:02 -04:00
John Spray	17f669a868	Merge pull request #15026 from ukernel/wip-19891 qa/suites/fs: reserve more space for mds in full tests Reviewed-by: John Spray <john.spray@redhat.com>	2017-05-15 13:21:52 +01:00
John Spray	897b5f5bbe	Merge pull request #15035 from batrick/quiet-mds-grow-shrink qa: silence spurious insufficient standby health warnings Reviewed-by: Yan, Zheng <zyan@redhat.com>	2017-05-15 13:17:38 +01:00
Casey Bodley	062923515c	qa: add task for dnsmasq configuration Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-05-12 16:53:14 -04:00
Yan, Zheng	1a48359f34	qa/tasks/cephfs: use getattr to guarantee inode is in client cache When selinux is enabled, kernel client may releases inodes (without uptodate xattr) in readdir reply immediately after processing the reply. The reason is that linking the inode to dentry causes deadlock if xattr is not uptodate. We can use stat(2) syscall to guarantee that kernel client caches an inode. Fixes: http://tracker.ceph.com/issues/19912 Signed-off-by: "Yan, Zheng" <zyan@redhat.com>	2017-05-12 16:42:25 +08:00
Yan, Zheng	b67a599ebe	Merge pull request #14598 from batrick/mds-balancer-pin mds: support export pinning on directories	2017-05-11 11:56:34 +08:00
Yan, Zheng	bbb3369b50	qa/suites/fs: fix write size calculation in full tests 'max_avail' has already taken full_ratio into account Signed-off-by: "Yan, Zheng" <zyan@redhat.com>	2017-05-11 11:18:22 +08:00
Patrick Donnelly	02c41f683d	qa: add health warning test for insufficient standbys Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-05-10 11:05:09 -04:00
Patrick Donnelly	a4cb10900d	qa: turn off spurious standby health warning Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-05-10 10:21:28 -04:00
Patrick Donnelly	9552efde4a	qa: improve time handling for test_exports test Also catches corner-case found by Zheng where an unjournaled directory will cause export pinning to fail because it cannot be made a subtree until its parent is stable. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-05-05 19:07:05 -04:00
Sage Weil	99928c9e0d	Merge pull request #14931 from tchaikov/wip-19771 qa/tasks/ceph_manager: always fix pgp_num when done with thrashosd task Reviewed-by: Sage Weil <sage@redhat.com>	2017-05-05 08:53:38 -05:00
Tamilarasi Muthamizhan	a189b61095	Merge pull request #14400 from ceph/wip-cd-1node qa/tasks: few fixes to get ceph-deploy 1node to working state	2017-05-04 10:42:50 -07:00
Vasu Kulkarni	e58dd3938a	install mgr on the node Signed-off-by: Vasu Kulkarni <vasu@redhat.com>	2017-05-03 16:47:14 -07:00
Kefu Chai	da1161cbd8	qa/tasks/ceph_manager: always fix pgp_num when done with thrashosd task Fixes: http://tracker.ceph.com/issues/19771 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-05-03 18:28:27 +08:00
Patrick Donnelly	63cbe330b7	qa: remove errant mount requirement Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-05-02 18:29:08 -04:00
Patrick Donnelly	6bd58fefb7	mds: use aux subtrees for export pinned inodes Idea here is that a pinned inode should not be exported when its parent is. Setting the pinned inode's dirfrags to aux subtrees prevents them from being merged with a parent subtree. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-05-02 00:30:35 -04:00
Casey Bodley	0e30e3ef01	Merge pull request #14845 from cbodley/wip-rgw-qa-s3tests qa/rgw: add cluster name to path when s3tests scans rgw log Reviewed-by: Daniel Gryniewicz <dang@redhat.com>	2017-05-01 10:49:12 -04:00
Kefu Chai	7424345c77	qa/erasure-code: override min_size to 2 so isa(k=2,m=1) can survive with 1 down OSD. Fixes: http://tracker.ceph.com/issues/19770 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-04-29 10:43:17 +08:00
Kefu Chai	5f50298025	qa/tasks/rados: add optional setting of "min_size" this setting only affects the newly created pool Fixes: http://tracker.ceph.com/issues/19770 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-04-29 10:39:02 +08:00
Casey Bodley	88b6a142bc	qa/rgw: fix assertions in radosgw_admin task Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-04-27 19:38:10 -04:00
Casey Bodley	a31aa6f65c	qa/rgw: add cluster name to path when s3tests scans rgw log Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-04-27 14:48:40 -04:00
John Spray	d0d3a4a02e	Merge pull request #12935 from stiopaa1/17855_evictClient mds/Server.cc: Don't evict a slow client if... Reviewed-by: John Spray <john.spray@redhat.com>	2017-04-24 22:10:01 +01:00
John Spray	837a71c0af	qa/tasks/cephfs: clean up mount point setup Previously were sometimes trying to maintain a mounted client across a filesystem destroy/create. Signed-off-by: John Spray <john.spray@redhat.com>	2017-04-24 11:19:55 +01:00
John Spray	16702ff13d	Merge pull request #14018 from jcsp/wip-17939 client: getattr before returning quota/layout xattrs Reviewed-by: Yan, Zheng <zyan@redhat.com>	2017-04-24 11:12:26 +01:00
Michal Jarzabek	1a5cb534d9	mds/Server.cc: Don't evict a slow client if... ... it's the only client Fixes: http://tracker.ceph.com/issues/17855 Signed-off-by: Michal Jarzabek <stiopa@gmail.com>	2017-04-23 13:31:47 +01:00
Sage Weil	27dd6530a2	Merge pull request #14559 from liewegas/wip-pg-map mon: move 'pg map' to OSDMonitor Reviewed-by: Kefu Chai <kchai@redhat.com>	2017-04-21 18:53:17 -05:00
Kefu Chai	c237e7ed29	Merge pull request #14232 from jcsp/wip-19412 mgr: fix python module teardown & add tests Reviewed-by: Kefu Chai <kchai@redhat.com>	2017-04-21 22:57:44 +08:00
Sage Weil	069182f91f	qa/tasks/ceph_manager: use 'pg map' for get_pg_{primary,replica} Pulling this out of the 'pg dump' heap is inefficient. Also, pg dump data comes from the mgr and may be stale. Signed-off-by: Sage Weil <sage@redhat.com>	2017-04-21 10:56:28 -04:00
Kefu Chai	6fa16c4477	Merge pull request #14584 from tchaikov/wip-19631 qa/suites: Revert "qa/suites: add mon-reweight-min-pgs-per-osd = 4" Reviewed-by: Sage Weil <sage@redhat.com>	2017-04-21 22:56:21 +08:00
Casey Bodley	a4fc5c38e5	qa/rgw: don't scan radosgw logs for encryption keys on jewel upgrade test Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-04-20 14:49:04 -04:00
John Spray	f695a0e30f	qa: s/REQUIRE_MGRS/MGRS_REQUIRED/ for consistency Signed-off-by: John Spray <john.spray@redhat.com>	2017-04-20 15:00:31 +01:00
John Spray	636fc40d90	qa: additions to mgr.test_failover Reproducers for recent fixes: http://tracker.ceph.com/issues/19407 http://tracker.ceph.com/issues/19258 Signed-off-by: John Spray <john.spray@redhat.com>	2017-04-20 15:00:31 +01:00
John Spray	8ea98b4cbf	qa: fix vstart_runner --create for mgr tests Signed-off-by: John Spray <john.spray@redhat.com>	2017-04-20 15:00:31 +01:00
Kefu Chai	e6a436bb27	qa/tasks/ceph_manager: be able to store options with service type so we are able to change options for services other than mon while thrashing. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-04-20 14:18:21 +08:00
Kefu Chai	ee653ba87c	Merge pull request #14608 from tchaikov/wip-19594 qa/tasks: assert on pg status with a timeout Reviewed-by: Sage Weil <sage@redhat.com>	2017-04-20 10:49:12 +08:00
Kefu Chai	960032e513	qa/tasks: update tests with helper to wait for pg-stats and remove unused helpers Fixes: http://tracker.ceph.com/issues/19594 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-04-20 09:35:05 +08:00
Kefu Chai	1207caf3a2	qa/tasks/ceph_manager: add a "wait_for_pg_stats()" decorator and accompany it with two helpers to access the pg stats in a more natural way Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-04-20 09:35:04 +08:00
Josh Durgin	a219319137	qa/tasks/rados: test sparse reads with ec overwrites Signed-off-by: Josh Durgin <jdurgin@redhat.com>	2017-04-19 17:45:43 -07:00
Josh Durgin	6fba80c1fa	osd, OSDMonitor, qa: mark ec overwrites non-experimental Keep the pool flag around so we can distinguish between a pool that should maintain hashes for each chunk, and a missing one is a bug, vs an overwrites pool where we rely on bluestore checksums for detecting corruption. Signed-off-by: Josh Durgin <jdurgin@redhat.com>	2017-04-19 17:45:43 -07:00
Patrick Donnelly	0b420be7e9	mds: add export_pin feature This allows the client/admin to pin a directory tree to a particular rank, preventing its export by the dynamic balancer. Fixes: http://tracker.ceph.com/issues/17834 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-04-19 18:21:19 -04:00
Sage Weil	ee1bb01a54	Merge pull request #14556 from liewegas/wip-pgupmap osd: pg-remap -> pg-upmap Reviewed-by: David Zafman <dzafman@redhat.com>	2017-04-19 17:07:01 -05:00
Zack Cerza	28d746bff3	Merge pull request #14464 from ceph/wip-systemd qa/tasks: use sudo to check ceph health for systemd test	2017-04-18 11:34:27 -06:00
Sage Weil	ce188e8fdf	osd: pg-remap -> pg-upmap 'remap' is to non-specific a name. In particular, it sounds like it is related to the 'remapped' PG state but in reality it is not related. 'upmap' or 'pg-upmap' is more specific: it maps a pgid to the 'up' set value (or item) Signed-off-by: Sage Weil <sage@redhat.com>	2017-04-18 12:59:40 -04:00
Casey Bodley	da7acc4211	Merge pull request #13597 from cbodley/wip-s3tests-crypto qa/rgw: add configuration for server-side encryption tests Reviewed-by: Orit Wasserman <owasserm@redhat.com>	2017-04-18 12:28:37 -04:00
Kefu Chai	1b54b5f3f1	Merge pull request #14415 from smithfarm/wip-19556 tests: Thrasher: handle "OSD has the store locked" gracefully Reviewed-by: Kefu Chai <kchai@redhat.com>	2017-04-18 23:18:35 +08:00
John Spray	033ee6bd1f	Merge pull request #14396 from jcsp/wip-19550 qa: re-enable ENOSPC tests for kclient	2017-04-18 12:59:14 +01:00
John Spray	d98e19fdbd	Merge pull request #14589 from jcsp/wip-19640 client: refine fsync/close writeback error handling Reviewed-by: Jeff Layton <jlayton@redhat.com>	2017-04-18 12:58:37 +01:00
John Spray	a2a100dc13	Merge pull request #14272 from jcsp/wip-vstart-fixup qa: fix test_standby_for_invalid_fscid with vstart_runner Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>	2017-04-18 12:50:20 +01:00
John Spray	1a69bec52f	client: refine fsync/close writeback error handling Previously, errors stuck indelibly to the inode, which meant that a close call would see an error even if the user already dutifully fsync()'d and handled it. We should emit each error only once per file handle. Signed-off-by: John Spray <john.spray@redhat.com>	2017-04-18 07:47:10 -04:00
Orit Wasserman	cb94e5ad3f	Merge pull request #12535 from ceph/wip-rgw-multisite-teuthology rgw: multisite enabled over multiple clusters Reviewed-by: Orit Wasserman <owasserm@redhat.com> Reviewed-by: Casey Bodley <cbodley@redhat.com>	2017-04-18 11:47:48 +03:00
David Zafman	a5731076ad	osd: Handle backfillfull_ratio just like nearfull and full Add BACKFILLFULL as a local OSD cur_state Notify monitor of this new fullness state Signed-off-by: David Zafman <dzafman@redhat.com>	2017-04-17 08:00:24 -07:00
John Spray	dd43d3bc64	qa/cephfs: use getfattr/setfattr helpers Signed-off-by: John Spray <john.spray@redhat.com>	2017-04-14 06:38:48 -04:00
John Spray	61617f8f10	qa: add test for reading quotas from different clients Fixes: http://tracker.ceph.com/issues/17939 Signed-off-by: John Spray <john.spray@redhat.com>	2017-04-14 06:38:48 -04:00
Sage Weil	5ca72c1193	qa/tasks/exec_on_cleanup.py: add Signed-off-by: Sage Weil <sage@redhat.com>	2017-04-13 17:11:19 -04:00
Ali Maredia	b31b84529e	rgw multisite: use get_config_master_client for radosgw_admin task Signed-off-by: Ali Maredia <amaredia@redhat.com>	2017-04-13 12:15:50 -04:00
Ali Maredia	c5956790e6	rgw: multisite enabled over multiple clusters Added '--cluster' to all necessary commands ex: radosgw-admin, rados, ceph, made sure necessary checks were in place so that clients can be read with our without a cluster_name preceeding them Made master_client defined in the config for radosgw-admin task Signed-off-by: Ali Maredia <amaredia@redhat.com>	2017-04-13 12:15:50 -04:00
Vasu Kulkarni	7af157ad4c	use sudo to check check health Signed-off-by: Vasu Kulkarni <vasu@redhat.com>	2017-04-11 13:52:26 -07:00
Nathan Cutler	a5b19d2d73	tests: Thrasher: handle "OSD has the store locked" gracefully On slower machines (VPS, OVH) it takes time for the OSD to go down. Fixes: http://tracker.ceph.com/issues/19556 Signed-off-by: Nathan Cutler <ncutler@suse.com>	2017-04-11 16:09:45 +02:00
John Spray	d529121b60	Merge pull request #10636 from fullerdj/wip-djf-15069 cephfs: Permit recovering metadata into a new RADOS pool Reviewed-by: John Spray <john.spray@redhat.com>	2017-04-10 13:52:04 +01:00
John Spray	fb046b9730	qa/tasks/cephfs: update kernel_mount for debugfs format Signed-off-by: John Spray <john.spray@redhat.com>	2017-04-09 18:13:29 +01:00
Vasu Kulkarni	73cccd4115	push keys on node using admin command will test admin command and is now needed due to create-keys change Signed-off-by: Vasu Kulkarni <vasu@redhat.com>	2017-04-07 12:39:19 -07:00
John Spray	e0833965b6	qa: re-enable ENOSPC tests for kclient Fixes: http://tracker.ceph.com/issues/19550 Signed-off-by: John Spray <john.spray@redhat.com>	2017-04-07 14:45:30 +01:00
Kefu Chai	24e69d79e7	Merge pull request #14281 from tchaikov/wip-19429 qa/tasks/workunit.py: use "overrides" as the default settings of workunit Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2017-04-05 10:01:27 +08:00
Douglas Fuller	37bafff9f4	qa/cephfs: Add test for rebuilding into an alternate metadata pool Add a test to validate the ability of cephfs_data_scan and friends to recover metadata from a damaged CephFS installation into a fresh metadata pool. cf: http://tracker.ceph.com/issues/15068 cf: http://tracker.ceph.com/issues/15069 Signed-off-by: Douglas Fuller <dfuller@redhat.com>	2017-04-04 12:29:01 -07:00
Casey Bodley	9730fec922	qa: s3test task scans radosgw logs for leaked encryption keys Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-04-03 10:44:58 -04:00
John Spray	13e8315d1a	Merge pull request #13862 from jcsp/wip-16523 qa, mds: add checks for fragmentation, and enable it by default	2017-04-03 11:56:37 +01:00
Kefu Chai	47080150a1	qa/tasks/workunit.py: use "overrides" as the default settings of workunit otherwise the settings in "workunit" tasks are always overridden by the settings in template config. so we'd better follow the way of how "install" task updates itself with the "overrides" settings: it uses the "overrides" as the defaults. Fixes: http://tracker.ceph.com/issues/19429 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-04-02 12:26:30 +08:00
vasukulkarni	574049a90b	Merge pull request #14229 from ceph/wip-systemd qa: Add reboot case for systemd test	2017-03-31 09:15:53 -07:00
John Spray	992b8499d0	Merge pull request #14254 from idryomov/wip-vstart-runner-ps qa/vstart_runner: amend ps invocation Reviewed-by: John Spray <john.spray@redhat.com>	2017-03-31 17:15:30 +01:00
John Spray	bf39f561e9	qa: fix test_standby_for_invalid_fscid with vstart_runner Signed-off-by: John Spray <john.spray@redhat.com>	2017-03-31 12:13:57 -04:00
Kefu Chai	9ca7ccf5f1	tasks/workunit.py: specify the branch name when cloning a branch `c1309fb` failed to specify a branch when cloning using --depth=1, which by default clones the HEAD. and we can not "git checkout" a specific sha1 if it is not HEAD, after cloning using '--depth=1', so in this change, we dispatch "tag", "branch", "HEAD" using three Refspec classes. Signed-off-by: Kefu Chai <kchai@redhat.com> Signed-off-by: Dan Mick <dan.mick@redhat.com>	2017-03-30 20:30:09 -07:00
Sage Weil	578b0f7cfc	Merge pull request #13617 from liewegas/wip-mgr-commands mon,mgr: tag some commands for ceph-mgr Reviewed-by: Kefu Chai <kchai@redhat.com>	2017-03-30 17:12:00 -05:00
Ilya Dryomov	8d8cd4e4d5	qa/vstart_runner: amend ps invocation "ps -xwwu<id>" is parsed as BSD, because -x is not a UNIX option. "u" is a BSD option for user-oriented format, so the <id> ends up being parsed as an old-style "select by pid". The only reason this command doesn't dump other user's processes is that the BSD "only yourself" restriction is in effect. I'm not sure what's wrong with a simple "ps xww", but if we want to select by euid, let's do it right. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-03-30 19:36:43 +02:00
Vasu Kulkarni	7b587304a5	Add reboot case for systemd test test systemd units restart after reboot Signed-off-by: Vasu Kulkarni <vasu@redhat.com>	2017-03-29 10:30:49 -07:00
Sage Weil	5dc9b8d026	qa/tasks/dump_stuck.py: stop making assertions about 'health' report Health comes from teh mon, while the pg stats come from teh mgr, so they may be out of sync. Signed-off-by: Sage Weil <sage@redhat.com>	2017-03-29 11:39:27 -04:00
Sage Weil	fa0b2164ad	qa/tasks/ceph.py: add 'skip_mgr_daemons' option For upgrades Signed-off-by: Sage Weil <sage@redhat.com>	2017-03-29 11:39:26 -04:00
Sage Weil	7edca203d8	qa/tasks/ceph.py: give everyone mgr caps Signed-off-by: Sage Weil <sage@redhat.com>	2017-03-29 11:39:26 -04:00
Dan Mick	c1309fbef3	tasks/workunit.py: when cloning, use --depth=1 Help avoid killing git.ceph.com. A depth 1 clone takes about 7 seconds, whereas a full one takes about 3:40 (much of it waiting for the server to create a huge compressed pack) Signed-off-by: Dan Mick <dan.mick@redhat.com>	2017-03-28 20:09:44 -07:00
John Spray	e90e37690a	qa/tasks: add check_counter.py We need this for CephFS, to verify that workloads we expect to do a particular thing (like directory fragmentation or metadata exports) are really doing it. This is for giving us confidence in our coverage of these features rather than testing them per se. Fixes: http://tracker.ceph.com/issues/16523 Signed-off-by: John Spray <john.spray@redhat.com>	2017-03-28 23:26:34 +01:00
Sage Weil	2a08cbbed5	qa/tasks/thrashosds,ceph_manager: thrash pg_remap[_items] Signed-off-by: Sage Weil <sage@redhat.com>	2017-03-28 10:12:10 -04:00
Casey Bodley	e3e3a71d1f	qa: rgw task uses period instead of region-map Signed-off-by: Casey Bodley <cbodley@redhat.com>	2017-03-20 11:50:03 -04:00
Kefu Chai	bd36f13163	doc: fix the links to http://ceph.com/docs they should point to http://docs.ceph.com/docs/master/.. instead Fixes: http://tracker.ceph.com/issues/19090 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-03-15 16:40:07 +08:00
Yehuda Sadeh	515db13970	qa/tasks/radosgw_admin: adjust test to new bucket structure Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>	2017-03-09 09:18:56 -08:00
John Spray	41f8ded3e7	qa: update TestDamage for PurgeQueue Signed-off-by: John Spray <john.spray@redhat.com>	2017-03-08 10:27:03 +00:00
John Spray	1a1951002d	qa: update TestFlush for changed stray perf counters Signed-off-by: John Spray <john.spray@redhat.com>	2017-03-08 10:27:03 +00:00
John Spray	6cf9c2956c	qa: add TestStrays.test_purge_queue_op_rate For ensuring that the PurgeQueue code is not generating too many extra IOs. Signed-off-by: John Spray <john.spray@redhat.com>	2017-03-08 10:27:02 +00:00
John Spray	3e66de2182	mds: create purge queue if it's not found Signed-off-by: John Spray <john.spray@redhat.com>	2017-03-08 10:26:59 +00:00
John Spray	f826c7e8aa	qa/cephfs: add TestStrays.test_purge_on_shutdown ...and change test_migration_on_shutdown to specifically target non-purgeable strays (i.e. hardlink-ish things). Signed-off-by: John Spray <john.spray@redhat.com>	2017-03-08 10:26:55 +00:00
John Spray	3970502c9b	qa: update test_strays for purgequeue Signed-off-by: John Spray <john.spray@redhat.com>	2017-03-08 10:20:59 +00:00
Sage Weil	7fbe8fb085	Merge pull request #13759 from liewegas/wip-19133 osdc/Objecter: resend RWORDERED ops on full Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Greg Farnum <gfarnum@redhat.com>	2017-03-07 21:31:50 -06:00
Sage Weil	296708091c	qa/tasks/ceph_manager: use new luminous set-full-ratio etc Signed-off-by: Sage Weil <sage@redhat.com>	2017-03-07 16:39:09 -05:00
Sage Weil	a202b68d18	qa/tasks/thrashosds: chance_thrash_cluster_full Induce a momentarily full cluster. Signed-off-by: Sage Weil <sage@redhat.com>	2017-03-07 13:33:44 -05:00
Radoslaw Zarzynski	6440750f53	qa/tasks/rgw.py: start Apache before RadosGW. At the end of start_rgw() we wait till establishing HTTP connections with RadosGW become possible. However, if RadosGW uses the FastCGI, the condition can't be fulfilled without spawning HTTP server first. Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>	2017-03-07 17:31:52 +01:00
John Spray	73100305e5	Merge pull request #13262 from batrick/multimds-thrasher Add multimds:thrash sub-suite and fix bugs in thrasher for multimds Reviewed-by: John Spray <john.spray@redhat.com>	2017-03-07 14:29:18 +00:00
John Spray	39204abeda	Merge pull request #13282 from jcsp/wip-fuse-mount-teardown tasks/cephfs: tear down on mount() failure Reviewed-by: Yan, Zheng <zyan@redhat.com>	2017-02-28 15:04:59 +00:00
Kefu Chai	edceabbd47	qa/tasks/workunit: use ceph.git as an alternative of ceph-ci.git for workunit repo if we run upgrade test, where, for example, "jewel" is not in ceph-ci.git repo, we should check ceph.git to clone the workunits. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-27 17:36:05 +08:00
Sage Weil	af5dab0613	Merge pull request #13649 from liewegas/wip-ceph-scrub-debug qa/tasks/ceph.py: debug which pgs aren't scrubbing Reviewed-by: Brad Hubbard <bhubbard@redhat.com>	2017-02-25 13:15:06 -06:00
Sage Weil	f777d849e7	qa/tasks/ceph.py: debug which pgs aren't scrubbing Signed-off-by: Sage Weil <sage@redhat.com>	2017-02-24 23:07:34 -05:00
Samuel Just	44b26f6ab4	Merge pull request #13594 from athanatos/wip-snap-trim-sleep osd: add snap trim reservation and re-implement osd_snap_trim_sleep Reviewed-by: Josh Durgin <jdurgin@redhat.com>	2017-02-24 14:09:17 -08:00
Kefu Chai	4cf28de4c9	qa/tasks/workunit: use the suite repo for cloning workunit as "workunits" reside in ceph/qa/workunits, it's more intuitive to respect suite-repo option when cloning workunits. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-24 16:47:47 +08:00
John Spray	de5249436c	Merge pull request #13359 from jcsp/wip-logrotate-sshexception qa: handle SSHException in logrotate Reviewed-by: Kefu Chai <kchai@redhat.com>	2017-02-22 10:05:07 +00:00
Kefu Chai	b3e516fc38	Merge pull request #13518 from tchaikov/wip-fix-pgp-num test: Thrasher: do not update pools_to_fix_pgp_num if nothing happens Reviewed-by: Sage Weil <sage@redhat.com>	2017-02-21 00:46:26 +08:00
Kefu Chai	c0f0cde399	test: Thrasher: do not update pools_to_fix_pgp_num if nothing happens we should not update pools_to_fix_pgp_num if the pool is not expanded or the pg_num is not increased due to pgs being created. this prevent us from fixing the pgp_num after done with thrashing if we actually did nothing when fixing the pgp_num when thrashing, but we removed the pool from pools_to_fix_pgp_num after set_pool_pgpnum() returns. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-19 13:10:46 +08:00
Sage Weil	86c0d07e32	qa/tasks/ceph.py: fix timing of wait-for-* and osd markdown Mark down osds, then wait for them to come up or for the cluster to be healthy! Signed-off-by: Sage Weil <sage@redhat.com>	2017-02-18 21:12:23 -05:00
Sage Weil	96bc86b537	Revert "qa/tasks/workunit: use the suite repo for cloning workunit"	2017-02-17 11:54:27 -06:00
Kefu Chai	1f82b9b944	qa/tasks/workunit: use the suite repo for cloning workunit as "workunits" reside in ceph/qa/workunits, it's more intuitive to respect suite-repo option when cloning workunits. Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-16 15:05:51 +08:00
Samuel Just	4aebf59d90	rados: check that pool is done trimming before removing it Signed-off-by: Samuel Just <sjust@redhat.com>	2017-02-13 09:47:02 -08:00
Kefu Chai	de59b5102c	test: Thrasher: restore changed options after done with thrash Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-13 09:25:51 +08:00
Kefu Chai	761a1dc391	tests: Thrasher: extract _set_config() method Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-13 09:25:50 +08:00
Kefu Chai	995e144e3e	tests: CephManager: add get_config() method Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-13 09:25:50 +08:00
Kefu Chai	136483a8f9	test: Thrasher: update pgp_num of all expanded pools if not yet otherwise wait_until_healthy will fail after timeout as seeing warning like: HEALTH_WARN pool cephfs_data pg_num 182 > pgp_num 172 Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-13 09:25:50 +08:00
John Spray	880cbf09aa	Merge pull request #13137 from jcsp/wip-18661 qa: fix race in Mount.open_background Reviewed-by: Yan, Zheng <zyan@redhat.com>	2017-02-10 17:48:05 +00:00
John Spray	a3fd3f225c	Merge pull request #13099 from jcsp/wip-18663 qa/tasks: force umount during kclient teardown	2017-02-10 17:42:37 +00:00
John Spray	6f9e11f03d	qa: handle SSHException in logrotate Yet another different type of exception we may get when orchestra.run can't talk to a remote host. Signed-off-by: John Spray <john.spray@redhat.com>	2017-02-10 17:16:24 +00:00
Nathan Cutler	6b7443fb50	tests: drop buildpackages.py The buildpackages suite has been moved to teuthology. This cleans up a file that was left behind by https://github.com/ceph/ceph/pull/13297 Fixes: http://tracker.ceph.com/issues/18846 Signed-off-by: Nathan Cutler <ncutler@suse.com>	2017-02-08 21:23:54 +01:00
Loic Dachary	5a43f8d579	buildpackages: remove because it does not belong It should live in teuthology, not in Ceph. And it is currently broken: there is no need to keep it around. Fixes: http://tracker.ceph.com/issues/18846 Signed-off-by: Loic Dachary <loic@dachary.org>	2017-02-07 18:37:26 +01:00
John Spray	6203f33df4	tasks/cephfs: tear down on mount() failure There were some cases where we would leave a mountpoint that would cause the teuthology teardown to get hung up when it tried to look inside cephtest/ Signed-off-by: John Spray <john.spray@redhat.com>	2017-02-06 22:53:21 +00:00
Patrick Donnelly	d748226f00	qa: add DaemonWatchdog to stop tests on failure Thrashing MDS will often result in failures which often do not stop the test. The failure may also cause the test to stall which will force the machines to needlessly be locked until a timeout is reached. This watchdog will unmount mounts and kill daemons when a failure is detected. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:14 -05:00
Patrick Donnelly	f005e8af6b	qa: disable max_mds changes during thrashing While the trasher supports the behavior desired by issue 10792 [1], the bugs uncovered due to deactivating MDS (and sometimes killing deactivating MDS) are presently a distraction from addressing issues during normal failures. So now thrashing max_mds is turned off by default. I have added a TODO to deactivate ranks in order (configurably) as random deactivation causes a lot of other problems. This also fixes a bug: random.randrange(0.0, 1.0) always returns 0. Oops. [1] http://tracker.ceph.com/issues/10792 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:14 -05:00
Patrick Donnelly	82662edd7f	qa: do not pretty the json to shorten stdout log Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:14 -05:00
Patrick Donnelly	a0052fc2d6	qa: use gevent.sleep so greenlet yields Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:14 -05:00
Patrick Donnelly	cf9e0da078	qa: use fs methods for setting configs Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:13 -05:00
Patrick Donnelly	0098873fb7	qa: remove old comment Filesystem is now cluster aware. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:13 -05:00
Patrick Donnelly	fd4b61890d	qa: allow revived MDS to be up:active Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:13 -05:00
Patrick Donnelly	884215d933	qa: timeout waiting for thrashed MDS to revive Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:13 -05:00
Patrick Donnelly	8e9ea7b6ac	qa: configure thrashing while MDS are stopping Currently multimds is prone to many failures when killing an active or stopping MDS when there are MDS in the cluster which have been deactivated (stopping). Have this turned off by default for now. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:13 -05:00
Patrick Donnelly	6304b6ed5d	qa: add deactivation log message Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:13 -05:00
Patrick Donnelly	1185326c45	qa: avoid infinite wait if no repl. can be made The thrasher can enter an infinite loop waiting for an MDS to take a certain rank when a replacement may not be possible. For example, max_mds actives are already running. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:12 -05:00
Patrick Donnelly	638bccb2bb	qa: timeout thrasher if fs does not stabilize After 5 minutes of waiting, it's reasonable to stop as the cluster is probably stuck. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:12 -05:00
Patrick Donnelly	8f3e745344	qa: check replacement MDS is active in thrasher Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:12 -05:00
Patrick Donnelly	19289725c8	qa: handle thrashing ranks with holes During the course of thrashing max_mds, the ranks assigned to MDSs may develop holes. This causes the thrasher to try to wrongly deactivate ranks that are not assigned. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>	2017-02-06 14:07:12 -05:00
Nathan Cutler	db2582e25e	tests: fix regression in qa/tasks/ceph_master.py https://github.com/ceph/ceph/pull/13194 introduced a regression: 2017-02-06T16:14:23.162 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last): File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 722, in wrapper return func(self) File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 839, in do_thrash self.choose_action()() File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 305, in kill_osd output = proc.stderr.getvalue() AttributeError: 'NoneType' object has no attribute 'getvalue' This is because the original patch failed to pass "stderr=StringIO()" to run(). Fixes: http://tracker.ceph.com/issues/16263 Signed-off-by: Nathan Cutler <ncutler@suse.com> Signed-off-by: Kefu Chai <kchai@redhat.com>	2017-02-06 19:37:38 +01:00
Sage Weil	5fc3dd36e2	Merge pull request #13237 from smithfarm/wip-18799 tests: Thrasher: eliminate a race between kill_osd and __init__ Reviewed-by: Sage Weil <sage@redhat.com>	2017-02-05 12:49:30 -06:00

... 2 3 4 5 6 ...

396 Commits