Commit Graph

351 Commits

Author SHA1 Message Date
Casey Bodley
0fb3e76eae qa/rgw: more cleanup in rgw.py
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-05-19 15:53:37 -04:00
Casey Bodley
c8d8b9cae1 qa/rgw: remove unused helpers in util/rgw.py
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-05-19 15:53:37 -04:00
Casey Bodley
a05b3bb409 qa/rgw: remove radosgw_agent task
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-05-19 15:53:37 -04:00
Casey Bodley
762e15fbb3 qa/rgw: remove radosgw-agent config from s3tests task
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-05-19 15:53:37 -04:00
Casey Bodley
9d82486d0e qa/rgw: remove radosgw-agent tests from radosgw_admin task
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-05-19 15:53:37 -04:00
Casey Bodley
898ab4bb0f qa/rgw: remove multisite configuration from rgw task
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-05-19 15:53:36 -04:00
Casey Bodley
cff53b246f Merge pull request #14688 from cbodley/wip-rgw-multi-suite
qa/rgw: add multisite suite to configure and run multisite tests

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2017-05-19 14:30:57 -04:00
Sage Weil
590fd5362a Merge pull request #15071 from cbodley/wip-qa-dnsmasq
qa: add task for dnsmasq configuration

Reviewed-by: Vasu Kulkarni <vasu@redhat.com>
Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@gmail.com>
2017-05-19 13:25:12 -05:00
Casey Bodley
de836ee684 qa/rgw: add test config to rgw_multisite_tests task
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-05-18 13:38:44 -04:00
Casey Bodley
efb3b181fd qa/rgw: add log_level argument to rgwadmin()
changes default level from info to debug

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-05-18 13:37:35 -04:00
Casey Bodley
4722d1d920 qa/rgw: add rgw_multisite_tests task to run tests
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-05-17 14:48:55 -04:00
Casey Bodley
b6d86be2c5 qa/rgw: add rgw_multisite task based on rgw_multi
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-05-17 14:48:55 -04:00
Casey Bodley
a86ce77155 qa/rgw: add symlink to qa/tasks/rgw_multi
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-05-17 14:48:55 -04:00
Casey Bodley
746c630999 qa/rgw: move startup polling logic to util/rgw.py
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-05-17 14:48:55 -04:00
Casey Bodley
76e147614f qa/rgw: fixes for cluster name on cleanup
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-05-17 14:48:55 -04:00
Casey Bodley
4c59d343c3 qa/rgw: move compression type out of ceph.conf
this makes the 'compression type' setting global to all gateways, and
makes the setting visible to other tasks in ctx.rgw.compression_type

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-05-17 14:48:55 -04:00
Patrick Donnelly
6c34a2c673
qa: silence upgrade test failure
The new fs setting standby_count_wanted is only avialable in luminous. Upgrade
tests were tripping on this.

Fixes: http://tracker.ceph.com/issues/19934

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-05-16 18:43:57 -04:00
Patrick Donnelly
4b72940d02
qa: fix float parse error in test_fragment
2017-05-16 17:45:30,663.663 INFO:__main__:run args=['./bin/ceph', 'daemon', 'mds.b', 'perf', 'dump', 'mds']
    2017-05-16 17:45:30,664.664 INFO:__main__:Running ['./bin/ceph', 'daemon', 'mds.b', 'perf', 'dump', 'mds']
    Can't get admin socket path: unable to get conf option admin_socket for mds.b: parse error setting 'mds_bal_fragment_size_max' to '152.0'

    2017-05-16 17:45:30,781.781 INFO:__main__:test_rapid_creation (tasks.cephfs.test_fragment.TestFragmentation) ... ERROR
    2017-05-16 17:45:30,782.782 ERROR:__main__:Traceback (most recent call last):
      File "/home/pdonnell/ceph/qa/tasks/cephfs/test_fragment.py", line 114, in test_rapid_creation
        self.assertEqual(self.get_splits(), 0)
      File "/home/pdonnell/ceph/qa/tasks/cephfs/test_fragment.py", line 15, in get_splits
        return self.fs.mds_asok(['perf', 'dump', 'mds'])['mds']['dir_split']
      File "/home/pdonnell/ceph/qa/tasks/cephfs/filesystem.py", line 788, in mds_asok
        return self.json_asok(command, 'mds', mds_id)
      File "/home/pdonnell/ceph/qa/tasks/cephfs/filesystem.py", line 174, in json_asok
        proc = self.mon_manager.admin_socket(service_type, service_id, command)
      File "../qa/tasks/vstart_runner.py", line 561, in admin_socket
        args=[os.path.join(BIN_PREFIX, "ceph"), "daemon", "{0}.{1}".format(daemon_type, daemon_id)] + command, check_status=check_status
      File "../qa/tasks/vstart_runner.py", line 296, in run
        proc.wait()
      File "../qa/tasks/vstart_runner.py", line 174, in wait
        raise CommandFailedError(self.args, self.exitstatus)
    CommandFailedError: Command failed with status 22: ['./bin/ceph', 'daemon', 'mds.b', 'perf', 'dump', 'mds']

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-05-16 18:02:18 -04:00
myoungwon oh
a07ad9fe80 qa/suites/rados/thrash: add redirect test cases
Signed-off-by: Myoungwon Oh omwmw@sk.com
2017-05-17 05:47:12 +09:00
John Spray
60f904615f Merge pull request #15096 from jcsp/wip-journalrepair-test
qa: simplify TestJournalRepair

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-05-16 16:11:57 +01:00
Yan, Zheng
6473b79337 qa/cephfs: disable mds_bal_frag for TestStrays.test_purge_queue_op_rate
directory fragmentation generates extra osd ops, which affects checks
in the test.

Fixes: http://tracker.ceph.com/issues/19892
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-05-16 16:43:29 +08:00
John Spray
2350555fe5 qa: simplify TestJournalRepair
This was sending lots of metadata ops to MDSs to persuade
them to migrate some subtrees, but that was flaky.  Use
the shiny new rank pinning functionality instead.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-05-15 17:27:07 -04:00
Douglas Fuller
7f659e104d qa/cephfs: Fix for test_data_scan
Don't assume that test_data_scan will be run on exactly 2 MDS nodes.

Fixes: http://tracker.ceph.com/issues/19893
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-05-15 16:01:02 -04:00
John Spray
17f669a868 Merge pull request #15026 from ukernel/wip-19891
qa/suites/fs: reserve more space for mds in full tests

Reviewed-by: John Spray <john.spray@redhat.com>
2017-05-15 13:21:52 +01:00
John Spray
897b5f5bbe Merge pull request #15035 from batrick/quiet-mds-grow-shrink
qa: silence spurious insufficient standby health warnings

Reviewed-by: Yan, Zheng <zyan@redhat.com>
2017-05-15 13:17:38 +01:00
Casey Bodley
062923515c qa: add task for dnsmasq configuration
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-05-12 16:53:14 -04:00
Yan, Zheng
1a48359f34 qa/tasks/cephfs: use getattr to guarantee inode is in client cache
When selinux is enabled, kernel client may releases inodes (without
uptodate xattr) in readdir reply immediately after processing the reply.
The reason is that linking the inode to dentry causes deadlock if xattr
is not uptodate.

We can use stat(2) syscall to guarantee that kernel client caches an
inode.

Fixes: http://tracker.ceph.com/issues/19912
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-05-12 16:42:25 +08:00
Yan, Zheng
b67a599ebe Merge pull request #14598 from batrick/mds-balancer-pin
mds: support export pinning on directories
2017-05-11 11:56:34 +08:00
Yan, Zheng
bbb3369b50 qa/suites/fs: fix write size calculation in full tests
'max_avail' has already taken full_ratio into account

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
2017-05-11 11:18:22 +08:00
Patrick Donnelly
02c41f683d
qa: add health warning test for insufficient standbys
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-05-10 11:05:09 -04:00
Patrick Donnelly
a4cb10900d
qa: turn off spurious standby health warning
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-05-10 10:21:28 -04:00
Patrick Donnelly
9552efde4a
qa: improve time handling for test_exports test
Also catches corner-case found by Zheng where an unjournaled directory will
cause export pinning to fail because it cannot be made a subtree until its
parent is stable.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-05-05 19:07:05 -04:00
Sage Weil
99928c9e0d Merge pull request #14931 from tchaikov/wip-19771
qa/tasks/ceph_manager: always fix pgp_num when done with thrashosd task

Reviewed-by: Sage Weil <sage@redhat.com>
2017-05-05 08:53:38 -05:00
Tamilarasi Muthamizhan
a189b61095 Merge pull request #14400 from ceph/wip-cd-1node
qa/tasks: few fixes to get ceph-deploy 1node to working state
2017-05-04 10:42:50 -07:00
Vasu Kulkarni
e58dd3938a install mgr on the node
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-05-03 16:47:14 -07:00
Kefu Chai
da1161cbd8 qa/tasks/ceph_manager: always fix pgp_num when done with thrashosd task
Fixes: http://tracker.ceph.com/issues/19771
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-05-03 18:28:27 +08:00
Patrick Donnelly
63cbe330b7
qa: remove errant mount requirement
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-05-02 18:29:08 -04:00
Patrick Donnelly
6bd58fefb7
mds: use aux subtrees for export pinned inodes
Idea here is that a pinned inode should not be exported when its parent is.
Setting the pinned inode's dirfrags to aux subtrees prevents them from being
merged with a parent subtree.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-05-02 00:30:35 -04:00
Casey Bodley
0e30e3ef01 Merge pull request #14845 from cbodley/wip-rgw-qa-s3tests
qa/rgw: add cluster name to path when s3tests scans rgw log

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2017-05-01 10:49:12 -04:00
Kefu Chai
7424345c77 qa/erasure-code: override min_size to 2
so isa(k=2,m=1) can survive with 1 down OSD.

Fixes: http://tracker.ceph.com/issues/19770
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-04-29 10:43:17 +08:00
Kefu Chai
5f50298025 qa/tasks/rados: add optional setting of "min_size"
this setting only affects the newly created pool

Fixes: http://tracker.ceph.com/issues/19770
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-04-29 10:39:02 +08:00
Casey Bodley
88b6a142bc qa/rgw: fix assertions in radosgw_admin task
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-04-27 19:38:10 -04:00
Casey Bodley
a31aa6f65c qa/rgw: add cluster name to path when s3tests scans rgw log
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-04-27 14:48:40 -04:00
John Spray
d0d3a4a02e Merge pull request #12935 from stiopaa1/17855_evictClient
mds/Server.cc: Don't evict a slow client if...

Reviewed-by: John Spray <john.spray@redhat.com>
2017-04-24 22:10:01 +01:00
John Spray
837a71c0af qa/tasks/cephfs: clean up mount point setup
Previously were sometimes trying to maintain a mounted
client across a filesystem destroy/create.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-04-24 11:19:55 +01:00
John Spray
16702ff13d Merge pull request #14018 from jcsp/wip-17939
client: getattr before returning quota/layout xattrs

Reviewed-by: Yan, Zheng <zyan@redhat.com>
2017-04-24 11:12:26 +01:00
Michal Jarzabek
1a5cb534d9 mds/Server.cc: Don't evict a slow client if...
... it's the only client

Fixes: http://tracker.ceph.com/issues/17855
Signed-off-by: Michal Jarzabek <stiopa@gmail.com>
2017-04-23 13:31:47 +01:00
Sage Weil
27dd6530a2 Merge pull request #14559 from liewegas/wip-pg-map
mon: move 'pg map' to OSDMonitor

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-04-21 18:53:17 -05:00
Kefu Chai
c237e7ed29 Merge pull request #14232 from jcsp/wip-19412
mgr: fix python module teardown & add tests

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-04-21 22:57:44 +08:00
Sage Weil
069182f91f qa/tasks/ceph_manager: use 'pg map' for get_pg_{primary,replica}
Pulling this out of the 'pg dump' heap is inefficient.
Also, pg dump data comes from the mgr and may be stale.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-04-21 10:56:28 -04:00
Kefu Chai
6fa16c4477 Merge pull request #14584 from tchaikov/wip-19631
qa/suites: Revert "qa/suites: add mon-reweight-min-pgs-per-osd = 4"

Reviewed-by: Sage Weil <sage@redhat.com>
2017-04-21 22:56:21 +08:00
Casey Bodley
a4fc5c38e5 qa/rgw: don't scan radosgw logs for encryption keys on jewel upgrade test
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-04-20 14:49:04 -04:00
John Spray
f695a0e30f qa: s/REQUIRE_MGRS/MGRS_REQUIRED/ for consistency
Signed-off-by: John Spray <john.spray@redhat.com>
2017-04-20 15:00:31 +01:00
John Spray
636fc40d90 qa: additions to mgr.test_failover
Reproducers for recent fixes:
http://tracker.ceph.com/issues/19407
http://tracker.ceph.com/issues/19258

Signed-off-by: John Spray <john.spray@redhat.com>
2017-04-20 15:00:31 +01:00
John Spray
8ea98b4cbf qa: fix vstart_runner --create for mgr tests
Signed-off-by: John Spray <john.spray@redhat.com>
2017-04-20 15:00:31 +01:00
Kefu Chai
e6a436bb27 qa/tasks/ceph_manager: be able to store options with service type
so we are able to change options for services other than mon while
thrashing.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-04-20 14:18:21 +08:00
Kefu Chai
ee653ba87c Merge pull request #14608 from tchaikov/wip-19594
qa/tasks: assert on pg status with a timeout

Reviewed-by: Sage Weil <sage@redhat.com>
2017-04-20 10:49:12 +08:00
Kefu Chai
960032e513 qa/tasks: update tests with helper to wait for pg-stats
and remove unused helpers

Fixes: http://tracker.ceph.com/issues/19594
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-04-20 09:35:05 +08:00
Kefu Chai
1207caf3a2 qa/tasks/ceph_manager: add a "wait_for_pg_stats()" decorator
and accompany it with two helpers to access the pg stats in a more
natural way

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-04-20 09:35:04 +08:00
Josh Durgin
a219319137 qa/tasks/rados: test sparse reads with ec overwrites
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-04-19 17:45:43 -07:00
Josh Durgin
6fba80c1fa osd, OSDMonitor, qa: mark ec overwrites non-experimental
Keep the pool flag around so we can distinguish between a pool that
should maintain hashes for each chunk, and a missing one is a bug, vs
an overwrites pool where we rely on bluestore checksums for detecting
corruption.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-04-19 17:45:43 -07:00
Patrick Donnelly
0b420be7e9
mds: add export_pin feature
This allows the client/admin to pin a directory tree to a particular rank,
preventing its export by the dynamic balancer.

Fixes: http://tracker.ceph.com/issues/17834

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-04-19 18:21:19 -04:00
Sage Weil
ee1bb01a54 Merge pull request #14556 from liewegas/wip-pgupmap
osd: pg-remap -> pg-upmap

Reviewed-by: David Zafman <dzafman@redhat.com>
2017-04-19 17:07:01 -05:00
Zack Cerza
28d746bff3 Merge pull request #14464 from ceph/wip-systemd
qa/tasks: use sudo to check ceph health for systemd test
2017-04-18 11:34:27 -06:00
Sage Weil
ce188e8fdf osd: pg-remap -> pg-upmap
'remap' is to non-specific a name.  In particular, it
sounds like it is related to the 'remapped' PG state
but in reality it is not related.

'upmap' or 'pg-upmap' is more specific: it maps a pgid
to the 'up' set value (or item)

Signed-off-by: Sage Weil <sage@redhat.com>
2017-04-18 12:59:40 -04:00
Casey Bodley
da7acc4211 Merge pull request #13597 from cbodley/wip-s3tests-crypto
qa/rgw: add configuration for server-side encryption tests

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2017-04-18 12:28:37 -04:00
Kefu Chai
1b54b5f3f1 Merge pull request #14415 from smithfarm/wip-19556
tests: Thrasher: handle "OSD has the store locked" gracefully

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-04-18 23:18:35 +08:00
John Spray
033ee6bd1f Merge pull request #14396 from jcsp/wip-19550
qa: re-enable ENOSPC tests for kclient
2017-04-18 12:59:14 +01:00
John Spray
d98e19fdbd Merge pull request #14589 from jcsp/wip-19640
client: refine fsync/close writeback error handling

Reviewed-by: Jeff Layton <jlayton@redhat.com>
2017-04-18 12:58:37 +01:00
John Spray
a2a100dc13 Merge pull request #14272 from jcsp/wip-vstart-fixup
qa: fix test_standby_for_invalid_fscid with vstart_runner

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-04-18 12:50:20 +01:00
John Spray
1a69bec52f client: refine fsync/close writeback error handling
Previously, errors stuck indelibly to the inode, which
meant that a close call would see an error even if the
user already dutifully fsync()'d and handled it.

We should emit each error only once per file handle.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-04-18 07:47:10 -04:00
Orit Wasserman
cb94e5ad3f Merge pull request #12535 from ceph/wip-rgw-multisite-teuthology
rgw: multisite enabled over multiple clusters
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
2017-04-18 11:47:48 +03:00
David Zafman
a5731076ad osd: Handle backfillfull_ratio just like nearfull and full
Add BACKFILLFULL as a local OSD cur_state
Notify monitor of this new fullness state

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-04-17 08:00:24 -07:00
John Spray
dd43d3bc64 qa/cephfs: use getfattr/setfattr helpers
Signed-off-by: John Spray <john.spray@redhat.com>
2017-04-14 06:38:48 -04:00
John Spray
61617f8f10 qa: add test for reading quotas from different clients
Fixes: http://tracker.ceph.com/issues/17939
Signed-off-by: John Spray <john.spray@redhat.com>
2017-04-14 06:38:48 -04:00
Sage Weil
5ca72c1193 qa/tasks/exec_on_cleanup.py: add
Signed-off-by: Sage Weil <sage@redhat.com>
2017-04-13 17:11:19 -04:00
Ali Maredia
b31b84529e rgw multisite: use get_config_master_client for radosgw_admin task
Signed-off-by: Ali Maredia <amaredia@redhat.com>
2017-04-13 12:15:50 -04:00
Ali Maredia
c5956790e6 rgw: multisite enabled over multiple clusters
Added '--cluster' to all necessary commands
ex: radosgw-admin, rados, ceph, made sure
necessary checks were in place so that clients
can be read with our without a cluster_name
preceeding them

Made master_client defined in the config for
radosgw-admin task

Signed-off-by: Ali Maredia <amaredia@redhat.com>
2017-04-13 12:15:50 -04:00
Vasu Kulkarni
7af157ad4c use sudo to check check health
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-04-11 13:52:26 -07:00
Nathan Cutler
a5b19d2d73 tests: Thrasher: handle "OSD has the store locked" gracefully
On slower machines (VPS, OVH) it takes time for the OSD to go down.

Fixes: http://tracker.ceph.com/issues/19556
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-04-11 16:09:45 +02:00
John Spray
d529121b60 Merge pull request #10636 from fullerdj/wip-djf-15069
cephfs: Permit recovering metadata into a new RADOS pool

Reviewed-by: John Spray <john.spray@redhat.com>
2017-04-10 13:52:04 +01:00
John Spray
fb046b9730 qa/tasks/cephfs: update kernel_mount for debugfs format
Signed-off-by: John Spray <john.spray@redhat.com>
2017-04-09 18:13:29 +01:00
Vasu Kulkarni
73cccd4115 push keys on node using admin command
will test admin command and is now needed due to create-keys change

Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-04-07 12:39:19 -07:00
John Spray
e0833965b6 qa: re-enable ENOSPC tests for kclient
Fixes: http://tracker.ceph.com/issues/19550
Signed-off-by: John Spray <john.spray@redhat.com>
2017-04-07 14:45:30 +01:00
Kefu Chai
24e69d79e7 Merge pull request #14281 from tchaikov/wip-19429
qa/tasks/workunit.py: use "overrides" as the default settings of workunit

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-04-05 10:01:27 +08:00
Douglas Fuller
37bafff9f4 qa/cephfs: Add test for rebuilding into an alternate metadata pool
Add a test to validate the ability of cephfs_data_scan and friends to
recover metadata from a damaged CephFS installation into a fresh metadata
pool.

cf: http://tracker.ceph.com/issues/15068
cf: http://tracker.ceph.com/issues/15069
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2017-04-04 12:29:01 -07:00
Casey Bodley
9730fec922 qa: s3test task scans radosgw logs for leaked encryption keys
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-04-03 10:44:58 -04:00
John Spray
13e8315d1a Merge pull request #13862 from jcsp/wip-16523
qa, mds: add checks for fragmentation, and enable it by default
2017-04-03 11:56:37 +01:00
Kefu Chai
47080150a1 qa/tasks/workunit.py: use "overrides" as the default settings of workunit
otherwise the settings in "workunit" tasks are always overridden by the
settings in template config. so we'd better follow the way of how
"install" task updates itself with the "overrides" settings: it uses the
"overrides" as the *defaults*.

Fixes: http://tracker.ceph.com/issues/19429
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-04-02 12:26:30 +08:00
vasukulkarni
574049a90b Merge pull request #14229 from ceph/wip-systemd
qa: Add reboot case for systemd test
2017-03-31 09:15:53 -07:00
John Spray
992b8499d0 Merge pull request #14254 from idryomov/wip-vstart-runner-ps
qa/vstart_runner: amend ps invocation

Reviewed-by: John Spray <john.spray@redhat.com>
2017-03-31 17:15:30 +01:00
John Spray
bf39f561e9 qa: fix test_standby_for_invalid_fscid with vstart_runner
Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-31 12:13:57 -04:00
Kefu Chai
9ca7ccf5f1 tasks/workunit.py: specify the branch name when cloning a branch
c1309fb failed to specify a branch when cloning using --depth=1, which
by default clones the HEAD. and we can not "git checkout" a specific
sha1 if it is not HEAD, after cloning using '--depth=1', so in this
change, we dispatch "tag", "branch", "HEAD" using three Refspec classes.

Signed-off-by: Kefu Chai <kchai@redhat.com>
Signed-off-by: Dan Mick <dan.mick@redhat.com>
2017-03-30 20:30:09 -07:00
Sage Weil
578b0f7cfc Merge pull request #13617 from liewegas/wip-mgr-commands
mon,mgr: tag some commands for ceph-mgr

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-03-30 17:12:00 -05:00
Ilya Dryomov
8d8cd4e4d5 qa/vstart_runner: amend ps invocation
"ps -xwwu<id>" is parsed as BSD, because -x is not a UNIX option.
"u" is a BSD option for user-oriented format, so the <id> ends up being
parsed as an old-style "select by pid".  The only reason this command
doesn't dump other user's processes is that the BSD "only yourself"
restriction is in effect.

I'm not sure what's wrong with a simple "ps xww", but if we want to
select by euid, let's do it right.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-03-30 19:36:43 +02:00
Vasu Kulkarni
7b587304a5 Add reboot case for systemd test
test systemd units restart after reboot

Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-03-29 10:30:49 -07:00
Sage Weil
5dc9b8d026 qa/tasks/dump_stuck.py: stop making assertions about 'health' report
Health comes from teh mon, while the pg stats come from teh mgr, so they
may be out of sync.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-03-29 11:39:27 -04:00
Sage Weil
fa0b2164ad qa/tasks/ceph.py: add 'skip_mgr_daemons' option
For upgrades

Signed-off-by: Sage Weil <sage@redhat.com>
2017-03-29 11:39:26 -04:00
Sage Weil
7edca203d8 qa/tasks/ceph.py: give everyone mgr caps
Signed-off-by: Sage Weil <sage@redhat.com>
2017-03-29 11:39:26 -04:00
Dan Mick
c1309fbef3 tasks/workunit.py: when cloning, use --depth=1
Help avoid killing git.ceph.com.  A depth 1 clone takes about
7 seconds, whereas a full one takes about 3:40 (much of it
waiting for the server to create a huge compressed pack)

Signed-off-by: Dan Mick <dan.mick@redhat.com>
2017-03-28 20:09:44 -07:00
John Spray
e90e37690a qa/tasks: add check_counter.py
We need this for CephFS, to verify that workloads
we expect to do a particular thing (like directory fragmentation
or metadata exports) are really doing it.

This is for giving us confidence in our coverage of these
features rather than testing them per se.

Fixes: http://tracker.ceph.com/issues/16523
Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-28 23:26:34 +01:00
Sage Weil
2a08cbbed5 qa/tasks/thrashosds,ceph_manager: thrash pg_remap[_items]
Signed-off-by: Sage Weil <sage@redhat.com>
2017-03-28 10:12:10 -04:00
Casey Bodley
e3e3a71d1f qa: rgw task uses period instead of region-map
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2017-03-20 11:50:03 -04:00
Kefu Chai
bd36f13163 doc: fix the links to http://ceph.com/docs
they should point to http://docs.ceph.com/docs/master/.. instead

Fixes: http://tracker.ceph.com/issues/19090
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-03-15 16:40:07 +08:00
Yehuda Sadeh
515db13970 qa/tasks/radosgw_admin: adjust test to new bucket structure
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2017-03-09 09:18:56 -08:00
John Spray
41f8ded3e7 qa: update TestDamage for PurgeQueue
Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:27:03 +00:00
John Spray
1a1951002d qa: update TestFlush for changed stray perf counters
Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:27:03 +00:00
John Spray
6cf9c2956c qa: add TestStrays.test_purge_queue_op_rate
For ensuring that the PurgeQueue code is not generating
too many extra IOs.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:27:02 +00:00
John Spray
3e66de2182 mds: create purge queue if it's not found
Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:26:59 +00:00
John Spray
f826c7e8aa qa/cephfs: add TestStrays.test_purge_on_shutdown
...and change test_migration_on_shutdown to
specifically target non-purgeable strays (i.e.
hardlink-ish things).

Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:26:55 +00:00
John Spray
3970502c9b qa: update test_strays for purgequeue
Signed-off-by: John Spray <john.spray@redhat.com>
2017-03-08 10:20:59 +00:00
Sage Weil
7fbe8fb085 Merge pull request #13759 from liewegas/wip-19133
osdc/Objecter: resend RWORDERED ops on full

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2017-03-07 21:31:50 -06:00
Sage Weil
296708091c qa/tasks/ceph_manager: use new luminous set-full-ratio etc
Signed-off-by: Sage Weil <sage@redhat.com>
2017-03-07 16:39:09 -05:00
Sage Weil
a202b68d18 qa/tasks/thrashosds: chance_thrash_cluster_full
Induce a momentarily full cluster.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-03-07 13:33:44 -05:00
Radoslaw Zarzynski
6440750f53 qa/tasks/rgw.py: start Apache before RadosGW.
At the end of start_rgw() we wait till establishing HTTP connections
with RadosGW become possible. However, if RadosGW uses the FastCGI,
the condition can't be fulfilled without spawning HTTP server first.

Signed-off-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
2017-03-07 17:31:52 +01:00
John Spray
73100305e5 Merge pull request #13262 from batrick/multimds-thrasher
Add multimds:thrash sub-suite and fix bugs in thrasher for multimds

Reviewed-by: John Spray <john.spray@redhat.com>
2017-03-07 14:29:18 +00:00
John Spray
39204abeda Merge pull request #13282 from jcsp/wip-fuse-mount-teardown
tasks/cephfs: tear down on mount() failure

Reviewed-by: Yan, Zheng <zyan@redhat.com>
2017-02-28 15:04:59 +00:00
Kefu Chai
edceabbd47 qa/tasks/workunit: use ceph.git as an alternative of ceph-ci.git for workunit repo
if we run upgrade test, where, for example, "jewel" is not in
ceph-ci.git repo, we should check ceph.git to clone the workunits.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-27 17:36:05 +08:00
Sage Weil
af5dab0613 Merge pull request #13649 from liewegas/wip-ceph-scrub-debug
qa/tasks/ceph.py: debug which pgs aren't scrubbing

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
2017-02-25 13:15:06 -06:00
Sage Weil
f777d849e7 qa/tasks/ceph.py: debug which pgs aren't scrubbing
Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-24 23:07:34 -05:00
Samuel Just
44b26f6ab4 Merge pull request #13594 from athanatos/wip-snap-trim-sleep
osd: add snap trim reservation and re-implement osd_snap_trim_sleep

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-02-24 14:09:17 -08:00
Kefu Chai
4cf28de4c9 qa/tasks/workunit: use the suite repo for cloning workunit
as "workunits" reside in ceph/qa/workunits, it's more intuitive to
respect suite-repo option when cloning workunits.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-24 16:47:47 +08:00
John Spray
de5249436c Merge pull request #13359 from jcsp/wip-logrotate-sshexception
qa: handle SSHException in logrotate

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-02-22 10:05:07 +00:00
Kefu Chai
b3e516fc38 Merge pull request #13518 from tchaikov/wip-fix-pgp-num
test: Thrasher: do not update pools_to_fix_pgp_num if nothing happens

Reviewed-by: Sage Weil <sage@redhat.com>
2017-02-21 00:46:26 +08:00
Kefu Chai
c0f0cde399 test: Thrasher: do not update pools_to_fix_pgp_num if nothing happens
we should not update pools_to_fix_pgp_num if the pool is not expanded or
the pg_num is not increased due to pgs being created. this prevent us
from fixing the pgp_num after done with thrashing if we actually did
nothing when fixing the pgp_num when thrashing, but we removed the pool
from pools_to_fix_pgp_num after set_pool_pgpnum() returns.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-19 13:10:46 +08:00
Sage Weil
86c0d07e32 qa/tasks/ceph.py: fix timing of wait-for-* and osd markdown
Mark down osds, *then* wait for them to come up or for the cluster to be
healthy!

Signed-off-by: Sage Weil <sage@redhat.com>
2017-02-18 21:12:23 -05:00
Sage Weil
96bc86b537 Revert "qa/tasks/workunit: use the suite repo for cloning workunit" 2017-02-17 11:54:27 -06:00
Kefu Chai
1f82b9b944 qa/tasks/workunit: use the suite repo for cloning workunit
as "workunits" reside in ceph/qa/workunits, it's more intuitive to
respect suite-repo option when cloning workunits.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-16 15:05:51 +08:00
Samuel Just
4aebf59d90 rados: check that pool is done trimming before removing it
Signed-off-by: Samuel Just <sjust@redhat.com>
2017-02-13 09:47:02 -08:00
Kefu Chai
de59b5102c test: Thrasher: restore changed options after done with thrash
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-13 09:25:51 +08:00
Kefu Chai
761a1dc391 tests: Thrasher: extract _set_config() method
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-13 09:25:50 +08:00
Kefu Chai
995e144e3e tests: CephManager: add get_config() method
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-13 09:25:50 +08:00
Kefu Chai
136483a8f9 test: Thrasher: update pgp_num of all expanded pools if not yet
otherwise wait_until_healthy will fail after timeout as seeing warning
like:

HEALTH_WARN pool cephfs_data pg_num 182 > pgp_num 172

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-13 09:25:50 +08:00
John Spray
880cbf09aa Merge pull request #13137 from jcsp/wip-18661
qa: fix race in Mount.open_background

Reviewed-by: Yan, Zheng <zyan@redhat.com>
2017-02-10 17:48:05 +00:00
John Spray
a3fd3f225c Merge pull request #13099 from jcsp/wip-18663
qa/tasks: force umount during kclient teardown
2017-02-10 17:42:37 +00:00
John Spray
6f9e11f03d qa: handle SSHException in logrotate
Yet another different type of exception we may get when
orchestra.run can't talk to a remote host.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-02-10 17:16:24 +00:00
Nathan Cutler
6b7443fb50 tests: drop buildpackages.py
The buildpackages suite has been moved to teuthology. This cleans up a file
that was left behind by https://github.com/ceph/ceph/pull/13297

Fixes: http://tracker.ceph.com/issues/18846
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-02-08 21:23:54 +01:00
Loic Dachary
5a43f8d579 buildpackages: remove because it does not belong
It should live in teuthology, not in Ceph. And it is currently broken:
there is no need to keep it around.

Fixes: http://tracker.ceph.com/issues/18846

Signed-off-by: Loic Dachary <loic@dachary.org>
2017-02-07 18:37:26 +01:00
John Spray
6203f33df4 tasks/cephfs: tear down on mount() failure
There were some cases where we would leave a mountpoint
that would cause the teuthology teardown to get hung up
when it tried to look inside cephtest/

Signed-off-by: John Spray <john.spray@redhat.com>
2017-02-06 22:53:21 +00:00
Patrick Donnelly
d748226f00
qa: add DaemonWatchdog to stop tests on failure
Thrashing MDS will often result in failures which often do not stop the
test. The failure may also cause the test to stall which will force the
machines to needlessly be locked until a timeout is reached. This
watchdog will unmount mounts and kill daemons when a failure is
detected.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-02-06 14:07:14 -05:00
Patrick Donnelly
f005e8af6b
qa: disable max_mds changes during thrashing
While the trasher supports the behavior desired by issue 10792 [1], the
bugs uncovered due to deactivating MDS (and sometimes killing
deactivating MDS) are presently a distraction from addressing issues
during normal failures. So now thrashing max_mds is turned off by
default. I have added a TODO to deactivate ranks in order (configurably)
as random deactivation causes a lot of other problems.

This also fixes a bug: random.randrange(0.0, 1.0) always returns 0.
Oops.

[1] http://tracker.ceph.com/issues/10792

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-02-06 14:07:14 -05:00
Patrick Donnelly
82662edd7f
qa: do not pretty the json to shorten stdout log
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-02-06 14:07:14 -05:00
Patrick Donnelly
a0052fc2d6
qa: use gevent.sleep so greenlet yields
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-02-06 14:07:14 -05:00
Patrick Donnelly
cf9e0da078
qa: use fs methods for setting configs
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-02-06 14:07:13 -05:00
Patrick Donnelly
0098873fb7
qa: remove old comment
Filesystem is now cluster aware.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-02-06 14:07:13 -05:00
Patrick Donnelly
fd4b61890d
qa: allow revived MDS to be up:active
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-02-06 14:07:13 -05:00
Patrick Donnelly
884215d933
qa: timeout waiting for thrashed MDS to revive
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-02-06 14:07:13 -05:00
Patrick Donnelly
8e9ea7b6ac
qa: configure thrashing while MDS are stopping
Currently multimds is prone to many failures when killing an active or
stopping MDS when there are MDS in the cluster which have been
deactivated (stopping). Have this turned off by default for now.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-02-06 14:07:13 -05:00
Patrick Donnelly
6304b6ed5d
qa: add deactivation log message
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-02-06 14:07:13 -05:00
Patrick Donnelly
1185326c45
qa: avoid infinite wait if no repl. can be made
The thrasher can enter an infinite loop waiting for an MDS to take a
certain rank when a replacement may not be possible. For example,
max_mds actives are already running.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-02-06 14:07:12 -05:00
Patrick Donnelly
638bccb2bb
qa: timeout thrasher if fs does not stabilize
After 5 minutes of waiting, it's reasonable to stop as the cluster is
probably stuck.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-02-06 14:07:12 -05:00
Patrick Donnelly
8f3e745344
qa: check replacement MDS is active in thrasher
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-02-06 14:07:12 -05:00
Patrick Donnelly
19289725c8
qa: handle thrashing ranks with holes
During the course of thrashing max_mds, the ranks assigned to MDSs may
develop holes. This causes the thrasher to try to wrongly deactivate
ranks that are not assigned.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2017-02-06 14:07:12 -05:00
Nathan Cutler
db2582e25e tests: fix regression in qa/tasks/ceph_master.py
https://github.com/ceph/ceph/pull/13194 introduced a regression:

2017-02-06T16:14:23.162 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
  File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 722, in wrapper
    return func(self)
  File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 839, in do_thrash
    self.choose_action()()
  File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 305, in kill_osd
    output = proc.stderr.getvalue()
AttributeError: 'NoneType' object has no attribute 'getvalue'

This is because the original patch failed to pass "stderr=StringIO()" to run().

Fixes: http://tracker.ceph.com/issues/16263
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-06 19:37:38 +01:00
Sage Weil
5fc3dd36e2 Merge pull request #13237 from smithfarm/wip-18799
tests: Thrasher: eliminate a race between kill_osd and __init__

Reviewed-by: Sage Weil <sage@redhat.com>
2017-02-05 12:49:30 -06:00
Josh Durgin
21cdcfcc66 Merge pull request #13194 from smithfarm/wip-16263
tests: ignore bogus ceph-objectstore-tool error in ceph_manager

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: David Zafman <dzafman@redhat.com>
2017-02-02 15:31:29 -08:00
Nathan Cutler
b519d38fb1 tests: Thrasher: eliminate a race between kill_osd and __init__
If Thrasher.__init__() spawns the do_thrash thread before initializing the
ceph_objectstore_tool property, do_thrash races with the rest
of Thrasher.__init__() and in some cases do_thrash can call kill_osd() before
Trasher.__init__() progresses much further. This can lead to an exception
("AttributeError: Thrasher instance has no attribute 'ceph_objectstore_tool'")
being thrown in kill_osd().

This commit eliminates the race by making sure the ceph_objectstore_tool
attribute is initialized before the do_thrash thread is spawned.

Fixes: http://tracker.ceph.com/issues/18799
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-02-02 23:23:54 +01:00
John Spray
3c9f16d8ab tasks/kclient: apply timeout to umount
The umount process can get stuck, in which case
we want to fail the test rather than waiting around for it.

During teardown of the kclient task catch this
timeout explicitly so that we will powercycle the node if
needed.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-02-02 15:09:48 +00:00
Mykola Golub
93f7b5ef3f Merge pull request #13158 from dillaman/wip-18594
qa: integrate OpenStack 'gate-tempest-dsvm-full-devstack-plugin-ceph'

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
2017-02-02 08:27:49 +02:00
John Spray
a027dba78f tasks/cephfs: switch open vs. write in test_open_inode
Do the write after opening the file, so that we get good
behaviour wrt the change in Mount.open_background that uses
file existence to confirm that the open happened.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-02-01 00:38:08 +00:00
John Spray
7f7f44ea5c qa/tasks: force umount during kclient teardown
Previously we could readily end up hanging on teardown
when something had gone wrong with umount.  Forcing
is a big hammer (umount_wait will power cycle the node
if umount isn't working), so if we had to do that
then raise an exception to indicate that something
was wrong with the test.

Fixes: http://tracker.ceph.com/issues/18663
Signed-off-by: John Spray <john.spray@redhat.com>
2017-02-01 00:26:59 +00:00
John Spray
d4f6385b85 Merge pull request #12800 from jcsp/wip-vstart-qasuite
Improve vstart_runner to (optionally) create its own cluster

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2017-01-31 02:02:49 +01:00
Nathan Cutler
046e873026 tests: ignore bogus ceph-objectstore-tool error in ceph_manager
Fixes: http://tracker.ceph.com/issues/16263
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-01-31 00:49:05 +01:00
Jason Dillaman
ce675383b3 qa/tasks/qemu: allow tests to customize the number of CPUs
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-01-26 14:18:48 -05:00
Jason Dillaman
42e967f0bb qa/tasks/qemu: copy ceph configuration to VM image
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-01-26 14:17:43 -05:00
Jason Dillaman
d98aa1a39a qa/tasks/qemu: attach all disks as rbd block devices
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-01-26 14:17:30 -05:00
Jason Dillaman
67a4a6c519 qa/tasks/qemu: support overriding the cloud image
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-01-26 14:16:16 -05:00
Jason Dillaman
454348004b qa/tasks/qemu: support arbitrary additions to cloud-init-archive
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-01-26 14:16:10 -05:00
John Spray
c6d91dd912 qa: fix race in Mount.open_background
Previously a later remote call could end up executing
before the remote python program in open_background
had actually got as far as opening the file.

Fixes: http://tracker.ceph.com/issues/18661
Signed-off-by: John Spray <john.spray@redhat.com>
2017-01-26 16:48:58 +00:00
Michal Jarzabek
052c3d3f68 mon/MDSMonitor.cc:refuse fs new on pools with obj
Fixes: http://tracker.ceph.com/issues/11124
Signed-off-by: Michal Jarzabek <stiopa@gmail.com>
2017-01-23 19:48:53 +00:00
John Spray
fe219df2a2 qa: update vstart_runner docstring
...to use paths pointing to ceph tree, not
ceph-qa-suite tree.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-01-19 06:30:20 +01:00
John Spray
549d993d3f qa: update remaining ceph.com to download.ceph.com
Fixes: http://tracker.ceph.com/issues/18574
Signed-off-by: John Spray <john.spray@redhat.com>
2017-01-17 17:14:50 +01:00
Jason Dillaman
6d17befb3b qa/tasks/qemu: update default image url after ceph.com redesign
Fixes: http://tracker.ceph.com/issues/18542
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-01-16 22:12:51 -05:00
Alfredo Deza
7172b55ad9 Merge pull request #12892 from ceph/wip-cd-fs-fix
qa/tasks/ceph-deploy: use the new create option during instantiation

Reviewed-by: Alfredo Deza <adeza@redhat.com>
2017-01-13 16:06:24 -05:00
John Spray
1e62467d09 Merge pull request #12833 from ukernel/wip-18396
tasks/cephfs: fix kernel force umount

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2017-01-13 11:20:00 +00:00
John Spray
2076cda04a Merge pull request #12749 from ukernel/wip-18179
mds: propagate error encountered during opening inode by number

Reviewed-by: John Spray <john.spray@redhat.com>
2017-01-13 11:18:59 +00:00
Yan, Zheng
6526ecc084 qa/tasks: add test_open_ino_errors
Validate that errors encountered during opening inos are properly
propagated

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2017-01-12 20:15:53 +08:00
Vasu Kulkarni
2d4ed95f2b use the create option during instantiation
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-01-10 15:43:12 -08:00
Alfredo Deza
ebb02c8ef5 Merge pull request #12867 from ceph/wip-ceph-deploy-workaround
qa/tasks/ceph-deploy: create-keys explicitly

Reviewed-by: Alfredo Deza <adeza@redhat.com>
2017-01-10 15:47:26 -05:00
Vasu Kulkarni
2d6c3fa8b2 Add ceph-create-keys to explicitly create admin/bootstrap keys
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-01-09 17:14:33 -08:00
Yan, Zheng
4cdeeaac10 qa/tasks/cephfs: fix kernel force umount
Fixes: http://tracker.ceph.com/issues/18396
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2017-01-10 08:31:25 +08:00
John Spray
6542a2e0d0 Merge pull request #12588 from jcsp/wip-18311
mds: check for errors decoding backtraces

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2017-01-09 11:02:32 +00:00
Nathan Cutler
74689df754 tests: subst branch and repo in qa/tasks/qemu.py
References: http://tracker.ceph.com/issues/18440
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-01-07 22:49:54 +01:00
Nathan Cutler
56e37e41f4 tests: subst repo name in qa/tasks/cram.py
Inspired by bcbe45d948

Fixes: http://tracker.ceph.com/issues/18440
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-01-07 13:40:06 +01:00
John Spray
aa01f44022 qa: enable cluster creation in vstart_runner
Convenient when you want to create a fresh cluster
each test run: just pass --create and you'll get
a cluster with the right number of daemons for
the tests you're running.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-01-05 13:43:40 +00:00
John Spray
5d945fb71e qa/vstart_runner: more robust stop() on daemons
Previously this could get hung up if we killed one
PID and then the daemon reappears with a different
one (perhaps because we caught it during
daemonization?)

Signed-off-by: John Spray <john.spray@redhat.com>
2017-01-05 13:43:39 +00:00
John Spray
081038ef53 qa: fix vstart_runner tasks import
Instead of hunting around the filesystem for
ceph-qa-suite, get it from our own location.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-01-05 13:43:39 +00:00
John Spray
5f6cdab80f qa/tasks: add test_corrupt_backtrace
Validate that we get EIO and a damage table entry
when seeing a decode error on a backtrace.

Signed-off-by: John Spray <john.spray@redhat.com>
2017-01-05 13:41:59 +00:00
Sage Weil
2861a2188a Merge pull request #12630 from liewegas/wip-workunit-retry
qa/tasks/workunit: clear clone dir before retrying checkout

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2016-12-23 08:12:35 -06:00
Sage Weil
2a7013cd5a qa/tasks/workunit: clear clone dir before retrying checkout
If we checkout ceph-ci.git, and don't find a branch,
we'll try again from ceph.git. But the checkout will
already exist and the clone will fail, so we'll still
fail to find the branch.

The same can happen if a previous workunit task already
checked out the repo.

Fix by removing the repo before checkout (the first and
second times).  Note that this may break if there are
multiple workunit tasks running in parallel on the same
role.  That is already racy, so if it's happening, we'll
want to switch to using a truly unique clonedir for each
instantiation.

Fixes: http://tracker.ceph.com/issues/18336
Signed-off-by: Sage Weil <sage@redhat.com>
2016-12-22 13:05:22 -05:00
Sage Weil
e1781dd573 qa/tasks/peer: update task based on current peering behavior
This changed in 0be3f5f72e.

Fixes: http://tracker.ceph.com/issues/18330
Signed-off-by: Sage Weil <sage@redhat.com>
2016-12-22 08:40:45 -05:00
Sage Weil
c922404a03 qa/tasks/osd_backfill.py: wait for osd.[12] to start
...before sending a tell command.  Otherwise osd.2 might
start without 1, the io unblocks, and the tell fails
because osd.1 is still down.

Fixes: http://tracker.ceph.com/issues/18303
Signed-off-by: Sage Weil <sage@redhat.com>
2016-12-19 21:56:11 -05:00
Sage Weil
72d73b8c88 qa/tasks/workunit: retry on ceph.git if checkout fails
Signed-off-by: Sage Weil <sage@redhat.com>
2016-12-16 15:06:16 -05:00
Vasu Kulkarni
9f04a7b32e use dev option instead of dev-commit
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2016-12-15 14:11:00 -08:00
Sage Weil
6bb3a037e5 Merge pull request #12511 from liewegas/wip-workunits
qa/workunits/rbd: fix

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Mykola Golub <mgolub@mirantis.com>
2016-12-15 14:15:31 -06:00
Sage Weil
c6698c95b8 Merge pull request #12508 from liewegas/wip-qa-admin-socket
qa/tasks/admin_socket: subst in repo name
2016-12-15 13:53:10 -06:00
Sage Weil
27b8eac249 qa/tasks/workunit.py: add CEPH_BASE env var
Root of git checkout

Signed-off-by: Sage Weil <sage@redhat.com>
2016-12-15 13:52:03 -05:00
Sage Weil
4602884ab8 qa/tasks/workunit: leave workunits inside git checkout
Signed-off-by: Sage Weil <sage@redhat.com>
2016-12-15 13:52:03 -05:00
Sage Weil
bcbe45d948 qa/tasks/admin_socket: subst in repo name
It is either ceph.git or ceph-ci.git.

Signed-off-by: Sage Weil <sage@redhat.com>
2016-12-15 13:35:02 -05:00
Samuel Just
ae40602c14 Merge remote-tracking branch 'ceph-qa-suite/master' into wip-18113-qa 2016-12-14 16:05:35 -08:00