Commit Graph

31 Commits

Author SHA1 Message Date
Kefu Chai
8abc6e1bea qa/tasks/rebuild_mondb: update to address ceph-mgr changes
- revive ceph-mgr after updating the keyring cap
- grant "mgr:allow *" to client.admin
- minor refactors

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-05-28 09:59:50 +08:00
Sage Weil
5ab996ab3c qa/tasks/ceph_manager: 'ceph $service tell ...' is obsolete
This died forever ago; no need for the fallback here.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-05-23 22:53:53 -04:00
Kefu Chai
da1161cbd8 qa/tasks/ceph_manager: always fix pgp_num when done with thrashosd task
Fixes: http://tracker.ceph.com/issues/19771
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-05-03 18:28:27 +08:00
Sage Weil
27dd6530a2 Merge pull request #14559 from liewegas/wip-pg-map
mon: move 'pg map' to OSDMonitor

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-04-21 18:53:17 -05:00
Sage Weil
069182f91f qa/tasks/ceph_manager: use 'pg map' for get_pg_{primary,replica}
Pulling this out of the 'pg dump' heap is inefficient.
Also, pg dump data comes from the mgr and may be stale.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-04-21 10:56:28 -04:00
Kefu Chai
6fa16c4477 Merge pull request #14584 from tchaikov/wip-19631
qa/suites: Revert "qa/suites: add mon-reweight-min-pgs-per-osd = 4"

Reviewed-by: Sage Weil <sage@redhat.com>
2017-04-21 22:56:21 +08:00
Kefu Chai
e6a436bb27 qa/tasks/ceph_manager: be able to store options with service type
so we are able to change options for services other than mon while
thrashing.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-04-20 14:18:21 +08:00
Kefu Chai
ee653ba87c Merge pull request #14608 from tchaikov/wip-19594
qa/tasks: assert on pg status with a timeout

Reviewed-by: Sage Weil <sage@redhat.com>
2017-04-20 10:49:12 +08:00
Kefu Chai
960032e513 qa/tasks: update tests with helper to wait for pg-stats
and remove unused helpers

Fixes: http://tracker.ceph.com/issues/19594
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-04-20 09:35:05 +08:00
Kefu Chai
1207caf3a2 qa/tasks/ceph_manager: add a "wait_for_pg_stats()" decorator
and accompany it with two helpers to access the pg stats in a more
natural way

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-04-20 09:35:04 +08:00
Josh Durgin
6fba80c1fa osd, OSDMonitor, qa: mark ec overwrites non-experimental
Keep the pool flag around so we can distinguish between a pool that
should maintain hashes for each chunk, and a missing one is a bug, vs
an overwrites pool where we rely on bluestore checksums for detecting
corruption.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2017-04-19 17:45:43 -07:00
Sage Weil
ee1bb01a54 Merge pull request #14556 from liewegas/wip-pgupmap
osd: pg-remap -> pg-upmap

Reviewed-by: David Zafman <dzafman@redhat.com>
2017-04-19 17:07:01 -05:00
Sage Weil
ce188e8fdf osd: pg-remap -> pg-upmap
'remap' is to non-specific a name.  In particular, it
sounds like it is related to the 'remapped' PG state
but in reality it is not related.

'upmap' or 'pg-upmap' is more specific: it maps a pgid
to the 'up' set value (or item)

Signed-off-by: Sage Weil <sage@redhat.com>
2017-04-18 12:59:40 -04:00
Kefu Chai
1b54b5f3f1 Merge pull request #14415 from smithfarm/wip-19556
tests: Thrasher: handle "OSD has the store locked" gracefully

Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-04-18 23:18:35 +08:00
David Zafman
a5731076ad osd: Handle backfillfull_ratio just like nearfull and full
Add BACKFILLFULL as a local OSD cur_state
Notify monitor of this new fullness state

Signed-off-by: David Zafman <dzafman@redhat.com>
2017-04-17 08:00:24 -07:00
Nathan Cutler
a5b19d2d73 tests: Thrasher: handle "OSD has the store locked" gracefully
On slower machines (VPS, OVH) it takes time for the OSD to go down.

Fixes: http://tracker.ceph.com/issues/19556
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-04-11 16:09:45 +02:00
Sage Weil
2a08cbbed5 qa/tasks/thrashosds,ceph_manager: thrash pg_remap[_items]
Signed-off-by: Sage Weil <sage@redhat.com>
2017-03-28 10:12:10 -04:00
Sage Weil
296708091c qa/tasks/ceph_manager: use new luminous set-full-ratio etc
Signed-off-by: Sage Weil <sage@redhat.com>
2017-03-07 16:39:09 -05:00
Sage Weil
a202b68d18 qa/tasks/thrashosds: chance_thrash_cluster_full
Induce a momentarily full cluster.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-03-07 13:33:44 -05:00
Samuel Just
44b26f6ab4 Merge pull request #13594 from athanatos/wip-snap-trim-sleep
osd: add snap trim reservation and re-implement osd_snap_trim_sleep

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-02-24 14:09:17 -08:00
Kefu Chai
c0f0cde399 test: Thrasher: do not update pools_to_fix_pgp_num if nothing happens
we should not update pools_to_fix_pgp_num if the pool is not expanded or
the pg_num is not increased due to pgs being created. this prevent us
from fixing the pgp_num after done with thrashing if we actually did
nothing when fixing the pgp_num when thrashing, but we removed the pool
from pools_to_fix_pgp_num after set_pool_pgpnum() returns.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-19 13:10:46 +08:00
Samuel Just
4aebf59d90 rados: check that pool is done trimming before removing it
Signed-off-by: Samuel Just <sjust@redhat.com>
2017-02-13 09:47:02 -08:00
Kefu Chai
de59b5102c test: Thrasher: restore changed options after done with thrash
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-13 09:25:51 +08:00
Kefu Chai
761a1dc391 tests: Thrasher: extract _set_config() method
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-13 09:25:50 +08:00
Kefu Chai
995e144e3e tests: CephManager: add get_config() method
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-13 09:25:50 +08:00
Kefu Chai
136483a8f9 test: Thrasher: update pgp_num of all expanded pools if not yet
otherwise wait_until_healthy will fail after timeout as seeing warning
like:

HEALTH_WARN pool cephfs_data pg_num 182 > pgp_num 172

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-13 09:25:50 +08:00
Nathan Cutler
db2582e25e tests: fix regression in qa/tasks/ceph_master.py
https://github.com/ceph/ceph/pull/13194 introduced a regression:

2017-02-06T16:14:23.162 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
  File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 722, in wrapper
    return func(self)
  File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 839, in do_thrash
    self.choose_action()()
  File "/home/teuthworker/src/github.com_ceph_ceph_master/qa/tasks/ceph_manager.py", line 305, in kill_osd
    output = proc.stderr.getvalue()
AttributeError: 'NoneType' object has no attribute 'getvalue'

This is because the original patch failed to pass "stderr=StringIO()" to run().

Fixes: http://tracker.ceph.com/issues/16263
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-02-06 19:37:38 +01:00
Sage Weil
5fc3dd36e2 Merge pull request #13237 from smithfarm/wip-18799
tests: Thrasher: eliminate a race between kill_osd and __init__

Reviewed-by: Sage Weil <sage@redhat.com>
2017-02-05 12:49:30 -06:00
Nathan Cutler
b519d38fb1 tests: Thrasher: eliminate a race between kill_osd and __init__
If Thrasher.__init__() spawns the do_thrash thread before initializing the
ceph_objectstore_tool property, do_thrash races with the rest
of Thrasher.__init__() and in some cases do_thrash can call kill_osd() before
Trasher.__init__() progresses much further. This can lead to an exception
("AttributeError: Thrasher instance has no attribute 'ceph_objectstore_tool'")
being thrown in kill_osd().

This commit eliminates the race by making sure the ceph_objectstore_tool
attribute is initialized before the do_thrash thread is spawned.

Fixes: http://tracker.ceph.com/issues/18799
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-02-02 23:23:54 +01:00
Nathan Cutler
046e873026 tests: ignore bogus ceph-objectstore-tool error in ceph_manager
Fixes: http://tracker.ceph.com/issues/16263
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2017-01-31 00:49:05 +01:00
Sage Weil
c01f2ee0e2 move ceph-qa-suite dirs into qa/ 2016-12-14 11:29:55 -06:00