Commit Graph

64 Commits

Author SHA1 Message Date
Sage Weil
afd0b508c2 qa/suites/rados/thrash: force normal pg log length with cache tiering
When we are doing cache tiering, we are more sensitive to short PG logs
because the dup op entries are not perfectly promoted from the base to
the cache.

See:
 http://tracker.ceph.com/issues/38358
 http://tracker.ceph.com/issues/24320

This works around the problem by not testing short pg logs in combination
with cache tiering.  This works because the short_pg_log.yaml fragment
sets the short log in the [global] section but the cache workloads overload
it (back to a large/default value) in the [osd] section.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-06-19 11:15:25 -05:00
Josh Durgin
d45f18119b qa/suites: remove mon kv backend options
rocksdb is the default, leveldb is not recommended at this point, so drop it.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2019-02-08 16:58:44 -05:00
Sage Weil
ee59743a1a qa/suites/rados/workloads/rados_api_tests.yaml: debug mgrc = 20 on mon
Seeing some hangs when the mon is forwarding mgr commands (pg deep-scrub)
to the mgr.  This is a buggy test (it should send it to the mgr directly)
but it is helpful to verify the mon forwarding behavior works.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-02-07 12:10:34 -06:00
Neha Ojha
4ef94e89c8 qa/suites/rados/thrash: change crush_tunables to jewel in rados_api_tests
Fixes: http://tracker.ceph.com/issues/38042
Signed-off-by: Neha Ojha <nojha@redhat.com>
2019-01-24 16:54:29 -08:00
xie xingguo
c7356c66b0 mgr/balancer: blame if upmap won't actually work
With automatic balancing on, and if mode is set to upmap,
balancer will fail silently if min_compat_client is lower than
luminous.
You can't figure out that unless you take a closer look at the
mgr log, which is super annoying..

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2019-01-16 09:19:54 +08:00
Sage Weil
d518eb6cac qa/msgr: move msgr factet into generic re-usable dir
Signed-off-by: Sage Weil <sage@redhat.com>
2019-01-03 11:17:38 -06:00
David Zafman
02964703de
Merge pull request #24749 from dzafman/wip-36474
Add support for osd_delete_sleep configuration value

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2018-11-07 14:05:55 -08:00
David Zafman
3f621a1190 test: Set any value for osd_delete_sleep to guarantee we are testing even SSD
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-10-30 11:42:05 -07:00
Sage Weil
86ae8fb6b8 qa/suites/rados/thrash*/thrashers/careful.yaml: thrash with mgr controller
Thrash such that we still exercise the careful throttling in the mgr.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-10-20 15:21:58 -05:00
Sage Weil
44de03d5e6 qa/suites: test pg merging
Signed-off-by: Sage Weil <sage@redhat.com>
2018-09-07 12:09:05 -05:00
Patrick Donnelly
b39f9d06dc
qa: fix symlinks indirectly pointing at qa to .qa
Building on the previous commit.

Command used:

$ find suites/ -type l -and -not -name .qa -execdir ~/fix.sh {} \;

fix.sh:
    #!/bin/bash

    link="$(readlink "$1")"

    echo $link
    dirlink="$(dirname "$link")"
    baselink="$(basename "$link")"

    while true; do
        echo $dirlink
        if [ "$dirlink" -ef ~/ceph/qa ]; then
            ln -nsf ".qa/$baselink" "$1"
            exit
        else
            baselink="$(basename "$dirlink")/$baselink"
            dirlink="$(dirname "$dirlink")"
            if [ "$dirlink" -ef . ]; then
                break
            fi
        fi
    done

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-06-26 11:48:38 -07:00
Patrick Donnelly
716db6e2fd
qa: add .qa helper link
This utilizes the recent feature in teuthology [1] to skip hidden files in
suites when building the job matrix.

Idea of this change is to enable referring to the top-level qa directory in a
position-independent way such that copies of a suite to another location do not
break any symlinks.

[1] https://github.com/ceph/teuthology/pull/1185

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-06-26 11:33:48 -07:00
Yuri Weinstein
9f2c485942 tests/qa: adding rados/.. dirs
Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
2018-05-11 14:03:15 -07:00
Sage Weil
27e91a99f5
Merge pull request #21273 from jdurgin/wip-23195
osd/ECBackend: only check required shards when finishing recovery reads

Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2018-04-24 17:20:25 -05:00
Kefu Chai
cdcbd47e1e qa/suite: whitelist PG_AVAILABILITY in rados_api_tests.yaml
pg will be created when increasing pgp-num and pg-num. so at that
moment, PG_AVAILABILITY is reported. so whitelist it in all tests which
run rados/test.sh. that script exercises ceph_test_rados_api_list.

Fixes: http://tracker.ceph.com/issues/23763
Signed-off-by: Kefu Chai <kchai@redhat.com>
2018-04-24 10:16:12 +08:00
Josh Durgin
234d652317 qa/suites/rados: add coverage for osd_recovery_max_single_start > 1
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2018-04-20 19:42:15 -04:00
Gregory Farnum
6d2e4c9b7b
Merge pull request #19973 from liewegas/wip-peering-fast-dispatch
osd: fast dispatch of peering events and pg_map + osd sharded wq refactor

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2018-04-06 11:48:11 -07:00
Joao Eduardo Luis
3997eed4db qa: enable mon osdmap pruning on 'rados/' suites
Signed-off-by: Joao Eduardo Luis <joao@suse.de>
2018-04-06 04:18:23 +01:00
Sage Weil
26f00dd67c qa/suites: mon warn on pool no app = false for api tests
Among other things, the list.cc tests set pg_num which waits for cluster
healthy.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-04-04 08:26:58 -05:00
myoungwon oh
a1d6304442 qa/suites/rados/thrash/workloads: add paramters to support two pools
Signed-off-by: Myoungwon Oh <omwmw@sk.com>
2018-01-26 16:20:01 +09:00
myoungwon oh
93e986c064 qa/suites/rados/thrash: add tier_promote op
1.add tier_promote op for redirect and chunked cases.
2.rename set-chunk.yaml due to current chunked object
only for the read case.

Signed-off-by: Myoungwon Oh <omwmw@sk.com>
2018-01-12 14:38:57 +09:00
myoungwon oh
56462d5ee8 qa/suites/rados/thrash: remove write op
current chunked object and ChunkReadOp are
only for the read case.
write op and promote_object() still be tested without ChunkReadOp
by another ceph_test_rados in the same test suite (with --set_chunk)

Signed-off-by: Myoungwon Oh <omwmw@sk.com>
2017-12-14 01:27:02 +09:00
Sage Weil
dda79ad1fa
Merge pull request #15482 from myoungwon/wip-chunked-manifest
osd,librados: add manifest, operations for chunked object

Reviewed-by: Sage Weil <sage@redhat.com>
2017-11-29 21:13:43 -06:00
Sage Weil
6455954d29 qa/suites/rados: stop testing firefly tunables
We can't mix the balancer compat-set testing with firefly tunables because
it requires that all buckets be straw2.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-11-10 07:10:11 -06:00
myoungwon oh
93be6f79e0 qa/suites/rados/thrash: add set_chunk test case
Signed-off-by: Myoungwon Oh <omwmw@sk.com>
2017-11-06 15:53:46 +09:00
Sage Weil
26710f0a9b mgr/balancer: enable module by default
It will still be "off".

Signed-off-by: Sage Weil <sage@redhat.com>
2017-11-02 16:11:26 -05:00
Sage Weil
2c9c18d1ec qa/suites/rados/thrash/d-balancer: enable balancer in various modes
Signed-off-by: Sage Weil <sage@redhat.com>
2017-11-01 07:28:49 -05:00
Vasu Kulkarni
30dbbfe4ae Remove unsupported 2-size-1-min-size config
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
2017-09-07 09:47:15 -07:00
Sage Weil
d8dead1aaf qa/suites/rados: remove luminous tests
- snapdir conversion (at-end) stuff
- merge luminous-specific collections that avoided the above back
into their normal locations

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-28 23:10:32 -04:00
Sage Weil
b5fae9a9ca Merge pull request #16873 from liewegas/wip-4-nodes
qa/suites: change fixed-2.yaml users to get 4 openstack disks

Reviewed-by: Zack Cerza <zcerza@redhat.com>
2017-08-07 11:27:40 -05:00
Sage Weil
f683d2d374 qa/suites: change fixed-2.yaml users to get 4 openstack disks
Follow-up for 4203c4f887

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-07 11:56:33 -04:00
Sage Weil
6307e03c6d qa/suites/rados/thrash/workloads/cache-agent-big: m=2
...because we do the test_map_discontinuity thing.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-05 14:33:13 -04:00
Kefu Chai
d12c51ca91 qa/suites: escape the parenthesis of the whitelist text
so we can avoid the warnings like

grep: Unmatched ( or \(

because we pass the whitelisted string to `egrep -v "$1"` directly.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-08-01 21:54:44 +08:00
Sage Weil
c3c2b31c87 Merge pull request #16568 from liewegas/wip-application-warn
qa,doc: document and fix tests for pool application warnings
2017-07-28 09:00:46 -05:00
Sage Weil
e398fd4ee4 qa/suites: more whitelisting
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-27 09:31:24 -04:00
Sage Weil
3683cdf496 qa/suites/rados: at-end: ignore PG_{AVAILABILITY,DEGRADED}
With the peering deletes change, setting luminous sets the osdmap flag
which triggers a new peering interval.  That can lead to health warnings
about PG_AVAILABILITY or PG_DEGRADED.  Ignore those!

Fixes: http://tracker.ceph.com/issues/20693
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-25 18:29:07 -04:00
John Spray
343e1a4281 qa: update whitelist for "wrongly marked me down"
Signed-off-by: John Spray <john.spray@redhat.com>
2017-07-24 14:54:46 +01:00
Jason Dillaman
fa90be842e test: enable pool applications for new pools
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-07-19 13:13:01 -04:00
Sage Weil
63f97ddcf6 qa/suites/rados: whitelist health warnings
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-12 12:52:02 -04:00
Sage Weil
f2b837578a Merge pull request #16244 from liewegas/wip-11793
qa/suites/rados/thrash/workload/*: enable rados.py cache tiering ops

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-07-11 13:01:42 -05:00
Sage Weil
2afbc60be7 qa/suites/: enable rados.py cache tiering ops
These weren't being exercised!

See http://tracker.ceph.com/issues/11793

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-11 14:01:15 -04:00
Sage Weil
dc7a2aaf7a erasure-code: ruleset-* -> crush-*
1) ruleset is an obsolete term, and
2) crush-{rule,failure-domain,...} is more descriptive.

Note that we are changing the names of the erasure code profile keys
from ruleset-* to crush-*.  We will update this on upgrade when the
luminous flag is set, but that means that during mon upgrade you cannot
create EC pools that use these fields.

When the upgrade completes (users sets require_osd_release = luminous)
existing ec profiles are updated automatically.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-06 15:01:03 -04:00
Sage Weil
bfbe9fdd86 qa/suites/rados/thrash/worklaods/radosbench: use less disk
Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-15 12:14:24 -04:00
Sage Weil
3118d9a154 osd: replace require_*_osds flags with require_osd_release field
- OSDMap encode and decode translate between the flags and int
representations.
- OSDMap::Incremental only does decode; we do not expect to ever encode
an incremental osdmap for an old osd that sets any of these flags.
- the 'osd set' command still lets you set the jewel and kraken flags,
but not luminous.
- OSDMap::apply_incremental handles the conversion of legacy require flags
to the new field if the jewel or kraken flags have to be set before
starting the osd upgrade.
- clear out the legacy flags when we make the luminous transition only;
until then we keep using the old flag in the encoded and decoded version
(although the require_osd_release field will be accurate in memory in all
cases).

Signed-off-by: Sage Weil <sage@redhat.com>
2017-05-29 21:33:17 -04:00
Sage Weil
ce654c5133 qa/suites/rados/*/at-end: wait for healthy before scrubbing
The scrub_pgs command also waits for healthy for a while, but fails
silently if it times out, which means the subsequent scrubs will also
fail to clean up.

This forces an earlier failure that does not obscure the root cause.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-05-23 14:12:24 -04:00
Sage Weil
e57ecb64f0 qa/suites/rados/thrash: make sure osds have map before legacy scrub
The OSDs must have a map reflecting the require_luminous flag in order
for the legacy conversion to happen.  A quick rados bench should ensure
that.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-05-16 12:28:40 -04:00
Sage Weil
d0a73ec955 Merge pull request #13610 from liewegas/wip-snapset
osd: eliminate snapdir objects and move clone snaps vector into SnapSet

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2017-05-11 21:18:11 -05:00
Sage Weil
1de9c90776 qa/suites: set initial require_min_compat_client
For cases where we are selecting crush tunables beyond the default
min of hammer.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-05-09 11:32:56 -05:00
Sage Weil
8bd54abc74 qa/suites/rados: at end, scrub pgs, verify no legacy snapsets
Signed-off-by: Sage Weil <sage@redhat.com>
2017-05-05 13:39:14 -04:00
Sage Weil
5f5f370925 qa/suites/rados: switch require-luminous facet to use full_sequential_finally
This lets us run multiple cleanup steps right before ceph
teardown.

Note that we drop the facet from multimon/ because it
doesn't factor out cluster creation before this step
properly.  That's fine because the require_luminous
cleanup shouldn't be related to the multimon tests.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-05-05 13:39:14 -04:00