Commit Graph

59 Commits

Author SHA1 Message Date
Mykola Golub
7311f6656f qa/suites/rados: add crushdiff test
Signed-off-by: Mykola Golub <mykola.golub@clyso.com>
2021-08-27 17:45:40 +03:00
Patrick Donnelly
d6c66f3fa6
qa,pybind/mgr: allow disabling .mgr pool
This is mostly for testing: a lot of tests assume that there are no
existing pools. These tests relied on a config to turn off creating the
"device_health_metrics" pool which generally exists for any new Ceph
cluster. It would be better to make these tests tolerant of the new .mgr
pool but clearly there's a lot of these. So just convert the config to
make it work.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2021-06-11 19:35:17 -07:00
Greg Farnum
d02625331c Merge remote-tracking branch 'origin/master' into wip-stretch-mode 2020-09-14 02:32:19 +00:00
Sage Weil
2ee9365d0b qa: log-whitelist -> log-ignorelist
Signed-off-by: Sage Weil <sage@newdream.net>
2020-08-24 19:53:08 +00:00
Greg Farnum
39d71f7841 test: add a mon_election directory to the rados and upgrade suites
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2020-07-08 04:26:03 +00:00
Sridhar Seshasayee
e527067666 qa: Whitelist 'slow request' within a bunch of tests
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2020-02-24 19:59:56 +05:30
Sage Weil
f8d0e3d73a qa/suites/rados: disable device scraping
We need no pools to avoid breaking some tests.

Signed-off-by: Sage Weil <sage@redhat.com>
2020-02-19 15:31:26 -06:00
Kefu Chai
44fb077978 qa: whitelist FS_DEGRADED
`admin_socket_output --all` sends "respawn" to mds, so when the mds
restarts, FS_DEGRADED.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-01-08 22:20:35 +08:00
Sage Weil
47350be466 qa/suites/rados: test cephadm on centos and ubuntu both
Signed-off-by: Sage Weil <sage@redhat.com>
2019-12-21 09:03:55 -06:00
Sage Weil
137fa64e12 qa: rename ceph-daemon tests -> cephadm
Also move the workunit to a better location.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-12-11 19:14:09 -06:00
Sage Weil
82c2320fbb qa/suites/rados/singleton-nomsgr/all/balancer: whitelist PG_AVAILABILITY
Balancer triggers peering, which may make PGs briefly go inactive--when
they possibly haven't been active yet.  E.g.,

    "PG_AVAILABILITY": {
        "severity": "HEALTH_WARN",
        "summary": {
            "message": "Reduced data availability: 3 pgs inactive, 3 pgs peering",
            "count": 6
        },
        "detail": [
            {
                "message": "pg 2.6 is stuck peering since forever, current state peering, last acting [2,0]"
            },
            {
                "message": "pg 2.1c is stuck peering since forever, current state peering, last acting [2,1]"
            },
            {
                "message": "pg 2.7a is stuck peering since forever, current state peering, last acting [2,0]"
            }
        ]
    }

Signed-off-by: Sage Weil <sage@redhat.com>
2019-11-19 20:38:08 -06:00
Sage Weil
9fe9653c8c qa/suites/rados/singleton-nomsgr/ceph-daemon: make sure python3 is installed
Centos7 doesn't have it by default.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-28 12:15:47 -05:00
Sage Weil
47777b9c0d qa/suites/rados/singleton-nomsgr/ceph-daemon: run test_ceph_daemon.sh
Signed-off-by: Sage Weil <sage@redhat.com>
2019-10-23 15:08:55 -05:00
Kefu Chai
df8bb8b8f6
Merge pull request #30646 from shyukri/wip-qa-mgr-balancer
qa/mgr/balancer: Add cram based test for altering target_max_misplaced_ratio setting

Reviewed-by: Jan Fajerski <jfajerski@suse.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>
2019-10-18 15:43:33 +08:00
Shyukri Shyukriev
37a45deb5b qa/mgr/balancer: Add cram based test for altering target_max_misplaced_ratio setting
Signed-off-by: Shyukri Shyukriev <shshyukriev@suse.com>
2019-09-30 14:17:16 +03:00
Sage Weil
379bf4b423 qa/suites/rados/singleton-nomsg/osd_stale_reads.yaml
Signed-off-by: Sage Weil <sage@redhat.com>
2019-09-28 11:51:18 -05:00
Kefu Chai
037daf5982 qa/suites/rados: whitelist POOL_APP_NOT_ENABLED warning
Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-08-20 21:05:21 +08:00
Brad Hubbard
88e9ca58a0 tests: Add test for lazy omap stat collection
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
2019-07-10 07:53:37 +10:00
Nathan Cutler
f9f824448a qa: add version number sanity singleton to rados suite
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2019-05-08 11:31:34 +02:00
Kefu Chai
0e1ec8dc20 qa: install libradospp-dev for librados_hello_world.yaml
libradospp-{dev,devel} is necessary for compiling sources in
examples/librados/hello_world.cc

Signed-off-by: Kefu Chai <kchai@redhat.com>
2018-11-07 14:13:48 -08:00
Nathan Cutler
c46c890d02 qa: add test that builds example librados programs
Fixes: http://tracker.ceph.com/issues/15100
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2018-09-25 13:18:04 +02:00
Patrick Donnelly
b39f9d06dc
qa: fix symlinks indirectly pointing at qa to .qa
Building on the previous commit.

Command used:

$ find suites/ -type l -and -not -name .qa -execdir ~/fix.sh {} \;

fix.sh:
    #!/bin/bash

    link="$(readlink "$1")"

    echo $link
    dirlink="$(dirname "$link")"
    baselink="$(basename "$link")"

    while true; do
        echo $dirlink
        if [ "$dirlink" -ef ~/ceph/qa ]; then
            ln -nsf ".qa/$baselink" "$1"
            exit
        else
            baselink="$(basename "$dirlink")/$baselink"
            dirlink="$(dirname "$dirlink")"
            if [ "$dirlink" -ef . ]; then
                break
            fi
        fi
    done

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-06-26 11:48:38 -07:00
Patrick Donnelly
716db6e2fd
qa: add .qa helper link
This utilizes the recent feature in teuthology [1] to skip hidden files in
suites when building the job matrix.

Idea of this change is to enable referring to the top-level qa directory in a
position-independent way such that copies of a suite to another location do not
break any symlinks.

[1] https://github.com/ceph/teuthology/pull/1185

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-06-26 11:33:48 -07:00
Sage Weil
4f769a3cc9 qa/suites/rados: move valgrind test to singleton-flat
No distro facet (or anything else) since we require centos for this test.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-05-27 10:07:45 -05:00
David Zafman
918921ab2f test: Need to escape parens in log-whitelist for grep
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-05-21 09:47:59 -07:00
Yuri Weinstein
9f2c485942 tests/qa: adding rados/.. dirs
Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
2018-05-11 14:03:15 -07:00
Brad Hubbard
eeeed6497b qa/suites/rados: Disable scrub backoff
A long run of lost coin flips can lead to a timeout in
test_large_omap_detection.py.

Fixes: http://tracker.ceph.com/issues/23578

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
2018-04-09 17:21:01 +10:00
David Zafman
9f103f013c tests: recovery-unfound-found test needs to account for correct misplaced calculations
The test expected HEALTH_OK when in a state with misplaced objects therefore HEALTH_WARN

Signed-off-by: David Zafman <dzafman@redhat.com>
2018-01-16 10:52:21 -08:00
Sage Weil
a5eb976cb3 qa/suites/rados: add missing openstack volumes
Signed-off-by: Sage Weil <sage@redhat.com>
2017-12-09 10:20:19 -06:00
Sage Weil
25b7965f88 qa/suites/rados: test for recovery_unfound bug
See http://tracker.ceph.com/issues/22145

Signed-off-by: Sage Weil <sage@redhat.com>
2017-11-19 21:32:57 -06:00
Kefu Chai
3ceab4ca43 Merge pull request #16332 from badone/wip-warn-about-objects-with-too-many-omap-entries
osd: Warn about objects with too many omap entries

Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-10-25 19:20:00 +08:00
Brad Hubbard
71bf04775b osd: Warn about objects with too many omap entries
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
2017-10-24 17:27:57 +10:00
Chang Liu
8d5d9c6a62 test: new test case for ceph-kvstore-tool
Signed-off-by: Chang Liu <liuchang0812@gmail.com>
2017-10-16 22:52:10 +08:00
xie xingguo
b4ca5ae462 mon, osd: per pool space-full flag support
The newly introduced 'device-class' can be used to separate
different type of devices into different pools, e.g, hdd-pool
for backup data and all-flash-pool for DB applications.

However, if any osd of the cluster is currently running out
of space (exceeding the predefined 'full' threshold), Ceph
will mark the whole cluster as full and prevent writes to all pools,
which turns out to be very wrong.

This patch instead makes the space 'full' control at pool granularity,
which exactly leverages the pool quota logic but shall solve
the above problem.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
2017-09-08 10:03:17 +08:00
Sage Weil
c8af364699 Merge pull request #16739 from liewegas/wip-multi-backfill-reject
qa/suites/rados/singleton-nomsgr/all/multi-backfill-reject: sleep longer
2017-08-04 08:41:06 -05:00
Kefu Chai
a70be4e00c qa/suites: more whitelisting
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-08-02 10:00:57 +08:00
Sage Weil
c955bf528f qa/suites/rados/singleton-nomsgr/all/multi-backfill-reject: sleep longer
I saw a failure where the 30% backfill probability was enough that we
just didn't manage to backfill all of the pgs during the 5 minute recovery
timeout during ceph.py shutdown.  Build in some additional time for the
test to recover.

http://pulpito.ceph.com/sage-2017-08-01_15:32:10-rados-wip-sage-testing-distro-basic-smithi/1469184

Signed-off-by: Sage Weil <sage@redhat.com>
2017-08-01 15:50:47 -04:00
Kefu Chai
d12c51ca91 qa/suites: escape the parenthesis of the whitelist text
so we can avoid the warnings like

grep: Unmatched ( or \(

because we pass the whitelisted string to `egrep -v "$1"` directly.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-08-01 21:54:44 +08:00
John Spray
343e1a4281 qa: update whitelist for "wrongly marked me down"
Signed-off-by: John Spray <john.spray@redhat.com>
2017-07-24 14:54:46 +01:00
Jason Dillaman
fa90be842e test: enable pool applications for new pools
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2017-07-19 13:13:01 -04:00
Sage Weil
960f00071f qa/suites: disable mon crush smoke test with valgrind
Valgrind runs itself on forked children, and does its cleanup when they
complete, and this is slow... slow enough that it frequently makes the
test time out.

Valgrind let's you ignore child *processes* that you exec, but I can't
find a way to skip forked children in the same address space.

Work around this by skip this validation when running under valgrind.

Fixes: http://tracker.ceph.com/issues/20602
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-14 11:51:47 -04:00
Sage Weil
93de19adcf qa: whitelist health warnings
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-12 12:52:03 -04:00
Sage Weil
63f97ddcf6 qa/suites/rados: whitelist health warnings
Signed-off-by: Sage Weil <sage@redhat.com>
2017-07-12 12:52:02 -04:00
Sage Weil
c7893283cd do all valgrind runs on centos
We are fighting two issues with valgrind on ubuntu (xenial, yakkety,
and z):

	http://tracker.ceph.com/issues/18126
	http://tracker.ceph.com/issues/20360

Revert this when it is fixed.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-30 09:33:18 -04:00
Greg Farnum
7d33e98bd3 qa: do not restrict valgrind runs to centos
This reverts 693bd23851, which was
added in response to http://tracker.ceph.com/issues/18126. But
we updated the Ubuntu packages in sepia so it should be good to go.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2017-06-23 16:25:16 -04:00
Sage Weil
288f623878 Merge pull request #15354 from badone/wip-rados-ls-auth-fix
osd: Reverse order of op_has_sufficient_caps and do_pg_op

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
2017-06-21 21:14:01 -05:00
Sage Weil
aa76cf7488 Revert "qa: do not restrict valgrind runs to centos"
This reverts commit 5923961465.

See http://tracker.ceph.com/issues/20360

Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-20 17:14:52 -04:00
Sage Weil
8d10e5fc29 qa/suites/rados/singleton-nomsgr/multi-backfill-reject: clean up
Set pool size back to 2 so we don't have to have backfill
complete (despite rejection probability) in order to get back to
healthy.  This way we scrub on cleanup.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-19 14:49:56 -04:00
Sage Weil
15efccab70 qa/suites/rados/singleton-nomsgr/full-tiering: unset quota at end
If we leave the quota set, the proxied ops will block
indefinitely, which will block scrubbing on the cache tier pgs
indefinitely.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-06-19 14:37:57 -04:00
Brad Hubbard
a921882e7c osd: Reverse order of op_has_sufficient_caps and do_pg_op
Fixes: http://tracker.ceph.com/issues/19790

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
2017-06-19 15:23:17 +10:00