ceph/qa/standalone
xie xingguo 023524a26d osd/PeeringState: restart peering on any previous down acting member coming back
One of our customers wants to verify the data safety of Ceph during scaling
the cluster up, and the test case looks like:
- keep checking the status of a speficied pg, who's up is [1, 2, 3]
- add more osds: up [1, 2, 3] -> up [1, 4, 5], acting = [1, 2, 3], backfill_targets = [4, 5],
  pg is remapped
- stop osd.2: up [1, 4, 5], acting = [1, 3], backfill_targets = [4, 5], pg is undersized
- restart osd.2, acting will stay unchanged as 2 belongs to neither current up nor acting set,
  hence leaving the corresponding pg pinning undersized for a long time until all backfill
  targets completes

It does not pose any critical problem -- we'll end up getting that pg back into active + clean,
except that the long live DEGRADED warnings keep bothering our customer who cares about data
safety more than any thing else.

The right way to achieve the above goal is for:

	boost::statechart::result PeeringState::Active::react(const MNotifyRec& notevt)

to check whether the newly booted node could be validly chosen for the acting set and
request a new temp mapping. The new temp mapping would then trigger a real interval change
that will get rid of the DEGRADED warning.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
Signed-off-by: Yan Jun <yan.jun8@zte.com.cn>
2020-02-21 17:52:52 +08:00
..
crush
erasure-code test: Use activate_osd() when restarting OSDs 2019-12-05 15:13:31 -08:00
mgr test: Sort pool list because the order isn't guaranteed from "balancer pool ls" 2020-01-06 21:35:19 -08:00
misc qa/standalone/misc/ok-to-stop: improve test 2020-01-20 13:24:30 -06:00
mon qa/standalone/mon/osd-create-pool: fix utf-8 grep LANG 2020-01-17 14:19:53 -06:00
osd osd/PeeringState: restart peering on any previous down acting member coming back 2020-02-21 17:52:52 +08:00
scrub qa/standalone: python -> python3 2019-12-20 13:33:21 -06:00
special qa/standalone/special/ceph_objectstore_tool: python3 2019-12-20 13:32:53 -06:00
ceph-helpers.sh qa/standalone/ceph-helpers: add wait_for_peered 2020-01-20 13:23:56 -06:00
README

qa/standalone
=============

These scripts run standalone clusters, but not in a normal way.  They make
use of functions ceph-helpers.sh to quickly start/stop daemons against
toy clusters in a single directory.

They are normally run via teuthology based on qa/suites/rados/standalone/*.yaml.

You can run them in a git checkout + build directory as well:

  * The qa/run-standalone.sh will run all of them in sequence.  This is slow
     since there is no parallelism.

  * You can run individual script(s) by specifying the basename or path below
    qa/standalone as arguments to qa/run-standalone.sh.

../qa/run-standalone.sh misc.sh osd/osd-dup.sh

  * Add support for specifying arguments to selected tests by simply adding
    list of tests to each argument.

../qa/run-standalone.sh "test-ceph-helpers.sh test_get_last_scrub_stamp"