Commit Graph

62 Commits

Author SHA1 Message Date
Sage Weil
30fc7f5e97 qa/standalone/ceph-helpers: fix test_wait_for_clean
Signed-off-by: Sage Weil <sage@redhat.com>
2019-03-08 18:07:10 -06:00
Sage Weil
1e2b0c7252 qa/standalone/ceph-helpers.sh: fix test_run_mon
- Only create each osd once
- forget the first osdmap dump test; it's pointless

Signed-off-by: Sage Weil <sage@redhat.com>
2019-03-08 17:43:00 -06:00
Sage Weil
cba0483b09 qa/standalone: make sure an osd is running before create_rbd_pool
'rbd pool init' now does IO.  Drop the pool, or change the pool size to 1.

Fixes: http://tracker.ceph.com/issues/38585
Signed-off-by: Sage Weil <sage@redhat.com>
2019-03-06 16:27:56 -06:00
David Zafman
690ff9a21f
Merge pull request #26213 from dzafman/wip-38041
osd: Fix recovery and backfill priority handling

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2019-02-07 17:26:34 -08:00
Sage Weil
dcdca44aa4 qa/standalone/ceph-helpers: fix health_ok test
Stopping the osd daemon won't reliably get you HEALTH_WARN or ERR; you have
to make sure it is also marked down.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-02-07 12:10:34 -06:00
David Zafman
bca4fe98b1 test: Fix kill_daemon() to check after last large sleep
Signed-off-by: David Zafman <dzafman@redhat.com>
2019-02-05 11:30:04 -08:00
David Zafman
70b5136208 test: Add option to wait_for_clean() to execute at every sleep
Signed-off-by: David Zafman <dzafman@redhat.com>
2019-01-30 09:35:51 -08:00
Kefu Chai
94a84b6f5a test: listen on random port in tests which start ceph-mon
See-also: http://tracker.ceph.com/issues/36737
Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-01-27 21:16:54 +08:00
David Zafman
3b8f86c8b0 test: Add testing for backfill out of space detection
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-12-18 09:30:44 -08:00
Igor Fedotov
79fd227639 qa: replace raw_bytes_used field access in QA test cases
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
2018-12-06 18:54:21 +03:00
John Spray
67d147c00d
Merge pull request #23622 from renhwztetecs/renhw-wip-25103
mgr: fixup pgs show in unknown state

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: John Spray <john.spray@redhat.com>
2018-10-10 13:28:33 +01:00
huanwen ren
ed442447c0 qa: modify the format for add pgmap_ready.
Signed-off-by: huanwen ren <ren.huanwen@zte.com.cn>
2018-09-27 23:22:50 +08:00
Kefu Chai
f46523e464
Merge pull request #23955 from wjwithagen/wjw-fix-ceph-helpers.sh
test: Start using GNU awk and fix archiving directory

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-09-17 15:44:06 +08:00
David Zafman
6e3f04365f test: Trap termination so we can capture logs on teuthology timeout
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-09-10 12:23:07 -07:00
Willem Jan Withagen
bfe7a2afaa test: Start using GNU awk and fix archiving directory
awk uses some tests that the native FreeBSD awk does not support:
    like: BEGIN{print 0 < 90}

And TESTDIR is not set when calling ceph-helpers from smoke.sh
    So fix with keeping the archive in /tmp

Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
2018-09-06 15:50:20 +02:00
David Zafman
d0b260c272 test: Fix test to use -gt instead of creating an empty file "0"
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-08-17 19:33:44 -07:00
Noah Watkins
7d3fa9bda3 qa/standalone/ceph-helpers.sh: fix mgr module path
callers of get_python_path were not passing in a $1 parameter, so
ceph_lib was an empty string resulting in an invalid path to the built
cython modules. assume this is called from the `lib` parent directory.

pass path to the manager modules when starting ceph-mgr.

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
2018-08-17 15:21:57 -07:00
David Zafman
fbc8bcfe05 test: test_get_timeout_delays() fix
Caused by: 7b0d1c8b8a

Signed-off-by: David Zafman <dzafman@redhat.com>
2018-07-03 14:01:36 -07:00
David Zafman
663d96e934
Merge pull request #22727 from dzafman/wip-21664
qa/standalone/scrub: When possible show side-by-side diff in addition to regular diff

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-06-28 19:59:21 -04:00
David Zafman
3ff56a82a4
Merge pull request #22763 from dzafman/wip-remove-sudo
qa: Don't use sudo when moving logs

Reviewed-by: Neha Ojha <nojha@redhat.com>
2018-06-28 18:37:24 -04:00
David Zafman
23ed63e15f
Merge pull request #22441 from ErwanAliasr1/evelu-makecheck
Improving make check reliability

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: David Zafman <dzafman@redhat.com>
2018-06-28 14:55:12 -04:00
David Zafman
808c628304 qa: Don't use sudo when moving logs
Caused by: f0964beac5

Signed-off-by: David Zafman <dzafman@redhat.com>
2018-06-28 09:17:06 -07:00
David Zafman
ebb05b2542 test: When possible show side-by-side diff in addition to regular diff
Fixes: https://tracker.ceph.com/issues/21664

Signed-off-by: David Zafman <dzafman@redhat.com>
2018-06-26 18:23:07 -07:00
David Zafman
f0964beac5 qa: For teuthology copy logs to teuthology expected location
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-06-25 18:06:01 -07:00
Erwan Velu
57df91380b qa/standalone/ceph-helpers.sh: Setup ulimit in setup()
If ulimit is set to a 1024 value, ceph-osd will segfault with the
following error :
    filestore(td/smoke/0)  error (24) Too many open files not handled on operation 0x55565d1fd004 (2182.1.0, or op 0, counting from 0)

This patch is about to insure that before setting up ceph daemons in tests, a valid ulimit value is setup.

Signed-off-by: Erwan Velu <erwan@redhat.com>
2018-06-25 22:09:14 +02:00
Erwan Velu
7b0d1c8b8a qa/standalone/ceph-helpers.sh: Thinner resolution in get_timeout_delays()
get_timeout_delays() is a generic function to compute delays for a long
period of time without saturating the CPU is busy loops.

It works pretty fine when the delay is short like having the following
series when requesting a 20seconds timeout : "0.1 0.2 0.4 0.8 1.6 3.2 6.4 7.3 ".
Here the maximum between two loops is 7.3 which is perfectly fine.

When the timeout reaches 300sec, the same code produces the following
series : "0.1 0.2 0.4 0.8 1.6 3.2 6.4 12.8 25.6 51.2 102.4 95.3 "
In such example there is delays which are nearly 2 minutes !

That is not efficient as the expected event, between two loops, could
arrive just after this long sleep occurs making a minute+ sleep for
nothing. On a local system that could be ok while on a CI, if all jobs
run like CI the overall is pretty unefficient by generating useless CPU
waits.

This patch is about adding a maximum acceptable delay time between two
loops while keeping the same rampup behavior.

On the same 300 seconds delay example, with MAX_TIMEOUT set to 10, we
now have the following series: "0.1 0.2 0.4 0.8 1.6 3.2 6.4 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 7.3"
We can see that the long 12/25/51/102/95 values vanished and being
replaced by a series of 10 seconds. It's up to every test defining the
probability of having a soonish event to complete.

The MAX_TIMEOUT is set to 15seconds.
Signed-off-by: Erwan Velu <erwan@redhat.com>
2018-06-25 22:09:14 +02:00
Sage Weil
3cd7d5eb22 Merge PR #22343 into master
* refs/pull/22343/head:
	qa/standalone remove ceph-disk from activate_osd helper
	cmake: remove subman.sh tests
	test remove ceph-disk directory
	debian: remove ceph_detect_init python files from base
	qa/standalone remove virtualenv paths for ceph-disk and ceph-detect-init
	debian: remove ceph-disk ceph-detect-init python files
	rpm: remove ceph-disk ceph-detect-init python files
	alpine: remove ceph-disk ceph-detect-init python files
	alpine: remove ceph-osd and parttypeuuid udev rules
	debian: remove ceph-osd and parttypeuuid udev rules
	rpm: remove ceph-osd and parttypeuuid udev rules
	ceph-helpers.sh: remove ceph-disk, set up osds directly
	CMakeLists.txt: add back CEPH_BUILD_VIRTUALENV
	alpine: remove ceph-disk, add ceph-volume in APKBUILD.in
	upstart: remove ceph-disk activation call
	doc/install add anchor for manual osd deployment in freebsd guide
	doc/dev remove ceph-disk from freebsd guide, link to manual reference
	doc/dev/config-key remove ceph-disk references
	doc/dev remove ceph-disk.rst
	doc/dev: change ceph-disk suite examples for ceph-deploy
	doc/man_index: remove ceph-disk, ceph-detect-init refs
	doc/install: remove ceph-disk from freebsd examples
	doc/rados remove ceph-disk from man references
	doc/man remove ceph-disk ref from ceph-volume-systemd
	doc/man: update reference from ceph-disk to ceph-volume
	doc/man: remove ceph-disk, ceph-detect-init from cmake
	doc/man/ceph-volume remove doc reference to ceph-disk
	doc/man: remove ceph-disk, ceph-detect-init
	qa/suites: remove ceph-disk
	qa/run-standalone.sh: remove requirement for ceph-detect-init virtualenv
	qa/workunits: remove ceph-detect-init from rbdmapfile test
	qa/workunits: remove ceph-detect-init from ceph-helpers-root.sh
	qa/workunits: remove ceph-disk
	build: remove ceph-disk from freebsd script
	cmake: remove ceph-disk, ceph-detect-init tox tests
	init-ceph: remove ceph-disk
	cmake: remove top-level entries for ceph-disk, ceph-detect-init
	debian: remove ceph-detect-init references
	debian: remove ceph-disk references
	src: remove ceph-detect-init tool
	rpm: remove ceph-disk, ceph-detect-init from spec file
	test: remove subman script
	script: remove subman script
	udev: remove parttypeuuid rules for ceph-disk
	tool remove ceph-disk from ps-ceph.pl
	upstart: remove ceph-disk conf file
	systemd: remove ceph-disk from CMakeLists
	systemd: remove ceph-disk service
	udev: remove ceph-disk rules
	src: remove ceph-disk tool
2018-06-19 07:07:55 -05:00
David Zafman
f886ebba08 test: Fix some function desciptions
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-06-18 14:09:14 -07:00
Erwan Velu
2ce480b8fd qa/standalone/ceph-helpers.sh: Fixing comment for wait_for_health()
wait_for_health doesn't check if the cluster is making progress. So
let's adjust the comment accordingly.

Signed-off-by: Erwan Velu <erwan@redhat.com>
2018-06-14 11:06:52 +02:00
Erwan Velu
62d2646c30 qa/standalone/ceph-helpers.sh: Defining custom timeout for wait_for_clean()
The wait_for_clean() is using the default timeout aka 300sec = 5mn.

wait_for_clean() is trying to find a clean status within that timeout
_or_ reset its counter if any progress got made in between loops.

In a case where the cluster is sane, the recovery should be made in
shorter than 5mn but it the cluster died, waiting for 5mn for nothing is
unefficient.

This patch is about defining a custom timeout for a wait_for_clean() not
to wait much more that 1m30 (90sec). If no progress is made in that
period, there is very few chance this will read the a valid state
anyhow.

Signed-off-by: Erwan Velu <erwan@redhat.com>
2018-06-14 11:06:52 +02:00
Alfredo Deza
5b3a540045 qa/standalone remove ceph-disk from activate_osd helper
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2018-06-13 15:16:27 -04:00
Alfredo Deza
aa4f5569c3 qa/standalone remove virtualenv paths for ceph-disk and ceph-detect-init
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2018-06-13 15:16:27 -04:00
Dan Mick
50f2b72f2f ceph-helpers.sh: remove ceph-disk, set up osds directly
Signed-off-by: Dan Mick <dan.mick@redhat.com>
2018-06-13 15:16:26 -04:00
Nathan Cutler
f03b9028f5 qa/standalone/ceph-helpers.sh: provide argument to dirname
Fixes: http://tracker.ceph.com/issues/23805
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2018-04-20 10:10:15 +02:00
David Zafman
ce9c029858 test: Eliminate use of bc (use awk) in get_timeout_delays()
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-03-28 10:24:33 -07:00
David Zafman
51b740ad41 test: Fail upon flush_pg_stats timeout
Signed-off-by: David Zafman <dzafman@redhat.com>
2018-03-11 16:26:11 -07:00
Sage Weil
5ee5bbace1 qa/standalone: drop CEPH_LIB hacks
Signed-off-by: Sage Weil <sage@redhat.com>
2018-03-06 14:44:49 -06:00
Kefu Chai
ac56a202fd qa/standalone: extract delete_pool()
some tests, like osd-backfill-stats.sh are using delete_pool(), but
they don't have this function defined. and this function is defined
in standalone tests separately, so would be simpler if we can
consolidate them in ceph-helper.sh.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2018-02-28 15:40:28 +08:00
Patrick Donnelly
46c25abd1c
test/encoding: refactor to avoid escaping shell magic
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-02-07 18:03:05 -08:00
David Zafman
aeba36a660 ceph-helpers.sh: Add flush_pg_stats() to wait_for_clean() to make it reliable
osd-scrub-repair.sh: Fixes for omap keys landing on different OSDs due to flush

Signed-off-by: David Zafman <dzafman@redhat.com>
2018-01-14 18:17:23 -08:00
Sage Weil
f33ab7e03a Merge remote-tracking branch 'gh/mimic-dev1' 2017-12-20 15:08:30 -06:00
Sage Weil
06b7707cee
Merge pull request #19456 from liewegas/wip-22373
qa/standalone/ceph-helpers: pass --verbose to ceph-disk
2017-12-19 11:55:07 -06:00
Kefu Chai
2ceff9eb4e qa/stanalone: pass options using --<option-name>=<value>
not "--<option-name> <value>', otherwise `ceph-authtool` would error
out:

$ CEPH_ARGS='--osd-map-max-advance 1000' bin/ceph-authtool --gen-print-key
bin/ceph-authtool: unexpected '1000'
usage: ceph-authtool keyringfile [OPTIONS]...
....

but using the syntax of `--<option-name>=<value>', it works:

$ CEPH_ARGS='--osd-map-max-advance=1000' bin/ceph-authtool --gen-print-key
AQBAhTNamf5+ABAASkAp/6IGq7LkUTEOMp/fgw==

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-12-15 16:19:15 +08:00
Kefu Chai
4e621762ed qa/standalone/ceph-helpers.sh: silence ceph-disk DEPRECATION_WARNING
Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-12-13 19:42:50 +08:00
Sage Weil
86dc162686 qa/standalone/ceph-helpers: pass --verbose to ceph-disk
Signed-off-by: Sage Weil <sage@redhat.com>
2017-12-12 12:56:45 -06:00
Sage Weil
c6529ad93e qa/standalone/ceph-helpers.sh: fix full ratio ordering
Signed-off-by: Sage Weil <sage@redhat.com>
2017-11-29 16:07:12 -06:00
Sage Weil
15b63d6795 qa/standalone/scrub/osd-scrub-repair: no -y to diff
With -y you can't see the entire line when it is long, which is
needed to identify the diff failure in
http://tracker.ceph.com/issues/21618

Instead, let the interactive user specify the option if they want it.

Signed-off-by: Sage Weil <sage@redhat.com>
2017-10-03 14:35:35 -05:00
Kefu Chai
279d2980fa qa/standalone/ceph-helpers.sh: pass btrfs subvolume options the right way
with the latest btrfs-progs, it complains with

$ sudo btrfs subvolume list . -t
btrfs subvolume list: too many arguments

so, we need to pass `-t` right after `list` subcommand.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-09-15 12:19:50 +08:00
Kefu Chai
0c47aa8217 qa: respect $TEMPDIR
ceph-disk and ceph-detect-init are build in $TEMPDIR if it's defined.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2017-09-15 12:19:50 +08:00
Kefu Chai
30b5b4627c Merge pull request #16494 from asomers/bin_bash
misc: Fix bash path in shebangs

Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2017-08-27 10:14:14 +08:00