Only do the failure injection 50% of the time; otherwise, just
kill as usual.
Signed-off-by: Sage Weil <sage@redhat.com>
# Conflicts:
# tasks/ceph_manager.py
* tasks/rebuild_mondb.py: this task
1. removes all store.db on all monitors
2. rebuild the store.db for the first mon
3. start the first mon
4. run mkfs on other mon
5. and revive them
* suites/rados/singleton/all/rebuild-mon-db.yaml
1. run rados/test.sh
2. run rebuild_mondb task
Fixes: http://tracker.ceph.com/issues/17179
Signed-off-by: Kefu Chai <kchai@redhat.com>
rebuild_mondb task is not able to offer OSD with any monitor alive. so
self.manager.revive_osd() will always timeout after calling cot.
Signed-off-by: Kefu Chai <kchai@redhat.com>
The lsb_release binary is deprecated and requires installation of packages.
The /etc/os-release file is guaranteed to be present.
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Not doing so leads to issues and can interfere with subsequent jobs.
One example is the invocation of vgs(8) during the inital test setup:
it will issue a read to the left-behind rbd device(s) whose backing
cluster is long gone, locking up the job.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
The ceph-test package is required for teuthology. It is disabled to speed up
the build in OBS, but here we need it enabled unconditionally.
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Previously, if errors occurred during healthy(), then
the finally block would invoke osd_scrub_pgs, which relies
on CephManager being constructed, and it would die, hiding
the original exception.
Signed-off-by: John Spray <john.spray@redhat.com>
When killing an osd, split all pools with a low threshold.
This will slow down tests, but should not impact correctness.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
When multiple client run in parallel on the same machine and they
try to get workunits from a repository that is not github, they must
git clone in a directory that is suffixed as srcdir. Otherwise they
will conflict with each other.
Fixes: http://tracker.ceph.com/issues/17116
Signed-off-by: Loic Dachary <loic@dachary.org>
otherwise monitor could reject the command:
```
Refusing to reweight: we only used 588084 kb used across all osds!
```
if the average used space is smaller than
`mon_reweight_min_bytes_per_osd`.
Fixes: http://tracker.ceph.com/issues/16805
Signed-off-by: Kefu Chai <kchai@redhat.com>
vstart_runner can't find arguments to ceph daemons to identify them with
ps -x because commands are cut off at terminal width. Add -ww for wide
output.
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
Fortunately we already have a test that creates the condition,
so just tweak it to exceed the 150% threshold for the health warning,
and check that the health message appears.
Signed-off-by: John Spray <john.spray@redhat.com>
The rest of the test is still valuable to ensure that we detect missing
items which are not in the log, but now that the missing set is
explicitely persisted, the divergent priors set isn't a special case
and won't have special log lines to check for.
Signed-off-by: Samuel Just <sjust@redhat.com>
Test the usecases for the authentication metadata stored
by the volume client:
* Obtain the list of auth IDs having access to a volume.
* Restrict volume access to auth IDs of a single (OpenStack)
tenant to enforce strong tenant isolation of volumes.
Signed-off-by: Ramana Raja <rraja@redhat.com>
We changed the default to k+1 instead of k. Adjust test to compensate.
Fixes: http://tracker.ceph.com/issues/16416
Signed-off-by: Samuel Just <sjust@redhat.com>