This forces them to be unclean, *then* stale. This ensures
that after they are both down they are both *always* unclean,
whereas previously it would be possible for them to be only
stale and not unclean.
Signed-off-by: Sage Weil <sage@redhat.com>
So that for folks with sources in typical locations
(or typical on my workstation at least!) invoking
vstart_runner is less of a mouthful.
Signed-off-by: John Spray <john.spray@redhat.com>
Only do the failure injection 50% of the time; otherwise, just
kill as usual.
Signed-off-by: Sage Weil <sage@redhat.com>
# Conflicts:
# tasks/ceph_manager.py
* tasks/rebuild_mondb.py: this task
1. removes all store.db on all monitors
2. rebuild the store.db for the first mon
3. start the first mon
4. run mkfs on other mon
5. and revive them
* suites/rados/singleton/all/rebuild-mon-db.yaml
1. run rados/test.sh
2. run rebuild_mondb task
Fixes: http://tracker.ceph.com/issues/17179
Signed-off-by: Kefu Chai <kchai@redhat.com>
rebuild_mondb task is not able to offer OSD with any monitor alive. so
self.manager.revive_osd() will always timeout after calling cot.
Signed-off-by: Kefu Chai <kchai@redhat.com>
The lsb_release binary is deprecated and requires installation of packages.
The /etc/os-release file is guaranteed to be present.
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Not doing so leads to issues and can interfere with subsequent jobs.
One example is the invocation of vgs(8) during the inital test setup:
it will issue a read to the left-behind rbd device(s) whose backing
cluster is long gone, locking up the job.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
The ceph-test package is required for teuthology. It is disabled to speed up
the build in OBS, but here we need it enabled unconditionally.
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Previously, if errors occurred during healthy(), then
the finally block would invoke osd_scrub_pgs, which relies
on CephManager being constructed, and it would die, hiding
the original exception.
Signed-off-by: John Spray <john.spray@redhat.com>
When killing an osd, split all pools with a low threshold.
This will slow down tests, but should not impact correctness.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
When multiple client run in parallel on the same machine and they
try to get workunits from a repository that is not github, they must
git clone in a directory that is suffixed as srcdir. Otherwise they
will conflict with each other.
Fixes: http://tracker.ceph.com/issues/17116
Signed-off-by: Loic Dachary <loic@dachary.org>
otherwise monitor could reject the command:
```
Refusing to reweight: we only used 588084 kb used across all osds!
```
if the average used space is smaller than
`mon_reweight_min_bytes_per_osd`.
Fixes: http://tracker.ceph.com/issues/16805
Signed-off-by: Kefu Chai <kchai@redhat.com>
vstart_runner can't find arguments to ceph daemons to identify them with
ps -x because commands are cut off at terminal width. Add -ww for wide
output.
Signed-off-by: Douglas Fuller <dfuller@redhat.com>