Added ability to implement scrubbing while thrashing
(scrub_interval in config can be set to an interval
similar to how clean_interval is set). Defaults to 0,
which indicates that no scrubbing will take place.
Add scrub_interval description to thrashosds docstring.
Fixes: 7199
Signed-off-by: Warren Usui <warren.usui@inktank.com>
and '(remote,) = ctx.cluster.only(role).remotes.iterkeys()' would fail with
ValueError and no message if there were less than 0 or more than 1 key.
Now a new function, get_single_remote_value() is called which prints out
more understandable messages.
Fixes: 7510
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Warren Usui <warren.usui@inktank.com>
See #7171. In rare cases CRUSH can't handle it when only 2/6 of
the OSDs are marked in. Avoid those situations for now.
Signed-off-by: Sage Weil <sage@inktank.com>
This included:
A). changes made so that full path names on some files were used
(scheduled tasks started in different home directories).
B.) Changes to insure tasks come up on the beanstalkc queue properly,
C.) Finding and inserting the libvirt eqivalent code for vm machines
in order to simulate ipmi actions,
D.) Fix host key code, report valgrind issue more clearly.
E.) Some message and downburst call changes.
Fix#4988Fix#5122
Signed-off-by: Warren Usui <warren.usui@inktank.com>
We don't need to setup the ipmi console on runs that
don't use powercycling, so delay setup of the RemoteConsole
with ipmi to the thrashosd task and only then if the powercycle
config is set. This avoids spurious test failures from flaky
ipmi.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
This patch defines a RemoteConsole class associated
with each Remote class instance, allowing
power cycling a target through ipmi.
Fixes/Implements #3782.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Thrasher can now with configurable frequency test min_size by
taking down all but one osd, waiting, killing that osd and bringing
back the others, and verifying that the cluster goes clean.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Clean now also means the correct number of replicas, whereas recovered
means we have done all the work we can do given the replicas/osds we have.
For example, degraded and clean are now mutually exclusive.
Also move away from 'till'.
ctx.daemons will now be an instance of CephState.
ctx.daemons.get_daemon(role, id).stop() to stop daemon, retart() to
restart the daemon, etc.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>