ceph/tasks/thrashosds.py

"""
Thrash -- Simulate random osd failures.
"""
import contextlib
import logging
import ceph_manager
from teuthology import misc as teuthology


log = logging.getLogger(__name__)

@contextlib.contextmanager
def task(ctx, config):
    """
    "Thrash" the OSDs by randomly marking them out/down (and then back
    in) until the task is ended. This loops, and every op_delay
    seconds it randomly chooses to add or remove an OSD (even odds)
    unless there are fewer than min_out OSDs out of the cluster, or
    more than min_in OSDs in the cluster.

    All commands are run on mon0 and it stops when __exit__ is called.

    The config is optional, and is a dict containing some or all of:

    min_in: (default 3) the minimum number of OSDs to keep in the
       cluster

    min_out: (default 0) the minimum number of OSDs to keep out of the
       cluster

    op_delay: (5) the length of time to sleep between changing an
       OSD's status

    min_dead: (0) minimum number of osds to leave down/dead.

    max_dead: (0) maximum number of osds to leave down/dead before waiting
       for clean.  This should probably be num_replicas - 1.

    clean_interval: (60) the approximate length of time to loop before
       waiting until the cluster goes clean. (In reality this is used
       to probabilistically choose when to wait, and the method used
       makes it closer to -- but not identical to -- the half-life.)

    scrub_interval: (-1) the approximate length of time to loop before
       waiting until a scrub is performed while cleaning. (In reality
       this is used to probabilistically choose when to wait, and it
       only applies to the cases where cleaning is being performed). 
       -1 is used to indicate that no scrubbing will be done.
  
    chance_down: (0.4) the probability that the thrasher will mark an
       OSD down rather than marking it out. (The thrasher will not
       consider that OSD out of the cluster, since presently an OSD
       wrongly marked down will mark itself back up again.) This value
       can be either an integer (eg, 75) or a float probability (eg
       0.75).

    chance_test_min_size: (0) chance to run test_pool_min_size,
       which:
       - kills all but one osd
       - waits
       - kills that osd
       - revives all other osds
       - verifies that the osds fully recover

    timeout: (360) the number of seconds to wait for the cluster
       to become clean after each cluster change. If this doesn't
       happen within the timeout, an exception will be raised.

    revive_timeout: (150) number of seconds to wait for an osd asok to
       appear after attempting to revive the osd

    thrash_primary_affinity: (true) randomly adjust primary-affinity

    chance_pgnum_grow: (0) chance to increase a pool's size
    chance_pgpnum_fix: (0) chance to adjust pgpnum to pg for a pool
    pool_grow_by: (10) amount to increase pgnum by
    max_pgs_per_pool_osd: (1200) don't expand pools past this size per osd

    pause_short: (3) duration of short pause
    pause_long: (80) duration of long pause
    pause_check_after: (50) assert osd down after this long
    chance_inject_pause_short: (1) chance of injecting short stall
    chance_inject_pause_long: (0) chance of injecting long stall

    clean_wait: (0) duration to wait before resuming thrashing once clean

    sighup_delay: (0.1) duration to delay between sending signal.SIGHUP to a
                  random live osd

    powercycle: (false) whether to power cycle the node instead
        of just the osd process. Note that this assumes that a single
        osd is the only important process on the node.

    chance_test_backfill_full: (0) chance to simulate full disks stopping
        backfill

    chance_test_map_discontinuity: (0) chance to test map discontinuity
    map_discontinuity_sleep_time: (40) time to wait for map trims

    ceph_objectstore_tool: (true) whether to export/import a pg while an osd is down
    chance_move_pg: (1.0) chance of moving a pg if more than 1 osd is down (default 100%)

    example:

    tasks:
    - ceph:
    - thrashosds:
        chance_down: 10
        op_delay: 3
        min_in: 1
        timeout: 600
    - interactive:
    """
    if config is None:
        config = {}
    assert isinstance(config, dict), \
        'thrashosds task only accepts a dict for configuration'
    # add default value for sighup_delay
    config['sighup_delay'] = config.get('sighup_delay', 0.1)
    overrides = ctx.config.get('overrides', {})
    teuthology.deep_merge(config, overrides.get('thrashosds', {}))

    if 'powercycle' in config:

        # sync everyone first to avoid collateral damage to / etc.
        log.info('Doing preliminary sync to avoid collateral damage...')
        ctx.cluster.run(args=['sync'])

        if 'ipmi_user' in ctx.teuthology_config:
            for t, key in ctx.config['targets'].iteritems():
                host = t.split('@')[-1]
                shortname = host.split('.')[0]
                from teuthology.orchestra import remote as oremote
                console = oremote.getRemoteConsole(
                    name=host,
                    ipmiuser=ctx.teuthology_config['ipmi_user'],
                    ipmipass=ctx.teuthology_config['ipmi_password'],
                    ipmidomain=ctx.teuthology_config['ipmi_domain'])
                cname = '{host}.{domain}'.format(
                    host=shortname,
                    domain=ctx.teuthology_config['ipmi_domain'])
                log.debug('checking console status of %s' % cname)
                if not console.check_status():
                    log.info(
                        'Failed to get console status for '
                        '%s, disabling console...'
                        % cname)
                    console=None
                else:
                    # find the remote for this console and add it
                    remotes = [
                        r for r in ctx.cluster.remotes.keys() if r.name == t]
                    if len(remotes) != 1:
                        raise Exception(
                            'Too many (or too few) remotes '
                            'found for target {t}'.format(t=t))
                    remotes[0].console = console
                    log.debug('console ready on %s' % cname)

            # check that all osd remotes have a valid console
            osds = ctx.cluster.only(teuthology.is_type('osd'))
            for remote, _ in osds.remotes.iteritems():
                if not remote.console:
                    raise Exception(
                        'IPMI console required for powercycling, '
                        'but not available on osd role: {r}'.format(
                            r=remote.name))

    log.info('Beginning thrashosds...')
    thrash_proc = ceph_manager.Thrasher(
        ctx.manager,
        config,
        logger=log.getChild('thrasher')
        )
    try:
        yield
    finally:
        log.info('joining thrashosds')
        thrash_proc.do_join()
        ctx.manager.wait_for_recovery(config.get('timeout', 360))
Added docstrings, and improved some of the comments on several tasks. 2013-10-12 08:28:27 +00:00			`"""`
			`Thrash -- Simulate random osd failures.`
			`"""`
added thrashosds Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-06-13 23:36:21 +00:00			`import contextlib`
			`import logging`
			`import ceph_manager`
thrashosds: make it work when first mon isn't mon.0 2011-08-31 20:56:42 +00:00			`from teuthology import misc as teuthology`

added thrashosds Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-06-13 23:36:21 +00:00
			`log = logging.getLogger(__name__)`

			`@contextlib.contextmanager`
			`def task(ctx, config):`
			`"""`
thrasher: allow a config to set values Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> 2011-08-25 22:18:42 +00:00			`"Thrash" the OSDs by randomly marking them out/down (and then back`
thrashosds: no camelcaps, add some whitespace 2011-08-31 20:21:30 +00:00			`in) until the task is ended. This loops, and every op_delay`
			`seconds it randomly chooses to add or remove an OSD (even odds)`
			`unless there are fewer than min_out OSDs out of the cluster, or`
			`more than min_in OSDs in the cluster.`

thrasher: allow a config to set values Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> 2011-08-25 22:18:42 +00:00			`All commands are run on mon0 and it stops when __exit__ is called.`
thrasher: improve documentation a little Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> 2011-08-25 22:27:30 +00:00
thrasher: allow a config to set values Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> 2011-08-25 22:18:42 +00:00			`The config is optional, and is a dict containing some or all of:`
added thrashosds Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-06-13 23:36:21 +00:00
thrashosds: change min_in from 2 -> 3 See #7171. In rare cases CRUSH can't handle it when only 2/6 of the OSDs are marked in. Avoid those situations for now. Signed-off-by: Sage Weil <sage@inktank.com> 2014-01-10 19:00:55 +00:00			`min_in: (default 3) the minimum number of OSDs to keep in the`
thrashosds: no camelcaps, add some whitespace 2011-08-31 20:21:30 +00:00			`cluster`

			`min_out: (default 0) the minimum number of OSDs to keep out of the`
			`cluster`

			`op_delay: (5) the length of time to sleep between changing an`
			`OSD's status`

thrasher: adjust min_dead default Make this 1, not 2. That's a bit more friendly. It doesn't strictly matter, tho, since we revive osds before waiting for clean. 2012-01-11 00:20:50 +00:00			`min_dead: (0) minimum number of osds to leave down/dead.`

thrashosds: maxdead default to 0 This avoids any possibility of blocking peering. 2012-01-17 17:24:54 +00:00			`max_dead: (0) maximum number of osds to leave down/dead before waiting`
thrasher: add max_dead Add max_dead, and revive osds prior to waiting for clean. Otherwise we can leave too many OSDs down and the cluster will never go clean. 2012-01-10 21:57:55 +00:00			`for clean. This should probably be num_replicas - 1.`

thrashosds: no camelcaps, add some whitespace 2011-08-31 20:21:30 +00:00			`clean_interval: (60) the approximate length of time to loop before`
			`waiting until the cluster goes clean. (In reality this is used`
			`to probabilistically choose when to wait, and the method used`
			`makes it closer to -- but not identical to -- the half-life.)`

Allow scrubbing while thrashing Added ability to implement scrubbing while thrashing (scrub_interval in config can be set to an interval similar to how clean_interval is set). Defaults to 0, which indicates that no scrubbing will take place. Add scrub_interval description to thrashosds docstring. Fixes: 7199 Signed-off-by: Warren Usui <warren.usui@inktank.com> 2014-04-22 17:59:53 +00:00			`scrub_interval: (-1) the approximate length of time to loop before`
			`waiting until a scrub is performed while cleaning. (In reality`
			`this is used to probabilistically choose when to wait, and it`
			`only applies to the cases where cleaning is being performed).`
			`-1 is used to indicate that no scrubbing will be done.`

ceph_manager: default chance_down to 0.4 Signed-off-by: Samuel Just <sam.just@inktank.com> 2013-01-24 01:44:05 +00:00			`chance_down: (0.4) the probability that the thrasher will mark an`
thrashosds: no camelcaps, add some whitespace 2011-08-31 20:21:30 +00:00			`OSD down rather than marking it out. (The thrasher will not`
			`consider that OSD out of the cluster, since presently an OSD`
			`wrongly marked down will mark itself back up again.) This value`
			`can be either an integer (eg, 75) or a float probability (eg`
			`0.75).`
thrashosds: fail if cluster doesn't finally become clean in 5 minutes 2011-09-09 01:09:11 +00:00
ceph_manager: add test_min_size action Thrasher can now with configurable frequency test min_size by taking down all but one osd, waiting, killing that osd and bringing back the others, and verifying that the cluster goes clean. Signed-off-by: Samuel Just <sam.just@inktank.com> 2012-11-07 20:36:37 +00:00			`chance_test_min_size: (0) chance to run test_pool_min_size,`
			`which:`
			`- kills all but one osd`
			`- waits`
			`- kills that osd`
			`- revives all other osds`
			`- verifies that the osds fully recover`

thrashosds: fail if cluster doesn't finally become clean in 5 minutes 2011-09-09 01:09:11 +00:00			`timeout: (360) the number of seconds to wait for the cluster`
thrashosds: timeout for every clean check, not just the last one 2011-11-17 19:11:33 +00:00			`to become clean after each cluster change. If this doesn't`
			`happen within the timeout, an exception will be raised.`
thrashosds: fail if cluster doesn't finally become clean in 5 minutes 2011-09-09 01:09:11 +00:00
thrashosds: increase osd revive timeout (75s -> 150s) This is needed when running valgrind. Signed-off-by: Sage Weil <sage@redhat.com> 2014-08-25 15:51:40 +00:00			`revive_timeout: (150) number of seconds to wait for an osd asok to`
ceph_manager: add timeout option to revive, increase for power_cycle Signed-off-by: Samuel Just <sam.just@inktank.com> 2013-05-06 21:10:11 +00:00			`appear after attempting to revive the osd`

thrashosds: allow primary-affinity thrashing to be disabled Signed-off-by: Sage Weil <sage@inktank.com> 2014-02-17 21:16:42 +00:00			`thrash_primary_affinity: (true) randomly adjust primary-affinity`

CephManager: add ability to test split Signed-off-by: Samuel Just <sam.just@inktank.com> 2012-12-11 22:21:48 +00:00			`chance_pgnum_grow: (0) chance to increase a pool's size`
			`chance_pgpnum_fix: (0) chance to adjust pgpnum to pg for a pool`
			`pool_grow_by: (10) amount to increase pgnum by`
			`max_pgs_per_pool_osd: (1200) don't expand pools past this size per osd`

ceph_manager: add filestore and heartbeat stalls Signed-off-by: Samuel Just <sam.just@inktank.com> 2013-01-24 00:13:22 +00:00			`pause_short: (3) duration of short pause`
ceph_manager: use 80/70 as pause_long, pause_check_after defaults OSD::op_tp suicides after 150. Signed-off-by: Samuel Just <sam.just@inktank.com> 2013-01-24 20:50:24 +00:00			`pause_long: (80) duration of long pause`
			`pause_check_after: (50) assert osd down after this long`
ceph_manager: add filestore and heartbeat stalls Signed-off-by: Samuel Just <sam.just@inktank.com> 2013-01-24 00:13:22 +00:00			`chance_inject_pause_short: (1) chance of injecting short stall`
ceph_manager: turn long stall injection off by default Signed-off-by: Samuel Just <sam.just@inktank.com> 2013-01-25 01:31:38 +00:00			`chance_inject_pause_long: (0) chance of injecting long stall`
ceph_manager: add filestore and heartbeat stalls Signed-off-by: Samuel Just <sam.just@inktank.com> 2013-01-24 00:13:22 +00:00
thrashosds: add delay option after recovery Signed-off-by: Samuel Just <sam.just@inktank.com> 2013-07-22 23:24:41 +00:00			`clean_wait: (0) duration to wait before resuming thrashing once clean`

thrashosds: adds a sighup_delay option, defaulted to 0.1 This will call Thrasher.do_sighup which picks a random osd and sends a signal.SIGHUP to it, delaying for the value of sighup_delay between each time it picks a new osd to signal. Signed-off-by: Andrew Schoen <aschoen@redhat.com> 2015-07-28 19:11:14 +00:00			`sighup_delay: (0.1) duration to delay between sending signal.SIGHUP to a`
			`random live osd`

Support power cycling osds/nodes through ipmi This patch defines a RemoteConsole class associated with each Remote class instance, allowing power cycling a target through ipmi. Fixes/Implements #3782. Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com> 2013-01-23 02:13:19 +00:00			`powercycle: (false) whether to power cycle the node instead`
thrashosds: note assumption for powercycling 2013-01-31 17:14:06 +00:00			`of just the osd process. Note that this assumes that a single`
			`osd is the only important process on the node.`
Support power cycling osds/nodes through ipmi This patch defines a RemoteConsole class associated with each Remote class instance, allowing power cycling a target through ipmi. Fixes/Implements #3782. Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com> 2013-01-23 02:13:19 +00:00
thrashosds: add test_backfill_full Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> 2013-03-21 21:37:38 +00:00			`chance_test_backfill_full: (0) chance to simulate full disks stopping`
			`backfill`

ceph_manager: add test_map_discontinuity to thrasher Signed-off-by: Samuel Just <sam.just@inktank.com> 2013-07-26 02:43:08 +00:00			`chance_test_map_discontinuity: (0) chance to test map discontinuity`
			`map_discontinuity_sleep_time: (40) time to wait for map trims`

ceph_manager: Implement export/import when thrasher kills an osd Use list-pgs to avoid races by seeing actual pgs present Signed-off-by: David Zafman <david.zafman@inktank.com> 2014-08-04 20:07:19 +00:00			`ceph_objectstore_tool: (true) whether to export/import a pg while an osd is down`
ceph_manager: Add test code to use export/import to move a pg Check for more than 1 osd down and randomize on chance_move_pg (100%) For now only export from older down osd to newly down osd to avoid missing map Signed-off-by: David Zafman <david.zafman@inktank.com> 2014-08-14 18:46:29 +00:00			`chance_move_pg: (1.0) chance of moving a pg if more than 1 osd is down (default 100%)`
ceph_manager: Implement export/import when thrasher kills an osd Use list-pgs to avoid races by seeing actual pgs present Signed-off-by: David Zafman <david.zafman@inktank.com> 2014-08-04 20:07:19 +00:00
added thrashosds Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-06-13 23:36:21 +00:00			`example:`

			`tasks:`
			`- ceph:`
Whitespace and style cleanup. 2011-07-12 01:00:03 +00:00			`- thrashosds:`
thrashosds: no camelcaps, add some whitespace 2011-08-31 20:21:30 +00:00			`chance_down: 10`
			`op_delay: 3`
			`min_in: 1`
thrashosds: fail if cluster doesn't finally become clean in 5 minutes 2011-09-09 01:09:11 +00:00			`timeout: 600`
added thrashosds Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-06-13 23:36:21 +00:00			`- interactive:`
			`"""`
thrashosds: fix timeout when no options are specified 2011-09-09 17:31:08 +00:00			`if config is None:`
			`config = {}`
			`assert isinstance(config, dict), \`
			`'thrashosds task only accepts a dict for configuration'`
thrashosds: adds a sighup_delay option, defaulted to 0.1 This will call Thrasher.do_sighup which picks a random osd and sends a signal.SIGHUP to it, delaying for the value of sighup_delay between each time it picks a new osd to signal. Signed-off-by: Andrew Schoen <aschoen@redhat.com> 2015-07-28 19:11:14 +00:00			`# add default value for sighup_delay`
			`config['sighup_delay'] = config.get('sighup_delay', 0.1)`
tasks/thrashosds: support overrides e.g., overrides: thrashosds: thrash_primary_affinity: false ... tasks: - install: - ceph: - thrashosds: - workunit: ... Needed for #9865 Signed-off-by: Sage Weil <sage@redhat.com> 2014-10-22 18:19:01 +00:00			`overrides = ctx.config.get('overrides', {})`
			`teuthology.deep_merge(config, overrides.get('thrashosds', {}))`
task/thrashosds: Ipmi checking/setup in thrashosds We don't need to setup the ipmi console on runs that don't use powercycling, so delay setup of the RemoteConsole with ipmi to the thrashosd task and only then if the powercycle config is set. This avoids spurious test failures from flaky ipmi. Signed-off-by: Sam Lang <sam.lang@inktank.com> 2013-03-13 15:11:06 +00:00
			`if 'powercycle' in config:`

thrashosds: sync before doing powercycle testing Hopefully fixes #5112 2013-05-20 19:26:49 +00:00			`# sync everyone first to avoid collateral damage to / etc.`
			`log.info('Doing preliminary sync to avoid collateral damage...')`
			`ctx.cluster.run(args=['sync'])`

task/thrashosds: Ipmi checking/setup in thrashosds We don't need to setup the ipmi console on runs that don't use powercycling, so delay setup of the RemoteConsole with ipmi to the thrashosd task and only then if the powercycle config is set. This avoids spurious test failures from flaky ipmi. Signed-off-by: Sam Lang <sam.lang@inktank.com> 2013-03-13 15:11:06 +00:00			`if 'ipmi_user' in ctx.teuthology_config:`
			`for t, key in ctx.config['targets'].iteritems():`
			`host = t.split('@')[-1]`
			`shortname = host.split('.')[0]`
Update module references Signed-off-by: Zack Cerza <zack.cerza@inktank.com> 2014-08-07 14:24:59 +00:00			`from teuthology.orchestra import remote as oremote`
Support added for running scheduled tasks on virtual machines. This included: A). changes made so that full path names on some files were used (scheduled tasks started in different home directories). B.) Changes to insure tasks come up on the beanstalkc queue properly, C.) Finding and inserting the libvirt eqivalent code for vm machines in order to simulate ipmi actions, D.) Fix host key code, report valgrind issue more clearly. E.) Some message and downburst call changes. Fix #4988 Fix #5122 Signed-off-by: Warren Usui <warren.usui@inktank.com> 2013-06-07 01:43:43 +00:00			`console = oremote.getRemoteConsole(`
thrashosds.py: fix line length Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> 2013-03-21 21:10:13 +00:00			`name=host,`
			`ipmiuser=ctx.teuthology_config['ipmi_user'],`
			`ipmipass=ctx.teuthology_config['ipmi_password'],`
			`ipmidomain=ctx.teuthology_config['ipmi_domain'])`
			`cname = '{host}.{domain}'.format(`
			`host=shortname,`
			`domain=ctx.teuthology_config['ipmi_domain'])`
task/thrashosds: Ipmi checking/setup in thrashosds We don't need to setup the ipmi console on runs that don't use powercycling, so delay setup of the RemoteConsole with ipmi to the thrashosd task and only then if the powercycle config is set. This avoids spurious test failures from flaky ipmi. Signed-off-by: Sam Lang <sam.lang@inktank.com> 2013-03-13 15:11:06 +00:00			`log.debug('checking console status of %s' % cname)`
			`if not console.check_status():`
thrashosds.py: fix line length Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> 2013-03-21 21:10:13 +00:00			`log.info(`
			`'Failed to get console status for '`
			`'%s, disabling console...'`
			`% cname)`
task/thrashosds: Ipmi checking/setup in thrashosds We don't need to setup the ipmi console on runs that don't use powercycling, so delay setup of the RemoteConsole with ipmi to the thrashosd task and only then if the powercycle config is set. This avoids spurious test failures from flaky ipmi. Signed-off-by: Sam Lang <sam.lang@inktank.com> 2013-03-13 15:11:06 +00:00			`console=None`
			`else:`
			`# find the remote for this console and add it`
thrashosds.py: fix line length Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> 2013-03-21 21:10:13 +00:00			`remotes = [`
			`r for r in ctx.cluster.remotes.keys() if r.name == t]`
task/thrashosds: Ipmi checking/setup in thrashosds We don't need to setup the ipmi console on runs that don't use powercycling, so delay setup of the RemoteConsole with ipmi to the thrashosd task and only then if the powercycle config is set. This avoids spurious test failures from flaky ipmi. Signed-off-by: Sam Lang <sam.lang@inktank.com> 2013-03-13 15:11:06 +00:00			`if len(remotes) != 1:`
thrashosds.py: fix line length Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> 2013-03-21 21:10:13 +00:00			`raise Exception(`
			`'Too many (or too few) remotes '`
			`'found for target {t}'.format(t=t))`
task/thrashosds: Ipmi checking/setup in thrashosds We don't need to setup the ipmi console on runs that don't use powercycling, so delay setup of the RemoteConsole with ipmi to the thrashosd task and only then if the powercycle config is set. This avoids spurious test failures from flaky ipmi. Signed-off-by: Sam Lang <sam.lang@inktank.com> 2013-03-13 15:11:06 +00:00			`remotes[0].console = console`
			`log.debug('console ready on %s' % cname)`

			`# check that all osd remotes have a valid console`
			`osds = ctx.cluster.only(teuthology.is_type('osd'))`
			`for remote, _ in osds.remotes.iteritems():`
			`if not remote.console:`
thrashosds.py: fix line length Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> 2013-03-21 21:10:13 +00:00			`raise Exception(`
			`'IPMI console required for powercycling, '`
			`'but not available on osd role: {r}'.format(`
			`r=remote.name))`
task/thrashosds: Ipmi checking/setup in thrashosds We don't need to setup the ipmi console on runs that don't use powercycling, so delay setup of the RemoteConsole with ipmi to the thrashosd task and only then if the powercycle config is set. This avoids spurious test failures from flaky ipmi. Signed-off-by: Sam Lang <sam.lang@inktank.com> 2013-03-13 15:11:06 +00:00
added thrashosds Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-06-13 23:36:21 +00:00			`log.info('Beginning thrashosds...')`
			`thrash_proc = ceph_manager.Thrasher(`
replace locally instantiated CephManager Use the ctx.manager instance created by ceph.py instead Signed-off-by: Loic Dachary <loic@dachary.org> 2014-08-15 13:56:52 +00:00			`ctx.manager,`
thrasher: allow a config to set values Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> 2011-08-25 22:18:42 +00:00			`config,`
			`logger=log.getChild('thrasher')`
added thrashosds Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-06-13 23:36:21 +00:00			`)`
			`try:`
			`yield`
			`finally:`
			`log.info('joining thrashosds')`
			`thrash_proc.do_join()`
replace locally instantiated CephManager Use the ctx.manager instance created by ceph.py instead Signed-off-by: Loic Dachary <loic@dachary.org> 2014-08-15 13:56:52 +00:00			`ctx.manager.wait_for_recovery(config.get('timeout', 360))`