ceph/teuthology/task/thrashosds.py

import contextlib
import logging
import ceph_manager
from teuthology import misc as teuthology


log = logging.getLogger(__name__)

@contextlib.contextmanager
def task(ctx, config):
    """
    "Thrash" the OSDs by randomly marking them out/down (and then back
    in) until the task is ended. This loops, and every op_delay
    seconds it randomly chooses to add or remove an OSD (even odds)
    unless there are fewer than min_out OSDs out of the cluster, or
    more than min_in OSDs in the cluster.

    All commands are run on mon0 and it stops when __exit__ is called.

    The config is optional, and is a dict containing some or all of:

    min_in: (default 2) the minimum number of OSDs to keep in the
       cluster

    min_out: (default 0) the minimum number of OSDs to keep out of the
       cluster

    op_delay: (5) the length of time to sleep between changing an
       OSD's status

    min_dead: (0) minimum number of osds to leave down/dead.

    max_dead: (0) maximum number of osds to leave down/dead before waiting
       for clean.  This should probably be num_replicas - 1.

    clean_interval: (60) the approximate length of time to loop before
       waiting until the cluster goes clean. (In reality this is used
       to probabilistically choose when to wait, and the method used
       makes it closer to -- but not identical to -- the half-life.)

    chance_down: (0.4) the probability that the thrasher will mark an
       OSD down rather than marking it out. (The thrasher will not
       consider that OSD out of the cluster, since presently an OSD
       wrongly marked down will mark itself back up again.) This value
       can be either an integer (eg, 75) or a float probability (eg
       0.75).

    chance_test_min_size: (0) chance to run test_pool_min_size,
       which:
       - kills all but one osd
       - waits
       - kills that osd
       - revives all other osds
       - verifies that the osds fully recover

    timeout: (360) the number of seconds to wait for the cluster
       to become clean after each cluster change. If this doesn't
       happen within the timeout, an exception will be raised.

    revive_timeout: (75) number of seconds to wait for an osd asok to
       appear after attempting to revive the osd

    chance_pgnum_grow: (0) chance to increase a pool's size
    chance_pgpnum_fix: (0) chance to adjust pgpnum to pg for a pool
    pool_grow_by: (10) amount to increase pgnum by
    max_pgs_per_pool_osd: (1200) don't expand pools past this size per osd

    pause_short: (3) duration of short pause
    pause_long: (80) duration of long pause
    pause_check_after: (50) assert osd down after this long
    chance_inject_pause_short: (1) chance of injecting short stall
    chance_inject_pause_long: (0) chance of injecting long stall

    powercycle: (false) whether to power cycle the node instead
        of just the osd process. Note that this assumes that a single
        osd is the only important process on the node.

    chance_test_backfill_full: (0) chance to simulate full disks stopping
        backfill

    example:

    tasks:
    - ceph:
    - thrashosds:
        chance_down: 10
        op_delay: 3
        min_in: 1
        timeout: 600
    - interactive:
    """
    if config is None:
        config = {}
    assert isinstance(config, dict), \
        'thrashosds task only accepts a dict for configuration'

    if 'powercycle' in config:

        # sync everyone first to avoid collateral damage to / etc.
        log.info('Doing preliminary sync to avoid collateral damage...')
        ctx.cluster.run(args=['sync'])

        if 'ipmi_user' in ctx.teuthology_config:
            for t, key in ctx.config['targets'].iteritems():
                host = t.split('@')[-1]
                shortname = host.split('.')[0]
                from ..orchestra import remote as oremote
                console = oremote.getRemoteConsole(
                    name=host,
                    ipmiuser=ctx.teuthology_config['ipmi_user'],
                    ipmipass=ctx.teuthology_config['ipmi_password'],
                    ipmidomain=ctx.teuthology_config['ipmi_domain'])
                cname = '{host}.{domain}'.format(
                    host=shortname,
                    domain=ctx.teuthology_config['ipmi_domain'])
                log.debug('checking console status of %s' % cname)
                if not console.check_status():
                    log.info(
                        'Failed to get console status for '
                        '%s, disabling console...'
                        % cname)
                    console=None
                else:
                    # find the remote for this console and add it
                    remotes = [
                        r for r in ctx.cluster.remotes.keys() if r.name == t]
                    if len(remotes) != 1:
                        raise Exception(
                            'Too many (or too few) remotes '
                            'found for target {t}'.format(t=t))
                    remotes[0].console = console
                    log.debug('console ready on %s' % cname)

            # check that all osd remotes have a valid console
            osds = ctx.cluster.only(teuthology.is_type('osd'))
            for remote, _ in osds.remotes.iteritems():
                if not remote.console:
                    raise Exception(
                        'IPMI console required for powercycling, '
                        'but not available on osd role: {r}'.format(
                            r=remote.name))

    log.info('Beginning thrashosds...')
    first_mon = teuthology.get_first_mon(ctx, config)
    (mon,) = ctx.cluster.only(first_mon).remotes.iterkeys()
    manager = ceph_manager.CephManager(
        mon,
        ctx=ctx,
        config=config,
        logger=log.getChild('ceph_manager'),
        )
    ctx.manager = manager
    thrash_proc = ceph_manager.Thrasher(
        manager,
        config,
        logger=log.getChild('thrasher')
        )
    try:
        yield
    finally:
        log.info('joining thrashosds')
        thrash_proc.do_join()
        manager.wait_for_recovery(config.get('timeout', 360))
added thrashosds Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-06-13 23:36:21 +00:00			`import contextlib`
			`import logging`
			`import ceph_manager`
thrashosds: make it work when first mon isn't mon.0 2011-08-31 20:56:42 +00:00			`from teuthology import misc as teuthology`

added thrashosds Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-06-13 23:36:21 +00:00
			`log = logging.getLogger(__name__)`

			`@contextlib.contextmanager`
			`def task(ctx, config):`
			`"""`
thrasher: allow a config to set values Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> 2011-08-25 22:18:42 +00:00			`"Thrash" the OSDs by randomly marking them out/down (and then back`
thrashosds: no camelcaps, add some whitespace 2011-08-31 20:21:30 +00:00			`in) until the task is ended. This loops, and every op_delay`
			`seconds it randomly chooses to add or remove an OSD (even odds)`
			`unless there are fewer than min_out OSDs out of the cluster, or`
			`more than min_in OSDs in the cluster.`

thrasher: allow a config to set values Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> 2011-08-25 22:18:42 +00:00			`All commands are run on mon0 and it stops when __exit__ is called.`
thrasher: improve documentation a little Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> 2011-08-25 22:27:30 +00:00
thrasher: allow a config to set values Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> 2011-08-25 22:18:42 +00:00			`The config is optional, and is a dict containing some or all of:`
added thrashosds Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-06-13 23:36:21 +00:00
thrashosds: no camelcaps, add some whitespace 2011-08-31 20:21:30 +00:00			`min_in: (default 2) the minimum number of OSDs to keep in the`
			`cluster`

			`min_out: (default 0) the minimum number of OSDs to keep out of the`
			`cluster`

			`op_delay: (5) the length of time to sleep between changing an`
			`OSD's status`

thrasher: adjust min_dead default Make this 1, not 2. That's a bit more friendly. It doesn't strictly matter, tho, since we revive osds before waiting for clean. 2012-01-11 00:20:50 +00:00			`min_dead: (0) minimum number of osds to leave down/dead.`

thrashosds: maxdead default to 0 This avoids any possibility of blocking peering. 2012-01-17 17:24:54 +00:00			`max_dead: (0) maximum number of osds to leave down/dead before waiting`
thrasher: add max_dead Add max_dead, and revive osds prior to waiting for clean. Otherwise we can leave too many OSDs down and the cluster will never go clean. 2012-01-10 21:57:55 +00:00			`for clean. This should probably be num_replicas - 1.`

thrashosds: no camelcaps, add some whitespace 2011-08-31 20:21:30 +00:00			`clean_interval: (60) the approximate length of time to loop before`
			`waiting until the cluster goes clean. (In reality this is used`
			`to probabilistically choose when to wait, and the method used`
			`makes it closer to -- but not identical to -- the half-life.)`

ceph_manager: default chance_down to 0.4 Signed-off-by: Samuel Just <sam.just@inktank.com> 2013-01-24 01:44:05 +00:00			`chance_down: (0.4) the probability that the thrasher will mark an`
thrashosds: no camelcaps, add some whitespace 2011-08-31 20:21:30 +00:00			`OSD down rather than marking it out. (The thrasher will not`
			`consider that OSD out of the cluster, since presently an OSD`
			`wrongly marked down will mark itself back up again.) This value`
			`can be either an integer (eg, 75) or a float probability (eg`
			`0.75).`
thrashosds: fail if cluster doesn't finally become clean in 5 minutes 2011-09-09 01:09:11 +00:00
ceph_manager: add test_min_size action Thrasher can now with configurable frequency test min_size by taking down all but one osd, waiting, killing that osd and bringing back the others, and verifying that the cluster goes clean. Signed-off-by: Samuel Just <sam.just@inktank.com> 2012-11-07 20:36:37 +00:00			`chance_test_min_size: (0) chance to run test_pool_min_size,`
			`which:`
			`- kills all but one osd`
			`- waits`
			`- kills that osd`
			`- revives all other osds`
			`- verifies that the osds fully recover`

thrashosds: fail if cluster doesn't finally become clean in 5 minutes 2011-09-09 01:09:11 +00:00			`timeout: (360) the number of seconds to wait for the cluster`
thrashosds: timeout for every clean check, not just the last one 2011-11-17 19:11:33 +00:00			`to become clean after each cluster change. If this doesn't`
			`happen within the timeout, an exception will be raised.`
thrashosds: fail if cluster doesn't finally become clean in 5 minutes 2011-09-09 01:09:11 +00:00
ceph_manager: add timeout option to revive, increase for power_cycle Signed-off-by: Samuel Just <sam.just@inktank.com> 2013-05-06 21:10:11 +00:00			`revive_timeout: (75) number of seconds to wait for an osd asok to`
			`appear after attempting to revive the osd`

CephManager: add ability to test split Signed-off-by: Samuel Just <sam.just@inktank.com> 2012-12-11 22:21:48 +00:00			`chance_pgnum_grow: (0) chance to increase a pool's size`
			`chance_pgpnum_fix: (0) chance to adjust pgpnum to pg for a pool`
			`pool_grow_by: (10) amount to increase pgnum by`
			`max_pgs_per_pool_osd: (1200) don't expand pools past this size per osd`

ceph_manager: add filestore and heartbeat stalls Signed-off-by: Samuel Just <sam.just@inktank.com> 2013-01-24 00:13:22 +00:00			`pause_short: (3) duration of short pause`
ceph_manager: use 80/70 as pause_long, pause_check_after defaults OSD::op_tp suicides after 150. Signed-off-by: Samuel Just <sam.just@inktank.com> 2013-01-24 20:50:24 +00:00			`pause_long: (80) duration of long pause`
			`pause_check_after: (50) assert osd down after this long`
ceph_manager: add filestore and heartbeat stalls Signed-off-by: Samuel Just <sam.just@inktank.com> 2013-01-24 00:13:22 +00:00			`chance_inject_pause_short: (1) chance of injecting short stall`
ceph_manager: turn long stall injection off by default Signed-off-by: Samuel Just <sam.just@inktank.com> 2013-01-25 01:31:38 +00:00			`chance_inject_pause_long: (0) chance of injecting long stall`
ceph_manager: add filestore and heartbeat stalls Signed-off-by: Samuel Just <sam.just@inktank.com> 2013-01-24 00:13:22 +00:00
Support power cycling osds/nodes through ipmi This patch defines a RemoteConsole class associated with each Remote class instance, allowing power cycling a target through ipmi. Fixes/Implements #3782. Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com> 2013-01-23 02:13:19 +00:00			`powercycle: (false) whether to power cycle the node instead`
thrashosds: note assumption for powercycling 2013-01-31 17:14:06 +00:00			`of just the osd process. Note that this assumes that a single`
			`osd is the only important process on the node.`
Support power cycling osds/nodes through ipmi This patch defines a RemoteConsole class associated with each Remote class instance, allowing power cycling a target through ipmi. Fixes/Implements #3782. Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com> 2013-01-23 02:13:19 +00:00
thrashosds: add test_backfill_full Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> 2013-03-21 21:37:38 +00:00			`chance_test_backfill_full: (0) chance to simulate full disks stopping`
			`backfill`

added thrashosds Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-06-13 23:36:21 +00:00			`example:`

			`tasks:`
			`- ceph:`
Whitespace and style cleanup. 2011-07-12 01:00:03 +00:00			`- thrashosds:`
thrashosds: no camelcaps, add some whitespace 2011-08-31 20:21:30 +00:00			`chance_down: 10`
			`op_delay: 3`
			`min_in: 1`
thrashosds: fail if cluster doesn't finally become clean in 5 minutes 2011-09-09 01:09:11 +00:00			`timeout: 600`
added thrashosds Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-06-13 23:36:21 +00:00			`- interactive:`
			`"""`
thrashosds: fix timeout when no options are specified 2011-09-09 17:31:08 +00:00			`if config is None:`
			`config = {}`
			`assert isinstance(config, dict), \`
			`'thrashosds task only accepts a dict for configuration'`
task/thrashosds: Ipmi checking/setup in thrashosds We don't need to setup the ipmi console on runs that don't use powercycling, so delay setup of the RemoteConsole with ipmi to the thrashosd task and only then if the powercycle config is set. This avoids spurious test failures from flaky ipmi. Signed-off-by: Sam Lang <sam.lang@inktank.com> 2013-03-13 15:11:06 +00:00
			`if 'powercycle' in config:`

thrashosds: sync before doing powercycle testing Hopefully fixes #5112 2013-05-20 19:26:49 +00:00			`# sync everyone first to avoid collateral damage to / etc.`
			`log.info('Doing preliminary sync to avoid collateral damage...')`
			`ctx.cluster.run(args=['sync'])`

task/thrashosds: Ipmi checking/setup in thrashosds We don't need to setup the ipmi console on runs that don't use powercycling, so delay setup of the RemoteConsole with ipmi to the thrashosd task and only then if the powercycle config is set. This avoids spurious test failures from flaky ipmi. Signed-off-by: Sam Lang <sam.lang@inktank.com> 2013-03-13 15:11:06 +00:00			`if 'ipmi_user' in ctx.teuthology_config:`
			`for t, key in ctx.config['targets'].iteritems():`
			`host = t.split('@')[-1]`
			`shortname = host.split('.')[0]`
			`from ..orchestra import remote as oremote`
Support added for running scheduled tasks on virtual machines. This included: A). changes made so that full path names on some files were used (scheduled tasks started in different home directories). B.) Changes to insure tasks come up on the beanstalkc queue properly, C.) Finding and inserting the libvirt eqivalent code for vm machines in order to simulate ipmi actions, D.) Fix host key code, report valgrind issue more clearly. E.) Some message and downburst call changes. Fix #4988 Fix #5122 Signed-off-by: Warren Usui <warren.usui@inktank.com> 2013-06-07 01:43:43 +00:00			`console = oremote.getRemoteConsole(`
thrashosds.py: fix line length Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> 2013-03-21 21:10:13 +00:00			`name=host,`
			`ipmiuser=ctx.teuthology_config['ipmi_user'],`
			`ipmipass=ctx.teuthology_config['ipmi_password'],`
			`ipmidomain=ctx.teuthology_config['ipmi_domain'])`
			`cname = '{host}.{domain}'.format(`
			`host=shortname,`
			`domain=ctx.teuthology_config['ipmi_domain'])`
task/thrashosds: Ipmi checking/setup in thrashosds We don't need to setup the ipmi console on runs that don't use powercycling, so delay setup of the RemoteConsole with ipmi to the thrashosd task and only then if the powercycle config is set. This avoids spurious test failures from flaky ipmi. Signed-off-by: Sam Lang <sam.lang@inktank.com> 2013-03-13 15:11:06 +00:00			`log.debug('checking console status of %s' % cname)`
			`if not console.check_status():`
thrashosds.py: fix line length Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> 2013-03-21 21:10:13 +00:00			`log.info(`
			`'Failed to get console status for '`
			`'%s, disabling console...'`
			`% cname)`
task/thrashosds: Ipmi checking/setup in thrashosds We don't need to setup the ipmi console on runs that don't use powercycling, so delay setup of the RemoteConsole with ipmi to the thrashosd task and only then if the powercycle config is set. This avoids spurious test failures from flaky ipmi. Signed-off-by: Sam Lang <sam.lang@inktank.com> 2013-03-13 15:11:06 +00:00			`console=None`
			`else:`
			`# find the remote for this console and add it`
thrashosds.py: fix line length Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> 2013-03-21 21:10:13 +00:00			`remotes = [`
			`r for r in ctx.cluster.remotes.keys() if r.name == t]`
task/thrashosds: Ipmi checking/setup in thrashosds We don't need to setup the ipmi console on runs that don't use powercycling, so delay setup of the RemoteConsole with ipmi to the thrashosd task and only then if the powercycle config is set. This avoids spurious test failures from flaky ipmi. Signed-off-by: Sam Lang <sam.lang@inktank.com> 2013-03-13 15:11:06 +00:00			`if len(remotes) != 1:`
thrashosds.py: fix line length Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> 2013-03-21 21:10:13 +00:00			`raise Exception(`
			`'Too many (or too few) remotes '`
			`'found for target {t}'.format(t=t))`
task/thrashosds: Ipmi checking/setup in thrashosds We don't need to setup the ipmi console on runs that don't use powercycling, so delay setup of the RemoteConsole with ipmi to the thrashosd task and only then if the powercycle config is set. This avoids spurious test failures from flaky ipmi. Signed-off-by: Sam Lang <sam.lang@inktank.com> 2013-03-13 15:11:06 +00:00			`remotes[0].console = console`
			`log.debug('console ready on %s' % cname)`

			`# check that all osd remotes have a valid console`
			`osds = ctx.cluster.only(teuthology.is_type('osd'))`
			`for remote, _ in osds.remotes.iteritems():`
			`if not remote.console:`
thrashosds.py: fix line length Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> 2013-03-21 21:10:13 +00:00			`raise Exception(`
			`'IPMI console required for powercycling, '`
			`'but not available on osd role: {r}'.format(`
			`r=remote.name))`
task/thrashosds: Ipmi checking/setup in thrashosds We don't need to setup the ipmi console on runs that don't use powercycling, so delay setup of the RemoteConsole with ipmi to the thrashosd task and only then if the powercycle config is set. This avoids spurious test failures from flaky ipmi. Signed-off-by: Sam Lang <sam.lang@inktank.com> 2013-03-13 15:11:06 +00:00
added thrashosds Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-06-13 23:36:21 +00:00			`log.info('Beginning thrashosds...')`
thrashosds: make it work when first mon isn't mon.0 2011-08-31 20:56:42 +00:00			`first_mon = teuthology.get_first_mon(ctx, config)`
			`(mon,) = ctx.cluster.only(first_mon).remotes.iterkeys()`
added thrashosds Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-06-13 23:36:21 +00:00			`manager = ceph_manager.CephManager(`
Whitespace and style cleanup. 2011-07-12 01:00:03 +00:00			`mon,`
ceph.py/cephmanager.py: add ctx.daemons for restarting daemons ctx.daemons will now be an instance of CephState. ctx.daemons.get_daemon(role, id).stop() to stop daemon, retart() to restart the daemon, etc. Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-09-14 23:31:58 +00:00			`ctx=ctx,`
Support power cycling osds/nodes through ipmi This patch defines a RemoteConsole class associated with each Remote class instance, allowing power cycling a target through ipmi. Fixes/Implements #3782. Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com> 2013-01-23 02:13:19 +00:00			`config=config,`
Whitespace and style cleanup. 2011-07-12 01:00:03 +00:00			`logger=log.getChild('ceph_manager'),`
added thrashosds Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-06-13 23:36:21 +00:00			`)`
CephManager: add ability to test split Signed-off-by: Samuel Just <sam.just@inktank.com> 2012-12-11 22:21:48 +00:00			`ctx.manager = manager`
added thrashosds Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-06-13 23:36:21 +00:00			`thrash_proc = ceph_manager.Thrasher(`
			`manager,`
thrasher: allow a config to set values Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> 2011-08-25 22:18:42 +00:00			`config,`
			`logger=log.getChild('thrasher')`
added thrashosds Signed-off-by: Samuel Just <samuel.just@dreamhost.com> 2011-06-13 23:36:21 +00:00			`)`
			`try:`
			`yield`
			`finally:`
			`log.info('joining thrashosds')`
			`thrash_proc.do_join()`
wait_till_clean -> wait_for_clean and wait_for_recovery Clean now also means the correct number of replicas, whereas recovered means we have done all the work we can do given the replicas/osds we have. For example, degraded and clean are now mutually exclusive. Also move away from 'till'. 2012-02-18 05:53:25 +00:00			`manager.wait_for_recovery(config.get('timeout', 360))`