Commit Graph

66 Commits

Author SHA1 Message Date
Sage Weil
7a5fd05edd misc: replace : with - in testdir name
The :'s break the list in $PATH.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-02-17 22:13:45 -08:00
Sam Lang
100e9056ed misc: Close connections on reboot
When nodes are rebooted, the connections remain open
even after calling reconnect and setting up new ssh
sessions to the rebooted nodes.  This causes ECONNRESET
errors to show up in the teuthology output.

Close the existing connections before trying to reconnect.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-05 16:20:52 -06:00
Josh Durgin
2f41f81dfa misc: don't use colon in default run name
LD_LIBRARY_PATH does not work with colons (and backslash does not escape them.)
2013-02-04 10:39:40 -08:00
Sam Lang
150a3d7d9e misc: Don't include existing partitions in devs
We don't want to include /dev/sda1, etc. in the
list of devices to use.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 10:53:47 -06:00
Sam Lang
64e3966779 misc: get_wwn_id_map() needs to return dict
If we can't find device ids, we need to return
a dict, not a list.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 09:13:48 -06:00
Sam Lang
ac4ba69d8d misc: Fix get_wwn_id_map() to be optional
Not all plana nodes have symlinks setup when
we check /dev/disk/by-id/wwn-*.  Instead of failing
here, just use the /dev/disk/sd* devices.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-02-01 08:20:43 -06:00
Sam Lang
8f720454cb Assign devices to osds using the device wwn
Linux doesn't guarantee device names (/dev/sdb, etc.)
are always mapped to the same disk.  Instead of assigning
nominal devices to osds, we map devices by their wwn
(/dev/disk/by-id/wwn-*) to an osd (both data and journal).

Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-31 08:23:39 -06:00
Sam Lang
ace4cb07b2 Replace /tmp/cephtest/ with configurable path
Teuthology uses /tmp/cephtest/ as the scratch test directory for
a run.  This patch replaces /tmp/cephtest/ everywhere with a
per-run directory: {basedir}/{rundir} where {basedir} is a directory
configured in .teuthology.yaml (/tmp/cephtest if not specified),
and {rundir} is the name of the run, as given in --name.  If no name
is specified, {user}-{timestamp} is used.

To get the old behavior (/tmp/cephtest), set test_path: /tmp/cephtest
in .teuthology.yaml.

This change was modivated by #3782, which requires a test dir that
survives across reboots, but also resolves #3767.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-31 08:23:31 -06:00
Sam Lang
53f22d9493 task/mds_thrasher: New task for thrashing the mds
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2013-01-18 15:48:52 -06:00
Loic Dachary
72db1a59cd When running teuthology with targets provisionned on OpenStack and kvm, the disks will show under /dev/vda, /dev/vdb etc. Add them to the list of devices to inspect and use for tests.
Signed-off-by: Loic Dachary <loic@dachary.org>
2013-01-16 20:48:15 -08:00
Sam Lang
25d4f56067 misc: Show url on get failure
Signed-off-by: Sam Lang <sam.lang@inktank.com>
2012-11-12 13:16:49 -06:00
Sage Weil
b4bf14edd5 add exec task 2012-10-22 16:51:54 -07:00
Sage Weil
ee3407fa04 include newpool in osd cap for client.0
This is needed by the kclient_workunit_kclient task.
2012-09-29 08:55:58 -07:00
Josh Durgin
13c91dba67 misc: use new syntax for osd caps
pool=pool1,pool2 is not valid for the new grammar
2012-09-28 10:07:45 -07:00
Josh Durgin
57bb434def Fix errors found by pyflakes
A bunch of unused imports and variables.
2012-09-21 16:46:24 -07:00
Sage Weil
12dc0ad101 ceph: archive mon data to a .tgz
Saves bandwidth, time, and space.
2012-07-17 10:00:59 -07:00
Sage Weil
cff2cfa217 internal: move pulling archive w/ tar to helper 2012-07-11 14:10:00 -07:00
Sage Weil
132dc0066d nuke: be more careful about kill; simplify
If the archive dir is specified, make sure we are killing the right
process.

Also drop the kill_process helper; it's simple enough to open-code.
2012-07-04 14:47:33 -07:00
Sage Weil
45fcca1fea valgrind: add strptime suppressions
Precise's strptime triggers valgrind false positives.

Use ship_utilities to push the valgrind.supp file over, which is a bit
slippy.
2012-07-04 14:29:55 -07:00
tamil
f3c2451797 nuke - optionally kill the process hung
Added a function kill_process to kill the process hung in the nightly runs.
It takes in pid as an optional argument.

Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
2012-07-03 12:23:36 -07:00
Sage Weil
94f0ba1efe run valgrind with cwd set to /tmp/cephtest/archive/coredump
This lets us capture the vgcore.* files, which always go to valgrind's
cwd.

Fixes: #1953
2012-03-18 10:48:51 -07:00
Sage Weil
5c9acbd897 gitbuilder: put flavor last
in case we refine the field later
2012-03-13 10:09:18 -07:00
Sage Weil
1a01ccaafb Pull from new gitbuilder.ceph.com locations.
Simplifies the flavor stuff into a tuple of

<package,type,flavor,dist,arch>

where package is ceph, kenrel, etc.
type is tarball, deb
flavor is basic, gcov, notcmalloc
arch is x86_64, i686 (uname -m)
dist is oneiric, etc. (lsb_release -s -c)
2012-03-13 10:02:26 -07:00
Josh Durgin
62bda12711 misc: always return a usable result from get_valgrind_args 2012-02-24 14:56:43 -08:00
Josh Durgin
c93a08eda0 Whitespace and unnecessary formatting fixes 2012-02-24 12:05:35 -08:00
Sage Weil
9ec047226f refactor all valgrind users to use a get_valgrind_args() helper
This avoids much annoying, duplicated code.
2012-02-24 12:05:35 -08:00
Sage Weil
46b612efa4 misc: make get_scratch_devices look for (almost) any disk that's not mounted 2012-02-13 15:28:24 -08:00
Sage Weil
50463ffddd verify all osds start before checking health
Just checking health isn't good enough, since it races with OSD startup:
we can have a healthy cluster with 0 (or something else < total) OSDs.
2012-01-11 12:54:08 -08:00
Sage Weil
da9210779e ceph: don't skip monitor ports
We can use the same port multiple times if they are on a different hosts.
2012-01-06 13:36:54 -08:00
Josh Durgin
2f71f03fdd misc: simplify reconnect logic
Ignore all errors until the timeout expires so we don't have to worry
about whitelisting them.
2011-12-30 14:37:37 -08:00
Josh Durgin
a763297685 misc: move deep_merge out of the MergeConfig class - it's generic 2011-11-17 13:06:36 -08:00
Sage Weil
77c977c1cf misc: allow >1 monitor per role in get_mon_names()
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-10 14:13:24 -08:00
Sage Weil
6618a0275c mon_recovery: add task to test monitor cluster failure recovery
Some simple tests to start with.  We still need some sort of mon cluster
thrashing.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-08 22:17:00 -08:00
Josh Durgin
0b451f9475 Keep each ssh connection alive.
With long-running jobs like thrashing, ssh connections were timing
out.
2011-11-03 13:08:49 -07:00
Josh Durgin
c56ab97442 reconnect: ignore SSHExceptions before the timeout expires
Fixes: #1587
2011-10-06 17:18:35 -07:00
Samuel Just
28d60172f6 ceph.py: add btrfs option
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-03 14:26:04 -07:00
Sage Weil
a92fef77dc rename c* -> ceph-*
Leave cfuse task name unchanged for now...
2011-09-23 08:57:18 -07:00
Tommi Virtanen
a2372fce12 Move orchestra to teuthology.orchestra so there's just one top-level package. 2011-09-13 14:53:02 -07:00
Josh Durgin
091b0ae3de autotest: allow tests to be run on all clients 2011-09-07 17:50:12 -07:00
Sage Weil
c502418fca thrashosds: make it work when first mon isn't mon.0 2011-09-01 12:56:29 -07:00
Greg Farnum
0139323e51 Merge branch 'wip-nuke'
Conflicts:
	teuthology/task/kernel.py
2011-08-10 16:16:25 -07:00
Greg Farnum
b5859f877a Move reconnect function from kernel task to misc.py
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-10 14:37:24 -07:00
Sage Weil
b5ba155c17 Revert "fix get_clients"
This reverts commit 83b6678e79.  The bug I was
hitting was actually fxied by 06e3e69c29.
2011-08-09 13:23:58 -07:00
Sage Weil
01fac3e2c6 new gitbuilder ref/branch naming
no origin_ prefix
2011-08-05 14:35:44 -07:00
Sage Weil
83b6678e79 fix get_clients
Only return the clients that are listed (not _all_ clients).  There might
be a combination of cfuse and kclient (or other) clients here!
2011-08-05 14:35:44 -07:00
Sage Weil
07745f8a51 no ++ in python 2011-07-27 11:45:20 -07:00
Sage Weil
573c9ff2b4 configure mds's with -s suffix as standby 2011-07-27 10:04:37 -07:00
Sage Weil
5b0924494a tolerate named (not numbered) mons 2011-07-26 22:07:02 -07:00
Josh Durgin
a55d2eb53a Read lock server from ~/teuthology.yaml. 2011-07-07 12:35:11 -07:00
Josh Durgin
09bee43593 Move username to a utility method. 2011-07-07 12:32:58 -07:00