Josh Durgin
3bfb8d696e
ceph, ceph-fuse: simplify valgrind argument additions
2012-02-24 12:05:35 -08:00
Sage Weil
9ec047226f
refactor all valgrind users to use a get_valgrind_args() helper
...
This avoids much annoying, duplicated code.
2012-02-24 12:05:35 -08:00
Sage Weil
90fdc84086
ceph: always create valgrind logs dir
...
Other tasks use it too. It's more annoying to conditionally create it.
2012-02-24 12:05:35 -08:00
Sage Weil
7af6e46c94
ceph: always try to process valgrind logs
...
Check for errors in valgrind logs even if there is no valgrind option
the ceph task config stanza. Other tasks can run via valgrind (ceph-fuse,
rgw). If the logs aren't there, this is harmless.
2012-02-24 12:05:35 -08:00
Sage Weil
c5688e6570
ceph: valgrind trumps coverage when picking a flavor
...
valgrind will crash if we don't use notcmalloc; coverage will silently
fail to collect coverage info.
2012-02-20 15:17:52 -08:00
Sage Weil
7ff9f044e7
ceph: allow valgrind per-type (not just per-name)
2012-02-20 07:04:45 -08:00
Sage Weil
af4ce44233
ceph: use any fs, not just btrfs, on scratch devices
...
The
btrfs: true
syntax is replaced with
fs: btrfs
or ext4, xfs.
2012-02-13 15:28:24 -08:00
Josh Durgin
0cd16cf03d
ceph: always add logger for daemons
...
The extra log function added redundant info and didn't allow different
levels.
2012-02-02 09:36:04 -08:00
Josh Durgin
7af7c66bd0
ceph: rename type parameter to type_
...
type is a built-in and shouldn't be aliased.
2012-02-02 09:35:58 -08:00
Josh Durgin
7146db9215
ceph: use the correct comparison operator
...
is compares identity (i.e. address in cpython), not value.
2012-02-02 09:27:04 -08:00
Josh Durgin
e7672b6433
ceph: sync before unmounting btrfs devices
...
There may still be writes in flight, since the osds may not have
shutdown cleanly. This should prevent EBUSY when unmounting.
Fixes : #1997
2012-02-02 09:26:45 -08:00
Josh Durgin
1364b8826f
ceph: delay raising exceptions until all daemons are stopped
...
If a daemon crashes, the exception is raised when we stop it. This
caused some daemons to continue running during cleanup, since the rest
of the daemons of the same type would not be shut down. Also log each
daemon that crashed, for easier debugging.
Fixes : #1744
2012-02-02 09:26:25 -08:00
Tommi Virtanen
09bed16408
Allow user to provide flavor to use.
...
With this, you can use Ubuntu 11.10 machines with teuthology by saying::
tasks:
- ceph:
flavor: oneiric
...
2012-01-31 07:59:43 -08:00
Sage Weil
b58f9560ea
ceph: ignore all leaks
...
unless/until we figure out where the DefinitelyLost records are coming
from.. at first glance they look bogus.
2012-01-16 09:55:47 -08:00
Sage Weil
40fb86ff81
ceph: take single arg or list for valgrind args
2012-01-16 09:22:45 -08:00
Sage Weil
c88ec5719e
combined mon, osd, mds starter functions
2012-01-15 22:54:09 -08:00
Sage Weil
50463ffddd
verify all osds start before checking health
...
Just checking health isn't good enough, since it races with OSD startup:
we can have a healthy cluster with 0 (or something else < total) OSDs.
2012-01-11 12:54:08 -08:00
Josh Durgin
f4883ebf09
ceph: let the user running ceph-osd remove subvolumes
...
This will prevent EPERM when using the SNAP_DESTROY ioctl,
so the filestore will use btrfs snaps.
2012-01-10 16:07:04 -08:00
Tommi Virtanen
d8fc151365
Clean up C++isms.
2011-11-17 17:00:44 -08:00
Josh Durgin
f85f5dd7e3
ceph: deep merge overrides, so e.g. log whitelists can be overridden
2011-11-17 13:07:03 -08:00
Sage Weil
6d39cc1146
ceph: keep ceph.conf at ctx.ceph.conf
...
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-08 22:17:00 -08:00
Josh Durgin
bcded7f163
ceph: add whitelist for cluster log errors
...
Some messages are expected when thrashing osds or creating unfound
objects.
Fixes : #1622
2011-10-17 14:42:08 -07:00
Josh Durgin
1cad309d65
Add failure_reason to summary for the first failure detected.
...
For now, this is the exception raised during a task, the error found
in the central log, or coredumps found. More specific errors
(i.e. s3-tests had 3 failures) can be added later as exceptions raised
by tasks.
2011-10-03 17:07:41 -07:00
Samuel Just
28d60172f6
ceph.py: add btrfs option
...
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-03 14:26:04 -07:00
Sage Weil
a92fef77dc
rename c* -> ceph-*
...
Leave cfuse task name unchanged for now...
2011-09-23 08:57:18 -07:00
Samuel Just
4a0f8fee54
ceph.py: remove unused variables mds_daemons and mon_daemons
...
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-09-15 17:26:38 -07:00
Samuel Just
a3c886af19
ceph.py/cephmanager.py: add ctx.daemons for restarting daemons
...
ctx.daemons will now be an instance of CephState.
ctx.daemons.get_daemon(role, id).stop() to stop daemon, retart() to
restart the daemon, etc.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-09-15 17:08:34 -07:00
Tommi Virtanen
a2372fce12
Move orchestra to teuthology.orchestra so there's just one top-level package.
2011-09-13 14:53:02 -07:00
Sage Weil
e66dffc3d3
don't eat exceptions for breakfast
...
fixes 0c2bee1514
2011-09-02 11:07:10 -07:00
Sage Weil
c502418fca
thrashosds: make it work when first mon isn't mon.0
2011-09-01 12:56:29 -07:00
Josh Durgin
ec768ba3ca
Fix pyflakes warnings.
2011-08-31 14:36:01 -07:00
Greg Farnum
0c2bee1514
valgrind: don't run valgrind_post if there's no valgrind
...
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-29 16:47:22 -07:00
Greg Farnum
3a3c859f5b
valgrind: scan logs for bad results
...
It's not sophisticated but it will warn you about a node
if at least one node has issues.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-29 14:03:02 -07:00
Greg Farnum
50a648bdfc
valgrind: use xml output for tools that support it
...
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-29 14:03:02 -07:00
Greg Farnum
1130e5fe37
coverage: create dir conditionally
...
We don't need to create the dir if we aren't using coverage.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-24 16:48:14 -07:00
Sage Weil
42318c57cb
check ceph cluster log for badness (ERR, WRN, SEC)
2011-08-23 21:00:26 -07:00
Sage Weil
21d04419b8
ceph: copy cluster log file to archive/ceph.log
2011-08-22 22:04:57 -07:00
Greg Farnum
e20bae2a7f
valgrind: Document!
...
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-17 10:35:37 -07:00
Greg Farnum
4efc95fa57
include log in valgrind log file names
...
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-17 10:30:26 -07:00
Greg Farnum
d5eb2c2b77
ceph task: split up arguments a little more
...
This allows selective daemon kill signal changes. With valgrind
daemons we want term instead of kill, for instance.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-17 10:30:24 -07:00
Greg Farnum
5323e1796f
valgrind: move valgrind logs to log dir
...
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-17 10:29:54 -07:00
Greg Farnum
aa74481728
ceph: split up daemon-running arguments and insert valgrind ones
...
This setup should let us insert other kinds of things too, if we
need them.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-15 15:35:42 -07:00
Greg Farnum
9ec19f13df
ceph: Set up valgrind as a flavor, and create a dir for logging.
...
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-15 15:32:23 -07:00
Greg Farnum
98ac89a54e
ceph task: pass the full config to the daemon startup subs
...
So far as I can tell there is no reason to reduce them to
the coverage config, and I want the full config for my
soon-to-exist valgrind options.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-15 15:31:18 -07:00
Sage Weil
bf7b1dd4a7
ceph: fix max_mds calculation
...
Signed-off-by: Sage Weil <sage@newdream.net>
2011-08-10 12:47:20 -07:00
Sage Weil
b5ba155c17
Revert "fix get_clients"
...
This reverts commit 83b6678e79
. The bug I was
hitting was actually fxied by 06e3e69c29
.
2011-08-09 13:23:58 -07:00
Sage Weil
83b6678e79
fix get_clients
...
Only return the clients that are listed (not _all_ clients). There might
be a combination of cfuse and kclient (or other) clients here!
2011-08-05 14:35:44 -07:00
Sage Weil
ef2b80910a
use coverage_dir
2011-08-05 14:35:43 -07:00
Greg Farnum
6ac6f7ab38
teuthology: convert from bzip2 to gzip.
...
gzip is much, much faster on large log files. With a 7.7GB client log, gzip
took 2:45 to compress it to 624MB. bzip2 took 34:38 to compress it to
366MB. For our purposes the space savings are not worth the time loss.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-07-29 10:35:02 -07:00
Sage Weil
277c4ff7aa
set max_mds based on non-standbys
2011-07-28 10:25:30 -07:00