Commit Graph

104 Commits

Author SHA1 Message Date
Josh Durgin
3bfb8d696e ceph, ceph-fuse: simplify valgrind argument additions 2012-02-24 12:05:35 -08:00
Sage Weil
9ec047226f refactor all valgrind users to use a get_valgrind_args() helper
This avoids much annoying, duplicated code.
2012-02-24 12:05:35 -08:00
Sage Weil
90fdc84086 ceph: always create valgrind logs dir
Other tasks use it too.  It's more annoying to conditionally create it.
2012-02-24 12:05:35 -08:00
Sage Weil
7af6e46c94 ceph: always try to process valgrind logs
Check for errors in valgrind logs even if there is no valgrind option
the ceph task config stanza.  Other tasks can run via valgrind (ceph-fuse,
rgw).  If the logs aren't there, this is harmless.
2012-02-24 12:05:35 -08:00
Sage Weil
c5688e6570 ceph: valgrind trumps coverage when picking a flavor
valgrind will crash if we don't use notcmalloc; coverage will silently
fail to collect coverage info.
2012-02-20 15:17:52 -08:00
Sage Weil
7ff9f044e7 ceph: allow valgrind per-type (not just per-name) 2012-02-20 07:04:45 -08:00
Sage Weil
af4ce44233 ceph: use any fs, not just btrfs, on scratch devices
The

  btrfs: true

syntax is replaced with

  fs: btrfs

or ext4, xfs.
2012-02-13 15:28:24 -08:00
Josh Durgin
0cd16cf03d ceph: always add logger for daemons
The extra log function added redundant info and didn't allow different
levels.
2012-02-02 09:36:04 -08:00
Josh Durgin
7af7c66bd0 ceph: rename type parameter to type_
type is a built-in and shouldn't be aliased.
2012-02-02 09:35:58 -08:00
Josh Durgin
7146db9215 ceph: use the correct comparison operator
is compares identity (i.e. address in cpython), not value.
2012-02-02 09:27:04 -08:00
Josh Durgin
e7672b6433 ceph: sync before unmounting btrfs devices
There may still be writes in flight, since the osds may not have
shutdown cleanly. This should prevent EBUSY when unmounting.

Fixes: #1997
2012-02-02 09:26:45 -08:00
Josh Durgin
1364b8826f ceph: delay raising exceptions until all daemons are stopped
If a daemon crashes, the exception is raised when we stop it. This
caused some daemons to continue running during cleanup, since the rest
of the daemons of the same type would not be shut down. Also log each
daemon that crashed, for easier debugging.

Fixes: #1744
2012-02-02 09:26:25 -08:00
Tommi Virtanen
09bed16408 Allow user to provide flavor to use.
With this, you can use Ubuntu 11.10 machines with teuthology by saying::

  tasks:
  - ceph:
      flavor: oneiric
  ...
2012-01-31 07:59:43 -08:00
Sage Weil
b58f9560ea ceph: ignore all leaks
unless/until we figure out where the DefinitelyLost records are coming
from.. at first glance they look bogus.
2012-01-16 09:55:47 -08:00
Sage Weil
40fb86ff81 ceph: take single arg or list for valgrind args 2012-01-16 09:22:45 -08:00
Sage Weil
c88ec5719e combined mon, osd, mds starter functions 2012-01-15 22:54:09 -08:00
Sage Weil
50463ffddd verify all osds start before checking health
Just checking health isn't good enough, since it races with OSD startup:
we can have a healthy cluster with 0 (or something else < total) OSDs.
2012-01-11 12:54:08 -08:00
Josh Durgin
f4883ebf09 ceph: let the user running ceph-osd remove subvolumes
This will prevent EPERM when using the SNAP_DESTROY ioctl,
so the filestore will use btrfs snaps.
2012-01-10 16:07:04 -08:00
Tommi Virtanen
d8fc151365 Clean up C++isms. 2011-11-17 17:00:44 -08:00
Josh Durgin
f85f5dd7e3 ceph: deep merge overrides, so e.g. log whitelists can be overridden 2011-11-17 13:07:03 -08:00
Sage Weil
6d39cc1146 ceph: keep ceph.conf at ctx.ceph.conf
Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-08 22:17:00 -08:00
Josh Durgin
bcded7f163 ceph: add whitelist for cluster log errors
Some messages are expected when thrashing osds or creating unfound
objects.

Fixes: #1622
2011-10-17 14:42:08 -07:00
Josh Durgin
1cad309d65 Add failure_reason to summary for the first failure detected.
For now, this is the exception raised during a task, the error found
in the central log, or coredumps found. More specific errors
(i.e. s3-tests had 3 failures) can be added later as exceptions raised
by tasks.
2011-10-03 17:07:41 -07:00
Samuel Just
28d60172f6 ceph.py: add btrfs option
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-10-03 14:26:04 -07:00
Sage Weil
a92fef77dc rename c* -> ceph-*
Leave cfuse task name unchanged for now...
2011-09-23 08:57:18 -07:00
Samuel Just
4a0f8fee54 ceph.py: remove unused variables mds_daemons and mon_daemons
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-09-15 17:26:38 -07:00
Samuel Just
a3c886af19 ceph.py/cephmanager.py: add ctx.daemons for restarting daemons
ctx.daemons will now be an instance of CephState.

ctx.daemons.get_daemon(role, id).stop() to stop daemon, retart() to
restart the daemon, etc.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
2011-09-15 17:08:34 -07:00
Tommi Virtanen
a2372fce12 Move orchestra to teuthology.orchestra so there's just one top-level package. 2011-09-13 14:53:02 -07:00
Sage Weil
e66dffc3d3 don't eat exceptions for breakfast
fixes 0c2bee1514
2011-09-02 11:07:10 -07:00
Sage Weil
c502418fca thrashosds: make it work when first mon isn't mon.0 2011-09-01 12:56:29 -07:00
Josh Durgin
ec768ba3ca Fix pyflakes warnings. 2011-08-31 14:36:01 -07:00
Greg Farnum
0c2bee1514 valgrind: don't run valgrind_post if there's no valgrind
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-29 16:47:22 -07:00
Greg Farnum
3a3c859f5b valgrind: scan logs for bad results
It's not sophisticated but it will warn you about a node
if at least one node has issues.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-29 14:03:02 -07:00
Greg Farnum
50a648bdfc valgrind: use xml output for tools that support it
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-29 14:03:02 -07:00
Greg Farnum
1130e5fe37 coverage: create dir conditionally
We don't need to create the dir if we aren't using coverage.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-24 16:48:14 -07:00
Sage Weil
42318c57cb check ceph cluster log for badness (ERR, WRN, SEC) 2011-08-23 21:00:26 -07:00
Sage Weil
21d04419b8 ceph: copy cluster log file to archive/ceph.log 2011-08-22 22:04:57 -07:00
Greg Farnum
e20bae2a7f valgrind: Document!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-17 10:35:37 -07:00
Greg Farnum
4efc95fa57 include log in valgrind log file names
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-17 10:30:26 -07:00
Greg Farnum
d5eb2c2b77 ceph task: split up arguments a little more
This allows selective daemon kill signal changes. With valgrind
daemons we want term instead of kill, for instance.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-17 10:30:24 -07:00
Greg Farnum
5323e1796f valgrind: move valgrind logs to log dir
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-17 10:29:54 -07:00
Greg Farnum
aa74481728 ceph: split up daemon-running arguments and insert valgrind ones
This setup should let us insert other kinds of things too, if we
need them.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-15 15:35:42 -07:00
Greg Farnum
9ec19f13df ceph: Set up valgrind as a flavor, and create a dir for logging.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-15 15:32:23 -07:00
Greg Farnum
98ac89a54e ceph task: pass the full config to the daemon startup subs
So far as I can tell there is no reason to reduce them to
the coverage config, and I want the full config for my
soon-to-exist valgrind options.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-08-15 15:31:18 -07:00
Sage Weil
bf7b1dd4a7 ceph: fix max_mds calculation
Signed-off-by: Sage Weil <sage@newdream.net>
2011-08-10 12:47:20 -07:00
Sage Weil
b5ba155c17 Revert "fix get_clients"
This reverts commit 83b6678e79.  The bug I was
hitting was actually fxied by 06e3e69c29.
2011-08-09 13:23:58 -07:00
Sage Weil
83b6678e79 fix get_clients
Only return the clients that are listed (not _all_ clients).  There might
be a combination of cfuse and kclient (or other) clients here!
2011-08-05 14:35:44 -07:00
Sage Weil
ef2b80910a use coverage_dir 2011-08-05 14:35:43 -07:00
Greg Farnum
6ac6f7ab38 teuthology: convert from bzip2 to gzip.
gzip is much, much faster on large log files. With a 7.7GB client log, gzip
took 2:45 to compress it to 624MB. bzip2 took 34:38 to compress it to
366MB. For our purposes the space savings are not worth the time loss.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-07-29 10:35:02 -07:00
Sage Weil
277c4ff7aa set max_mds based on non-standbys 2011-07-28 10:25:30 -07:00