Commit Graph

66 Commits

Author SHA1 Message Date
Patrick Donnelly
9832beac85
tasks/mds_thrash: support multimds
This commit amends the MDS thrasher task to also work on multimds
clusters. Main changes:

o New FSStatus class in tasks/cephfs/filesystem.py which gets a snapshot
  of the fsmap (`ceph fs dump`). This allows consecutive operations on
  the same fsmap without repeated fs dumps.

o Only one MDSThrasher is started for each file system.

o The MDSThrasher operates on ranks instead of names (and groups of
  standbys following the initial active).

o The MDSThrasher also will change the max_mds for the cluster to a new
  value [1, current) or (current, starting max_mds]. When reduced,
  randomly selected MDSs other than rank 0 will be deactivated to reach
  the new max_mds. The likelihood of changing max_mds in a given cycle of
  the MDSThrasher is set by the "thrash_max_mds" config.

o The MDSThrasher prints out stats on completion, e.g. number of
  mds deactivated or mds_max changed.

Pre-requisite for: http://tracker.ceph.com/issues/10792
Partially fixes: http://tracker.ceph.com/issues/15134

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2016-11-07 21:24:09 -05:00
John Spray
326a33b4fa tasks: update to run ceph-mgr daemons
Signed-off-by: John Spray <john.spray@redhat.com>
2016-11-01 12:21:51 +01:00
John Spray
298cc8f932 tasks/ceph: move generate_caps from teuthology
This was only used in this task, and it is much too
ceph-specific to belong in teuthology.

Fixes: http://tracker.ceph.com/issues/17614
Signed-off-by: John Spray <john.spray@redhat.com>
2016-10-19 13:05:36 +01:00
John Spray
cc8198d8eb tasks/ceph: enable dirfrags in cephfs
Otherwise places we set mds_bal_frag have
no effect.

Signed-off-by: John Spray <john.spray@redhat.com>
2016-09-23 11:04:54 +01:00
John Spray
c444db12d4 tasks/ceph: construct CephManager earlier
Previously, if errors occurred during healthy(), then
the finally block would invoke osd_scrub_pgs, which relies
on CephManager being constructed, and it would die, hiding
the original exception.

Signed-off-by: John Spray <john.spray@redhat.com>
2016-08-31 15:32:33 +01:00
Warren Usui
1b7552c9cb tasks/ceph.restart osd fix
ceph.restart should mark restarted osds down in order to avoid a
race condition with ceph_manager.wait_for_clean

Fixes: http://tracker.ceph.com/issues/15778
Signed-off-by: Warren Usui <wusui@redhat.com>
2016-05-25 16:59:05 -07:00
Josh Durgin
a67a123f0c tasks/ceph.healthy: allow None as config
This is used by the pg-removal-interruption test.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-11 12:07:16 -07:00
Josh Durgin
bccdef6650 tasks/ceph: ignore EEXIST for the archive data dir creation
With multiple clusters this will be called multiple times.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 14:55:27 -07:00
Josh Durgin
4c2e7309db tasks/ceph: pull each mon dir only once
No need to pull all mon dirs for the host for each mon on the host.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 14:55:26 -07:00
Josh Durgin
3203b76792 tasks/ceph: only run ceph_log and valgrind_post once
These setup and parse logs on all hosts, so they should be run only
for the first cluster setup. This cluster will be torn down last, so
the cleanup happens after all clusters are shutdown as well.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 14:55:26 -07:00
Josh Durgin
96e7724e23 tasks/ceph: make scrubbing cluster-aware
Simplify implementation by using manager and teuthology.misc methods
instead of reinventing them here.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 14:55:26 -07:00
Josh Durgin
3948f108a8 tasks/ceph: make restart subtask cluster-aware
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 11:51:59 -07:00
Josh Durgin
9ad65769c7 tasks/ceph: make wait_for_mon_quorum cluster-aware
Accept a 'daemons' list like other ceph subtasks, so it can get an
optional 'cluster' setting too.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 11:51:59 -07:00
Josh Durgin
bb76deaf57 tasks/ceph: make wait_for_osds_up cluster-aware
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 11:51:59 -07:00
Josh Durgin
ef619062be tasks/ceph: update ctx.manager usage to ctx.managers
Not sure this function is ever used (no users in ceph-qa-suite yamls
or tasks.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 11:51:59 -07:00
Josh Durgin
524e6d7a5e tasks/ceph_manager: add cluster param to write_conf()
Only used by cephfs right now, so don't bother changing callers.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 11:51:59 -07:00
Josh Durgin
141c73d399 tasks/ceph_manager: parameterize CephManager with cluster
Add --cluster arguments, pass cluster to get_daemon() and
iter_daemons_of_role, replace 'ceph' with cluster in paths, and use
ctx.ceph[cluster] instead of ctx.ceph.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 11:51:58 -07:00
Josh Durgin
0acbafe371 tasks/ceph: store cluster config in a per-cluster dict
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 11:51:58 -07:00
Josh Durgin
25ff14af74 tasks/ceph: create a CephManager per cluster
Thrashings tasks will be updated to use ctx.managers indexed by
cluster later.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 11:51:58 -07:00
Josh Durgin
d135470305 tasks/ceph: make healthy() cluster-aware
ceph.healthy may be used as a standalone task, so it may not always
have the cluster name in its configuration.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 11:51:58 -07:00
Josh Durgin
b0dd04736e tasks/ceph: make cephfs_setup() cluster-aware
Note that cephfs tests using the Filesystem abstractions will need to
be converted to understand multiple clusters later. This just updates
the ceph task portion.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 11:51:58 -07:00
Josh Durgin
4409710102 tasks/ceph: make crush_setup() cluster-aware
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 11:51:58 -07:00
Josh Durgin
e19e9e2fa3 tasks/ceph: adapt to cluster-aware daemon.resolve_role_list
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 11:51:57 -07:00
Josh Durgin
26b8a1f3ac tasks/ceph: make run_daemon() cluster-aware
Pass --cluster where appropriate and include the full role in file
names.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 11:51:57 -07:00
Josh Durgin
a54ff597fa tasks/ceph: convert cluster creation to work with multiple clusters
Add a cluster option to the ceph task, and pass that through to
cluster(). Make sure monitors and clients don't collide by adding
their cluster to paths they use.

This assumes there is one ceph task per cluster, and osds from
multiple clusters do not share hosts (or else the block device
assignment won't work).

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2016-05-09 11:51:57 -07:00
Sage Weil
52b13e82a6 tasks/ceph: allow set allow_multiple to fail
This will fail on upgrade tests.

Signed-off-by: Sage Weil <sage@redhat.com>
2016-04-16 09:10:51 -04:00
Greg Farnum
5e7e017d7d cephfs: update tests to enable multimds when needed
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2016-04-06 16:32:44 -07:00
Dan Mick
228f71e176 tasks/ceph.py: Remove *.pid at end of run
http://tracker.ceph.com/issues/15162
Fixes: #15162
Signed-off-by: Dan Mick <dan.mick@redhat.com>
2016-03-16 15:50:30 -07:00
John Spray
f05d977628 tasks/ceph: fix up whitespace
...because otherwise it lights up like a christmas
tree in pycharm.

Signed-off-by: John Spray <john.spray@redhat.com>
2015-11-09 13:20:49 +00:00
Loic Dachary
8f9de175e0 ceph: log which ceph.conf file is written
Signed-off-by: Loic Dachary <ldachary@redhat.com>
2015-10-21 15:49:48 +02:00
John Spray
eab70197a6 tasks/ceph: wait for MDS to be active when creating a cluster
This is the correct implementation of 685d76a77c,
merged while broken in ff1655cb57 and
reverted in 4cccde634f.

Signed-off-by: John Spray <john.spray@redhat.com>
2015-09-21 16:05:51 -07:00
Greg Farnum
4cccde634f Revert "Merge pull request #567 from ceph/ceph_fuse-timeout"
This reverts commit ff1655cb57, reversing
changes made to 2b25080d4f.

Since we haven't actually started the MDS daemons yet, this code is broken.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2015-09-17 07:29:28 -07:00
Greg Farnum
685d76a77c ceph: wait for CephFS to be healthy before proceeding
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2015-09-16 13:53:36 -07:00
Zack Cerza
e9847570de Merge pull request #561 from ceph/wip-sudo
sudo ceph
2015-09-11 10:20:50 -06:00
Sage Weil
dad981d339 tasks: sudo ceph for cli
/var/run/ceph is 770.  This is mainly necessary for any
interaction with the daemon sockets, but it is what users do
and it may avoid log noise.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-11 12:15:01 -04:00
Sage Weil
9b3f36f91f ceph: add option to expect valgind errors and fail if there are none
See #10328
Signed-off-by: Sage Weil <sage@redhat.com>
2015-09-11 11:53:37 -04:00
John Spray
9f530092e2 tasks: fix syntax error in ceph.py
From e195f9fa.

Signed-off-by: John Spray <john.spray@redhat.com>
2015-08-06 10:25:03 +01:00
Zack Cerza
e195f9fa1c Set the SELinux context of the logrotate config
Signed-off-by: Zack Cerza <zack@redhat.com>
2015-08-05 10:08:35 -06:00
Greg Farnum
451c5ca79d ceph: fix up log rotation stopper
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2015-06-12 11:19:24 -07:00
Greg Farnum
6573e92fb3 ceph: update log rotation for review comments
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2015-06-09 16:16:50 -07:00
Greg Farnum
96f3eb9dbe ceph: support arbitrarily-named daemons in logrotate
And make it more configurable in terms of sizes.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2015-06-08 17:36:49 -07:00
Greg Farnum
5935f86e49 ceph: enable mds log rotation
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
2015-06-08 17:36:49 -07:00
Jason Dillaman
8392d7f213 tasks: add support for running fsx under valgrind
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2015-04-28 09:44:05 -04:00
John Spray
ea7c39222a tasks/ceph: refactor legacy FS configuration check
Move up into Filesystem so that this can be used from
the ceph_deploy task as well.

Signed-off-by: John Spray <john.spray@redhat.com>
2015-04-19 17:46:18 -07:00
Andrew Schoen
8cb28ddb8e Revert "ceph: be less weird about passing -f to mkfs" 2015-04-02 15:08:13 -05:00
Sage Weil
182cb63034 ceph: fix mkfs -f bug
Pass -f by default to btrfs instead of first trying without and *then*
trying with.

Among other things, this avoids a confusing failure where we try mkfs.ext4
device (no -f), fail for some reason, and then try again with -f and get
a usage error (-f does not mean force for mke2fs).

Signed-off-by: Sage Weil <sage@redhat.com>
2015-03-31 07:56:53 -07:00
Sage Weil
1922c61bbf ceph: ugh fix syntax
Signed-off-by: Sage Weil <sage@redhat.com>
2015-02-25 11:37:44 -08:00
Sage Weil
18307be0ca ceph: fix ps axuf lsof line
Signed-off-by: Sage Weil <sage@redhat.com>
2015-02-25 11:05:13 -08:00
Sage Weil
a68281e147 ceph: ps axf too before lsof
Specifically, I want to know *who* is running the ceph-osd that is
holding the files open.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-02-24 06:59:16 -08:00
Sage Weil
077e917af1 ceph: lsof if umount fails
Signed-off-by: Sage Weil <sage@redhat.com>
2015-02-23 13:52:48 -08:00