Commit Graph

4302 Commits

Author SHA1 Message Date
John Spray
097ccbb5be tasks: update journal_repair test for 'damaged' state
To track recent change in master where instead of
crashing on missing MDSTable object we'll go
into damaged state.

Instead of catching a crash, handle the rank's
transition to the damanged state.  Leave the crash
handling code (unused for the moment) in the
Filesystem class in case it's needed elsewhere
soon.

Signed-off-by: John Spray <john.spray@redhat.com>
2015-04-14 14:13:39 +01:00
John Spray
bd3ae1eeba tasks/cephfs: add test_strays
This tests the new purge file/ops throttling
in the MDS, via the new perf counters for
strays/purging.

Fixes: #10390
Signed-off-by: John Spray <john.spray@redhat.com>
2015-04-14 14:13:39 +01:00
John Spray
abb635588f tasks/cephfs: add test_sessionmap
Tests for the persistence behaviour of SessionMap.

Signed-off-by: John Spray <john.spray@redhat.com>
2015-04-14 14:13:39 +01:00
John Spray
2b5137bf06 tasks: generalise cephfs test runner
...to avoid having boilerplate in each test module,
and gain the ability to run them all in one go
with a nice test-by-test pass/fail report.

Signed-off-by: John Spray <john.spray@redhat.com>
2015-04-14 14:13:39 +01:00
John Spray
f54e5414f9 tasks/ceph_fuse: populate ctx.mounts earlier
...so that if an error happens during mount, I can
use the interactive console to access ctx.mounts.

Signed-off-by: John Spray <john.spray@redhat.com>
2015-04-14 14:13:39 +01:00
John Spray
2b39fe5951 tasks/mds_flush: be more careful monitoring stats
Were previously taking the baseline from just after the
client did a delete, which was racy: should have taken
it from before, to get a steady state.

Also update the perf dump calls to take advantage of
the new filtering syntax.

Signed-off-by: John Spray <john.spray@redhat.com>
2015-04-14 14:13:39 +01:00
John Spray
3d3b095bb1 tasks: lots of s/mds_restart/mds_fail_restart/
Wherever we are subsequently waiting for daemons
to be healthy, we should be doing a fail during the restart.

Also catch some places that were doing this longhand and use
the handy fail_restart version instead.

Signed-off-by: John Spray <john.spray@redhat.com>
2015-04-14 14:13:39 +01:00
John Spray
79906e3d07 tasks/cephfs: better multiple-mds handling
Signed-off-by: John Spray <john.spray@redhat.com>
2015-04-14 14:13:39 +01:00
John Spray
0de712f42a tasks/ceph_manager: DRY in mds_status
Signed-off-by: John Spray <john.spray@redhat.com>
2015-04-14 14:13:38 +01:00
John Spray
5c1071b103 ceph_manager: fix bad type assertions
In python, isinstance(foo, str) will fail if
a unicode string is passed in.  The correct check
is basestring.

Signed-off-by: John Spray <john.spray@redhat.com>
2015-04-14 14:13:38 +01:00
John Spray
ce1196d62f tasks/cephfs: be tolerant of multiple MDSs
...as long as only one is active, all the ops
that default to talking to a single MDS should
be happy to talk to the active MDS, even if there
happens to be a standby lying around too.

Signed-off-by: John Spray <john.spray@redhat.com>
2015-04-14 14:13:38 +01:00
Yan, Zheng
86bd6bc377 task/samba: use SIGTERM to stop samba server
man samba(8) contains sentences:

To shut down a user's smbd process it is recommended that SIGKILL (-9)
NOT be used, except as a last resort, as this may leave the shared
memory area in an inconsistent state. The safe way to terminate an smbd
is to send it a SIGTERM (-15) signal and wait for it to die on its own.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-14 17:07:33 +08:00
Josh Durgin
25aa9a3cf9 Merge pull request #395 from fullerdj/wip-rbd-xfstests-201504
RBD: add YAML variables to override locations for ceph-qa-chef and xfstests

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-04-13 17:11:06 -07:00
Josh Durgin
f720903d37 Merge pull request #399 from ceph/wip-11043
Removed per #11070 and resolves #11043

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-04-09 14:53:57 -07:00
Yuri Weinstein
8a0de9a8ba Removed per #11070 and resolves #11043
Signed-off-by: Yuri Weinstein <yuri.weinstein@inktank.com>
2015-04-09 14:47:25 -07:00
Douglas Fuller
7b855dea09 RBD: added optional YAML parameters to test xfstests from different repos
These variables are needed because ceph-qa-suite bootstraps ceph-qa-chef via
http download of solo-from/scratch/run. This adds a variable to override the
default script. It also adds variables to the rbd task to override the versions
of run_xfstests_krbd.sh and run_xfstests.sh downloaded by the default task.

variables added
======
tasks:
-chef
  script_url: # override default location for solo-from-scratch for Chef
  chef_repo: # override default Chef repo used by solo-from-scratch
  chef_branch: # to choose a different git upstream branch for ceph-qa-chef
-rbd.xfstests:
  client.0:
   xfstests_branch: # to choose a different git upstream branch for xfstests
   xfstests_url: # override git base URL for run_xfstests{_krbd}.sh

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2015-04-07 17:45:20 -07:00
Douglas Fuller
b0f5cb1bf1 Increased default test RBD size to 10G to help tests pass
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
2015-04-07 16:49:11 -07:00
Loic Dachary
6d23f3551a Merge pull request #367 from dachary/wip-ec-cache-agent
add erasure code and cache agent upgrade tests

Reviewed-by: Yuri Weinstein <yuriw@redhat.com>
2015-04-03 18:38:44 +02:00
Loic Dachary
1414ca9291 erasure-code: ec-cache-agent in firefly-x/stress-split-erasure-code
Immediately after the Firefly installation, create an erasure code pool
behind a replicated cache pool. Run deep-scrub on all OSD while a rados
task runs. After upgrading half of the cluster (MON and OSD), run a
rados task again also deep-scrub in parallel.

http://tracker.ceph.com/issues/11054 Fixes: #11054

Signed-off-by: Loic Dachary <loic@dachary.org>
2015-04-03 11:10:26 +02:00
Loic Dachary
12af9b7292 erasure-code: rename firefly-x/stress-split-erasure-code
Rename sub-directories of firefly-x/stress-split-erasure-code make room
for a workload immediately after the installation, i.e. number 2.

Signed-off-by: Loic Dachary <loic@dachary.org>
2015-04-03 11:10:26 +02:00
Loic Dachary
f13eb91e51 rados: explain that the task is asynchronous by default
Signed-off-by: Loic Dachary <loic@dachary.org>
2015-04-03 11:10:26 +02:00
Andrew Schoen
ff227ad82b Merge pull request #393 from ceph/revert-385-wip-mkfs
Revert "ceph: be less weird about passing -f to mkfs"
2015-04-02 15:18:03 -05:00
Andrew Schoen
8cb28ddb8e Revert "ceph: be less weird about passing -f to mkfs" 2015-04-02 15:08:13 -05:00
Yuri Weinstein
720f4c78bd Fixed #11306 Whitelist WRN "failed to encode map"
Signed-off-by: Yuri Weinstein <yuri.weinstein@inktank.com>
(cherry picked from commit 9b9d27b25e)
2015-04-01 16:04:06 -07:00
Sage Weil
b6184a3924 Merge pull request #388 from athanatos/wip-wrongly
rados_python: whitelist wrongly marked me down
2015-03-31 10:06:41 -07:00
Sage Weil
5381115be3 Merge pull request #387 from athanatos/wip-scrub-repair-test
(scrub|rados)_test: tolerate best guess digest errors as well
2015-03-31 10:06:10 -07:00
Sage Weil
3a85be7049 Merge pull request #386 from athanatos/wip-11156
rados/thrash*: make scrubs happen a lot
2015-03-31 10:05:37 -07:00
Andrew Schoen
855d3a7623 Merge pull request #385 from ceph/wip-mkfs
ceph: be less weird about passing -f to mkfs
2015-03-31 11:03:00 -05:00
Sage Weil
182cb63034 ceph: fix mkfs -f bug
Pass -f by default to btrfs instead of first trying without and *then*
trying with.

Among other things, this avoids a confusing failure where we try mkfs.ext4
device (no -f), fail for some reason, and then try again with -f and get
a usage error (-f does not mean force for mke2fs).

Signed-off-by: Sage Weil <sage@redhat.com>
2015-03-31 07:56:53 -07:00
Samuel Just
90393886fc rados_python: whitelist wrongly marked me down
Signed-off-by: Samuel Just <sjust@redhat.com>
2015-03-30 13:47:05 -07:00
Samuel Just
9175f88943 (scrub|rados)_test: tolerate best guess digest errors as well
Signed-off-by: Samuel Just <sjust@redhat.com>
2015-03-30 08:28:23 -07:00
Samuel Just
4d8bc2127f rados/thrash*: make scrubs happen a lot
Signed-off-by: Samuel Just <sjust@redhat.com>
2015-03-30 08:21:11 -07:00
Yuri Weinstein
581fcf192f Merge pull request #380 from ceph/wip-11204
Make sure that ulimits are adjusted for ceph-objectstore-tool
2015-03-27 12:23:37 -07:00
Andrew Schoen
25db1f64ab Merge pull request #382 from dmick/master
calamari_setup: der.  Use dict.update() correctly
2015-03-26 18:03:50 -07:00
Dan Mick
58174f05e9 calamari_setup: der. Use dict.update() correctly
Signed-off-by: Dan Mick <dan.mick@redhat.com>
2015-03-26 17:54:36 -07:00
Andrew Schoen
85e7be5a48 Merge pull request #381 from dmick/master
calamari_setup fixes
2015-03-26 17:21:18 -07:00
Sage Weil
dcb5e8da9d Merge remote-tracking branch 'gh/hammer'
Conflicts:
	.gitignore
2015-03-26 17:09:33 -07:00
Dan Mick
e3ec2fc7c4 calamari_setup: Require test_image to be set
Signed-off-by: Dan Mick <dan.mick@redhat.com>
2015-03-26 17:01:48 -07:00
Dan Mick
dee010163b calamari_setup: centralize config defaults
Make a DEFAULTS dict that is updated by any user parms, so that
defaults are documented centrally and so config.get(key, defval) is
no longer necessary everywhere.

Signed-off-by: Dan Mick <dan.mick@redhat.com>
2015-03-26 17:01:48 -07:00
Dan Mick
3a69c3f494 calamari_setup: remove "build test image" code; add 'test_image' cfgvar
Stop trying to build test images inside this test; presume the test
image is available built externally (in a file path or an http URL).
Config vars ice_tool_dir, ice_version, iceball_location, and
ice_git_location go away in favor of 'test_image', the path to the
testable image (which can still be a tar.gz or an .iso).

Signed-off-by: Dan Mick <dan.mick@redhat.com>
2015-03-26 17:01:48 -07:00
Dan Mick
5644bb5a8d calamari_setup: mounting iso on older distros requires -o loop
Ubuntu's mount/kernel support "mount <file> <mntpnt>" directly;
apparently Centos 6 (and presumably RHEL6) require specifying at
least '-o loop' (a /dev/loopN will be dynamically allocated and removed
on unmount).

Signed-off-by: Dan Mick <dan.mick@redhat.com>
2015-03-26 17:01:48 -07:00
Andrew Schoen
55cb0a5e05 Merge pull request #379 from ceph/wip-wn
fix watch-notify test
2015-03-26 17:00:46 -07:00
Sage Weil
bafe87a8e5 tasks/watch_notify_same_primary: wait for watch before notify
Make sure watch is done registering and ready before sending the
notifies.

Fixes: #10634
Signed-off-by: Sage Weil <sage@redhat.com>
2015-03-26 16:51:56 -07:00
David Zafman
e6ce90fdb1 Make sure that ulimits are adjusted for ceph-objectstore-tool
Fixes: #11204

Signed-off-by: David Zafman <dzafman@redhat.com>
2015-03-26 15:18:47 -07:00
Yuri Weinstein
aa3ff92587 Merge pull request #378 from dachary/wip-11221-erasure-code-not-parallel
erasure-code: enable ec-rados-default.yaml
2015-03-25 16:35:09 -07:00
Loic Dachary
1dddc118d3 erasure-code: enable ec-rados-default.yaml
The ec-rados-default.yaml started with:

  workload:
    sequential:

which is only suitable for suites/upgrade/giant-x/parallel/2-workload/sequential_run/ec-rados-default.yaml

because suites/upgrade/giant-x/parallel/1-giant-install/giant.yaml has

   - parallel:
      - workload
      - upgrade-sequence

The same file was included in contexts where the parallel task was not
used and the workload did not run:

  ./suites/upgrade/firefly-x/stress-split-erasure-code/5-workload/ec-rados-default.yaml
  ./suites/upgrade/giant-x/stress-split-erasure-code/5-workload/ec-rados-default.yaml
  ./suites/upgrade/giant-x/stress-split-erasure-code-x86_64/5-workload/ec-rados-default.yaml

The ec-rados-default.yaml is modified to be a task instead of a
sequential task in a parallel tasks. The ec-rados-sequential.yaml is
added and is linked in
suites/upgrade/giant-x/parallel/2-workload/sequential_run instead of ec-rados-default.yaml.

http://tracker.ceph.com/issues/11221 Fixes: #11221

Signed-off-by: Loic Dachary <loic@dachary.org>
2015-03-25 17:33:31 +01:00
Loic Dachary
fc39550933 ensure summary is looked for the user we need (part 2)
Move the get_user_summary(out, user) logic to util.rgw so that it can be
shared between radosgw_admin_rest.py and radosgw_admin.py and modify
them accordingly.

http://tracker.ceph.com/issues/11180 Fixes: #11180

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 97e6d808f0)
2015-03-24 13:37:41 -04:00
Josh Durgin
fde3075f6f Merge pull request #377 from ceph/wip-11166
Fixes #11166, whitelisted 'Missing health data for MDS'

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2015-03-23 11:25:34 -07:00
Yuri Weinstein
e0ff0864e3 Fixes #11166, whitelisted 'Missing health data for MDS'
Signed-off-by: Yuri Weinstein <yuri.weinstein@inktank.com>
2015-03-23 11:17:27 -07:00
Loic Dachary
cfc0d0784b Merge pull request #375 from dachary/wip-rgw-regional-summary
ensure summary is looked for the user we need (part 2)

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2015-03-23 01:17:07 +01:00