Commit Graph

991 Commits

Author SHA1 Message Date
Warren Usui
357fd22f04 Add calamari_setup
Calamari_setup can be used to set up a calamari gui for manual testing,
or be run in a suite to test the calamari setup and calamari ceph
installation code.

Fixes: 9759
Signed-off-by: Warren Usui <warren.usui@inktank.com>
2014-11-18 19:06:24 -08:00
Loic Dachary
e2d6ce7e9d Merge pull request #234 from ceph/wip-dzaddscrub
Add scrub_test and repair_test to rados basic suite

Reviewed-by: Loic Dachary <ldachary@redhat.com>
2014-11-11 03:42:33 +01:00
Sage Weil
36055e143c Merge remote-tracking branch 'gh/next' 2014-11-09 20:48:45 -08:00
Sage Weil
5f19ef7116 tasks/radosbench: no log to stderr
Signed-off-by: Sage Weil <sage@redhat.com>
2014-11-09 20:48:35 -08:00
Gregory Farnum
e8cd3f10d6 Merge pull request #221 from ceph/wip-forward-scrub
Wip forward scrub

Reviewed-by: John Spray <john.spray@redhat.com>
2014-11-07 16:16:38 -08:00
Greg Farnum
6c26c073de mds_scrub_checks: Run scrub and flush commands against the MDS.
We mostly do a variety of successful ones, but we also corrupt the store
using the rados tool and make sure we get the expected error codes. Includes
a yaml fragment so the task gets run as part of the fs/basic suite.

Signed-off-by: Greg Farnum <greg@inktank.com>
2014-11-07 13:06:13 -08:00
David Zafman
74e776139b repair_test: Wait for OSDs to come up before proceeding with test
Signed-off-by: David Zafman <dzafman@redhat.com>
2014-11-06 22:23:34 -08:00
Yan, Zheng
edb780a3c5 tasks/cephfs/mount: use seperate for testing flock and posix lock
Old version libfuse treats both flock and posix lock requests as posix
lock request. This is a workaround for the bug.

Fixes: #9995
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-11-07 09:07:27 +08:00
Sage Weil
8a18d8baaf Merge remote-tracking branch 'gh/giant' into m
Conflicts:
	tasks/ceph_manager.py
2014-10-29 14:31:26 -07:00
Yehuda Sadeh
dd6194f637 Merge branch 'wip-apache-worker' 2014-10-23 16:05:44 -07:00
Yehuda Sadeh
c3b53c3265 apache: switch to use the apache worker mpm
Fixes: #9169

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2014-10-23 16:05:03 -07:00
Yehuda Sadeh
35c9cae84c apache: change template to load mpm worker module
in apache 2.4

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2014-10-23 16:04:58 -07:00
David Zafman
4ddadf0698 Thrasher: Disable ceph_objectstore_tool tests if old release missing command
Leaving disabled until merge of import/export fixes

Fixes: #9805

Signed-off-by: David Zafman <dzafman@redhat.com>
2014-10-22 23:30:29 -07:00
David Zafman
523cb63b5f ceph_manager: ceph_objectstore_tool testing off by default
Signed-off-by: David Zafman <dzafman@redhat.com>
2014-10-22 10:32:46 -07:00
Zack Cerza
1b8d31986a Smarter s3tests branch selection
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-10-22 09:12:43 -06:00
Sage Weil
ecfcb2e04c Merge pull request #189 from ceph/wip-apache-max-requests
apache: set MaxRequestsPerChild to 0
2014-10-21 10:57:29 -07:00
Yehuda Sadeh
1fd89f4e43 apache: switch to use the apache worker mpm
Fixes: #9169

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
(cherry picked from commit c3b53c3265)
2014-10-23 16:08:22 -07:00
Yehuda Sadeh
14b5a9afdd apache: change template to load mpm worker module
in apache 2.4

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
(cherry picked from commit 35c9cae84c)
2014-10-23 16:08:14 -07:00
David Zafman
a295c18a80 Thrasher: Disable ceph_objectstore_tool tests if old release missing command
Don't need to explicitly turn off the test during some upgrades
Leaving disabled until merge of import/export fixes

Fixes: #9805

Signed-off-by: David Zafman <dzafman@redhat.com>
2014-10-22 19:21:04 -07:00
Sage Weil
7e41c93ed8 tasks/thrashosds: support overrides
e.g.,

overrides:
  thrashosds:
    thrash_primary_affinity: false
...
tasks:
- install:
- ceph:
- thrashosds:
- workunit:
...

Needed for #9865

Signed-off-by: Sage Weil <sage@redhat.com>
2014-10-22 11:19:01 -07:00
David Zafman
bdbcf760d9 ceph_manager: ceph_objectstore_tool testing off by default
Signed-off-by: David Zafman <dzafman@redhat.com>
2014-10-22 10:34:26 -07:00
Zack Cerza
01b556afc1 Smarter s3tests branch selection
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
(cherry picked from commit 1b8d31986a)
2014-10-22 09:13:11 -06:00
Loic Dachary
e48a0a3924 erasure-code: unfound test needs a non empty file
Other rados put will fail as follows

$ touch /tmp/bar
$ ./rados -p rbd put existing_3 /tmp/bar
$ ./rados -p rbd put existing_3 /tmp/bar
WARNING: could not create object: existing_3
error putting rbd/existing_3: (17) File exists

it should be considered a bug in the rados command line but needs to be
addressed separately.

http://tracker.ceph.com/issues/9387 Fixes: #9387

Signed-off-by: Loic Dachary loic-201408@dachary.org
2014-10-20 14:41:10 -07:00
Gregory Farnum
28761d8bfd Merge pull request #181 from ceph/wip-client-flock
tasks/mds_client_recovery: file lock test

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-10-16 06:57:54 -07:00
Yehuda Sadeh
f4432e6386 apache: set MaxRequestsPerChild to 0
Otherwise the default is 10k.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2014-10-14 14:17:41 -07:00
Yehuda Sadeh
8a87a08477 tasks/s3tests: add slow backend configurable
Adding this so that we can modify the clients' conf file as needed with slow backend.
This can be achieved by:

overrides:
  s3tests:
    slow_backend: true

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
(cherry picked from commit 61409179df)
2014-10-14 11:36:24 -07:00
Yehuda Sadeh
61409179df tasks/s3tests: add slow backend configurable
Adding this so that we can modify the clients' conf file as needed with slow backend.
This can be achieved by:

overrides:
  s3tests:
    slow_backend: true

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2014-10-13 15:07:06 -07:00
Greg Farnum
4db95170e6 document 'command' requirements on admin_socket method
Signed-off-by: Greg Farnum <greg@inktank.com>
2014-10-13 12:37:52 -07:00
Yan, Zheng
88133719b7 tasks/mds_client_recovery: file lock test
check that file lock doesn't get lost after an MDS restart

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-10-10 16:54:29 +08:00
John Spray
6ac9efef9c tasks/cephfs: say which test failed in exception
Example:
Was: 'Test failure'
Now: Test failure: test_full_caps (tasks.mds_full.TestClusterFull)

Signed-off-by: John Spray <john.spray@redhat.com>
2014-10-08 16:27:44 +01:00
Sage Weil
7ba50e0c89 tasks/ceph_manager: enable log for ceph_objectstore_tool
Signed-off-by: Sage Weil <sage@redhat.com>
2014-10-07 08:37:53 -07:00
Loic Dachary
c598b8e962 erasure-code: unfound test needs a non empty file
Other rados put will fail as follows

$ touch /tmp/bar
$ ./rados -p rbd put existing_3 /tmp/bar
$ ./rados -p rbd put existing_3 /tmp/bar
WARNING: could not create object: existing_3
error putting rbd/existing_3: (17) File exists

it should be considered a bug in the rados command line but needs to be
addressed separately.

http://tracker.ceph.com/issues/9387 Fixes: #9387

Signed-off-by: Loic Dachary loic-201408@dachary.org
2014-10-02 08:06:42 +02:00
John Spray
48a0b75928 Merge remote-tracking branch 'origin/giant' into wip-merge
Conflicts:
	erasure-code/ec-rados-default.yaml
	tasks/mds_client_limits.py
	tasks/mds_client_recovery.py
	tasks/mds_journal_migration.py
2014-10-01 18:17:01 +01:00
Yan, Zheng
ff03b46509 tasks/mds_client_recovery: client trim its cache on reconnect
make sure CephFS client trim its cache before reconnect to the MDS.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-09-29 20:59:43 +01:00
John Spray
c2d298a43c tasks: wait for mds active before mounting clients
To make the logs clearer when trying to work out
if/when something went wrong, rather than always
having client logs start with some failures.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-29 15:04:33 +01:00
John Spray
0073e25d77 tasks: rename FuseMount.get_client_id to get_global_id
'client_id' was ambiguous because in other places it
meant the '0' in client.0, whereas here it means
the runtime-generated global ID of the client.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-29 15:04:25 +01:00
John Spray
b77b3bec72 tasks: add mds_client_limits
New CephFS tests for the behaviour of the system while
enforcing its resource limits.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-29 15:04:18 +01:00
John Spray
1fa15011a3 tasks: generalise CephFSTestCase
Some of this stuff could be even more general for embedding
unittest-style suites, but for the moment let's keep the cephfs
stuff in a walled garden.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-29 15:04:10 +01:00
John Spray
b6ccf0d414 tasks: generalize config writing for Filesystem
Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-29 15:03:17 +01:00
John Spray
8f49a7d86a tasks: wait for active after mds restart
May have been causing spurious failures on
trying to read session state after MDS restart (
session list isn't populated until recovery is
complete)

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-25 11:28:53 +01:00
tamil
a5a1cce3c7 included an option to ceph_objectstore_tool, whenever we have keyvaluestore_backend as a configurable parameter
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
2014-09-23 07:50:49 -07:00
John Spray
d9ec7f2f7a tasks: wait for mds active before mounting clients
To make the logs clearer when trying to work out
if/when something went wrong, rather than always
having client logs start with some failures.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-19 14:16:14 +01:00
John Spray
3e07bd1aaa tasks: rename FuseMount.get_client_id to get_global_id
'client_id' was ambiguous because in other places it
meant the '0' in client.0, whereas here it means
the runtime-generated global ID of the client.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-19 14:16:13 +01:00
John Spray
7274289542 tasks: add mds_client_limits
New CephFS tests for the behaviour of the system while
enforcing its resource limits.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-19 14:15:41 +01:00
John Spray
d777d7123b tasks: generalise CephFSTestCase
Some of this stuff could be even more general for embedding
unittest-style suites, but for the moment let's keep the cephfs
stuff in a walled garden.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-19 14:13:53 +01:00
John Spray
6f36269d24 tasks: generalize config writing for Filesystem
Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-19 14:13:51 +01:00
Gregory Farnum
278f4dc77a Merge pull request #143 from ceph/wip-migration-test
tasks: more substantial IO for journal migration

Reviewed-by: Greg Farnum <greg@inktank.com>
2014-09-18 15:03:35 -07:00
John Spray
65a4141e22 Merge remote-tracking branch 'origin/giant' 2014-09-17 13:50:55 +01:00
John Spray
7d086403d4 tasks: escaping '*' when deleting files
Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-17 13:37:08 +01:00
John Spray
366ee00554 tasks: more substantial IO for journal migration
...so that there will at least be multiple segments
in the log during the rewrite.

Also make the test stricter by checking that
cephfs-journal-tool can happily read the resulting
journal.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-16 15:14:54 +01:00
John Spray
1d9101cf31 tasks: fix race in test_stale_caps
Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-15 14:32:20 +01:00
John Spray
4daf2ddc39 tasks: typo in mds_client_recovery
Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-15 13:44:27 +01:00
John Spray
bc257677de tasks: handle failure cleanly in test_stale_caps
Previously would fail because the cap waiter
completed too soon, without noticing that the
reason it completed quickly was because it failed.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-15 13:44:27 +01:00
Samuel Just
f6582f8961 tasks: add watch_notify_same_primary
Reproduces: #9220
Signed-off-by: Samuel Just <sam.just@inktank.com>
2014-09-09 15:31:13 -07:00
Samuel Just
79989de8b0 Merge pull request #112 from ceph/wip-8231-forreview
Wip 8231 forreview

Reviewed-by: Samuel Just <sam.just@inktank.com>
2014-09-02 13:43:38 -07:00
John Spray
bbf569de74 tasks: fix mount race in mds_client_recovery
Signed-off-by: John Spray <john.spray@redhat.com>
2014-09-01 16:38:25 +01:00
David Zafman
05eee9fa79 ceph_manager: Add test code to use export/import to move a pg
Check for more than 1 osd down and randomize on chance_move_pg (100%)
For now only export from older down osd to newly down osd to avoid missing map

Signed-off-by: David Zafman <david.zafman@inktank.com>
2014-08-30 16:20:22 -07:00
David Zafman
0cdf6e813d ceph_manager: Implement export/import when thrasher kills an osd
Use list-pgs to avoid races by seeing actual pgs present

Signed-off-by: David Zafman <david.zafman@inktank.com>
2014-08-30 16:20:22 -07:00
David Zafman
9ade22dd34 ceph_objectstore_tool: Add task for testing of tool of the same name
Based on ceph/src/test/ceph_objectstore_tool.py but only does
replicated pool testing and doesn't test argument validation.

Signed-off-by: David Zafman <david.zafman@inktank.com>
2014-08-30 16:20:22 -07:00
tamil
b3dfe47589 Added dmcrypt option and ability to choose same or different disk for ceph journal
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
2014-08-28 18:21:30 -07:00
John Spray
322e2498de Merge pull request #101 from ceph/wip-7810
Wip 7810
2014-08-27 22:22:13 +01:00
Zack Cerza
7baeb8043c Merge pull request #105 from ceph/wip-boto
tasks/s3tests: push boto config with idle_timeout setting
2014-08-26 09:58:39 -06:00
Sage Weil
4f8436bf5d Merge pull request #106 from ceph/wip-9091-wusui
mplement ceph.created_pool

Reviewed-by: Sage Weil <sage@redhat.com>
2014-08-26 06:34:45 -07:00
Sage Weil
12a391ea01 thrashosds: increase osd revive timeout (75s -> 150s)
This is needed when running valgrind.

Signed-off-by: Sage Weil <sage@redhat.com>
2014-08-25 08:52:02 -07:00
Warren Usui
0ec5bd1a63 mplement ceph.created_pool
ceph.created_pool allows the user (via yaml lines) to add pools
that the ceph_manager knows about.

Fixes: 9091
Signed-off-by: Warren Usui <warren.usui@inktank.com>
2014-08-22 17:39:38 -07:00
Sage Weil
9d466aa110 tasks/s3tests: push boto config with idle_timeout setting
Signed-off-by: Sage Weil <sage@redhat.com>
2014-08-22 15:28:33 -07:00
John Spray
1855e094e5 suites/fs: add client recovery
Signed-off-by: John Spray <john.spray@redhat.com>
2014-08-21 23:09:00 +01:00
John Spray
d001cc27bc tasks/mds_client_recovery: use existing clients
This will enable using .yaml changes to switch this
guy over to use kcephfs client once the teuthology
code around it supports all the same hooks as I've added
for fuse.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-08-21 23:09:00 +01:00
John Spray
bb52a9733a tasks/mds_client_recovery: network freeze test
This is about testing the CephFS client's handling
of losing connectivity to the MDS.

Fixes: #7810

Signed-off-by: John Spray <john.spray@redhat.com>
2014-08-21 23:09:00 +01:00
John Spray
8211d83dde tasks/ceph_fuse: enable umounting from config
This is for any test config that needs to run
some workunit with clients unmounted.  It allows
you to go toggle the mountedness of a client as
you go up and down the stack list this:

- ceph-fuse:
    client.0:
        mounted: true
- workunit:
    clients:
        client.0:
        - fs/misc/trivial_sync.sh
- ceph-fuse:
    client.0:
        mounted:
            false

The initial use case for this is running the
cephfs_journal_tool_smoke.sh workunit, which
tests administrative operations that are meant
to be run on an unmounted filesystem.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-08-21 23:09:00 +01:00
John Spray
1e7bfb842a tasks/workunit: fix log message
Signed-off-by: John Spray <john.spray@redhat.com>
2014-08-21 23:09:00 +01:00
John Spray
5c29ae6bd1 tasks/ceph: add ceph.stop task
So that we can explicitly stop daemons on demand.  Useful
for MDS tool tests that want the MDS daemons not to be running,
is this is more solid and explicit than doing e.g. "ceph mds
stop" from within workunits.

Signed-off-by: John Spray <john.spray@redhat.com>
2014-08-21 23:09:00 +01:00
Zack Cerza
5765fde1aa Merge pull request #102 from ceph/9171
ignore errors on informational service status
2014-08-21 09:16:49 -06:00
Alfredo Deza
70a1f18adf use 'mon create-initial' always
But don't error if it fails, as this would mean that the monitors
are just taking longer to form quorum. Go and try the next block which will
wait up to 15 minutes for a successful gatherkeys to happen (that only works
if monitors have formed quorum).

Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
2014-08-21 10:03:28 -04:00
Alfredo Deza
5b946e1a6d ignore errors on informational service status
Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
2014-08-21 09:44:45 -04:00
Loic Dachary
e5c5bcf92d rgw: add erasure_code_profile configuration
If erasure_code_profile is present at the same leve as ec-data-pool, it
is used to override the default hard coded profile.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
2014-08-22 01:27:17 +02:00
Alfredo Deza
4b15d0118e use the right syntax for RHEL/CentOS distros to check for ceph status
Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
2014-08-18 12:40:43 -04:00
Sage Weil
f7b32bcc31 rgw: httpd instead of httpd.worker
httpd exists on rhel 6.5 too ...

Signed-off-by: Sage Weil <sage@redhat.com>
2014-08-16 16:44:32 -07:00
Sage Weil
6392758f1b rgw: need alll of mod unixd, version, authz
Signed-off-by: Sage Weil <sage@redhat.com>
2014-08-16 16:44:32 -07:00
Sage Weil
27b7eceeae tasks/rgw: include mod_authz
As per http://www.webhostingtalk.com/showthread.php?t=1173594

Signed-off-by: Sage Weil <sage@redhat.com>
2014-08-16 13:56:15 -07:00
Sage Weil
2aae91929f tasks/rgw: get mpm_event frmo mods-available, not mods-enabled
Signed-off-by: Sage Weil <sage@redhat.com>
2014-08-16 13:37:39 -07:00
Dan Mick
9de5bd1d23 Add extra conf for Apache 2.4
Inside a conditional to affect only 2.4, set User, Group, and the
module config to load mpm_event.  This is normally done with the
default configuration files, but since this abbreviated conf bypasses
those, we must set them here.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
2014-08-15 22:37:22 -07:00
Loic Dachary
821b2a4397 replace locally instantiated CephManager
Use the ctx.manager instance created by ceph.py instead

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-08-15 15:56:52 +02:00
Loic Dachary
9782465c87 initialize ctx.manager in ceph.py
instead of rados.py because ceph.py is only run once where rados.py
could be run multiple time, leading to race conditions

http://tracker.ceph.com/issues/9027 Fixes: #9027

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-08-15 15:56:35 +02:00
Loic Dachary
f53ea258a4 move functions from ceph to ceph_manager
mount_osd_data and make_admin_daemon_dir are only used by
ceph_manager.py although they are defined in ceph.py

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-08-15 15:56:35 +02:00
Loic Dachary
da00662191 rgw: s/idle_timeout/default_idle_timeout/
Signed-off-by: Loic Dachary <loic@dachary.org>
2014-08-15 12:34:37 +02:00
Zack Cerza
2fc76d6e7f Merge pull request #86 from dachary/wip-9027-create-unique-pool
rados.py: avoid CephManager creation race
2014-08-14 09:28:29 -06:00
Loic Dachary
54a7298cdd rgw: add default_idle_timeout to allow override
Globally overriding the rgw idle_timeout is not possible because it it
needs to be done on a per client.0, client.1, etc. basis. Add the
default_idle_timeout key to the rgw config : it defaults to the
previously hardcoded default (30) and can be changed via the override.

The existing tasks that were previously overriding the idle_timeout on a
per client basis are changed to use the default_idle_timeout instead for
consistency and to allow a global override.

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-08-14 14:53:24 +02:00
Loic Dachary
6237acb316 rados.py: avoid CephManager creation race
gevent may hold the rados.py thread when it has an opportunity. The

   if not hasattr(ctx, 'manager'):

must therefore be immediately before the manager creation it is supposed
to protect. If any of the functions called as a side effect of

   first_mon = teuthology.get_first_mon(ctx, config)
   (mon,) = ctx.cluster.only(first_mon).remotes.iterkeys()

give gevent an opportunity to hold the thread, it creates a race
condition.

The other possibility would be use a ctx lock to protect the code, but
this solution seem simpler.

http://tracker.ceph.com/issues/9027 Fixes: #9027

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-08-14 10:57:35 +02:00
Zack Cerza
4e1e929f75 Update module references
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-08-07 08:24:59 -06:00
Zack Cerza
0e1df3cc72 Import teuthology tasks (master branch)
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-08-07 08:24:58 -06:00