Enable s3readwrite task to have the branch to
download specified and for overrides to be
incorporated into the config at run-time.
Code based on the s3tests.py task.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
I saw
2013-08-03T12:56:26.641 DEBUG:teuthology.orchestra.run:Running [10.214.131.28]: 'sudo killall -9 smbd'
2013-08-03T12:56:26.727 DEBUG:teuthology.orchestra.run:Running [10.214.131.28]: 'sudo lsof /home/ubuntu/cephtest/93695/mnt.0'
2013-08-03T12:56:26.830 INFO:teuthology.orchestra.run.out:[10.214.131.28]: COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
2013-08-03T12:56:26.830 INFO:teuthology.orchestra.run.out:[10.214.131.28]: smbd 12381 root cwd DIR 0,0 0 1 /home/ubuntu/cephtest/93695/mnt.0
which makes me think we just need to wait a moment before
attempting the umount?
Signed-off-by: Sage Weil <sage@inktank.com>
The rgw task was failing to check for a None object
when parsing user info in the case where there were
config options set for the client that did not include
user info (e.g. valgrind: ).
Correcting a bug where specifying
a rgw server for a client but not specifying
a system user would throw an exception.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
The log_data and log_metadata are made configurable
via the YAML file and default to false
(meaning neither data nor metadata operations are
logged).
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
- Read ceph.conf from stored copy that includes overrides
- Get system users and keys from cluster instead of reading other
tasks' yaml, which may not be complete.
- Put zone info extraction from the cluster into utility functions,
since it'll be useful for other tests later.
- Work with more than one agent on a single host
- Accept more than one client to run, like almost every other task
- Rename target to dest for consistency with radosgw-agent
- Don't make everything one large function
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
This pulls access data out of the rgw task and off disk,
and then downloads, sets up, and runs an rgw sync agent
in test mode.
Signed-off-by: Greg Farnum <greg@inktank.com>
Fixes a bug where an rgw client without
a system user specified would cause teuthology
to error out.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
By separating out the user creation from
generating the region/zone info, we can generate
users for RGW tests that run against the default
pools.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
A 'user create' call was being passed to radosgw-admin
with '--secret-key' instead of the valid '--secret'
which was causing a random secret to be generated,
which was causing subsequent tests to fail.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
fastcgi_sock dir needs to exist before radosgw starts, and apache-execed radosgw needs an explicit keyring argument.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Just a simple change to reconnect to SSH after running
ceph-qa-chef to get around things like ulimit changes.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
packages are missing (the old code skipped 'Nothing to do' messages, but these
cases are still errors).
Fixes#5803
Signed-off-by: Warren Usui <warren.usui@inktank.com>
Reviewed by: Sandon Van Ness
Only radosgw needs this option, and each one will be different, so
remove it from the ceph.conf template.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
The clients are pretty regularly reporting busy on unmount when
samba runs above them. This will hopefully give us some info about why.
Signed-off-by: Greg Farnum <greg@inktank.com>
Due to bug #5716, pools need to start with a '.' at present.
Updating the examples to follow this convention.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
The post-yield code in create_dirs needed to
be tweaked to correctly delete the {tdir}/apache
directory (if it exists) on each client.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Take client<->zone/region and the associated pools from ceph.conf, so
we don't have to invent a new format to specify it.
General region info is added to a new configuration section in the rgw
task. Each client is assumed to be a different zone, and a system user
is created with the key specified in the yaml, so it can be passed to
later task configuration as well. This isn't strictly necessary, but
avoids having to lookup this info in later tasks through something
like radosgw-admin.
Ports are allocated automatically because there's no obvious mapping
from host to client in the task configuration. Later tests can get the
endpoints desired by reading the region map.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Six copies are replaced with one, with an added option to check status
automatically. This should probably be used in a few places where the
return code is ignored.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Just to allow for the create to still work incase the os
volume is fairly large (takes a while to resize) and in
case the host machine is bogged down due to disk I/O.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
tasks:
...
- ceph.wait_for_mon_quorum: [a, b]
...
will block until the mon quorum consists of exactly [a, b]. This is
compared directly to the relevant field from 'ceph quorum_status'
which has the alphanumeric names only.
Signed-off-by: Sage Weil <sage@inktank.com>
Often we want to build a test collection that substitutes different
sequences of tasks into a parallel/sequential construction. However, the
yaml combination that happens when generating jobs is not smart enough to
substitute some fragment into a deeply-nested piece of yaml.
Instead, make these sequences top-level entries in the config dict, and
reference them. For example:
tasks:
- install:
- ceph:
- parallel:
- workload
- upgrade-sequence
workload:
workunit:
- something
upgrade-sequence:
install.restart: [osd.0, osd.1]
Signed-off-by: Sage Weil <sage@inktank.com>
Instead of relying on hardcoded values, obtain the max-skew default from
'ceph-mon --show-config-value mon_clock_drift_allowed' to match the mon's
expectation.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Sometimes the thing we're talking to is slow to start, or to register the
command we are running. Loop in that case, at least for a while.
Signed-off-by: Sage Weil <sage@inktank.com>
If not defined, defaults to 0.05; if 'max-skew' however is defined, it
must override whatever is on the config.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
This will make the CLI do every mon command twice and make sure they both
succeed. This catches problems with mon command idempotency faster than
waiting for random failures trigger.
Added sequential task and parallel task.
Changed _run_one_task to run_one_task (now called by new tasks too).
Fix#4969
Signed-off-by: Warren Usui <warren.usui@inktank.com>
We already install btrfs-tools and xfsprogs with ceph-qa-chef
Doing it here was just causing problems on non-ubuntu
distros and I really see no point for it to have it now.
This is needed so we can set the ceph branch for ceph-deploy
to use via the main yaml which is created via the suite
scheduler.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Because of issues with package name differences vps are
setup to use repopriority and our local repo (which has
some ceph/librados stuff in it) gets high priority so
the ceph.repo that is created on the machine from
ceph-release basically gets ignored. This change makes
it so ceph.repo is the same priority level as our local
repo.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
In some rare cases (mainly centos/rhel after creating the
guest with downburst it does not come up right. It
gets a kernel panic at boot. Usually just turning it off
and then back on again is enough but to be on the safe
side I figured it should be re-created instead. This
insures you don't get hung jobs from a guest that didn't
come up correctly.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Fix of #5494 although bad description. Instead of adding a wait
the code used to detect if the guest was back up is fixed. The
previous code appeared to assume only one machine and broke
when it was waiting for multiple machines if the guests did not
come up within 10 seconds of each other
Make nuke not do the normal stuff if the machine is a VPS as we
just destroy them when they get unlocked.
Instead of getting downburst options from ~/.teuthology.yaml get
it from the yaml given to teuthology for the test/task instead.
Fixed an error that would make all the default downburst values
not take effect if any of them were set via a yaml.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Reviewed-by: Warren Usui <warren.usui@inktank.com>
Occasionally we don't wait long enough for the osd to start and
mark itself up. Keep trying until flush succeeds.
Fixes: #5431
Signed-off-by: Sage Weil <sage@inktank.com>
A very simple change. Just touch a file first (to create it if it
doesn't yet exist so the delete doesn't error out) and then delete
it before pushing the keys to the file. This should avoid the
id_rsa.pub and id_rsa files from getting messed up due to previous
runs which were interrupted or failed (or if those files exist for
some reason). This appears to be what was causing breaking in the
ceph-deploy nightlies.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
The usage doc string for a task is tedious to write and
hard to keep reconciled with the code as defaults are changed.
args.py includes a helper to put it all in one place.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Instead of going through the trouble of adding/removing lines
from authorized_keys which has all our normal keys in it, instead
push keys to the unused authorized_keys2 file which makes the key
management significantly simpler as that file can just be wiped
out each time instead of worrying about preserving contents.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Instead of going through the trouble of adding/removing lines
from authorized_keys which has all our normal keys in it, instead
push keys to the unused authorized_keys2 file which makes the key
management significantly simpler as that file can just be wiped
out each time instead of worrying about preserving contents.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
- use a separate pool for each client
- create pool at start, destroy pool at end
- use all clients, if not explicitly specified
Signed-off-by: Sage Weil <sage@inktank.com>
This included:
A). changes made so that full path names on some files were used
(scheduled tasks started in different home directories).
B.) Changes to insure tasks come up on the beanstalkc queue properly,
C.) Finding and inserting the libvirt eqivalent code for vm machines
in order to simulate ipmi actions,
D.) Fix host key code, report valgrind issue more clearly.
E.) Some message and downburst call changes.
Fix#4988Fix#5122
Signed-off-by: Warren Usui <warren.usui@inktank.com>
File "/var/lib/teuthworker/teuthology-master/teuthology/task/ceph.py", line 215, in valgrind_post
(file, kind) = line.split(':')
ValueError: need more than 1 value to unpack