Fix of #5494 although bad description. Instead of adding a wait
the code used to detect if the guest was back up is fixed. The
previous code appeared to assume only one machine and broke
when it was waiting for multiple machines if the guests did not
come up within 10 seconds of each other
Make nuke not do the normal stuff if the machine is a VPS as we
just destroy them when they get unlocked.
Instead of getting downburst options from ~/.teuthology.yaml get
it from the yaml given to teuthology for the test/task instead.
Fixed an error that would make all the default downburst values
not take effect if any of them were set via a yaml.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Reviewed-by: Warren Usui <warren.usui@inktank.com>
Occasionally we don't wait long enough for the osd to start and
mark itself up. Keep trying until flush succeeds.
Fixes: #5431
Signed-off-by: Sage Weil <sage@inktank.com>
A very simple change. Just touch a file first (to create it if it
doesn't yet exist so the delete doesn't error out) and then delete
it before pushing the keys to the file. This should avoid the
id_rsa.pub and id_rsa files from getting messed up due to previous
runs which were interrupted or failed (or if those files exist for
some reason). This appears to be what was causing breaking in the
ceph-deploy nightlies.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
The usage doc string for a task is tedious to write and
hard to keep reconciled with the code as defaults are changed.
args.py includes a helper to put it all in one place.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Instead of going through the trouble of adding/removing lines
from authorized_keys which has all our normal keys in it, instead
push keys to the unused authorized_keys2 file which makes the key
management significantly simpler as that file can just be wiped
out each time instead of worrying about preserving contents.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Instead of going through the trouble of adding/removing lines
from authorized_keys which has all our normal keys in it, instead
push keys to the unused authorized_keys2 file which makes the key
management significantly simpler as that file can just be wiped
out each time instead of worrying about preserving contents.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
- use a separate pool for each client
- create pool at start, destroy pool at end
- use all clients, if not explicitly specified
Signed-off-by: Sage Weil <sage@inktank.com>
This included:
A). changes made so that full path names on some files were used
(scheduled tasks started in different home directories).
B.) Changes to insure tasks come up on the beanstalkc queue properly,
C.) Finding and inserting the libvirt eqivalent code for vm machines
in order to simulate ipmi actions,
D.) Fix host key code, report valgrind issue more clearly.
E.) Some message and downburst call changes.
Fix#4988Fix#5122
Signed-off-by: Warren Usui <warren.usui@inktank.com>
File "/var/lib/teuthworker/teuthology-master/teuthology/task/ceph.py", line 215, in valgrind_post
(file, kind) = line.split(':')
ValueError: need more than 1 value to unpack
This lets us set different overrides for e.g. ceph vs samba, and makes it
so the schedule_teuthology.sh overrides don't specify a ceph sha1 for
samba installs.
Signed-off-by: Sage Weil <sage@inktank.com>
Add description of yaml file including log-whitelist
Add sudo to dd that corrupts data
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Warren Usui <warren.usui@inktank.com>:wq
This creates and cleans up a local mnt dir that can be consumed
by other tasks (like workunit, samba, etc), just like any
other client (ceph-fuse, kclient), except it is just a dir on
the local fs.
Signed-off-by: Sage Weil <sage@inktank.com>
The cifs-mount task mounts a smb endpoint from the
first available smbd server (the samba.0 role). This
task is similar to the ceph-fuse task, file system
tests can be run on the resulting mount point.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
The samba task sets up samba on all 'samba' roles
with ceph as the backend storage module. The task
creates a smb.conf file that points to ceph, and
starts smbd.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Some daemons (smbd) will try to read from stdin and check if its a
socket, using that for sending/receiving messages. If /dev/null is
used for stdin, the daemon aborts. This patch adds a 'nostdin' option
to the daemon-helper so that the daemon can be started without /dev/null
as stdin.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Generalizes the install task to specify a "project" which defaults to
'ceph', but can be configured to install different project packages,
for example:
install:
project: samba
extra_packages: samba
The default install task uses 'ceph' as the project, and relies on an
existing set of defined packages to install. For other projects, the
packages to be installed must be specified with the extra_packages
field. Multiple install tasks can be specified:
install:
install:
project: samba
extra_packages: samba
Which installs ceph packages and then samba packages.
Also, cleanup in nuke.py so that nuke and install use the same list of
packages when doing the remove steps.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Yum installs of packages specify a pacakge number. Initial
install of yum source changed to not fail if already done.
Added yum cleans where necessary.
Fixes: #4768
Signed-off-by: Warren Usui <warren.usui@inktank.com>
There's no need for an explicit cleanup function, so move it back
to where it came from (except in s3roundtrip, which did not have it).
Instead, since these use a nested contextmanager, pass through
and yield to the top-level run_tasks after the nested
contextmanager has finished (and thus run all the cleanup steps
in the subtasks for this test).
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Only split once, since radosgw will have client.X after it.
Monitors and MDSs may have names with more .s as well.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
The upgrade tasks specify 'branch' in the job file, but the
schedule_suite.sh script sets a sha1 in the overrides. Make
the upgrade tests actually test an upgrade by preferring branch
over sha1 when both are specified.
This is fragile, but ought to do the trick for now!
Signed-off-by: Sage Weil <sage@inktank.com>
Test repair with more than 1 damaged object and with different types of damage
Regression test for bug #4778
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
Purge removes logs, and we want to archive those, so explicitly shut down
all daemons before doing the archiving step.
Signed-off-by: Sage Weil <sage@inktank.com>
This reverts commit 67a616a979.
Sigh. As it turns out, /etc/default/grub being hacked also
causes the same problem. I think there's a way to fix that cleanly
as well, but until then, replacing the "accept installed version"
hack here so jobs can run.
This reverts commit 5995ae7e78.
With the changes to ceph-qa-chef and the teuthology kernel task,
we're no longer touching packaged file /etc/grub.d/10_linux, which
was the reason for this apt forcing. Remove so that we find other
package problems that might be masked by this; we can always
put it back if there are such problems until we can fix those as well.
Signed-off-by: Dan Mick <dan.mick@inktank.com>
(cherry picked from commit c2b0828b19)
Change apt commands to prevent prompts from coming up (forcing
non-interactive mode) so things like grub or other stuff doesn't
break teuthology runs.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
fa2049f caused an import cycle between lock.py and misc.py. Move the
needed functions from lock.py to lockstatus.py so that we can avoid the
import cycle.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Conflicts:
teuthology/lock.py
Purge will uninstall and (in so doing) stop the daemons. This avoids trying
to tar up the mon data or logs while they are being written to, which
avoids errors like
2013-04-16T20:21:47.103 INFO:teuthology.task.ceph-deploy:Archiving mon data...
2013-04-16T20:21:47.545 INFO:teuthology.orchestra.run.err:tar: ./ceph-mira089/store.db/000009.log: file changed as we read it
Also drop the unnecessary uninstall (it is implied by purge).
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 4befae4fbe)
Purge will uninstall and (in so doing) stop the daemons. This avoids trying
to tar up the mon data or logs while they are being written to, which
avoids errors like
2013-04-16T20:21:47.103 INFO:teuthology.task.ceph-deploy:Archiving mon data...
2013-04-16T20:21:47.545 INFO:teuthology.orchestra.run.err:tar: ./ceph-mira089/store.db/000009.log: file changed as we read it
Also drop the unnecessary uninstall (it is implied by purge).
Signed-off-by: Sage Weil <sage@inktank.com>
This reverts commit 67a616a97927efdc4fbcc5edb0d0cf4a724d90e2.
Sigh. As it turns out, /etc/default/grub being hacked also
causes the same problem. I think there's a way to fix that cleanly
as well, but until then, replacing the "accept installed version"
hack here so jobs can run.
This reverts commit 5995ae7e78.
With the changes to ceph-qa-chef and the teuthology kernel task,
we're no longer touching packaged file /etc/grub.d/10_linux, which
was the reason for this apt forcing. Remove so that we find other
package problems that might be masked by this; we can always
put it back if there are such problems until we can fix those as well.
Signed-off-by: Dan Mick <dan.mick@inktank.com>
(cherry picked from commit c2b0828b19)
We had been writing 01_ceph_kernel with the kernel title, and
relying on the fact that grub.cfg would never have submenus in it
(implemented by a hack to /etc/grub.d/10_linux which neutered its
submenu creation). However, that hack was modifying a package file,
and got in the way of later apt commands. Rather than doing it
that way, this divines the title of the submenu and sets the
default variable to "submenu>kernel", which works to select the
desired kernel.
It depends on there being only one level of submenu, and on the
format of the menuentry and submenu commands, dictated by grub2.
None of this is likely to work at all outside Ubuntu.
Fixes: #4496
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
(cherry picked from commit 52aec32a7d)
The pg state could easily have changed in the mean time,
for example, from recovery_wait to recovering.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
This reverts commit 5995ae7e78.
With the changes to ceph-qa-chef and the teuthology kernel task,
we're no longer touching packaged file /etc/grub.d/10_linux, which
was the reason for this apt forcing. Remove so that we find other
package problems that might be masked by this; we can always
put it back if there are such problems until we can fix those as well.
Signed-off-by: Dan Mick <dan.mick@inktank.com>
fa2049f caused an import cycle between lock.py and misc.py. Move the
needed functions from lock.py to lockstatus.py so that we can avoid the
import cycle.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
We had been writing 01_ceph_kernel with the kernel title, and
relying on the fact that grub.cfg would never have submenus in it
(implemented by a hack to /etc/grub.d/10_linux which neutered its
submenu creation). However, that hack was modifying a package file,
and got in the way of later apt commands. Rather than doing it
that way, this divines the title of the submenu and sets the
default variable to "submenu>kernel", which works to select the
desired kernel.
It depends on there being only one level of submenu, and on the
format of the menuentry and submenu commands, dictated by grub2.
None of this is likely to work at all outside Ubuntu.
Fixes: #4496
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
This is a fix for issue #4677 which was caused by kdb output being
hard-coded to ttyS1 which is fine for all our hardware except mira
machines. This change just checks to see if mira is in the host's
name and uses ttyS2 instead (simple fix).
Resolves an issue where we
were not properly escaping the generated
public key when doing matches against it.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewd-by: Sam Lang <sam.lang@inktank.com>
Change apt commands to prevent prompts from coming up (forcing
non-interactive mode) so things like grub or other stuff doesn't
break teuthology runs.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Change apt commands to prevent prompts from coming up (forcing
non-interactive mode) so things like grub or other stuff doesn't
break teuthology runs.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Modify the Hadoop task to support branches
being specified for both the Apache and Inktank
Hadoop branches.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewd-by: Sam Lang <sam.lang@inktank.com>
Updated the ssh-keys task to cleanup
any left-over keys from previous tasks
(indicated by the user being 'ssh-keys-user').
Also, some of the functions in the ssh_keys task seem
like they could be useful in general.
This patch refactors them into misc.py.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewd-by: Sam Lang <sam.lang@inktank.com>
Downburst create is used to reinstall a VM when it is locked.
Downburst destroy is used to remove a VM when it is unlocked.
Host keys are regenerated on each vm instantiation, so the keys
need to be checked prior to use.
If needed, qa-ceph-chef is run on newly installed systems to insure that
they are fully functional.
Signed-off-by: Warren Usui <warren.usui@inktank.com>
Optional flag makes us suck down the archive (mostly, the logs, which
might be huge for some debugging tests) unless the test has failed.
Signed-off-by: Sage Weil <sage@inktank.com>
In cases where the mds thrasher continuously loops
waiting for an mds to be removed from the map, or
for a new mds to become active, we want to start logging
the mds state for debugging.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Verify there is no /var/lib/ceph, just like we do with the cephtest
directory. We will need to change this (or make it optional) when we
allow runs against an existing cluster, but then a whole bunch of other
things will need to change then as well.
Signed-off-by: Sage Weil <sage@inktank.com>