This class is meant to quack like a DaemonState,
so it needs to expose a .proc attribute. This
was breaking TestDamageTable.
Signed-off-by: John Spray <john.spray@redhat.com>
The "ps au" output was truncating lines which could cause
processes to get missed in some cases, like where there
is a long --client-mountpoint argument.
Signed-off-by: John Spray <john.spray@redhat.com>
Mostly checking the output from the status asok is awkward
because you need to get the system stuck in a particular
state to do so. However, we already have a test here
that sticks the system in reconnect, so here's some
very light test coverage for that asok.
Signed-off-by: John Spray <john.spray@redhat.com>
test_ops_throttle was catching misbehaviour when
deleting lots of files in a directory, but
we should have a non-throttle test for that case
so that it's clearer what's gone wrong.
Signed-off-by: John Spray <john.spray@redhat.com>
...because new tests that modify the client ID will
get confused about whether they're mounted or not, otherwise.
Signed-off-by: John Spray <john.spray@redhat.com>
Previously teuthology read test path from ctx.teuthology_config,
now it reads it from the global teuthology.config.config object.
This was breaking test_journal_migration because it uses tasks/workunit,
when run with vstart_runner.
Signed-off-by: John Spray <john.spray@redhat.com>
This was getting stressed in new ways by
TestSessionMap.test_session_reject, which
has a mount that fails initially.
Two changes here:
* Raise CommandFailedError instead of RuntimeError when
a mount fails (i.e. catch process termination instead
of timing out on /sys/ population)
* Generalise error handling on umount, so that we only
raise the exception on an umount failure if the mount
appears to really not be unmounted. There is some
EINVAL corner case that was getting triggered by the test.
Signed-off-by: John Spray <john.spray@redhat.com>
Analogous to raw_cluster_command, but instead
of calling blocking CLI command we're invoking
the -w mode.
Signed-off-by: John Spray <john.spray@redhat.com>
Used when configuring clients with dynamically
generated auth keys, and pointing them at mount paths.
Signed-off-by: John Spray <john.spray@redhat.com>
When stray directory inodes are corrupted, MDS may go to damaged state
after becoming active. (MDCache::open_root/populate_mydir is called by
MDSRank::starting_done).
Fixes: #14196
Signed-off-by: Yan, Zheng <zyan@redhat.com>
The packages repo host fails in environments where networking setup is
needed in VMs. Use the same user-data as the buildhosts to ensure this
is the case.
Signed-off-by: Robin H. Johnson <robin.johnson@dreamhost.com>
7b27e1db7: openstack: support /etc/network/intefaces injection
2358562cf: ensure VMs always have /etc/hosts set up
4378a505d: always allow unsigned deb packages
50b2db521: openstack: encode instance name with the full IP
6e828a33b: openstack: add 8.8.8.8 as a last resort resolver
Signed-off-by: Robin H. Johnson <robin.johnson@dreamhost.com>
Split the sleep from the server creation, so we catch 'server create'
failures (eg due to quota):
> Quota exceeded for cores: Requested 16, but already used 10 of 20 cores
> (HTTP 403) (Request-ID: req-6467934e-db50-4479-995c-4d44dedf553a)
Signed-off-by: Robin H. Johnson <robin.johnson@dreamhost.com>
OpenStack could tell us the VM has multiple networks, and offers no
guarantee about the order of addresses either (the old code failed if
the v4 IP was first).
For now, take the first listed network, and the first listed IPv4
address therein. Comments contain more detailed examples of possible
output from openstack tool.
Also remove the need for using jq to parse the output.
Signed-off-by: Robin H. Johnson <robin.johnson@dreamhost.com>
The commit from which workunits are fetched must be retrieved
from --ceph-git-url via teuth_config.get_ceph_git_url() instead of
assuming it is available via git://git.ceph.com/ceph.git.
Using git://git.ceph.com/ceph.git is convenient because it supports git
archive. In the general case, some git servers such as github do not
support git archive and a full git clone must be done instead.
Although it would be possible to
git clone --branch=master --depth=1 --single-branch
to reduce the amount of data being retrieved, it would require a
git fetch origin SHA1
but git version >= 1.7 do not support fetching a commit.
http://tracker.ceph.com/issues/13624Fixes: #13624
Signed-off-by: Loic Dachary <loic@dachary.org>
The sha1 for the workunit task is always set by the suite.py task. The
tag must be checked before the sha1 othewise it cannot be used to
override the sha1.
Signed-off-by: Loic Dachary <loic@dachary.org>
Use the Mount.* wrappers for filesystem operations,
so that changes like making run_shell use sudo just work.
Signed-off-by: John Spray <john.spray@redhat.com>
This was causing permissions issues when
running inside teuthology, as run_python
was using sudo and run_shell wasn't.
Would be nice to get rid of all the rootishness,
but for the moment just make it more uniform.
This tests the forward scrub's ability to traverse
some metadata and tag it, and the corresponding
functionality in cephfs-data-scan to filter based
on tag and inject orphaned items.
Signed-off-by: John Spray <john.spray@redhat.com>
Since buildpackages runs before target provisioning, it is possible that
the desired image does not yet exist on a newly provisionned tenant (or
region).
http://tracker.ceph.com/issues/13910Fixes: #13910
Signed-off-by: Loic Dachary <loic@dachary.org>
Similar to what the teuthology install.py task does, add --force-yes to
the apt-get install so that unsigned packages are successfully
installed. It is needed when the buildpackages task is used to create
packages on the fly.
There is no need to do the same for rpm packages because the
verification is controlled from the ceph-release package instead of from
the command line.
http://tracker.ceph.com/issues/13899Fixes: #13899
Signed-off-by: Loic Dachary <loic@dachary.org>
When the quotas are low, it matters to block until the build machine is
actually deleted. Otherwise target provisionning may fail because the
they exceed the quota. For instance the default on OVH is to have 32
cores and the build machine uses 16. The packages-repository machine
uses two, the teuthology cluster uses one and that leaves only 13 cores
for the targets which may be too low when running jobs that require
large instances.
Signed-off-by: Loic Dachary <loic@dachary.org>
Use named error codes instead of numbers, and
use the helper fn for getting inode number
instead of doing it by hand.
Signed-off-by: John Spray <john.spray@redhat.com>
The ceph pg scrub ... command isn't really guarranteed to
start a scrub, keep reissuing it until the scrub actually
happens.
Related: #12746
Signed-off-by: Samuel Just <sjust@redhat.com>
Most of the flavor, sha1, tag etc. selection logic as implemented in the
packaging module of teuthology relies on remote hosts. This is complex
to tests and inconvenient because hosts must be provisionned even before
trying to figure out which packages need to be installed.
Using remote hosts is necessary when bare metal targets are used because
teuthology must adapt to the operating system already installed. The
selection logic in the context of dynamically provisionned targets is
simpler because it is defined by the job being run.
The buildpackages is refactored to use only the job configuration to
figure out which packages must be built. It makes it specific to targets
that are dynamically provisionned. It would have to be modified to query
the remote host in the case of bare metal targets.
Signed-off-by: Loic Dachary <loic@dachary.org>
Also removes fio-version option from yaml since its redundant and if required can be specified in
overrides
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
This was previously using a bunch of files and a small
MDCache limit to force things out of cache. It is much
simpler to just drop the journal.
Signed-off-by: John Spray <john.spray@redhat.com>
...specifically that we don't have lingering
MDS sessions after running it. This is testing
that Client::shutdown is doing the right thing
and closing sessions.
Signed-off-by: John Spray <john.spray@redhat.com>
A clone of Ceph is not automagically updated with the tags from the
official Ceph repository. For a pull request based on master, git
describe will use whatever tags existed at the time the clone was made,
unless the author pull them from the official Ceph repository and later
git push --tags them.
The output of git describe is used to name the packages and if the
official tags are not present, the packages will be incorrectly
named. For instance instead of 9.0.3-34 the packages could be named
0.87-8433 because the v0.87 tag is the most recent tag in the
repository. That confuses the install task that will fail with:
'ceph version 0.87 was not installed, found 9.0.3.'
Signed-off-by: Loic Dachary <ldachary@redhat.com>
A quick check that clients refuse to mount
when daemons are laggy, and while we're at it,
that the basics of failover work. It's a trivial
test, but it's nice to have this kind of thing
so that we don't have to wait for weird thrasher
failures if something breaks.
Signed-off-by: John Spray <john.spray@redhat.com>
To get the health warning, first we need to make sure requests are
added to session's completed request list. Then we need to send an
extra request to MDS to trigger the code that generates the warning.
Fixes: #13437
Signed-off-by: Yan, Zheng <zyan@redhat.com>
When running on virtual machines, it may take more than one minute for a
daemon to create the admin socket.
http://tracker.ceph.com/issues/13449Fixes: #13449
Signed-off-by: Loic Dachary <loic@dachary.org>
Prior to v0.80.9, autogen.sh did not get submodules. Copy/paste the
submodule initialization from newer autogen.sh in common.sh so that
v0.80.8 and below can be rebuilt from sources. It does not hurt to
update the submodules twice.
Signed-off-by: Loic Dachary <loic@dachary.org>
os_version is from the remote and will be 7.1.23 for CentOS 7
instead of the expected 7.0 for all 7.* CentOS.
Signed-off-by: Loic Dachary <loic@dachary.org>
It is not enough to look for the first install task. In upgrade tests,
the install.upgrade task requires more packages to be built. In more
complicated tests using sequential and parallel tasks, the actual
install or install.upgrade task may be deeper in the config tree.
Signed-off-by: Loic Dachary <loic@dachary.org>
The install config may have contradicting tag/branch and sha1. When
suite.py prepares the jobs, it always overrides the sha1 with whatever
default is provided on the command line with --distro and what is found
in the gitbuilder. If it turns out that the tag or the branch in the
install config task is about another sha1, it will override anyway.
Instead of obtaining the tag, branch and sha1 directly from the
packaging.GitbuilderProject object, compute them from the returned
uri_reference data member. The uri_reference is used by the install task
to fetch packages in the gitbuilders and this is what buildpackages
needs to build.
Signed-off-by: Loic Dachary <loic@dachary.org>
The config['os_type'] and config['os_version'] are not always set for a given
job (for instance, in the rbd suite). When a suite runs, it relies on
default values, depending on the target Operating System and internal,
hard coded values associating ubuntu to 14.04 etc.
Instead of using config['os_{type,version}'] use the GitbuilderProject
equivalent which is set with the appropriate defaults.
Signed-off-by: Loic Dachary <loic@dachary.org>
test rbd or krbd using fio, can also run io on rbd clones if option is specified in yaml
various options like image-size, rbd format/features, fio io size, readwrite options can be provided in yaml.
check the docstring for exact usage.
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
FuseMount only uses the prefix for finding the 'ceph'
executable, which is in ./ for either cmake or
authtools, not ./src for cmake like other binaries.
Signed-off-by: John Spray <john.spray@redhat.com>
It was trying to get the output file from
a different remote than the one used to
run the journal tool.
Signed-off-by: John Spray <john.spray@redhat.com>
This is to allow running CephFSTestCase tests
against a vstart cluster, for much faster turnaround
during development than running teuthology against
built ceph packages.
Not everything will be runnable this way, but for
certain things like filesystem repair scenarios we
have everything we need within a vstart environment.
Signed-off-by: John Spray <john.spray@redhat.com>
For tests to advertise that they need the client
to be able to trim its cache (i.e. currently that
means requiring run as root)
Signed-off-by: John Spray <john.spray@redhat.com>
A means for test cases to mark particular methods
as long running, so that the vstart runner can skip
them when running for developers.
This is not a scientific thing, anything that takes
more than about 2 minutes due to lots of iteration
or sleeps.
Signed-off-by: John Spray <john.spray@redhat.com>
In teuthology this isn't needed because we join the
mds child processes after killing them. In vstart
we're killing them asynchronously, so be a bit more
careful to ensure they can't re-insert themselves
to the mdsmap between our calling fail and our calling
fs rm.
Signed-off-by: John Spray <john.spray@redhat.com>
...into the part that requires a network-isolated
client and the part that doesn't.
This happens to also be the part that won't work with
vstart vs. the part that will. teuthology yaml will
still pick up and run both parts.
Signed-off-by: John Spray <john.spray@redhat.com>
* Instead of creating files in background, create
them in foreground (simpler).
* Instead of creating max_request*2 files, just create
max_requests plus a litle bit.
* Set max_requests to 1000 instead of 5000 to run a bit
faster.
Signed-off-by: John Spray <john.spray@redhat.com>
We weren't waiting for export dir to complete (the asok
just starts the process). This wasn't noticeable when running
remotely due to latency between the test runner and the MDS,
but it shows up when running against a local vstart cluster.
Signed-off-by: John Spray <john.spray@redhat.com>
I am seeing a strange thing where it seems like sometimes
a ls of /sys/fs/fuse/connections is returning empty when
connections do exist. It is pretty easy to make this
a non-issue by waiting for "more conns than we started with"
instead of "list of conns is different", so do that.
Signed-off-by: John Spray <john.spray@redhat.com>
Previously failure to stat mnt dir was interpreted
as being unmounted. For "transport endpoint no connected"
error we do want to recognise that it is mounted, albeit
with no ceph-fuse process.
Signed-off-by: John Spray <john.spray@redhat.com>