- cleanup on a test failure;
- minimize interference with other processes (tests) that are
run concurrently;
- use xmlstarlet when parsing rbd output;
- add exit status test.
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
If the journal is not unmapped, ceph-disk destroy will fail to zap the
corresponding devices because it is still held by devicemapper.
A consequence of this modification is that
ceph-disk activate --dmcrypt --reactivate
no longer works from the command line, because it does not map the
dmcrypted journal. The --reactivate option is added to activate-journal
which will map both the journal and the data devices, if necessary.
http://tracker.ceph.com/issues/14233Fixes: #14233
Signed-off-by: Loic Dachary <loic@dachary.org>
* not all config items are tracked, so it does not take any effect after
we sucessfully changed them using "ceph tell <daemon> injectargs --foo-bar 15',
as shown by the command output:
$daemon: foo_bar = '15'
if foo-bar happens to be the one not tracked by any components in <daemon>.
in this fix, the message of
$daemon: foo_bar = '15' (unchangeable)
is returned instead. nevertheless, the config is still updated. as
"ceph daemon <daemon> config show | grep foo_bar" shows:
"foo_bar": "15"
this helps user to understand that the setting is not dynamically
changeable.
* update the test accordingly
Fixes: #11692
Signed-off-by: Kefu Chai <kchai@redhat.com>
This is just like 'blacklist rm' except it removes
everything. Useful if you've got a whole bunch of
things in your blacklist and you don't want to wait
for N "blacklist rm" commands to run.
Signed-off-by: John Spray <john.spray@redhat.com>
Add the pool name for a given rbd imgae when executing rbd admin socket
commands in case there are more than one images with the same name in
different pools.
Signed-off-by: Xiangwei Wu wuxiangwei@h3c.com
librados extend remove interface, add flags parameter, and use
this extended interface to implement force remove when cluster full.
Signed-off-by: Xiaowei Chen <chen.xiaowei@h3c.com>
When an OSD id is removed via ceph osd rm, it will be reused by the next
ceph osd create command. Verify that and OSD reusing such an id
successfully comes up.
http://tracker.ceph.com/issues/13988 Refs: #13988
Signed-off-by: Loic Dachary <loic@dachary.org>
When called to teardown a test, kill_daemon should use KILL to ensure
all leftovers are removed as quickly as possible to leave a clean state
for the next test. However, when kill_daemons is called to shutdown a
given daemon from within a test, it should use TERM by default so the
daemon has time to notify the MON that it goes down. For instance, if
KILLing an OSD, the mon will still report it as being up although the
calling function probably expects that it will be marked out.
Signed-off-by: Loic Dachary <loic@dachary.org>
In a test environment, consistency is more important than
performances. Effectively disable the test that would postpone a scrub
depending on the load average. It is assumed that a machine with a load
average higher than 2000 won't be useable anyway.
http://tracker.ceph.com/issues/14027 Refs: #14027
Signed-off-by: Loic Dachary <loic@dachary.org>
If /tmp/obj1 happened to exist already, and was not writable by the
testing user, then this test failed!
Signed-off-by: Robin H. Johnson <robin.johnson@dreamhost.com>
When testing auto scrub, waiting 20 seconds for the scrub to complete is
sometimes not enough and creates false negatives.
Split wait_for_scrub out of the repair helper so that it can be used to
wait for the scrub to happen instead of using a timer.
The scrub timestamp is obtained after removing the object, therefore
there is a chance for the scrub to be finished already. But since auto
scrub is scheduled every 5 seconds, it will only make the test wait an
extra 5 seconds and not hang forever.
http://tracker.ceph.com/issues/13592
Signed-off-by: Xinze Chi <xinze@xsky.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
ceph osd pool set $POOL scrub_min_interval N
ceph osd pool set $POOL scrub_max_interval N
ceph osd pool set $POOL deep_scrub_interval N
If N > 0, this value is used for the pool instead of
the corresponding global parameter from the config
(osd_scrub_min_interval, osd_scrub_max_interval or
osd_deep_scrub_interval).
Fixes: #13077
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
Update the PLUGINS variable that was no longer used. Add the TECHNIQUES
variable to control which techniques are compared.
Signed-off-by: Loic Dachary <loic@dachary.org>
It is used instead of the obsoleted --parameter directory= to specify
the location of the erasure code directory plugins.
Signed-off-by: Loic Dachary <loic@dachary.org>
- using the deactivate/destroy feature to destroy osd
- test reactivate option when the osd goes deactive
- add check_osd_status to check osd status when osd goes up
Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
test:
see test.sh:test_mon_caps
before modify:
when we first exec ../qa/workunits/cephtool/test.sh -t mon_caps --asok-does-not-need-root , it stuck.
after modify:
exec again, return Permission denied.
Signed-off-by: Xiaowei Chen <chen.xiaowei@h3c.com>
The rbd merge-diff tool does not support fancy striped
image exports. Corrected the test to reflect this fact.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
It is good for src/test/test_rados_tool.sh to be run by
rados/singleton/all/radostool.yaml because it contains a lot more tests
than qa/workunits/rados/test_rados_tool.sh
http://tracker.ceph.com/issues/13691Fixes: #13691
Signed-off-by: Loic Dachary <ldachary@redhat.com>
When copy/pasting a tests, it is easy to forget (or not know) that the
port used must be unique to allow for multiple tests to run in
parallel (make -j8). Add a reminder next to each port.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
Instead of using augtool to modify the configuration file, use
configobj. It is also used by the install teuthology task. The .ini
lens (puppet lens really) is unable to read ini files created by
configobj.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
The ceph-disk workunit deploy keys that are not deployed by default by
the ceph teuthology task.
The OSD created by the ceph task are removed from the default
bucket (via osd rm) so they do not interfere with the tests.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
RHEL7 derivatives were failing test 002 since they were using
legacy test cases for now unsupported OSes.
Fixes: #13483
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
This test required root in order to copy its built
binary into /usr (presumably to avoid rebuilding it).
That's not really a good thing anyway because there's
no guarantee that a binary in that path is the binary
we wanted, so just run the thing straight out of /tmp. The
build is really quick anyway.
Signed-off-by: John Spray <john.spray@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
This reverts commit 30810da4b5.
After some discussion we have decided it is better to build a generic
dictionary in pg_pool_t to store infrequently used per-pool properties.
Signed-off-by: Sage Weil <sage@redhat.com>
If installed on Ubuntu where multipath does not activate properly, it
interferes with the other tests.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
After preparing an OSD, wait for the corresponding OSD to be up
according to ceph osd dump before asserting the devices are in the
expected state. Otherwise the test races with ceph-disk activate which
is run asynchronously via udev / upstart / system.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
It turns out it was not CentOS 7 specific. There is no excuse to skip
the tests anymore.
http://tracker.ceph.com/issues/12787 Refs: #12787
Signed-off-by: Loic Dachary <ldachary@redhat.com>
ceph osd pool set $POOL scrub_min_interval N
ceph osd pool set $POOL scrub_max_interval N
ceph osd pool set $POOL deep_scrub_interval N
If N > 0, this value is used for the pool instead of
the corresponding global parameter from the config
(osd_scrub_min_interval, osd_scrub_max_interval or
osd_deep_scrub_interval).
Fixes: #13077
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
This can race with an actual mdsmap epoch update for some other
reason. We just need to make sure the epoch *increased*, not that
it is exactly old + 1.
Fixes: #12991
Signed-off-by: Sage Weil <sage@redhat.com>
* Get rid of the cryptsetup calls that are redundant with what ceph
prepare already does
* Do not use the --dmcrypt-key-dir option. This is less coverage but it
interferes with the udev logic and is expected to be refactored soon.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
This new ceph-disk workunit re-implements the tests that previously were
in the src/test/ceph-disk.sh src/test/ceph-disk-root.sh scripts and is
meant to run in a virtual machine instead of docker.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
Ignore the profile 'directory' field.
This ensures that we can always find plugins even when teh cluster
is installed across a mix of distros.
Rename the option to have no osd_ (or mon_) prefix since anybody
may use the ec factory/plugin code.
We still hard-code .libs in the unit tests... sigh.
Signed-off-by: Sage Weil <sage@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Conflicts:
src/include/ceph_features.h
src/osd/ReplicatedPG.cc
src/osd/ReplicatedPG.h
When an object is first created, it's proxied to base tier, need to
change the behavior of the test_tiering test case accordingly.
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
On some test machines, /usr/lib/ltp/testcases/bin/fsstress is
dangling symlink. 'cp -f' is impotent in this case.
Fixes: #12710
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Verify that an object promoted to a cache tier because of a proxy read
is evicted as expected.
http://tracker.ceph.com/issues/12673 Refs: #12673
Signed-off-by: Loic Dachary <ldachary@redhat.com>
the proble breaks `test_mon_deprecated_commands` on ubuntu precise,
on the python shipped with ubuntu precise, errno.errorcode[95]
evalutes to `EOPNOTSUPP` but not `ENOTSUP`. but these two errnos
are equal in glibc.
Signed-off-by: Kefu Chai <kchai@redhat.com>
'ceph mon_metadata' was added still during this dev cycle, so there is
no need to deprecate it first.
Fixes: #11545
Signed-off-by: Joao Eduardo Luis <joao@suse.de>
We need to be able to allow the version of ceph_test_* from earlier
versions of ceph to continue to work. This patch also adjusts the
work unit to use a single rados snap to test the condition without
--force-nonempty to ensure that we don't need to be careful about
the config value when running that script.
Signed-off-by: Samuel Just <sjust@redhat.com>
Modify the test traces to include the file name in addition to the
function and line name. It makes it easier to locate the faulty line
without going back to the test name.
Format the trace lines to be emacs friendly (filename:lineno) so that
C-x ` or C-c C-c jumps to the right file and the right line when running
the test with M-x compile.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
When a test fails, the script returns immediately and kill_daemon
function is called to cleanup. It is quite verbose and requires
scrolling hundreds of lines back to find the actual error
message. Turn off the shell trace to reduce the verbosity and improve
error output readability.
The kill_daemon cannot just turn off set -x because it may be called by
a test, not just at the end of the run. Instead the kill_daemon function
checks if tracing is activated and temporarily disables it.
Also get rid of the find standard error that commonly happens when
kill_daemon is called to verify there are no leftovers and the test
directory does not exist.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
osd-class-dir was not set when activating osds in the test environment leading to failures with 'operation not supported' message when trying to lock objects
Signed-off-by: Sebastien Ponce <sebastien.ponce@cern.ch>
Wip writeback throttling for cache tiering
This patch is to do write back throttling for cache tiering, which is similar to what the Linux kernel does for page cache write back. A paramter 'cache_target_dirty_high_ratio' (default 0.6) is introduced as the high speed flushing threshold, while leave the 'cache_target_dirty_ratio' (default 0.4) to represent the low speed threshold. The flush speed is controlled by limiting the parallelism of flushing. The maximum parallelism under low speed is half of the parallelism under high speed. If there is at least one PG such that the dirty ratio beyond the high threshold, full speed mode is entered; If there is no PG such that dirty ratio beyond the low threshold, idle mode is entered; In other cases, slow speed mode is entered.
Signed-off-by: Mingxin Liu <mingxinliu@ubuntukylin.com>
Reviewed-by: Li Wang <liwang@ubuntukylin.com>
Suggested-by: Nick Fisk <nick@fisk.me.uk>
Tested-by: Kefu Chai <kchai@redhat.com>
This should make newer gcc releases happier in their default configuration.
kernel.org is now distributing tarballs as .xz files so we change to that
as well when decompressing (it is supported by Ubuntu Precise so we should
be all good).
Fixes: #11758
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
VirtualBox has some files with weird
permissions in its /usr/lib, which was
tripping up this usually-safe operation
when run as an unprivileged user.
Fixes: #11959
Signed-off-by: John Spray <john.spray@redhat.com>
While we're at it, take only /usr/lib instead of all of /usr
to keep the overall file count more modest.
Fixes: #11807
Signed-off-by: John Spray <john.spray@redhat.com>
fix "pg ls" with states of "recovering" and/or "repair"
Reviewed-by: Joao Eduardo Luis <joao@suse.de>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
1. Creating a filesystem using a
readonly tier on an EC pool (should be forbidden)
2. Removing a tier from a replicated base pool (should
be permitted)
Signed-off-by: John Spray <john.spray@redhat.com>
A get/set command may fail with
Error EBUSY: currently creating pgs, wait
if issued before the PGs are clean. Call wait_for_clean after the pool
is created or a pool setting is changed and remaps the PGs it
contains (size, pg_num...) to ensure the PGs are clean and the set/get
command that follow will succeed.
http://tracker.ceph.com/issues/11624Fixes: #11624
Signed-off-by: Loic Dachary <ldachary@redhat.com>
The semantic and interface of get_pg are the same, that avoids
duplication and the ceph-helpers.sh version is tested and documented.
Make the ceph-test package dependent on xmlstarlet because it is
needed by ceph-helpers.sh.
Signed-off-by: Loic Dachary <ldachary@redhat.com>