Simulate the cases where the activation (via udev running trigger)
sequences are:
* journal then lockbox
* data then lockbox
* lockbox
All of them must end with the OSD verfied to be up.
Signed-off-by: Loic Dachary <loic@dachary.org>
Instead of storing the dmcrypt keys in the /etc/ceph/dmcrypt-keys
directory, they are stored in the monitor. If a machine with
OSDs created with ceph-disk prepare --dmcrypt is lost, it does
not contain the key that would allow to decrypt their content.
The dmcrypt key is retrieved from the monitor using a different keyring
for each OSD. It is stored in a small partition called the lockbox. At
boot time the lockbox is mounted
/var/lib/ceph/osd-lockbox/$uuid
and used when the $uuid partition is detected by udev to map it with
cryptsetup.
The OSDs that were prepared prior to the lockbox implementation are
supported by looking up the key found in /etc/ceph/dmcrypt-keys before
looking in /var/lib/ceph/osd-lockbox/$uuid.
http://tracker.ceph.com/issues/14669Fixes: #14669
Signed-off-by: Loic Dachary <loic@dachary.org>
"ceph --watch-debug" and "ceph tell mon.foo version" could connect
to different monitors, and there is chance that "ceph --watch-debug"
is not connected yet when "ceph tell" completes, and hence the former
fails to collect the cluster log including the "ceph tell" related
message. this renders test_mon_tell() unreliable. so, in
ceph_watch_start(), we should wait until the "ceph" cli connects to the
monitor and receives messages from it.
Fixes: #14910
Signed-off-by: Kefu Chai <kchai@redhat.com>
While running a make check on a btrfs system, many subvolumes are let at the end
of the build. It's pretty common to have several hundreds of those.
btrfs is pretty sensible to the path when requesting a subvolume removal.
The current code was misleading the path and didn't deleted the remaining
volumes.
This patch list the current subvolumes, filter thoses created by the
test process and ajust the path because brtfs reports
erwan/chroot/ceph/src/testdir/test-7202/dev/osd1/snap_439
while regarding the current working directory we want to delete :
testdir/test-7202/dev/osd1/snap_439
Signed-off-by: Erwan Velu <erwan@redhat.com>
client: add option to control how directory size is calculated
This lets you disable rstats if your workload is unhappy about directories
changing size (eg, tar of recently-moved/created/untarred files).
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
http://tracker.ceph.com/issues/13422 made it so ceph-osd won't start
unless the pidfile can be created successfully. The default location
being the current directory, ceph-osd must explicitly be told to write
in a directory where it has write permissions.
Signed-off-by: Loic Dachary <loic@dachary.org>
Only support the block file for now. It is handled the same as the
journal, only with a different name (block) and it's own set of ptypes
depending on multipath or dmcrypt.
Signed-off-by: Loic Dachary <loic@dachary.org>
Refactor the test / virtualenv setup in the same way it was done for
ceph-detect-init.
All shell tests use ceph-helpers.sh which is modified to add ceph-disk /
ceph-detect-init virtualenv/bin to the PATH to ensure the source version
is used even if ceph is installed.
See "ceph-detect-init: make all must setup.py install"
Signed-off-by: Loic Dachary <loic@dachary.org>
This patch does cleanup for option "filestore_xattr_use_omap",
as this option was removed in #7408.
Fixes: #14356
Signed-off-by: Vikhyat Umrao <vumrao@redhat.com>
- cleanup on a test failure;
- minimize interference with other processes (tests) that are
run concurrently;
- use xmlstarlet when parsing rbd output;
- add exit status test.
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
If the journal is not unmapped, ceph-disk destroy will fail to zap the
corresponding devices because it is still held by devicemapper.
A consequence of this modification is that
ceph-disk activate --dmcrypt --reactivate
no longer works from the command line, because it does not map the
dmcrypted journal. The --reactivate option is added to activate-journal
which will map both the journal and the data devices, if necessary.
http://tracker.ceph.com/issues/14233Fixes: #14233
Signed-off-by: Loic Dachary <loic@dachary.org>
* not all config items are tracked, so it does not take any effect after
we sucessfully changed them using "ceph tell <daemon> injectargs --foo-bar 15',
as shown by the command output:
$daemon: foo_bar = '15'
if foo-bar happens to be the one not tracked by any components in <daemon>.
in this fix, the message of
$daemon: foo_bar = '15' (unchangeable)
is returned instead. nevertheless, the config is still updated. as
"ceph daemon <daemon> config show | grep foo_bar" shows:
"foo_bar": "15"
this helps user to understand that the setting is not dynamically
changeable.
* update the test accordingly
Fixes: #11692
Signed-off-by: Kefu Chai <kchai@redhat.com>
This is just like 'blacklist rm' except it removes
everything. Useful if you've got a whole bunch of
things in your blacklist and you don't want to wait
for N "blacklist rm" commands to run.
Signed-off-by: John Spray <john.spray@redhat.com>
Add the pool name for a given rbd imgae when executing rbd admin socket
commands in case there are more than one images with the same name in
different pools.
Signed-off-by: Xiangwei Wu wuxiangwei@h3c.com
librados extend remove interface, add flags parameter, and use
this extended interface to implement force remove when cluster full.
Signed-off-by: Xiaowei Chen <chen.xiaowei@h3c.com>
When an OSD id is removed via ceph osd rm, it will be reused by the next
ceph osd create command. Verify that and OSD reusing such an id
successfully comes up.
http://tracker.ceph.com/issues/13988 Refs: #13988
Signed-off-by: Loic Dachary <loic@dachary.org>
When called to teardown a test, kill_daemon should use KILL to ensure
all leftovers are removed as quickly as possible to leave a clean state
for the next test. However, when kill_daemons is called to shutdown a
given daemon from within a test, it should use TERM by default so the
daemon has time to notify the MON that it goes down. For instance, if
KILLing an OSD, the mon will still report it as being up although the
calling function probably expects that it will be marked out.
Signed-off-by: Loic Dachary <loic@dachary.org>
In a test environment, consistency is more important than
performances. Effectively disable the test that would postpone a scrub
depending on the load average. It is assumed that a machine with a load
average higher than 2000 won't be useable anyway.
http://tracker.ceph.com/issues/14027 Refs: #14027
Signed-off-by: Loic Dachary <loic@dachary.org>
If /tmp/obj1 happened to exist already, and was not writable by the
testing user, then this test failed!
Signed-off-by: Robin H. Johnson <robin.johnson@dreamhost.com>
xfstests generic/247 exercises XFS DIO and AIO to detect races. Some
races apparently exist in the XFS code as this is a known issue within
XFS. Expunge this test because it is not specifically relevant to krbd
and not a specific krbd issue.
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
When testing auto scrub, waiting 20 seconds for the scrub to complete is
sometimes not enough and creates false negatives.
Split wait_for_scrub out of the repair helper so that it can be used to
wait for the scrub to happen instead of using a timer.
The scrub timestamp is obtained after removing the object, therefore
there is a chance for the scrub to be finished already. But since auto
scrub is scheduled every 5 seconds, it will only make the test wait an
extra 5 seconds and not hang forever.
http://tracker.ceph.com/issues/13592
Signed-off-by: Xinze Chi <xinze@xsky.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
ceph osd pool set $POOL scrub_min_interval N
ceph osd pool set $POOL scrub_max_interval N
ceph osd pool set $POOL deep_scrub_interval N
If N > 0, this value is used for the pool instead of
the corresponding global parameter from the config
(osd_scrub_min_interval, osd_scrub_max_interval or
osd_deep_scrub_interval).
Fixes: #13077
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
Update the PLUGINS variable that was no longer used. Add the TECHNIQUES
variable to control which techniques are compared.
Signed-off-by: Loic Dachary <loic@dachary.org>
It is used instead of the obsoleted --parameter directory= to specify
the location of the erasure code directory plugins.
Signed-off-by: Loic Dachary <loic@dachary.org>
- using the deactivate/destroy feature to destroy osd
- test reactivate option when the osd goes deactive
- add check_osd_status to check osd status when osd goes up
Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
test:
see test.sh:test_mon_caps
before modify:
when we first exec ../qa/workunits/cephtool/test.sh -t mon_caps --asok-does-not-need-root , it stuck.
after modify:
exec again, return Permission denied.
Signed-off-by: Xiaowei Chen <chen.xiaowei@h3c.com>
The rbd merge-diff tool does not support fancy striped
image exports. Corrected the test to reflect this fact.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
It is good for src/test/test_rados_tool.sh to be run by
rados/singleton/all/radostool.yaml because it contains a lot more tests
than qa/workunits/rados/test_rados_tool.sh
http://tracker.ceph.com/issues/13691Fixes: #13691
Signed-off-by: Loic Dachary <ldachary@redhat.com>
When copy/pasting a tests, it is easy to forget (or not know) that the
port used must be unique to allow for multiple tests to run in
parallel (make -j8). Add a reminder next to each port.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
Instead of using augtool to modify the configuration file, use
configobj. It is also used by the install teuthology task. The .ini
lens (puppet lens really) is unable to read ini files created by
configobj.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
The ceph-disk workunit deploy keys that are not deployed by default by
the ceph teuthology task.
The OSD created by the ceph task are removed from the default
bucket (via osd rm) so they do not interfere with the tests.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
RHEL7 derivatives were failing test 002 since they were using
legacy test cases for now unsupported OSes.
Fixes: #13483
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
This test required root in order to copy its built
binary into /usr (presumably to avoid rebuilding it).
That's not really a good thing anyway because there's
no guarantee that a binary in that path is the binary
we wanted, so just run the thing straight out of /tmp. The
build is really quick anyway.
Signed-off-by: John Spray <john.spray@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
This reverts commit 30810da4b5.
After some discussion we have decided it is better to build a generic
dictionary in pg_pool_t to store infrequently used per-pool properties.
Signed-off-by: Sage Weil <sage@redhat.com>
If installed on Ubuntu where multipath does not activate properly, it
interferes with the other tests.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
After preparing an OSD, wait for the corresponding OSD to be up
according to ceph osd dump before asserting the devices are in the
expected state. Otherwise the test races with ceph-disk activate which
is run asynchronously via udev / upstart / system.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
It turns out it was not CentOS 7 specific. There is no excuse to skip
the tests anymore.
http://tracker.ceph.com/issues/12787 Refs: #12787
Signed-off-by: Loic Dachary <ldachary@redhat.com>
ceph osd pool set $POOL scrub_min_interval N
ceph osd pool set $POOL scrub_max_interval N
ceph osd pool set $POOL deep_scrub_interval N
If N > 0, this value is used for the pool instead of
the corresponding global parameter from the config
(osd_scrub_min_interval, osd_scrub_max_interval or
osd_deep_scrub_interval).
Fixes: #13077
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
This can race with an actual mdsmap epoch update for some other
reason. We just need to make sure the epoch *increased*, not that
it is exactly old + 1.
Fixes: #12991
Signed-off-by: Sage Weil <sage@redhat.com>
* Get rid of the cryptsetup calls that are redundant with what ceph
prepare already does
* Do not use the --dmcrypt-key-dir option. This is less coverage but it
interferes with the udev logic and is expected to be refactored soon.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
This new ceph-disk workunit re-implements the tests that previously were
in the src/test/ceph-disk.sh src/test/ceph-disk-root.sh scripts and is
meant to run in a virtual machine instead of docker.
Signed-off-by: Loic Dachary <ldachary@redhat.com>
Ignore the profile 'directory' field.
This ensures that we can always find plugins even when teh cluster
is installed across a mix of distros.
Rename the option to have no osd_ (or mon_) prefix since anybody
may use the ec factory/plugin code.
We still hard-code .libs in the unit tests... sigh.
Signed-off-by: Sage Weil <sage@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Conflicts:
src/include/ceph_features.h
src/osd/ReplicatedPG.cc
src/osd/ReplicatedPG.h
When an object is first created, it's proxied to base tier, need to
change the behavior of the test_tiering test case accordingly.
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
On some test machines, /usr/lib/ltp/testcases/bin/fsstress is
dangling symlink. 'cp -f' is impotent in this case.
Fixes: #12710
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Verify that an object promoted to a cache tier because of a proxy read
is evicted as expected.
http://tracker.ceph.com/issues/12673 Refs: #12673
Signed-off-by: Loic Dachary <ldachary@redhat.com>
the proble breaks `test_mon_deprecated_commands` on ubuntu precise,
on the python shipped with ubuntu precise, errno.errorcode[95]
evalutes to `EOPNOTSUPP` but not `ENOTSUP`. but these two errnos
are equal in glibc.
Signed-off-by: Kefu Chai <kchai@redhat.com>
'ceph mon_metadata' was added still during this dev cycle, so there is
no need to deprecate it first.
Fixes: #11545
Signed-off-by: Joao Eduardo Luis <joao@suse.de>
We need to be able to allow the version of ceph_test_* from earlier
versions of ceph to continue to work. This patch also adjusts the
work unit to use a single rados snap to test the condition without
--force-nonempty to ensure that we don't need to be careful about
the config value when running that script.
Signed-off-by: Samuel Just <sjust@redhat.com>