If I have to touch this again I will remove it. Ugh. This time,
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-03-11_02:30:01-rados-firefly-distro-basic-plana/125922
hit NXIO a few lines down because one of the OSDs was still down.
Signed-off-by: Sage Weil <sage@inktank.com>
Added port (fixed value for right now in teuthology) to hostname.
Fixes: 7374
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Signed-off-by: Warren Usui <warren.usui@inktank.com>
(cherry picked from commit 8200b8a025)
- fix the wait check for osds to come back up
- make sure they get marked back in, too
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
'rados cppool' copies the contents but that doesn't make the destination
pool an unmanaged snaps pool. Therefore, we must get an ENOTSUP when
we try to remove an unmanaged snap from a not-unmanaged pool.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
This wreaks havoc on our QA because it marks osds up and down and then
immediately after that we try to scrub and some osds are still down.
Adjust the CLI test to wait for all OSDs to come back up after thrashing.
Signed-off-by: Sage Weil <sage@inktank.com>
Prevent creation of buckets of type 0 ('osd', 'device', etc.), as they
will confusing the mapping algorithm.
Signed-off-by: Sage Weil <sage@inktank.com>
The cache pools will throttle when they reach the target max size, so it
is important to make the administrator aware when they approach that point.
Unfortunately it is not particularly easy to efficiently keep track of
which PGs have hit their limit and use that for reporting. However, it
is easy to raise a flag when we start to approach the target for the
entire pool, and that sort of early warning is arguably more useful
anyway.
Trigger the warning based on the target full ratio. Not when we hit the
target, but when we are 2/3 between it and completely full.
Implements: #7442
Signed-off-by: Sage Weil <sage@inktank.com>
This is a friendlier interface for setting up a cache tier with some
reasonable defaults (defined via config options). This will simplify
the user experience and documentation.
Signed-off-by: Sage Weil <sage@inktank.com>
In general, users should not use non-empty pools as new tiers or else
things can behave strangely:
- the data sets are unrelated behavior will be... strange.
- if the cache pool is not "new" and does not do the OMAP flag, the OSD
will not know not to flush omap objects to an EC base tier
- probably other random stuff I'm forgetting
Allow a user to shoot themselves in the foot with --force-nonempty.
Implements: #7457
Signed-off-by: Sage Weil <sage@inktank.com>
We would like to get the hit set parameters: hit_set_type |
hit_set_period | hit_set_count | hit_set_fpp via OSDMonitor
Signed-off-by: Kai Zhang <zakir.exe@gmail.com>
comment out erasure pool related tests when an OSD is involved because
it does not work yet. See http://tracker.ceph.com/issues/7360.
Signed-off-by: Loic Dachary <loic@dachary.org>
Now ObjectStore support three backend types, so we need to make each backend
share unit test to avoid duplicate codes.
This patch mainly make workload_generator workable for objectstore.
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
On Linux ENOTSUP is remapped
/usr/include/x86_64-linux-gnu/bits/errno.h
/* Linux has no ENOTSUP error code. */
# define ENOTSUP EOPNOTSUPP
and should have different values on other operating systems. On Ubuntu
precise the string returned when translating the error value of ENOTSUP
or EOPNOTSUPP is always EOPNOTSUPP but on Ubuntu saucy it is always
ENOTSUP.
Replace ENOSYS with ENOTSUP because the expected semantic is very
similar and modify the test to not check on the string translation of
the error.
Rework the check_response shell function to optionaly check the return
code. The erasure coded pool size change test verifies the error message
only but not the error code.
Signed-off-by: Loic Dachary <loic@dachary.org>
We had already added this as a flag (set/unset) when I generalized the
'mds set_max_mds' to be 'ceph mds set <var> <val>'. Switch the snaps
flag to be a key/value to with true/false (similar to the hashpspool
pool flag) since there are fewer users and the var/val approach is more
general.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
When run in a shared environment ( as opposed as a machine created for
the purpose of running this test only ), it is important to cleanup
leftovers to avoid poluting the /tmp space. Create a common temporary
directory for all tmp files.
Signed-off-by: Loic Dachary <loic@dachary.org>
First, add the ability to modify max_file_size. While we are at it, move
to a more sensible interface for adjusting max_mds too.
Signed-off-by: Sage Weil <sage@inktank.com>
This will show up on the command line and logs, making it more
clear than EINVAL.
Fixes#6851 and #4045
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
The three rules created by build_simple are identical. They are replaced
by a single rule named replicated_rule which is set to be used by the
data, rbd and metadata pools.
Instead of hardcoding the ruleset number to zero, it is read from
osd_pool_default_crush_ruleset which defaults to zero.
The CEPH_DEFAULT_CRUSH_REPLICATED_RULESET enum is moved from osd_type.h to
config.h because it may be needed when osd_type.h is not included.
Signed-off-by: Loic Dachary <loic@dachary.org>
Creating an erasure pool will crash the OSD because OSD::_make_pg
asserts if the type is not replicated. The tests related to erasure
coded pool creation are removed from qa/workunits/cephtool/test.sh.
The osd-create-pool.sh unit test covers the cases removed from test.sh
more extensively. The intent is to check the interactions with the MON
only, therefore it does not run an OSD and the absence of erasure code
placement group backend implementation is not an issue.
Signed-off-by: Loic Dachary <loic@dachary.org>
When osd create pool is called twice on the same pool, it will succeed
because the pool already exists. However, if a different type is
specified, it must fail.
Signed-off-by: Loic Dachary <loic@dachary.org>
It looked like it worked because the wrapper hide the error. The failing
tests are commented out so that the other tests can be used.
Signed-off-by: Loic Dachary <loic@dachary.org>
Display benchmark results for the default erasure code plugins, in a tab
separated CSV file. The first two column contain the amount of KB
that were coded or decoded, for a given combination of parameters
displayed in the following fields.
seconds KB plugin k m work. iter. size eras.
1.2 10 example 2 1 encode 10 1024 0
0.5 10 example 2 1 decode 10 1024 1
It can be used as input for a human readable report. It is also intented
to be used to show if a given version of an erasure code plugin performs
better than another.
The last column ( not shown above for brievety ) is the exact command
that was run to produce the result so it can be copy / pasted to
reproduce them or to profile.
Only the jerasure techniques mentionned in
https://www.usenix.org/legacy/events/fast09/tech/full_papers/plank/plank_html/
are benchmarked, the others are assumed to be less interesting.
Signed-off-by: Loic Dachary <loic@dachary.org>
crush: make set_chooseleaf_tries work with firstn chooseleaf, too
Reviewed-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Loic Dachary <loic@dachary.com>
instead of assuming the pool size is 2, query it and increment it to
test for pool set data size. It allows to run the test from vstart.sh
without knowing what the required pool size is in advance:
rm -fr dev out ; mkdir -p dev ; \
MON=1 OSD=3 ./vstart.sh -n -X -l mon osd
LC_ALL=C PATH=:$PATH CEPH_CONF=ceph.conf \
../qa/workunits/cephtool/test.sh
Signed-off-by: Loic Dachary <loic@dachary.org>
The file removal installed to be triggered when the script stops must
not fail if the file does not exist.
Signed-off-by: Loic Dachary <loic@dachary.org>
We ran into problems before when we made this a string because a mixed
cluster of mons might forward a client request with the wrong schema.
To make this work, we make the new code understand both the new and
old schema, and also backport a change to emperor and dumpling to
handle the new schema.
For the previous attempt to do this, see:
337195f0462fe0d0d97a
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
feed osd info about os, kernel, memory, arch to the mons
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Loic Dachary <loic@dachary.org>
Basic testing by forcing each monitor out of quorum at a time and making
sure they still reply to ping requests.
Fixes: #6705
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
This test attempts to generate a random number of holes within a
particular range, but may fail because hole placement is random.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Although fsstress was being called with a static path the directory
it was writing to was in the current directory so doing a cd to the
source directory that is made in /tmp and then removing it later
caused it to be unable to write the files in a non-existent dir.
This change gets the current path first and cd's back into it after
it is done compiling fsstress.
Issue #6479.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Reviewed-by: Alfredo Deza <alfredo.deza@inktank.com>
It's legal to give a CephEntityAddr to osd blacklist with no nonce,
so allow it in the valid() method; also add validation of any nonce
given that it's a long >= 0.
Also fix comment on CephEntityAddr type description in MonCommands.h,
and add tests for invalid nonces (while fixing the existing tests to remove
the () around expect_false args).
Fixes: #6425
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Tests were rearranged upstream; use an old version for the time being
until we can refactor run_xfstests.sh to cope. See #6385
Signed-off-by: Sage Weil <sage@inktank.com>
Test that it works in snaptest-0.sh, and set the flag in
all the snap workunits so they continue to function.
Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Loic Dachary <loic@dachary.org>
The adjust method returns a count of adjusted items.
Add a test.
Fixes: #6382
Backport: dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
When using the properties key=value only, the test was inverted
and an attempt to obtain a substring at index string::npos throws
an exception.
Add variations of osd pool create to qa/workunits/mon/pool_ops.sh
to assert the problem has been fixed and all code paths are used.
http://tracker.ceph.com/issues/6357fixes#6357
Signed-off-by: Loic Dachary <loic@dachary.org>
This may need to change since it exploits some of the loose
consistency we currently have with caching pools, but for now
it checks that the Objecter does what we want.
Signed-off-by: Greg Farnum <greg@inktank.com>
Some distro's have a lack of ltp-kernel packages and all we need is
fstress. This just modified the shell script to download/compile
fstress from source and copy it to the right location if it doesn't
currently exist where it is expected. It is a very small/quick
compile and currently only SLES and debian do not have it already.
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Some distro's have a lack of ltp-kernel packages and all we need is
fstress. This just modified the shell script to download/compile
fstress from source and copy it to the right location if it doesn't
currently exist where it is expected. It is a very small/quick
compile and currently only SLES and debian do not have it already.
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
This patch change the fsx.sh to pull better fsx.c from xfstests site
to support hole punching test.
Signed-off-by: Yunchuan Wen <yunchuanwen@ubuntukylin.com>
Signed-off-by: Li Wang <liwang@ubuntukylin.com>
[Also move into a separatate test script; validate result -sage]
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Signed-off-by: Sage Weil <sage@inktank.com>
requests 0.12.1 handles queryparams in the URL with embedded
spaces; requests 0.8.2 does not. Avoid the issue by quoting the
URL into expect().
Signed-off-by: Dan Mick <dan.mick@inktank.com>
dump_info() got a new field outside the mdsmap section; it's ok for
the overall "report", but not for "mds stat". Add an enclosing section
in "mds stat". Fix test to expect new level.
Fixes: #5718
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
set env var TEST_EXIT_ON_ERROR=0 to obtain all errors instead of exiting
with return 1 on first error found.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
A monc/mon connection fault or the dup command test flag may mean an extra
osd id is created that we isn't actually up; reorder so that doesn't screw
up 'osd ls'.
Signed-off-by: Sage Weil <sage@inktank.com>
Patterned after cephtool/test.sh, with some deeper validation of
output format and contents (because structured output is easier
to validate).
Signed-off-by: Dan Mick <dan.mick@inktank.com>
From: Yan, Zheng <yan.zheng@intel.com>
Simple reproducer for #5453, modified to run for a finite number of
iterations.
Signed-off-by: Sage Weil <sage@inktank.com>
Test case for failure in #5467. Supplying new auth info overwrites.
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
We can get random messages to stderror from socket reconnects and such;
discard those if we are looking at stderr in the test.
Signed-off-by: Sage Weil <sage@inktank.com>
If we add an item that already exists in particular position, we should
update instead of inserting it; the CrushWrapper methods are not
idempotent.
Signed-off-by: Sage Weil <sage@inktank.com>
'ceph-conf ...' doesn't give you final/default values, only what is in the
conf file. Use -w output to test this instead.
Fixes: #5327
Signed-off-by: Sage Weil <sage@inktank.com>
Trying to track down this failure:
2013-06-12T06:11:13.430 INFO:teuthology.task.workunit.client.0.err:+ rsync -auv --exclude local/ /usr/ usr.2
2013-06-12T06:11:13.430 INFO:teuthology.task.workunit.client.0.err:+ tee a
2013-06-12T06:11:13.527 INFO:teuthology.task.workunit.client.0.out:sending incremental file list
2013-06-12T06:11:46.206 INFO:teuthology.task.workunit.client.0.out:
2013-06-12T06:11:46.208 INFO:teuthology.task.workunit.client.0.out:sent 1689627 bytes received 8302 bytes 50684.45 bytes/sec
2013-06-12T06:11:46.208 INFO:teuthology.task.workunit.client.0.out:total size is 3274130495 speedup is 1928.31
2013-06-12T06:11:46.209 INFO:teuthology.task.workunit.client.0.err:+ wc -l a
2013-06-12T06:11:46.209 INFO:teuthology.task.workunit.client.0.err:+ grep 4
2013-06-12T06:11:46.211 INFO:teuthology.task.workunit:Stopping misc on client.0...
...and am perplexed!
Signed-off-by: Sage Weil <sage@inktank.com>
The sysfs entries for snapshots went away a while ago, and this
script used them to verify sizes matched what was expected.
Instead, look at the mapped size of the snapshot in the places
that used to look for the image's snapshot sysfs files.
Also, switch over to using "udevadm settle" rather than a delay to
wait for udev to do its thing. Insert them at more appropriate
places--right after "rmd map" commands and before and after the
"rbd unmap" calls.
Stop doing the manual refresh calls as well. The osd will trigger
refreshes whenever the image size or shapshot context changes.
Finally, the cleanup routine is called initially, when there really
isn't expected to be anything to clean up. Change the rbd commands
to run there conditionally, only if the target of the command
already exists.
Signed-off-by: Alex Elder <elder@inktank.com>
There's no guarantee the rbd module is loaded when this script is
run, so add a line that loads it if necessary.
Signed-off-by: Alex Elder <elder@inktank.com>
An rbd clone image can be created with an object order that differs
from that of its parent. This patch adds testing for that in
qa/workunits/rbd/image_read.sh. By default, clone images will be
created with an object size twice as big as that of its parent.
For simplicity, when a clone's object order differs from its parent
the order will be either one more than or one less than that of its
parent image, meaning its object size is either double or half of
the size of objects used in the parent.
Signed-off-by: Alex Elder <elder@inktank.com>
Add testing to verify that a snapshot of a clone and a clone of
that snapshot both produce the correct results when read.
Signed-off-by: Alex Elder <elder@inktank.com>
Move the dd command that touches the last byte in a local file
into create_image() where it belongs (out of fill_original()).
Signed-off-by: Alex Elder <elder@inktank.com>
The function boolean_toggle() in qa/workunits/rbd/image_read.sh is
defined but never used. My intentions were good though. Fix it and
use it for argument parsing.
Change the minimum supported object order so it matches what the
command line interface enforces.
Assign the initial value of TEST_CLONES from the environment if it's
available.
Change defaults to use format 2 and test clones.
Output details about the parameters of the run even if not being
verbose.
Make the order of assignment of argument variables consistent.
And fix a typo unmap_image().
Signed-off-by: Alex Elder <elder@inktank.com>
Since snapshots never change, it's safe to read from replicas for them.
A common use for this would be reading from a parent snapshot shared by
many clones.
Convert LibrbdWriteback and AioRead to use the ObjectOperation api
so we can set flags. Fortunately the external wrapper holds no data,
so its lifecycle doesn't need to be managed.
Include a simple workunit that sets the flags in various combinations
and looks for their presence in the logs from 'rbd export'.
Fixes: #3064
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Some plana have non-world-readable crap in /usr/local/samba. Avoid
/usr/local entirely for that and any similar landmines.
Signed-off-by: Sage Weil <sage@inktank.com>
This uses the old stand-alone qemu-iotests repo so it works with the
version of qemu in Ubuntu 12.04. The tests depend tightly on qemu
version, so to use later tests we'd need to install corresponding
versions of qemu.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
This adds a new test script for validating data reads from a mapped
rbd image is what it's expected to be.
See the content of the file for a bit more explanation.
Signed-off-by: Alex Elder <elder@inktank.com>
unit tests are added in test/filestore/store_test.cc for the
FileStore::_detect_fs method, when using ext4. It tests the following
situations:
* without user_xattr, ext4 fails
* mounted with user_xattr, ext4 fails if filestore_xattr_use_omap is false
* mounted with user_xattr, ext4 succeeds if filestore_xattr_use_omap is true
The tests are grouped in an ext4 dedicated class
TEST(EXT4StoreTest, _detect_fs)
The qa/workunits/filestore/filestore.sh script is added to prepare the
environment required for the unit tests ( create an image file,
formats it with ext4 etc.). It runs ceph_test_filestore with a sudo
to allow it to mount(2) and umount(2) the ext4 file system. It is
called with
ceph_test_filestore --gtest_filter=EXT4StoreTest.*
to only run the ext4 dependent tests.
The filestore.sh script is meant to be used as part of teuthology
in order to increase the code coverage for src/os/FileStore.cc. It is
self tested and can be checked from the source directory with
CEPH_TEST_FILESTORE=../../../src/ceph_test_filestore filestore.sh TEST
http://tracker.ceph.com/issues/4321 refs #4321
Signed-off-by: Loic Dachary <loic@dachary.org>
Stress test that does io on an image while we are mirroring a diff from
earlier snaps to a second copy. At the end, verify that all snaps have
matching content.
Signed-off-by: Sage Weil <sage@inktank.com>
Export a diff of an image from a previous snapshot to a file (or stdout).
Import a diff and apply it to an image, and then create the ending
snapshot.
Signed-off-by: Sage Weil <sage@inktank.com>
Remove now superfluous directory changes
that are causing tests to fail.
This code should have been removed when we transitioned
from running tests with Ant to using Java to run the tests.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Noah Watkins <noahwatkins@gmail.com>