unit tests are added in test/filestore/store_test.cc for the
FileStore::_detect_fs method, when using ext4. It tests the following
situations:
* without user_xattr, ext4 fails
* mounted with user_xattr, ext4 fails if filestore_xattr_use_omap is false
* mounted with user_xattr, ext4 succeeds if filestore_xattr_use_omap is true
The tests are grouped in an ext4 dedicated class
TEST(EXT4StoreTest, _detect_fs)
The qa/workunits/filestore/filestore.sh script is added to prepare the
environment required for the unit tests ( create an image file,
formats it with ext4 etc.). It runs ceph_test_filestore with a sudo
to allow it to mount(2) and umount(2) the ext4 file system. It is
called with
ceph_test_filestore --gtest_filter=EXT4StoreTest.*
to only run the ext4 dependent tests.
The filestore.sh script is meant to be used as part of teuthology
in order to increase the code coverage for src/os/FileStore.cc. It is
self tested and can be checked from the source directory with
CEPH_TEST_FILESTORE=../../../src/ceph_test_filestore filestore.sh TEST
http://tracker.ceph.com/issues/4321 refs #4321
Signed-off-by: Loic Dachary <loic@dachary.org>
Stress test that does io on an image while we are mirroring a diff from
earlier snaps to a second copy. At the end, verify that all snaps have
matching content.
Signed-off-by: Sage Weil <sage@inktank.com>
Export a diff of an image from a previous snapshot to a file (or stdout).
Import a diff and apply it to an image, and then create the ending
snapshot.
Signed-off-by: Sage Weil <sage@inktank.com>
Remove now superfluous directory changes
that are causing tests to fail.
This code should have been removed when we transitioned
from running tests with Ant to using Java to run the tests.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Noah Watkins <noahwatkins@gmail.com>
This script uses the python bindings to libcephfs and rados
to create files and check the correctness of the backtrace
written to the 'parent' xattr on the first object (if its
a file) or inode (if its a dir). The script includes test cases
that kill the mds at specific kill points and restart it through
teuthology using the teuthology restart task.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
The hadoop-internal-tests workunit needs
to be updated in light of our moving to
using stock Hadoop with our hadoop-cephfs
jars.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Small tweaks to the hadoop-internal test
to better use existing environment varaibles
and in response to the recent teuthology
changes.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Noah Watkins <noahwatkins@gmail.com>
qa/workunits/rbd/test_lock_fence.sh runs using test/rbdrw.py
rbdrw.py creates an image, locks it, and runs an I/O loop;
test_lock_fence.sh runs it, waits, and then blacklists that client,
which causes rbdrw.py to get ESHUTDOWN on operations thereafter.
Currently doesn't work with rbd caching enabled.
rbd.py gets new exception type for ESHUTDOWN
Fixes: #3190
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
So we don't have to figure out which test is running from the output,
which can be difficult with the system tests.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Use qi to parse a strictly formatted set of key/value pairs. Be picky
about whitespace. Any subset of recognized keys is allowed. Parse the
same set of keys as the ceph.*.layout.* vxattrs.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 5551aa5b3b)
Use qi to parse a strictly formatted set of key/value pairs. Be picky
about whitespace. Any subset of recognized keys is allowed. Parse the
same set of keys as the ceph.*.layout.* vxattrs.
Signed-off-by: Sage Weil <sage@inktank.com>
The '! command' doesn't fail properly, even with -e, in bash (wtf!).
Also, the last pool deletion command succeeds because the pool
'--yes-i-really-really-mean-it' doesn't exist. So drop that test.
Signed-off-by: Sage Weil <sage@inktank.com>
Small tweaks to the hadoop-internal test
to better use existing environment varaibles.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Noah Watkins <noahwatkins@gmail.com>
Changing the hadoop-internal tests to use the
newly added $TESTDIR environment variable.
Also, removed unneeded variables.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Sam Lang <sam.lang@inktank.com>
Udev runs blkid on device close, thwarting any rbd unmap that
immediately follows use of the device. Explicitly settle for now.
See #4183.
Signed-off-by: Sage Weil <sage@inktank.com>
This is fugly, but sudo -E doesn't work. Fix this after we are installing
debs and the path doesn't matter anymore!
Signed-off-by: Sage Weil <sage@inktank.com>
Add error handling for open(), posix_memalign() and malloc().
Reuse code for read_* and write_* functions.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Add a test script that tests for creating a pool
and then setting the layout for a (pre-existing)
file to that pool.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
The files from the ceph-test subpackage are installed to /usr/bin,
give them more useful names to make sure that the user know they
belong to ceph. add a 'ceph_' prefix and change some test* binaries
to ceph_test_*.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
A few changes, now that a few rbd problems have been fixed.
First, the more substantive changes:
- Generate a source file, and compare what's read back from rbd
devices with the content of that file.
- Write to the rbd device such that the written data spans
an (assumed 4 MB) rbd object boundary, as well as starting
and ending on non-page-aligned offsets.
- Perform multiple reads on rbd devices: entirely within a range
before any written data; beginning before but ending within
written data; the exact written data (and validating what's
read); beginning within written data but ending after it;
reading after written data but within a written rbd object;
and reading from an unwritten rbd object.
- Have the sleep between iterations provide a non-integer value
to avoid zero (or quantized) delays.
Also, some a little less substantive (but possibly informative):
- Don't run with "set -x". It produces a ton of noise that is
not useful for this test. This is an exerciser, looking
really for system crashes during concurrent activity, and
knowing which commands were (concurrently) active isn't going
to help much in diagnosis.
- Create two more directories, used to track the degree of
concurrency (more or less) and the highest rbd id consumed.
Files whose names are numbers are touched in each, and the
highest at the end is the highest during the run. This gets
around issues passing environment info from sub-shells to the
top-level shell. As a bonus, it offers a better chance of
avoiding problems due to concurrent update.
- NAMESDIR is renamed NAMES_DIR, and it (and the others) is
set up in the setup() function.
- Increase the concurrency and iteration counts.
- Move the default definitions before the ceph secrets stuff
Signed-off-by: Alex Elder <elder@inktank.com>
This tests for the behavior reported in #3964. It passes on the current
code, but fails on 3.2 in squeeze (and 32-bit?).
Signed-off-by: Sage Weil <sage@inktank.com>
Test virtual xattrs for file and directory layouts.
TODO: create a data pool, add it to the fs, and make sure we can use it.
Signed-off-by: Sage Weil <sage@inktank.com>
This defines a new workunit shell script that performs a bunch of
rbd operations concurrently in order to exercise code paths and
catch reference count and bad pointer problems.
Signed-off-by: Alex Elder <elder@inktank.com>
Use ! for clarity when commands are supposed to fail.
Check a few other cases that should fail, and correct deleting
non-existent pools.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Require that the pool name be passed twice along with an force option
before we irreversibly delete an entire pool of objects.
Signed-off-by: Sage Weil <sage@inktank.com>
This workunit runs the internal tests for our local branch of hadoop-common.
Requires ant be installed on the host running the test.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Check the integer (fixed-point) value to avoid any worries
about floating-point rounding. Add tests for reweight < 0.
Fixes: #3872
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage.weil@inktank.com>
A multi-client dbench run doesn't work over NFS,
see bug #3718. Make single client dbench available.
Signed-off-by: David Zafman <david.zafman@inktank.com>
The pjd script now uses the latest version of pjd
with an additional test for opening a non-existent
file.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Using assert version for linger ops doesn't work with retries,
since the version will change after the first send.
This reverts commit e1776809031c6dad441cfb2b9fac9612720b9083.
Conflicts:
qa/workunits/rbd/watch_correct_version.sh
Add tests for:
- sparse import makes expected sparse images
- sparse export makes expected sparse files
- sparse import from stdin also creates sparse images
- import from partially-sparse file leads to partially-sparse image
- import from stdin with zeros leads to sparse
- export from zeros-image to file leads to sparse file
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
It turns out that our suites don't exercise fsync, at least not very much
(I couldn't find it in all the places I looked for it). This tester
was written by Ted T'so and updated by Chris Mason; I just made it
work on a smaller dataset (256MB) because 8GB against a small cluster takes
more time than we want to wait.
Signed-off-by: Greg Farnum <greg@inktank.com>
This script was heuristically using short sleep commands in order to
give udev activity time to complete.
There's a command "udevadm settle" which actually looks at the udev
queue and waits until its processing is done. Much, much better.
This rearranges the get_id function a bit too, breaking it into one
function that gets the id and another that loops back and tries
again after a short delay in the event the get_id fails.
Signed-off-by: Alex Elder <elder@inktank.com>
This adds a bash script that creates an rbd image, then repeatedly
maps and unmaps it for a specified duration (5 minutes by default).
Signed-off-by: Alex Elder <elder@inktank.com>
Make import work; do I/O in image native block size.
Note: creating sparse images is not currently attempted; could
scan for runs of zeros and write discontiguous chunks to image.
Fixes: #3503
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
(cherry picked from commit c99d9c3ae782597984f0c67dd1488fb95bd2ce54)
Make import work; do I/O in image native block size.
Note: creating sparse images is not currently attempted; could
scan for runs of zeros and write discontiguous chunks to image.
Fixes: #3503
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Validate change to not assume dest pool == src pool
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
(cherry picked from commit 39180430b9be14a94502b7e5bef30dd582ce3ad8)
These tests are showing intermittent failures so we'll drop them
from the default list for the time being.
Signed-off-by: Alex Elder <elder@inktank.com>
I've gone through the set of xfstests that were previously found to
not work. Some of those now do work, and with the addition of an
option to pass to "mkfs.xfs" a large number of other tests now
produce expected output as well.
This patch updates the default list of tests to run to reflect
the result of this exercise. The following 50 additional tests
are now run by default:
029 074 078 084-087 100 105 117 121 124 126 129-134
164 165 167 174 181 184 186 187 192 214-216 227 236
237 241 243 245-249 257-259 261 277 278 280 285 286
Test 127 completed without error, but it took from 1-3 hours so I
kept that out of the list.
Signed-off-by: Alex Elder <elder@inktank.com>
The default of 8 is virtually never the right answer. Require the initial
pg count to be explicitly provided.
Signed-off-by: Sage Weil <sage@inktank.com>
rbd ls of format-2 images was looping on the first 64 (when more than 64
were present). The key name passed to the omap layer needs to always
contain the prefix, and the "inside-the-loop next-chunk" statement
was missing the "add the prefix" call.
Also, add a test for listing 100 images, format 1 and 2.
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Users have been seeing failures where rbd rm is half-done; could be
because of outstanding watches on the rbd_header object. The state
is that rbd_children no longer contains the child, but other pieces
remain; remove considers this a failure.
Fix: test for ENOENT from remove_child, and treat that as an ignorable
error and drive on. Simulate this in copy.sh by removing the
rbd_children object altogether, which also results in ENOENT return
from remove_child.
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Users have been seeing failures where rbd rm is half-done; could be
because of outstanding watches on the rbd_header object. The state
is that rbd_children no longer contains the child, but other pieces
remain; remove considers this a failure.
Fix: test for ENOENT from remove_child, and treat that as an ignorable
error and drive on. Simulate this in copy.sh by removing the
rbd_children object altogether, which also results in ENOENT return
from remove_child.
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
This adds a "-c <count>" option to the run_xfstests.sh script so
the full set of tests can be repeated more than once without having
to go through the setup process each time.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
We can't check the initial permissions of the
file because the umask may be set to something
other than 0022. The check isn't needed to check
for chmod correctness anyway.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
This fixes up the chmod test to use a unique
filename to test with, and avoid clobbering of
other tests and commonly named files.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
This test still verifies that the race is handled correctly if it
occurs, but will no longer clutter test results with spurious failures
when the race is not reproduced.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
This is to handle TextTable output, which doesn't use tabs
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Use an assert version op in combination with our watch, and re-read
the header until it's not stale. Header updates are infrequent, so
this should not cause any delay with normal use.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Or maybe it was a spello, or a thinko, or something. In any case
I'm pretty sure Josh intended to call the function he added in
commit 78d6a60ca, and not the non-existent "test_import_args".
Signed-off-by: Alex Elder <elder@inktank.com>
The locker (entity_name_t) will be different each time the rbd
command line tool is run, so 'lock remove' is always breaking a lock.
Fixes: #2556
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
* no longer need to wait for watch timeout since #2948 was fixed
* use --format 2 instead of --new-format
* add test_cls_rbd to run-rbd-tests script
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
These check that removing an image still works if an rbd rm
command was interrupted partway through.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Now it's not the caller's responsibility to specify the format,
and we can eliminate a job from the qa suite.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Stop the qa noise we fix#2410. Looks like a freeze/thaw thing.
Maybe Jan's new freeze/thaw code will address this? That's probably
wishful thinking.
Signed-off-by: Sage Weil <sage@inktank.com>
These's are comprehensive because a lot of the startup logic is about
picking a local address, and it's difficult to do test that on a single
host. They cover the other variables surrounding mon bringing up, though:
- part of initial monmap, or not
- new nodes given all prior nodes, or not
- new nodes have self included in monmap seed, or not
- initial quorum members
Signed-off-by: Sage Weil <sage@inktank.com>
Test 232 in the xfstests suite produces an XFS error in the log
when run over an RBD device. This is most likely an XFS problem
that will be tracked separately (in tracker 2302).
My original plan with getting this checked in was to have it run a
baseline set of the tests--all known to pass on rbd devices--with
the intention of doing ongoing work to add back missing tests (at
least from the "auto" group) as we understand and fix whatever
makes them produce failures.
So just comment out test 232 so the xfstests script is able to
run to completion without error.
Signed-off-by: Alex Elder <elder@dreamhost.com>
Because we exit on any error (due to 'set -e'), the cleanup call was
never getting made in the event of an error. The net effect of that
was that a filesystem could be left mounted, and rbd cleanup then
couldn't complete because the module was in use.
Fix the trap call so it calls cleanup on exit as well as error.
Switch to using the capitalized signal names in the call.
Signed-off-by: Alex Elder <elder@dreamhost.com>
It turns out that xfstests *does* exit with non-zero status
when a test fails. Its exit status is the number of tests
that failed (which, now that we have over 255 tests could be
an issue...)
Save the exit status and make it be the result of the run.
Signed-off-by: Alex Elder <elder@dreamhost.com
Add a script that runs xfstests over a pair of devices that are
specified using command line arguments. The tests are run using
a specified filesystem type (xfs, ext4, or btrfs).
A default set of tests is run if none is specified on the command
line. Normally there's an "auto" group used for this purpose, but
for now I've laid out a (large) subset of them that I know pass on
rbd devices. These can be updated as we find they work reliably.
Signed-off-by: Alex Elder <elder@dreamhost.com>
Attempt to reproduce btrfs bug when rmdirs race with an async snap.
Unsuccessful. Best guess is that we need multiple threads to trigger.
Signed-off-by: Sage Weil <sage@newdream.net>
This was mixed up with min/max_op_len. And max_ops wasn't being used
the initial object creation stage, flooding the OSDs. Or during run().
Signed-off-by: Sage Weil <sage@newdream.net>
Capture Alexandre's script for reproducing #1774 here for posterity, until
we write a properly harnessed test for this. Currently, workunits can't
mount/unmount, and we don't have a way to make ceph-fuse drop it's cache.
Signed-off-by: Sage Weil <sage@newdream.net>
We don't have a great way to guarantee mdsmap updates, but they
should happen on their own and we can loop. Closes#1518.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
We end up needing _GNU_SOURCE in a bunch of places-- to get direct i/o,
pipe2, and some other Linux-specific interfaces.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>