Some extra coverage of the dashboard, including its standby
redirect mode and the publishing of URIs.
Also invoking the command_spam mode of the selftest module.
Signed-off-by: John Spray <john.spray@redhat.com>
added a qa/rgw_frontend directory for civetweb.yaml and the new
beast.yaml. the rgw suites for multifs and singleton now symlink
rgw_frontend/civetweb.yaml. the multisite, tempest and verify suites
symlink rgw_frontend to test both. this doubles the number of jobs in
those suites
Signed-off-by: Casey Bodley <cbodley@redhat.com>
rbd.xfstests task allows spawning xfstests runs on multiple nodes.
Don't unwind task contexts if one of the runs fails -- let the other
runs finish.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
* refs/pull/18192/head:
qa/cephfs: test ec data pool
qa/suites/fs/basic_functional/clusters: more osds
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
osd will refused to create new pgs, until its pg number is lower
than the max-pg-per-osd upper bound setting.
Signed-off-by: Kefu Chai <kchai@redhat.com>
mgr: common interface for TSDB modules
Reviewed-by: My Do <mhdo@umich.edu>
Reviewed-by: Jan Fajerski <jfajerski@suse.com>
Reviewed-by: John Spray <john.spray@redhat.com>
... that have empty OSD and MDS caps. Don't add a ',' at the
start of OSD and MDS caps.
Fixes: http://tracker.ceph.com/issues/21501
Signed-off-by: Ramana Raja <rraja@redhat.com>
* refs/remotes/upstream/pull/17697/head:
pybind/ceph_volume_client: add get, put, and delete object interfaces
pybind/ceph_volume_client: remove 'compat_version'
pybind/ceph_volume_client: set the version
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/remotes/upstream/pull/16036/head:
mds: improve cap min/max ratio descriptions
mds: fix whitespace
mds: cap client recall to min caps per client
mds: fix conf types
mds: fix whitespace
doc/cephfs: add client min cache and max cache ratio describe
mds: adding tunable features for caps_per_client
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Zheng Yan <zyan@redhat.com>
The module self test commands give us a chance to
catch any other ceph changes that change something
that a module was relying on reading.
Signed-off-by: John Spray <john.spray@redhat.com>
Wrap low-level rados APIs to allow ceph_volume_client to get, put, and
delete objects. The interfaces would allow OpenStack Manila's
cephfs driver to store config data in a shared storage to implement
highly available Manila deployments. Restrict write(put) and
read(get) object sizes to 'osd_max_size' config setting.
Signed-off-by: Ramana Raja <rraja@redhat.com>
bluestore_fsck_on_mount and bluestore_fsck_on_mount_deep are enabled by
default. and bluestore is used as the default store backend. it takes
longer to perform the deep fsck with verbose log. so prolong the
revive_osd()'s timeout from 150 sec to 360 sec.
Fixes: http://tracker.ceph.com/issues/21474
Signed-off-by: Kefu Chai <kchai@redhat.com>
This reverts commit f95798b3ad.
The config_path method wasn't available through inheritance as I thought. Oops.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Pg state maybe all in active+clean when no recovering going on,
so check it again before timedout.
Fixes: http://tracker.ceph.com/issues/21294
Signed-off-by: huangjun <huangjun@xsky.com>
* refs/remotes/upstream/pull/17694/head:
qa/cephfs: kill mount if it gets evicted by mds
qa/cephfs: fix test_evict_client
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/remotes/upstream/pull/17657/head:
mds: optimize MDCache::rejoin_scour_survivor_replicas()
mds: fix MDSCacheObject::clear_replica_map
mds: support limiting cache by memory
common: refactor of lru
mds: resolve unsigned coercion compiler warning
common: use safer uint64_t for list size
common: add bytes2str pretty print function
mds: check if waiting is allocated before use
mds: go back to compact_map for replicas
mds: use mempool for cache objects
mds: cleanup replica_map access
common: add alloc_ptr smart pointer
common: add warning on base class use of mempool
common: use atomic uin64_t for counter
Reviewed-by: Zheng Yan <zyan@redhat.com>
ceph df accounts for pool size, so there is no need to do it in the test.
Fixes: http://tracker.ceph.com/issues/21381
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
This introduces two config parameters:
mds_cache_memory_limit: Sets the soft maximum of the cache to the given
byte count. (Like mds_cache_size, this doesn't actually limit the maximum
size of the cache. It just dictates the steady-state size.)
mds_cache_reservation: This replaces mds_health_cache_threshold everywhere
except the Beacon heartbeat sent to the mons. The idea here is to specify a
reservation of memory (5% by default) for operations and the MDS tries to
always maintain that reservation. So, the MDS will recall caps from clients
when it begins dipping into its reservation of memory.
mds_cache_size still limits the cache by Inode count but is now by-default 0
(i.e. unlimited). The new preferred way of specifying cache limits is by memory
size. The default is 1GB.
Fixes: http://tracker.ceph.com/issues/20594
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1464976
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/remotes/upstream/pull/17679/head:
qa: get asok path from ceph.conf
qa: use config_path property instead of literal
Reviewed-by: John Spray <john.spray@redhat.com>
are mapped and use the new mapped role for upgrades during later
stage.
eg: mon.a is mapped to mon.mira002 during install, store this mapping
and durig upgrade map it back to appropriate name to find the hostname
with that role
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
jewel needs neither filestore or bluestore as an option, so provide none
when running with jewel branch.
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
This is to test for customer like upgrade scenarios and to find
any issues that may be related to systemd, packaging etc
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
rbd pool should exist for many rbd tests to work properly, create
the pool right after install is successful.
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
We assume below that rerrosd is up, but it may not be when we exit the
loop.
Fixes: http://tracker.ceph.com/issues/21206
Signed-off-by: Sage Weil <sage@redhat.com>
Add support for testing recovery of CephFS metadata into an alternate
RADOS pool, useful as a disaster recovery mechanism that avoids
modifying the metadata in-place.
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
Add support for testing recovery of CephFS metadata into an alternate
RADOS pool, useful as a disaster recovery mechanism that avoids
modifying the metadata in-place.
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
Remove the alternate pool recovery test from test_data_scan. Newer
commits will place the test in its own file.
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
Different filesystems (and further, different configurations of the
same filesystem) need different exclude lists. Hard coding the list in
a wrapper script is inflexible.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
If start osd process first and then mark it in, the
pg state may remain all active+clean when doing
wait_for_clean() check, which may fail the next
osd_scrub_pgs() process.
So faster pg state change by marking osd in first.
Signed-off-by: huangjun <huangjun@xsky.com>
Rearrange logic to make it easier to measure accumulation.
Instrument the boto request/response loop to count bytes in and out.
Accumulate byte counts in usage like structure.
Compare actual usage reported by ceph against local usage measured.
Report and assert if there are any short-comings.
Remove zone placement rule that was newly added at end: tests should be rerunable.
Nit: the logic to wait for "delete_obj" is not quite right.
Fixes: http://tracker.ceph.com/issues/19870
Signed-off-by: Marcus Watts <mwatts@redhat.com>
* refs/remotes/upstream/pull/16378/head:
doc: remove accidental additions to release notes
qa/cephfs: Fix race in test_volume_client
qa/cephfs: Test filtered df
PendingReleaseNotes: add note about df filtering
client: Support new, filtered MStatfs
objecter: Support new, filtered MStatfs
mon/PGMap stats: Support new, filtered MStatfs
messages: Add optional data pool to MStatfs
Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
If the OSD doesn't see IO, it won't flush more pg/osd stats when the
luminous flag is not yet set (legacy pgmonitor mode).
Signed-off-by: Sage Weil <sage@redhat.com>
* refs/remotes/upstream/pull/16714/head:
qa: test export_pin is correct in dumped subtree
mds: print export_pin for dumped subtree
Reviewed-by: Douglas Fuller <dfuller@redhat.com>
Reviewed-by: huanwen ren <ren.huanwen@zte.com.cn>
lifecycle expiration tests are too reliant on timing, and have been
failing consistently for a long time
Signed-off-by: Casey Bodley <cbodley@redhat.com>
so we can avoid the warnings like
grep: Unmatched ( or \(
because we pass the whitelisted string to `egrep -v "$1"` directly.
Signed-off-by: Kefu Chai <kchai@redhat.com>
/bin/bash is a Linuxism. Other operating systems install bash to
different paths. Use /usr/bin/env in shebangs to find bash.
Signed-off-by: Alan Somers <asomers@gmail.com>
I'm seeing sporadic single thread deadlocks on fio stat_mutex during krbd
thrash runs:
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7f89ee730740 (LWP 15604) 0x00007f89ed9f41bd in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0 0x00007f89ed9f41bd in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007f89ed9f17b2 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#2 0x00000000004429b9 in fio_mutex_down (mutex=0x7f89ee72d000) at mutex.c:170
#3 0x0000000000459704 in thread_main (data=<optimized out>) at backend.c:1639
#4 0x000000000045b013 in fork_main (offset=0, shmid=<optimized out>, sk_out=0x0) at backend.c:1778
#5 run_threads (sk_out=sk_out@entry=0x0) at backend.c:2195
#6 0x000000000045b47f in fio_backend (sk_out=sk_out@entry=0x0) at backend.c:2400
#7 0x000000000040cb0c in main (argc=2, argv=0x7fffad3e3888, envp=<optimized out>) at fio.c:63
(gdb) up 2
170 pthread_cond_wait(&mutex->cond, &mutex->lock);
(gdb) p mutex.lock.__data.__owner
$1 = 15604
Upgrading to 2.21 seems to make these go away.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Review current log messages for consistency, accuracy and necessesity as
part of usability initiative. First in a series.
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
Test cluster with 2 osds, stop osd.0, if osd.1
report the pg stats during pg peering, mon will
record pg state to 'peering',then stop osd.1,
finally the pg state will stuck in 'stale+peering',
which is unexpected.
Let's wait_for_active() after stop osd.0.
Signed-off-by: huangjun <huangjun@xsky.com>
This randomly issues pg force-recovery/force-backfill and
pg cancel-force-recovery/cancel-force-backfill during QA
testing. Disabled for upgrades from hammer, jewel and kraken.
Signed-off-by: Piotr Dałek <piotr.dalek@corp.ovh.com>
* refs/remotes/upstream/pull/16288/head:
qa/cephfs: don't use int() to convert string of float point number
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
This isn't recognized by hammer, and we don't need it for jewel.
Fixes: http://tracker.ceph.com/issues/20548
Signed-off-by: Sage Weil <sage@redhat.com>
If the repo ends in "/ceph-ci" it's the same as if it ended in "/ceph-ci.git"
Before this change, the following command was broken if the workunit specified,
e.g., "branch: jewel":
teuthology-suite --ceph-repo https://github.com/ceph/ceph --ceph master
--suite-repo https://github.com/ceph/ceph-ci --suite-branch wip-foo . . .
Fixes: http://tracker.ceph.com/issues/20554
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Note: unmounting the client is not necessary for purging snapshots.
Fixes: http://tracker.ceph.com/issues/20072
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>