* refs/pull/24585/head:
doc: add developer documentation on new cephfs reclaim interfaces
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Zheng Yan <zyan@redhat.com>
In CephReleaseNamePipe, we used to blindly return the "release name" portion of
the version string. This ends up e.g. returning 'nautilus' for master right
now, which causes us to link to nonexistent documentation on ceph.com. This
change causes builds marked as 'dev' (as opposed to 'stable') to report
'master' as their release name.
Fixes: https://tracker.ceph.com/issues/36416
Signed-off-by: Zack Cerza <zack@redhat.com>
In case a reshard attempt is left in an incomplete state, i.e., flags
still show resharding even though the bucket reshard lock isn't being
held, try to recover by taking the bucket reshard lock and clearing
flags associated with resharding.
This change requires access to an RGWBucketInfo object. So call stack
into this function should provide that to prevent unnecessary
work. Changes were made to provide this object.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
When we open a connection, there is a short window before we attach
the session. If a fault happens quickly, we won't get the reset, and
will persistently fail to send osd pings.
Move the lock up to avoid this. Note that we should rarely really see
connections without sessions here anyway (except when this specific
race happens), so this should have no negative impact (by taking the lock
when we weren't before).
Fixes: http://tracker.ceph.com/issues/36602
Signed-off-by: Sage Weil <sage@redhat.com>
There are other processes beyond resharding that would need to take a
bucket reshard lock (e.g., correcting bucet resharding flags in event
of crash, tools to remove bucket shard information from earlier
versions of ceph). Pulling this logic outside of RGWReshardBucket
allows this code to be re-used.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
ThreadA ThreadB
sdata->shard_lock.Lock();
if (sdata->pqueue->empty() &&
!(is_smallest_thread_index && !sdata->context_queue.empty())) {
void queue(list<Context *>& ls) {
bool empty = false;
{
std::scoped_lock l(q_mutex);
if (q.empty()) {
q.swap(ls);
empty = true;
} else {
q.insert(q.end(), ls.begin(), ls.end());
}
}
if (empty) {
mutex.Lock();
cond.Signal();
mutex.Unlock();
}
}
sdata->sdata_wait_lock.Lock();
if (!sdata->stop_waiting) {
Fix by simply rechecking that context_queue is empty after taking the
wait lock. We still check it without taking that lock to keep the hot/busy
path fast (we avoid the wait lock in general) at the expense of taking
the context_queue qlock twice in the idle/wait path (where we don't care
so much about additional latency/cycles).
Fixes: http://tracker.ceph.com/issues/36473
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Previously, when resharding failed, we restored the shard status on
the bucket info object. However the status on each of the shards was
left indicating a reshard was underway. This prevented some write
operations from taking place, as they would wait for resharding to
complete. This adds the missing functionality. It also makes the
functionality available to other classes via static functions in
RGWBucketReshard.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
The bucket reshard lock was simply an exclusive lock that existed on
an object solely for the purpose of representing the lock. This is now
changed to exclusvie-ephemeral lock, so as not to leave these objects
behind.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Add a new type of cls lock -- exclusive ephemeral for which the
object only exists to represent the lock and for which the object
should be deleted at unlock. This is to prevent the accumulation of
unneeded objects in the cluster by automatically cleaning them up.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Using "#!/usr/bin/env python" is not recommended as it's not portable.
setuptools provides an console_scripts entry_point that generates
scripts that always have the good sheban whatever the target operating
system and python version/distribution.
http://tracker.ceph.com/issues/36601
Signed-off-by: Mehdi Abaakouk <sileht@sileht.net>
mgr/dashboard: Add support for managing individual OSD settings/characteristics in the frontend
Reviewed-by: Tiago Melo <tmelo@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Ricardo Marques <rimarques@suse.com>
With the current number of unit tests we have for the frontend the output from
Jest is too big, and is exceeding the limit from jenkins.
This is resulting in the omission of linting results.
With this change, Jest will only output the results of failing tests and will no
longer show the coverage in the logs.
Jenkins will keep tracking the coverage via the generated cobertura file.
Signed-off-by: Tiago Melo <tmelo@suse.com>
* refs/pull/24716/head:
test/log: drop redundant test case
common/StackStringStream: don't reserve before every insert
Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/24283/head:
client: support for exporting multiple subdirectories in faked mode
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Zheng Yan <zyan@redhat.com>
Ceph recently gained the ability to create alternate named filesystems
within a cluster. We want to allow nfs-ganesha to export them, but it
currently generates filehandles using a inode number+snapid tuple, and
that will not be unique across multiple filesystems.
Add a new field to hold the fscid in the Client, as that is generated
on a per-filesystem basis via monotonic counter. When we mount, fetch
the fscid and store it in that field. Add a new Client accessor, and a
libcephfs function that returns that value.
Tracker: http://tracker.ceph.com/issues/36585
Signed-off-by: Jeff Layton <jlayton@redhat.com>
We are currently hosting the grafana dashboards in our repo but we do
not install them. This patch adds the cmake support.
Signed-off-by: Boris Ranto <branto@redhat.com>