* Removing tiers from a base pool in use by CephFS is forbidden.
* Using CephFS pools as tiers is forbidden.
Signed-off-by: John Spray <john.spray@redhat.com>
Fixes two things:
* EC pools are now permissible if they have a cache overlay
* Pools are not permissible if they are a cache tier.
Fixes: #9435
Signed-off-by: John Spray <john.spray@redhat.com>
There was a race condition (hence OSD crash) between lfn_unlink
and lfn_open. The reason was FDCache lookup was called without
taking index lock from lfn_open. Lookup will increase reference
count and thus Clear will not be able to delete those FDs. FDs
will be leaked. The assert within FDCache clear was hitting
because of this.
Fixes: #9480
Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
The implicit creation of a ruleset when creating a pool is convenient
when nothing is specified. However, if the caller sets a ruleset name,
it should not implicitly create it but return ENOENT instead. Silently
creating a ruleset when there is a typo in the ruleset name is
confusing.
http://tracker.ceph.com/issues/9304Fixes: #9304
Signed-off-by: Loic Dachary <loic-201408@dachary.org>
Otherwise the FDCache will keep a file descriptor to a file that was
removed from the file system. This may create various type of errors
because the OSD checking the FDCache will assume the file that contains
information for an object exists although it does not. For instance in
the following:
* rados put object file
* rm file from the primary
* repair the pg to which the object is mapped
if the FDCache is not cleared, repair will incorrectly pull a copy from
a replica and write it to the now unlinked file. Later on, it will
assume the file exists on the primary and only be partially correct :
the data can still be accessed via the file descriptor but any operation
using the path name will fail.
http://tracker.ceph.com/issues/8914Fixes: #8914
Signed-off-by: Loic Dachary <loic-201408@dachary.org>
To get the ordered list of OSD to which an object is mapped and the name
of the corresponding PG.
Signed-off-by: Loic Dachary <loic-201408@dachary.org>
This is mostly relevant in testing clusters, but it ensures that an OSD
disconnecting from the monitor at the wrong time will still see any recent
map updates and prevent accidental loss of map injection into the OSD cluster.
Fixes: #9219
Signed-off-by: Greg Farnum <greg@inktank.com>
Cancel the command op timeout event before we clear out the op from the
session struct. This isn't strictly necessary because command_op_cancel
will "gracefully" handle the case where the tid is no longer present, but
this avoids that noise and is cleaner.
Signed-off-by: Sage Weil <sage@redhat.com>
The C_CancelOp path assumes op->session != NULL. Cancel that op before
we clear it. This fixes a crash like
#0 pthread_rwlock_wrlock () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S:39
#1 0x00007fc82690a4b1 in RWLock::get_write (this=0x18, lockdep=<optimized out>) at ./common/RWLock.h:88
#2 0x00007fc8268f4d79 in Objecter::op_cancel (this=0x1f61830, s=0x0, tid=0, r=-110) at osdc/Objecter.cc:1850
#3 0x00007fc8268ba449 in Context::complete (this=0x1f68c20, r=<optimized out>) at ./include/Context.h:64
#4 0x00007fc8269769aa in RWTimer::timer_thread (this=0x1f61950) at common/Timer.cc:268
#5 0x00007fc82697a85d in RWTimerThread::entry (this=<optimized out>) at common/Timer.cc:200
#6 0x00007fc82651ce9a in start_thread (arg=0x7fc7e3fff700) at pthread_create.c:308
Signed-off-by: Sage Weil <sage@redhat.com>
We did this forever ago with mkcephfs, but ceph-disk didn't. Note that for
modern XFS this option is obsolete, but for older kernels it was not the
default.
Backport: firefly
Signed-off-by: Sage Weil <sage@redhat.com>
This is to avoid hitting an O(caps) loop in the worst
cast scenario. This mechanism is a little crude but
should be superceded at some point by admin socket
functionality to inspect session caps so that we
don't need to spit out this level of detail in logs.
Signed-off-by: John Spray <john.spray@redhat.com>
...to avoid doing an O(caps) scan to find out
which clients are responsible for any late-revoking
caps during health checks.
Signed-off-by: John Spray <john.spray@redhat.com>
Follow up on Yan Zheng's "mds: warn clients which
aren't revoking cap" to include a health metric
for this condition as well as the clog messages.
Signed-off-by: John Spray <john.spray@redhat.com>
To be used later for generating health metrics
for clients which are failing to promptly service
CEPH_SESSION_RECALL_STATE messages.
Signed-off-by: John Spray <john.spray@redhat.com>
Previously client would fail to release caps for files
in the root directory in response to CEPH_SESSION_RECALL_STATE
messages.
Signed-off-by: John Spray <john.spray@redhat.com>
Used for simulating a buggy client that trips
the error detection in #9282 (warn clients
which aren't revoking caps)
Signed-off-by: John Spray <john.spray@redhat.com>
trim_dentry can potentially free an inode, so get/put
it around the block where we use the inode's dn_set.
Signed-off-by: John Spray <john.spray@redhat.com>
Two fixes:
* Client would unlink everything it could, instead of just
meeting its goal, because caps.size() doesn't change until
dentries are cleaned up later. Take account of the trimmed
count in the while() condition to fix that.
* Don't count the root ino as trimmed, as although it has no
dentries (of course), we will never give up the cap.
With this change, the client will now precisely achieve the number
of caps requested in CEPH_SESSION_RECALL_STATE messages.
Signed-off-by: John Spray <john.spray@redhat.com>
In a75af4c2, procedure was added to invalidate root's dentries
if the trimming failed to free enough caps. This would sometimes
crash because root->dir wasn't necessarily open.
Fix by only doing it if root dir is open, though I suspect this
may not be the end of it...
Signed-off-by: John Spray <john.spray@redhat.com>
osd_backfill_scan_min and osd_backfill_scan_max set the number of
items grabbed during a single backfill scan, not an interval in
seconds. Correct the doc.
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
When using RHEL7 the radosgw daemon needs to start under systemd.
Check for systemd running on PID 1. If it is then start
the daemon using: systemd-run -r <cmd>. pidof returns null
as it is executed too quickly, adding one second of sleep and
script reports startup correctly.
Signed-off-by: JuanJose 'JJ' Galvez <jgalvez@redhat.com>
Describes the CLI for adding and removing buckets, in addition to the
'moving' instructions which were already present.
Signed-off-by: Stephen Jahl <stephenjahl@gmail.com>