After commit 419800fe (client: re-send request when MDS enters reconnecting
stage), cephfs client can send both unsafe requests and normal requests when
MDS is in reconnecting stage. Normal requests can have embedded cap releases,
the client code encodes these embedded cap releases after composing the cap
reconnect message. This causes the client sliently drop some caps. The fix
is re-send requsets (which add embedded cap releases) before composing the
cap reconnect message
Fixes: #10912
Signed-off-by: Yan, Zheng <zyan@redhat.com>
We should ship the RBD udev rules in the same package that ships
/usr/bin/rbd. This package happens to be ceph-common, so move the udev
rules there.
The udev rules rely on the ceph-rbdnamer utility, so move that utility
and its man page as well.
http://tracker.ceph.com/issues/10864 Refs: #10864
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
Since this is often looked up by snap_id anyway, snap_lock
is easy to use for this.
This lets us avoid taking md_lock in many places.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
There's no need to explicitly close the ioctx. Doing so may cause
problems when the Images using it are destroyed afterwards. Just let
normal cleanup at the end of the block take care of it in the correct
order.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
This simplifies locking by obviating the NULL checks. We no longer
need md_lock to protect these acceses. We can use object_map_lock
instead, to make sure no one reads an object map while its being
updated.
Keep track of whether the object map is enabled for a given snapshot
internally. In each public method, check this state, and automatically
set it correctly when refreshing the object map. During snapshot
removal, unconditionally try to remove the object map object, to
protect against bugs leaking objects, and to be consistent with image
removal.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Detect the case of a crashed lock owner by waiting for up to 30 seconds
for a async request progress message from the leader. If a progress
message isn't received, restart the request (and possibly take ownership
of the lock).
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Replace the two Context threading classes used within
ImageWatcher with a facade to orchestrate the scheduling
and canceling of Context task callbacks.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Ensure that all in-flight maintenance operations (resize, flatten) are
not running when the exclusive lock is released. The lock will be
released when transitioning to a snapshot, closing the image, or
cooperatively when another client requests the lock.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
If the async operation associated with a flush request completes,
only complete the flush contexts if no previous operations are
still in flight. Otherwise, move the flush contexts to an older
in-flight async operation.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
add_snap() updates the ImageCtx snapshot metadata in memory, as well
as reading the flags as part of the object map snapshot. Both of these
require holding snap_lock.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
This is another step towards eliminating md_lock from the writeback
path. Almost all the places that use ImageCtx->flags already use
snap_lock, so there's no need to create a new lock. For the others,
add a helper, test_flags() that acquires the lock, similar to
test_features().
This also makes sure we look up flags of the snapshot we're operating
on, instead of those for head.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
A bunch of these used to be here, but were removed when converting to
RWLocks, before RWLocks had is_[w]locked() methods.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
This gets the appropriate locks, and checks the currently open
snapshot instead of head. Looking up features by snap_id prepares us
for future addition or removal of e.g. an object map throughout the
life of an image.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
This was being protected by md_lock, but that has become too coarse
since it is used to prevent writes from proceeding while flushing
caches for a snapshot. With the addition of ObjectMap and
ImageWatcher, writeback could try to acquire md_lock again, leading to
a deadlock.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
We were passing in a NULL data structure, probably in an attempt to
let things clean up -- but our implementation just returns with a NULL
pass-in value, so drop it for clarity.
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
these two functions traverse the whole projected_nodes list if there
is no projected xatts/srnode. busy directory inode can have large
projected_nodes list.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
define some rarely used containers as compact_map/compact_set. Each
replacement can save 40 bytes for 64 bits program.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Size of ceph_lock_state_t is about 200 bytes, CInode contains two
ceph_lock_state_t. Dynamiclly allocating them can save about 400
bytes.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
define some rarely used containers as compact_map/compact_set. Each
replacement can save 40 bytes for 64 bits program.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
inode_t::old_pools is rarely used. Defining it as compact_set can save
40 bytes. inline_data is also rarely used, dynamiclly allocating bufferlist
for inline_data can save another 72 bytes.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
InodeStore::old_inodes and InodeStore::snap_blob are for snapshotted
inode only. their size are 48 bytes and 80 bytes respectively. Defining
InodeStore::old_inodes can save 40 bytes, allocating bufferlist for
snap_blob dynamiclly can save 72 bytes.
Signed-off-by: Yan, Zheng <zyan@redhat.com>