Rather than assuming that any necessary inodes are in the cache, split up
MDCache::scrub_dentry into setup and work phases. Add an internal_op_finisher()
to MDRequest. Dispatch any CEPH_MDS_OP_VALIDATE internal operations to
scrub_dentry_work(). Taken together, these make everything work properly when
path_traverse() (by way of rdlock_path_pin_ref()) needs to go to disk before
satisfying the lookup.
Signed-off-by: Greg Farnum <greg@inktank.com>
For now, just return -EXDEV ("Cross-device link") on internal ops that
require forwarding, as forwarding internal ops will require a great deal more
infrastructure.. But push the issue down to this level instead of worrying
about it in path_traverse, and consider the possibility that the MDRequest
might not have a client_request that it's wrapped around.
Signed-off-by: Greg Farnum <greg@inktank.com>
The generic reply_request(MDRequest, int) is now the only caller. It's still
just building an MClientRequest to pass along, but we can change it a lot more
easily now to support responding to non-client requests.
Signed-off-by: Greg Farnum <greg@inktank.com>
Set the MClientReply::extra_bl from reply_extra_bl unconditionally in
reply_request(), instead of only in early_reply(). Further isolate
the reply_request() callers from the use of MClientReply this way.
Signed-off-by: Greg Farnum <greg@inktank.com>
We have members for these two parameters in the MDRequestImpl already, so
make use of them. This helps us move towards dropping the expectation of an
MClientRequest from functions like rdlock_path_pin_ref().
Signed-off-by: Greg Farnum <greg@inktank.com>
scrub_dentry() is passed a string path, and it validates it before replying. We
hook up an admin socket command "scrub_path" to call it and dump the output.
Signed-off-by: Greg Farnum <greg@inktank.com>
Add a function that will validate the on-disk state of the CInode. We currently
check that the on-disk backtrace matches (or is older) and compare rstats on
dirfrags against the parent dir's inode (for directories only).
TODO: validate that the on-disk Inode object matches what the parent
directory holds.
It's using a sort-of new programming model, trying to stuff stack data into
a Continuation object and write everything sequentially instead of having
a function and Context per IO.
Signed-off-by: Greg Farnum <greg@inktank.com>
Signed-off-by: John Spray <john.spray@redhat.com>
Unlike the regular Continuation, this one works in terms of an MDRequest
and has wrappers to provide Context callbacks that are either
internal MDS or IO appropriate.
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
This way we can create duplicate CInodes without actually linking them
into the cache. It'll be helpful for comparing different versions of
disk states and in-memory state, etc.
Signed-off-by: Greg Farnum <greg@inktank.com>
Use this passthru in the Server path locking functions so that we can get
locks or auth pins without an associated MClientRequest.
Signed-off-by: Greg Farnum <greg@inktank.com>
operator== is checking equality of the version as well, but I want
something I can use to check that the internal sums match. This is useful
for eg comparing the sums of a set of dirfrags to the tally stored in
the inode.
Signed-off-by: Greg Farnum <greg@inktank.com>
This compares one inode_t against another, seeing which version is newer
and checking that differences in the data members make sense given that.
Signed-off-by: Greg Farnum <greg@inktank.com>
The compare() function checks one backtrace against another and indicates
if they're equivalent (or divergent!) and the relative freshness.
Signed-off-by: Greg Farnum <greg@inktank.com>
These let us wrap generic function tooling up inside of the appropriate
type-checking, and verify we haven't done anything too stupid.
Signed-off-by: Greg Farnum <greg@inktank.com>
The || instead of && had it always installed. That was fixed in EPEL
already.
http://tracker.ceph.com/issues/9747Fixes: #9747
Signed-off-by: Loic Dachary <loic-201408@dachary.org>
(cherry picked from commit 5ff4a850a0)
The MClientCaps* is allowed to be NULL, so we can't deref it unless
the dirty param is non-zero. So don't do the ahead-of-time lookup;
just call it explicitly in the if block.
Signed-off-by: Greg Farnum <greg@inktank.com>
Add CEPH_FEATURE_OSD_SET_ALLOC_HINT feature bit
Collect the intersection of all peer feature bits during peering
When handling CEPH_OSD_OP_SETALLOCHINT check that all OSDs support it
by checking for CEPH_FEATURE_OSD_SET_ALLOC_HINT feature bit.
Fixes: #9419
Backport: firefly
Signed-off-by: David Zafman <dzafman@redhat.com>