coll_t is now unstructured; list all dirs besides '.' and '..'.
The old coll_t::parse() was broken. Remove it. Fixes
a4138c9050.
Signed-off-by: Sage Weil <sage@newdream.net>
Manually mark an mds rank as failed. The daemon should kill itself when
it finds out.
Note that this doesn't do any sanity checks, so it can also be used to
adjust state in an otherwise inconsistent mdsmap due to other bugs (one
where, say, an mds in up but has no info, or not up but not in the failed
set.)
Signed-off-by: Sage Weil <sage@newdream.net>
We need to push all parents of the straydn to the target. This changed
a while back with the mdsdir stuff but this bit of code wasn't updated.
Updated to mirror send_dentry_unlink().
This fixes a crash like:
mds/MDCache.cc: In function 'void MDCache::adjust_subtree_auth(CDir*, std::pair<int, int>, bool)':
mds/MDCache.cc:644: FAILED assert(root)
ceph version 0.22~rc (0e67718a36)
1: (MDCache::add_replica_dir(ceph::buffer::list::iterator&, CInode*, int, std::list<Context*, std::allocator<Context*> >&)+0x1c1) [0x536a91]
2: (MDCache::add_replica_stray(ceph::buffer::list&, int)+0xdb) [0x536fab]
3: (Server::handle_slave_rename_prep(MDRequest*)+0x1113) [0x4d5c33]
4: (Server::dispatch_slave_request(MDRequest*)+0x21b) [0x4de80b]
5: (Server::handle_slave_request(MMDSSlaveRequest*)+0x145) [0x4e1955]
6: (MDS::_dispatch(Message*)+0x2598) [0x49e038]
...
Signed-off-by: Sage Weil <sage@newdream.net>
The old forget lost objects rewrote history in the PG log, which is asking
for all kinds of trouble. Instead, add new logs events to indicate that
an object is LOST (deleted) or LOST_REVERTed (reverted to an older
version).
The LOST_REVERT case means we may need to recover the old version from
another node and rewrite the version number. This isn't implemented yet;
for now we just assert.
Signed-off-by: Sage Weil <sage@newdream.net>
The old forget lost objects rewrote history in the PG log, which is asking
for all kinds of trouble. Instead, add new logs events to indicate that
an object is LOST (deleted) or LOST_REVERTed (reverted to an older
version).
The LOST_REVERT case means we may need to recover the old version from
another node and rewrite the version number. This isn't implemented yet;
for now we just assert.
Signed-off-by: Sage Weil <sage@newdream.net>
It's best not to have data members in PG::Info that are not serialized
and sent over the wire. Cache coll directly inside PG instead.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
When we take the clone branch, we update the missing map. This invalidates
our current iterator, which can cause badness. Instead, increment the
iterator near the top of the loop so we don't have to worry about it.
Signed-off-by: Sage Weil <sage@newdream.net>
coll_t is now a string. META_COLL and TEMP_COLL are just constants now.
Now there is a constructor that takes pgid_t and snapid_t, rather than
factory methods. It's clear what that constructor does, so wrapping it
in factory methods should be unecessary.
Bump coll_t serialization version to 3. Implement decoding for the old
versions.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
This change makes interval_set::m and interval_set::_size private data
members in interval_set, instead of public. This change also creates a
non-const iterator. Using this iterator, users can modify the length of
an interval. So now, all users can use the iterators rather than
interacting with the class internals directly.
We were properly falling out of the while loop when we reached end(), but
not checking for it in the following if-else. Now we do!
Reported-by: Henry C Chang <henry_c_chang@tcloudcomputing.com>
This avoids stalling out peering, because the peer just responds with
another 'empty' PG::Info in response (which we already have).
Signed-off-by: Sage Weil <sage@newdream.net>
If during recovery we are unable to pull from a replica due to reaching
EOF (e.g., zeroed out object), pull from the next available replica (if
any).
Eventually this should be extended to do the same when a checksum fails.
Signed-off-by: Sage Weil <sage@newdream.net>
The setup-chroot.sh script is very handy for building the server in a
chroot environment. I thought I would share it here in case anyone else
finds it useful.
This really shouldn't happen (!), but if it does, at least avoid getting
the primary state out of sync with the replicas.
Signed-off-by: Sage Weil <sage@newdream.net>
If we already auth_pinned, we're past the gates; don't stop on freezable.
This screws up xlock: the lock moves to PREXLOCK state, but the request
that would normally xlock it gets deferred because of a racing freezing
of the tree. Then the PREXLOCK gather kicks in and badness happens.
Signed-off-by: Sage Weil <sage@newdream.net>
This makes the interface a bit more adaptable for a situation where it has
a simple string representation instead of the strict structure it has now.
Eventually this function can simply attempt a pg_t parse.
Signed-off-by: Sage Weil <sage@newdream.net>
We can't error out if we don't get everything we want in one go now that
we support pushing objects in pieces. Remove this check entirely, since
we don't have a good error handling case anyway.
We need to preserve the order of processing of cap release and writeback
messages across handle_client_caps() and process_request_cap_release().
Use a helper with the appropriate condition, and defer the release
processing as needed.
Signed-off-by: Sage Weil <sage@newdream.net>