We were dropping gather state on the floor, which resulted in
general confusion and errors like this
10.03.16 14:17:17.622280 mds0.locker handle_simple_lock lock(a=lockack dn 1000000019c/NEWS1_1B.PRN snap head) on (dn xlock x=1 by 0x307c050) [dentry #1/clients/client2/~dmtmp/WORDPRO/NEWS1_1B.PRN [2,head] auth{1=1} (dn xlock x=1 by 0x307c050) v=90 inode=0x7f2fe90a7980 | nref=5 0x7f2fe0349b80]
mds/Locker.cc: In function 'void Locker::handle_simple_lock(SimpleLock*, MLock*)':
mds/Locker.cc:2424: FAILED assert(lock->get_state() == 7 || lock->get_state() == 12)
1: (Locker::handle_simple_lock(SimpleLock*, MLock*)+0x374) [0x8607bc]
2: (Locker::handle_lock(MLock*)+0x18b) [0x861b91]
3: (Locker::dispatch(Message*)+0x41) [0x86470b]
4: (MDS::_dispatch(Message*)+0x1b42) [0x72ce68]
5: (MDS::ms_dispatch(Message*)+0x2f) [0x72e1e9]
6: (Messenger::ms_deliver_dispatch(Message*)+0x55) [0x72086b]
7: (SimpleMessenger::dispatch_entry()+0x4f4) [0x70d50e]
8: (SimpleMessenger::DispatchThread::entry()+0x29) [0x7095bd]
9: (Thread::_entry_func(void*)+0x20) [0x71a9e1]
10: /lib/libpthread.so.0 [0x7f2fe8d6573a]
11: (clone()+0x6d) [0x7f2fe7f906dd]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
We can get unlink metadata after replay:
create some auth metadata (possibly a whole tree)
export to another mds
other mds deletes it
reimport tree, original link to root is broken by EImportStart
-> entire tree is unlinked
Find any unlinked metadata (that's not a base inode) and remove
it (recursively) after journal replay.
We don't want to index the Log::Entry reqid for CLONE ops because
each CLONE is followed by a real op with the same reqid.. that's
the one that should get indexed. Introduce a helper to keep
the logic consistent.
This reverts commit 0c688f94be.
Revert "mon: Use PaxosServiceMessage::caps instead of Session::caps where applicable"
This reverts commit e33e17ec2f.
This needs to be reworked slightly to handle inter-mon communication better.
Include LOCK, MIX, and associated gather states. An EXCL on the
auth (which is not temporary) is a LOCK on the replica, so fw.
(Otherwise we'd need another state.) MIX should fw too, since
it is also stable and the replcia can't rdlock.
Include leading states because rdlock waits for WAIT_RD, not
stable.
Not RESOLVE|REJOIN|CLIENTREPLAY|ACTIVE|STOPPING, according to
the current MDSMap.
Fixes problem where the resolve_set is empty, but we send
resolves out, and got != needed because got is a superset.
Previously we have a broken hack that would drop a replicated
null dentry if it got new linkage. We already get an explicit
message if it was unlinked, that unlinks it cleanly. Do the
same for links, and replicate the newly linked inode as
needed. This is much cleaner and more correct.
Specifically, this fixes a problem where a create (link) and
unlink are pipelined by the same client under the same xlock,
so that the previous hack (in the handle_lock handler) never
triggers because the lock state doesn't toggle between the link
and unlink.
send_dentry_link sends current, not projected, linkage
This is more or less equivalent to the linux kernel list_head:
each embedded item struct has only a next and prev pointer. As
long as the same member item is always used, at a fixed offset
from the containing class, we can go from an item to a contained
class.
The offset can either be passed to the list (head) constructor,
or to the begin(), front(), back() members explicitly.
Iterator has 3 modes.. current (list_for_each), cache_next
(list_for_each_safe), and magic (uses cached next iff current is
empty). Magic will work most of the time... as long as we don't
re-add ourselves to a different list inside the iterator loop.
(Note that if we do, we will iterator up to the other list's
head, not detect it is a head, an get an invalid pointer and
crash.)
elist: add to makefile
elist: require offset for cosntructor
elist: fix pop_front/back