mds: ignore fragment_notify when dft state doesn't match

In particular, if there is a resolve in there somewhere, we may have found
out about this refragment from the src because they send resolve messages
to all nodes (to resolve ambiguous migrations).  If that's the case we
can ignore the message.

Fixes crash like

2011-04-28 14:30:31.179106 7fe72e325710 -- 10.0.1.252:6805/22158 <== mds2 10.0.1.252:6803/25635 548 ==== fragment_notify(300000000b4#00* 1) v1 ==== 17+0+0 (2192211443 0 0) 0x2c9ec00 con 0x323d140
2011-04-28 14:30:31.179116 7fe72e325710 mds1.cache handle_fragment_notify fragment_notify(300000000b4#00* 1) v1 from mds2
2011-04-28 14:30:31.179149 7fe72e325710 mds1.cache adjust_dir_fragments 00* 1 on [inode 300000000b4 [...2,head] /syn.4114.0/dir.0/dir.0/dir.0/dir.0/dir.0/dir.5/dir.5/ auth{0=1,2=1} fragtree_t(*^2 00*^1) v188 ap=1 f(v0 m2011-04-28 14:23:59.074510 7=7+0) n(v2 rc2011-04-28 14:23:59.074510 15=14+1) (idft mix->lock g=0,2 dirty) (inest mix dirty) (ifile excl dirty) (ixattr excl) (iversion lock) caps={4114=pAsLsXsxFsx/-@1},l=4114(-1) | dirtyscattered dirfrag caps replicated dirty authpin 0x32bec70]
2011-04-28 14:30:31.179182 7fe72e325710 mds1.cache adjust_dir_fragments 00* bits 1 srcfrags 0x3080860,0x378da50 on [inode 300000000b4 [...2,head] /syn.4114.0/dir.0/dir.0/dir.0/dir.0/dir.0/dir.5/dir.5/ auth{0=1,2=1} fragtree_t(*^2 00*^1) v188 ap=1 f(v0 m2011-04-28 14:23:59.074510 7=7+0) n(v2 rc2011-04-28 14:23:59.074510 15=14+1) (idft mix->lock g=0,2 dirty) (inest mix dirty) (ifile excl dirty) (ixattr excl) (iversion lock) caps={4114=pAsLsXsxFsx/-@1},l=4114(-1) | dirtyscattered dirfrag caps replicated dirty authpin 0x32bec70]
2011-04-28 14:30:31.179218 7fe72e325710 mds1.cache  new fragtree is fragtree_t(*^2 00*^1)
mds/MDCache.cc: In function 'void MDCache::adjust_dir_fragments(CInode*, std::list<CDir*, std::allocator<CDir*> >&, frag_t, int, std::list<CDir*, std::allocator<CDir*> >&, std::list<Context*, std::allocator<Context*> >&, bool)', in thread '0x7fe72e325710'
mds/MDCache.cc: 9254: FAILED assert(srcfrags.size() == 1)
 ceph version 0.27-165-gaf908f8 (commit:af908f82924a67be3aeb2767eaa05ba04c145f42)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x53) [0xa5775e]
 2: (MDCache::adjust_dir_fragments(CInode*, std::list<CDir*, std::allocator<CDir*> >&, frag_t, int, std::list<CDir*, std::allocator<CDir*> >&, std::list<Context*, std::allocator<Context*> >&, bool)+0x2dd) [0x888bbf]
 3: (MDCache::adjust_dir_fragments(CInode*, frag_t, int, std::list<CDir*, std::allocator<CDir*> >&, std::list<Context*, std::allocator<Context*> >&, bool)+0x13d) [0x88817d]
 4: (MDCache::handle_fragment_notify(MMDSFragmentNotify*)+0x199) [0x88bac5]
 5: (MDCache::dispatch(Message*)+0x124) [0x8765ea]
 6: (MDS::handle_deferrable_message(Message*)+0x1f5) [0x77a607]
 7: (MDS::_dispatch(Message*)+0x784) [0x77ba90]

Signed-off-by: Sage Weil <sage@newdream.net>
This commit is contained in:
Sage Weil 2011-04-28 15:17:18 -07:00
parent 1c58f8069b
commit e9fac67fe1

View File

@ -9710,11 +9710,21 @@ void MDCache::handle_fragment_notify(MMDSFragmentNotify *notify)
CInode *diri = get_inode(notify->get_ino());
if (diri) {
list<Context*> waiters;
frag_t base = notify->get_basefrag();
int bits = notify->get_bits();
if ((bits < 0 && diri->dirfragtree.is_leaf(base)) ||
(bits > 0 && !diri->dirfragtree.is_leaf(base))) {
dout(10) << " dft " << diri->dirfragtree << " state doesn't match " << base << " by " << bits
<< ", must have found out during resolve/rejoin? ignoring. " << *diri << dendl;
notify->put();
return;
}
// refragment
list<Context*> waiters;
list<CDir*> resultfrags;
adjust_dir_fragments(diri, notify->get_basefrag(), notify->get_bits(),
adjust_dir_fragments(diri, base, bits,
resultfrags, waiters, false);
if (g_conf.mds_debug_frag)
diri->verify_dirfrags();