mkcephfs creates osd data directory automatically, but it doesn't create a
directory for the osd journal file.
When you have a journal file in a directory that differs from the osd data
directory in your configuration, like:
osd data = /osd/osd$id
osd journal = /journal/osd$id
You will receive a "mount failed to open journal /journal/osd0/journal: No
such file or directory" error when doing mkcephfs
Signed-off-by: CC Lien <cc_lien@tcloudcomputing.com>
Otherwise we can fail to get_snaps() when we start the recovery:
#0 0x00007fa037625f55 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x00007fa037628d90 in *__GI_abort () at abort.c:88
#2 0x00007fa03761f07a in *__GI___assert_fail (assertion=0x9f3d81 "oldparent", file=<value optimized out>, line=170, function=0x9f4680 "void SnapRealm::build_snap_set(std::set<snapid_t, std::less<snapid_t>, std::allocator<snapid_t> >&, snapid_t&, snapid_t&, snapid_t&, snapid_t, snapid_t)") at assert.c:78
#3 0x00000000008f7656 in SnapRealm::build_snap_set (this=0x222a300, s=..., max_seq=..., max_last_created=..., max_last_destroyed=..., first=..., last=...) at mds/snap.cc:170
#4 0x00000000008f7e8c in SnapRealm::check_cache (this=0x222a300) at mds/snap.cc:194
#5 0x00000000008f892a in SnapRealm::get_snaps (this=0x222a300) at mds/snap.cc:209
#6 0x00000000007f2c85 in MDCache::queue_file_recover (this=0x2202a00, in=0x7fa0340f5450) at mds/MDCache.cc:4398
#7 0x0000000000865011 in Locker::file_recover (this=0x21fe850, lock=0x7fa0340f59b0) at mds/Locker.cc:3437
#8 0x00000000007e5899 in MDCache::start_files_to_recover (this=0x2202a00, recover_q=..., check_q=...) at mds/MDCache.cc:4503
#9 0x00000000007e887e in MDCache::rejoin_gather_finish (this=0x2202a00) at mds/MDCache.cc:3904
#10 0x00000000007ed6cf in MDCache::handle_cache_rejoin_strong (this=0x2202a00, strong=0x7fa030025440) at mds/MDCache.cc:3618
#11 0x00000000007ed84a in MDCache::handle_cache_rejoin (this=0x2202a00, m=0x7fa030025440) at mds/MDCache.cc:3063
#12 0x00000000007fade6 in MDCache::dispatch (this=0x2202a00, m=0x7fa030025440) at mds/MDCache.cc:5668
#13 0x0000000000735313 in MDS::_dispatch (this=0x22014d0, m=0x7fa030025440) at mds/MDS.cc:1390
#14 0x00000000007372a3 in MDS::ms_dispatch (this=0x22014d0, m=0x7fa030025440) at mds/MDS.cc:1295
#15 0x0000000000728b97 in Messenger::ms_deliver_dispatch(Message*) ()
#16 0x0000000000716c5e in SimpleMessenger::dispatch_entry (this=0x2202350) at msg/SimpleMessenger.cc:332
#17 0x00000000007119c7 in SimpleMessenger::DispatchThread::entry (this=0x2202760) at msg/SimpleMessenger.h:494
#18 0x000000000071f4e7 in Thread::_entry_func (arg=0x2202760) at ./common/Thread.h:39
#19 0x00007fa03849673a in start_thread (arg=<value optimized out>) at pthread_create.c:300
#20 0x00007fa0376bf6dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
Signed-off-by: Sage Weil <sage@newdream.net>
We can't fuss with lock state in the finish method because we already
encoded the old state to the new auth, and we are now just a replica.
We do still want to relax the lock state to be more replica friendly,
though, so do that in the encode_export_inode method.
This was broken by d5574993 (probably, that commit fixed a similar
problem). The rejoin_ack initializes replica state properly, so we can't
go changing it now. I'm not sure why this was resetting the state to
LOCK, because that's clearly not allowed.
Print when check_max_size does a no-op so that this is a bit easier to see
next time.
The reconnect will infer some client caps, which will affect what lock
states we want. If we're not replicated, fine, just pick something good.
Otherwise, try_eval() and go through the proper channels.
This _might_ be the source of #165...
This is intended to mitigate a livelock issue with traversing to snapped
metadata. The client specifies all snap requests relative to a non-snap
inode. The traversal through the snapped portion of the namespace will
normally happen on the auth node, but the actual target may be on another
node that does not have that portion of the namespace. To avoid indefinite
request ping-pong, the mds will begin to discover and replicate the snapped
path components if the request has been retried.
This doesn't perform optimally, but it will at least work.