From 33a06847adba56c45da8b3af0256199bdf6623b6 Mon Sep 17 00:00:00 2001 From: Jianpeng Ma Date: Thu, 29 Jun 2017 06:31:09 +0800 Subject: [PATCH] os/bluestore/BlueFS: clear current log entrys before dump all fnode, We do async-compact-log, i met this bug: 2017-06-28 11:51:42.747315 7f193dd70bc0 -1 /root/ceph/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_replay(bool)' thread 7f193dd70bc0 time 2017-06-28 11:51:42.741868 /root/ceph/src/os/bluestore/BlueFS.cc: 714: FAILED assert(r == q->second->file_map.end()) ceph version 12.0.3-2327-gc74625e (c74625ebf57d603043f414a83b7a6525264fb6ae) luminous (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x10e) [0x5628ee1f8a0e] 2: (BlueFS::_replay(bool)+0x3bc3) [0x5628ee18cb13] 3: (BlueFS::mount()+0x1cf) [0x5628ee18cf0f] 4: (BlueStore::_open_db(bool)+0xd99) [0x5628ee0af7f9] 5: (BlueStore::_mount(bool)+0x3da) [0x5628ee0e056a] 6: (OSD::init()+0x28f) [0x5628edce10bf] 7: (main()+0x29ca) [0x5628edbf116a] 8: (__libc_start_main()+0xf5) [0x7f193b2c1f45] 9: (()+0x493306) [0x5628edc8b306] NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. assume this case : Thread1 Thread2 _compact_log_async _flush_and_sync_log lock.unlock() open_for_write(A) op_file_update op_dir_link lock.lock() _compact_log_dump_metadata contail file A flush lock.unlock op_file_update(alloc new extent) _flush_and_sync_log So two log entry have the same infos(op_dir_link). When do _replay the above bug occur. Before reflect everything to compact, we should clear current log entrys to avoid this. And compact contain all infos. It don't miss something. Signed-off-by: Jianpeng Ma --- src/os/bluestore/BlueFS.cc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/os/bluestore/BlueFS.cc b/src/os/bluestore/BlueFS.cc index f73e27cf603..399dc56a1bd 100644 --- a/src/os/bluestore/BlueFS.cc +++ b/src/os/bluestore/BlueFS.cc @@ -1206,6 +1206,8 @@ void BlueFS::_compact_log_async(std::unique_lock& l) // 2. prepare compacted log bluefs_transaction_t t; + //avoid record two times in log_t and _compact_log_dump_metadata. + log_t.clear(); _compact_log_dump_metadata(&t); // conservative estimate for final encoded size