* refs/pull/41860/head:
qa: log messages when falling back to force/lazy umount
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Some time ago we replaced the single, `boost::lockfree`-based queue
in `ThreadPool` with the in-house, lockish `ShardedWorkQueue` vector.
Unfortunately, pushing into such queue isn't synchronized with
consuming from it -- the former happens without locking the `mutex`.
As the underlying primitive behind `ShardedWorkQueue::pending` is
plain `std::deque`, it's unsafe to operate that way in multi-thread
environment. Indeed, weirdly looking crashes have been spotted at Sepia:
```
(virtualenv) rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-06-21_14:49:36-rados-master-distro-basic-smithi/6182668$ less ./remote/smithi196/log/ceph-osd.7.log.gz
...
0# 0x000055862FD67ADF in ceph-osd
1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
3# 0x00007FB22CF36B20 in /lib64/libpthread.so.0
4# 0x00005586357540E4 in ceph-osd
5# 0x00007FB22CF36B20 in /lib64/libpthread.so.0
6# pthread_cond_timedwait in /lib64/libpthread.so.0
7# crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) in ceph-osd
8# 0x00005586313E303B in ceph-osd
9# 0x00007FB22CC51BA3 in /lib64/libstdc++.so.6
10# 0x00007FB22CF2C14A in /lib64/libpthread.so.0
11# clone in /lib64/libc.so.6
Fault at location: 0x18
daemon-helper: command crashed with signal 11
```
This fix introduces the synchronization to the `push_back()` method of
`ShardedWorkQueue`. The side effect is that it may stall the reactor.
Therefore, a follow-up change that switches to e.g. `boost::lockfree`
is expected.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
so AvlAllocator can switch from the first-first mode to best-fit mode
without walking through the whole space map tree. in the
highly-fragmented system, iterating the whole tree could hurt the
performance of fast storage system a lot.
the idea comes from openzfs's metaslab allocator.
Signed-off-by: Kefu Chai <kchai@redhat.com>
so AvlAllocator can switch from the first-first mode to best-fit mode
without walking through the whole space map tree. in the
highly-fragmented system, iterating the whole tree could hurt the
performance of fast storage system a lot.
the idea comes from openzfs's metaslab allocator.
Signed-off-by: Kefu Chai <kchai@redhat.com>
crimson/onode-staged-tree: improve logs to understand inconsistent load from seastore
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Xuehan Xu <xuxuehan@360.cn>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Add logs to detect corruptions when load nodes. assert() is not
informative enough to understand the context.
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Dummy backend is used for unit tests without transactions, so there
should be no copy-on-write behavior.
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
when do ceph-osd mkfs, when ceph-osd process exit, sometimes
the block data could be written incompletely. we need add
wait for it complete.
Signed-off-by: Chen Fan <fan.chen@easystack.cn>
Instead of ceph::make_message() because conn::send() in crimson expects
a std::unique_ptr and not boost::intrusive_ptr
Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
Instead of ceph::make_message() because conn::send() in crimson expects
a std::unique_ptr and not boost::intrusive_ptr
Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
if `parallel_for_each_state` is defined as a nested class in errorator,
clang fails to compile it:
../src/crimson/common/errorator.h:716:47: error: no class named 'parallel_for_each_state' in 'errorator<AllowedErrors...>'
friend class errorator<AllowedErrors...>::parallel_for_each_state;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
and the forward declaration does not help. so we have to extract it
out of the errorator. to speed up the compilation, it is moved into
errorator-loop.h. its name mirrors `include/seastar/core/loop.h`.
we could extract the `errorator<>::parallel_for_each()` out as well,
as its return type can be deduced from the type of Iterator and Func.
Signed-off-by: Kefu Chai <kchai@redhat.com>
* refs/pull/41892/head:
client: remove unused include from barrier.cc
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/41723/head:
mds: to print the unknow type value
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
* refs/pull/40997/head:
test: add test to verify adding an active peer back to source
pybind/mirroring: disallow adding a active peer back to source
pybind/cephfs: interface to fetch file system id
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/36823/head:
qa : add a test for the cmd, dump cache
mds : add timeout to the command, dump cache, to prevent it from running too long and affecting the service
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
mgr/dashboard: bucket details: show lock retention period only in days
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>