mirror of
https://github.com/ceph/ceph
synced 2025-02-23 19:17:37 +00:00
crimson/os: synchronize producers with consumers in AlienStore's queues.
Some time ago we replaced the single, `boost::lockfree`-based queue in `ThreadPool` with the in-house, lockish `ShardedWorkQueue` vector. Unfortunately, pushing into such queue isn't synchronized with consuming from it -- the former happens without locking the `mutex`. As the underlying primitive behind `ShardedWorkQueue::pending` is plain `std::deque`, it's unsafe to operate that way in multi-thread environment. Indeed, weirdly looking crashes have been spotted at Sepia: ``` (virtualenv) rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-06-21_14:49:36-rados-master-distro-basic-smithi/6182668$ less ./remote/smithi196/log/ceph-osd.7.log.gz ... 0# 0x000055862FD67ADF in ceph-osd 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd 3# 0x00007FB22CF36B20 in /lib64/libpthread.so.0 4# 0x00005586357540E4 in ceph-osd 5# 0x00007FB22CF36B20 in /lib64/libpthread.so.0 6# pthread_cond_timedwait in /lib64/libpthread.so.0 7# crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) in ceph-osd 8# 0x00005586313E303B in ceph-osd 9# 0x00007FB22CC51BA3 in /lib64/libstdc++.so.6 10# 0x00007FB22CF2C14A in /lib64/libpthread.so.0 11# clone in /lib64/libc.so.6 Fault at location: 0x18 daemon-helper: command crashed with signal 11 ``` This fix introduces the synchronization to the `push_back()` method of `ShardedWorkQueue`. The side effect is that it may stall the reactor. Therefore, a follow-up change that switches to e.g. `boost::lockfree` is expected. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
This commit is contained in:
parent
a5fd875665
commit
809c5d10a3
@ -90,10 +90,14 @@ public:
|
||||
return work_item;
|
||||
}
|
||||
void stop() {
|
||||
std::unique_lock lock{mutex};
|
||||
stopping = true;
|
||||
cond.notify_all();
|
||||
}
|
||||
void push_back(WorkItem* work_item) {
|
||||
// XXX: oops, we can stall the reactor!
|
||||
// TODO: switch to boost::lockfree.
|
||||
std::unique_lock lock{mutex};
|
||||
pending.push_back(work_item);
|
||||
cond.notify_one();
|
||||
}
|
||||
|
Loading…
Reference in New Issue
Block a user