* refs/pull/40640/head:
common: send SYSLOG_IDENTIFIER to journald
cephadm: enable log to journald by default
Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
* refs/pull/40842/head:
qa: update the ffsb.sh to clone it from git://git.ceph.com/ffsb.git
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
* refs/pull/41239/head:
librbd: use uint64_t instead of size_t for SparseExtent::length
mgr/PyModule: use Py_ssize_t for the PyList index
os/bluestore: print size_t using %xz
client: print int64_t using PRId64
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
* refs/pull/41268/head:
mds: fix possible mds_lock not locked assert
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
libfmt does compile-time format argument validation of the format string
and the argument when the the user-defined literal is used. but the
downside is that the formatter materialize the whole formatted string
into a std::string, before printing them argument into seastar's log buffer
inserter. presumably, the inserter would be more efficient in
comparision to the pre-format approach. so this validation is only
enabled for non NDEBUG build. so it is able to help us to identify
errors like
DEBUG("{} {}", 1, 2, 3)
Signed-off-by: Kefu Chai <kchai@redhat.com>
* refs/pull/41314/head:
qa/tasks/nfs: add test to check if cmds fail on not passing required arguments
mgr/nfs: fix flake8 missing whitespace around parameter equals error
mgr/nfs: annotate _cmd_nfs_* methods return value
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
The `send_message()` method is a high-level facility for
communicating with a monitor. If there is an active conn
available, it sends the message immediately; otherwise
the message is queued. This method assumes the queue is
already drained if the connection is available.
`active_con` is managed by `reopen_session()` where it's
first cleared and then reset after finding new alive mon.
This is followed by draining the `pending_messages` queue
which happens in `on_session_opened()` after the `MAuth`
exchange is finished.
Unfortunately, the path from the `active_con` clearing
to draining the queue is long and divided into multiple
continuations which results in lack of atomicity. When
e.g. `run_command()` interleaves the stages, following
crash happens:
```
INFO 2021-05-07 08:13:43,914 [shard 0] monc - do_auth_single: mon v2:172.21.15.82:6805/34166 => v2:172.21.15.82:3300/0 returns auth_reply(proto 2 0 (0) Success) v1: 0
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3910-g1b18e076/rpm/el8/BUILD/ceph-17.0.0-3910-g1b18e076/src/crimson/mon/MonClient.cc:1034: seastar::future<> crimson::mon::Client::send_message(MessageRef): Assertion `pending_messages.empty()' failed.
Aborting on shard 0.
Backtrace:
0# 0x000055CDE6DB532F in ceph-osd
1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
3# 0x00007FC1BF20BB20 in /lib64/libpthread.so.0
4# gsignal in /lib64/libc.so.6
5# abort in /lib64/libc.so.6
6# 0x00007FC1BD806B09 in /lib64/libc.so.6
7# 0x00007FC1BD814DE6 in /lib64/libc.so.6
8# crimson::mon::Client::send_message(boost::intrusive_ptr<Message>) in ceph-osd
9# crimson::mon::Client::renew_subs() in ceph-osd
10# 0x000055CDE764FB0B in ceph-osd
11# 0x000055CDE10457F0 in ceph-osd
12# 0x000055CDEA0AB88F in ceph-osd
13# 0x000055CDEA0B0DD0 in ceph-osd
14# 0x000055CDEA2689FB in ceph-osd
15# 0x000055CDE9DC0EDA in ceph-osd
16# main in ceph-osd
17# __libc_start_main in /lib64/libc.so.6
18# _start in ceph-osd
```
The problem caused following failure at Sepia:
http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-07_07:41:02-rados-master-distro-basic-smithi/6104549
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
mgr/cephadm: fix missing prometheus alerts
Reviewed-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Make client requests go to the concurrent pipeline stage "wait_repop" once they
are "submitted" to the underlying objectstore, which means their on-disk order
is guaranteed, so that successive client requests can go into the "process"
pipeline stage.
Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
These two stages are used to provide more parallelism in the pipeline,
while preserving client requests order.
Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
WriteLogCacheEntry gets appended to persist_log_entries before
write_data_pos is updated with the actual media offset. Because
push_back() makes a copy, the updated write_data_pos value never
makes it to media, making recovery impossible.
Fixes: https://tracker.ceph.com/issues/50669
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
first_valid_entry and first_free_entry pointers are read from media
but not actually used: both m_first_valid_entry and m_first_free_entry
get assigned 0 (or garbage). next_log_pos gets the same value as well
meaning that not only no recovery is attempted but the cache also gets
corrupted because DATA_RING_BUFFER_OFFSET is not applied.
Fixes: https://tracker.ceph.com/issues/50669
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
In ssd mode log entries are variable size. Attempting to count and
impose watermarks on the number of log entries is bogus because the
total number of entries it would take to fill the cache to capacity
is also variable and can't be precisely estimated.
Fixes: https://tracker.ceph.com/issues/50669
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
All parameters are integers and none of them are (in-)out, so don't
take them by reference. Additionally num_lanes, num_log_entries and
num_unpublished_reserves don't need to be 64-bit as their respective
fields in AbstractWriteLog are 32-bit.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>