Commit Graph

123173 Commits

Author SHA1 Message Date
Sage Weil
fc6fffc6d5 Merge PR #40640 into master
* refs/pull/40640/head:
	common: send SYSLOG_IDENTIFIER to journald
	cephadm: enable log to journald by default

Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
2021-05-18 09:32:52 -04:00
Rishabh Dave
72478a585c
Merge pull request #39580 from ifed01/wip-ifed-migrate
ceph-volume: implement bluefs volume migration.
2021-05-18 18:20:43 +05:30
Matt Benjamin
1627d6c76a
Merge pull request #41282 from cbodley/wip-rgw-rm-civetweb
rgw: remove the civetweb and fcgi frontends
2021-05-18 08:19:09 -04:00
Kefu Chai
3b653b331b
Merge pull request #41364 from rzarzynski/wip-crimson-monc-pending_messages-assert
crimson/monc: fix send_message() racing with reopen_session().

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-05-18 17:45:28 +08:00
Kefu Chai
49b027da4e
Merge pull request #41366 from tchaikov/wip-crimson-os-debug
crimson/os: use compile-time validation

Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Xuehan Xu <xxhdx1985126@gmail.com>
2021-05-18 17:05:15 +08:00
Ilya Dryomov
5440b9e04b
Merge pull request #41354 from idryomov/wip-rbd-pwl-ssd-recovery
librbd/cache/pwl/ssd: fix some crash recovery issues

Reviewed-by: Yin Congmin <congmin.yin@intel.com>
Reviewed-by: Mahati Chamarthy <mahati.chamarthy@intel.com>
2021-05-18 10:25:42 +02:00
Kefu Chai
0b795da509
Merge pull request #39772 from xxhdx1985126/wip-crimson-client-req-pipeline-parallelism
crimson/osd: optimize crimson-osd's client requests process parallelism

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-05-18 13:44:53 +08:00
Kefu Chai
c270ad48f0
Merge pull request #41362 from Aran85/crimson-diagrams
crimson/seastore: add string_kv_node_layout diagrams

Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
2021-05-18 13:29:13 +08:00
Patrick Donnelly
43b5a39844
Merge PR #40234 into master
* refs/pull/40234/head:
	client: always register callbacks before mount()
	client: move SnapRealm methods to ClientSnapRealm.cc

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-05-17 19:50:45 -07:00
Patrick Donnelly
80855c3c6d
Merge PR #40842 into master
* refs/pull/40842/head:
	qa: update the ffsb.sh to clone it from git://git.ceph.com/ffsb.git

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-05-17 19:49:40 -07:00
Patrick Donnelly
478cbbadc1
Merge PR #41235 into master
* refs/pull/41235/head:
	mds: PurgeQueue.cc fix for 32bit compilation

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-05-17 19:47:57 -07:00
Patrick Donnelly
b15cd57a4e
Merge PR #41239 into master
* refs/pull/41239/head:
	librbd: use uint64_t instead of size_t for SparseExtent::length
	mgr/PyModule: use Py_ssize_t for the PyList index
	os/bluestore: print size_t using %xz
	client: print int64_t using PRId64

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2021-05-17 19:46:56 -07:00
Patrick Donnelly
c9087065f7
Merge PR #41254 into master
* refs/pull/41254/head:
	mds: save the metadata pool id MDSRank class's private member

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-05-17 19:42:54 -07:00
Patrick Donnelly
756a5735c1
Merge PR #41267 into master
* refs/pull/41267/head:
	mds: defer the journal recovered success log

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-05-17 19:40:16 -07:00
Patrick Donnelly
a33f659004
Merge PR #41268 into master
* refs/pull/41268/head:
	mds: fix possible mds_lock not locked assert

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-05-17 19:39:11 -07:00
Kefu Chai
66ee8af4f8
Merge pull request #41341 from tchaikov/wip-dmclock
dmclock: pick up change to fix run_sched_ahead() scheduling issue

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>
2021-05-18 09:20:53 +08:00
Casey Bodley
bf33951219 rgw: building the beast frontend is no longer optional
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2021-05-17 15:00:33 -04:00
Casey Bodley
a335304042 rgw: remove the fcgi frontend
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2021-05-17 15:00:33 -04:00
Casey Bodley
46c0292870 rgw: remove the civetweb frontend from src and qa
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2021-05-17 15:00:31 -04:00
Casey Bodley
a7215d8683
Merge pull request #41262 from cbodley/wip-rgw-civetweb-deprecate
rgw: deprecate the civetweb frontend

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2021-05-17 14:46:50 -04:00
Kefu Chai
c39d64d7bb crimson/os: pass log level to LOG()
instead of passing function name to the underlying macro, pass log
level for better readability.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-05-18 01:38:55 +08:00
Kefu Chai
e56263b057 crimson/os: use compile-time validation
libfmt does compile-time format argument validation of the format string
and the argument when the the user-defined literal is used. but the
downside is that the formatter materialize the whole formatted string
into a std::string, before printing them argument into seastar's log buffer
inserter. presumably, the inserter would be more efficient in
comparision to the pre-format approach. so this validation is only
enabled for non NDEBUG build. so it is able to help us to identify
errors like

DEBUG("{} {}", 1, 2, 3)

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-05-18 01:38:55 +08:00
David Galloway
46cde2ac8a
Merge pull request #41348 from jdurgin/wip-release-notes-fixes
script/ceph-release-notes: work with py3 and remove backport release names from PRs
2021-05-17 12:03:15 -04:00
Patrick Donnelly
ff9de808ab
Merge PR #41314 into master
* refs/pull/41314/head:
	qa/tasks/nfs: add test to check if cmds fail on not passing required arguments
	mgr/nfs: fix flake8 missing whitespace around parameter equals error
	mgr/nfs: annotate _cmd_nfs_* methods return value

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2021-05-17 08:38:41 -07:00
Igor Fedotov
f8def0443d tests/ceph_volume: add UT for bluefs migration stuff
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
2021-05-17 18:26:52 +03:00
Radoslaw Zarzynski
26f205dbea crimson/monc: fix send_message() racing with reopen_session().
The `send_message()` method is a high-level facility for
communicating with a monitor. If there is an active conn
available, it sends the message immediately; otherwise
the message is queued. This method assumes the queue is
already drained if the connection is available.

`active_con` is managed by `reopen_session()` where it's
first cleared and then reset after finding new alive mon.
This is followed by draining the `pending_messages` queue
which happens in `on_session_opened()` after the `MAuth`
exchange is finished.

Unfortunately, the path from the `active_con` clearing
to draining the queue is long and divided into multiple
continuations which results in lack of atomicity. When
e.g. `run_command()` interleaves the stages, following
crash happens:

```
INFO  2021-05-07 08:13:43,914 [shard 0] monc - do_auth_single: mon v2:172.21.15.82:6805/34166 => v2:172.21.15.82:3300/0 returns auth_reply(proto 2 0 (0) Success) v1: 0
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3910-g1b18e076/rpm/el8/BUILD/ceph-17.0.0-3910-g1b18e076/src/crimson/mon/MonClient.cc:1034: seastar::future<> crimson::mon::Client::send_message(MessageRef): Assertion `pending_messages.empty()' failed.
Aborting on shard 0.
Backtrace:
 0# 0x000055CDE6DB532F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007FC1BF20BB20 in /lib64/libpthread.so.0
 4# gsignal in /lib64/libc.so.6
 5# abort in /lib64/libc.so.6
 6# 0x00007FC1BD806B09 in /lib64/libc.so.6
 7# 0x00007FC1BD814DE6 in /lib64/libc.so.6
 8# crimson::mon::Client::send_message(boost::intrusive_ptr<Message>) in ceph-osd
 9# crimson::mon::Client::renew_subs() in ceph-osd
10# 0x000055CDE764FB0B in ceph-osd
11# 0x000055CDE10457F0 in ceph-osd
12# 0x000055CDEA0AB88F in ceph-osd
13# 0x000055CDEA0B0DD0 in ceph-osd
14# 0x000055CDEA2689FB in ceph-osd
15# 0x000055CDE9DC0EDA in ceph-osd
16# main in ceph-osd
17# __libc_start_main in /lib64/libc.so.6
18# _start in ceph-osd
```

The problem caused following failure at Sepia:
http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-07_07:41:02-rados-master-distro-basic-smithi/6104549

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-05-17 14:44:36 +00:00
Xuehan Xu
0a0848ddfd crimson/osd: make do_osd_ops receive lvalue reference to osd ops vector
otherwise any async execution of lambdas in PG::do_osd_ops_execute() may
reference a outdated osd_op

Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
2021-05-17 21:49:57 +08:00
Sebastian Wagner
4123ef9feb
Merge pull request #40172 from p-se/pse-fix-cephadm-prom-alerts-missing
mgr/cephadm: fix missing prometheus alerts

Reviewed-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
2021-05-17 14:23:05 +02:00
zdover23
fc17a0836b
Merge pull request #41361 from zdover23/wip-doc-rados-gateway-spelling-embeddding-2021-05-17
doc/radosgw: s/embeddding/embedding/

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-05-17 21:40:42 +10:00
Aran85
f426a11826 crimson/seastore: add string_kv_node_layout diagrams
Signed-off-by: Zengran Zhang <zhangzengran@sangfor.com.cn>
2021-05-17 19:11:17 +08:00
Zac Dover
ebed69f2ad doc/radosgw: s/embeddding/embedding/
res ipsa loquitur

Signed-off-by: Zac Dover <zac.dover@gmail.com>
2021-05-17 21:01:37 +10:00
Kefu Chai
4539cb3c8a
Merge pull request #41360 from mflehmig/patch-1
doc/rados: Fix typo

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-05-17 15:10:35 +08:00
mflehmig
00225dc531 doc/rados: Fix typo
Signed-off-by: Martin Flehmig <martin.flehmig@tu-dresden.de>
2021-05-17 14:42:12 +08:00
Kefu Chai
33e02a9da3
Merge pull request #41315 from adk3798/check-version
mgr/cephadm: check version in upgrade check

Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
2021-05-16 23:55:25 +08:00
Kefu Chai
16ab5ff202
Merge pull request #41328 from jmolmo/osd_replacement_in_fqdn_hosts
mgr/cephadm: Fix OSD replacement in hosts with FQDN host name

Reviewed-by: Sebastian Wagner <swagner@suse.com>
2021-05-16 23:54:14 +08:00
Kefu Chai
3a3f92ff51
Merge pull request #41343 from dsavineau/issue_50717
mgr/cephadm: fix prometheus jinja template

Reviewed-by: Daniel Pivonka <dpivonka@redhat.com>
Reviewed-by: Adam King <adking@redhat.com>
2021-05-16 23:49:18 +08:00
Kefu Chai
cfdcd23bc3
Merge pull request #41306 from liewegas/udpate-isa-l
isa-l: incorporate fix for aarch64 text relocation

Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-05-16 23:47:30 +08:00
Xuehan Xu
f7181ab2f6 crimson/osd: optimize crimson-osd's client requests process parallelism
Make client requests go to the concurrent pipeline stage "wait_repop" once they
are "submitted" to the underlying objectstore, which means their on-disk order
is guaranteed, so that successive client requests can go into the "process"
pipeline stage.

Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
2021-05-16 14:47:56 +08:00
Xuehan Xu
a0eaf67fec crimson/common: add new facilities to interruptible future
Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
2021-05-16 14:47:56 +08:00
Xuehan Xu
0817a7c82d crimson/osd: add two more stages into pg's client request pipeline
These two stages are used to provide more parallelism in the pipeline,
while preserving client requests order.

Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
2021-05-16 14:47:56 +08:00
Yuri Weinstein
68886f16ae
Merge pull request #41342 from yuriw/wip-yuriw-crontab-master
qa/tests: added client-upgrade-nautilus-pacific tests

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
2021-05-15 13:38:17 -07:00
Ilya Dryomov
ab05cc4a8f librbd/cache/pwl/ssd: stronger assert in aio_read_data_blocks()
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-05-15 18:48:22 +02:00
Ilya Dryomov
00363a0f7c librbd/cache/pwl/ssd: rename aio_read_data_block() overload
Rename the overload that deals with multiple data blocks to
aio_read_data_blocks().

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-05-15 18:48:22 +02:00
Ilya Dryomov
fe757401ad librbd/cache/pwl/ssd: persist correct write_data_pos
WriteLogCacheEntry gets appended to persist_log_entries before
write_data_pos is updated with the actual media offset.  Because
push_back() makes a copy, the updated write_data_pos value never
makes it to media, making recovery impossible.

Fixes: https://tracker.ceph.com/issues/50669
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-05-15 18:48:22 +02:00
Ilya Dryomov
cb9b3afd87 librbd/cache/pwl/ssd: set m_bytes_allocated_cap on recovery
Currently it's set only when a new cache is formatted.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-05-15 18:48:22 +02:00
Ilya Dryomov
ef020b85fb librbd/cache/pwl/ssd: actually use first_{valid,free}_entry on recovery
first_valid_entry and first_free_entry pointers are read from media
but not actually used: both m_first_valid_entry and m_first_free_entry
get assigned 0 (or garbage).  next_log_pos gets the same value as well
meaning that not only no recovery is attempted but the cache also gets
corrupted because DATA_RING_BUFFER_OFFSET is not applied.

Fixes: https://tracker.ceph.com/issues/50669
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-05-15 18:48:22 +02:00
Ilya Dryomov
ea65553b4a librbd/cache/pwl/ssd: don't count log entries
In ssd mode log entries are variable size.  Attempting to count and
impose watermarks on the number of log entries is bogus because the
total number of entries it would take to fill the cache to capacity
is also variable and can't be precisely estimated.

Fixes: https://tracker.ceph.com/issues/50669
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-05-15 18:48:22 +02:00
Ilya Dryomov
74ecc4b76a librbd/cache/pwl: fix AbstractWriteLog::check_allocation() signature
All parameters are integers and none of them are (in-)out, so don't
take them by reference.  Additionally num_lanes, num_log_entries and
num_unpublished_reserves don't need to be 64-bit as their respective
fields in AbstractWriteLog are 32-bit.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-05-15 18:48:22 +02:00
Ilya Dryomov
829ef952d2 librbd/cache/pwl: rename m_log_pool_config_size to m_log_pool_size
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-05-15 18:48:22 +02:00
Ilya Dryomov
820bbecfb1 librbd/cache/pwl: get rid of AbstractWriteLog::m_log_pool_actual_size
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-05-15 18:48:22 +02:00