Commit Graph

109605 Commits

Author SHA1 Message Date
Joao Eduardo Luis
3d682c21f6 qa/standalone: exercise osdmon's last epoch clean
Signed-off-by: Joao Eduardo Luis <joao@suse.de>
2020-03-23 14:58:59 +00:00
Joao Eduardo Luis
bd2e5c6275 mon/OSDMonitor: dump last epoch clean info on report
Signed-off-by: Joao Eduardo Luis <joao@suse.de>
2020-03-23 14:58:38 +00:00
Sage Weil
de20c7bc61 Merge PR #34105 into master
* refs/pull/34105/head:
	Merge PR #34042 into octopus
	Merge PR #33959 into octopus
	Merge PR #34067 into octopus
	mgr/DaemonServer: add explicit check that acting matches for merge
	Merge pull request #34040 from dillaman/wip-44396-partial-fix
	Merge PR #34098 into octopus
	mgr/rook: list rgw services
	mgr/rook: tolerate timestamps that are None
	mgr/orch: add 'subcluster' property to RGWSpec
	mgr/rook: do not create radosgw pools
	mgr/rook: refactor apply/add for rgw
	Merge PR #34082 into octopus
	Merge PR #34068 into octopus
	cephadm: relabel /etc/ganesha mount
	Merge PR #34046 into octopus
	Merge PR #34092 into octopus
	Merge pull request #33719 from ukernel/wip-44416
	rbd-mirror: leader watcher should not cancel get locker if locker is invalid
	rbd-mirror: snapshot sync request needs to check for interruption
	librbd: request exclusive lock when moving to trash
	rbd-mirror: basic integration with sync throttling
	rbd-mirror: don't prematurely finish snapshot replay loop
	rbd-mirror: pass InstanceWatcher to snapshot Replayer
	doc/releases/octopus.rst: add note about ec recovery below min_size
	mgr/cephadm: configure rgw_frontends for rgw service
	cephadm: switch grafana image to the ceph repo
	Merge PR #34034 into octopus
	qa/suites/rados/cephadm/upgrade: update starting version
	Merge PR #33540 into octopus
	Merge PR #34023 into octopus
	Merge PR #34044 into octopus
	Merge PR #34030 into octopus
	doc/orchestrator: update rgw creation
	mgr/cephadm: clean up client.crash.* container_image settings after upgrade
	cephadm: make add-repo --release and --version independent
	cephadm: env over last used
	mgr/orch: accept port and ssl flags to 'apply rgw'
	mgr/orch: 'ceph upgrade ...' -> 'ceph orch upgrade ...'
	cephadm: fall back to default for infer_image
	cephadm: remove outdated check
	cephadm: consolidate default image logic
	remove ceph_test_rados_watch_notify
	python-common/ceph/deployment/service_spec: add ssl to RGWSpec
	cephadm: only infer image for shell, run, inspect-image, pull, ceph-volume
	mgr/test_orchestrator: fix service filtering when using dummy data
	mgr/dashboard: fix adding/removing host errors
	mgr/rook: fix 'orch ps' for osds
	qa: fix all the fsx.sh-invoking yaml files to install dependencies
	mds: pass proper MutationImpl::LockOp to Locker::wrlock_start()

Reviewed-by: Kiefer Chang <kiefer.chang@suse.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
2020-03-23 08:24:06 -05:00
Lenz Grimmer
4a4b737cbe
Merge pull request #33741 from ricardoasmarques/iscsi-password-msg
mgr/dashboard: Improve iSCSI CHAP message

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
2020-03-23 13:47:41 +01:00
Lenz Grimmer
b819847556
Merge pull request #34063 from s0nea/wip-dashboard-crush-rule-suite
mgr/dashboard: add crush rule test suite

Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
2020-03-23 13:11:50 +01:00
Lenz Grimmer
e45d96d23e
Merge pull request #34113 from s0nea/wip-dashboard-orch-docu-link
mgr/dashboard: correct Orchestrator documentation link

Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Sebastian Wagner <swagner@suse.com>
2020-03-23 13:07:17 +01:00
Tatjana Dehler
6e91edb287 mgr/dashboard: correct Orchestrator documentation link
Fixes: https://tracker.ceph.com/issues/44708
Signed-off-by: Tatjana Dehler <tdehler@suse.com>
2020-03-23 11:45:53 +01:00
Kefu Chai
5132d9851d
Merge pull request #34104 from tchaikov/crimson-admin-close
crimson/admin: do not reset connected_sock before closing

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
2020-03-23 17:10:40 +08:00
Kefu Chai
5305cfbc43
Merge pull request #33909 from cyx1231st/wip-seastar-msgr-fix-reset
crimson: misc fixes for writes to multiple-osd cluster

Reviewed-by: Kefu Chai <kchai@redhat.com>
2020-03-23 17:05:57 +08:00
Yingxin Cheng
d78cbb3ebd crimson/net: add critical info logs to track and debug racing
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2020-03-23 13:00:23 +08:00
Yingxin Cheng
e202069f09 crimson/net: fix incorrect SocketConnection::print()
The informaction about SocketConnection::side and
SocketConnection::ephemeral_port are not up-to-date in the log, because
they are not moved with Socket during connection replacement. They are
actually socket-level information.

Also take the chance to reorder Socket members.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2020-03-23 13:00:23 +08:00
Yingxin Cheng
373e16499e crimson/osd: make send_heartbeat() atomic
The item in Heartbeat:peers could be removed/re-added during the
asynchronous operation.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
2020-03-23 13:00:15 +08:00
Kefu Chai
d6ead8b9ad
Merge pull request #32171 from rosinL/wip-ec-isla-aarch64
erasure-code: enable isa-l EC for aarch64 platform

Reviewed-by: Kefu Chai <kchai@redhat.com>
2020-03-23 12:46:39 +08:00
Kefu Chai
1ce152f597 crimson/admin: do not reset connected_sock before closing
* no need to discard_result(). as `output_stream::close()` returns an
  empty future<> already
* free the connected socket after the background task finishes, because:

we should not free the connected socket before the promise referencing it is fulfilled.

otherwise we have error messages from ASan, like

==287182==ERROR: AddressSanitizer: heap-use-after-free on address 0x611000019aa0 at pc 0x55e2ae2de882 bp 0x7fff7e2bf080 sp 0x7fff7e2bf078
READ of size 8 at 0x611000019aa0 thread T0
    #0 0x55e2ae2de881 in seastar::reactor_backend_aio::await_events(int, __sigset_t const*) ../src/seastar/src/core/reactor_backend.cc:396
    #1 0x55e2ae2dfb59 in seastar::reactor_backend_aio::reap_kernel_completions() ../src/seastar/src/core/reactor_backend.cc:428
    #2 0x55e2adbea397 in seastar::reactor::reap_kernel_completions_pollfn::poll() (/var/ssd/ceph/build/bin/crimson-osd+0x155e9397)
    #3 0x55e2adaec6d0 in seastar::reactor::poll_once() ../src/seastar/src/core/reactor.cc:2789
    #4 0x55e2adae7cf7 in operator() ../src/seastar/src/core/reactor.cc:2687
    #5 0x55e2adb7c595 in __invoke_impl<bool, seastar::reactor::run()::<lambda()>&> /usr/include/c++/10/bits/invoke.h:60
    #6 0x55e2adb699b0 in __invoke_r<bool, seastar::reactor::run()::<lambda()>&> /usr/include/c++/10/bits/invoke.h:113
    #7 0x55e2adb50222 in _M_invoke /usr/include/c++/10/bits/std_function.h:291
    #8 0x55e2adc2ba00 in std::function<bool ()>::operator()() const /usr/include/c++/10/bits/std_function.h:622
    #9 0x55e2adaea491 in seastar::reactor::run() ../src/seastar/src/core/reactor.cc:2713
    #10 0x55e2ad98f1c7 in seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) ../src/seastar/src/core/app-template.cc:199
    #11 0x55e2a9e57538 in main ../src/crimson/osd/main.cc:148
    #12 0x7fae7f20de0a in __libc_start_main ../csu/libc-start.c:308
    #13 0x55e2a9d431e9 in _start (/var/ssd/ceph/build/bin/crimson-osd+0x117421e9)

0x611000019aa0 is located 96 bytes inside of 240-byte region [0x611000019a40,0x611000019b30)
freed by thread T0 here:
    #0 0x7fae80a4e487 in operator delete(void*, unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.6+0xac487)
    #1 0x55e2ae302a0a in seastar::aio_pollable_fd_state::~aio_pollable_fd_state() ../src/seastar/src/core/reactor_backend.cc:458
    #2 0x55e2ae2e1059 in seastar::reactor_backend_aio::forget(seastar::pollable_fd_state&) ../src/seastar/src/core/reactor_backend.cc:524
    #3 0x55e2adab9b9a in seastar::pollable_fd_state::forget() ../src/seastar/src/core/reactor.cc:1396
    #4 0x55e2adab9d05 in seastar::intrusive_ptr_release(seastar::pollable_fd_state*) ../src/seastar/src/core/reactor.cc:1401
    #5 0x55e2ace1b72b in boost::intrusive_ptr<seastar::pollable_fd_state>::~intrusive_ptr() /opt/ceph/include/boost/smart_ptr/intrusive_ptr.hpp:98
    #6 0x55e2ace115a5 in seastar::pollable_fd::~pollable_fd() ../src/seastar/include/seastar/core/internal/pollable_fd.hh:109
    #7 0x55e2ae0ed35c in seastar::net::posix_server_socket_impl::~posix_server_socket_impl() ../src/seastar/include/seastar/net/posix-stack.hh:161
    #8 0x55e2ae0ed3cf in seastar::net::posix_server_socket_impl::~posix_server_socket_impl() ../src/seastar/include/seastar/net/posix-stack.hh:161
    #9 0x55e2ae0ed943 in std::default_delete<seastar::net::api_v2::server_socket_impl>::operator()(seastar::net::api_v2::server_socket_impl*) const /usr/include/c++/10/bits/unique_ptr.h:81
    #10 0x55e2ae0db357 in std::unique_ptr<seastar::net::api_v2::server_socket_impl, std::default_delete<seastar::net::api_v2::server_socket_impl> >::~unique_ptr()
	/usr/include/c++/10/bits/unique_ptr.h:357    #11 0x55e2ae1438b7 in seastar::api_v2::server_socket::~server_socket() ../src/seastar/src/net/stack.cc:195
    #12 0x55e2aa1c7656 in std::_Optional_payload_base<seastar::api_v2::server_socket>::_M_destroy() /usr/include/c++/10/optional:260
    #13 0x55e2aa16c84b in std::_Optional_payload_base<seastar::api_v2::server_socket>::_M_reset() /usr/include/c++/10/optional:280
    #14 0x55e2ac24b2b7 in std::_Optional_base_impl<seastar::api_v2::server_socket, std::_Optional_base<seastar::api_v2::server_socket, false, false> >::_M_reset() /usr/include/c++/10/optional:432
    #15 0x55e2ac23f37b in std::optional<seastar::api_v2::server_socket>::reset() /usr/include/c++/10/optional:975
    #16 0x55e2ac21a2e7 in crimson::admin::AdminSocket::stop() ../src/crimson/admin/admin_socket.cc:265
    #17 0x55e2aa099825 in operator() ../src/crimson/osd/osd.cc:450
    #18 0x55e2aa0d4e3e in apply ../src/seastar/include/seastar/core/apply.hh:36

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-03-23 10:44:20 +08:00
Sage Weil
243cbd6224 Merge PR #34042 into octopus
* refs/pull/34042/head:
	mgr/rook: list rgw services
	mgr/rook: tolerate timestamps that are None
	mgr/orch: add 'subcluster' property to RGWSpec
	mgr/rook: do not create radosgw pools
	mgr/rook: refactor apply/add for rgw
	mgr/cephadm: configure rgw_frontends for rgw service
	mgr/orch: accept port and ssl flags to 'apply rgw'
	python-common/ceph/deployment/service_spec: add ssl to RGWSpec
	mgr/rook: fix 'orch ps' for osds

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
2020-03-22 18:32:11 -05:00
Sage Weil
2740349122 Merge PR #33959 into octopus
* refs/pull/33959/head:
	qa: fix all the fsx.sh-invoking yaml files to install dependencies

Reviewed-by: Sage Weil <sage@redhat.com>
2020-03-22 10:56:31 -05:00
Sage Weil
3b28477f0d Merge PR #34067 into octopus
* refs/pull/34067/head:
	mgr/DaemonServer: add explicit check that acting matches for merge

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
2020-03-22 10:55:54 -05:00
Kefu Chai
ec362f4499
Merge pull request #34071 from badone/wip-docker-test-helper-use-podman-by-default
tests: Use podman if available

Reviewed-by: Kefu Chai <kchai@redhat.com>
2020-03-22 10:41:23 +08:00
Kefu Chai
961834c3b1
Merge pull request #34048 from tchaikov/wip-test-docker-fc31
tests: update Dockerfile to support fc-31

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
2020-03-22 10:40:16 +08:00
Brad Hubbard
a1e8f61cb7 tests: Use podman if available
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
2020-03-22 09:07:50 +10:00
Sage Weil
1700d18158 mgr/DaemonServer: add explicit check that acting matches for merge
Add an explicit check that the PG acting for the source and target
match before merging.

Fixes: https://tracker.ceph.com/issues/44684
Signed-off-by: Sage Weil <sage@redhat.com>
2020-03-21 14:17:30 -05:00
Mykola Golub
5d09c481a4
Merge pull request #34040 from dillaman/wip-44396-partial-fix
rbd-mirror: snapshot-based mirroring should use image sync throttler

Reviewed-by: Mykola Golub <mgolub@suse.com>
2020-03-21 10:22:45 +02:00
Kefu Chai
71f6db5f6b
Merge pull request #34066 from mgfritch/cephadm-mon-b-test
qa/workunits/cephadm/test_cephadm.sh: fix mon.b failure

Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2020-03-21 14:34:28 +08:00
Kefu Chai
a791177764
Merge pull request #34022 from ifed01/wip-ifed-fix-leak-in-expand
os/bluestore: fix extent leak after main device expand.

Reviewed-by: Adam Kupczyk <akucpzyk@redhat.com>
2020-03-21 14:32:15 +08:00
Kefu Chai
d071132987
Merge pull request #33883 from dragonylffly/wip-fix-comments
msg/async: fix log information

Reviewed-by: Kefu Chai <kchai@redhat.com>
2020-03-21 14:31:16 +08:00
Kefu Chai
11b8e974a9
Merge pull request #33869 from mgfritch/cephadm-osd-create-test
qa/workunits/cephadm/test_cephadm.sh: move osd test to ceph-volume

Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
2020-03-21 14:30:23 +08:00
Kefu Chai
fc5a41119d
Merge pull request #34097 from adamemerson/wip-boost-use-valgrind-fix
cmake: Don't enable BOOST_USE_VALGRIND when not requested

Reviewed-by: Kefu Chai <kchai@redhat.com>
2020-03-21 14:29:22 +08:00
Kefu Chai
b0dca75a59
Merge pull request #34056 from xiexingguo/wip-44662
qa/*/osd-markdown.sh: propagate map to osd before testing its reaction

Reviewed-by: Neha Ojha <nojha@redhat.com>
2020-03-21 14:27:51 +08:00
Kefu Chai
25ac152841
Merge pull request #33796 from adamemerson/wip-using-namespace-common
Build the target 'common' without relying on using namespace in headers

Reviewed-by: Kefu Chai <kchai@redhat.com>
2020-03-21 10:33:16 +08:00
Kefu Chai
f617e10612
Merge pull request #33903 from tchaikov/wip-rados-object-locator
tools/rados: use object-locator in user-visible outputs

Reviewed-by: Neha Ojha <nojha@redhat.com>
2020-03-21 10:30:08 +08:00
Sage Weil
e2f3e6062c Merge PR #34098 into octopus
* refs/pull/34098/head:
	cephadm: relabel /etc/ganesha mount

Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Sebastian Wagner <swagner@suse.com>
2020-03-20 21:15:23 -05:00
Kefu Chai
3cac20f31a
Merge pull request #33976 from tchaikov/wip-build-doc-on-darwin
admin/build-doc, pybind/*/setup.py: support Darwin

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
2020-03-21 09:27:09 +08:00
Xie Xingguo
ec404a9a9e
Merge pull request #34070 from bangmingcheng/wip-doc-ceph-chenbm
doc: fix a spelling error at /doc/radosgw/dynamicresharding.rst

Reviewed-by: Kefu Chai <kchai@redhat.com>
2020-03-21 09:01:29 +08:00
Adam C. Emerson
647819c632 cmake: Don't enable BOOST_USE_VALGRIND when not requested
We were adding the define without support in the library if
WITH_BOOST_VALGRIND was turned off.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2020-03-20 20:40:39 -04:00
Sage Weil
818b8583c8 mgr/rook: list rgw services
Signed-off-by: Sage Weil <sage@redhat.com>
2020-03-20 17:11:24 -04:00
Sage Weil
15a106c3b0 mgr/rook: tolerate timestamps that are None
Signed-off-by: Sage Weil <sage@redhat.com>
2020-03-20 17:03:57 -04:00
Sage Weil
66de37d565 mgr/orch: add 'subcluster' property to RGWSpec
Signed-off-by: Sage Weil <sage@redhat.com>
2020-03-20 16:40:12 -04:00
Sage Weil
22cea7eb0e mgr/rook: do not create radosgw pools
First, we don't know how big they should be or what they should look like.
The caller should already know that, and/or radosgw can create the pools
itself.

This depends on https://github.com/rook/rook/pull/5058

Signed-off-by: Sage Weil <sage@redhat.com>
2020-03-20 16:40:12 -04:00
Sage Weil
0580297aff mgr/rook: refactor apply/add for rgw
A few caveats here:

- enforce that realm == zone, since that is all rook does at the moment.
- we force a (bad!) pool configuration, since rook requires that these
be present (instead of allowing radosgw or the caller to create the pools)

Signed-off-by: Sage Weil <sage@redhat.com>
2020-03-20 16:40:12 -04:00
Sage Weil
74b0212d90 Merge PR #34082 into octopus
* refs/pull/34082/head:
	cephadm: switch grafana image to the ceph repo

Reviewed-by: Michael Fritch <mfritch@suse.com>
2020-03-20 15:35:17 -05:00
Sage Weil
6c9e4e2192 Merge PR #34068 into octopus
* refs/pull/34068/head:
	mgr/cephadm: clean up client.crash.* container_image settings after upgrade

Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Sebastian Wagner <swagner@suse.com>
2020-03-20 15:31:32 -05:00
Sage Weil
b7857818f8 cephadm: relabel /etc/ganesha mount
Fixes: https://tracker.ceph.com/issues/44701
Signed-off-by: Sage Weil <sage@redhat.com>
2020-03-20 15:30:11 -05:00
Sage Weil
1bc2853d2f Merge PR #34046 into octopus
* refs/pull/34046/head:
	qa/suites/rados/cephadm/upgrade: update starting version
	mgr/orch: 'ceph upgrade ...' -> 'ceph orch upgrade ...'

Reviewed-by: Sebastian Wagner <swagner@suse.com>
2020-03-20 14:50:42 -05:00
Sage Weil
7c6defc9f0 Merge PR #34092 into octopus
* refs/pull/34092/head:
	doc/releases/octopus.rst: add note about ec recovery below min_size

Reviewed-by: Sage Weil <sage@redhat.com>
2020-03-20 13:14:25 -05:00
Gregory Farnum
22673102c2
Merge pull request #33719 from ukernel/wip-44416
mds: pass proper MutationImpl::LockOp to Locker::wrlock_start()

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Reviewed-by:  Jeff Layton <jlayton@redhat.com>
2020-03-20 10:25:15 -07:00
Jason Dillaman
a7cc4ab05a rbd-mirror: leader watcher should not cancel get locker if locker is invalid
When a new leader acquires the lock, it will send out a lock acquired
notification along with periodic heartbeats. The get locker will attempt to
run immediately, but if a heartbeat arrives before it executes the heartbeat
will cancel the timer and reschedule it for the future. This process repeats
for each periodic heartbeat and the locker is never re-read from the OSD.

This is an issue only for namespace replayers due to the delayed fashion in
which the leader instance id is retrieved.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2020-03-20 13:18:14 -04:00
Jason Dillaman
336a16bcaf rbd-mirror: snapshot sync request needs to check for interruption
If the sync request was locally canceled, we need to resume the paused
shut down logic instead of just notifying the image replayer state
machine of the change -- since it had already requested a shut down and
will not re-request it.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2020-03-20 13:18:14 -04:00
Jason Dillaman
ae726336d2 librbd: request exclusive lock when moving to trash
Even if the image is in-use, moving it to the trash does not
remove any data. This also solves a race between snapshot-based
mirroring shutting down and being able to move a mirrored image
to the trash.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2020-03-20 13:18:14 -04:00
Jason Dillaman
470ce1c0d7 rbd-mirror: basic integration with sync throttling
snapshot-based mirroring did not have any throttling to prevent
too many concurrent syncs from running. Since each sync might need
to iterate over every object of an image, that could potentially
put an extreme burden on the remote cluster.

A future PR will add a more intelligent throttle based on the actual
number of objects needed to be scanned.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2020-03-20 13:18:14 -04:00
Jason Dillaman
9970e49ab5 rbd-mirror: don't prematurely finish snapshot replay loop
The unlink step was being incorrectly skipped if a state machine
shut down was requested.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2020-03-20 13:18:13 -04:00