Commit Graph

121118 Commits

Author SHA1 Message Date
Sage Weil
4bcf9c3422 Merge PR #40218 into master
* refs/pull/40218/head:
	cephadm: make default image the daily master build

Reviewed-by: Michael Fritch <mfritch@suse.com>
2021-03-19 10:21:20 -04:00
Kefu Chai
bb70e94dd7
Merge pull request #40232 from tchaikov/wip-rgw-drop-unused-var
rgw/rgw_zone: drop unused variable

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2021-03-19 22:05:55 +08:00
Kefu Chai
9521e38450
Merge pull request #40205 from tchaikov/wip-promtool-podman-docker
test: run promtool test without docker on focal

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Aashish Sharma <aasharma@redhat.com>
2021-03-19 22:03:50 +08:00
Kefu Chai
8c28c79856 cmake: define BOOST_ASIO_USE_TS_EXECUTOR_AS_DEFAULT for rgw tests
otherwise unittest_rbd_mirror does not compile with boost v1.75

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-03-19 20:35:52 +08:00
Kefu Chai
f381aa8bf0 test: run promtool test without docker on ubuntu/focal
before this change, we use docker for running promtools offered by
a docker image, but this is not efficient, and quite a few developers
do not want to use docker for running "make check". this change was
introduced by #39246, the reason was that, in Ceph's CI process, we
are using Ubuntu/Bionic for running "make check" jobs, but prometheus
packaged by Bionic does not offer the "test rules" command. so, to
address problem, we are using "dnanexus/promtool:2.9.2" docker image
for verifying monitoring/prometheus/alerts/test_alerts.yml.

after this change, we use prometheus packaged by debian derivatives
instead of pulling a docker image.

* debian/control: add prometheus as a "make check" dependency
* install-deps.sh: partially revert
  53a5816ded, as we don't need to
  pull docker or start docker service for using promtool anymore.
* cmake: check if promtool is capable of running "test rules"
  command, bail out if it is not.

see also: https://tracker.ceph.com/issues/49653

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-03-19 20:35:51 +08:00
Kefu Chai
1cc6174a3a install-deps.sh: install boost 1.75 on focal
we bump boost on regular basis. let's take the opportunity of moving to
focal to use boost v1.75.

v1.73 was used before this change. since both boost 1.75 and boost 1.73
install some files at the same places, we need to remove boost 1.73
before installing boost 1.75.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-03-19 20:35:51 +08:00
Kefu Chai
33d2964ae2 cmake: adapt FindBoost.cmake to our needs
the vanilla FindBoost.cmake pulled from cmake has couple assumptions
which do not hold in our environment. so address them case by case.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-03-19 20:35:51 +08:00
Kefu Chai
3aaedc8fdc cmake: add 1.75 to known versions
sync with
507710438d/Modules/FindBoost.cmake

for v1.75 support

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-03-19 20:35:51 +08:00
Kefu Chai
f75a36fe9a install-deps.sh: install libzbd on focal
WITH_ZBD is enabled for testing the build of zbd bluestore backend, and
we plan to migrate to Ubuntu/Focal for testing "make check", so need to
install libzbd when the distro version is focal.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-03-19 20:35:51 +08:00
Sage Weil
a5ca3aa0f0 Merge PR #40200 into master
* refs/pull/40200/head:
	mgr/cephadm: clean up misc messages
	mgr/cephadm/configcheck: do not spam info every minute

Reviewed-by: Adam King <adking@redhat.com>
2021-03-19 08:31:56 -04:00
Sage Weil
6b43b2bfe6 Merge PR #40223 into master
* refs/pull/40223/head:
	cephadm: prevent podman from breaking socket.getfqdn()

Reviewed-by: Daniel Pivonka <dpivonka@redhat.com>
2021-03-19 08:31:24 -04:00
Kefu Chai
b3805887c4
Merge pull request #40236 from tchaikov/wip-cbt-perf
script/run-cbt.sh: set kernel.perf_event_paranoid for running perf

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-03-19 20:10:33 +08:00
Kefu Chai
b37ef49f22 script/run-make.sh: quote targets with double quote
in
ceph-build/ceph-perf-pull-requests/config/definitions/ceph-perf-pull-requests.yml,
we pass "vstart-base crimson-osd" as the targets argument, but the
build() function in ceph/src/script/run-make.sh fails to quote them, so
they are expanded into two argument of `test -n`. hence it breaks like

src/script/run-make.sh: line 124: test: vstart-base: binary operator expected
make will run with option(s) -j40
Unknown argument vstart-base
Unknown argument crimson-osd

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-03-19 19:25:25 +08:00
Kefu Chai
151719590d script/run-cbt.sh: set kernel.perf_event_paranoid for running perf
Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-03-19 19:04:03 +08:00
Kefu Chai
053ad3ff5b
Merge pull request #40233 from tchaikov/wip-make-check-aio-max
run-make-check.sh: increase fs.aio-max-nr

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2021-03-19 17:50:16 +08:00
Jenkins Build Slave User
afdafee74c cmake: use --smp 1 --memory 256M to crimson tests
to reduce the resource usage when running tests

there is an exception though, as we want to test test_config.cc with
multiple reactors.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-03-19 16:37:13 +08:00
Kefu Chai
f0510c5fed
Merge pull request #40229 from tchaikov/wip-dashboard-flake8
pybind/mgr/dashboard: bump flake8 to 3.9.0

Reviewed-by: Nizamudeen A <nia@redhat.com>
2021-03-19 16:25:14 +08:00
Kefu Chai
e8fd4b3a13 run-make-check.sh: increase fs.aio-max-nr
without this change the seastar based tests fail on host with 48 cores,
because the /proc/sys/fs/aio-nr used by the tests is greater than
1048576. if run-make-check.sh is used to launch the test, the default
job number is `$(nproc) / 2`, and the peak number of /proc/sys/fs/aio-nr
when running ctest was 3190848 when testing on the 48-core host.

so we need to increase fs.aio-max-nr accordingly to the available cores
on the host.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-03-19 16:24:33 +08:00
Kefu Chai
1898990422 rgw/rgw_zone: drop unused variable
Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-03-19 14:52:09 +08:00
Kefu Chai
de9a6a4d6c pybind/mgr/dashboard: remove "python_version >= 3'
remove "python_version >= '3'" from requirements-lint.txt, as we've
dropped the Python2 support.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-03-19 12:24:28 +08:00
Kefu Chai
152964ca36 pybind/mgr/dashboard: bump flake8 to 3.9.0
to address the failure of

ERROR: Cannot install -r requirements-lint.txt (line 2) and -r requirements-lint.txt (line 8) because these package versions have conflicting dependencies.

The conflict is caused by:
    flake8 3.8.4 depends on pycodestyle<2.7.0 and >=2.6.0a1
    autopep8 1.5.6 depends on pycodestyle>=2.7.0

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

Signed-off-by: Kefu Chai <kchai@redhat.com>
2021-03-19 12:14:45 +08:00
Neha Ojha
da7a6fae9e
Merge pull request #40227 from neha-ojha/wip-message-cap-val
qa/suites/rados/perf: set osd client message cap to 5000

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2021-03-18 17:28:01 -07:00
Neha Ojha
4f746f2655
Merge pull request #40185 from ronen-fr/wip-ronenf-extra-scrub-assert
osd: remove a ceph_assert() from a legitimate path

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Sam Just <sjust@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-03-18 14:54:37 -07:00
Neha Ojha
fb8b4e9727 qa/suites/rados/perf: set osd client message cap to 5000
Related to https://tracker.ceph.com/issues/49894
Signed-off-by: Neha Ojha <nojha@redhat.com>
2021-03-18 21:28:52 +00:00
Sage Weil
0461b061d5 Merge PR #40048 into master
* refs/pull/40048/head:
	mgr/cephadm: stop conflicting daemon when deploying to a specific port
	mgr/cephadm: make DaemonPlacement print nicer
	mgr/cephadm: fix --force remove comment
	mgr/cephadm/schedule: choose an IP from a subnet list
	mgr/cephadm: rgw: clean up config and config-key values on removal
	mgr/cephadm: rgw: drop .crt extension when storing cert in config-key
	mgr/cephadm/services: allow beast/civetweb to bind to a particular IP
	python-common: add 'networks' property to ServiceSpec
	mgr/cephadm/schedule: match placement ip only combination with port

Reviewed-by: Sebastian Wagner <swagner@suse.com>
Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
2021-03-18 16:11:38 -04:00
Yuval Lifshitz
b903b2e011
Merge pull request #39139 from TRYTOBE8TME/wip-rgw-bucket-tests-separation-new
Wip rgw bucket tests separation new
2021-03-18 20:33:00 +02:00
Sage Weil
cfc1f914ce cephadm: prevent podman from breaking socket.getfqdn()
socket.getfqdn() will return the reverse lookup for 127.0.1.1, which is
the last item listed for that IP in /etc/hosts.  Podman, by default, will
append the container name (ceph-$fsid-$name) to that line, which is not
a valid hostname, and not what we want the dashbaord to use for the URI
it advertises in the service map.

Pass --no-hosts to podman to disable this.

Docker does not appear to modify /etc/hosts by default--or, more
importantly, does not add the container name there.

Explicitly instruct podman (and docker) to add a

Fixes: https://tracker.ceph.com/issues/49890
Signed-off-by: Sage Weil <sage@newdream.net>
2021-03-18 14:26:48 -04:00
Mykola Golub
2b21735498
Merge pull request #40199 from dillaman/wip-rbd-lockdep
test: ignore failures to force-enable lockdep

Reviewed-by: Mykola Golub <mgolub@suse.com>
2021-03-18 18:46:13 +02:00
Mykola Golub
b5a8e96b96
Merge pull request #40194 from dillaman/wip-49848
test/pybind/rbd: fixed functional change in encryption API

Reviewed-by: Mykola Golub <mgolub@suse.com>
2021-03-18 18:44:53 +02:00
Neha Ojha
2f1cb79620
Merge pull request #40161 from sseshasa/wip-fix-wait-for-clean
qa/tasks: Add additional wait_for_clean() check in lost_unfound tasks.

Reviewed-by: Neha Ojha <nojha@redhat.com>
2021-03-18 09:24:31 -07:00
Sage Weil
61bdbc2779 cephadm: make default image the daily master build
Signed-off-by: Sage Weil <sage@newdream.net>
2021-03-18 10:26:36 -05:00
Ronen Friedman
437456ecf9 osd: remove a ceph_assert() from a legitimate path
on_replica_init() might be legitimately called twice,
if the replica was waiting for updates to complete
before servicing the request.

Fixes: https://tracker.ceph.com/issues/49867

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2021-03-18 17:13:42 +02:00
Patrick Donnelly
25bc7023f0
Merge PR #40207 into master
* refs/pull/40207/head:
	doc: max_maps -> max_caps

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-18 06:46:03 -07:00
zdover23
f79a9d438e
Merge pull request #40187 from ideepika/wip-tracing-intial-doc
dev/developer_guide: add jaegertracing intial developer documentation

Reviewed-by: Zac Dover <zac.dover@gmail.com>
2021-03-18 21:45:39 +10:00
Deepika Upadhyay
6fd0165610 doc/dev/developer_guide: add jaegertracing intial developer documentation
Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
2021-03-18 16:59:44 +05:30
Sridhar Seshasayee
88df47230b qa/tasks: Add additional wait_for_clean() check in lost_unfound tasks.
At the end of the lost_unfound tests add an additional wait_for_clean()
check to ensure that recoveries get enough time to complete before
proceeding and avoid failures down the line. For e.g. failure like
"Scrubbing terminated -- not all pgs were active and clean." is because
recoveries on the PGs did not get sufficient time to complete even though
they were bound to eventually complete.

Fixes: https://tracker.ceph.com/issues/49844
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2021-03-18 13:03:41 +05:30
Dan van der Ster
8d5608f695 doc: max_maps -> max_caps
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
2021-03-18 08:05:49 +01:00
Casey Bodley
29f4bbb5ee qa/rgw: notifications suite runs single job
pin to the beast frontend, default bluestore, replicated pools, and run
against a random distro

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2021-03-18 11:06:40 +05:30
Kefu Chai
01a7ecaba2
Merge pull request #40163 from ktdreyer/resource-agents-noarch
rpm: ceph-resource-agents package is noarch

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2021-03-18 11:02:22 +08:00
Patrick Donnelly
822789547e
Merge PR #40058 into master
* refs/pull/40058/head:
	doc: mds cap acquisition readdir throttle documentation

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-17 19:41:09 -07:00
Patrick Donnelly
14a2501f4b
Merge PR #40193 into master
* refs/pull/40193/head:
	ceph-debug-docker: podman build doesn't accept input via stdin

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-17 19:31:43 -07:00
Brad Hubbard
cba65e6ea4
Merge pull request #31514 from simon-rock/simon_work_fou
osd:modify conf, timeout & suicide timeout, of workqueue at runtime to av…

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
2021-03-18 12:30:26 +10:00
Sage Weil
a2b7587e04 mgr/cephadm: stop conflicting daemon when deploying to a specific port
If we are deploying a daemon to bind to a specific port and there is
an existing daemon we are removing that also binds to that port, stop
it first.  Unless we are both binding to different IPs.

This resolves the case where daemons bind to * and we redeploy with a
subnet to bind to.  It would eventually converge before, but would
throw a bind error in the process and take longer.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-03-17 17:45:41 -04:00
Sage Weil
98fa727cad mgr/cephadm: make DaemonPlacement print nicer
'host(ip:port)' or 'host(*:port)' so we can show it to a user.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-03-17 17:45:41 -04:00
Samuel Just
1af1a7cf69
Merge pull request #39911 from cyx1231st/wip-seastore-onode-tree-fix-cache
crimson/onode-staged-tree: fix tree_cursor_t::Cursor to be aware of extent duplication

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
2021-03-17 13:43:32 -07:00
Sage Weil
f45f6ee4f6 Merge PR #40160 into master
* refs/pull/40160/head:
	qa/suites/rados/cephadm/orchestrator_cli: random-distro$ -> 0-random-distro$
	qa/suites/rados/cephadm/smoke-roleless: distro -> 0-distro
	qa/distros/podman: install kubic once per host, in parallel
	qa/suites/fs/multiclient: use clients: not all: for pexec

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2021-03-17 15:50:50 -04:00
Sage Weil
f8c32b0fcc mgr/cephadm: clean up misc messages
- join list with ' '
- key, not keyring
- -ing, not ': '

Signed-off-by: Sage Weil <sage@newdream.net>
2021-03-17 15:49:47 -04:00
Sage Weil
b828e627d6 mgr/cephadm/configcheck: do not spam info every minute
It doesn't make to spam INF every minute.  Reducing this to DBG means
it'll never be seen.  Just remove it.

Signed-off-by: Sage Weil <sage@newdream.net>
2021-03-17 15:44:08 -04:00
Jason Dillaman
92522c624b
Merge pull request #39915 from CongMinYin/fix-vm-io-hang
librbd/cache/pwl: set max size of continuous data

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Jianpeng Ma <jianpeng.ma@intel.com>
2021-03-17 15:35:58 -04:00
Jason Dillaman
bdc1178bd8 test: ignore failures to force-enable lockdep
PR #40062 tweaked the behavior of lockdep to compile it out
of the code entirely for release builds. This fixes several
gtests where lockdep was force-enabled.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
2021-03-17 15:29:37 -04:00