Commit Graph

107397 Commits

Author SHA1 Message Date
xie xingguo
023524a26d osd/PeeringState: restart peering on any previous down acting member coming back
One of our customers wants to verify the data safety of Ceph during scaling
the cluster up, and the test case looks like:
- keep checking the status of a speficied pg, who's up is [1, 2, 3]
- add more osds: up [1, 2, 3] -> up [1, 4, 5], acting = [1, 2, 3], backfill_targets = [4, 5],
  pg is remapped
- stop osd.2: up [1, 4, 5], acting = [1, 3], backfill_targets = [4, 5], pg is undersized
- restart osd.2, acting will stay unchanged as 2 belongs to neither current up nor acting set,
  hence leaving the corresponding pg pinning undersized for a long time until all backfill
  targets completes

It does not pose any critical problem -- we'll end up getting that pg back into active + clean,
except that the long live DEGRADED warnings keep bothering our customer who cares about data
safety more than any thing else.

The right way to achieve the above goal is for:

	boost::statechart::result PeeringState::Active::react(const MNotifyRec& notevt)

to check whether the newly booted node could be validly chosen for the acting set and
request a new temp mapping. The new temp mapping would then trigger a real interval change
that will get rid of the DEGRADED warning.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
Signed-off-by: Yan Jun <yan.jun8@zte.com.cn>
2020-02-21 17:52:52 +08:00
Jan Fajerski
c406ffa79b
Merge pull request #31978 from jan--f/c-v-batch-no-db-dev-drop
ceph-volume/batch: fail on filtered devices when non-interactive
2020-02-07 14:41:27 +01:00
Sage Weil
f29bfa4f60 Merge PR #33075 into master
* refs/pull/33075/head:
	examples/librados: fix bufferlist::copy() in hello_world.cc.

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2020-02-07 07:16:14 -06:00
Jan Fajerski
3f4065c443
Merge pull request #33112 from jan--f/c-v-lvm-list-regression-31700
ceph-volume: fix regression and improve output in lvm list
2020-02-07 11:30:01 +01:00
Kefu Chai
38d20f69e3
Merge pull request #32985 from sebastian-philipp/mgr-progress-mypy
mgr/progress: Add integration to pybind/mgr/tox.ini

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2020-02-07 15:49:37 +08:00
Patrick Donnelly
7173cae78c
Merge PR #32432 into master
* refs/pull/32432/head:
	mds: Reorganize structure members in snap header

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2020-02-06 19:29:20 -08:00
Sage Weil
33e96ef911 Merge PR #33114 into master
* refs/pull/33114/head:
	cephadm:Fix name argument parsing during image check for non-ceph components

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Michael Fritch <mfritch@suse.com>
2020-02-06 16:12:39 -06:00
Daniel-Pivonka
810b96c910 cephadm:Fix name argument parsing during image check for non-ceph components
bug in parsing introduced in 97def7c
args.name may exist but will be none if flag is not used
check the value in addition to checking if it exists

Signed-off-by: Daniel-Pivonka <dpivonka@redhat.com>
2020-02-06 15:47:26 -05:00
Sage Weil
24f70b0334 Merge PR #33109 into master
* refs/pull/33109/head:
	cephadm: fix inspect-image

Reviewed-by: Michael Fritch <mfritch@suse.com>
2020-02-06 14:42:46 -06:00
Sage Weil
bd1684c6ad Merge PR #33089 into master
* refs/pull/33089/head:
	cephadm: re-introduce the `podman logs` command

Reviewed-by: Sage Weil <sage@redhat.com>
2020-02-06 14:42:34 -06:00
Sage Weil
0eef82165b Merge PR #33110 into master
* refs/pull/33110/head:
	qa/distros: rhel and centos: whitelist cephadm logrotate selinux denial

Reviewed-by: Boris Ranto <branto@redhat.com>
2020-02-06 13:13:48 -06:00
Jan Fajerski
000bf2ffff ceph-volume: fix various lvm list issues
A single report on a non-lvm device now works.
Format was cleaned up, report lvm journal,wal, db only once.

Fixes: https://tracker.ceph.com/issues/44009

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
2020-02-06 18:37:31 +01:00
Jan Fajerski
ffe5b5732a ceph-volume: add get_device_lvs to easily retrieve all lvs per device
Also drop the sep argument from get_lvs and siblings, unused.
Introduce LV_CMD_OPTIONS to unify options to lvs.

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
2020-02-06 16:47:08 +01:00
Michael Fritch
e939ee1685
cephadm: re-introduce the podman logs command
Fixes: https://tracker.ceph.com/issues/43973
Signed-off-by: Michael Fritch <mfritch@suse.com>
2020-02-06 08:29:44 -07:00
Sage Weil
d64aa0a421 Merge PR #33093 into master
* refs/pull/33093/head:
	build-integration-branch: don't fail on existing branch

Reviewed-by: Neha Ojha <nojha@redhat.com>
2020-02-06 09:22:21 -06:00
Sage Weil
373ca99f83 cephadm: fix inspect-image
This was broken by d8debba782
because the 'images' json output works with podman but not with
docker.  (Also, the inspect command is more explicit and cleaner.)

Signed-off-by: Sage Weil <sage@redhat.com>
2020-02-06 09:18:13 -06:00
Sage Weil
b91636dcfe qa/distros: rhel and centos: whitelist cephadm logrotate selinux denial
This is fixed in RHEL 8.1.1 (and by extension centos/rhel 8.2+).

No fix for el 7 yet

Partially-fixes: https://tracker.ceph.com/issues/43703
Signed-off-by: Sage Weil <sage@redhat.com>
2020-02-06 08:52:52 -06:00
Sage Weil
e6630e6a29 Merge PR #33071 into master
* refs/pull/33071/head:
	mgr/cephadm: remove item from cache when removing

Reviewed-by: Michael Fritch <mfritch@suse.com>
2020-02-06 06:33:36 -06:00
Lenz Grimmer
f4bea66165
Merge pull request #33059 from tspmelo/wip-node-10-18-1
make-dist: Bump Node.js to v10.18.1

Reviewed-by: Nathan Cutler <ncutler@suse.com>
2020-02-06 11:34:56 +00:00
Tatjana Dehler
4515ab32fa
Merge pull request #32546 from votdev/issue_43089_passwd_cmplx_config
mgr/dashboard: Make password policy check configurable

Reviewed-by: Stephan Müller <smueller@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
2020-02-06 09:44:48 +01:00
Kefu Chai
9805feeca2
Merge pull request #32881 from tchaikov/wip-43657
mgr/orchestrator: use deepcopy for copying exceptions

Reviewed-by: Sebastian Wagner <swagner@suse.com>
2020-02-06 16:05:31 +08:00
Kefu Chai
edca34ea67 qa/tasks/cephadm: test "orchestrator host ls"
Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-02-06 09:53:17 +08:00
Kefu Chai
062e0365ea qa/tasks: drop test_cephadm_orchestrator.py
this test will end with a failure like

```
2020-01-30T18:15:15.870 INFO:tasks.ceph.mgr.x.smithi042.stderr:Warning: Permanently added 'smithi042.front.sepia.ceph.com,172.21.15.42' (ECDSA) to the list of known hosts.
2020-01-30T18:15:15.925 INFO:tasks.ceph.mgr.x.smithi042.stderr:Permission denied, please try again.
2020-01-30T18:15:15.932 INFO:tasks.ceph.mgr.x.smithi042.stderr:Permission denied, please try again.
2020-01-30T18:15:15.939 INFO:tasks.ceph.mgr.x.smithi042.stderr:root@smithi042.front.sepia.ceph.com: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
```

because mgr is not able to establish an ssh connection to that host with "root".
please note, the teuthology worker is acting using the "ubuntu" account on the
test node, and by default, "root" does not have its pubkey. and actually
`qa/tasks/cephadm.py` does push the pubkey to all the managed hosts before
testing cephadm.

since `qa/tasks/cephadm.py` is a better test for cephadm, let's just
drop this one.

as suites/rados/cephadm already covers cephadm

Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-02-06 09:53:17 +08:00
Kefu Chai
ec7b160a55 mgr/orchestrator: use deepcopy for copying exceptions
since rexec module has been removed in python3, we cannot use it
anymore.

Fixes: https://tracker.ceph.com/issues/43657
Signed-off-by: Kefu Chai <kchai@redhat.com>
2020-02-06 09:53:17 +08:00
Sage Weil
58d9cb55ba build-integration-branch: don't fail on existing branch
This behavior is too annoying, and you can always get back to something
clobbered with 'git reflog'.

Signed-off-by: Sage Weil <sage@redhat.com>
2020-02-05 16:56:42 -06:00
Sage Weil
5948bd5545 Merge PR #32946 into master
* refs/pull/32946/head:
	qa/suites/rados: improve valgrind leak check
	common/ceph_context: add an asok command to deliberately leak memory

Reviewed-by: Neha Ojha <nojha@redhat.com>
2020-02-05 16:47:21 -06:00
Sage Weil
2611a73d57 Merge PR #33055 into master
* refs/pull/33055/head:
	qa/tasks/mgr/test_orchestrator_cli: support multiple DriveGroups

Reviewed-by: Joshua Schmid <jschmid@suse.de>
Reviewed-by: Sebastian Wagner <swagner@suse.com>
2020-02-05 16:09:12 -06:00
Sage Weil
e7b7e2e36f Merge PR #33056 into master
* refs/pull/33056/head:
	common: fix clang compile errors from cython_modules

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2020-02-05 16:08:41 -06:00
Sage Weil
572425e823 Merge PR #33058 into master
* refs/pull/33058/head:
	mgr/cephadm: enforce that a host is a valid DNS name
	mgr/cephadm: verify host's hostname matches our host name
	cephadm: check-host: add optional --expect-hostname

Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
2020-02-05 16:07:43 -06:00
Sage Weil
1d1778d66e Merge PR #33069 into master
* refs/pull/33069/head:
	cephadm: use appropriate default image for non-ceph components

Reviewed-by: Patrick Seidensal <pseidensal@suse.com>
Reviewed-by: Michael Fritch <mfritch@suse.com>
2020-02-05 16:07:28 -06:00
Sage Weil
e640a95b89 Merge PR #33039 into master
* refs/pull/33039/head:
	osd/OSD: prevent down osds from immediately rejoining the culster
	osd/OSD: trim osd_markdown_log in tick() thread

Reviewed-by: yanjun <yan.jun8@zte.com.cn>
Reviewed-by: Sage Weil <sage@redhat.com>
2020-02-05 12:55:53 -06:00
Sage Weil
93dba1e711 mgr/cephadm: enforce that a host is a valid DNS name
This combines the hostname restrictions

 * 1-63 chars
 * a-z, A-Z, 0-9, -

and the DNS name restrictions

 * .-delimited
 * no empty components (or leading or trailing .)
 * 250 chars total max

Note that this allows bare IPv4 addresses (which are indistinguishable from
a valid DNS name, AFAICS), but disallows bare IPv6 addresses.

Signed-off-by: Sage Weil <sage@redhat.com>
2020-02-05 12:49:49 -06:00
Sage Weil
d30203a079 mgr/cephadm: remove item from cache when removing
This makes the daemon disappear immediately from 'service ls', and also
avoids a temporary health warning about a stray service.

Signed-off-by: Sage Weil <sage@redhat.com>
2020-02-05 11:41:32 -06:00
Tiago Melo
26ac80a34b mgr/dashboard: Fix fsevents and node-gyp error
Signed-off-by: Tiago Melo <tmelo@suse.com>
2020-02-05 16:21:28 -01:00
Tiago Melo
abcc7bf85e make-dist: Bump Node.js to v10.18.1
This will fix an error caused by the usage of the latest version of Angular CLI
and Node.js v10.16.0.

Fixes: https://tracker.ceph.com/issues/43961

Signed-off-by: Tiago Melo <tmelo@suse.com>
2020-02-05 15:53:00 -01:00
Kefu Chai
7b7826d847
Merge pull request #33085 from rzarzynski/wip-client-bl_iter_advance
client: fix FTBFS due to bl::iterator::advance().

Reviewed-by: Kefu Chai <kchai@redhat.com>
2020-02-06 00:26:12 +08:00
Radoslaw Zarzynski
c7bdf1bcd8 client: fix FTBFS due to bl::iterator::advance().
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2020-02-05 17:13:38 +01:00
Kefu Chai
5c5e1105bb
Merge pull request #33026 from liewegas/wip-el81
qa/distros: add rhel/centos 8.1

Reviewed-by: Kefu Chai <kchai@redhat.com>
2020-02-05 23:44:15 +08:00
Casey Bodley
8dbb92e4a1
Merge pull request #32996 from cbodley/wip-rgw-put-multipart-stripe
rgw: MultipartObjectProcessor supports stripe size > chunk size

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2020-02-05 08:42:50 -05:00
Casey Bodley
18ccc3e5f0
Merge pull request #32811 from multi-arch/master
test/rgw: fix test_rgw_reshard_wait with -DHAVE_BOOST_CONTEXT=OFF

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2020-02-05 08:40:08 -05:00
Sage Weil
6c9131405e Merge PR #33051 into master
* refs/pull/33051/head:
	mgr/telemetry: check get_metadata return val

Reviewed-by: Sage Weil <sage@redhat.com>
2020-02-05 07:36:01 -06:00
Kefu Chai
22da7813ef
Merge pull request #33018 from mgfritch/cephadm-docker-disabled
qa/workunits/cephadm/test_cephadm.sh: skip docker when service is disabled

Reviewed-by: Sage Weil <sage@redhat.com>
2020-02-05 20:46:16 +08:00
Kefu Chai
068d72230d
Merge pull request #32982 from krig/cephadm-fixes
cephadm: Read ceph version from io.ceph.version label if set

Reviewed-by: Sage Weil <sage@redhat.com>
2020-02-05 20:44:58 +08:00
Kefu Chai
07e77aceba
Merge pull request #33025 from neha-ojha/wip-no-mgr
mon/MgrMonitor.cc: warn about missing mgr in a cluster with osds

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2020-02-05 20:36:12 +08:00
Jan Fajerski
1b8963147f
Merge pull request #33077 from guits/guits-cv_fix_listing
ceph-volume: fix lvm list

Reviewed-by: Jan Fajerski <jfajerski@suse.com>
2020-02-05 13:29:43 +01:00
Kefu Chai
bdc3634101
Merge pull request #33004 from matthewoliver/argparge_matchcnt_kwargs
ceph_argparse: increment matchcnt on kwargs

Reviewed-by: Kefu Chai <kchai@redhat.com>
2020-02-05 20:29:35 +08:00
Jan Fajerski
18f23bdf4d
Merge pull request #33074 from guits/guits-cv_quick_fix
ceph-volume: fix has_bluestore_label() function

Reviewed-by: Jan Fajerski <jfajerski@suse.com>
2020-02-05 13:29:03 +01:00
Kefu Chai
3198146d6f
Merge pull request #33015 from rouming/double-unlock-p1-fix
msg/async: open() should be called with connection locked

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: David Disseldorp <ddiss@suse.de>
2020-02-05 20:18:56 +08:00
Kefu Chai
882d1fb28a
Merge pull request #33029 from yaarith/wip-telemetry-show-device
mgr/telemetry: anonymizing smartctl report itself

Reviewed-by: Sage Weil <sage@redhat.com>
2020-02-05 20:11:39 +08:00
Kefu Chai
e1107bb35e
Merge pull request #33003 from tchaikov/wip-buffer-list-advance
include/buffer: add operator+=() for list::iterator

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2020-02-05 20:10:42 +08:00