Commit Graph

88720 Commits

Author SHA1 Message Date
Sage Weil
f09a87f902 doc/mgr/devicehealth: document devicehealth module
Signed-off-by: Sage Weil <sage@redhat.com>
2018-07-31 14:08:53 -05:00
Sage Weil
7ab8675fdf doc/rados/operations/health-checks: document DEVICE_HEALTH* messages
Signed-off-by: Sage Weil <sage@redhat.com>
2018-07-31 14:08:53 -05:00
Sage Weil
ccdfcc7e72 mgr/devicehealth: fix style for returns
Signed-off-by: Sage Weil <sage@redhat.com>
2018-07-31 14:08:53 -05:00
Sage Weil
1f8662a708 mgr/devicehealth: use constants for health warnings
Signed-off-by: Sage Weil <sage@redhat.com>
2018-07-31 14:08:53 -05:00
Sage Weil
b23295dbb9 mgr/devicehealth: deal with as many daemons as we can until limit
Process as many OSDs as we can until we hit the min_in_ratio.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-07-31 14:08:53 -05:00
Sage Weil
4cda89c9e3 mgr/devicehealth: warn if too many daemons are expected to fail soon
Refuse to mark out *all* OSDs.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-07-31 14:08:53 -05:00
Sage Weil
1c9ce2fc56 mgr/devicehealth: set primary-affinity 0 for failing devices
Signed-off-by: Sage Weil <sage@redhat.com>
2018-07-31 14:08:53 -05:00
Sage Weil
cba41b6f7c msg/devicehealth: fix config options
Signed-off-by: Sage Weil <sage@redhat.com>
2018-07-31 14:08:53 -05:00
Sage Weil
abdee9f679 mgr/devicehealth: only fetch osdmap once from check_health
Signed-off-by: Sage Weil <sage@redhat.com>
2018-07-31 14:08:53 -05:00
Sage Weil
c688c81afd mgr/devicehealth: revise health messages
Signed-off-by: Sage Weil <sage@redhat.com>
2018-07-31 14:08:53 -05:00
Sage Weil
8deec7445f mgr/devicehealth: add 'device check-health' command and run periodically
Signed-off-by: Sage Weil <sage@redhat.com>
2018-07-31 14:08:53 -05:00
Sage Weil
b9d547f012 mgr/devicehealth: fix new options
Signed-off-by: Sage Weil <sage@redhat.com>
2018-07-31 14:08:53 -05:00
Yaarit Hatuka
e1552de24b mgr/devicehealth: add helpers to life_expectancy_response()
- if mark_out_threshold is met we write to log.warn instead of raising a
  health warning.
- check that OSD is 'in' before calling mark_out().
- raise a health warning in case OSD is marked 'out' but still has PGs
  attached to it.
- cast thresholds default values to string.
- add SCSI multipath support to health warning message.
- change health warning message.

Signed-off-by: Yaarit Hatuka <yaarithatuka@gmail.com>
2018-07-31 14:08:53 -05:00
Sage Weil
2b86590a66 mgr/devicehealth: simplify setting defaults
Signed-off-by: Sage Weil <sage@redhat.com>
2018-07-31 14:08:53 -05:00
Yaarit Hatuka
8e542033a1 common/blkdev remove debug statements
Signed-off-by: Yaarit Hatuka yaarithatuka@gmail.com
2018-07-31 14:08:53 -05:00
Sage Weil
34698a2c62 Merge PR #23334 into master
* refs/pull/23334/head:
	pybind/rados/rados: do not pass prval from stack

Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2018-07-31 14:08:37 -05:00
David Zafman
9d06ab3da9
Merge pull request #23217 from dzafman/wip-25085
osd: Allow repair of an object with a bad data_digest in object_info on all replicas

Reviewed-by: Sage Weil <sage@redhat.com>
2018-07-31 15:07:22 -04:00
Sage Weil
8e36f18cde pybind/rados/rados: do not pass prval from stack
The prval is a pointer to an int to write the final completion code of
the rados op.  This can't be on the stack since we immediately leave the
current scope after preparing the op (looong before we do the rados op).

We keep the tuple return value to avoid breaking users of this API
(devicehealth module, gnocchi at a minimum).

Fixes: http://tracker.ceph.com/issues/25175
Signed-off-by: Sage Weil <sage@redhat.com>
2018-07-31 09:41:05 -05:00
Alfredo Deza
96e7576400
Merge pull request #23348 from ceph/wip-rm24957
ceph-volume: adds test for `ceph-volume lvm list /dev/sda`

Reviewed-by: Alfredo Deza <adeza@redhat.com>
2018-07-31 09:56:05 -04:00
Andrew Schoen
ef10886f1e ceph-volume: adds a unit test for lvm list /dev/sda
This test is to prove that the issue from
http://tracker.ceph.com/issues/24957 was fixed
by http://tracker.ceph.com/issues/24784

When running lvm list against a raw device it should handle
gracefully the situation where there are multiple PVs with the
name of the given device.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-07-31 08:50:28 -05:00
Andrew Schoen
37ed1be08b ceph-volume: move pvolumes fixture into conftest.py
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
2018-07-31 08:50:27 -05:00
Kefu Chai
cec5a23f69
Merge pull request #23336 from noahdesu/vstart-dashboard-no-rbd
vstart: disable dashboard when rbd not built

Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-07-31 11:05:02 +08:00
Patrick Donnelly
55e60ab17d
Merge PR #23297 into master
* refs/pull/23297/head:
	ceph_volume_client: add delay for MDSMap to be distributed

Reviewed-by: Ramana Raja <rraja@redhat.com>
2018-07-30 16:11:21 -07:00
Patrick Donnelly
957bdb4abe
Merge PR #23308 into master
* refs/pull/23308/head:
	doc: s/Ceph FS/CephFS

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2018-07-30 16:06:39 -07:00
Noah Watkins
5b9dd4c8a2 vstart: disable dashboard when rbd not built
dashboard doesn't load correctly without the rbd module, which means
vstart commands that interact with dashboard fail and vstart exits.

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
2018-07-30 14:50:28 -07:00
Sage Weil
1ebafdb65f
Merge pull request #23292 from yuriw/wip-yuriw-25140-master
qa/tests: added 1st draft of mimic-x suite
2018-07-30 14:55:41 -05:00
Sage Weil
c6dd193f45
Merge pull request #23302 from yuriw/wip-yuriw-crontab-master
qa/tests: added mimic-x to the schedule
2018-07-30 14:55:27 -05:00
Alfredo Deza
81df5d18c3
Merge pull request #23321 from cernceph/dvanders_enable
ceph-volume: enable the ceph-osd during lvm activation

Reviewed-by: Alfredo Deza <adeza@redhat.com>
2018-07-30 12:54:24 -04:00
Andrew Schoen
4a043de4b7
Merge pull request #23332 from alfredodeza/wip-rm25171
ceph-volume add a __release__ string, to help version-conditional calls

Reviewed-by: Andrew Schoen <aschoen@redhat.com>
2018-07-30 16:14:01 +00:00
Yuri Weinstein
baa4d0ea78
Merge pull request #23305 from smithfarm/wip-cleanup-upgrade
qa/upgrade: cleanup for nautilus

Reviewed-by: Yuri Weinstein <yweins@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2018-07-30 09:01:03 -07:00
Yuri Weinstein
e6f21c1aa3 qa/tests: added 1st draft of mimic-x suite
Fixes: https://tracker.ceph.com/issues/25140
Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
2018-07-30 08:41:18 -07:00
Alfredo Deza
5bd0c27f9d ceph-volume add a __release__ string, to help version-conditional calls
Signed-off-by: Alfredo Deza <adeza@redhat.com>
2018-07-30 11:39:43 -04:00
Kefu Chai
df2196dbf8
Merge pull request #23276 from tchaikov/wip-config-diff-lock
common/config: fix the lock in ConfigProxy::diff()

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2018-07-30 22:24:54 +08:00
Kefu Chai
1bb7be365e
Merge pull request #23251 from neha-ojha/wip-25112
osd,mon: increase mon_max_pg_per_osd to 250

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>
2018-07-30 22:23:18 +08:00
Kefu Chai
b88596d93c
Merge pull request #23249 from liewegas/wip-mon-cx-nautilus
osd/OSDMap: fix CEPHX_V2 osd requirement to nautilus, not mimic

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2018-07-30 22:22:40 +08:00
Kefu Chai
53951a84b6
Merge pull request #23229 from rjfd/wip-dashboard-query-params-bug
mgr/dashboard: fix query parameters in task annotated endpoints

Reviewed-by: Tiago Melo <tmelo@suse.com>
2018-07-30 22:21:43 +08:00
Sage Weil
c13aa38175 Merge PR #22607 into master
* refs/pull/22607/head:
	common/options: convert many TYPE_[U]INT -> TYPE_SIZE
	common/options: remove journal_max_corrupt_search

Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2018-07-30 08:45:17 -05:00
Dan van der Ster
261d8ac94d ceph-volume: enable ceph-osd during lvm activation
Enable the ceph-osd@<id> unit during lvm activate to link these
units to the ceph-osd.target.

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
Fixes: http://tracker.ceph.com/issues/24152
2018-07-30 15:07:54 +02:00
Dan van der Ster
3e6f387be1 ceph-volume: optional systemd enable --runtime
Allow units to be enabled but not persisted across a reboot,
and use this when enabling osds.

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
2018-07-30 14:57:03 +02:00
Lenz Grimmer
a8dc0e593f
Merge pull request #23287 from Devp00l/wip-duplicate-error-messages
mgr/dashboard: Fix duplicate error messages

Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
2018-07-30 11:39:05 +02:00
Ricardo Dias
00ec05abbd
Merge pull request #22669 from votdev/feature_24574
mgr/dashboard: Cleanup RGW config checks

Reviewed-by: Lenz Grimmer <lgrimmer@suse.com>
Reviewed-by: Sebastian Wagner <swagner@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
2018-07-30 09:53:16 +01:00
Nathan Cutler
17d9b5be4d qa/upgrade: cleanup for nautilus
Drop unused suites, which ATM means all of them except upgrade/luminous-x
which recently got a cleanup in https://github.com/ceph/ceph/pull/23162

Signed-off-by: Nathan Cutler <ncutler@suse.com>
2018-07-29 19:56:53 +02:00
Nathan Cutler
cecbf3e5dd
Merge pull request #23162 from smithfarm/wip-upgrade-cleanup
tests: upgrade/luminous-x: fix order of final-workload directory

Reviewed-by: Sage Weil <sage@redhat.com>
2018-07-29 18:08:48 +02:00
Yuri Weinstein
bba79cdb9f qa/tests: added mimic-x to the schedule
Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
2018-07-28 14:42:39 -07:00
Sage Weil
922bfc5f3b common/options: convert many TYPE_[U]INT -> TYPE_SIZE
Note that the _cost options are in fact in units of bytes.

Signed-off-by: Sage Weil <sage@redhat.com>
2018-07-28 15:11:05 -05:00
Jos Collin
152f4fba1e
doc: s/Ceph FS/CephFS
Fixes: https://github.com/ceph/ceph/pull/22784#discussion_r200755460
Signed-off-by: Jos Collin <jcollin@redhat.com>
2018-07-28 19:59:33 +05:30
Kefu Chai
d76fbf67a6
Merge pull request #23043 from tchaikov/wip-python-cephfs-dependencies
deb,rpm: fix python-cephfs dependencies

Reviewed-by: Nathan Cutler <ncutler@suse.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2018-07-28 09:53:32 +08:00
Patrick Donnelly
4853aa7d42
ceph_volume_client: add delay for MDSMap to be distributed
Otherwise the setxattr will fail if the mds has not yet received the MDSMap
which adds the new data pool.

Fixes: https://tracker.ceph.com/issues/25141

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2018-07-27 17:02:48 -07:00
Abhishek L
726ca169ee
Merge pull request #23288 from theanalyst/doc/releases/13.2.1
doc: releases: mimic 13.2.1 release notes

Reviewed-By: Sage Weil  <sweil@redhat.com>
2018-07-28 00:18:46 +02:00
Alfredo Deza
ba755c115f
Merge pull request #23278 from b-ranto/wip-volume-selinux
ceph-volume: Restore SELinux context

Reviewed-by: Alfredo Deza <adeza@redhat.com>
2018-07-27 17:57:46 -04:00