Commit Graph

100921 Commits

Author SHA1 Message Date
Sage Weil
0eba993fad doc/rados/operations/health-checks: document AUTH_BAD_CAPS
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
7e9ba0a1c1 doc/reados/operations/health-checks: document PG_SLOW_SNAP_TRIMMING
The mitigation steps are weak, but it's not clear concrete guidance to
provide.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
078ef210d5 doc/rados/operations/health-checks: document MGR_DOWN
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
7385e917bb mon/HealthCheck: check mutes based on count, not parsing the summary string
This is more explicit and robust, and works with the PG warnings, which
don't conform to the "%d ..." form that the other messages do.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
d0eb22f3ba mon/health_checks: associate a count with health_alert_t
0 means this is a singleton.  Otherwise, we can sum this up, either
via merge() or get_or_add().

We always structure this so the count goes toward zero (more healthy), so
if a value is too low, then we count how much too low it is.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
7834cd8146 mon/HealthMonitor: simplify health alert dump
Use dump() member instead of duplicating!  The only reason we had this
before was because the detail portion was optinoal

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
164de7d69b mon/PGMap: use nice timespan for PG stuck warnings
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
0acff80f64 mon/HealthMonitor: allow muted alert counts to decrease but not increase
If the summary starts with a digit, parse a count.

If the count goes up, clear the mute.

If the count goes down, update the mute so that we ratchet the threshold
down.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
e4784af4ca mon/PGMap: fix summary form for bluestore health alerts
Count goes first.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
1b6745efb4 doc/rados/operations/health-alerts: document BLUESTORE_NO_COMPRESSION
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
1ae154a433 mon/PGMap: fix summary form for POOL_APP_NOT_ENABLED
Count goes first.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
c09dc4ec45 mon/HealthMonitor: persist summary for non-sticky mutes
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
29d74309aa mon/HealthMonitor: move get_health_status()
This operates exclusively on HealthMonitor members.  Make public member
private again.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
70db3f025a mon/HealthMonitor: automatically clear non-sticky mutes when alert clears
If the alert goes away, drop the mute.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
1600ff60d6 mon/HealthMonitor: add gather_all_health_checks helper
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
2dff777ba9 mon/HealthMonitor: add sticky flag to mutes
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
ffc98aa606 mon/HealthMonitor: expire mutes based on ttl
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:37:00 -05:00
Sage Weil
553f1c5578 mon: apply mutes to health [detail]
- de-escalate severity
- mark mutes in structured output
- note mutes in summary text output
- mark mutes in detail text output

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:37:00 -05:00
Sage Weil
b00d4ca085 mon/HealthMonitor: implement mute and unmount commands
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:37:00 -05:00
Sage Weil
e2c320dc2b mon/HealthMonitor: maintain list of mutes
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:37:00 -05:00
Sage Weil
062681a72f mon: refactor/simplify health [detail]
Get rid of single caller helpers.  Instead, assimilate all the checks
together at once, and have two separate blocks, one for formatted, and
one for plaintext output.  Much easier to follow!

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:37:00 -05:00
Sage Weil
2a3f89fe6b mon/health_checks: format 'health summary' with a colon
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:37:00 -05:00
Sage Weil
ee30f1b68a mon/health_checks: drop dump_summary_compat
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:37:00 -05:00
Patrick Donnelly
d1ce58257e
Merge PR #29431 into master
* refs/pull/29431/head:
	qa: fix malformed suite config

Reviewed-by: Zheng Yan <zyan@redhat.com>
2019-08-14 15:21:51 -07:00
Patrick Donnelly
aed88d43a1
Merge PR #28652 into master
* refs/pull/28652/head:
	cephfs-shell: Add error message for invalid ls commands

Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2019-08-14 15:05:29 -07:00
Patrick Donnelly
48d4499b86
Merge PR #29554 into master
* refs/pull/29554/head:
	cephfs-shell: Fix onecmd TypeError

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2019-08-14 15:04:04 -07:00
Patrick Donnelly
a809a9aaf9
Merge PR #29552 into master
* refs/pull/29552/head:
	cephfs-shell: Convert paths type from string to bytes

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
2019-08-14 15:01:35 -07:00
Yuri Weinstein
6f3c0c9641
Merge pull request #29666 from yuriw/wip-yuriw-crontab-master
qa/tests - upped priority for upgrades on master, otherwise they neve…
2019-08-14 09:49:39 -07:00
Yuri Weinstein
c90740427b qa/tests - upped priority for upgrades on master, otherwise they never lock nodes for testing and fail
Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
2019-08-14 09:43:02 -07:00
Lenz Grimmer
2b5f62852b
mgr/dashboard: Fix e2e issue in HACKING.rst (#29640)
mgr/dashboard: Fix e2e issue in HACKING.rst

Reviewed-by: Stephan Müller <smueller@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
2019-08-14 09:05:07 +00:00
Lenz Grimmer
5f8f666c46
Merge pull request #29570 from rhcs-dashboard/new-bucket-utilities-adaptation
mgr/dashboard: adapt bucket tenant API tests to new behaviour

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
2019-08-14 08:19:32 +00:00
Lenz Grimmer
70b7b61b66
Merge pull request #29634 from rhcs-dashboard/mgr-module-fixes
mgr/dashboard: fix mgr module API tests

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
2019-08-14 08:18:18 +00:00
Kefu Chai
8e0e2bbadc
Merge pull request #29612 from tchaikov/wip-crimson-perf-test
crimson/test: add CBT based perf tests

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2019-08-14 13:48:21 +08:00
Kefu Chai
aaad7dfc0b
Merge pull request #29644 from anthonyeleven/patch-3
doc: operations: correct 'comma-delimited'

Reviewed-by: Kefu Chai <kchai@redhat.com>
2019-08-14 11:24:45 +08:00
Kefu Chai
1f88eb0298 src/script: add run-cbt.sh
this script will be used by jenkins to drive the CBT based test

also, this test can be used by developer for testing crimson or classic
OSD.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-08-14 11:16:31 +08:00
Kefu Chai
03b34e2eab crimson/test: add perf tests for crimson
Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-08-14 10:37:54 +08:00
Kefu Chai
5896267e2b crimson/test: add script to convert teuthology task config to cbt config
Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-08-14 10:37:54 +08:00
Anthony D'Atri
51fb48b0f7
doc: operations: correct 'comma-delimited'
CIDR blocks are comma-separated, not comma-delimited.

Signed-off-by: Anthony D'Atri <anthony.datri@gmail.com>
2019-08-13 12:50:39 -07:00
Casey Bodley
a3039beaba
Merge pull request #29118 from cbodley/wip-rgw-metadata-servicification
rgw: metadata refactoring

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2019-08-13 12:56:52 -04:00
Casey Bodley
75e1ec8a29 Merge branch 'wip-rgw-metadata-servicification'
Conflicts:
	src/rgw/rgw_auth.cc
	src/rgw/rgw_auth_registry.h
	src/rgw/rgw_auth_s3.h
	src/rgw/rgw_bucket.cc
	src/rgw/rgw_bucket.h
	src/rgw/rgw_data_sync.h
	src/rgw/rgw_frontend.h
	src/rgw/rgw_log.h
	src/rgw/rgw_main.cc
	src/rgw/rgw_rados.cc
	src/rgw/rgw_rados.h
	src/rgw/rgw_rest_s3.h
	src/rgw/rgw_rest_sts.h
	src/rgw/rgw_swift_auth.h
	src/rgw/rgw_user.cc
	src/rgw/rgw_user.h
	src/rgw/services/svc_sys_obj_core.h
2019-08-13 10:24:50 -04:00
Volker Theile
3562685ca5 mgr/dashboard: Fix e2e issue in HACKING.rst
Signed-off-by: Volker Theile <vtheile@suse.com>
2019-08-13 15:20:56 +02:00
Casey Bodley
ba9bcf024d
Merge pull request #29633 from hanfengzhe-hi/Fix-decompression-logprint
rgw:Fix rgw decompression log-print

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2019-08-13 08:15:17 -04:00
alfonsomthd
6e6711a1e2 mgr/dashboard: fix mgr module API tests
Signed-off-by: alfonsomthd <almartin@redhat.com>
2019-08-13 12:15:38 +02:00
Han Fengzhe
9d22ccc0c6
rgw:Fix rgw decompression log-print
The zlib compression takes effect in RGW。
When getting objects failed because of decompress-failed。“ceph-client.rgw” log printed “Compression failed with exit code......”,it should be “deCompression failed with exit code......”。

Signed-off-by: Han Fengzhe  <hanfengzhe@hisilicon.com>
2019-08-13 17:08:23 +08:00
Yuval Lifshitz
76b97cacff
Merge pull request #29587 from yuvalif/wip-yuvali-fix-issue-41169
rgw: don't throw when accept errors are happening on frontend
2019-08-13 17:11:34 +09:00
Kefu Chai
3a27c3c800
Merge pull request #29615 from tchaikov/wip-qa/tasks/mgr/dashboard/test_health
qa/tasks/mgr/dashboard/test_health: add missing field for test_full_health

Reviewed-by: Kiefer Chang <kiefer.chang@suse.com>
2019-08-13 13:59:28 +08:00
Kefu Chai
b1c05009f9 qa/tasks/mgr/dashboard/test_health: add missing field for test_full_health
fix regressions introduced by a076260e and d6ff61ed

Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-08-13 11:30:09 +08:00
Kefu Chai
cc3ae85b05 qa/tasks/mgr/dashboard/test_mgr_module: remove enable/disable test from MgrModuleTelemetryTest
telemetry is always enabled since 2d62d71cd4

Fixes: https://tracker.ceph.com/issues/41186
Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-08-13 09:55:48 +08:00
Kefu Chai
f13c7c83d9
Merge pull request #29342 from Jeegn-Chen/wip-scrub-extended-sleep
osd: support osd_scrub_extended_sleep

Reviewed-by: David Zafman <dzafman@redhat.com>
2019-08-13 09:09:52 +08:00
Kefu Chai
9666fabc67
Merge pull request #29522 from majianpeng/bluestore-optimization
os/bluestore: deferred IO notify and locking optimization

Reviewed-by: Sage Weil <sage@redhat.com>
2019-08-13 09:08:00 +08:00