Commit Graph

100969 Commits

Author SHA1 Message Date
Kefu Chai
7aae20cdf2 denc: add fallback for the O(n) legacy of std::list::size().
Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-08-16 07:56:13 -04:00
Radoslaw Zarzynski
f1b61c548b denc: slightly optimize container_base::bound_encode.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2019-08-16 07:56:12 -04:00
Kefu Chai
ebdf419c63
Merge pull request #29597 from tchaikov/wip-qa/tasks/cbt
qa/tasks/cbt.py: use "git --depth 1 for" faster clone

Reviewed-by: Neha Ojha <nojha@redhat.com>
2019-08-16 19:17:58 +08:00
Alfredo Deza
3add40c4ee
Merge pull request #29683 from jan--f/c-v-keep-device-list-as-lists
ceph-volume: don't keep device lists as sets

Reviewed-by: Alfredo Deza <adeza@redhat.com>
2019-08-16 07:00:50 -04:00
Sage Weil
8958df969e Merge PR #29676 into master
* refs/pull/29676/head:
	test/unittest_bluefs: always remove temp bdev file

Reviewed-by: Sage Weil <sage@redhat.com>
2019-08-15 14:03:37 -05:00
Sage Weil
81f5b3788d Merge PR #29581 into master
* refs/pull/29581/head:
	os/bluestore: do not set osd_memory_target default from cgroup limit

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Mark Nelson <mnelson@redhat.com>
2019-08-15 14:02:34 -05:00
Sage Weil
83a59884e1 Merge PR #29577 into master
* refs/pull/29577/head:
	os/bluestore/KernelDevice: fix RW_IO_MAX constant
	os/bluestore/KernelDevice: print aio error extent in hex

Reviewed-by: Kefu Chai <kchai@redhat.com>
2019-08-15 14:02:00 -05:00
Casey Bodley
c35f9d6ecd
Merge pull request #29578 from theanalyst/rgw-user-policy-urlencode
rgw: url decode PutUserPolicy params

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Pritha Srivastava <prsrivas@redhat.com>
Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
2019-08-15 14:02:03 -04:00
Matt Benjamin
306bc84af1
Merge pull request #29670 from linuxbox2/wip-rgwfile-marker
rgw_file: readdir: do not construct markers w/leading '/'
2019-08-15 13:33:27 -04:00
Sage Weil
09beeb0e03 Merge PR #29422 into master
* refs/pull/29422/head:
	qa/tasks/mgr/dashboard/test_health: update schema
	doc/rados/operations/monitoring: document muting health alerts
	qa/standalone/mon/health-mutes: add tests
	doc/rados/operations/health-checks: document MON_DISK_{LOW,CRIT,BIG}
	doc/rados/operations/health-checks: document OSD_NO_DOWN_OUT_INTERVAL
	doc/rados/operations/health-checks: document AUTH_BAD_CAPS
	doc/reados/operations/health-checks: document PG_SLOW_SNAP_TRIMMING
	doc/rados/operations/health-checks: document MGR_DOWN
	mon/HealthCheck: check mutes based on count, not parsing the summary string
	mon/health_checks: associate a count with health_alert_t
	mon/HealthMonitor: simplify health alert dump
	mon/PGMap: use nice timespan for PG stuck warnings
	mon/HealthMonitor: allow muted alert counts to decrease but not increase
	mon/PGMap: fix summary form for bluestore health alerts
	doc/rados/operations/health-alerts: document BLUESTORE_NO_COMPRESSION
	mon/PGMap: fix summary form for POOL_APP_NOT_ENABLED
	mon/HealthMonitor: persist summary for non-sticky mutes
	mon/HealthMonitor: move get_health_status()
	mon/HealthMonitor: automatically clear non-sticky mutes when alert clears
	mon/HealthMonitor: add gather_all_health_checks helper
	mon/HealthMonitor: add sticky flag to mutes
	mon/HealthMonitor: expire mutes based on ttl
	mon: apply mutes to health [detail]
	mon/HealthMonitor: implement mute and unmount commands
	mon/HealthMonitor: maintain list of mutes
	mon: refactor/simplify health [detail]
	mon/health_checks: format 'health summary' with a colon
	mon/health_checks: drop dump_summary_compat

Reviewed-by: Neha Ojha <nojha@redhat.com>
2019-08-15 12:28:26 -05:00
Sage Weil
2ac53bb645 Merge PR #29537 into master
* refs/pull/29537/head:
	os/bluestore/BlueFS: fix device_migrate_to_* to handle varying alloc sizes
	os/bluestore/BlueFS: apply shared_alloc_size to shared device
	os/bluestore: whitespace
	os/bluestore/BlueFS: add bluefs_shared_alloc_size
	os/bluestore/BlueStore.cc: start should be >= _get_ondisk_reserved()

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
2019-08-15 12:26:29 -05:00
Kefu Chai
441ed26c9b
Merge pull request #29686 from tchaikov/wip-osdc-wait-for-osdmap
osdc: should release the rwlock before waiting

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2019-08-16 00:10:25 +08:00
Lenz Grimmer
97928fb323
Merge pull request #26953 from Exotelis/ceph-dashboard-i18ntool
mgr/dashboard: ceph dashboard i18ntool

Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
2019-08-15 14:45:51 +00:00
Kefu Chai
8432b6a9c1
Merge pull request #29568 from votdev/ignore_dirs
.gitignore: add more stuff

Reviewed-by: Stephan Müller <smueller@suse.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
2019-08-15 22:39:05 +08:00
Jan Fajerski
0534cf188a ceph-volume: don't keep device lists as sets
This was introduced by #27754. The explicit device lists were cast to
sets but other parts of the code where not updated accordingly. To avoid
touching all code places, only cast to sets for disjoint test and keep
lists otherwise.

Fixes: https://tracker.ceph.com/issues/41292

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
2019-08-15 15:31:03 +02:00
Jan Fajerski
362bfac4dc
Merge pull request #29684 from jan--f/c-v-batch-functional-check-stderr
ceph-volume: fix batch functional tests, idempotent test must check s…
2019-08-15 15:19:23 +02:00
Lenz Grimmer
9294989bc2
mgr/dashboard: Daemons Page Tables Test (#29469)
mgr/dashboard: Daemons Page Tables Test

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
2019-08-15 13:18:33 +00:00
Lenz Grimmer
e64198315c
mgr/dashboard: Logs Page E2E Tests (#29434)
mgr/dashboard: Logs Page E2E Tests

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
2019-08-15 13:17:53 +00:00
Lenz Grimmer
8867074b4b
Merge pull request #29420 from ricardoasmarques/fix-default-builder-is-not-a-function
mgr/dashboard: Fixes 'defaultBuilder' is not a function

Reviewed-by: Ricardo Dias <rdias@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
2019-08-15 13:16:48 +00:00
Kefu Chai
1b8df73697 osdc: should release the rwlock before waiting
this addresses a regresssion introduced by 20b1ac6e

Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-08-15 19:50:35 +08:00
Jan Fajerski
88807110f3 ceph-volume: fix batch functional tests, idempotent test must check stderr
Fixes: https://tracker.ceph.com/issues/41295

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
2019-08-15 13:28:42 +02:00
Volker Theile
3866fca834 Improve .gitignore
Ignore some Python related caching dirs.

Signed-off-by: Volker Theile <vtheile@suse.com>
2019-08-15 13:16:46 +02:00
Nathan Cutler
e1bbc4e16d
Merge pull request #29536 from batrick/backport-https
scripts: use https url for redmine

Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
Reviewed-by: Nathan Cutler <ncutler@suse.com>
2019-08-15 11:08:01 +02:00
Boris Ranto
d9f8a50a89
Merge pull request #28997 from b-ranto/wip-push-dash
Make ceph-dashboard require grafana dashboards

Reviewed-by: Zack Cerza <zcerza@redhat.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>
2019-08-15 11:02:43 +02:00
Kefu Chai
a38122fdf7 test/unittest_bluefs: always remove temp bdev file
we leave files in build directory if the test fails. better off
removing them.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-08-15 13:23:51 +08:00
Sage Weil
6f7179daa6 os/bluestore/KernelDevice: fix RW_IO_MAX constant
This depends on the page size.  See:

6e6d05360b/include/linux/fs.h (L2305)

30d1d92a88/tools/virtio/linux/kernel.h (L23)

Fixes 4d33114a40

Fixes: https://tracker.ceph.com/issues/41188
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:43:43 -05:00
Sage Weil
403f1195b0 qa/tasks/mgr/dashboard/test_health: update schema
Also fix the 'checks' field, which is a list of objects, not strings.  (The
test doesn't notice because it's empty.)

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
2a1b58b5ac doc/rados/operations/monitoring: document muting health alerts
I think someday the docs for how health alerts work (here) and the
enumeration of all actual alerts should be restructured.  For now this
si the simplest placde to fit this!

Signed-off-by: Sage Weil <sage@redhat.com>t
2019-08-14 20:40:08 -05:00
Sage Weil
710fef96ea qa/standalone/mon/health-mutes: add tests
Make sure mute and unmute work.  Make sure stick is sticky. Mkae sure
counts can go down bupt if they go upt hte mute clears.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
95b8e9fa0d doc/rados/operations/health-checks: document MON_DISK_{LOW,CRIT,BIG}
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
dd5e985614 doc/rados/operations/health-checks: document OSD_NO_DOWN_OUT_INTERVAL
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
0eba993fad doc/rados/operations/health-checks: document AUTH_BAD_CAPS
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
7e9ba0a1c1 doc/reados/operations/health-checks: document PG_SLOW_SNAP_TRIMMING
The mitigation steps are weak, but it's not clear concrete guidance to
provide.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
078ef210d5 doc/rados/operations/health-checks: document MGR_DOWN
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
7385e917bb mon/HealthCheck: check mutes based on count, not parsing the summary string
This is more explicit and robust, and works with the PG warnings, which
don't conform to the "%d ..." form that the other messages do.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
d0eb22f3ba mon/health_checks: associate a count with health_alert_t
0 means this is a singleton.  Otherwise, we can sum this up, either
via merge() or get_or_add().

We always structure this so the count goes toward zero (more healthy), so
if a value is too low, then we count how much too low it is.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
7834cd8146 mon/HealthMonitor: simplify health alert dump
Use dump() member instead of duplicating!  The only reason we had this
before was because the detail portion was optinoal

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
164de7d69b mon/PGMap: use nice timespan for PG stuck warnings
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
0acff80f64 mon/HealthMonitor: allow muted alert counts to decrease but not increase
If the summary starts with a digit, parse a count.

If the count goes up, clear the mute.

If the count goes down, update the mute so that we ratchet the threshold
down.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
e4784af4ca mon/PGMap: fix summary form for bluestore health alerts
Count goes first.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
1b6745efb4 doc/rados/operations/health-alerts: document BLUESTORE_NO_COMPRESSION
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
1ae154a433 mon/PGMap: fix summary form for POOL_APP_NOT_ENABLED
Count goes first.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
c09dc4ec45 mon/HealthMonitor: persist summary for non-sticky mutes
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
29d74309aa mon/HealthMonitor: move get_health_status()
This operates exclusively on HealthMonitor members.  Make public member
private again.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
70db3f025a mon/HealthMonitor: automatically clear non-sticky mutes when alert clears
If the alert goes away, drop the mute.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
1600ff60d6 mon/HealthMonitor: add gather_all_health_checks helper
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
2dff777ba9 mon/HealthMonitor: add sticky flag to mutes
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
ffc98aa606 mon/HealthMonitor: expire mutes based on ttl
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:37:00 -05:00
Sage Weil
553f1c5578 mon: apply mutes to health [detail]
- de-escalate severity
- mark mutes in structured output
- note mutes in summary text output
- mark mutes in detail text output

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:37:00 -05:00
Sage Weil
b00d4ca085 mon/HealthMonitor: implement mute and unmount commands
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:37:00 -05:00