Commit Graph

101213 Commits

Author SHA1 Message Date
Lenz Grimmer
9294989bc2
mgr/dashboard: Daemons Page Tables Test (#29469)
mgr/dashboard: Daemons Page Tables Test

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
2019-08-15 13:18:33 +00:00
Lenz Grimmer
e64198315c
mgr/dashboard: Logs Page E2E Tests (#29434)
mgr/dashboard: Logs Page E2E Tests

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
2019-08-15 13:17:53 +00:00
Lenz Grimmer
8867074b4b
Merge pull request #29420 from ricardoasmarques/fix-default-builder-is-not-a-function
mgr/dashboard: Fixes 'defaultBuilder' is not a function

Reviewed-by: Ricardo Dias <rdias@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
2019-08-15 13:16:48 +00:00
Daniel Gryniewicz
7e613fdc55 Project Zipper Part 1 - Framework and RGWRadosStore
This is the first part of Project Zipper, the Store Abstraction Layer.
It introduces the basic framework, and wraps RGWRados in RGWRadosStore.
The goal over the next few weeks is to do the same for user, bucket, and
object.  This will make most of the remaining users of RGWRados wrapped
in SAL classes, allowing it to be completely absorbed into the private
RGWRadosStore.  This will also expose all the APIs that need to be
pusheg up to higher layers in the SAL.

Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
2019-08-15 08:48:13 -04:00
Kefu Chai
1b8df73697 osdc: should release the rwlock before waiting
this addresses a regresssion introduced by 20b1ac6e

Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-08-15 19:50:35 +08:00
Nathan Cutler
b9c7443010 rpm: always build ceph-test package
Fixes: https://tracker.ceph.com/issues/41296
Signed-off-by: Nathan Cutler <ncutler@suse.com>
2019-08-15 13:28:51 +02:00
Jan Fajerski
88807110f3 ceph-volume: fix batch functional tests, idempotent test must check stderr
Fixes: https://tracker.ceph.com/issues/41295

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
2019-08-15 13:28:42 +02:00
Volker Theile
3866fca834 Improve .gitignore
Ignore some Python related caching dirs.

Signed-off-by: Volker Theile <vtheile@suse.com>
2019-08-15 13:16:46 +02:00
Nathan Cutler
e1bbc4e16d
Merge pull request #29536 from batrick/backport-https
scripts: use https url for redmine

Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
Reviewed-by: Nathan Cutler <ncutler@suse.com>
2019-08-15 11:08:01 +02:00
Boris Ranto
d9f8a50a89
Merge pull request #28997 from b-ranto/wip-push-dash
Make ceph-dashboard require grafana dashboards

Reviewed-by: Zack Cerza <zcerza@redhat.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>
2019-08-15 11:02:43 +02:00
Kefu Chai
a38122fdf7 test/unittest_bluefs: always remove temp bdev file
we leave files in build directory if the test fails. better off
removing them.

Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-08-15 13:23:51 +08:00
Radoslaw Zarzynski
f74318e5c6 osd, osdc: drop the unused outdata feature of PGLSFilter.
Before this commit PGLSFilter interface was offering the outdata
parameter in its filter() method:

  filter(..., bufferlist& outdata)

OSD was serializing and appending the bufferlist to response to
CEPH_OSD_OP_PGLS_FILTER and CEPH_OSD_OP_PGNLS_FILTER operations.
At the Objecter's side these extra bits were being parsed and
finally stored in NListContext::extra_info. However, it really
looks this member is not used anywhere.

The commit removes the outdata handling on multiple layers: from
PGLSFilter implementations, through OSD till Objecter.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2019-08-15 10:47:48 +08:00
Kefu Chai
778409caed osd/PrimaryLogPG: remove unused "parent" pgls-filter
it's implemented using `PGLSParentFilter`, and this filter has never been
used. the only possible user would be `cephfs-data-scan`, but it's using
`PGLSCephFSFilter` which is referened with "cephfs.inode_tag".

Signed-off-by: Kefu Chai <kchai@redhat.com>
2019-08-15 10:38:35 +08:00
Tao Ning
86d55c1a0d osd/PrimaryLogPG: Avoid accessing destroyed references in finish_degraded_object
As follows:
for (auto i = callbacks_for_degraded_object.begin(); i != callbacks_for_degraded_object.end();) {
    finish_degraded_object((i++)->first);
}

void PrimaryLogPG::finish_degraded_object(const hobject_t oid)
{
  if (callbacks_for_degraded_object.count(oid)) {
    contexts.swap(callbacks_for_degraded_object[oid]);
    callbacks_for_degraded_object.erase(oid);   // Release
  }

  map<hobject_t, snapid_t>::iterator i = objects_blocked_on_degraded_snap.find(
    oid.get_head());  // Access
  ...
}

Fixes: https://tracker.ceph.com/issues/41250
Signed-off-by: Tao Ning <ningtao@sangfor.com.cn>
2019-08-15 10:15:15 +08:00
Sage Weil
6f7179daa6 os/bluestore/KernelDevice: fix RW_IO_MAX constant
This depends on the page size.  See:

6e6d05360b/include/linux/fs.h (L2305)

30d1d92a88/tools/virtio/linux/kernel.h (L23)

Fixes 4d33114a40

Fixes: https://tracker.ceph.com/issues/41188
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:43:43 -05:00
Sage Weil
403f1195b0 qa/tasks/mgr/dashboard/test_health: update schema
Also fix the 'checks' field, which is a list of objects, not strings.  (The
test doesn't notice because it's empty.)

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
2a1b58b5ac doc/rados/operations/monitoring: document muting health alerts
I think someday the docs for how health alerts work (here) and the
enumeration of all actual alerts should be restructured.  For now this
si the simplest placde to fit this!

Signed-off-by: Sage Weil <sage@redhat.com>t
2019-08-14 20:40:08 -05:00
Sage Weil
710fef96ea qa/standalone/mon/health-mutes: add tests
Make sure mute and unmute work.  Make sure stick is sticky. Mkae sure
counts can go down bupt if they go upt hte mute clears.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
95b8e9fa0d doc/rados/operations/health-checks: document MON_DISK_{LOW,CRIT,BIG}
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
dd5e985614 doc/rados/operations/health-checks: document OSD_NO_DOWN_OUT_INTERVAL
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
0eba993fad doc/rados/operations/health-checks: document AUTH_BAD_CAPS
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
7e9ba0a1c1 doc/reados/operations/health-checks: document PG_SLOW_SNAP_TRIMMING
The mitigation steps are weak, but it's not clear concrete guidance to
provide.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
078ef210d5 doc/rados/operations/health-checks: document MGR_DOWN
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
7385e917bb mon/HealthCheck: check mutes based on count, not parsing the summary string
This is more explicit and robust, and works with the PG warnings, which
don't conform to the "%d ..." form that the other messages do.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
d0eb22f3ba mon/health_checks: associate a count with health_alert_t
0 means this is a singleton.  Otherwise, we can sum this up, either
via merge() or get_or_add().

We always structure this so the count goes toward zero (more healthy), so
if a value is too low, then we count how much too low it is.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
7834cd8146 mon/HealthMonitor: simplify health alert dump
Use dump() member instead of duplicating!  The only reason we had this
before was because the detail portion was optinoal

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
164de7d69b mon/PGMap: use nice timespan for PG stuck warnings
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
0acff80f64 mon/HealthMonitor: allow muted alert counts to decrease but not increase
If the summary starts with a digit, parse a count.

If the count goes up, clear the mute.

If the count goes down, update the mute so that we ratchet the threshold
down.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
e4784af4ca mon/PGMap: fix summary form for bluestore health alerts
Count goes first.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
1b6745efb4 doc/rados/operations/health-alerts: document BLUESTORE_NO_COMPRESSION
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
1ae154a433 mon/PGMap: fix summary form for POOL_APP_NOT_ENABLED
Count goes first.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
c09dc4ec45 mon/HealthMonitor: persist summary for non-sticky mutes
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
29d74309aa mon/HealthMonitor: move get_health_status()
This operates exclusively on HealthMonitor members.  Make public member
private again.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
70db3f025a mon/HealthMonitor: automatically clear non-sticky mutes when alert clears
If the alert goes away, drop the mute.

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
1600ff60d6 mon/HealthMonitor: add gather_all_health_checks helper
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
2dff777ba9 mon/HealthMonitor: add sticky flag to mutes
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:40:08 -05:00
Sage Weil
ffc98aa606 mon/HealthMonitor: expire mutes based on ttl
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:37:00 -05:00
Sage Weil
553f1c5578 mon: apply mutes to health [detail]
- de-escalate severity
- mark mutes in structured output
- note mutes in summary text output
- mark mutes in detail text output

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:37:00 -05:00
Sage Weil
b00d4ca085 mon/HealthMonitor: implement mute and unmount commands
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:37:00 -05:00
Sage Weil
e2c320dc2b mon/HealthMonitor: maintain list of mutes
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:37:00 -05:00
Sage Weil
062681a72f mon: refactor/simplify health [detail]
Get rid of single caller helpers.  Instead, assimilate all the checks
together at once, and have two separate blocks, one for formatted, and
one for plaintext output.  Much easier to follow!

Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:37:00 -05:00
Sage Weil
2a3f89fe6b mon/health_checks: format 'health summary' with a colon
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:37:00 -05:00
Sage Weil
ee30f1b68a mon/health_checks: drop dump_summary_compat
Signed-off-by: Sage Weil <sage@redhat.com>
2019-08-14 20:37:00 -05:00
David Zafman
5928fe8ca0 osd/PG: scrub error when objects are larger than osd_max_object_size
Signed-off-by: David Zafman <dzafman@redhat.com>
2019-08-14 20:25:12 -05:00
Patrick Donnelly
d1ce58257e
Merge PR #29431 into master
* refs/pull/29431/head:
	qa: fix malformed suite config

Reviewed-by: Zheng Yan <zyan@redhat.com>
2019-08-14 15:21:51 -07:00
Patrick Donnelly
aed88d43a1
Merge PR #28652 into master
* refs/pull/28652/head:
	cephfs-shell: Add error message for invalid ls commands

Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2019-08-14 15:05:29 -07:00
Patrick Donnelly
48d4499b86
Merge PR #29554 into master
* refs/pull/29554/head:
	cephfs-shell: Fix onecmd TypeError

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2019-08-14 15:04:04 -07:00
Patrick Donnelly
a809a9aaf9
Merge PR #29552 into master
* refs/pull/29552/head:
	cephfs-shell: Convert paths type from string to bytes

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
2019-08-14 15:01:35 -07:00
Matt Benjamin
ca546fc53e rgw_file: readdir: do not construct markers w/leading '/'
This case arises when listing the top directory of a bucket, and,
with proper continued enumeration, would generate a non-terminating
loop if a directory contained names which sort lexically before '/'.

Fixes: https://tracker.ceph.com/issues/41252

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2019-08-14 14:47:20 -04:00
Yuri Weinstein
6f3c0c9641
Merge pull request #29666 from yuriw/wip-yuriw-crontab-master
qa/tests - upped priority for upgrades on master, otherwise they neve…
2019-08-14 09:49:39 -07:00