Commit Graph

15 Commits

Author SHA1 Message Date
Xavier Villaneau 74c89af225 Implement new gauge counting crash reports
New metric: `ceph_crash_reports` which counts the entries returned by
`ceph crash ls` by daemon name and archival status.

This is not the same as `ceph_new_crash_reports` which is the value of
the `RECENT_CRASH` health check, and that only counts the non-archived
errors of the past two weeks. The new metric counts errors as long as
they are not purged (which is done after 1 year by defaults).
2022-06-15 17:04:04 -04:00
AKYD 763e5ecd21 Normalize ceph-ansible version format 2022-05-25 11:49:04 +03:00
Joshua Baergen ebd166be2d ceph: Support the Octopus+ mgrmap format. 2022-04-12 08:52:04 -06:00
Joshua Baergen 4e0f8910a4 Add missing tests for Octopus+ osdmap format.
In TestClusterHealthCollector, test all supported versions by default,
and split the osdmap tests for Nautilus vs. Octopus+. There were a
number of tests that included an osdmap that didn't need it, and the
osdmap was removed from them so that version-specific testing would not
be required.
2022-04-12 08:52:01 -06:00
haoyixing 407248ce1d feat: add misplaced ratio metric
Misplaced ratio equals to misplaced_objects deviding misplaced_total, not misplaced_objects / num_objects.
So add a separate metric to show misplaced ratio.

Signed-off-by: haoyixing <haoyixing@kuaishou.com>
2022-03-29 18:38:15 -07:00
Kyle 917a468065 update deps and reduce a warn to debug 2022-03-29 17:44:50 -07:00
Kyle 1d7bac531d update license headers 2022-03-23 14:02:21 -07:00
Kyle 4d817f487d fix staticcheck errors 2022-03-23 12:24:28 -07:00
Kyle d6b67a77c3 removed down osd duplicate filtering 2022-03-22 12:59:51 -07:00
Kyle 3a0b289eda filter duplicate OSD nodes for down health check and fix health tests 2022-03-21 15:28:20 -07:00
Kyle b806cf51bb remove pre-nautilus health check code 2022-03-21 14:52:34 -07:00
Kyle df7435b259 add DAEMON_OLD_VERSION health check, update readme, remove makefile 2022-03-21 13:56:19 -07:00
Kyle 2122a3331f support flattened osdmap format added in octopus 2022-03-16 14:13:57 -07:00
Xavier Villaneau 6f83fdd300 Restructure so that tests do not depend on go-ceph
- `ceph.Conn` interface no longer depends on go-ceph/rados,
  now defines its own `PoolStat` structure for our use.
- New separate `rados` package that implements the interface
- Merged `mocks` package into `ceph` to avoid circular import
2022-02-24 15:57:00 -05:00
Kyle 566f1fa5d3 a ton of refactoring 2022-02-23 15:43:46 -08:00