Commit Graph

351 Commits

Author SHA1 Message Date
Xavier Villaneau 74c89af225 Implement new gauge counting crash reports
New metric: `ceph_crash_reports` which counts the entries returned by
`ceph crash ls` by daemon name and archival status.

This is not the same as `ceph_new_crash_reports` which is the value of
the `RECENT_CRASH` health check, and that only counts the non-archived
errors of the past two weeks. The new metric counts errors as long as
they are not purged (which is done after 1 year by defaults).
2022-06-15 17:04:04 -04:00
Joshua Baergen 56bd79f4be
Merge pull request #216 from AKYD/update_version_validation
Normalize ceph-ansible version format
2022-05-25 11:21:46 -06:00
AKYD 763e5ecd21 Normalize ceph-ansible version format 2022-05-25 11:49:04 +03:00
Joshua Baergen 43df70b181
Merge pull request #215 from digitalocean/fix-pacific-mgrs
ceph: Support the Octopus+ mgrmap format; improve multi-version testing.
2022-04-12 11:20:41 -06:00
Joshua Baergen ebd166be2d ceph: Support the Octopus+ mgrmap format. 2022-04-12 08:52:04 -06:00
Joshua Baergen 4e0f8910a4 Add missing tests for Octopus+ osdmap format.
In TestClusterHealthCollector, test all supported versions by default,
and split the osdmap tests for Nautilus vs. Octopus+. There were a
number of tests that included an osdmap that didn't need it, and the
osdmap was removed from them so that version-specific testing would not
be required.
2022-04-12 08:52:01 -06:00
Kyle e54e159791 add docker build and push action 2022-03-30 11:41:21 -07:00
Kyle 602a178af1
Merge pull request #213 from Rethan/feat-misplaced-ratio
feat: add misplaced ratio metric
2022-03-29 18:44:01 -07:00
haoyixing 407248ce1d feat: add misplaced ratio metric
Misplaced ratio equals to misplaced_objects deviding misplaced_total, not misplaced_objects / num_objects.
So add a separate metric to show misplaced ratio.

Signed-off-by: haoyixing <haoyixing@kuaishou.com>
2022-03-29 18:38:15 -07:00
Kyle 52ecf44451
Merge pull request #211 from digitalocean/4.0-dev
v4.0.0
2022-03-29 18:21:14 -07:00
Kyle 917a468065 update deps and reduce a warn to debug 2022-03-29 17:44:50 -07:00
Kyle ce4e3993c4 update build workflow 2022-03-29 12:41:44 -07:00
Kyle 1d7bac531d update license headers 2022-03-23 14:02:21 -07:00
Kyle 00c0dacc02
Merge pull request #210 from digitalocean/more-new-stuff
4.0-rc1
2022-03-23 13:31:49 -07:00
Kyle cf432402f5 update README 2022-03-23 13:26:33 -07:00
Kyle 4d817f487d fix staticcheck errors 2022-03-23 12:24:28 -07:00
Kyle ef01452f1c reload tls cert whenever it is requested 2022-03-23 11:43:13 -07:00
Kyle d6b67a77c3 removed down osd duplicate filtering 2022-03-22 12:59:51 -07:00
Kyle 5e7fae5d5a add TLS support 2022-03-22 10:40:40 -07:00
Kyle 3a0b289eda filter duplicate OSD nodes for down health check and fix health tests 2022-03-21 15:28:20 -07:00
Kyle b806cf51bb remove pre-nautilus health check code 2022-03-21 14:52:34 -07:00
Kyle df7435b259 add DAEMON_OLD_VERSION health check, update readme, remove makefile 2022-03-21 13:56:19 -07:00
Kyle e0d8ba4d6f
Merge pull request #209 from digitalocean/update-dockerfile
update Go to 1.18 and Docker image to focal
2022-03-18 10:19:18 -07:00
Kyle 64e5410753 target nautilus and update workflows to use Go 1.18 2022-03-18 10:16:08 -07:00
Kyle 11cca676b8 update Go to 1.18 and Docker image to focal 2022-03-18 10:04:51 -07:00
Kyle b5897900cb
Merge pull request #207 from digitalocean/multiple-version-support
* add version parser and IsAtLeast constraint
* refactor package structure
* restructure tests to remove go-ceph requirement
* split CI into build and test
* support flattened osdmap format added in octopus
2022-03-17 12:15:50 -07:00
Kyle 2122a3331f support flattened osdmap format added in octopus 2022-03-16 14:13:57 -07:00
Xavier Villaneau 94efb30be1 CI: Split build and tests into separate workflows 2022-02-24 15:57:00 -05:00
Xavier Villaneau 6f83fdd300 Restructure so that tests do not depend on go-ceph
- `ceph.Conn` interface no longer depends on go-ceph/rados,
  now defines its own `PoolStat` structure for our use.
- New separate `rados` package that implements the interface
- Merged `mocks` package into `ceph` to avoid circular import
2022-02-24 15:57:00 -05:00
Kyle 566f1fa5d3 a ton of refactoring 2022-02-23 15:43:46 -08:00
Kyle 13e97cd25d introduce version parser and IsAtLeast constraint 2022-02-22 16:00:42 -08:00
Kyle 4e84633fc0 allow different collectors by ceph version 2022-02-16 10:00:05 -08:00
Xavier Villaneau 5ea59b00fd ci: Use GitHub Actions to run tests 2022-02-14 15:17:42 -05:00
Kyle 8fe2bcc648 fix pg states test case 2022-02-14 11:01:25 -08:00
Kyle cfe7dc2df3 use t.Run for table driven health test 2022-02-14 10:46:31 -08:00
Kyle 625f1fe8cf update Go and go-ceph 2022-02-11 13:44:55 -08:00
Matt1360 e8ea7d7e66
Merge pull request #204 from digitalocean/repair-counter
collectors/health: add repair state checking
2021-12-21 15:07:04 -04:00
Matt Vandersomething 47b7ae2ed6
collectors/health: add repair state checking 2021-12-21 14:24:47 -04:00
Alexandre Marangone 8aa2b4127d
Merge pull request #203 from digitalocean/amarangone/STORSYS-347
health: add osds_too_many_repair gauge
2021-12-02 09:45:22 -08:00
Alex Marangone ef8b362842 health: add osds_too_many_repair gauge 2021-12-02 09:34:23 -08:00
Matt1360 d33169f435
Merge pull request #200 from digitalocean/nautilus-snaptrim-fix
collectors/health: fix snaptrim collection, add a test
2021-09-09 14:45:32 -03:00
Matt Vandersomething 71a4783863
collectors/health: fix snaptrim collection, add a test 2021-09-09 14:09:30 -03:00
Matt1360 bf9b28cfaf
Merge pull request #198 from digitalocean/nautilus-snaptrim-collection
nautilus: collectors/health: include `snaptrim{,_wait}` PG states
2021-08-26 12:57:41 -03:00
Matt Vandersomething 836e7c4dc6
collectors/health: include `snaptrim{,_wait}` PG states 2021-08-26 12:07:13 -03:00
Matt1360 712c30902b
Merge pull request #196 from digitalocean/pg-collector
collectors/osd: fix meric name stutter
2021-05-25 11:12:03 -04:00
Matt Vandersomething 45b583d4a8
collectors/osd: fix meric name stutter 2021-05-25 11:03:41 -04:00
Matt1360 2accbad9f9
Merge pull request #195 from digitalocean/pg-collector
Golang update, expose Ceph versions and features, PG inactive tracking
2021-05-25 09:46:12 -04:00
Matt Vandersomething a70eac9a44
collectors/monitors: added tests, and use gabs for dynamic JSON 2021-05-20 16:40:24 -04:00
Joshua Baergen 0b2266af78
Merge pull request #192 from AKYD/nautilus
Use IEC instead of SI units
2021-01-14 10:17:52 -07:00
alin.hrapciuc@gmail.com 3740143393 Update tests for osd and monitors 2021-01-14 17:27:36 +02:00