Commit Graph

275 Commits

Author SHA1 Message Date
Kyle
2122a3331f support flattened osdmap format added in octopus 2022-03-16 14:13:57 -07:00
Xavier Villaneau
94efb30be1 CI: Split build and tests into separate workflows 2022-02-24 15:57:00 -05:00
Xavier Villaneau
6f83fdd300 Restructure so that tests do not depend on go-ceph
- `ceph.Conn` interface no longer depends on go-ceph/rados,
  now defines its own `PoolStat` structure for our use.
- New separate `rados` package that implements the interface
- Merged `mocks` package into `ceph` to avoid circular import
2022-02-24 15:57:00 -05:00
Kyle
566f1fa5d3 a ton of refactoring 2022-02-23 15:43:46 -08:00
Kyle
13e97cd25d introduce version parser and IsAtLeast constraint 2022-02-22 16:00:42 -08:00
Kyle
4e84633fc0 allow different collectors by ceph version 2022-02-16 10:00:05 -08:00
Xavier Villaneau
5ea59b00fd ci: Use GitHub Actions to run tests 2022-02-14 15:17:42 -05:00
Kyle
8fe2bcc648 fix pg states test case 2022-02-14 11:01:25 -08:00
Kyle
cfe7dc2df3 use t.Run for table driven health test 2022-02-14 10:46:31 -08:00
Kyle
625f1fe8cf update Go and go-ceph 2022-02-11 13:44:55 -08:00
Matt1360
e8ea7d7e66
Merge pull request #204 from digitalocean/repair-counter
collectors/health: add repair state checking
2021-12-21 15:07:04 -04:00
Matt Vandersomething
47b7ae2ed6
collectors/health: add repair state checking 2021-12-21 14:24:47 -04:00
Alexandre Marangone
8aa2b4127d
Merge pull request #203 from digitalocean/amarangone/STORSYS-347
health: add osds_too_many_repair gauge
2021-12-02 09:45:22 -08:00
Alex Marangone
ef8b362842 health: add osds_too_many_repair gauge 2021-12-02 09:34:23 -08:00
Matt1360
d33169f435
Merge pull request #200 from digitalocean/nautilus-snaptrim-fix
collectors/health: fix snaptrim collection, add a test
2021-09-09 14:45:32 -03:00
Matt Vandersomething
71a4783863
collectors/health: fix snaptrim collection, add a test 2021-09-09 14:09:30 -03:00
Matt1360
bf9b28cfaf
Merge pull request #198 from digitalocean/nautilus-snaptrim-collection
nautilus: collectors/health: include `snaptrim{,_wait}` PG states
2021-08-26 12:57:41 -03:00
Matt Vandersomething
836e7c4dc6
collectors/health: include snaptrim{,_wait} PG states 2021-08-26 12:07:13 -03:00
Matt1360
712c30902b
Merge pull request #196 from digitalocean/pg-collector
collectors/osd: fix meric name stutter
2021-05-25 11:12:03 -04:00
Matt Vandersomething
45b583d4a8
collectors/osd: fix meric name stutter 2021-05-25 11:03:41 -04:00
Matt1360
2accbad9f9
Merge pull request #195 from digitalocean/pg-collector
Golang update, expose Ceph versions and features, PG inactive tracking
2021-05-25 09:46:12 -04:00
Matt Vandersomething
a70eac9a44
collectors/monitors: added tests, and use gabs for dynamic JSON 2021-05-20 16:40:24 -04:00
Joshua Baergen
0b2266af78
Merge pull request #192 from AKYD/nautilus
Use IEC instead of SI units
2021-01-14 10:17:52 -07:00
alin.hrapciuc@gmail.com
3740143393 Update tests for osd and monitors 2021-01-14 17:27:36 +02:00
alin.hrapciuc@gmail.com
c9e65abd6f Fix units for latency 2021-01-14 16:41:28 +02:00
AKYD
9be0cd69d3 Use IEC instead of SI units 2021-01-14 09:46:59 +02:00
Joshua Baergen
0ee6da60a0
Merge pull request #189 from digitalocean/nautilus-pg-upmap-items
collectors/osd: Export the total number of items in the pg-upmap table.
2020-12-18 06:14:19 -07:00
Joshua Baergen
cb429a2b91 collectors/osd: Export the total number of items in the pg-upmap table. 2020-12-17 16:31:34 -07:00
Joshua Baergen
f7011dbe78
Merge pull request #188 from digitalocean/nautilus-misplaced_objects
collectors/health: Fix ceph_misplaced_objects on Nautilus.
2020-12-11 08:33:27 -07:00
Joshua Baergen
b4b94a0844 collectors/health: Fix ceph_misplaced_objects on Nautilus.
Nautilus no longer reports misplaced objects as a health status, but it
is available in the pgmap data. For consistency, let's get the degraded
object count from there as well.
2020-12-10 14:54:05 -07:00
Joshua Baergen
8a1f51881f
Merge pull request #187 from shminjs/feat-add-mon-down-metric
Add new gauge to show the count of mon in down state.
2020-12-04 06:58:31 -07:00
shimin
b01931d0c4 Add new gauge to show the count of mon in down state.
When a monitor is down, it should be urgent to notice administrator.

Signed-off-by: shminjs <shminjs@outlook.com>
2020-12-04 19:58:41 +08:00
Joshua Baergen
38c5cc7360
Merge pull request #185 from Rethan/feat_add_osd_full_ratio
feat: add osd full/nearfull/backfillfull ratio
2020-11-06 07:14:59 -07:00
haoyixing
5ef76dbd19 update osd_test
Signed-off-by: haoyixing <haoyixing@kuaishou.com>
2020-11-06 18:03:16 +08:00
haoyixing
c60444dcef feat: add osd full/nearfull/backfillfull ratio
Add new gauge to show osd full/nearfull/backfillfull ratio.
Not only do we need to know whether a osd is full or not, we
also want to know the exact full ratio was for a cluster.
For many clusters which have different full ratio set, this
should be meaningfull.

Signed-off-by: haoyixing <haoyixing@kuaishou.com>
2020-11-06 16:21:30 +08:00
Yue Zhu
4906d5b866
Merge pull request #184 from digitalocean/yzhu/fix-osd-collector
Use MgrCommand for "osd df", "osd perf" and "pg dump pgs_brief"
2020-10-29 22:44:33 -04:00
Yue Zhu
1778243b17 Remove go get because we use go mod 2020-10-29 21:26:41 -04:00
Yue Zhu
477ce579f7 Update Travis 2020-10-29 15:14:31 -04:00
Yue Zhu
252eb6604a Use MgrCommand for "osd df", "osd perf" and "pg dump pgs_brief" 2020-10-29 13:44:43 -04:00
Yue Zhu
238f39a71b
Refactoring for nautilus branch (#183)
* Run -race for go test

* Upgrade go 1.15.3

* Extract a function to create rados connection

* Convert to go module; update dependencies

* Use environment variables to pass in parameters

* Make rados connection short lived

* Use float64 for JSON number in cluster_usage.go

* Use mocks to replace the NoopConn

* Add "-tags nautilus" for "go test" and "go build"

* Update readme

* Update go mod
2020-10-28 14:42:52 -04:00
Max Kuznetsov
c9552f0f9f
Merge pull request #180 from syhpoon/BLOCK-2615
fix rbd-mirror daemon format in Nautilus
2020-10-07 16:09:02 -04:00
Max Kuznetsov
77d7977809 fix rbd-mirror daemon format in Nautilus 2020-10-07 15:55:58 -04:00
Max Kuznetsov
de55efabeb
Merge pull request #179 from syhpoon/BLOCK-2559
add metric to capture the number of inconsistent pgs
2020-09-16 11:32:38 -04:00
Max Kuznetsov
b75904e25f add metric to capture the number of inconsistent pgs 2020-09-16 11:27:27 -04:00
Max Kuznetsov
21fb2c4d24
Merge pull request #177 from syhpoon/BLOCK-2553
rbd_mirror_up: switch to const metric
2020-09-15 11:46:03 -04:00
Max Kuznetsov
dc6728deb0 rbd_mirror_up: switch to const metric 2020-09-15 11:26:30 -04:00
Max Kuznetsov
a5c61bc5ba
Merge pull request #175 from syhpoon/BLOCK-2553
add new metric: rbd_mirror_up
2020-09-14 13:37:22 -04:00
Max Kuznetsov
6056ab259b add new metric: rbd_mirror_up 2020-09-14 13:06:18 -04:00
Yue Zhu
0cf8d3e787
Merge pull request #173 from digitalocean/pool-unfound-objects-nautilus
nautilus: add pool metric unfound_objects_total
2020-09-10 16:39:57 -04:00
Yue Zhu
0242f191c2 Add pool metric unfound_objects_total
(cherry picked from commit 170a989eed)
2020-09-10 16:29:05 -04:00