ceph/monitoring/grafana/dashboards
Ernesto Puerta cc6b18a92c
Merge pull request #41880 from david-caro/fix_cluster_grafana_dashboard
monitoring/grafana/cluster: use per-unit max and limit values

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: p-se <NOT@FOUND>
2021-08-02 13:03:46 +02:00
..
ceph-cluster.json Merge pull request #41880 from david-caro/fix_cluster_grafana_dashboard 2021-08-02 13:03:46 +02:00
cephfs-overview.json
CMakeLists.txt
host-details.json Merge pull request #41838 from p-se/grafana-clean-up 2021-06-25 20:45:28 +02:00
hosts-overview.json mgr/dashboard: fix OSDs Host details/overview grafana graphs 2021-05-07 15:38:07 +02:00
osd-device-details.json monitoring: fix Physical Device Latency unit 2021-07-07 17:00:30 +04:30
osds-overview.json mgr/dashboard: Prometheus query error in the metrics of Pools, OSDs and RBD images 2020-04-21 23:03:09 +05:30
pool-detail.json mgr/dashboard: Remove hard-coded timezone off Grafana dashboards 2021-06-16 09:10:32 +02:00
pool-overview.json mgr/dashboard: Remove hard-coded timezone off Grafana dashboards 2021-06-16 09:10:32 +02:00
radosgw-detail.json mgr/dashboard: deprecated variable usage in Grafana dashboards 2021-06-07 14:31:53 +02:00
radosgw-overview.json monitoring: fix RGW grafana chart 'Average GET/PUT Latencies' 2020-03-10 12:05:26 +01:00
radosgw-sync-overview.json mgr/dashboard: grafana panels for rgw multisite sync performance 2020-05-22 13:36:10 +02:00
rbd-details.json monitoring: convert newline character to LF 2021-06-16 09:10:32 +02:00
rbd-overview.json mgr/dashboard: Prometheus query error in the metrics of Pools, OSDs and RBD images 2020-04-21 23:03:09 +05:30
README

Context
These dashboards should be enough to get started on the integration. It's not a complete set, so more will be added in the next week.

Bare in mind that the osd device details dashboard needs node_exporter active - all the other dashboards pick data out of ceph-mgr based metrics.


The cephfs dashboard only has 2 panels currently. The counter available are
a little light at the moment. Patrick/Venky have been addressing this with
https://bugzilla.redhat.com/show_bug.cgi?id=1618523
cephfs-overview.json

Host Information
host-details.json combines generic server metrics that show cpu/memory/network stats (including network errors/drops),
with disk level stats for OSD hosts. OSD charts show the physical device name together with it's corresponding osd id for correlation.

Ceph Pools
two dashboards. Overview gives the high level combined view, pool-detail needs a pool_name variable passed to it (currently uses a templating var which is visible)
pool-overview.json
pool-detail.json

OSD Device Details. This dashboard needs some further work. It currently shows
OSD level stats with physical device stats but leaves out some of the counters
that cephmetrics provides for trouble shooting.
osd-device-details.json

Object gateway dashboards, again split into overview and detail. The detail dashboard needs the relevant ceph-deamon name for the rgw instance.
radosgw-overview.json
radosgw-detail.json