10 KiB
10 KiB
Metrics Collected
Ceph exporter implements multiple collectors:
Cluster usage
General cluster level data usage.
Labels:
cluster
: cluster name
Metrics:
ceph_cluster_capacity_bytes
: Total capacity of the clusterceph_cluster_used_bytes
: Capacity of the cluster currently in useceph_cluster_available_bytes
: Available space within the cluster
Pool usage
Per-pool usage data
Labels:
cluster
: cluster namepool
: pool name
Metrics:
ceph_pool_used_bytes
: Capacity of the pool that is currently under useceph_pool_raw_used_bytes
: Raw capacity of the pool that is currently under use, this factors in the sizeceph_pool_available_bytes
: Free space for the poolceph_pool_percent_used
: Percentage of the capacity available to this pool that is used by this poolceph_pool_objects_total
: Total no. of objects allocated within the poolceph_pool_dirty_objects_total
: Total no. of dirty objects in a cache-tier poolceph_pool_unfound_objects_total
: Total no. of unfound objects for the poolceph_pool_read_total
: Total read I/O calls for the poolceph_pool_read_bytes_total
: Total read throughput for the poolceph_pool_write_total
: Total write I/O calls for the poolceph_pool_write_bytes_total
: Total write throughput for the pool
Pool info
General pool information
Labels:
cluster
: cluster namepool
: pool nameroot
: CRUSH root of the poolprofile
:replicated
or EC profile being used
Metrics:
ceph_pool_pg_num
: The total count of PGs alotted to a poolceph_pool__pgp_num
: The total count of PGs alotted to a pool and used for placementsceph_pool_min_size
: Minimum number of copies or chunks of an object that need to be present for active I/Oceph_pool_size
: Total copies or chunks of an object that need to be present for a healthy clusterceph_pool_quota_max_bytes
: Maximum amount of bytes of data allowed in a poolceph_pool_quota_max_objects
: Maximum amount of RADOS objects allowed in a poolceph_pool_stripe_width
: Stripe width of a RADOS object in a poolceph_pool_expansion_factor
: Data expansion multiplier for a pool
Cluster health
Cluster health metrics
Labels:
cluster
: cluster name
Metrics:
ceph_health_status
: Health status of Cluster, can vary only between 3 states (err:2, warn:1, ok:0)ceph_health_status_interp
: Health status of Cluster, can vary only between 4 states (err:3, critical_warn:2, soft_warn:1, ok:0)ceph_mons_down
: Count of Mons that are in DOWN stateceph_total_pgs
: Total no. of PGs in the clusterceph_pg_state
: State of PGs in the clusterceph_active_pgs
: No. of active PGs in the clusterceph_scrubbing_pgs
: No. of scrubbing PGs in the clusterceph_deep_scrubbing_pgs
: No. of deep scrubbing PGs in the clusterceph_recovering_pgs
: No. of recovering PGs in the clusterceph_recovery_wait_pgs
: No. of PGs in the cluster with recovery_wait stateceph_backfilling_pgs
: No. of backfilling PGs in the clusterceph_backfill_wait_pgs
: No. of PGs in the cluster with backfill_wait stateceph_forced_recovery_pgs
: No. of PGs in the cluster with forced_recovery stateceph_forced_backfill_pgs
: No. of PGs in the cluster with forced_backfill stateceph_down_pgs
: No. of PGs in the cluster in down stateceph_incomplete_pgs
: No. of PGs in the cluster in incomplete stateceph_inconsistent_pgs
: No. of PGs in the cluster in inconsistent stateceph_snaptrim_pgs
: No. of snaptrim PGs in the clusterceph_snaptrim_wait_pgs
: No. of PGs in the cluster with snaptrim_wait stateceph_repairing_pgs
: No. of PGs in the cluster with repair stateceph_slow_requests
: No. of slow requests/slow opsceph_degraded_pgs
: No. of PGs in a degraded stateceph_stuck_degraded_pgs
: No. of PGs stuck in a degraded stateceph_unclean_pgs
: No. of PGs in an unclean stateceph_stuck_unclean_pgs
: No. of PGs stuck in an unclean stateceph_undersized_pgs
: No. of undersized PGs in the clusterceph_stuck_undersized_pgs
: No. of stuck undersized PGs in the clusterceph_stale_pgs
: No. of stale PGs in the clusterceph_stuck_stale_pgs
: No. of stuck stale PGs in the clusterceph_peering_pgs
: No. of peering PGs in the clusterceph_degraded_objects
: No. of degraded objects across all PGs, includes replicasceph_misplaced_objects
: No. of misplaced objects across all PGs, includes replicasceph_misplaced_ratio
: ratio of misplaced objects to total objectsceph_new_crash_reports
: Number of new crash reports availableceph_osds_too_many_repair
: Number of OSDs with too many repaired readsceph_cluster_objects
: No. of rados objects within the clusterceph_osd_map_flags
: A metric for all OSDMap flagsceph_osds_down
: Count of OSDs that are in DOWN stateceph_osds_up
: Count of OSDs that are in UP stateceph_osds_in
: Count of OSDs that are in IN state and available to serve requestsceph_osds
: Count of total OSDs in the clusterceph_pgs_remapped
: No. of PGs that are remapped and incurring cluster-wide movementceph_recovery_io_bytes
: Rate of bytes being recovered in cluster per secondceph_recovery_io_keys
: Rate of keys being recovered in cluster per secondceph_recovery_io_objects
: Rate of objects being recovered in cluster per secondceph_client_io_read_bytes
: Rate of bytes being read by all clients per secondceph_client_io_write_bytes
: Rate of bytes being written by all clients per secondceph_client_io_ops
: Total client ops on the cluster measured per secondceph_client_io_read_ops
: Total client read I/O ops on the cluster measured per secondceph_client_io_write_ops
: Total client write I/O ops on the cluster measured per secondceph_cache_flush_io_bytes
: Rate of bytes being flushed from the cache pool per secondceph_cache_evict_io_bytes
: Rate of bytes being evicted from the cache pool per secondceph_cache_promote_io_ops
: Total cache promote operations measured per secondceph_mgrs_active
: Count of active mgrs, can be either 0 or 1ceph_mgrs
: Total number of mgrs, including standbysceph_rbd_mirror_up
: Alive rbd-mirror daemons
Ceph monitor
Ceph Monitor metrics
Labels:
cluster
: cluster namedaemon
: daemon name.ceph_versions
andceph_features
onlyrelease
,features
: ceph feature name and feature flag.ceph_features
onlyversion_tag
,sha1
,release_name
: ceph version infortmation.ceph_features
only
Metrics:
ceph_monitor_capacity_bytes
: Total storage capacity of the monitor nodeceph_monitor_used_bytes
: Storage of the monitor node that is currently allocated for useceph_monitor_avail_bytes
: Total unused storage capacity that the monitor node has leftceph_monitor_avail_percent
: Percentage of total unused storage capacity that the monitor node has leftceph_monitor_store_capacity_bytes
: Total capacity of the FileStore backing the monitor daemonceph_monitor_store_sst_bytes
: Capacity of the FileStore used only for raw SSTsceph_monitor_store_log_bytes
: Capacity of the FileStore used only for loggingceph_monitor_store_misc_bytes
: Capacity of the FileStore used only for storing miscellaneous informationceph_monitor_clock_skew_seconds
: Clock skew the monitor node is incurringceph_monitor_latency_seconds
: Latency the monitor node is incurringceph_monitor_quorum_count
: he total size of the monitor quorumceph_versions
: Counts of current versioned daemons, parsed fromceph versions
ceph_features
: Counts of current client features, parsed fromceph features
OSD collector
OSD level metrics
Labels:
cluster
: cluster nameosd
: OSD iddevice_class
: CRUSH device classhost
: CRUSH host the OSD is inrack
: CRUSH rack the OSD is inroot
: CRUSH root the OSD is inpgid
: PG id for recovery related metrics
Metrics:
ceph_osd_crush_weight
: OSD Crush Weightceph_osd_depth
: OSD Depthceph_osd_reweight
: OSD Reweightceph_osd_bytes
: OSD Total Bytesceph_osd_used_bytes
: OSD Used Storage in Bytesceph_osd_avail_bytes
: OSD Available Storage in Bytesceph_osd_utilization
: OSD Utilizationceph_osd_variance
: OSD Varianceceph_osd_pgs
: OSD Placement Group Countceph_osd_pg_upmap_items_total
: OSD PG-Upmap Exception Table Entry Countceph_osd_total_bytes
: OSD Total Storage Bytesceph_osd_total_used_bytes
: OSD Total Used Storage Bytesceph_osd_total_avail_bytes
: OSD Total Available Storage Bytesceph_osd_average_utilization
: OSD Average Utilizationceph_osd_perf_commit_latency_seconds
: OSD Perf Commit Latencyceph_osd_perf_apply_latency_seconds
: OSD Perf Apply Latencyceph_osd_in
: OSD In Statusceph_osd_up
: OSD Up Statusceph_osd_full_ratio
: OSD Full Ratio Valueceph_osd_near_full_ratio
: OSD Near Full Ratio Valueceph_osd_backfill_full_ratio
: OSD Backfill Full Ratio Valueceph_osd_full
: OSD Full Statusceph_osd_near_full
: OSD Near Full Statusceph_osd_backfill_full
: OSD Backfill Full Statusceph_osd_down
: Number of OSDs down in the clusterceph_osd_scrub_state
: State of OSDs involved in a scrubceph_pg_objects_recovered
: Number of objects recovered in a PGceph_osd_objects_backfilled
: Average number of objects backfilled in an OSDceph_pg_oldest_inactive
: The amount of time in seconds that the oldest PG has been inactive for
Crash collector
Ceph crash daemon related metrics
Labels:
cluster
: cluster name
Metrics:
ceph_crash_reports
: Count of crashes reports per daemon, according toceph crash ls
RBD Mirror collector
Ceph RBD mirror health collector
Labels:
cluster
: cluster name
Metrics:
ceph_rbd_mirror_pool_status
: Health status of rbd-mirror, can vary only between 3 states (err:2, warn:1, ok:0)ceph_rbd_mirror_pool_daemon_status
: Health status of rbd-mirror daemons, can vary only between 3 states (err:2, warn:1, ok:0)ceph_rbd_mirror_pool_image_status
: "Health status of rbd-mirror images, can vary only between 3 states (err:2, warn:1, ok:0)
RGW collector
RGW related metrics. Only enabled if RGW_MODE={1,2}
is set.
Labels:
cluster
: cluster name
Metrics:
ceph_rgw_gc_active_tasks
: RGW GC active task countceph_rgw_gc_active_objects
: RGW GC active object countceph_rgw_gc_pending_tasks
: RGW GC pending task countceph_rgw_gc_pending_objects
: RGW GC pending object count