Support collecting local IPMI metrics
This enables the standard `/metrics` endpoint. A scrape will trigger the collection of IPMI metrics from the local machine (that the exporter is running on).
This commit is contained in:
parent
49612613b7
commit
2c927eb68e
66
README.md
66
README.md
|
@ -1,12 +1,17 @@
|
||||||
Prometheus IPMI Exporter
|
Prometheus IPMI Exporter
|
||||||
========================
|
========================
|
||||||
|
|
||||||
This is an IPMI over LAN exporter for [Prometheus](https://prometheus.io).
|
This is an IPMI exporter for [Prometheus](https://prometheus.io).
|
||||||
|
|
||||||
An instance running on one host can be used to monitor a large number of IPMI
|
It supports both the regular `/metrics` endpoint, exposing metrics from the
|
||||||
interfaces by passing the `target` parameter to a scrape. It uses tools from
|
host that the exporter is running on, as well as an `/ipmi` endpoint that
|
||||||
the [FreeIPMI](https://www.gnu.org/software/freeipmi/) suite for the actual
|
supports IPMI over RMCP - one exporter running on one host can be used to
|
||||||
IPMI communication.
|
monitor a large number of IPMI interfaces by passing the `target` parameter to
|
||||||
|
a scrape.
|
||||||
|
|
||||||
|
The exporter relies on tools from the
|
||||||
|
[FreeIPMI](https://www.gnu.org/software/freeipmi/) suite for the actual IPMI
|
||||||
|
implementation.
|
||||||
|
|
||||||
## Installation
|
## Installation
|
||||||
|
|
||||||
|
@ -36,31 +41,56 @@ Make sure you have the following tools from the
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
The general configuration pattern is similar to that of the [blackbox
|
Simply scraping the standard `/metrics` endpoint will make the exporter emit
|
||||||
exporter](https://github.com/prometheus/blackbox_exporter), i.e. Prometheus
|
local IPMI metrics. No special configuration is required.
|
||||||
scrapes a small number (possibly one) of IPMI exporters with a `target` URL
|
|
||||||
parameter to tell the exporter which IPMI device it should use to retrieve the
|
For remote metrics, the general configuration pattern is similar to that of the
|
||||||
IPMI metrics. We have taken this approach as IPMI devices often provide useful
|
[blackbox exporter](https://github.com/prometheus/blackbox_exporter), i.e.
|
||||||
information even while the supervised host is turned off. If you are running
|
Prometheus scrapes a small number (possibly one) of IPMI exporters with a
|
||||||
the exporter on a separate host anyway, it makes more sense to have only a few
|
`target` URL parameter to tell the exporter which IPMI device it should use to
|
||||||
of them, each probing many (possibly thousands of) IPMI devices, rather than
|
retrieve the IPMI metrics. We offer this approach as IPMI devices often provide
|
||||||
one exporter per IPMI device.
|
useful information even while the supervised host is turned off. If you are
|
||||||
|
running the exporter on a separate host anyway, it makes more sense to have
|
||||||
|
only a few of them, each probing many (possibly thousands of) IPMI devices,
|
||||||
|
rather than one exporter per IPMI device.
|
||||||
|
|
||||||
### IPMI exporter
|
### IPMI exporter
|
||||||
|
|
||||||
The exporter requires a configuration file called `ipmi.yml` (can be
|
The exporter requires a configuration file called `ipmi.yml` (can be
|
||||||
overridden, see above). It must contain user names and passwords for IPMI
|
overridden, see above). To collect local metrics, an empty file is technically
|
||||||
access to all targets. It supports a “default” target, which is used as
|
sufficient. For remote metrics, it must contain user names and passwords for
|
||||||
|
IPMI access to all targets. It supports a “default” target, which is used as
|
||||||
fallback if the target is not explicitly listed in the file.
|
fallback if the target is not explicitly listed in the file.
|
||||||
|
|
||||||
The configuration file also supports a blacklist of sensors, useful in case of
|
The configuration file also supports a blacklist of sensors, useful in case of
|
||||||
OEM-specific sensors that FreeIPMI cannot deal with properly or otherwise
|
OEM-specific sensors that FreeIPMI cannot deal with properly or otherwise
|
||||||
misbehaving sensors.
|
misbehaving sensors. This applies to both local and remote metrics.
|
||||||
|
|
||||||
See the included `ipmi.yml` file for an example.
|
See the included `ipmi.yml` file for an example.
|
||||||
|
|
||||||
### Prometheus
|
### Prometheus
|
||||||
|
|
||||||
|
#### Local metrics
|
||||||
|
|
||||||
|
Collecting local IPMI metrics is fairly straightforward. Simply configure your
|
||||||
|
server to scrape the default metrics endpoint on the hosts running the
|
||||||
|
exporter.
|
||||||
|
|
||||||
|
```
|
||||||
|
- job_name: ipmi
|
||||||
|
scrape_interval: 1m
|
||||||
|
scrape_timeout: 30s
|
||||||
|
metrics_path: /metrics
|
||||||
|
scheme: http
|
||||||
|
static_configs:
|
||||||
|
- targets:
|
||||||
|
- 10.1.2.23:9290
|
||||||
|
- 10.1.2.24:9290
|
||||||
|
- 10.1.2.25:9290
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Remote metrics
|
||||||
|
|
||||||
To add your IPMI targets to Prometheus, you can use any of the supported
|
To add your IPMI targets to Prometheus, you can use any of the supported
|
||||||
service discovery mechanism of your choice. The following example uses the
|
service discovery mechanism of your choice. The following example uses the
|
||||||
file-based SD and should be easy to adjust to other scenarios.
|
file-based SD and should be easy to adjust to other scenarios.
|
||||||
|
@ -113,7 +143,7 @@ add the following to your Prometheus config:
|
||||||
- separator: ;
|
- separator: ;
|
||||||
regex: .*
|
regex: .*
|
||||||
target_label: __address__
|
target_label: __address__
|
||||||
replacement: ipmi-exporter.internal.example.com:9198
|
replacement: ipmi-exporter.internal.example.com:9290
|
||||||
action: replace
|
action: replace
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
96
collector.go
96
collector.go
|
@ -23,6 +23,8 @@ import (
|
||||||
|
|
||||||
const namespace = "ipmi"
|
const namespace = "ipmi"
|
||||||
|
|
||||||
|
const targetLocal = ""
|
||||||
|
|
||||||
var (
|
var (
|
||||||
ipmiDCMICurrentPowerRegex = regexp.MustCompile(`^Current Power\s*:\s*(?P<value>[0-9.]*)\s*Watts.*`)
|
ipmiDCMICurrentPowerRegex = regexp.MustCompile(`^Current Power\s*:\s*(?P<value>[0-9.]*)\s*Watts.*`)
|
||||||
bmcInfoFirmwareRevisionRegex = regexp.MustCompile(`^Firmware Revision\s*:\s*(?P<value>[0-9.]*).*`)
|
bmcInfoFirmwareRevisionRegex = regexp.MustCompile(`^Firmware Revision\s*:\s*(?P<value>[0-9.]*).*`)
|
||||||
|
@ -44,6 +46,12 @@ type sensorData struct {
|
||||||
Event string
|
Event string
|
||||||
}
|
}
|
||||||
|
|
||||||
|
type rmcpConfig struct {
|
||||||
|
host string
|
||||||
|
user string
|
||||||
|
pass string
|
||||||
|
}
|
||||||
|
|
||||||
var (
|
var (
|
||||||
sensorStateDesc = prometheus.NewDesc(
|
sensorStateDesc = prometheus.NewDesc(
|
||||||
prometheus.BuildFQName(namespace, "sensor", "state"),
|
prometheus.BuildFQName(namespace, "sensor", "state"),
|
||||||
|
@ -194,36 +202,42 @@ func freeipmiConfigPipe(driver, user, password string) (string, error) {
|
||||||
return pipe, nil
|
return pipe, nil
|
||||||
}
|
}
|
||||||
|
|
||||||
func freeipmiOutput(cmd, host, user, password string, arg ...string) ([]byte, error) {
|
func freeipmiOutput(cmd string, rmcp *rmcpConfig, arg ...string) ([]byte, error) {
|
||||||
pipe, err := freeipmiConfigPipe("LAN_2_0", user, password)
|
args := []string{}
|
||||||
if err != nil {
|
|
||||||
return nil, err
|
if rmcp != nil {
|
||||||
|
pipe, err := freeipmiConfigPipe("LAN_2_0", rmcp.user, rmcp.pass)
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
defer os.Remove(pipe)
|
||||||
|
|
||||||
|
rmcpArgs := []string{
|
||||||
|
"--config-file", pipe,
|
||||||
|
"-h", rmcp.host,
|
||||||
|
}
|
||||||
|
args = append(args, rmcpArgs...)
|
||||||
}
|
}
|
||||||
defer os.Remove(pipe)
|
|
||||||
|
|
||||||
fqcmd := path.Join(*executablesPath, cmd)
|
fqcmd := path.Join(*executablesPath, cmd)
|
||||||
args := []string{
|
|
||||||
"--config-file", pipe,
|
|
||||||
"-h", host,
|
|
||||||
}
|
|
||||||
args = append(args, arg...)
|
args = append(args, arg...)
|
||||||
out, err := exec.Command(fqcmd, args...).CombinedOutput()
|
out, err := exec.Command(fqcmd, args...).CombinedOutput()
|
||||||
if err != nil {
|
if err != nil {
|
||||||
log.Errorf("Error while calling %s for %s: %s", cmd, host, out)
|
log.Errorf("Error while calling %s: %s", cmd, out)
|
||||||
}
|
}
|
||||||
return out, err
|
return out, err
|
||||||
}
|
}
|
||||||
|
|
||||||
func ipmiMonitoringOutput(host, user, password string) ([]byte, error) {
|
func ipmiMonitoringOutput(rmcp *rmcpConfig) ([]byte, error) {
|
||||||
return freeipmiOutput("ipmimonitoring", host, user, password, "-Q", "--comma-separated-output", "--no-header-output", "--sdr-cache-recreate")
|
return freeipmiOutput("ipmimonitoring", rmcp, "-Q", "--comma-separated-output", "--no-header-output", "--sdr-cache-recreate")
|
||||||
}
|
}
|
||||||
|
|
||||||
func ipmiDCMIOutput(host, user, password string) ([]byte, error) {
|
func ipmiDCMIOutput(rmcp *rmcpConfig) ([]byte, error) {
|
||||||
return freeipmiOutput("ipmi-dcmi", host, user, password, "--get-system-power-statistics")
|
return freeipmiOutput("ipmi-dcmi", rmcp, "--get-system-power-statistics")
|
||||||
}
|
}
|
||||||
|
|
||||||
func bmcInfoOutput(host, user, password string) ([]byte, error) {
|
func bmcInfoOutput(rmcp *rmcpConfig) ([]byte, error) {
|
||||||
return freeipmiOutput("bmc-info", host, user, password, "--get-device-id")
|
return freeipmiOutput("bmc-info", rmcp, "--get-device-id")
|
||||||
}
|
}
|
||||||
|
|
||||||
func splitMonitoringOutput(impiOutput []byte, excludeSensorIds []int64) ([]sensorData, error) {
|
func splitMonitoringOutput(impiOutput []byte, excludeSensorIds []int64) ([]sensorData, error) {
|
||||||
|
@ -348,8 +362,8 @@ func collectGenericSensor(ch chan<- prometheus.Metric, state float64, data senso
|
||||||
)
|
)
|
||||||
}
|
}
|
||||||
|
|
||||||
func (c collector) collectMonitoring(ch chan<- prometheus.Metric, creds Credentials) (int, error) {
|
func (c collector) collectMonitoring(ch chan<- prometheus.Metric, rmcp *rmcpConfig) (int, error) {
|
||||||
output, err := ipmiMonitoringOutput(c.target, creds.User, creds.Password)
|
output, err := ipmiMonitoringOutput(rmcp)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
log.Errorf("Failed to collect ipmimonitoring data: %s", err)
|
log.Errorf("Failed to collect ipmimonitoring data: %s", err)
|
||||||
return 0, err
|
return 0, err
|
||||||
|
@ -397,8 +411,8 @@ func (c collector) collectMonitoring(ch chan<- prometheus.Metric, creds Credenti
|
||||||
return 1, nil
|
return 1, nil
|
||||||
}
|
}
|
||||||
|
|
||||||
func (c collector) collectDCMI(ch chan<- prometheus.Metric, creds Credentials) (int, error) {
|
func (c collector) collectDCMI(ch chan<- prometheus.Metric, rmcp *rmcpConfig) (int, error) {
|
||||||
output, err := ipmiDCMIOutput(c.target, creds.User, creds.Password)
|
output, err := ipmiDCMIOutput(rmcp)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
log.Debugf("Failed to collect ipmi-dcmi data: %s", err)
|
log.Debugf("Failed to collect ipmi-dcmi data: %s", err)
|
||||||
return 0, err
|
return 0, err
|
||||||
|
@ -416,8 +430,8 @@ func (c collector) collectDCMI(ch chan<- prometheus.Metric, creds Credentials) (
|
||||||
return 1, nil
|
return 1, nil
|
||||||
}
|
}
|
||||||
|
|
||||||
func (c collector) collectBmcInfo(ch chan<- prometheus.Metric, creds Credentials) (int, error) {
|
func (c collector) collectBmcInfo(ch chan<- prometheus.Metric, rmcp *rmcpConfig) (int, error) {
|
||||||
output, err := bmcInfoOutput(c.target, creds.User, creds.Password)
|
output, err := bmcInfoOutput(rmcp)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
log.Debugf("Failed to collect bmc-info data: %s", err)
|
log.Debugf("Failed to collect bmc-info data: %s", err)
|
||||||
return 0, err
|
return 0, err
|
||||||
|
@ -467,7 +481,7 @@ func (c collector) Collect(ch chan<- prometheus.Metric) {
|
||||||
start := time.Now()
|
start := time.Now()
|
||||||
defer func() {
|
defer func() {
|
||||||
duration := time.Since(start).Seconds()
|
duration := time.Since(start).Seconds()
|
||||||
log.Debugf("Scrape of target %s took %f seconds.", c.target, duration)
|
log.Debugf("Scrape of target %s took %f seconds.", targetName(c.target), duration)
|
||||||
ch <- prometheus.MustNewConstMetric(
|
ch <- prometheus.MustNewConstMetric(
|
||||||
durationDesc,
|
durationDesc,
|
||||||
prometheus.GaugeValue,
|
prometheus.GaugeValue,
|
||||||
|
@ -475,16 +489,25 @@ func (c collector) Collect(ch chan<- prometheus.Metric) {
|
||||||
)
|
)
|
||||||
}()
|
}()
|
||||||
|
|
||||||
creds, err := c.config.CredentialsForTarget(c.target)
|
rmcp := (*rmcpConfig)(nil)
|
||||||
if err != nil {
|
|
||||||
log.Errorf("No credentials available for target %s.", c.target)
|
if !targetIsLocal(c.target) {
|
||||||
c.markCollectorsUp(ch, 0, 0, 0)
|
creds, err := c.config.CredentialsForTarget(c.target)
|
||||||
return
|
if err != nil {
|
||||||
|
log.Errorf("No credentials available for target %s.", c.target)
|
||||||
|
c.markCollectorsUp(ch, 0, 0, 0)
|
||||||
|
return
|
||||||
|
}
|
||||||
|
rmcp = &rmcpConfig{
|
||||||
|
host: c.target,
|
||||||
|
user: creds.User,
|
||||||
|
pass: creds.Password,
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
ipmiUp, _ := c.collectMonitoring(ch, creds)
|
ipmiUp, _ := c.collectMonitoring(ch, rmcp)
|
||||||
dcmiUp, _ := c.collectDCMI(ch, creds)
|
dcmiUp, _ := c.collectDCMI(ch, rmcp)
|
||||||
bmcUp, _ := c.collectBmcInfo(ch, creds)
|
bmcUp, _ := c.collectBmcInfo(ch, rmcp)
|
||||||
|
|
||||||
c.markCollectorsUp(ch, bmcUp, dcmiUp, ipmiUp)
|
c.markCollectorsUp(ch, bmcUp, dcmiUp, ipmiUp)
|
||||||
}
|
}
|
||||||
|
@ -497,3 +520,14 @@ func contains(s []int64, elm int64) bool {
|
||||||
}
|
}
|
||||||
return false
|
return false
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func targetName(target string) string {
|
||||||
|
if targetIsLocal(target) {
|
||||||
|
return "[local]"
|
||||||
|
}
|
||||||
|
return target
|
||||||
|
}
|
||||||
|
|
||||||
|
func targetIsLocal(target string) bool {
|
||||||
|
return target == targetLocal
|
||||||
|
}
|
||||||
|
|
18
main.go
18
main.go
|
@ -34,7 +34,7 @@ var (
|
||||||
reloadCh chan chan error
|
reloadCh chan chan error
|
||||||
)
|
)
|
||||||
|
|
||||||
func handler(w http.ResponseWriter, r *http.Request) {
|
func remoteIpmiHandler(w http.ResponseWriter, r *http.Request) {
|
||||||
target := r.URL.Query().Get("target")
|
target := r.URL.Query().Get("target")
|
||||||
if target == "" {
|
if target == "" {
|
||||||
http.Error(w, "'target' parameter must be specified", 400)
|
http.Error(w, "'target' parameter must be specified", 400)
|
||||||
|
@ -43,8 +43,8 @@ func handler(w http.ResponseWriter, r *http.Request) {
|
||||||
log.Debugf("Scraping target '%s'", target)
|
log.Debugf("Scraping target '%s'", target)
|
||||||
|
|
||||||
registry := prometheus.NewRegistry()
|
registry := prometheus.NewRegistry()
|
||||||
collector := collector{target: target, config: sc}
|
remoteCollector := collector{target: target, config: sc}
|
||||||
registry.MustRegister(collector)
|
registry.MustRegister(remoteCollector)
|
||||||
h := promhttp.HandlerFor(registry, promhttp.HandlerOpts{})
|
h := promhttp.HandlerFor(registry, promhttp.HandlerOpts{})
|
||||||
h.ServeHTTP(w, r)
|
h.ServeHTTP(w, r)
|
||||||
}
|
}
|
||||||
|
@ -94,8 +94,11 @@ func main() {
|
||||||
}
|
}
|
||||||
}()
|
}()
|
||||||
|
|
||||||
http.Handle("/metrics", promhttp.Handler()) // Normal metrics endpoint for IPMI exporter itself.
|
localCollector := collector{target: targetLocal, config: sc}
|
||||||
http.HandleFunc("/ipmi", handler) // Endpoint to do IPMI scrapes.
|
prometheus.MustRegister(&localCollector)
|
||||||
|
|
||||||
|
http.Handle("/metrics", promhttp.Handler()) // Regular metrics endpoint for local IPMI metrics.
|
||||||
|
http.HandleFunc("/ipmi", remoteIpmiHandler) // Endpoint to do IPMI scrapes.
|
||||||
http.HandleFunc("/-/reload", updateConfiguration) // Endpoint to reload configuration.
|
http.HandleFunc("/-/reload", updateConfiguration) // Endpoint to reload configuration.
|
||||||
|
|
||||||
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
|
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
|
||||||
|
@ -120,8 +123,9 @@ func main() {
|
||||||
<form action="/ipmi">
|
<form action="/ipmi">
|
||||||
<label>Target:</label> <input type="text" name="target" placeholder="X.X.X.X" value="1.2.3.4"><br>
|
<label>Target:</label> <input type="text" name="target" placeholder="X.X.X.X" value="1.2.3.4"><br>
|
||||||
<input type="submit" value="Submit">
|
<input type="submit" value="Submit">
|
||||||
</form>
|
</form>
|
||||||
<p><a href="/config">Config</a></p>
|
<p><a href="/metrics">Local metrics</a></p>
|
||||||
|
<p><a href="/config">Config</a></p>
|
||||||
</body>
|
</body>
|
||||||
</html>`))
|
</html>`))
|
||||||
})
|
})
|
||||||
|
|
Loading…
Reference in New Issue