Support collecting local IPMI metrics

This enables the standard `/metrics` endpoint. A scrape will trigger the
collection of IPMI metrics from the local machine (that the exporter is
running on).
This commit is contained in:
Conrad Hoffmann 2018-08-03 16:23:35 +02:00
parent 49612613b7
commit 2c927eb68e
3 changed files with 124 additions and 56 deletions

View File

@ -1,12 +1,17 @@
Prometheus IPMI Exporter Prometheus IPMI Exporter
======================== ========================
This is an IPMI over LAN exporter for [Prometheus](https://prometheus.io). This is an IPMI exporter for [Prometheus](https://prometheus.io).
An instance running on one host can be used to monitor a large number of IPMI It supports both the regular `/metrics` endpoint, exposing metrics from the
interfaces by passing the `target` parameter to a scrape. It uses tools from host that the exporter is running on, as well as an `/ipmi` endpoint that
the [FreeIPMI](https://www.gnu.org/software/freeipmi/) suite for the actual supports IPMI over RMCP - one exporter running on one host can be used to
IPMI communication. monitor a large number of IPMI interfaces by passing the `target` parameter to
a scrape.
The exporter relies on tools from the
[FreeIPMI](https://www.gnu.org/software/freeipmi/) suite for the actual IPMI
implementation.
## Installation ## Installation
@ -36,31 +41,56 @@ Make sure you have the following tools from the
## Configuration ## Configuration
The general configuration pattern is similar to that of the [blackbox Simply scraping the standard `/metrics` endpoint will make the exporter emit
exporter](https://github.com/prometheus/blackbox_exporter), i.e. Prometheus local IPMI metrics. No special configuration is required.
scrapes a small number (possibly one) of IPMI exporters with a `target` URL
parameter to tell the exporter which IPMI device it should use to retrieve the For remote metrics, the general configuration pattern is similar to that of the
IPMI metrics. We have taken this approach as IPMI devices often provide useful [blackbox exporter](https://github.com/prometheus/blackbox_exporter), i.e.
information even while the supervised host is turned off. If you are running Prometheus scrapes a small number (possibly one) of IPMI exporters with a
the exporter on a separate host anyway, it makes more sense to have only a few `target` URL parameter to tell the exporter which IPMI device it should use to
of them, each probing many (possibly thousands of) IPMI devices, rather than retrieve the IPMI metrics. We offer this approach as IPMI devices often provide
one exporter per IPMI device. useful information even while the supervised host is turned off. If you are
running the exporter on a separate host anyway, it makes more sense to have
only a few of them, each probing many (possibly thousands of) IPMI devices,
rather than one exporter per IPMI device.
### IPMI exporter ### IPMI exporter
The exporter requires a configuration file called `ipmi.yml` (can be The exporter requires a configuration file called `ipmi.yml` (can be
overridden, see above). It must contain user names and passwords for IPMI overridden, see above). To collect local metrics, an empty file is technically
access to all targets. It supports a “default” target, which is used as sufficient. For remote metrics, it must contain user names and passwords for
IPMI access to all targets. It supports a “default” target, which is used as
fallback if the target is not explicitly listed in the file. fallback if the target is not explicitly listed in the file.
The configuration file also supports a blacklist of sensors, useful in case of The configuration file also supports a blacklist of sensors, useful in case of
OEM-specific sensors that FreeIPMI cannot deal with properly or otherwise OEM-specific sensors that FreeIPMI cannot deal with properly or otherwise
misbehaving sensors. misbehaving sensors. This applies to both local and remote metrics.
See the included `ipmi.yml` file for an example. See the included `ipmi.yml` file for an example.
### Prometheus ### Prometheus
#### Local metrics
Collecting local IPMI metrics is fairly straightforward. Simply configure your
server to scrape the default metrics endpoint on the hosts running the
exporter.
```
- job_name: ipmi
scrape_interval: 1m
scrape_timeout: 30s
metrics_path: /metrics
scheme: http
static_configs:
- targets:
- 10.1.2.23:9290
- 10.1.2.24:9290
- 10.1.2.25:9290
```
#### Remote metrics
To add your IPMI targets to Prometheus, you can use any of the supported To add your IPMI targets to Prometheus, you can use any of the supported
service discovery mechanism of your choice. The following example uses the service discovery mechanism of your choice. The following example uses the
file-based SD and should be easy to adjust to other scenarios. file-based SD and should be easy to adjust to other scenarios.
@ -113,7 +143,7 @@ add the following to your Prometheus config:
- separator: ; - separator: ;
regex: .* regex: .*
target_label: __address__ target_label: __address__
replacement: ipmi-exporter.internal.example.com:9198 replacement: ipmi-exporter.internal.example.com:9290
action: replace action: replace
``` ```

View File

@ -23,6 +23,8 @@ import (
const namespace = "ipmi" const namespace = "ipmi"
const targetLocal = ""
var ( var (
ipmiDCMICurrentPowerRegex = regexp.MustCompile(`^Current Power\s*:\s*(?P<value>[0-9.]*)\s*Watts.*`) ipmiDCMICurrentPowerRegex = regexp.MustCompile(`^Current Power\s*:\s*(?P<value>[0-9.]*)\s*Watts.*`)
bmcInfoFirmwareRevisionRegex = regexp.MustCompile(`^Firmware Revision\s*:\s*(?P<value>[0-9.]*).*`) bmcInfoFirmwareRevisionRegex = regexp.MustCompile(`^Firmware Revision\s*:\s*(?P<value>[0-9.]*).*`)
@ -44,6 +46,12 @@ type sensorData struct {
Event string Event string
} }
type rmcpConfig struct {
host string
user string
pass string
}
var ( var (
sensorStateDesc = prometheus.NewDesc( sensorStateDesc = prometheus.NewDesc(
prometheus.BuildFQName(namespace, "sensor", "state"), prometheus.BuildFQName(namespace, "sensor", "state"),
@ -194,36 +202,42 @@ func freeipmiConfigPipe(driver, user, password string) (string, error) {
return pipe, nil return pipe, nil
} }
func freeipmiOutput(cmd, host, user, password string, arg ...string) ([]byte, error) { func freeipmiOutput(cmd string, rmcp *rmcpConfig, arg ...string) ([]byte, error) {
pipe, err := freeipmiConfigPipe("LAN_2_0", user, password) args := []string{}
if err != nil {
return nil, err if rmcp != nil {
pipe, err := freeipmiConfigPipe("LAN_2_0", rmcp.user, rmcp.pass)
if err != nil {
return nil, err
}
defer os.Remove(pipe)
rmcpArgs := []string{
"--config-file", pipe,
"-h", rmcp.host,
}
args = append(args, rmcpArgs...)
} }
defer os.Remove(pipe)
fqcmd := path.Join(*executablesPath, cmd) fqcmd := path.Join(*executablesPath, cmd)
args := []string{
"--config-file", pipe,
"-h", host,
}
args = append(args, arg...) args = append(args, arg...)
out, err := exec.Command(fqcmd, args...).CombinedOutput() out, err := exec.Command(fqcmd, args...).CombinedOutput()
if err != nil { if err != nil {
log.Errorf("Error while calling %s for %s: %s", cmd, host, out) log.Errorf("Error while calling %s: %s", cmd, out)
} }
return out, err return out, err
} }
func ipmiMonitoringOutput(host, user, password string) ([]byte, error) { func ipmiMonitoringOutput(rmcp *rmcpConfig) ([]byte, error) {
return freeipmiOutput("ipmimonitoring", host, user, password, "-Q", "--comma-separated-output", "--no-header-output", "--sdr-cache-recreate") return freeipmiOutput("ipmimonitoring", rmcp, "-Q", "--comma-separated-output", "--no-header-output", "--sdr-cache-recreate")
} }
func ipmiDCMIOutput(host, user, password string) ([]byte, error) { func ipmiDCMIOutput(rmcp *rmcpConfig) ([]byte, error) {
return freeipmiOutput("ipmi-dcmi", host, user, password, "--get-system-power-statistics") return freeipmiOutput("ipmi-dcmi", rmcp, "--get-system-power-statistics")
} }
func bmcInfoOutput(host, user, password string) ([]byte, error) { func bmcInfoOutput(rmcp *rmcpConfig) ([]byte, error) {
return freeipmiOutput("bmc-info", host, user, password, "--get-device-id") return freeipmiOutput("bmc-info", rmcp, "--get-device-id")
} }
func splitMonitoringOutput(impiOutput []byte, excludeSensorIds []int64) ([]sensorData, error) { func splitMonitoringOutput(impiOutput []byte, excludeSensorIds []int64) ([]sensorData, error) {
@ -348,8 +362,8 @@ func collectGenericSensor(ch chan<- prometheus.Metric, state float64, data senso
) )
} }
func (c collector) collectMonitoring(ch chan<- prometheus.Metric, creds Credentials) (int, error) { func (c collector) collectMonitoring(ch chan<- prometheus.Metric, rmcp *rmcpConfig) (int, error) {
output, err := ipmiMonitoringOutput(c.target, creds.User, creds.Password) output, err := ipmiMonitoringOutput(rmcp)
if err != nil { if err != nil {
log.Errorf("Failed to collect ipmimonitoring data: %s", err) log.Errorf("Failed to collect ipmimonitoring data: %s", err)
return 0, err return 0, err
@ -397,8 +411,8 @@ func (c collector) collectMonitoring(ch chan<- prometheus.Metric, creds Credenti
return 1, nil return 1, nil
} }
func (c collector) collectDCMI(ch chan<- prometheus.Metric, creds Credentials) (int, error) { func (c collector) collectDCMI(ch chan<- prometheus.Metric, rmcp *rmcpConfig) (int, error) {
output, err := ipmiDCMIOutput(c.target, creds.User, creds.Password) output, err := ipmiDCMIOutput(rmcp)
if err != nil { if err != nil {
log.Debugf("Failed to collect ipmi-dcmi data: %s", err) log.Debugf("Failed to collect ipmi-dcmi data: %s", err)
return 0, err return 0, err
@ -416,8 +430,8 @@ func (c collector) collectDCMI(ch chan<- prometheus.Metric, creds Credentials) (
return 1, nil return 1, nil
} }
func (c collector) collectBmcInfo(ch chan<- prometheus.Metric, creds Credentials) (int, error) { func (c collector) collectBmcInfo(ch chan<- prometheus.Metric, rmcp *rmcpConfig) (int, error) {
output, err := bmcInfoOutput(c.target, creds.User, creds.Password) output, err := bmcInfoOutput(rmcp)
if err != nil { if err != nil {
log.Debugf("Failed to collect bmc-info data: %s", err) log.Debugf("Failed to collect bmc-info data: %s", err)
return 0, err return 0, err
@ -467,7 +481,7 @@ func (c collector) Collect(ch chan<- prometheus.Metric) {
start := time.Now() start := time.Now()
defer func() { defer func() {
duration := time.Since(start).Seconds() duration := time.Since(start).Seconds()
log.Debugf("Scrape of target %s took %f seconds.", c.target, duration) log.Debugf("Scrape of target %s took %f seconds.", targetName(c.target), duration)
ch <- prometheus.MustNewConstMetric( ch <- prometheus.MustNewConstMetric(
durationDesc, durationDesc,
prometheus.GaugeValue, prometheus.GaugeValue,
@ -475,16 +489,25 @@ func (c collector) Collect(ch chan<- prometheus.Metric) {
) )
}() }()
creds, err := c.config.CredentialsForTarget(c.target) rmcp := (*rmcpConfig)(nil)
if err != nil {
log.Errorf("No credentials available for target %s.", c.target) if !targetIsLocal(c.target) {
c.markCollectorsUp(ch, 0, 0, 0) creds, err := c.config.CredentialsForTarget(c.target)
return if err != nil {
log.Errorf("No credentials available for target %s.", c.target)
c.markCollectorsUp(ch, 0, 0, 0)
return
}
rmcp = &rmcpConfig{
host: c.target,
user: creds.User,
pass: creds.Password,
}
} }
ipmiUp, _ := c.collectMonitoring(ch, creds) ipmiUp, _ := c.collectMonitoring(ch, rmcp)
dcmiUp, _ := c.collectDCMI(ch, creds) dcmiUp, _ := c.collectDCMI(ch, rmcp)
bmcUp, _ := c.collectBmcInfo(ch, creds) bmcUp, _ := c.collectBmcInfo(ch, rmcp)
c.markCollectorsUp(ch, bmcUp, dcmiUp, ipmiUp) c.markCollectorsUp(ch, bmcUp, dcmiUp, ipmiUp)
} }
@ -497,3 +520,14 @@ func contains(s []int64, elm int64) bool {
} }
return false return false
} }
func targetName(target string) string {
if targetIsLocal(target) {
return "[local]"
}
return target
}
func targetIsLocal(target string) bool {
return target == targetLocal
}

18
main.go
View File

@ -34,7 +34,7 @@ var (
reloadCh chan chan error reloadCh chan chan error
) )
func handler(w http.ResponseWriter, r *http.Request) { func remoteIpmiHandler(w http.ResponseWriter, r *http.Request) {
target := r.URL.Query().Get("target") target := r.URL.Query().Get("target")
if target == "" { if target == "" {
http.Error(w, "'target' parameter must be specified", 400) http.Error(w, "'target' parameter must be specified", 400)
@ -43,8 +43,8 @@ func handler(w http.ResponseWriter, r *http.Request) {
log.Debugf("Scraping target '%s'", target) log.Debugf("Scraping target '%s'", target)
registry := prometheus.NewRegistry() registry := prometheus.NewRegistry()
collector := collector{target: target, config: sc} remoteCollector := collector{target: target, config: sc}
registry.MustRegister(collector) registry.MustRegister(remoteCollector)
h := promhttp.HandlerFor(registry, promhttp.HandlerOpts{}) h := promhttp.HandlerFor(registry, promhttp.HandlerOpts{})
h.ServeHTTP(w, r) h.ServeHTTP(w, r)
} }
@ -94,8 +94,11 @@ func main() {
} }
}() }()
http.Handle("/metrics", promhttp.Handler()) // Normal metrics endpoint for IPMI exporter itself. localCollector := collector{target: targetLocal, config: sc}
http.HandleFunc("/ipmi", handler) // Endpoint to do IPMI scrapes. prometheus.MustRegister(&localCollector)
http.Handle("/metrics", promhttp.Handler()) // Regular metrics endpoint for local IPMI metrics.
http.HandleFunc("/ipmi", remoteIpmiHandler) // Endpoint to do IPMI scrapes.
http.HandleFunc("/-/reload", updateConfiguration) // Endpoint to reload configuration. http.HandleFunc("/-/reload", updateConfiguration) // Endpoint to reload configuration.
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
@ -120,8 +123,9 @@ func main() {
<form action="/ipmi"> <form action="/ipmi">
<label>Target:</label> <input type="text" name="target" placeholder="X.X.X.X" value="1.2.3.4"><br> <label>Target:</label> <input type="text" name="target" placeholder="X.X.X.X" value="1.2.3.4"><br>
<input type="submit" value="Submit"> <input type="submit" value="Submit">
</form> </form>
<p><a href="/config">Config</a></p> <p><a href="/metrics">Local metrics</a></p>
<p><a href="/config">Config</a></p>
</body> </body>
</html>`)) </html>`))
}) })