Commit Graph

27 Commits

Author SHA1 Message Date
Conrad Hoffmann
67041ef633 The great refactoring
- Move every collector into its own file
- Move FreeIPMI code into own package
- Allow more customization of commands executed by collectors
- Split up documentation, so README is a little less overwhelming

A single commit message does not do justice to the amount of changes
here, but hey... :)
2021-06-01 22:38:23 +02:00
Conrad Hoffmann
0b40276f1f Output events in hex (bitmask), not string
The events are currently not used for anyways, and this avoids the case
where `ipmimonitoring` outputs several events quoted in a way that is at
best borderline CSV-compatible.

Fixes #62
2021-02-19 16:19:46 +01:00
Conrad Hoffmann
1a7c9d6189 Move error logging to same level as error handling
The `freeipmiOutput` function is used many in places. It already returns
an error if it encounters one, so it should leave the handling as well
as the severity of the logging of the error to the caller.

More specifically, the case has come up where `bmc-info` may fail, but
still provide partial results that the collector can use. This commit
allows to do so without littering the log with errors.
2021-02-19 16:08:09 +01:00
Conrad Hoffmann
27f27b902f Try to parse bmc-info even if command failed
This is a workaround for an issue described in #57. The bmc-info command
might produce usable output minus the system firmware revision, but then
choke on that. Try to recover in that scenario by attempting to parse
the output even if the command failed. Since the system firmware
revision is already optional, this should at least produce all other
values.

It is not pretty, but it avoids both folks having to change their
configs as well a second round-trip, which can be quite expensive in
IPMI.
2021-02-16 18:36:17 +01:00
Conrad Hoffmann
57c0f966d0 Make system firmware version optional in BMC info
There are systems that do not make this available, so don't make the
entire collector fail if only this metric can not be read. Instead, set
it to "N/A" if it cannot be determined.

This fixes #57.
2020-10-22 18:09:14 +02:00
Conrad Hoffmann
29354b0eb0 Document magic values in metric description
See https://www.supermicro.com/support/faqs/faq.cfm?faq=28159
2020-07-26 17:28:12 +02:00
Conrad Hoffmann
2c6a73b151 Add system firmware version to BMC info metric
Add the "system firmware version" (i.e. the host's "BIOS version", as
opposed to the BMC firmware version) as a label to the `bmc_info`
metric.

This fixes #51.
2020-07-26 16:39:08 +02:00
Jakub Chábek
534775f13a Add new metric config_lan_mode 2020-06-30 10:40:54 +02:00
Conrad Hoffmann
0aa63d4c21 Add SEL collector
It exposes two metrics about the IPMI system event log (SEL), the
current number of entries stored in it and the free space for new
records. The collector is not enabled by default, it has to be
explicitly enabled in the config.

Related to #41.
2020-04-22 22:58:29 +02:00
Conrad Hoffmann
7d7e33dc93 Log error if pipe deletion fails
Hopefully this helps to shed some light on #42.
2020-03-06 09:56:31 +01:00
zliuva
636235f6da Adding config options for workaround_flags. 2019-12-18 08:51:47 -08:00
Michael Sherman
25d1fd0ef8 Check chassis power state
Use ipmi-chassis to check power on /off
report via ipmi_chassis_power_state
enable via "chassis" module
2019-10-14 15:43:30 -04:00
Badreddin
ac6dbe2e02 add --ignore-unrecognized-events flag
to avoid NaN in sensor reading when sensor Event has unrecognized-events
2019-10-01 11:29:41 +02:00
Conrad Hoffmann
11f380924f Escape hashes in password for config file
The hash is the comment character in the config file, even if it occurs
in the middle of the password. This can be worked around however by
escaping it.

This fixes #16.
2019-04-15 11:01:59 +02:00
Conrad Hoffmann
ab356b2890 Remove accidentally merged debug output 2019-04-15 10:57:45 +02:00
Conrad Hoffmann
a4a57fe40b Allow setting the session timeout
Now that we have a good config framework in place, this is low-hanging
fruit. Will apply to all collectors used, so total scrape time for
Prometheus could be (timeout * #-of-collectors) milliseconds for a given
module.

Related to #20.
2019-03-22 12:27:32 +01:00
Danny Kulchinsky
6aa7866dc1 rebased and adjusted per #17 2019-03-20 11:08:18 -04:00
Danny Kulchinsky
7108701389 one more log 2019-03-20 11:05:18 -04:00
Danny Kulchinsky
a908a34812 typo :) 2019-03-20 11:05:18 -04:00
Danny Kulchinsky
7e5c643b90 adding rcmp.host to error log messages 2019-03-20 11:05:18 -04:00
Conrad Hoffmann
1a99329314 Refactor mapping of target to IPMI settings
Specifically, allow definition of a set of settings as module in the
configuration file, and the ability to use these settings by setting the
`module` URL parameter to the respective module name when scraping.

THIS COMMIT CHANGES THE CONFIG FORMAT IN A NON-BACKWARDS-COMPATIBLE WAY!

Based on this, the following "side effects" are noteworthy:

 - the exporter no longer requires a config file
 - the IPMI "privilege level" can be set in the config file
 - collectors can be enabled/disabled in the config file
 - anonymous IPMI access is now theoretically possible
 - there are now two example configurations (local & remote)

This fixes #10 by allowing to set the privilege level.
2019-03-16 16:11:32 +01:00
Conrad Hoffmann
1e16da97c1 Minor style fixes 2018-09-21 11:41:59 +02:00
Conrad Hoffmann
2c927eb68e Support collecting local IPMI metrics
This enables the standard `/metrics` endpoint. A scrape will trigger the
collection of IPMI metrics from the local machine (that the exporter is
running on).
2018-09-20 16:27:03 +02:00
Conrad Hoffmann
ab14984e9a Fix return value in happy path
This fixes #7.
2018-08-03 16:24:49 +02:00
Conrad Hoffmann
9fb5f7296c Handle tool-specific failures more gracefully
Instead of failing hard and not returning any metrics at all if just one
(or two) of the three calls to IPMI tools fail, return whatever data was
properly received and add a `collector` label to the `ipmi_up` metric
indicating which tools failed.

This is only a small step towards the concept of "collectors" like they
exist e.g. in the node exporter, but it should help solve #1. Additional
functionality, like disabling certain collectors, can be built on top of
this.

Currently, an error in the `ipmi` collector is always logged as an error,
In the `dcmi` and `bmc` collectors, an error retrieving the data is only
logged as debug output, but an error processing retrieved data is logged
as an error. This should cover most use cases and will be improved upon
once more work is done to make the collectors selectable per scrape.
2018-07-31 09:24:54 +02:00
Conrad Hoffmann
e11e76ed5c Use config file instead of command line arguments
Use a named pipe with 0600 permissions to pass the credentials to
FreeIPMI instead of using the command line, which certainly constitutes
bad security practice.

Template the `driver-type` while at it to potentially support local IPMI
at some point.
2018-07-26 16:14:26 +02:00
Conrad Hoffmann
670b92c799 Initial public release 2018-05-24 16:28:06 +02:00