Go to file
Julius Volz 35b3741756 Add runbook and alertmanager URLs to PD+email notifications.
I don't have a way to test all the other notification mechanisms, which
is something we should fix in general. For now, only PagerDuty and email
have the new runbook and alertmanager URL information.

Not very happy with the overall cleanliness of this, and the codebase
overall, of course, but since we need this urgently tomorrow, I hope
this is fine for now.
2015-06-25 18:18:08 +02:00
config Merge pull request #62 from SQiShER/master 2015-05-28 14:55:28 +02:00
manager Add runbook and alertmanager URLs to PD+email notifications. 2015-06-25 18:18:08 +02:00
web Add runbook and alertmanager URLs to PD+email notifications. 2015-06-25 18:18:08 +02:00
.dockerignore we do not need the deps around after the executable is created 2015-04-10 14:47:36 -05:00
.gitignore Add simple support for Slack notifications 2015-05-10 22:21:18 +02:00
.travis.yml Add travis CI support 2015-04-17 18:28:07 -04:00
AUTHORS.md Update AUTHORS 2015-05-11 09:41:26 +02:00
build_info.go Add missing build_info.go file. 2013-08-15 07:46:35 +02:00
CHANGELOG.md Cut bugfix release 0.0.3. 2015-06-10 17:05:02 +02:00
CONTRIBUTING.md License cleanup. 2015-01-22 15:45:23 +01:00
Dockerfile Fix silences.json permission error with docker image 2015-06-03 00:16:22 +02:00
LICENSE License cleanup. 2015-01-22 15:45:23 +01:00
main.go Add runbook and alertmanager URLs to PD+email notifications. 2015-06-25 18:18:08 +02:00
Makefile Cut bugfix release 0.0.3. 2015-06-10 17:05:02 +02:00
Makefile.COMMON Use Makefile.COMMON and add generated files.go 2015-05-05 17:11:03 +02:00
NOTICE License cleanup. 2015-01-22 15:45:23 +01:00
README.md Add a generic webhook notifier. 2015-05-27 23:57:18 +01:00

Alertmanager

Prometheus Alertmanager

WARNING: The Alertmanager is still very experimental and early in its development and design. More than any other Prometheus component, it will still undergo frequent breaking changes, including ones that will affect its architecture as a whole. While we do plan on making it mature and stable eventually, use it at your own risk for now.

Alertmanager receives alerts generated by Prometheus and takes care of the following aspects:

  • manual silencing of specific alerts
  • inhibiting alerts based on alert dependencies
  • aggregating alerts by labelset
  • handling notification repeats
  • sending alert notifications via external services (currently email, generic web hook, PagerDuty, HipChat, Slack, Pushover, or Flowdock)

See config/fixtures/sample.conf.input for an example config. The full configuration schema including a documentation for all possible options can be found in config/config.proto. Alertmanager automatically reloads the configuration when it changes, so restarts are not required for configuration updates.

Building and running

make
./alertmanager -config.file=/path/to/alertmanager.conf

Configuring Prometheus to send alerts

To make Prometheus send alerts to your Alertmanager, set the alertmanager.url command-line flag on the server:

./prometheus -alertmanager.url=http://<alertmanager-host>:<port> <...other flags...>

Prometheus only pushes firing alerts to Alertmanager. Alertmanager expects to receive regular pushes of firing alerts from Prometheus. Alerts which are not refreshed for a period of -alerts.min-refresh-period (5 minutes by default) are expired.

Alertmanager only shows alerts which are currently firing and pushed to Alertmanager.

Running tests

Build Status

make test

Caveats and roadmap

Alertmanager is still in an experimental state. Some of the known caveats which are going to be addressed in the future:

  • Alertmanager is run as a single instance and does not provide high availability yet. We plan on clustering multiple replicated Alertmanager instances to ensure reliability in the future.
  • Relatedly, silence information is currently only persisted locally in a file and lost if you lose the machine your Alertmanager is running on.
  • Alert aggregation needs to become more flexible. Currently alerts are aggregated based on their full labelsets. In the future, we want to allow grouping alerts based on a subset thereof (for example, grouping all alerts with one alert name and from the same job together).
  • For alert dependencies, we want to support time delays: if alert A inhibits alert B due to a dependency and B begins firing before A, wait for a configurable amount of time for A to start firing as well before sending notifications for B. This is not yet supported.
  • Alertmanager has not been tested or optimized for high alert loads yet.