Extend README by config example
This commit is contained in:
parent
3d216d887b
commit
fc2245bb9f
140
README.md
140
README.md
|
@ -1,15 +1,17 @@
|
||||||
# Alertmanager
|
# Alertmanager
|
||||||
|
|
||||||
This is the development version of the Alertmanager. It is a rewrite and
|
This is the development version of the Alertmanager. It is a rewrite and
|
||||||
is only compatible to the present version 0.0.4 in terms of the API endpoint
|
is incompatible to the present version 0.0.4. The only backport was the API endpoint used by Prometheus to push new alerts.
|
||||||
used by Prometheus to push new alerts.
|
|
||||||
|
|
||||||
## Installation
|
## Installation
|
||||||
|
|
||||||
|
The current version has to be run from the repository folder as UI assets and notification templates are not yet statically compiled into the binary.
|
||||||
|
|
||||||
You can either `go get` it:
|
You can either `go get` it:
|
||||||
|
|
||||||
```
|
```
|
||||||
$ GO15VENDOREXPERIMENT=1 go get github.com/prometheus/alertmanager
|
$ GO15VENDOREXPERIMENT=1 go get github.com/prometheus/alertmanager
|
||||||
|
# cd $GOPATH/src/github.com/prometheus/alertmanager
|
||||||
$ alertmanager -config.file=<your_file>
|
$ alertmanager -config.file=<your_file>
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -28,12 +30,16 @@ $ ./alertmanager -config.file=<your_file>
|
||||||
|
|
||||||
This version was written from scratch. Core features enabled by this is are more advanced alert routing configurations and grouping/batching of alerts. Thus, squashing expression results through aggregation in alerting rules is no longer required to avoid noisyness.
|
This version was written from scratch. Core features enabled by this is are more advanced alert routing configurations and grouping/batching of alerts. Thus, squashing expression results through aggregation in alerting rules is no longer required to avoid noisyness.
|
||||||
|
|
||||||
|
The concepts of alert routing were outlined in [this document](https://docs.google.com/document/d/1-4jefGkFo71jlaLo4lHz40ZBoCv9ycBBBbjzbXifGyY/edit?usp=sharing).
|
||||||
|
|
||||||
The version implements full persistence of alerts, silences, and notification state. On restart it picks up right where it left off.
|
The version implements full persistence of alerts, silences, and notification state. On restart it picks up right where it left off.
|
||||||
|
|
||||||
### Known issues
|
### Known issues
|
||||||
|
|
||||||
This development version still has an extensive list of improvements and changes. This is an incomplete list of things that are still missing or need to be improved.
|
This development version still has an extensive list of improvements and changes. This is an incomplete list of things that are still missing or need to be improved.
|
||||||
|
|
||||||
|
This will happen based on priority and demand. Feel free to ping fabxc about it
|
||||||
|
|
||||||
* On deleting silences it may take up to one `group_wait` cycle for a notification of a previously silenced alert to be sent.
|
* On deleting silences it may take up to one `group_wait` cycle for a notification of a previously silenced alert to be sent.
|
||||||
* Limiting inhibition rules to routing subtrees to avoid accidental interference.
|
* Limiting inhibition rules to routing subtrees to avoid accidental interference.
|
||||||
* Show silencing inhibition of alerts in the UI
|
* Show silencing inhibition of alerts in the UI
|
||||||
|
@ -43,6 +49,136 @@ This development version still has an extensive list of improvements and changes
|
||||||
* Definition of a minimum data set provided to notification templates
|
* Definition of a minimum data set provided to notification templates
|
||||||
* Best practices around notification templating
|
* Best practices around notification templating
|
||||||
* Various common command line flags like `path-prefix`
|
* Various common command line flags like `path-prefix`
|
||||||
|
* Compiling templates and UI assets into the binary
|
||||||
|
* Allow constraining displayed alerts in UI
|
||||||
|
|
||||||
|
## Example
|
||||||
|
|
||||||
|
This is an example configuration that should cover most relevant aspects of the new YAML configuration format. Authoritative source for now is the [code](https://github.com/prometheus/alertmanager/tree/dev/config).
|
||||||
|
|
||||||
|
```
|
||||||
|
global:
|
||||||
|
# The smarthost and SMTP sender used for mail notifications.
|
||||||
|
smarthost: 'localhost:25'
|
||||||
|
smtp_sender: 'alertmanager@example.org'
|
||||||
|
|
||||||
|
# The directory from which notification templates are read.
|
||||||
|
templates:
|
||||||
|
- 'template/*.tmpl'
|
||||||
|
|
||||||
|
# The root route on which each incoming alert enters.
|
||||||
|
route:
|
||||||
|
# The labels by which incoming alerts are grouped together. For example,
|
||||||
|
# multiple alerts coming in for cluster=A and alertname=LatencyHigh would
|
||||||
|
# be batched into a single group.
|
||||||
|
group_by: ['alertname', 'cluster']
|
||||||
|
|
||||||
|
# When a new group of alerts is created by an incoming alert, wait at
|
||||||
|
# least 'group_wait' to send the initial notification.
|
||||||
|
# This way ensures that you get multiple alerts for the same group that start
|
||||||
|
# firing shortly after another are batched together on the first
|
||||||
|
# notification.
|
||||||
|
group_wait: 30s
|
||||||
|
|
||||||
|
# When the first notification was sent, wait 'group_interval' to send a betch
|
||||||
|
# of new alerts that started firing for that group.
|
||||||
|
group_interval: 5m
|
||||||
|
|
||||||
|
# If an alert has successfully been sent, wait 'repeat_interval' to
|
||||||
|
# resend them.
|
||||||
|
repeat_interval: 3h
|
||||||
|
|
||||||
|
# If 'continue' is false, the first sub-route that matches this alert will
|
||||||
|
# terminate the search and the alert will be inserted at that routing node.
|
||||||
|
# If true, the alert is inserted to sibling nodes as well if there is a
|
||||||
|
# match.
|
||||||
|
# This allows to do first-match semantics (=false) in smaller scopes (e.g. team-level),
|
||||||
|
# while avoiding accidental shadowing (=true) at alerts at larger scopes (e.g. company-level)
|
||||||
|
continue: true
|
||||||
|
|
||||||
|
# All the above attributes are inherited by all child routes and can
|
||||||
|
# overwritten on each.
|
||||||
|
|
||||||
|
# The child route trees.
|
||||||
|
routes:
|
||||||
|
# This routes performs a regular expression match on alert labels to
|
||||||
|
# catch alerts that are related to a list of services.
|
||||||
|
- match_re:
|
||||||
|
service: ^(foo1|foo2|baz)$
|
||||||
|
send_to: team-X-mails
|
||||||
|
|
||||||
|
# The service has a sub-route for critical alerts, any alerts
|
||||||
|
# that do not match, i.e. severity != critical, fall-back to the
|
||||||
|
# parent node and are sent to 'team-X-mails'
|
||||||
|
routes:
|
||||||
|
- match:
|
||||||
|
severity: critical
|
||||||
|
send_to: team-X-pager
|
||||||
|
|
||||||
|
- match:
|
||||||
|
service: files
|
||||||
|
send_to: team-Y-mails
|
||||||
|
|
||||||
|
routes:
|
||||||
|
- match:
|
||||||
|
severity: critical
|
||||||
|
send_to: team-Y-pager
|
||||||
|
|
||||||
|
# This route handles all alerts coming from a database service. If there's
|
||||||
|
# not team to handle it, it defaults to the DB team.
|
||||||
|
- match:
|
||||||
|
service: database
|
||||||
|
|
||||||
|
send_to: team-DB-pager
|
||||||
|
# Also group alerts by affected database.
|
||||||
|
group_by: [alertname, cluster, database]
|
||||||
|
continue: false
|
||||||
|
|
||||||
|
routes:
|
||||||
|
- match:
|
||||||
|
owner: team-X
|
||||||
|
send_to: team-X-pager
|
||||||
|
|
||||||
|
- match:
|
||||||
|
owner: team-Y
|
||||||
|
send_to: team-Y-pager
|
||||||
|
|
||||||
|
|
||||||
|
# Inhibition rules allow to mute a set of alerts given that another alert is
|
||||||
|
# firing.
|
||||||
|
# We use this to mute any warning-level notifications if the same alert is
|
||||||
|
# already critical.
|
||||||
|
inhibit_rules:
|
||||||
|
- source_match:
|
||||||
|
severity: 'critical'
|
||||||
|
target_match:
|
||||||
|
severity: 'warning'
|
||||||
|
# Apply inhibition if the alertname is the same.
|
||||||
|
equal: ['alertname']
|
||||||
|
|
||||||
|
|
||||||
|
notification_configs:
|
||||||
|
- name: 'team-X-mails'
|
||||||
|
email_configs:
|
||||||
|
- email: 'team-X+alerts@example.org'
|
||||||
|
|
||||||
|
- name: 'team-X-pager'
|
||||||
|
email_configs:
|
||||||
|
- email: 'team-X+alerts-critical@example.org'
|
||||||
|
pagerduty_configs:
|
||||||
|
- service_key: <team-X-key>
|
||||||
|
|
||||||
|
- name: 'team-Y-mails'
|
||||||
|
email_configs:
|
||||||
|
- email: 'team-Y+alerts@example.org'
|
||||||
|
|
||||||
|
- name: 'team-Y-pager'
|
||||||
|
pagerduty_configs:
|
||||||
|
- service_key: <team-Y-key>
|
||||||
|
|
||||||
|
- name: 'team-DB-pager'
|
||||||
|
pagerduty_configs:
|
||||||
|
- service_key: <team-DB-key>
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue