Document new alerting rule format.

This commit is contained in:
Brian Brazil 2017-10-31 14:05:58 +00:00
parent efaa8f9ce8
commit 8cf279efb1

View File

@ -11,31 +11,36 @@ to an external service. Whenever the alert expression results in one or more
vector elements at a given point in time, the alert counts as active for these vector elements at a given point in time, the alert counts as active for these
elements' label sets. elements' label sets.
### Defining alerting rules
Alerting rules are configured in Prometheus in the same way as [recording Alerting rules are configured in Prometheus in the same way as [recording
rules](recording_rules.md). rules](recording_rules.md).
### Defining alerting rules An example rules file with an alert would be:
Alerting rules are defined in the following syntax: ```yaml
groups:
- name: example
rules:
- alert: HighErrorRate
expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5
for: 10m
labels:
severity: page
annotations:
summary: High request latency
```
ALERT <alert name> The optional `for` clause causes Prometheus to wait for a certain duration
IF <expression>
[ FOR <duration> ]
[ LABELS <label set> ]
[ ANNOTATIONS <label set> ]
The alert name must be a valid metric name.
The optional `FOR` clause causes Prometheus to wait for a certain duration
between first encountering a new expression output vector element (like an between first encountering a new expression output vector element (like an
instance with a high HTTP error rate) and counting an alert as firing for this instance with a high HTTP error rate) and counting an alert as firing for this
element. Elements that are active, but not firing yet, are in pending state. element. Elements that are active, but not firing yet, are in pending state.
The `LABELS` clause allows specifying a set of additional labels to be attached The `labels` clause allows specifying a set of additional labels to be attached
to the alert. Any existing conflicting labels will be overwritten. The label to the alert. Any existing conflicting labels will be overwritten. The label
values can be templated. values can be templated.
The `ANNOTATIONS` clause specifies another set of labels that are not The `annotations` clause specifies another set of labels that are not
identifying for an alert instance. They are used to store longer additional identifying for an alert instance. They are used to store longer additional
information such as alert descriptions or runbook links. The annotation values information such as alert descriptions or runbook links. The annotation values
can be templated. can be templated.
@ -53,24 +58,29 @@ and `$value` holds the evaluated value of an alert instance.
Examples: Examples:
# Alert for any instance that is unreachable for >5 minutes. ```yaml
ALERT InstanceDown groups:
IF up == 0 - name: example
FOR 5m rules:
LABELS { severity = "page" }
ANNOTATIONS {
summary = "Instance {{ $labels.instance }} down",
description = "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.",
}
# Alert for any instance that have a median request latency >1s. # Alert for any instance that is unreachable for >5 minutes.
ALERT APIHighRequestLatency - alert: InstanceDown
IF api_http_request_latencies_second{quantile="0.5"} > 1 expr: up == 0
FOR 1m for: 5m
ANNOTATIONS { labels:
summary = "High request latency on {{ $labels.instance }}", severity: page
description = "{{ $labels.instance }} has a median request latency above 1s (current value: {{ $value }}s)", annotations:
} summary: "Instance {{ $labels.instance }} down"
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes."
# Alert for any instance that has a median request latency >1s.
- alert: APIHighRequestLatency
expr: api_http_request_latencies_second{quantile="0.5"} > 1
for: 10m
annotations:
summary: "High request latency on {{ $labels.instance }}"
description: "{{ $labels.instance }} has a median request latency above 1s (current value: {{ $value }}s)"
```
### Inspecting alerts during runtime ### Inspecting alerts during runtime