* notify: notify resolved alerts properly
The PR #1205 while fixing an existing issue introduced another bug when
the send_resolved flag of the integration is set to true.
With send_resolved set to false, the semantics remain the same:
AlertManager generates a notification when new firing alerts are added
to the alert group. The notification only carries firing alerts.
With send_resolved set to true, AlertManager generates a notification
when new firing or resolved alerts are added to the alert group. The
notification carries both the firing and resolved notifications.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Fix comments
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Improve notification instrumentation
- Add notificationLatencySeconds histogram to
debug duplicate messages. This can help rule out
if duplicate messages are being caused by
excessive latency when sending a notification.
Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
* Wait for the gossip to settle before sending notifications
See #1209 for details.
As an heuristic for mesh readyness, try to see if
the mesh looks stable (the number of peers isn't changing too much).
This implementation always mark the altermanager as ready after a maximum of 60s.
This adds one new flags to control this behavior:
```
--cluster.settle-timeout=60s mesh settling timeout. Do not wait more than this duration on startup.
```
It also adds `/-/ready` which always return 200 (in order to make it clear
that we are ready as soon as we can receive requests).
The mesh status is exposed in `/api/v1/status` and visible on `/#/status`.
* cluster: fix typos and base interval on gossipInterval
See #1223, looks like OpsGenie now sometimes returns a 422 when you
don't specify a team. This change cleans up the JSON output and
add a few unit tests.
After the initial notification has been sent, AlertManager shouldn't notify the
receiver again when no new alerts have been added to the group during
group_interval.
This change also modifies the acceptance test framework to assert that no
notification has been received in a given interval.
* WECHAT support by ybyang2/berlinsaint
* correct the whitespace
* add some TestFile and modify some naming errors by ybyang2/berlinsaint
* modify wechat retry test expect
* template error
* add newline
Signed-off-by: yb_home <berlinsaint@126.com>
* fmt some pr code
* use the @stuartnelson3 the test-ci-wechat bingdata.go
* notify go add wechat
The PagerDuty Events API (v1), used by integrations with monitoring tools, will continue to be supported. There are currently no plans to deprecate, end support for or sunset it.
The end-of-support notice cited in the log message removed applies only to the *REST API* version 1, which PagerDuty will no longer support as of February 2018, but which Prometheus does not use.
* Template Source field in pagerduty payload
As a sane default we link to alertmanager, but
leave templating available to the user if
something suits their system better.