alertmanager/notify
Simon Pasquier f7c595c168
notify: improve logs on notification errors (#2273)
* notify: improve logs on notification errors

Alertmanager can experience occasional failures when sending
notifications to an external service. If the operation succeeds after
some retry, the 'alertmanager_notifications_failed_total' metric
increases but nothing is logged (unless running with log.level=debug).
Hence an operator might receive an alert about notification failures but
wouldn't know which integration was failing.

With this change, notification failures are logged at the warning level.
To avoid log flooding, similar failures on retries aren't logged.
Additional information on the failing integration has also been added.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Log notify success at info level if it's a retry

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-04 10:38:48 +02:00
..
email Add email notify Message-Id Header (#2057) 2019-10-23 15:49:30 +02:00
hipchat Bump prometheus/client_golang to v1.1.0 (#1989) 2019-08-08 14:36:10 +02:00
opsgenie Bump prometheus/client_golang to v1.1.0 (#1989) 2019-08-08 14:36:10 +02:00
pagerduty Enforce 512KB event size limit for Pagerduty events (#2225) 2020-05-15 15:15:18 +02:00
pushover Bump prometheus/client_golang to v1.1.0 (#1989) 2019-08-08 14:36:10 +02:00
slack Add slack channel when logging error. (#2177) 2020-02-05 09:17:15 +01:00
test *: split notify package 2019-06-18 15:36:19 +02:00
victorops Bump prometheus/client_golang to v1.1.0 (#1989) 2019-08-08 14:36:10 +02:00
webhook Allow limiting maximum number of alerts in webhook (#2274) 2020-06-04 10:07:33 +02:00
wechat notify/wechat: adjust result check sequence. (#2044) 2019-09-23 09:31:57 +02:00
notify.go notify: improve logs on notification errors (#2273) 2020-06-04 10:38:48 +02:00
notify_test.go notify: don't use the global metrics registry (#1977) 2019-08-26 16:37:13 +02:00
util.go notify: refactor code to retry requests (#1974) 2019-08-02 16:17:40 +02:00
util_test.go notify: refactor code to retry requests (#1974) 2019-08-02 16:17:40 +02:00