Commit Graph

2278 Commits

Author SHA1 Message Date
ricoberger
9a87f5c113 Populate details from common labels and details
Signed-off-by: ricoberger <mail@ricoberger.de>
2020-06-09 07:40:04 +02:00
Simon Pasquier
4346e1a51c
Cut v0.21.0-rc.0 (#2279)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-05 12:56:17 +02:00
ricoberger
8248c50365 Provide option to use common labels for OpsGenie details
Signed-off-by: ricoberger <mail@ricoberger.de>
2020-06-05 08:07:58 +02:00
ricoberger
dcccf542f1 Adjust Opsgenie config for labels propagation
Signed-off-by: ricoberger <mail@ricoberger.de>
2020-06-04 15:47:18 +02:00
Julius Volz
12da9d6570
Merge pull request #2277 from simonpasquier/bump-orb
.circleci/config.yml: bump Prometheus orb version
2020-06-04 15:24:34 +02:00
Simon Pasquier
8b7816dc9f .circleci/config.yml: bump Prometheus orb version
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-04 14:26:21 +02:00
Simon Pasquier
f7c595c168
notify: improve logs on notification errors (#2273)
* notify: improve logs on notification errors

Alertmanager can experience occasional failures when sending
notifications to an external service. If the operation succeeds after
some retry, the 'alertmanager_notifications_failed_total' metric
increases but nothing is logged (unless running with log.level=debug).
Hence an operator might receive an alert about notification failures but
wouldn't know which integration was failing.

With this change, notification failures are logged at the warning level.
To avoid log flooding, similar failures on retries aren't logged.
Additional information on the failing integration has also been added.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Log notify success at info level if it's a retry

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-04 10:38:48 +02:00
Julius Volz
70b5e00ffc
Allow limiting maximum number of alerts in webhook (#2274)
* Allow limiting maximum number of alerts in webhook

The webhook notifier is the only notifier that does not allow templating
on the Alertmanager side. Users who encounter occasional alert storms
(10ks of alerts going off at once for the same group) have reported
webhook receiver systems not being able to cope with the load caused by
the resulting large webhook notifier messages (the alerting rules also
contained large annotations that can't be stripped away due to lack of
templating). Reducing group size also wasn't an option, but this change
proposes to allow truncating the list of alerts sent in the webhook body
to a provided maximum length. This assumes that e.g. if a group receives
20k alerts, you really are fine only receiving 10k because you wouldn't
be able to check them all anyway.

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Change max_alerts to uint32

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Add truncatedAlerts field to webhook message

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Fix JSON struct tag

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2020-06-04 10:07:33 +02:00
Simon Pasquier
9c3ee38683
.circleci/config.yml: collect test metadata (#2211)
* .circleci/config.yml: collect test metadata

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Store frontend test results too

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-04 09:49:32 +02:00
ricoberger
117c8ba8f1 Propagate labels to Opsgenie details
Signed-off-by: ricoberger <mail@ricoberger.de>
2020-06-04 09:30:02 +02:00
Simon Pasquier
e0cc523893
api/v2: add path and method to API v2 logs (#2261)
* api/v2: add path and method to API v2 logs

When an API v2 handler logged a message, the log wouldn't include the
path and method. Since different handlers perform the same validations
(e.g. matchers for alerts and silences), it isn't easy to know which
handler was invoked (though the logged filename
+ line number provides a hint).

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Capitalize messages + improve logs

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-02 16:13:31 +02:00
LucasBoisserie
97bd078441
Add redirect on / to routePrefix (#2235)
Signed-off-by: LucasBoisserie <lucas.boisserie@gmail.com>
2020-05-28 17:07:55 +02:00
Ben Kochie
08956d1be3
Merge pull request #2269 from simonpasquier/update-orb
Bump Prometheus orb to 0.6.0
2020-05-27 17:15:36 +02:00
Simon Pasquier
6fb343b289 Bump Prometheus orb to 0.6.0
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-05-27 13:51:47 +02:00
PrometheusBot
3313bd6e29
makefile: update Makefile.common with newer version (#2263)
Signed-off-by: prombot <prometheus-team@googlegroups.com>
2020-05-19 09:02:46 +02:00
Fahri YARDIMCI
12db463c7f
Add cluster command to show cluster and peer statuses. (#2256)
Signed-off-by: Fahri Yardımcı <f.yardimci06@gmail.com>

Signed-off-by: Fahri Yardımcı <f.yardimci06@gmail.com>
2020-05-18 15:25:15 +02:00
Simon Pasquier
de80d907d1
cluster: log error on reconnect failures (#2260)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-05-18 15:00:51 +02:00
Julien Pivotto
013177e2d0
Update dependencies (#2257)
Update membership

Update common (support HTTP/2 client)

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-05-18 15:00:36 +02:00
shamilpd
d8ad30179a
Enforce 512KB event size limit for Pagerduty events (#2225)
* Enforce 512kb event size limit for Pagerduty

Signed-off-by: Shamil Ishraq <shamil@pagerduty.com>

* Add size limit to error message

Signed-off-by: Shamil Ishraq <shamil@pagerduty.com>

* Replace MaxEventSize setting with a const.

Signed-off-by: Shamil Ishraq <shamil@pagerduty.com>

* Change to package variable

Signed-off-by: Shamil Ishraq <shamil@pagerduty.com>

* Removed recursion in encodeMessage()

Signed-off-by: Shamil Ishraq <shamil@pagerduty.com>

* Unexport maxEventSize

Signed-off-by: Shamil Ishraq <shamil@pagerduty.com>
2020-05-15 15:15:18 +02:00
rmahroua
e7adbea594
Update README.md (#2245)
Move the `alertmanager.url` to a new line.

Signed-off-by: Razique Mahroua <rmahroua@redhat.com>
2020-05-15 14:39:41 +02:00
Pascal Hofmann
7efb78bce9
Improve remark on UDP/TCP for high availability (#2231)
* Improve remark on UDP/TCP for high availability
Signed-off-by: Pascal Hofmann <mail+github@pascalhofmann.de>

* Update README.md

Co-Authored-By: Max Inden <mail@max-inden.de>
Signed-off-by: Pascal Hofmann <mail+github@pascalhofmann.de>

* Update README.md
Signed-off-by: Pascal Hofmann <mail+github@pascalhofmann.de>

Co-authored-by: Max Inden <mail@max-inden.de>
2020-05-14 16:17:03 +02:00
Julien Pivotto
8d050daf51
Bump go version to 1.14 (#2248)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-05-06 10:59:32 +02:00
Julien Pivotto
6867a9b308
Merge pull request #2251 from simonpasquier/github-issue-typo
.github/ISSUE_TEMPLATE.md: fix typo
2020-05-05 23:00:28 +02:00
Simon Pasquier
66bc3a3aff .github/ISSUE_TEMPLATE.md: fix typo
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-05-05 14:18:13 +02:00
Carlos Eduardo
071818bdac
Add image build for ppc64le architecture (#2219)
Signed-off-by: Carlos de Paula <me@carlosedp.com>
2020-03-30 16:56:23 +02:00
Dominik-K
f8ffc2a18a
Add warning that inhibition occurs on missing equal (#2214)
Signed-off-by: Dominik <dominik-k@mailbox.org>
2020-03-27 16:20:19 +01:00
Jacob Lisi
0c0c6bdb01
Fix race condition in dispatcher (#2208)
* fix dispatcher race condition

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

* add test to check for race condition in dispatcher

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

* return when dispatcher Stop has nil receiver

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

* remove unneeded chec

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>
2020-03-19 15:32:37 +01:00
Simon Pasquier
44af3201fe
notify: add retry field to debug log (#2188)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-03-13 15:39:16 +01:00
Alan
bd53c5ac39
add shebang for test script (#2206)
add shebang for test script
Signed-off-by: alan <zg.zhu@daocloud.io>
2020-03-11 10:16:53 +01:00
Simon Pasquier
e347c31ab6
api/v2: return empty array of peers when disabled (#2203)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-03-10 15:47:43 +01:00
Julien Pivotto
99e5cbff40
Merge pull request #2200 from pracucci/fix-dispatcher-metrics
Fixed dispatcher metrics registration
2020-03-06 22:43:13 +01:00
Marco Pracucci
1f77f320a7
Fixed dispatcher metrics registration
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2020-03-06 15:09:30 +01:00
PrometheusBot
255b2073dc
makefile: update Makefile.common with newer version (#2197)
Signed-off-by: prombot <prometheus-team@googlegroups.com>
2020-02-28 10:24:50 +01:00
masataka
443fdb0b36
fix receiver regex (#2090)
Signed-off-by: m-masataka <m.mizukoshi.wakuwaku@gmail.com>
2020-02-18 17:24:05 +01:00
Julien Pivotto
cad963d8a8
Mark pull request as stale after 60d of inactivity (#2185)
During the dev Summit 2019/2, there was a consensus to mark stale PR
after 60 days.

This change is adding the stale bot configuration required for this.
The stale bot has already has access to the Prometheus organization. It
does _not_ comment and does _not_ close the stale pull request. It just
adds a label 'stale'.

This is already done in the collectd_exporter repository and there it
works as expected.

https://docs.google.com/document/d/1VVxx9DzpJPDgOZpZ5TtSHBRPuG5Fr3Vr6EFh8XuUpgs/edit

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-02-17 10:54:20 +01:00
Simon Pasquier
56e966bc20
api/v2: Fix silence creation error message (#2179)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-02-13 10:07:58 +01:00
stuart nelson
a6d722de6c
remove stuart from MAINTAINERS.md (#2181)
I've been largely inactive for the last 6 months,
and have no plans to become active again.

Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
2020-02-13 10:05:49 +01:00
Célian GARCIA
dcc0b70c7d
[Minor][one line change] Fix an error message about start and end time validation. EOM (#2173)
* Fix an error message about start and end time validation

Signed-off-by: Célian Garcia <celian.garcia@amadeus.com>

* Modified start and end time validation message to be affirmative

Signed-off-by: Célian Garcia <celian.garcia@amadeus.com>
2020-02-05 15:13:46 +01:00
melchiormoulin
e37f769035
Add slack channel when logging error. (#2177)
Signed-off-by: Melchior MOULIN <m.moulin@criteo.com>
2020-02-05 09:17:15 +01:00
Josh Soref
0f2c65d265 Spelling (#2167)
* spelling: inhibition

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: matchers

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: notification

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: nonexistent

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: obfuscated

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: occurred

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: relevant

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: unexpected

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: marshaled

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: marshaling

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-01-23 17:06:16 +01:00
Kevin Hellemun
a2aa0cb5bf [#2160] Removed default assignment of env vars. (#2161)
Signed-off-by: Kevin Hellemun <17928966+OGKevin@users.noreply.github.com>
2020-01-22 14:53:19 +01:00
Sho Okada
04ca507125 Inherit their parent route's grouping when "group_by: [...]" (#2154)
Signed-off-by: Sho Okada <shokada3@gmail.com>
2020-01-10 14:20:03 +01:00
Max Inden
b4ac213809
Merge pull request #2153 from mxinden/remove-mxinden
MAINTAINERS.md: Remove Max Inden (mxinden)
2020-01-06 11:01:32 +01:00
Max Inden
6b01da3a64
MAINTAINERS.md: Remove Max Inden
Signed-off-by: Max Inden <IndenML@gmail.com>
2020-01-02 19:02:58 +01:00
johncming
134c3c0ed9 move walkRoute to dispatch package. (#2136)
Signed-off-by: johncming <johncming@yahoo.com>
2019-12-20 15:27:58 +01:00
Simon Pasquier
b49ebfc683
Merge release 0.20 (#2140)
* Revert "slack: retry 429 errors (#2112)" (#2128)

This reverts commit 26cc96a787.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Revert "config: remove support for JSON marshaling (#2086)" (#2133)

This reverts commit 918f08b66a.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* config: fix JSON unmarshaling for HostPort (#2134)

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Cut 0.20.0 (#2137)

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-12-12 16:35:19 +01:00
Simon Pasquier
3640bb8d55
.circleci/config.yml: publish_release requires test_frontend (#2139)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-12-11 17:06:10 +01:00
Simon Pasquier
f74be0400a
Cut 0.20.0 (#2137)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-12-11 15:03:55 +01:00
Simon Pasquier
06adefbe59
config: fix JSON unmarshaling for HostPort (#2134)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-12-11 08:55:59 +01:00
Simon Pasquier
ed6434c7d4
Revert "config: remove support for JSON marshaling (#2086)" (#2133)
This reverts commit 918f08b66a.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-12-06 11:54:54 +01:00