Commit Graph

2126 Commits

Author SHA1 Message Date
Julien Pivotto a7f9fdadbe
docs: Fix dead link to visual route editor (#2298) (#2300)
Relative links are rendered relatively to the source repository, so this
link points to GitHub at the moment.

This commit changes the link to an absolute URL, bringing readers to the
Routing tree editor.

Signed-off-by: Manuel Hutter <manuel@hutter.io>

Co-authored-by: Manuel Hutter <mhutter@users.noreply.github.com>
2020-06-17 16:42:32 +02:00
Simon Pasquier 4c6c03ebfe
Cut v0.21.0 (#2297)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-17 10:44:48 +02:00
Simon Pasquier 56f09a62b2
notify: always retry with a back-off (#2290)
By default the library implementing the back-off timer stops the timer
after 15 minutes. Since the code never checked the value returned by the
ticker, notification retries were executed without delay after the 15
minutes had elapsed (e.g. for `group_interval` greater than 15m).

This change ensures that the back-off timer never expires.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-16 09:50:35 +02:00
Julien Pivotto 2f74a34176
Release 0.21 documentation (#2294)
* Release 0.20 docs (#2292)

* Raw docs imports

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Adapt for this repository

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Add max alerts doc

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Remove HipChat from docs

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-12 12:25:00 +02:00
Julien Pivotto 1cba0c7a37
Remove HipChat (#2281)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-11 15:51:10 +02:00
vitt-bagal ce5d523596
Add image build for s390x architecture (#2289)
* Add suport for s390x

Signed-off-by: Nayana <nthorat@us.ibm.com>

* Added s390x support to docker image

Signed-off-by: vitthalb@us.ibm.com <vitthalb@us.ibm.com>

Co-authored-by: Nayana <nthorat@us.ibm.com>
2020-06-09 16:30:55 +02:00
Simon Pasquier 4346e1a51c
Cut v0.21.0-rc.0 (#2279)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-05 12:56:17 +02:00
Julius Volz 12da9d6570
Merge pull request #2277 from simonpasquier/bump-orb
.circleci/config.yml: bump Prometheus orb version
2020-06-04 15:24:34 +02:00
Simon Pasquier 8b7816dc9f .circleci/config.yml: bump Prometheus orb version
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-04 14:26:21 +02:00
Simon Pasquier f7c595c168
notify: improve logs on notification errors (#2273)
* notify: improve logs on notification errors

Alertmanager can experience occasional failures when sending
notifications to an external service. If the operation succeeds after
some retry, the 'alertmanager_notifications_failed_total' metric
increases but nothing is logged (unless running with log.level=debug).
Hence an operator might receive an alert about notification failures but
wouldn't know which integration was failing.

With this change, notification failures are logged at the warning level.
To avoid log flooding, similar failures on retries aren't logged.
Additional information on the failing integration has also been added.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Log notify success at info level if it's a retry

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-04 10:38:48 +02:00
Julius Volz 70b5e00ffc
Allow limiting maximum number of alerts in webhook (#2274)
* Allow limiting maximum number of alerts in webhook

The webhook notifier is the only notifier that does not allow templating
on the Alertmanager side. Users who encounter occasional alert storms
(10ks of alerts going off at once for the same group) have reported
webhook receiver systems not being able to cope with the load caused by
the resulting large webhook notifier messages (the alerting rules also
contained large annotations that can't be stripped away due to lack of
templating). Reducing group size also wasn't an option, but this change
proposes to allow truncating the list of alerts sent in the webhook body
to a provided maximum length. This assumes that e.g. if a group receives
20k alerts, you really are fine only receiving 10k because you wouldn't
be able to check them all anyway.

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Change max_alerts to uint32

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Add truncatedAlerts field to webhook message

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Fix JSON struct tag

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2020-06-04 10:07:33 +02:00
Simon Pasquier 9c3ee38683
.circleci/config.yml: collect test metadata (#2211)
* .circleci/config.yml: collect test metadata

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Store frontend test results too

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-04 09:49:32 +02:00
Simon Pasquier e0cc523893
api/v2: add path and method to API v2 logs (#2261)
* api/v2: add path and method to API v2 logs

When an API v2 handler logged a message, the log wouldn't include the
path and method. Since different handlers perform the same validations
(e.g. matchers for alerts and silences), it isn't easy to know which
handler was invoked (though the logged filename
+ line number provides a hint).

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Capitalize messages + improve logs

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-02 16:13:31 +02:00
LucasBoisserie 97bd078441
Add redirect on / to routePrefix (#2235)
Signed-off-by: LucasBoisserie <lucas.boisserie@gmail.com>
2020-05-28 17:07:55 +02:00
Ben Kochie 08956d1be3
Merge pull request #2269 from simonpasquier/update-orb
Bump Prometheus orb to 0.6.0
2020-05-27 17:15:36 +02:00
Simon Pasquier 6fb343b289 Bump Prometheus orb to 0.6.0
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-05-27 13:51:47 +02:00
PrometheusBot 3313bd6e29
makefile: update Makefile.common with newer version (#2263)
Signed-off-by: prombot <prometheus-team@googlegroups.com>
2020-05-19 09:02:46 +02:00
Fahri YARDIMCI 12db463c7f
Add cluster command to show cluster and peer statuses. (#2256)
Signed-off-by: Fahri Yardımcı <f.yardimci06@gmail.com>

Signed-off-by: Fahri Yardımcı <f.yardimci06@gmail.com>
2020-05-18 15:25:15 +02:00
Simon Pasquier de80d907d1
cluster: log error on reconnect failures (#2260)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-05-18 15:00:51 +02:00
Julien Pivotto 013177e2d0
Update dependencies (#2257)
Update membership

Update common (support HTTP/2 client)

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-05-18 15:00:36 +02:00
shamilpd d8ad30179a
Enforce 512KB event size limit for Pagerduty events (#2225)
* Enforce 512kb event size limit for Pagerduty

Signed-off-by: Shamil Ishraq <shamil@pagerduty.com>

* Add size limit to error message

Signed-off-by: Shamil Ishraq <shamil@pagerduty.com>

* Replace MaxEventSize setting with a const.

Signed-off-by: Shamil Ishraq <shamil@pagerduty.com>

* Change to package variable

Signed-off-by: Shamil Ishraq <shamil@pagerduty.com>

* Removed recursion in encodeMessage()

Signed-off-by: Shamil Ishraq <shamil@pagerduty.com>

* Unexport maxEventSize

Signed-off-by: Shamil Ishraq <shamil@pagerduty.com>
2020-05-15 15:15:18 +02:00
rmahroua e7adbea594
Update README.md (#2245)
Move the `alertmanager.url` to a new line.

Signed-off-by: Razique Mahroua <rmahroua@redhat.com>
2020-05-15 14:39:41 +02:00
Pascal Hofmann 7efb78bce9
Improve remark on UDP/TCP for high availability (#2231)
* Improve remark on UDP/TCP for high availability
Signed-off-by: Pascal Hofmann <mail+github@pascalhofmann.de>

* Update README.md

Co-Authored-By: Max Inden <mail@max-inden.de>
Signed-off-by: Pascal Hofmann <mail+github@pascalhofmann.de>

* Update README.md
Signed-off-by: Pascal Hofmann <mail+github@pascalhofmann.de>

Co-authored-by: Max Inden <mail@max-inden.de>
2020-05-14 16:17:03 +02:00
Julien Pivotto 8d050daf51
Bump go version to 1.14 (#2248)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-05-06 10:59:32 +02:00
Julien Pivotto 6867a9b308
Merge pull request #2251 from simonpasquier/github-issue-typo
.github/ISSUE_TEMPLATE.md: fix typo
2020-05-05 23:00:28 +02:00
Simon Pasquier 66bc3a3aff .github/ISSUE_TEMPLATE.md: fix typo
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-05-05 14:18:13 +02:00
Carlos Eduardo 071818bdac
Add image build for ppc64le architecture (#2219)
Signed-off-by: Carlos de Paula <me@carlosedp.com>
2020-03-30 16:56:23 +02:00
Dominik-K f8ffc2a18a
Add warning that inhibition occurs on missing `equal` (#2214)
Signed-off-by: Dominik <dominik-k@mailbox.org>
2020-03-27 16:20:19 +01:00
Jacob Lisi 0c0c6bdb01
Fix race condition in dispatcher (#2208)
* fix dispatcher race condition

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

* add test to check for race condition in dispatcher

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

* return when dispatcher Stop has nil receiver

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

* remove unneeded chec

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>
2020-03-19 15:32:37 +01:00
Simon Pasquier 44af3201fe
notify: add retry field to debug log (#2188)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-03-13 15:39:16 +01:00
Alan bd53c5ac39
add shebang for test script (#2206)
add shebang for test script
Signed-off-by: alan <zg.zhu@daocloud.io>
2020-03-11 10:16:53 +01:00
Simon Pasquier e347c31ab6
api/v2: return empty array of peers when disabled (#2203)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-03-10 15:47:43 +01:00
Julien Pivotto 99e5cbff40
Merge pull request #2200 from pracucci/fix-dispatcher-metrics
Fixed dispatcher metrics registration
2020-03-06 22:43:13 +01:00
Marco Pracucci 1f77f320a7
Fixed dispatcher metrics registration
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2020-03-06 15:09:30 +01:00
PrometheusBot 255b2073dc
makefile: update Makefile.common with newer version (#2197)
Signed-off-by: prombot <prometheus-team@googlegroups.com>
2020-02-28 10:24:50 +01:00
masataka 443fdb0b36
fix receiver regex (#2090)
Signed-off-by: m-masataka <m.mizukoshi.wakuwaku@gmail.com>
2020-02-18 17:24:05 +01:00
Julien Pivotto cad963d8a8
Mark pull request as stale after 60d of inactivity (#2185)
During the dev Summit 2019/2, there was a consensus to mark stale PR
after 60 days.

This change is adding the stale bot configuration required for this.
The stale bot has already has access to the Prometheus organization. It
does _not_ comment and does _not_ close the stale pull request. It just
adds a label 'stale'.

This is already done in the collectd_exporter repository and there it
works as expected.

https://docs.google.com/document/d/1VVxx9DzpJPDgOZpZ5TtSHBRPuG5Fr3Vr6EFh8XuUpgs/edit

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-02-17 10:54:20 +01:00
Simon Pasquier 56e966bc20
api/v2: Fix silence creation error message (#2179)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-02-13 10:07:58 +01:00
stuart nelson a6d722de6c
remove stuart from MAINTAINERS.md (#2181)
I've been largely inactive for the last 6 months,
and have no plans to become active again.

Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
2020-02-13 10:05:49 +01:00
Célian GARCIA dcc0b70c7d
[Minor][one line change] Fix an error message about start and end time validation. EOM (#2173)
* Fix an error message about start and end time validation

Signed-off-by: Célian Garcia <celian.garcia@amadeus.com>

* Modified start and end time validation message to be affirmative

Signed-off-by: Célian Garcia <celian.garcia@amadeus.com>
2020-02-05 15:13:46 +01:00
melchiormoulin e37f769035
Add slack channel when logging error. (#2177)
Signed-off-by: Melchior MOULIN <m.moulin@criteo.com>
2020-02-05 09:17:15 +01:00
Josh Soref 0f2c65d265 Spelling (#2167)
* spelling: inhibition

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: matchers

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: notification

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: nonexistent

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: obfuscated

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: occurred

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: relevant

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: unexpected

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: marshaled

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: marshaling

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-01-23 17:06:16 +01:00
Kevin Hellemun a2aa0cb5bf [#2160] Removed default assignment of env vars. (#2161)
Signed-off-by: Kevin Hellemun <17928966+OGKevin@users.noreply.github.com>
2020-01-22 14:53:19 +01:00
Sho Okada 04ca507125 Inherit their parent route's grouping when "group_by: [...]" (#2154)
Signed-off-by: Sho Okada <shokada3@gmail.com>
2020-01-10 14:20:03 +01:00
Max Inden b4ac213809
Merge pull request #2153 from mxinden/remove-mxinden
MAINTAINERS.md: Remove Max Inden (mxinden)
2020-01-06 11:01:32 +01:00
Max Inden 6b01da3a64
MAINTAINERS.md: Remove Max Inden
Signed-off-by: Max Inden <IndenML@gmail.com>
2020-01-02 19:02:58 +01:00
johncming 134c3c0ed9 move walkRoute to dispatch package. (#2136)
Signed-off-by: johncming <johncming@yahoo.com>
2019-12-20 15:27:58 +01:00
Simon Pasquier b49ebfc683
Merge release 0.20 (#2140)
* Revert "slack: retry 429 errors (#2112)" (#2128)

This reverts commit 26cc96a787.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Revert "config: remove support for JSON marshaling (#2086)" (#2133)

This reverts commit 918f08b66a.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* config: fix JSON unmarshaling for HostPort (#2134)

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Cut 0.20.0 (#2137)

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-12-12 16:35:19 +01:00
Simon Pasquier 3640bb8d55
.circleci/config.yml: publish_release requires test_frontend (#2139)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-12-11 17:06:10 +01:00
meichuntao 5cb556e4b2 api/metrics/metrics.go: Fix returning wrong counter (#2126)
Signed-off-by: meichuntao <mei.chuntao@zte.com.cn>
2019-12-04 11:26:13 +01:00