alertmanager

mirror of https://github.com/prometheus/alertmanager synced 2024-12-25 15:42:18 +00:00

Author	SHA1	Message	Date
Marco Pracucci	f84af78693	Lowered number of alert groups Signed-off-by: Marco Pracucci <marco@pracucci.com>	2021-05-11 16:15:46 +02:00
Marco Pracucci	1ad22c808f	Added unit test Signed-off-by: Marco Pracucci <marco@pracucci.com>	2021-05-11 15:48:02 +02:00
Marco Pracucci	72ef6e04e1	Fix race condition causing 1st alert to not be immediately delivered when group_wait is 0s Signed-off-by: Marco Pracucci <marco@pracucci.com>	2021-05-11 15:15:53 +02:00
Arve Knudsen	87b1cc6637	Unlock at specific points instead of deferring Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2021-04-27 10:44:18 +02:00
Arve Knudsen	bd543f1345	Dispatch: Make sure mutex gets unlocked on call to Stop Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2021-04-27 09:25:16 +02:00
Ben Ridley	5983d2078d	Fix formatting Signed-off-by: Ben Ridley <benridley29@gmail.com>	2021-03-01 08:30:02 +11:00
Ben Ridley	5d4231b001	Use consistent naming for mute time intervals Signed-off-by: Ben Ridley <benridley29@gmail.com>	2021-03-01 08:30:02 +11:00
ben	d1f5e07909	Add mute time stage and pipeline Signed-off-by: Ben Ridley <benridley29@gmail.com>	2021-03-01 08:30:01 +11:00
ben	cbfbf07188	Allow routes to reference time intervals Signed-off-by: Ben Ridley <benridley29@gmail.com>	2021-03-01 08:30:00 +11:00
Atibhi Agrawal	6b36afbbec	Add negative matchers for routing. (#2434 ) Add negative route matchers using label.Matcher Signed-off-by: aSquare14 <atibhi.a@gmail.com> Signed-off-by: beorn7 <beorn@grafana.com> Co-authored-by: Björn Rabenstein <beorn@grafana.com>	2021-01-15 21:11:39 +01:00
Jacob Lisi	0c0c6bdb01	Fix race condition in dispatcher (#2208 ) * fix dispatcher race condition Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> * add test to check for race condition in dispatcher Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> * return when dispatcher Stop has nil receiver Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> * remove unneeded chec Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>	2020-03-19 15:32:37 +01:00
Marco Pracucci	1f77f320a7	Fixed dispatcher metrics registration Signed-off-by: Marco Pracucci <marco@pracucci.com>	2020-03-06 15:09:30 +01:00
Sho Okada	04ca507125	Inherit their parent route's grouping when "group_by: [...]" (#2154 ) Signed-off-by: Sho Okada <shokada3@gmail.com>	2020-01-10 14:20:03 +01:00
johncming	134c3c0ed9	move walkRoute to dispatch package. (#2136 ) Signed-off-by: johncming <johncming@yahoo.com>	2019-12-20 15:27:58 +01:00
Simon Pasquier	b49ebfc683	Merge release 0.20 (#2140 ) * Revert "slack: retry 429 errors (#2112)" (#2128) This reverts commit `26cc96a787`. Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Revert "config: remove support for JSON marshaling (#2086)" (#2133) This reverts commit `918f08b66a`. Signed-off-by: Simon Pasquier <spasquie@redhat.com> * config: fix JSON unmarshaling for HostPort (#2134) Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Cut 0.20.0 (#2137) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-12-12 16:35:19 +01:00
Simon Pasquier	4f45457b9c	dispatch: add metrics (#2113 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-11-26 09:04:56 +01:00
Simon Pasquier	918f08b66a	config: remove support for JSON marshaling (#2086 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-10-29 10:45:42 +01:00
johncming	bad2e792ca	dispatch: route group labels should contain group common label. (#2055 ) Signed-off-by: johncming <johncming@yahoo.com>	2019-10-02 14:54:34 +02:00
Simon Pasquier	4535311c34	dispatch: don't garbage-collect alerts from store The aggregation group is already responsible for removing the resolved alerts. Running the garbage collection in parallel introduces a race and eventually resolved notifications may be dropped. Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-09-18 11:42:14 +02:00
Simon Pasquier	ab537b5b2f	dispatch: fix missing receivers in Groups() (#1964 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-07-24 17:12:37 +02:00
Simon Pasquier	612222b693	dispatch: use strings.Builder instead of []byte Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-07-15 15:27:37 +02:00
bigMacro	5ff6cffa08	fix memory visibility error (#1936 ) Signed-off-by: denghuan <denghuan@actionsky.com>	2019-06-25 10:11:45 +02:00
Simon Pasquier	2ccb4707f1	dispatch: fix flaky test Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-04-23 11:16:48 +02:00
Simon Pasquier	c78b449f4a	provider/mem: fix dropped alerts Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-04-19 15:35:21 +02:00
stuart nelson	2fa210d0e3	add groups endpoint to v2 api Signed-off-by: stuart nelson <stuartnelson3@gmail.com>	2019-04-17 11:32:21 +02:00
Simon Pasquier	a5e26cc721	*: log at debug level when context is canceled Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-04-03 16:41:03 +02:00
JoeWrightss	b926c6935e	Fix some typos in comment (#1750 ) Signed-off-by: zhoulin xie <zhoulin.xie@daocloud.io>	2019-02-08 14:57:08 +01:00
Brian Brazil	7078333202	Make a copy of firing alerts with EndsAt=0 when flushing. (#1686 ) If the original EndsAt is left in place, then as time moves forwards past the EndsAt then firing alerts will be rendered and treated as resolved alerts which can cause confusion and races. This is most likely to happen on retries for a notification. Mitigate race and fix data races in TestAggrGroup. Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>	2019-01-04 16:52:20 +01:00
kirillsablin	32bb289906	dispatch: Add group_by_all support (#1588 ) To aggregate by all possible labels use '...' as the sole label name. This effectively disables aggregation entirely, passing through all alerts as-is. This is unlikely to be what you want, unless you have a very low alert volume or your upstream notification system performs its own grouping. Example: group_by: [...] Signed-off-by: Kyryl Sablin <kyryl.sablin@schibsted.com>	2018-11-29 12:31:14 +01:00
Simon Pasquier	306fd73e32	*: remove use of golang.org/x/net/context Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2018-11-09 10:00:23 +01:00
stuart nelson	e883ccb9de	pull out shared code for storing alerts (#1507 ) Move the code for storing and GC'ing alerts from being re-implemented in several packages to existing in its own package Signed-off-by: stuart nelson <stuartnelson3@gmail.com>	2018-09-03 14:52:53 +02:00
Simon Pasquier	899226f3ac	*: remove v1/alerts/groups API endpoint (#1525 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2018-08-23 16:03:49 +02:00
bigMacro	f3bc41d256	fix concurrent read and wirte group error (#1447 ) * fix concurrent read and wirte group Signed-off-by: denghuan <denghuan@actionsky.com> * make lock more elegant Signed-off-by: denghuan <denghuan@actionsky.com>	2018-07-10 17:13:41 +02:00
Simon Pasquier	6a7c912559	Sort alerts in correct order (#1349 ) * Sort dispatched alerts by job+instance in the correct order (#1178) Signed-off-by: Ted Zlatanov <tzz@lifelogs.com> * dispatch: add unit test for alerts sorting Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2018-06-14 15:54:33 +02:00
Simon Pasquier	0ebaeccd4b	*: add missing license headers Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2018-05-14 17:37:13 +02:00
Manos Fokas	300a87e85b	Removed file changes to resolve conflict. (#1318 ) Signed-off-by: manosf <manosf@protonmail.com>	2018-04-17 16:22:46 +02:00
Simon Pasquier	4cba49155d	dispatch: don't reset timer if flush is in-progress (#1301 ) When the aggregation group receives an alert that is past the initial group_wait value, it should reset its timer only if the timer has ever expired. Otherwise it means that the flush is already in-progress.	2018-03-29 12:22:49 +02:00
Ted Zlatanov	099b6a1d43	Sort dispatched alerts by job+instance then rest by default (#1178 ) (#1234 )	2018-03-22 20:06:37 +01:00
Brian Brazil	aa950668bf	The default group_by is meant to be no labels. (#1287 ) This is what the intended default is, and what the documentation says.	2018-03-16 18:39:23 +01:00
pasquier-s	c39a913f8a	test: enable race detection (#1262 ) This change enables race detection when running the tests. It also fixes a couple of existing race conditions.	2018-02-27 18:18:53 +01:00
Brian Brazil	5cb71e1def	Fix spelling and comment style. (#1257 )	2018-02-27 10:07:33 +01:00
pasquier-s	9b10acae68	Don't notify resolved alerts if none were firing (#1198 ) * Don't notify resolved alerts if none were firing * Fix comments	2018-01-18 11:12:17 +01:00
pasquier-s	907ac510f8	Fix flaky TestBatching acceptance test (#1193 ) This change decreases the repeat_interval parameter from 5s to 4.9s to make sure that the alerts are effectively sent after 5 seconds. The workflow is: - The dispatcher flushes the alerts at t0, sends the notification and marks the notification log at t0+epsilon. - The dispatcher flushes the alerts at t1, t2, t3 and t4 and doesn't send the notifications as expected. - At t5, the dispatcher flushes the alerts because current_time - (t0+epsilon) is less then repeat_interval. If repeat_interval is exactly 5s, there is a little chance that it is greater than current_time - (t0+epsilon).	2018-01-11 22:45:59 +01:00
Julius Volz	b145c51b99	Clarify variable names in Dispatcher.processAlert() A single entry in aggrGroups is just a single group, not plural.	2017-11-01 15:06:23 +01:00
Julius Volz	947970af44	Convert Alertmanager to use non-global go-kit loggers Fixes https://github.com/prometheus/alertmanager/issues/1040	2017-10-22 00:20:40 -07:00
Frederic Branczyk	5328885fe9	dispatch: fix race condition in dispatch test (#1025 )	2017-10-04 18:01:23 +02:00
Łukasz Mierzwa	8e61ebf6c3	Expose alert fingerprint in the API (#786 ) * Expose alert fingerprint in the API Alert fingerprint is already provided as the value of status.inhibitedBy[] attribute that inhibited alerts have, but there's no way to get back to the alert that's inhibiting it as the fingerprint is not exposed. * Expose alert fingerprint as ID in the list endpoint * Rename ID to Fingerprint * Use Fingerprint().String() in the API	2017-08-18 19:30:18 +02:00
stuart nelson	a7009a9db7	Stn/add receiver support (#872 ) Add ability to filter alerts by receiver in UI. This adds changes both in the Elm UI, as well as the Go backend.	2017-06-26 18:20:26 +02:00
Corentin Chary	9b2afbf18b	Make sure Matchers are always ordered This fixes https://github.com/prometheus/alertmanager/issues/881 Also add some unit tests	2017-06-23 15:30:34 +02:00
Fabian Reinartz	8170206070	Fix alert status handling in UI	2017-05-08 12:56:03 +02:00

1 2

61 Commits