alertmanager

Commit Graph

Author	SHA1	Message	Date
George Robinson	cc6de9c666	Remove Id return from silences.Set(pb.Silence) This commit removes the Id from the method silences.Set(pb.Silence) as it is redundant. The Id is still set even when creating a silence fails. This will be fixed in a later change. Signed-off-by: George Robinson <george.robinson@grafana.com>	2024-06-20 15:47:49 +01:00
George Robinson	e690fbe250	Rename silence limit to max-silence-size-bytes (#3886 ) * Rename silence limit to max-silence-size-bytes This commit renames an existing (unreleased) limit from max-per-silence-bytes to max-silence-size-bytes. Signed-off-by: George Robinson <george.robinson@grafana.com> * Update help Signed-off-by: George Robinson <george.robinson@grafana.com> --------- Signed-off-by: George Robinson <george.robinson@grafana.com>	2024-06-20 15:20:52 +01:00
George Robinson	124da3462d	Silence limits as functions (#3885 ) * Silence limits as functions This commit changes silence limits from a struct of ints to a struct of functions that return individual limits. This allows limits to be lazy-loaded and updated without having to call silences.New(). Signed-off-by: George Robinson <george.robinson@grafana.com> * Add explicit test for no limits Signed-off-by: George Robinson <george.robinson@grafana.com> * Fix run() Signed-off-by: George Robinson <george.robinson@grafana.com> --------- Signed-off-by: George Robinson <george.robinson@grafana.com>	2024-06-20 14:50:53 +01:00
George Robinson	db32fab612	Replace incorrect use of fmt.Errorf (#3883 )	2024-06-20 12:02:05 +01:00
George Robinson	f9d5a08759	Fix TestSilenceLimits tests (#3866 ) This commit fixes silence tests that relied on the maintenance function running at a fixed 100ms interval. If the go runtime that runs the maintenance is not scheduled with 150ms then the test will fail. Signed-off-by: George Robinson <george.robinson@grafana.com>	2024-06-05 15:03:00 +01:00
George Robinson	dbe6312f09	Limits should include expired silences (#3862 ) * Limits should include expired silences Signed-off-by: George Robinson <george.robinson@grafana.com> * Fix docs Signed-off-by: George Robinson <george.robinson@grafana.com> --------- Signed-off-by: George Robinson <george.robinson@grafana.com>	2024-06-03 09:12:19 +01:00
George Robinson	b67bde8cf9	Add limits for silences (#3852 ) * Add limits for silences This commit adds limits for silences including the maximum number of active and pending silences, and the maximum size per silence (in bytes). Signed-off-by: George Robinson <george.robinson@grafana.com> * Remove default limits Signed-off-by: George Robinson <george.robinson@grafana.com> * Allow expiration of silences that exceed max size --------- Signed-off-by: George Robinson <george.robinson@grafana.com>	2024-05-31 17:52:44 +01:00
George Robinson	d31a249ffc	#3513 : Add GroupMarker interface (#3792 ) * Add GroupMarker interface This commit adds a new GroupMarker interface that marks the status of groups. For example, whether an alert is muted because or one or more active or mute time intervals. It renames the existing Marker interface to AlertMarker to avoid confusion. Signed-off-by: George Robinson <george.robinson@grafana.com> --------- Signed-off-by: George Robinson <george.robinson@grafana.com>	2024-04-30 15:26:04 +01:00
George Robinson	6c70b5c014	Silences: Add benchmarks for Mutes (#3771 ) * Add benchmarks for Mutes This commit updates the existing benchmarks for silences to also benchmark Mutes. This complements the existing Query benchmarks by also measuring the time taken to mark silenced alerts. --------- Signed-off-by: George Robinson <george.robinson@grafana.com>	2024-03-21 20:54:56 +00:00
George Krajcsovits	d85bef20d9	feature: add native histogram support to latency metrics (#3737 ) Note that this does not stop showing classic metrics, for now it is up to the scrape config to decide whether to keep those instead or both. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-02-29 14:53:47 +00:00
George Robinson	f69a508665	Remove metrics from compat package (#3714 ) This commit removes the metrics from the compat package in favour of the existing logging and the additional tools at hand, such as amtool, to validate Alertmanager configurations. Due to the global nature of the compat package, a consequence of config.Load, these metrics have proven to be less useful in practice than expected, both in Alertmanager and other projects such as Mimir. There are a number of reasons for this: 1. Because the compat package is global, these metrics cannot be reset each time config.Load is called, as in multi-tenant projects like Mimir loading a config for one tenant would reset the metrics for all tenants. This is also the reason the metrics are counters and not gauges. 2. Since the metrics are counters, it is difficult to create meaningful dashboards for Alertmanager as, unlike in Mimir, configurations are not reloaded at fixed intervals, and as such, operators cannot use rate to track configuration changes over time. In Alertmanager, there are much better tools available to validate that an Alertmanager configuration is compatible with the UTF-8 parser, including both the existing logging from Alertmanager server and amtool check-config. In other projects like Mimir, we can track configurations for individual tenants using log aggregation and storage systems such as Loki. This gives operators far more information than what is possible with the metrics, including the timestamp, input and ID of tenant configurations that are incompatible or have disagreement. Signed-off-by: George Robinson <george.robinson@grafana.com>	2024-02-08 09:59:03 +00:00
George Robinson	f92a08d073	Remove unused feature flags (#3676 ) This commit removes some code that should have been removed in #3668. The FeatureFlags in silence.Options are no longer used but were still initialized. These had a no-op effect. Signed-off-by: George Robinson <george.robinson@grafana.com>	2024-01-19 10:43:50 +00:00
George Robinson	fa6a7e6dd6	Fix inconsistent defaults in UTF-8 behavior (#3668 ) This commit fixes inconsistent UTF-8 behavior if the compat package is not initialized and feature flags are not passed to the API. This can happen when Alertmanager is used as a package in software such as Cortex or Mimir. The inconsistent behavior is that Alertmanager will accept UTF-8 alerts but reject UTF-8 configurations. Since feature flags are optional via api.Options, we cannot force them to be passed to api.New at compile time. Instead, it's better to defer back to the compat package which is consistent even when not initialized. Signed-off-by: George Robinson <george.robinson@grafana.com>	2024-01-15 10:03:51 +00:00
Matthieu MOREL	b9e347b9d1	golangci-lint: enable testifylint linter Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2023-12-10 08:50:03 +00:00
Matthieu MOREL	b81bad8711	use Go standard errors Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2023-12-08 16:44:13 +01:00
George Robinson	70bd5dad98	Support UTF-8 label matchers: Use compat package in Alertmanager server (#3567 ) * Support UTF-8 label matchers: Use compat package in Alertmanager server This pull request adds use of the compat package in Alertmanager server that will allow users to switch between the new matchers/parse parser and the old pkg/labels parser. The new matchers/parse parser uses a fallback mechanism where if the input cannot be parsed in the new parser it then attempts to use the old parser. If an input is parsed in the old parser but not the new parser then a warning log is emitted. Signed-off-by: George Robinson <george.robinson@grafana.com> --------- Signed-off-by: George Robinson <george.robinson@grafana.com>	2023-11-24 10:01:40 +00:00
gotjosh	3ee2cd0f12	Metrics: Silence maintenance success and failure (#3285 ) * Metrics: Silence maintenance success and failure Due to various reasons, we've observed different kind of errors on this area. From read-only disks to silly code bugs. Errors during maintenance are effectively a _data loss_ and therefore we should encourage proper monitoring of this area. This PR Introduces a total and failure metric for silence maintenance. If agreed, I'll do the same for the nflog and fix the flaky test like I did for silences while I'm there. Signed-off-by: gotjosh <josue.abreu@gmail.com>	2023-03-08 12:32:59 +00:00
gotjosh	5318bc3ccb	replace atomic for uber fix atomic Signed-off-by: gotjosh <josue.abreu@gmail.com>	2023-02-24 12:11:50 +00:00
gotjosh	c61ca09246	Fix silences flaky test Today I learned that `runtime.Gosched()` doesn't do what I thought it would. While it allows other goroutines to run it doesn't guarantee that the main goroutine will be blocked until others are run. sadly, I had to fall back to the sleep approach. Signed-off-by: gotjosh <josue.abreu@gmail.com>	2023-02-24 12:09:47 +00:00
gotjosh	f59460bfd4	Refactor nflog configuration options to make it similar to Silences. (#3220 ) * Refactor nflog configuration options to make it similar to Silences. The Notification Log is a similar component to Silences. They're the only two things that are shared between nodes when running in HA and they both hold some sort of internal state that needs to be cleaned up on an interval. To simplify the code and make it a bit more understandable (among other benefits such as improved testability) - I've refactor the notification log configuration and `run` to be similar to the silences.	2023-01-19 16:39:03 +00:00
inosato	791e542100	Remove ioutil Signed-off-by: inosato <si17_21@yahoo.co.jp>	2022-07-18 22:01:02 +09:00
Joe Blubaugh	01d1e49c54	Simplify Silence test to remove unnecessary wait. As noted in #2867, there is an unnecessary require.Eventually in a silence test. This PR addresses that by using a channel to signal that that the maintenance loop has completed. Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>	2022-07-06 09:47:52 +08:00
Joe Blubaugh	505f944c6a	Apply suggestions from code review. Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>	2022-07-05 11:22:46 +08:00
Joe Blubaugh	0c3bf4b6ce	Loosen up the timing on an Eventually to avoid CI timeout Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>	2022-07-05 11:22:46 +08:00
Joe Blubaugh	c9249a02bc	Remove a stray line that was breaking the linter. Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>	2022-07-05 11:22:46 +08:00
Joe Blubaugh	bedd3c4175	Clean up linter warnings about unused code and atomic package Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>	2022-07-05 11:22:46 +08:00
Joe Blubaugh	cb00d9259b	Issue #2850 : Add benbjohnson/clock to the silences package. github.com/benbjohnson/clock provides a time interface to programs rather than using the stdlib time package. This allows mocking time in programs and tests. In this commit, the clock is used to speed up and simplify testing of the silences package. Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>	2022-07-05 11:22:46 +08:00
gotjosh	cfb909f419	Marker: Rename `SetSilenced` to `SetActiveOrSilenced` This accurately reflects what the function _actually_ does. If no active silences IDs are provided and the list of inhibitions we have is already empty the alert is actually set to Active. Took me a while to realise this as I was understanding how do we populate the alert list. Signed-off-by: gotjosh <josue.abreu@gmail.com>	2022-06-17 12:51:23 +01:00
Matthias Loibl	a6d10bd5bc	Update golangci-lint and fix complaints (#2853 ) * Copy latest golangci-lint files from Prometheus Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * Use grafana/regexp over stdlib regexp Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * Fix typos in comments Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * Fix goimports complains in import sorting Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * gofumpt all Go files Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * Update naming to comply with revive linter Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * config: Fix error messages to be lower case Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * test/cli: Fix error messages to be lower case Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * .golangci.yaml: Remove obsolete space Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * config: Fix expected victorOps error Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * Use stdlib regexp Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * Clean up Go modules Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>	2022-03-25 17:59:51 +01:00
Simon Pasquier	3f42c5e813	Merge pull request #2816 from prashbnair/update_check Correcting the condition for updating a silence. Earlier was checking…	2022-03-04 15:17:12 +01:00
Soon-Ping	a2d18c93de	Return no error when deleting expired silence (#2817 ) * Changed Silences.expire(id) to not return error for already expired silence Signed-off-by: Soon-Ping Phang <soonping@amazon.com> * Added comment explaining idempotency change for Silences.expire() Signed-off-by: Soon-Ping Phang <soonping@amazon.com> * Trigger build Signed-off-by: Soon-Ping Phang <soonping@amazon.com> * Trigger build Signed-off-by: Soon-Ping Phang <soonping@amazon.com> * Fixed typo in comment Signed-off-by: Soon-Ping Phang <soonping@amazon.com> * Trigger build Signed-off-by: Soon-Ping Phang <soonping@amazon.com> * Trigger build Signed-off-by: Soon-Ping Phang <soonping@amazon.com> * Fixed another typo in comment Signed-off-by: Soon-Ping Phang <soonping@amazon.com> * Promoted comment to function-level Signed-off-by: Soon-Ping Phang <soonping@amazon.com> * Added API v2 test for DeleteSilence, PostSilence Signed-off-by: Soon-Ping Phang <soonping@amazon.com> * Fixed lint errors Signed-off-by: Soon-Ping Phang <soonping@amazon.com> * Trigger build Signed-off-by: Soon-Ping Phang <soonping@amazon.com> * Trigger build Signed-off-by: Soon-Ping Phang <soonping@amazon.com> * Trigger build Signed-off-by: Soon-Ping Phang <soonping@amazon.com>	2022-02-22 13:34:21 +01:00
Prashant Balachandran	66182178d0	Correcting the condition for updating a silence. Earlier was checking upto nanosecond precision but reduced to second as the UI only sends upto millisecond Signed-off-by: Prashant Balachandran <pnair@redhat.com>	2022-01-31 11:32:48 +05:30
Kyle Brandt	1b8afe7cb5	export ValidateMatcher for DI (#2 ) (#2716 ) so third parties, Grafana in particular, can over ride the validation. Grafana wants to do this because other data sources will have label keys with things like spaces, periods, or other characters - and looking for a better integration with alert manager. goes with grafana/grafana#38629 replaces https://github.com/prometheus/alertmanager/pull/2694 Signed-off-by: Kyle Brandt <kyle@grafana.com>	2021-10-21 09:29:55 +02:00
Yuriy Tseretyan	15f44f4a61	Close file descriptor after snapshot file was read (#2710 ) * close file if it is opened Signed-off-by: Yuriy Tseretyan <yuriy.tseretyan@grafana.com>	2021-10-19 01:12:02 +02:00
Julius Volz	5195460c95	Correctly call default silence maintenance function (#2701 ) https://github.com/prometheus/alertmanager/pull/2689 introduced a regression where the default maintenance function would no longer be called even if no override was specified. The Alertmanager now crashes on any silence maintenance run without this fix. Signed-off-by: Julius Volz <julius.volz@gmail.com>	2021-09-13 19:42:48 +05:30
gotjosh	8da517524a	Enable support for custom callbacks as part of maintenance (#2689 ) * Enable support for custom callbacks as part of maintenance This enables support for custom Maintenance callbacks as part of the periodic maintenance of silences and notification logs. Effectively a no-op for the Alertmanager but allows downstream implementation to inject custom logic as part of it. Signed-off-by: gotjosh <josue.abreu@gmail.com> * Add tests Signed-off-by: gotjosh <josue.abreu@gmail.com> * Fix tests and remove whitespace Signed-off-by: gotjosh <josue.abreu@gmail.com> * Address review comments Signed-off-by: gotjosh <josue.abreu@gmail.com> * run go fmt Signed-off-by: gotjosh <josue.abreu@gmail.com> * Fix import ordering Signed-off-by: gotjosh <josue.abreu@gmail.com>	2021-09-06 16:19:39 +05:30
Julien Pivotto	b2a4cacb95	Update go dependencies & switch to go-kit/log Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-08-02 12:43:23 +02:00
beorn7	e84c265196	Include pending silences for future muting decisions Previously, if a pending silence existed for an alert, and it later became active without any silences getting added in the meantime, we would miss the existence of that newly active silence. Signed-off-by: beorn7 <beorn@grafana.com>	2021-05-27 22:15:57 +02:00
beorn7	f7c8a4b28a	Add test to expose issue #2426 Signed-off-by: beorn7 <beorn@grafana.com>	2021-05-26 19:39:25 +02:00
Ganesh Vernekar	1f946f8a7d	Replace satori/go.uuid with gofrs/uuid Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>	2021-03-15 19:39:15 +05:30
Ganesh Vernekar	406ddd200a	Upgrade github.com/satori/go.uuid Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>	2021-03-10 14:49:07 +05:30
Björn Rabenstein	023937679f	Catch unknown matcher types in gossipped silences (#2484 ) This has been discussed in #2479. Even if the conclusion there was that we don't need this in a bugfix release, it's still better to have this kind of robustness. So this introduces the same check into the main branch. Signed-off-by: beorn7 <beorn@grafana.com>	2021-02-10 12:02:56 +01:00
Simon Pasquier	23a7f89398	Update github.com/gogo/protobuf to v1.3.2 (#2478 ) Fix for CVE-2021-3121 Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2021-02-09 16:49:07 +01:00
Kiril Vladimirov	f5382af591	silence: Add tests for Not(Equal\|Regexp) matchers ... and fix a bug in validating silences with such matchers, caught while writing them. Signed-off-by: Kiril Vladimirov <kiril@vladimiroff.org>	2021-01-22 17:02:50 +02:00
Kiril Vladimirov	7320d83cbc	Replace types.Matcher(s)? with labels.Matcher(s)? Signed-off-by: Kiril Vladimirov <kiril@vladimiroff.org>	2021-01-22 17:02:48 +02:00
Julien Pivotto	013177e2d0	Update dependencies (#2257 ) Update membership Update common (support HTTP/2 client) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-05-18 15:00:36 +02:00
Josh Soref	0f2c65d265	Spelling (#2167 ) * spelling: inhibition Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: matchers Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: notification Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: nonexistent Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: obfuscated Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: occurred Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: relevant Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: unexpected Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: marshaled Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: marshaling Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>	2020-01-23 17:06:16 +01:00
Ilya Gladyshev	196c62f488	At least one non-empty silence matcher (#2081 ) * check if at least one silence matcher doesn't match empty strings Signed-off-by: qoops <ilya.v.gladyshev@gmail.com> * fixed grammar Signed-off-by: qoops <ilya.v.gladyshev@gmail.com>	2019-10-31 15:42:03 +01:00
Ganesh Vernekar	3207e8b300	Vendor prometheus 2.12.0 (#2008 ) * Vendor prometheus 2.12.0 Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in> * Update protos Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>	2019-08-22 15:34:38 +05:30
beorn7	318e006065	Mark some Summaries explicitly as having no objectives With the next release of client_golang, Summaries will not have objectives by default. Interestingly, this will do the right thing for the Summaries affected by this commit. However, right now those summaries do get the old default objectives. They don't really make sense because the affected Summaries receive Observations quite infrequently (far less than once in the 10m max age currently used). To not get surprising changes when moving on to client_golang v1, let's explicitly set the Summaries as objective-less now. Signed-off-by: beorn7 <beorn@grafana.com>	2019-06-12 15:47:56 +02:00

1 2

89 Commits