alertmanager

Commit Graph

Author	SHA1	Message	Date
George Robinson	7106bcc1ab	Fix version in APIv1 deprecation notice (#3815 ) Signed-off-by: George Robinson <george.robinson@grafana.com>	2024-04-23 16:16:01 +01:00
George Robinson	fa6a7e6dd6	Fix inconsistent defaults in UTF-8 behavior (#3668 ) This commit fixes inconsistent UTF-8 behavior if the compat package is not initialized and feature flags are not passed to the API. This can happen when Alertmanager is used as a package in software such as Cortex or Mimir. The inconsistent behavior is that Alertmanager will accept UTF-8 alerts but reject UTF-8 configurations. Since feature flags are optional via api.Options, we cannot force them to be passed to api.New at compile time. Instead, it's better to defer back to the compat package which is consistent even when not initialized. Signed-off-by: George Robinson <george.robinson@grafana.com>	2024-01-15 10:03:51 +00:00
Matthieu MOREL	b81bad8711	use Go standard errors Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2023-12-08 16:44:13 +01:00
George Robinson	70bd5dad98	Support UTF-8 label matchers: Use compat package in Alertmanager server (#3567 ) * Support UTF-8 label matchers: Use compat package in Alertmanager server This pull request adds use of the compat package in Alertmanager server that will allow users to switch between the new matchers/parse parser and the old pkg/labels parser. The new matchers/parse parser uses a fallback mechanism where if the input cannot be parsed in the new parser it then attempts to use the old parser. If an input is parsed in the old parser but not the new parser then a warning log is emitted. Signed-off-by: George Robinson <george.robinson@grafana.com> --------- Signed-off-by: George Robinson <george.robinson@grafana.com>	2023-11-24 10:01:40 +00:00
gotjosh	b4f7027908	Deprecate and remove api/v1/ (#2970 ) * Deprecate and remove api/v1/ Signed-off-by: gotjosh <josue.abreu@gmail.com> --------- Signed-off-by: gotjosh <josue.abreu@gmail.com>	2023-11-24 09:06:04 +00:00
Matthias Loibl	a6d10bd5bc	Update golangci-lint and fix complaints (#2853 ) * Copy latest golangci-lint files from Prometheus Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * Use grafana/regexp over stdlib regexp Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * Fix typos in comments Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * Fix goimports complains in import sorting Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * gofumpt all Go files Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * Update naming to comply with revive linter Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * config: Fix error messages to be lower case Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * test/cli: Fix error messages to be lower case Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * .golangci.yaml: Remove obsolete space Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * config: Fix expected victorOps error Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * Use stdlib regexp Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> * Clean up Go modules Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>	2022-03-25 17:59:51 +01:00
Jean-Philippe Quémémer	4fbcae7d05	Remove `/api/v2/` from `StripPrefix` Signed-off-by: Danny Kopping <danny.kopping@grafana.com>	2021-11-16 00:42:43 +01:00
Julien Pivotto	b2a4cacb95	Update go dependencies & switch to go-kit/log Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-08-02 12:43:23 +02:00
gotjosh	9a2ae39430	Clustering: Interface for Peers in other packages A Peer as defined by the `cluster` package represents the node in the cluster. It is used in other packages to know the status of all of the members or how long should we wait to know if a notification has already fired. In Cortex, we'd like to implement a slightly different way of clustering (using gRPC for communication and a hash ring for node discovery). This is a small change to support that by changing the consumer of other packages to an interface. Silences and Notification channels don't need an interface as they take a `func([]byte) error` as a parameter. Signed-off-by: gotjosh <josue@grafana.com>	2021-02-19 19:07:41 +00:00
Simon Pasquier	adcf283d4c	api: add missing metrics for API v2 Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-05-24 14:48:45 +02:00
stuart nelson	2fa210d0e3	add groups endpoint to v2 api Signed-off-by: stuart nelson <stuartnelson3@gmail.com>	2019-04-17 11:32:21 +02:00
beorn7	f3d9c89bbc	Create a `Muter` implementation for silences This encapsulates the logic of querying and marking silenced alerts. It removes the code duplication flagged earlier. I removed the error returned by the setAlertStatus function as we were only logging it, and that's already done anyway when the error is received from the `silence.Query` call (now in the `Mutes` method). Signed-off-by: beorn7 <beorn@soundcloud.com>	2019-02-26 16:42:59 +01:00
Max Leonard Inden	d0cd5a0f08	*: Introduce config coordinator bundling config specific logic Instead of handling all config specific logic inside Alertmangaer.main(), this patch introduces the config coordinator component. Tasks of the config coordinator: - Load and parse configuration - Notify subscribers on configuration changes - Register and manage configuration specific metrics Signed-off-by: Max Leonard Inden <IndenML@gmail.com>	2019-02-25 11:26:30 +01:00
stuart nelson	51eebbef85	Stn/correctly mark api silences (#1733 ) * Update alert status on every GET to alerts Signed-off-by: stuart nelson <stuartnelson3@gmail.com>	2019-02-18 17:06:51 +01:00
beorn7	21de9ff88c	Various improvements after code review Most importantly, `api.New` now takes an `Options` struct as an argument, which allows some other things done here as well: - Timout and concurrency limit are now in the options, streamlining the registration and the implementation of the limiting middleware. - A local registry is used for metrics, and the metrics used so far inside any of the api packages are using it now. The 'in flight' metric now contains the 'get' as a method label. I have also added a TODO to instrument other methods in the same way (otherwise, the label doesn't reall make sense, semantically). I have also added an explicit error counter for requests rejected because of the concurrency limit. (They also show up as 503s in the generic HTTP instrumentation (or they would, if v2 were instrumented, too), but those 503s might have a number of reasons, while users might want to alert on concurrency limit problems explicitly). Signed-off-by: beorn7 <beorn@soundcloud.com>	2019-02-12 18:42:08 +01:00
beorn7	3382a0e949	Add HTTP instrumentation for GET requests in flight While the newly added in-flight instrumentation works for all GET requests, the existing HTTP instrumentation omits api/v2 calls. This commit adds a TODO note about that. Signed-off-by: beorn7 <beorn@soundcloud.com>	2019-02-11 19:34:06 +01:00
beorn7	fc4b67ce80	Introduce a timeout and concurrency limit for HTTP requests The default concurrency limit is max(GOMAXPROCS, 8). That should not imply that each GET requests eats a whole CPU. It's more to get some reasonable heuristics for the processing power of the hosting machine (while allowing at least 8 concurrent requests even on the smallest machines). As GET requests can easily overload the Alertmanager, rendering it incapable of doing its main task, namely sending alert notifications, we need to limit GET requests by default. In contrast, no timeout is set by default. The http.TimeoutHandler inovkes quite a bit of machinery behind the scenes, in particular an additional layer of buffering. Thus, we should first get a bit of experience with it before we consider enforcing a timeout by default, even if setting a timeout is in general the safer setting for resiliency. Signed-off-by: beorn7 <beorn@soundcloud.com>	2019-02-11 19:34:06 +01:00
Max Leonard Inden	c57542127d	api: Combine v1 and v2 into generic api Instead of cmd/alertmanager/main.go instantiating and starting both api v1 and v2, delegate that work to a generic api combining the two. Signed-off-by: Max Leonard Inden <IndenML@gmail.com>	2019-02-04 14:31:33 +01:00
Max Leonard Inden	f1b920bcc9	api: Implement OpenAPI generated Alertmanager API V2 The current Alertmanager API v1 is undocumented and written by hand. This patch introduces a new Alertmanager API - v2. The API is fully generated via an OpenAPI 2.0 [1] specification (see `api/v2/openapi.yaml`) with the exception of the http handlers itself. Pros: - Generated server code - Ability to generate clients in all major languages (Go, Java, JS, Python, Ruby, Haskell, elm [3] ...) - Strict contract (OpenAPI spec) between server and clients. - Instant feedback on frontend-breaking changes, due to strictly typed frontend language elm. - Generated documentation (See Alertmanager online Swagger UI [4]) Cons: - Dependency on open api ecosystem including go-swagger [2] In addition this patch includes the following changes. - README.md: Add API section - test: Duplicate acceptance test to API v1 & API v2 version The Alertmanager acceptance test framework has a decent test coverage on the Alertmanager API. Introducing the Alertmanager API v2 does not go hand in hand with deprecating API v1. They should live alongside each other for a couple of minor Alertmanager versions. Instead of porting the acceptance test framework to use the new API v2, this patch duplicates the acceptance tests, one using the API v1, the other API v2. Once API v1 is removed we can simply remove `test/with_api_v1` and bring `test/with_api_v2` to `test/`. [1] https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md [2] https://github.com/go-swagger/go-swagger/ [3] https://github.com/ahultgren/swagger-elm [4] http://petstore.swagger.io/?url=https://raw.githubusercontent.com/mxinden/alertmanager/apiv2/api/v2/openapi.yaml Signed-off-by: Max Leonard Inden <IndenML@gmail.com>	2018-09-04 13:38:34 +02:00
Max Inden	b1a8fdd169	Merge pull request #1521 from mxinden/errcheck *.go: Introduce errcheck enforcing error handling	2018-08-30 17:53:49 +02:00
Max Leonard Inden	1219541184	*.go: Introduce errcheck enforcing error handling Errcheck [1] enforces error handling accross all go files. Functions can be excluded via `scripts/errcheck_excludes.txt`. This patch adds errcheck to the `test` Make target. [1] https://github.com/kisielk/errcheck Signed-off-by: Max Leonard Inden <IndenML@gmail.com>	2018-08-30 15:47:13 +02:00
Simon Pasquier	899226f3ac	*: remove v1/alerts/groups API endpoint (#1525 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2018-08-23 16:03:49 +02:00
comicmuse	ec263489e9	Add cache control headers to the API responses to avoid IE caching th… (#1500 ) Add cache control headers to the API responses to avoid IE caching the response.	2018-08-06 18:51:54 +02:00
Julius Volz	6d0edbe630	Fix a bunch of unhandled errors (#1501 ) ...as discovered by "gosec" (many other ones reported, but not all make a lot of sense to fix). Signed-off-by: Julius Volz <julius.volz@gmail.com>	2018-08-05 15:38:25 +02:00
Simon Pasquier	75900ea62a	api: remove dead code (#1367 ) This is a follow-up of `f825d97de4`. Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2018-05-07 18:11:36 +02:00
Simon Pasquier	383024e63d	api: support more query filters (#1366 ) * api: support more query filters This change adds 2 new query filters to the /api/v1/alerts endpoint. - active, filter out active alerts when set to 'false' (default: 'true'). - unprocessed, filter out unprocessed alerts when set to 'false' (default: 'true'). The default values ensure that the API behavior remains the same as before when the query filters aren't provided. Signed-off-by: Simon Pasquier <spasquie@redhat.com> * api: address comments Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2018-05-07 18:07:19 +02:00
Max Leonard Inden	f825d97de4	api: Deprecate `api/alerts` endpoint With prometheus/prometheus commit e114ce0ff7a1ae06b24fdc479ffc7422074c1ebe [1] Prometheus switches from using `api/alerts` to `api/v1/alerts`. This commit is included starting from Prometheus v0.17.0. As discussed on the prometheus-developers mailing list [2] the deprecation period is long over. [1] github.com/prometheus/prometheus/commit/e114ce0ff7a1ae06b24fdc479ffc7422074c1ebe [2] https://groups.google.com/d/msg/prometheus-developers/2CCuFTMbmAg/Qg58rvyzAQAJ Signed-off-by: Max Leonard Inden <IndenML@gmail.com>	2018-05-04 09:59:14 +02:00
Simon Pasquier	f53b24765d	api: initialize alerts_received_total labels (#1310 )	2018-04-04 10:38:17 +02:00
Simon Pasquier	b95b32821f	ui: replace deprecated InstrumentHandler() (#1302 ) This change replaces the deprecated InstrumentHandler function by the equivalent functions from the promhttp package. The following metrics are removed: * http_request_duration_microseconds (Summary). * http_request_size_bytes (Summary). * http_requests_total (Counter). And the following metrics are added instead: * alertmanager_http_request_duration_seconds (Histogram). * alertmanager_http_response_size_bytes (Histogram). * promhttp_metric_handler_requests_in_flight (Gauge). * promhttp_metric_handler_requests_total (Counter).	2018-03-28 15:28:38 +02:00
Stuart Nelson	319687ab3c	Re-simplify match filters fn	2018-03-20 16:11:01 +01:00
Stuart Nelson	0c026b4387	Remove empty alert labels on ingest The same behavior exists in prometheus. This is a bit superfluous, but in the event people are using old versions of prometheus or a different metric gathering system, it's still valid to check.	2018-03-20 12:06:34 +01:00
Stuart Nelson	4c98f4b4a9	Fix matchLabels logic	2018-03-20 11:47:53 +01:00
Stuart Nelson	f5df55666b	Filter empty matchers correctly	2018-03-20 10:08:58 +01:00
Corentin Chary	dd75201f1c	Add /-/ready based on mesh status (#1209 ) * Wait for the gossip to settle before sending notifications See #1209 for details. As an heuristic for mesh readyness, try to see if the mesh looks stable (the number of peers isn't changing too much). This implementation always mark the altermanager as ready after a maximum of 60s. This adds one new flags to control this behavior: ``` --cluster.settle-timeout=60s mesh settling timeout. Do not wait more than this duration on startup. ``` It also adds `/-/ready` which always return 200 (in order to make it clear that we are ready as soon as we can receive requests). The mesh status is exposed in `/api/v1/status` and visible on `/#/status`. * cluster: fix typos and base interval on gossipInterval	2018-03-02 15:45:21 +01:00
pasquier-s	e8a92f65ef	Run staticcheck as part of the build process (#1264 ) This change also fixes potential issues highlighted by running staticcheck.	2018-02-28 17:42:32 +01:00
pasquier-s	29e441f88f	Fix miscellaneous issues revealed by Go 1.10 (#1256 ) * provider/mem: fix format verbs in tests * api: fix format verb	2018-02-22 14:57:45 +00:00
pasquier-s	382a0d8089	api: support zero StartsAt for alerts (#1238 ) When the API receives alerts where StartsAt is zero, it updates the value to EndsAt (if not zero itself) or "now". This ensures that the alert validation will not fail since StartsAt has to be less than or equal to EndsAt.	2018-02-13 16:26:34 +01:00
Stuart Nelson	a552afd998	Merge branch 'master' into memberlist	2018-02-13 10:47:17 +01:00
Fabian Reinartz	3f2e00fbea	cluster/api: improve metrics and cluster status	2018-02-09 11:16:00 +01:00
Fabian Reinartz	fd49dbb477	*: move to memberlist for clustering	2018-02-08 12:18:44 +01:00
conorbroderick	e8832619e0	Fixes AM wrongly counting alerts with EndTimes in the future as resolved	2018-02-07 15:52:26 +00:00
Jose Donizetti	fc9306cd7e	Add expired silence validation (#1096 ) * Add expired silence validation * Add silence end time in the past validation	2018-01-21 15:29:51 +01:00
pasquier-s	364979bbf8	Display connections in the Status page (#1164 ) This change shows the status of the local connections in the web UI. It can be used to troubleshoot mesh issues.	2017-12-22 11:39:27 +01:00
Fabian Reinartz	405dbb8d9c	Fix wrong lock	2017-12-21 16:55:55 +01:00
stuart nelson	1abe4c9a56	Lock around variables used in Update() Found two places where struct members being updated in api.Update() where being accessed elsewhere without locks.	2017-12-21 12:08:39 +01:00
Jose Donizetti	10ed60361d	Fix silences negative filtering (#1095 ) * Fix silence negative filtering * Refactor extract filtering labels func	2017-11-15 14:29:06 -05:00
Jose Donizetti	95e80d1aa8	Add tests to receiver filtering (#1098 )	2017-11-12 11:35:49 -05:00
Julius Volz	fdee5fcbfc	Fix UI when no silences are present (#1090 ) * Explicitly initialize silences list to avoid "null" JSON * Wrap "No silences found" message in error box * bindata fixup	2017-11-11 14:48:48 +01:00
Jose Donizetti	74808e40f3	Refactor silence constants (#1076 ) * Refactor remove dups silence state constants * Refactor to use const instead of string	2017-11-07 11:36:30 +01:00
Jose Donizetti	b9597f5c7b	Fix negative matchers filtering (#1077 )	2017-11-04 14:38:16 +01:00

1 2

76 Commits