alertmanager

mirror of https://github.com/prometheus/alertmanager synced 2024-12-26 16:12:20 +00:00

Author	SHA1	Message	Date
Simon Pasquier	57c4ff10ab	api/v2: serve OpenAPI specification (#1751 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-02-20 15:36:19 +01:00
stuart nelson	51eebbef85	Stn/correctly mark api silences (#1733 ) * Update alert status on every GET to alerts Signed-off-by: stuart nelson <stuartnelson3@gmail.com>	2019-02-18 17:06:51 +01:00
beorn7	21de9ff88c	Various improvements after code review Most importantly, `api.New` now takes an `Options` struct as an argument, which allows some other things done here as well: - Timout and concurrency limit are now in the options, streamlining the registration and the implementation of the limiting middleware. - A local registry is used for metrics, and the metrics used so far inside any of the api packages are using it now. The 'in flight' metric now contains the 'get' as a method label. I have also added a TODO to instrument other methods in the same way (otherwise, the label doesn't reall make sense, semantically). I have also added an explicit error counter for requests rejected because of the concurrency limit. (They also show up as 503s in the generic HTTP instrumentation (or they would, if v2 were instrumented, too), but those 503s might have a number of reasons, while users might want to alert on concurrency limit problems explicitly). Signed-off-by: beorn7 <beorn@soundcloud.com>	2019-02-12 18:42:08 +01:00
beorn7	3382a0e949	Add HTTP instrumentation for GET requests in flight While the newly added in-flight instrumentation works for all GET requests, the existing HTTP instrumentation omits api/v2 calls. This commit adds a TODO note about that. Signed-off-by: beorn7 <beorn@soundcloud.com>	2019-02-11 19:34:06 +01:00
beorn7	4747fd9b2f	Propagate timeout to alert listing via context The context is created by the http.TimeoutHandler we use to set the timeout. I believe this is the only endpoint where propagating the timeout is feasible and needed. Signed-off-by: beorn7 <beorn@soundcloud.com>	2019-02-11 19:34:06 +01:00
beorn7	fc4b67ce80	Introduce a timeout and concurrency limit for HTTP requests The default concurrency limit is max(GOMAXPROCS, 8). That should not imply that each GET requests eats a whole CPU. It's more to get some reasonable heuristics for the processing power of the hosting machine (while allowing at least 8 concurrent requests even on the smallest machines). As GET requests can easily overload the Alertmanager, rendering it incapable of doing its main task, namely sending alert notifications, we need to limit GET requests by default. In contrast, no timeout is set by default. The http.TimeoutHandler inovkes quite a bit of machinery behind the scenes, in particular an additional layer of buffering. Thus, we should first get a bit of experience with it before we consider enforcing a timeout by default, even if setting a timeout is in general the safer setting for resiliency. Signed-off-by: beorn7 <beorn@soundcloud.com>	2019-02-11 19:34:06 +01:00
Max Leonard Inden	c57542127d	api: Combine v1 and v2 into generic api Instead of cmd/alertmanager/main.go instantiating and starting both api v1 and v2, delegate that work to a generic api combining the two. Signed-off-by: Max Leonard Inden <IndenML@gmail.com>	2019-02-04 14:31:33 +01:00
Max Leonard Inden	8e157b3af5	api/v2: Make cluster status peers and name optional If a users chooses to disable the Alertmanager cluster feature, there is no cluster name nor cluster peers. Hence these should be optional. Only cluster status is set to "disabled". Signed-off-by: Max Leonard Inden <IndenML@gmail.com>	2019-02-04 11:40:30 +01:00
Max Leonard Inden	2f055d9966	api/v2: Do not populate cluster info if clustering is disabled When users start Alertmanager with `--cluster.listen-address=`, the cluster will not be initialized, hence api.peer will be `nil`. So far this would result in a nil pointer dereference by the API v2 accessing the api.peer field. With this patch, api v2 skips populating the peers array, sets the name to an empty string and the status to "disabled" in case `api.peer` is nil. Signed-off-by: Max Leonard Inden <IndenML@gmail.com>	2019-01-31 16:56:59 +01:00
Max Leonard Inden	7aa8ea9d9d	api/v2: Disable serving swagger spec and redoc UI By default go-swagger serves the swagger spec and the redoc UI. This patch disables both. Signed-off-by: Max Leonard Inden <IndenML@gmail.com>	2019-01-17 16:19:29 +01:00
Simon Pasquier	b676fa79c0	*: update Makefile.common with new staticcheck (#1692 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-01-04 15:37:33 +01:00
Simon Pasquier	9a116736ef	api/v2: Add CORS support (#1667 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2018-12-16 14:05:34 +01:00
Max Leonard Inden	2b697aaa6b	api/v2: Extract shared properties of gettable and postable alert With issue 1465 on openapi-generator [1] being fixed, we can not extract shared properties of the gettable and postable alert definition into a shared object (`alert`) like we do for silence, gettable silence and postable silence. In addition this patch does the following changes to the UI: - Use `List GettableAlert` instead of plural type definition like `GettableAlerts` because the plural definitions are not generated. - Fix openapi-generator-cli docker image to specific hash. [1] https://github.com/OpenAPITools/openapi-generator/issues/1465 Signed-off-by: Max Leonard Inden <IndenML@gmail.com>	2018-11-28 14:35:39 +01:00
Max Leonard Inden	f504f953c1	ui: Move /alerts to API v2 Signed-off-by: Max Leonard Inden <IndenML@gmail.com>	2018-11-23 12:53:48 +01:00
Max Leonard Inden	b4b8b750df	api/v2/openapi.yaml: Differentiate between post and get silence Instead of having one general silence, differentiate between postable and gettable silence, hence making more fields required. Signed-off-by: Max Leonard Inden <IndenML@gmail.com>	2018-11-15 16:21:07 +01:00
Max Leonard Inden	e4e053b18e	ui: Move /status & /silences to API v2 This patch makes the Alertmanager UI (/status & /silences) use the api/v2 endpoint. In addition it adds logic to generate the elm side data model based on the OpenAPI specification. Signed-off-by: Max Leonard Inden <IndenML@gmail.com>	2018-11-15 13:24:26 +01:00
Max Leonard Inden	f1b920bcc9	api: Implement OpenAPI generated Alertmanager API V2 The current Alertmanager API v1 is undocumented and written by hand. This patch introduces a new Alertmanager API - v2. The API is fully generated via an OpenAPI 2.0 [1] specification (see `api/v2/openapi.yaml`) with the exception of the http handlers itself. Pros: - Generated server code - Ability to generate clients in all major languages (Go, Java, JS, Python, Ruby, Haskell, elm [3] ...) - Strict contract (OpenAPI spec) between server and clients. - Instant feedback on frontend-breaking changes, due to strictly typed frontend language elm. - Generated documentation (See Alertmanager online Swagger UI [4]) Cons: - Dependency on open api ecosystem including go-swagger [2] In addition this patch includes the following changes. - README.md: Add API section - test: Duplicate acceptance test to API v1 & API v2 version The Alertmanager acceptance test framework has a decent test coverage on the Alertmanager API. Introducing the Alertmanager API v2 does not go hand in hand with deprecating API v1. They should live alongside each other for a couple of minor Alertmanager versions. Instead of porting the acceptance test framework to use the new API v2, this patch duplicates the acceptance tests, one using the API v1, the other API v2. Once API v1 is removed we can simply remove `test/with_api_v1` and bring `test/with_api_v2` to `test/`. [1] https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md [2] https://github.com/go-swagger/go-swagger/ [3] https://github.com/ahultgren/swagger-elm [4] http://petstore.swagger.io/?url=https://raw.githubusercontent.com/mxinden/alertmanager/apiv2/api/v2/openapi.yaml Signed-off-by: Max Leonard Inden <IndenML@gmail.com>	2018-09-04 13:38:34 +02:00
Max Inden	b1a8fdd169	Merge pull request #1521 from mxinden/errcheck *.go: Introduce errcheck enforcing error handling	2018-08-30 17:53:49 +02:00
Max Leonard Inden	1219541184	*.go: Introduce errcheck enforcing error handling Errcheck [1] enforces error handling accross all go files. Functions can be excluded via `scripts/errcheck_excludes.txt`. This patch adds errcheck to the `test` Make target. [1] https://github.com/kisielk/errcheck Signed-off-by: Max Leonard Inden <IndenML@gmail.com>	2018-08-30 15:47:13 +02:00
Simon Pasquier	899226f3ac	*: remove v1/alerts/groups API endpoint (#1525 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2018-08-23 16:03:49 +02:00
comicmuse	ec263489e9	Add cache control headers to the API responses to avoid IE caching th… (#1500 ) Add cache control headers to the API responses to avoid IE caching the response.	2018-08-06 18:51:54 +02:00
Julius Volz	6d0edbe630	Fix a bunch of unhandled errors (#1501 ) ...as discovered by "gosec" (many other ones reported, but not all make a lot of sense to fix). Signed-off-by: Julius Volz <julius.volz@gmail.com>	2018-08-05 15:38:25 +02:00
Simon Pasquier	0ebaeccd4b	*: add missing license headers Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2018-05-14 17:37:13 +02:00
Simon Pasquier	75900ea62a	api: remove dead code (#1367 ) This is a follow-up of `f825d97de4`. Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2018-05-07 18:11:36 +02:00
Simon Pasquier	383024e63d	api: support more query filters (#1366 ) * api: support more query filters This change adds 2 new query filters to the /api/v1/alerts endpoint. - active, filter out active alerts when set to 'false' (default: 'true'). - unprocessed, filter out unprocessed alerts when set to 'false' (default: 'true'). The default values ensure that the API behavior remains the same as before when the query filters aren't provided. Signed-off-by: Simon Pasquier <spasquie@redhat.com> * api: address comments Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2018-05-07 18:07:19 +02:00
Max Leonard Inden	f825d97de4	api: Deprecate `api/alerts` endpoint With prometheus/prometheus commit e114ce0ff7a1ae06b24fdc479ffc7422074c1ebe [1] Prometheus switches from using `api/alerts` to `api/v1/alerts`. This commit is included starting from Prometheus v0.17.0. As discussed on the prometheus-developers mailing list [2] the deprecation period is long over. [1] github.com/prometheus/prometheus/commit/e114ce0ff7a1ae06b24fdc479ffc7422074c1ebe [2] https://groups.google.com/d/msg/prometheus-developers/2CCuFTMbmAg/Qg58rvyzAQAJ Signed-off-by: Max Leonard Inden <IndenML@gmail.com>	2018-05-04 09:59:14 +02:00
Simon Pasquier	f53b24765d	api: initialize alerts_received_total labels (#1310 )	2018-04-04 10:38:17 +02:00
Simon Pasquier	b95b32821f	ui: replace deprecated InstrumentHandler() (#1302 ) This change replaces the deprecated InstrumentHandler function by the equivalent functions from the promhttp package. The following metrics are removed: * http_request_duration_microseconds (Summary). * http_request_size_bytes (Summary). * http_requests_total (Counter). And the following metrics are added instead: * alertmanager_http_request_duration_seconds (Histogram). * alertmanager_http_response_size_bytes (Histogram). * promhttp_metric_handler_requests_in_flight (Gauge). * promhttp_metric_handler_requests_total (Counter).	2018-03-28 15:28:38 +02:00
Stuart Nelson	319687ab3c	Re-simplify match filters fn	2018-03-20 16:11:01 +01:00
Stuart Nelson	0c026b4387	Remove empty alert labels on ingest The same behavior exists in prometheus. This is a bit superfluous, but in the event people are using old versions of prometheus or a different metric gathering system, it's still valid to check.	2018-03-20 12:06:34 +01:00
Stuart Nelson	4c98f4b4a9	Fix matchLabels logic	2018-03-20 11:47:53 +01:00
Stuart Nelson	f5df55666b	Filter empty matchers correctly	2018-03-20 10:08:58 +01:00
Corentin Chary	dd75201f1c	Add /-/ready based on mesh status (#1209 ) * Wait for the gossip to settle before sending notifications See #1209 for details. As an heuristic for mesh readyness, try to see if the mesh looks stable (the number of peers isn't changing too much). This implementation always mark the altermanager as ready after a maximum of 60s. This adds one new flags to control this behavior: ``` --cluster.settle-timeout=60s mesh settling timeout. Do not wait more than this duration on startup. ``` It also adds `/-/ready` which always return 200 (in order to make it clear that we are ready as soon as we can receive requests). The mesh status is exposed in `/api/v1/status` and visible on `/#/status`. * cluster: fix typos and base interval on gossipInterval	2018-03-02 15:45:21 +01:00
pasquier-s	e8a92f65ef	Run staticcheck as part of the build process (#1264 ) This change also fixes potential issues highlighted by running staticcheck.	2018-02-28 17:42:32 +01:00
pasquier-s	29e441f88f	Fix miscellaneous issues revealed by Go 1.10 (#1256 ) * provider/mem: fix format verbs in tests * api: fix format verb	2018-02-22 14:57:45 +00:00
pasquier-s	382a0d8089	api: support zero StartsAt for alerts (#1238 ) When the API receives alerts where StartsAt is zero, it updates the value to EndsAt (if not zero itself) or "now". This ensures that the alert validation will not fail since StartsAt has to be less than or equal to EndsAt.	2018-02-13 16:26:34 +01:00
Stuart Nelson	a552afd998	Merge branch 'master' into memberlist	2018-02-13 10:47:17 +01:00
Fabian Reinartz	3f2e00fbea	cluster/api: improve metrics and cluster status	2018-02-09 11:16:00 +01:00
Fabian Reinartz	fd49dbb477	*: move to memberlist for clustering	2018-02-08 12:18:44 +01:00
conorbroderick	e8832619e0	Fixes AM wrongly counting alerts with EndTimes in the future as resolved	2018-02-07 15:52:26 +00:00
Jose Donizetti	fc9306cd7e	Add expired silence validation (#1096 ) * Add expired silence validation * Add silence end time in the past validation	2018-01-21 15:29:51 +01:00
pasquier-s	364979bbf8	Display connections in the Status page (#1164 ) This change shows the status of the local connections in the web UI. It can be used to troubleshoot mesh issues.	2017-12-22 11:39:27 +01:00
Fabian Reinartz	405dbb8d9c	Fix wrong lock	2017-12-21 16:55:55 +01:00
stuart nelson	1abe4c9a56	Lock around variables used in Update() Found two places where struct members being updated in api.Update() where being accessed elsewhere without locks.	2017-12-21 12:08:39 +01:00
Jose Donizetti	10ed60361d	Fix silences negative filtering (#1095 ) * Fix silence negative filtering * Refactor extract filtering labels func	2017-11-15 14:29:06 -05:00
Jose Donizetti	e303646b80	Fix typo (#1103 )	2017-11-15 14:25:46 -05:00
Jose Donizetti	95e80d1aa8	Add tests to receiver filtering (#1098 )	2017-11-12 11:35:49 -05:00
Julius Volz	fdee5fcbfc	Fix UI when no silences are present (#1090 ) * Explicitly initialize silences list to avoid "null" JSON * Wrap "No silences found" message in error box * bindata fixup	2017-11-11 14:48:48 +01:00
Jose Donizetti	74808e40f3	Refactor silence constants (#1076 ) * Refactor remove dups silence state constants * Refactor to use const instead of string	2017-11-07 11:36:30 +01:00
Jose Donizetti	b9597f5c7b	Fix negative matchers filtering (#1077 )	2017-11-04 14:38:16 +01:00

1 2

76 Commits