alertmanager

Commit Graph

Author	SHA1	Message	Date
janhorstmann	bd70e73fc7	Update mixin dashboard (#4078 ) Update and rewrite the mixin dashboard to use the grafonnet ([1]) library. Grafana has deprecated angular plugins ([2]) as used by grafonnet-lib ([3]) with removal pending for grafana version 12. Additionally grafonnet-lib is deprecated/unmaintained in favor of grafonnet. Therefore the mixin dashboard has been updated to use grafonnet. [1] https://github.com/grafana/grafonnet [2] https://grafana.com/docs/grafana/latest/developers/angular_deprecation/ [3] https://github.com/grafana/grafonnet-lib Signed-off-by: Jan Horstmann <horstmann@osism.tech>	2024-10-29 09:59:51 +00:00
chengzw	aff09c28dc	fix label mismatch for alertmanager_notifications_failed_total Signed-off-by: chengzw <chengzw258@163.com>	2023-11-11 22:26:38 +08:00
gotjosh	07b89eb117	Mixin: Pin the mixtool version in CircleCI In mixtool, the tip of master broke for our mixin - I have managed to trace it down and opened a PR (see https://github.com/grafana/dashboard-linter/pull/143) but for now, let's pin the version to make sure our CI is not affected. Signed-off-by: gotjosh <josue.abreu@gmail.com>	2023-08-03 16:15:26 +01:00
gotjosh	c494009f61	Mixin: Fix mixin linting In accordance with a new rule introduced as part of https://github.com/grafana/dashboard-linter/pull/79 this is now required. However, for the new rule of `panel-unit-rule` we don't reap any benefits from specifiying a particular unit for our panels, the defaults work perfectly fine so they're ignored. Signed-off-by: gotjosh <josue.abreu@gmail.com>	2022-06-29 16:25:15 +01:00
gotjosh	4d09995c26	Mixin: `template-job-rule` now only validates job and not both instance and job (#2944 ) With https://github.com/grafana/dashboard-linter/pull/49 `template-job-rule` no longer validates both `instance` and `job` labels. Add the new rule of `template-instance-rule` to the exclusions to preserve the previous behaviour. Signed-off-by: gotjosh <josue.abreu@gmail.com>	2022-06-15 22:27:11 +02:00
gotjosh	35bf59f182	Mixin: Rename exclusion rule from `panel-job-instance-rule` to `target-instance-rule` Within `9a32e58ed0`, the rules have been split into two different rules: `target-job-rule` `target-instance-rule` All of our queries do contain the `job` label but as per the reason, we don't need both in this particular case. Fixes #2899 Signed-off-by: gotjosh <josue.abreu@gmail.com>	2022-05-02 13:08:22 +01:00
clyang82	5414963190	fix lint error Signed-off-by: clyang82 <chuyang@redhat.com>	2021-12-10 08:20:23 +00:00
fpetkovski	b408b522bc	Improve the AlertmanagerMembersInconsistent alert The expression alertmanager_cluster_members{job="alertmanager"}[5m]) is assumed to return one series for each alertmanager instance in the cluster. When running inside Kubernetes, alertmanager pods can get evicted and rescheduled. This can change the instance label and produce a new series for that alertmanager instance. When the same pod gets evicted several times in a row, there will be a short interval in which Prometheus will return values from both the new series and the old series. As a result, counting the number of series for the alertmanager_cluster_members metric will overestimate the number of instances in the given cluster. This commit modifies the the AlertmanagerMembersInconsistent alert to increase the for clause to 15m in order to reduce the probability of a false positive. Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>	2021-06-22 08:21:02 +02:00
Arthur Silva Sens	8598683b24	[mixins] Alertmanager Overview dashboard (#2540 ) * Implements a Grafana dashboard to the mixin. The dashboard aims to show an overview of the overall health of Alertmanager. Signed-off-by: ArthurSens <arthursens2005@gmail.com>	2021-06-07 19:54:22 +02:00
SuperQ	99f64e944b	Update build * Drop /vendor. * Update Go to 1.16. * Update djfarrelly/maildev to 1.1.0. * Update protoc to 3.15.8. * Update mixin test for Go 1.16. * Bump Go modules. Signed-off-by: SuperQ <superq@gmail.com>	2021-04-22 13:11:44 +02:00
Björn Rabenstein	ce108378d4	Fix and improve AlertmanagerClusterFailedToSendAlerts (#2437 ) The alert was just looking at the minimum across integrations. So a complete failure of one integration would be masked by a still worknig other integration. With this fix, the `integration` label is retained (as it was already expected by the `description`), and thus any failing integration will trigger the alert. In addition, an `alertmanagerCriticalIntegrationsRegEx` is provided that allows to mark integrations as critical. Integrations that are not used to deliver critical alerts, or those that are just there for auditing and logging purposes can now be configured to only trigger a warning alert if they fail. Signed-off-by: beorn7 <beorn@grafana.com>	2020-12-23 15:15:38 +01:00
Tom Wilkie	6c5dee008f	Beginnings of an Alertmanager mixin. (#1629 ) Add an Alertmanager mixin Signed-off-by: beorn7 <beorn@grafana.com> Co-authored-by: Tom Wilkie <tom.wilkie@gmail.com> Co-authored-by: beorn7 <beorn@grafana.com> Co-authored-by: Simon Pasquier <spasquie@redhat.com>	2020-12-03 15:57:42 +01:00

12 Commits