Commit Graph

86 Commits

Author SHA1 Message Date
George Robinson f69a508665
Remove metrics from compat package (#3714)
This commit removes the metrics from the compat package
in favour of the existing logging and the additional tools
at hand, such as amtool, to validate Alertmanager configurations.

Due to the global nature of the compat package, a consequence
of config.Load, these metrics have proven to be less useful
in practice than expected, both in Alertmanager and other projects
such as Mimir.

There are a number of reasons for this:

1. Because the compat package is global, these metrics cannot be
   reset each time config.Load is called, as in multi-tenant
   projects like Mimir loading a config for one tenant would reset
   the metrics for all tenants. This is also the reason the metrics
   are counters and not gauges.

2. Since the metrics are counters, it is difficult to create
   meaningful dashboards for Alertmanager as, unlike in Mimir,
   configurations are not reloaded at fixed intervals, and as such,
   operators cannot use rate to track configuration changes
   over time.

In Alertmanager, there are much better tools available to validate
that an Alertmanager configuration is compatible with the UTF-8
parser, including both the existing logging from Alertmanager
server and amtool check-config.

In other projects like Mimir, we can track configurations for
individual tenants using log aggregation and storage systems
such as Loki. This gives operators far more information than
what is possible with the metrics, including the timestamp,
input and ID of tenant configurations that are incompatible
or have disagreement.

Signed-off-by: George Robinson <george.robinson@grafana.com>
2024-02-08 09:59:03 +00:00
George Robinson fa6a7e6dd6
Fix inconsistent defaults in UTF-8 behavior (#3668)
This commit fixes inconsistent UTF-8 behavior if the compat package is
not initialized and feature flags are not passed to the API. This can
happen when Alertmanager is used as a package in software such
as Cortex or Mimir.

The inconsistent behavior is that Alertmanager will accept UTF-8 alerts
but reject UTF-8 configurations.

Since feature flags are optional via api.Options, we cannot force them
to be passed to api.New at compile time. Instead, it's better to defer
back to the compat package which is consistent even when not initialized.

Signed-off-by: George Robinson <george.robinson@grafana.com>
2024-01-15 10:03:51 +00:00
Matthieu MOREL b81bad8711 use Go standard errors
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-12-08 16:44:13 +01:00
George Robinson 70bd5dad98
Support UTF-8 label matchers: Use compat package in Alertmanager server (#3567)
* Support UTF-8 label matchers: Use compat package in Alertmanager server

This pull request adds use of the compat package in Alertmanager server that will allow users to switch between the new matchers/parse parser and the old pkg/labels parser. The new matchers/parse parser uses a fallback mechanism where if the input cannot be parsed in the new parser it then attempts to use the old parser. If an input is parsed in the old parser but not the new parser then a warning log is emitted.

Signed-off-by: George Robinson <george.robinson@grafana.com>

---------

Signed-off-by: George Robinson <george.robinson@grafana.com>
2023-11-24 10:01:40 +00:00
gotjosh acb58400fd
Refactor: Move `inTimeIntervals` from `notify` to `timeinterval` (#3556)
* Refactor: Move `inTimeIntervals` from `notify` to `timeinterval`

There's absolutely no change of functionality here and I've expanded coverage for similar logic in both places.
---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-10-13 14:15:05 +01:00
gotjosh f66bbab421
Fix tests after rebase
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2022-06-17 13:20:21 +01:00
gotjosh cfb909f419
Marker: Rename `SetSilenced` to `SetActiveOrSilenced`
This accurately reflects what the function _actually_ does. If no active silences IDs are provided and the list of inhibitions we have is already empty the alert is actually set to Active. Took me a while to realise this as I was understanding how do we populate the alert list.

Signed-off-by: gotjosh <josue.abreu@gmail.com>
2022-06-17 12:51:23 +01:00
gotjosh 805e505288
Alert metric reports different results to what the user sees via API (#2943)
* Alert metric reports different results to what the user sees via API

Fixes #1439 and #2619.

The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI.

Signed-off-by: gotjosh <josue.abreu@gmail.com>
2022-06-16 12:16:06 +02:00
Matthias Loibl a6d10bd5bc
Update golangci-lint and fix complaints (#2853)
* Copy latest golangci-lint files from Prometheus

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* Use grafana/regexp over stdlib regexp

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* Fix typos in comments

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* Fix goimports complains in import sorting

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* gofumpt all Go files

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* Update naming to comply with revive linter

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* config: Fix error messages to be lower case

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* test/cli: Fix error messages to be lower case

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* .golangci.yaml: Remove obsolete space

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* config: Fix expected victorOps error

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* Use stdlib regexp

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* Clean up Go modules

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>
2022-03-25 17:59:51 +01:00
beorn7 e84c265196 Include pending silences for future muting decisions
Previously, if a pending silence existed for an alert, and it later
became active without any silences getting added in the meantime, we
would miss the existence of that newly active silence.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-05-27 22:15:57 +02:00
Kiril Vladimirov 91083d6cd9 types: Fix typo in Silence.Matchers' documentation
Signed-off-by: Kiril Vladimirov <kiril@vladimiroff.org>
2021-01-22 17:02:50 +02:00
Kiril Vladimirov 7320d83cbc Replace types.Matcher(s)? with labels.Matcher(s)?
Signed-off-by: Kiril Vladimirov <kiril@vladimiroff.org>
2021-01-22 17:02:48 +02:00
Atibhi Agrawal 6b36afbbec
Add negative matchers for routing. (#2434)
Add negative route matchers using label.Matcher

Signed-off-by: aSquare14 <atibhi.a@gmail.com>

Signed-off-by: beorn7 <beorn@grafana.com>

Co-authored-by: Björn Rabenstein <beorn@grafana.com>
2021-01-15 21:11:39 +01:00
Josh Soref 0f2c65d265 Spelling (#2167)
* spelling: inhibition

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: matchers

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: notification

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: nonexistent

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: obfuscated

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: occurred

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: relevant

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: unexpected

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: marshaled

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: marshaling

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-01-23 17:06:16 +01:00
johncming df1f1c8d74 types: remove redundant statements. (#2116)
Signed-off-by: johncming <johncming@yahoo.com>
2019-11-26 10:04:38 +01:00
johncming 52b4cecd56 types: remove unused equal method and add test case. (#2043)
Signed-off-by: johncming <johncming@yahoo.com>
2019-09-24 14:34:26 +02:00
Simon Pasquier 27e99e9e35 types: refactor *memMarker.Count method
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-05-24 13:36:01 +02:00
beorn7 3c981a92f7 Improve `Mutes` performance for silences
Add version tracking of silences states. Adding a silence to the state
increments the version. If the version hasn't changed since the last
time an alert was checked for being silenced, we only have to verify
that the relevant silences are still active rather than checking the
alert against all silences.

Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-28 12:34:41 +01:00
beorn7 12671bd261 Improve doc comments for Marker and friends
This clarifies a bunch of things I have run into during code reading
in preparation for some performance improvements around muting.

It also moves doc comments from places where they don't show up in
godoc to visible places.

It also fixes golint warnings.

Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-25 17:48:15 +01:00
Max Leonard Inden 09a7370572
main.go: Move marker metric registering into types/types.go
Instead of registering marker metrics inside of
cmd/alertmanager/main.go, register them in types/types.go, encapsulating
marker specific logic in its module, not in main.go. In addition it
paves the path for removing the usage of the global metric registry in
the future, by taking a local metric registerer.

Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2019-02-05 14:59:22 +01:00
Simon Pasquier b676fa79c0 *: update Makefile.common with new staticcheck (#1692)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-01-04 15:37:33 +01:00
Simon Pasquier 008b4a93da
types: fix alert merging
Alert merging assumed that EndsAt would always be empty for firing
alerts. This is no longer true starting with Prometheus v2.4.0: EndsAt
is set to a multiple of the evaluation interval or resend interval
(whichever is the largest). This change updates the merging logic to
support both cases.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-11-09 16:48:46 +00:00
Simon Pasquier 6a7c912559 Sort alerts in correct order (#1349)
* Sort dispatched alerts by job+instance in the correct order (#1178)

Signed-off-by: Ted Zlatanov <tzz@lifelogs.com>

* dispatch: add unit test for alerts sorting

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-06-14 15:54:33 +02:00
Simon Pasquier 0ebaeccd4b *: add missing license headers
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-05-14 17:37:13 +02:00
pasquier-s e8a92f65ef Run staticcheck as part of the build process (#1264)
This change also fixes potential issues highlighted by running
staticcheck.
2018-02-28 17:42:32 +01:00
Jose Donizetti fc9306cd7e Add expired silence validation (#1096)
* Add expired silence validation

* Add silence end time in the past validation
2018-01-21 15:29:51 +01:00
Jose Donizetti 20598bfd71 Remove old silence code (#1080) 2017-11-11 15:41:17 +01:00
Jose Donizetti cf85bd84f2 Add CalcSilenceState Test (#1085) 2017-11-11 15:13:12 +01:00
Jose Donizetti bc9b34d3db Add test to matchers (#1079)
* Add Test to Matcher and Matchers

* Move matchers string test to match_test.go
2017-11-07 11:39:22 +01:00
Julius Volz 9b72c10134 Minor code cleanups 2017-11-01 23:08:34 +01:00
Frederic Branczyk 0ef6695055
*: Remove .WasInhibited and .WasSilenced fields of Alert type 2017-10-10 15:50:15 +02:00
Corentin Chary bff889b490 silence|alerts: add metrics about current silences and alerts
This adds metrics that look like this:
```
alertmanager_alerts{state="active"} 6
alertmanager_alerts{state="suppressed"} 0
alertmanager_silences{state="active"} 1
alertmanager_silences{state="expired"} 1
alertmanager_silences{state="pending"} 0
```

This can be used to monitor alertmanager's usage and validate that
alertmanagers in a mesh have a similar number of silences and alerts.
2017-10-02 13:33:29 +02:00
Corentin Chary 9b2afbf18b Make sure Matchers are always ordered
This fixes https://github.com/prometheus/alertmanager/issues/881
Also add some unit tests
2017-06-23 15:30:34 +02:00
Max Leonard Inden 401e440db4
Return silence state on /silences
silenceState = "expired" | "active" | "pending"
```
"status": {
  "state": "expired"
}
```
2017-05-10 12:01:21 +02:00
Max Inden d463f1c298 Sync ui-rewrite with master (#779) 2017-05-10 11:49:02 +02:00
stuart nelson 6a909abf17 Add processing status field to alert 2017-04-27 14:18:52 +02:00
Fabian Reinartz 3269bc39e1 *: switch group key to matcher serialization
Turn the GroupKey into a string that is composed of the matchers if the
path in the routing tree and the grouping labels.
Only hash it at the very end to ensure we don't exceed size limits of
integration APIs.
2017-04-21 12:06:23 +02:00
Fabian Reinartz 8d88d9e05b Merge pull request #481 from prometheus/fabxc-meshsil
*: integrate new silence package
2016-08-30 16:53:34 +02:00
Fabian Reinartz 98101f3868 silence: fix doc strings 2016-08-30 14:19:22 +02:00
Fabian Reinartz a4e8703567 *: integrate new silence package 2016-08-30 12:15:23 +02:00
Fabian Reinartz 1baf98fb1a provider: remove NotificationInfos provider 2016-08-23 13:57:19 +02:00
Fabian Reinartz 66c2171bd8 *: rename NotifyInfo to NotificationInfo 2016-08-09 12:01:31 +02:00
Fabian Reinartz c0103dd8c6 provider/mesh: filter deleted silences from results 2016-08-09 12:00:52 +02:00
Fabian Reinartz 3d350a34b5 provider/mesh: extract silence deletion into state 2016-08-09 12:00:28 +02:00
Fabian Reinartz 4761663380 types: make Matchers sorted and compareable 2016-08-09 12:00:28 +02:00
Fabian Reinartz 6ad866dd27 types: validate ID existance for silences 2016-08-09 12:00:28 +02:00
Fabian Reinartz b92c5f5bd4 provider/mesh: add silence state 2016-08-09 11:59:35 +02:00
Fabian Reinartz 81cbf3cda7 *: refactor Silence type, use UUID
This commit removes the dependency on model.Silence for the internal
Silence type, uses UUIDs instead of uint64s and clarifies invariants
around timestamp handling.

The created_at timestamp is removed for the time being.
2016-08-09 11:59:35 +02:00
Fabian Reinartz 0e954e10cf provider/boltmem: Add Put/Get for BoltDB provider 2016-05-03 12:46:34 +02:00
Fabian Reinartz 438e22f246 Add merge alert test 2016-02-03 14:11:59 +01:00