* TimeMuter returns the names of time intervals
This commit updates the TimeMuter interface to also return the names
of the time intervals that muted the alerts.
Signed-off-by: George Robinson <george.robinson@grafana.com>
---------
Signed-off-by: George Robinson <george.robinson@grafana.com>
The elm npm package doesn't support linux/arm64. The easiest option
is to force docker to run this as a AMD64 container.
Upstream issue:
https://github.com/elm/compiler/issues/2283
Signed-off-by: Holger Hans Peter Freyther <holger@freyther.de>
Addresses:
Scanning your code and 410 packages across 83 dependent modules for known vulnerabilities...
=== Symbol Results ===
Vulnerability #1: GO-2024-2687
HTTP/2 CONTINUATION flood in net/http
More info: https://pkg.go.dev/vuln/GO-2024-2687
Module: golang.org/x/net
Found in: golang.org/x/net@v0.20.0
Fixed in: golang.org/x/net@v0.23.0
Example traces found:
#1: cli/root.go:122:52: cli.NewAlertmanagerClient calls config.NewClientFromConfig, which eventually calls http2.ConfigureTransports
#2: types/types.go:290:28: types.MultiError.Error calls http2.ConnectionError.Error
#3: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.ErrCode.String
#4: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.FrameHeader.String
#5: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.FrameType.String
#6: types/types.go:290:28: types.MultiError.Error calls http2.GoAwayError.Error
#7: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.Setting.String
#8: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.SettingID.String
#9: types/types.go:290:28: types.MultiError.Error calls http2.StreamError.Error
#10: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.Transport.NewClientConn
#11: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.Transport.RoundTrip
#12: notify/email/email.go:253:14: email.Email.Notify calls fmt.Fprintf, which eventually calls http2.chunkWriter.Write
#13: types/types.go:290:28: types.MultiError.Error calls http2.connError.Error
#14: types/types.go:290:28: types.MultiError.Error calls http2.duplicatePseudoHeaderError.Error
#15: test/cli/acceptance.go:362:3: cli.Alertmanager.Start calls http2.gzipReader.Close
#16: test/cli/acceptance.go:366:22: cli.Alertmanager.Start calls io.ReadAll, which calls http2.gzipReader.Read
#17: types/types.go:290:28: types.MultiError.Error calls http2.headerFieldNameError.Error
#18: types/types.go:290:28: types.MultiError.Error calls http2.headerFieldValueError.Error
#19: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.noDialH2RoundTripper.RoundTrip
#20: types/types.go:290:28: types.MultiError.Error calls http2.pseudoHeaderError.Error
#21: notify/email/email.go:253:14: email.Email.Notify calls fmt.Fprintf, which eventually calls http2.stickyErrWriter.Write
#22: test/cli/acceptance.go:362:3: cli.Alertmanager.Start calls http2.transportResponseBody.Close
#23: test/cli/acceptance.go:366:22: cli.Alertmanager.Start calls io.ReadAll, which calls http2.transportResponseBody.Read
#24: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.writeData.String
Your code is affected by 1 vulnerability from 1 module.
This scan also found 0 vulnerabilities in packages you import and 2
vulnerabilities in modules you require, but your code doesn't appear to call
these vulnerabilities.
Use '-show verbose' for more details.
Signed-off-by: Holger Hans Peter Freyther <holger@freyther.de>
* Add date and tz functions to templates
This commit adds the date and tz functions to templates. This means
users can now format time in a specified format and also change
the timezone to their specific locale.
An example of how these functions work, and can be composed together,
can be seen here:
{{ .StartsAt | tz "Europe/Paris" | date "15:04:05 MST" }}
Signed-off-by: George Robinson <george.robinson@grafana.com>
---------
Signed-off-by: George Robinson <george.robinson@grafana.com>
* Update TestRouteID tests
This commit updates the TestRouteID tests to be more simple without
reducing test coverage. It also adds new cases that show a bug
in the existing code where conflicting IDs can be returned.
Signed-off-by: George Robinson <george.robinson@grafana.com>
* Fix Route.ID() returns conflicting IDs
This commit fixes a bug where Route.ID() returns conflicting IDs.
For example, the configuration:
receiver: test
routes:
- matchers:
- foo=bar
continue: true
routes:
- matchers:
- bar=baz
- matchers:
- foo=bar
continue: true
routes:
- matchers:
- bar=baz
Gives the following Route IDs:
{}
{}/{foo="bar"}/0
{}/{foo="bar"}/{bar="baz"}/0
{}/{foo="bar"}/1
{}/{foo="bar"}/{bar="baz"}/0
When it should give these Route IDs:
{}
{}/{foo="bar"}/0
{}/{foo="bar"}/0/{bar="baz"}/0
{}/{foo="bar"}/1
{}/{foo="bar"}/1/{bar="baz"}/0
Signed-off-by: George Robinson <george.robinson@grafana.com>
---------
Signed-off-by: George Robinson <george.robinson@grafana.com>
This commit rewrites the existing TestTimeActiveStage unit tests
to have complete isolation between test cases. Before this change,
each test case affected the state of its subsequent tests.
The motivation behind this change is to make it easier to assert
that alerts have been marked as muted.
Signed-off-by: George Robinson <george.robinson@grafana.com>
* Add more benchmarks for inhibition rules
This commit adds more benchmarks for inhibition rules where
just the last rule in the benchmark inhibits the labels.
Signed-off-by: George Robinson <george.robinson@grafana.com>
---------
Signed-off-by: George Robinson <george.robinson@grafana.com>
* Add benchmarks for Mutes
This commit adds benchmarks for Mutes in the inhibit package.
Signed-off-by: George Robinson <george.robinson@grafana.com>
---------
Signed-off-by: George Robinson <george.robinson@grafana.com>
* Add benchmarks for Mutes
This commit updates the existing benchmarks for silences to also
benchmark Mutes. This complements the existing Query benchmarks
by also measuring the time taken to mark silenced alerts.
---------
Signed-off-by: George Robinson <george.robinson@grafana.com>
* Add godot linter
Signed-off-by: George Robinson <george.robinson@grafana.com>
* Remove extra line from LICENSE
Signed-off-by: George Robinson <george.robinson@grafana.com>
---------
Signed-off-by: George Robinson <george.robinson@grafana.com>
Note that this does not stop showing classic metrics, for now
it is up to the scrape config to decide whether to keep those instead or
both.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
* Cut Alertmanager version 0.27 from the rc.0 (#3740)
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* update the release md to reflect the current timings
Signed-off-by: gotjosh <josue.abreu@gmail.com>
---------
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Release: Cut 0.27.0-rc.0
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* small fixes
- typo in respond
- add PR numbers for UTF-8
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* more worthsmithing
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Fix flaky test TestClusterJoinAndReconnect/TestTLSConnection (#3722)
wait until `p2.Status()` returns because it blocks until we're ready - that way, we're guaranteed to know that the cluster size is 2.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
---------
Signed-off-by: gotjosh <josue.abreu@gmail.com>
wait until `p2.Status()` returns because it blocks until we're ready - that way, we're guaranteed to know that the cluster size is 2.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Update the docs on how to use UTF-8 in label matchers and parse mode feature flags
Signed-off-by: George Robinson <george.robinson@grafana.com>
---------
Signed-off-by: George Robinson <george.robinson@grafana.com>
* Fix panic in acceptance tests
This commit attempts to address a panic that occurs in acceptance
tests if a server in the cluster fails to start.
Signed-off-by: George Robinson <george.robinson@grafana.com>
* Remove started and check am.cmd.Process != nil
Signed-off-by: George Robinson <george.robinson@grafana.com>
---------
Signed-off-by: George Robinson <george.robinson@grafana.com>
This commit fixes a log line in the featurecontrol package which
should be "UTF-8 strict mode" and not "UTF-8 mode".
Signed-off-by: George Robinson <george.robinson@grafana.com>
This commit fixes a small number of inconsistencies in the compat
package logging. It now has consistent use of classic matchers
parser and UTF-8 matchers parser, instead of old matchers parser
and new matchers parser.
Signed-off-by: George Robinson <george.robinson@grafana.com>
* feat: add counter to track alerts dropped outside of time_intervals
Addresses: #3512
This adds a new counter metric `alertmanager_alerts_supressed_total`
that is incremented by `len(alerts)` when an alert is suppressed for
being outside of a time_interval, ie inside of a mute_time_intervals or
outside of an active_time_intervals.
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
* test: add time interval suppression metric checks for notify
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
* test: fix failure message log values in notifier
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
* ref: address PR feedback for #3565
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
* fix: track suppressed notifications metric for inhibit/silence
Based on PR feedback:
https://github.com/prometheus/alertmanager/pull/3565/files#r1393068026
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
* fix: broken notifier tests
- fixed metric count check to properly check the diff between
input/output notifications from the suppression to compare to suppression
metric, was previously inverted to compare to how many notifications it
suppressed.
- stopped using `Reset()` to compare collection counts between the
multiple stages that are executed in `TestMuteStageWithSilences()`.
the intent was to compare a clean metric collection after each stage
execution, but the final stage where all silences are lifted results in
no metric being created in the test, causing `prom_testutil.ToFloat64()`
to panic. changed to separate vars to check counts between each stage,
with care to consider prior counts.
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
* rename metric and add constants
Signed-off-by: gotjosh <josue.abreu@gmail.com>
---------
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
Signed-off-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>
This commit removes the metrics from the compat package
in favour of the existing logging and the additional tools
at hand, such as amtool, to validate Alertmanager configurations.
Due to the global nature of the compat package, a consequence
of config.Load, these metrics have proven to be less useful
in practice than expected, both in Alertmanager and other projects
such as Mimir.
There are a number of reasons for this:
1. Because the compat package is global, these metrics cannot be
reset each time config.Load is called, as in multi-tenant
projects like Mimir loading a config for one tenant would reset
the metrics for all tenants. This is also the reason the metrics
are counters and not gauges.
2. Since the metrics are counters, it is difficult to create
meaningful dashboards for Alertmanager as, unlike in Mimir,
configurations are not reloaded at fixed intervals, and as such,
operators cannot use rate to track configuration changes
over time.
In Alertmanager, there are much better tools available to validate
that an Alertmanager configuration is compatible with the UTF-8
parser, including both the existing logging from Alertmanager
server and amtool check-config.
In other projects like Mimir, we can track configurations for
individual tenants using log aggregation and storage systems
such as Loki. This gives operators far more information than
what is possible with the metrics, including the timestamp,
input and ID of tenant configurations that are incompatible
or have disagreement.
Signed-off-by: George Robinson <george.robinson@grafana.com>
* feat: implement webhook_url_file for discord
implements #3482
Signed-off-by: Philipp Born <git@pborn.eu>
* feat: implement webhook_url_file for msteams
implements #3536
Signed-off-by: Philipp Born <git@pborn.eu>
---------
Signed-off-by: Philipp Born <git@pborn.eu>
There is no need to register these metrics in amtool, so use
compat.NewMetrics(nil) instead of compat.RegisteredMetrics.
Signed-off-by: George Robinson <george.robinson@grafana.com>