* Add benchmarks for Mutes
This commit adds benchmarks for Mutes in the inhibit package.
Signed-off-by: George Robinson <george.robinson@grafana.com>
---------
Signed-off-by: George Robinson <george.robinson@grafana.com>
* Add benchmarks for Mutes
This commit updates the existing benchmarks for silences to also
benchmark Mutes. This complements the existing Query benchmarks
by also measuring the time taken to mark silenced alerts.
---------
Signed-off-by: George Robinson <george.robinson@grafana.com>
* Add godot linter
Signed-off-by: George Robinson <george.robinson@grafana.com>
* Remove extra line from LICENSE
Signed-off-by: George Robinson <george.robinson@grafana.com>
---------
Signed-off-by: George Robinson <george.robinson@grafana.com>
Note that this does not stop showing classic metrics, for now
it is up to the scrape config to decide whether to keep those instead or
both.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
* Cut Alertmanager version 0.27 from the rc.0 (#3740)
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* update the release md to reflect the current timings
Signed-off-by: gotjosh <josue.abreu@gmail.com>
---------
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Release: Cut 0.27.0-rc.0
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* small fixes
- typo in respond
- add PR numbers for UTF-8
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* more worthsmithing
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Fix flaky test TestClusterJoinAndReconnect/TestTLSConnection (#3722)
wait until `p2.Status()` returns because it blocks until we're ready - that way, we're guaranteed to know that the cluster size is 2.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
---------
Signed-off-by: gotjosh <josue.abreu@gmail.com>
wait until `p2.Status()` returns because it blocks until we're ready - that way, we're guaranteed to know that the cluster size is 2.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Update the docs on how to use UTF-8 in label matchers and parse mode feature flags
Signed-off-by: George Robinson <george.robinson@grafana.com>
---------
Signed-off-by: George Robinson <george.robinson@grafana.com>
* Fix panic in acceptance tests
This commit attempts to address a panic that occurs in acceptance
tests if a server in the cluster fails to start.
Signed-off-by: George Robinson <george.robinson@grafana.com>
* Remove started and check am.cmd.Process != nil
Signed-off-by: George Robinson <george.robinson@grafana.com>
---------
Signed-off-by: George Robinson <george.robinson@grafana.com>
This commit fixes a log line in the featurecontrol package which
should be "UTF-8 strict mode" and not "UTF-8 mode".
Signed-off-by: George Robinson <george.robinson@grafana.com>
This commit fixes a small number of inconsistencies in the compat
package logging. It now has consistent use of classic matchers
parser and UTF-8 matchers parser, instead of old matchers parser
and new matchers parser.
Signed-off-by: George Robinson <george.robinson@grafana.com>
* feat: add counter to track alerts dropped outside of time_intervals
Addresses: #3512
This adds a new counter metric `alertmanager_alerts_supressed_total`
that is incremented by `len(alerts)` when an alert is suppressed for
being outside of a time_interval, ie inside of a mute_time_intervals or
outside of an active_time_intervals.
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
* test: add time interval suppression metric checks for notify
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
* test: fix failure message log values in notifier
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
* ref: address PR feedback for #3565
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
* fix: track suppressed notifications metric for inhibit/silence
Based on PR feedback:
https://github.com/prometheus/alertmanager/pull/3565/files#r1393068026
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
* fix: broken notifier tests
- fixed metric count check to properly check the diff between
input/output notifications from the suppression to compare to suppression
metric, was previously inverted to compare to how many notifications it
suppressed.
- stopped using `Reset()` to compare collection counts between the
multiple stages that are executed in `TestMuteStageWithSilences()`.
the intent was to compare a clean metric collection after each stage
execution, but the final stage where all silences are lifted results in
no metric being created in the test, causing `prom_testutil.ToFloat64()`
to panic. changed to separate vars to check counts between each stage,
with care to consider prior counts.
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
* rename metric and add constants
Signed-off-by: gotjosh <josue.abreu@gmail.com>
---------
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
Signed-off-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>
This commit removes the metrics from the compat package
in favour of the existing logging and the additional tools
at hand, such as amtool, to validate Alertmanager configurations.
Due to the global nature of the compat package, a consequence
of config.Load, these metrics have proven to be less useful
in practice than expected, both in Alertmanager and other projects
such as Mimir.
There are a number of reasons for this:
1. Because the compat package is global, these metrics cannot be
reset each time config.Load is called, as in multi-tenant
projects like Mimir loading a config for one tenant would reset
the metrics for all tenants. This is also the reason the metrics
are counters and not gauges.
2. Since the metrics are counters, it is difficult to create
meaningful dashboards for Alertmanager as, unlike in Mimir,
configurations are not reloaded at fixed intervals, and as such,
operators cannot use rate to track configuration changes
over time.
In Alertmanager, there are much better tools available to validate
that an Alertmanager configuration is compatible with the UTF-8
parser, including both the existing logging from Alertmanager
server and amtool check-config.
In other projects like Mimir, we can track configurations for
individual tenants using log aggregation and storage systems
such as Loki. This gives operators far more information than
what is possible with the metrics, including the timestamp,
input and ID of tenant configurations that are incompatible
or have disagreement.
Signed-off-by: George Robinson <george.robinson@grafana.com>
* feat: implement webhook_url_file for discord
implements #3482
Signed-off-by: Philipp Born <git@pborn.eu>
* feat: implement webhook_url_file for msteams
implements #3536
Signed-off-by: Philipp Born <git@pborn.eu>
---------
Signed-off-by: Philipp Born <git@pborn.eu>
There is no need to register these metrics in amtool, so use
compat.NewMetrics(nil) instead of compat.RegisteredMetrics.
Signed-off-by: George Robinson <george.robinson@grafana.com>