Commit Graph

81 Commits

Author SHA1 Message Date
TJ Hoplock f6b942cf9b
chore!: adopt log/slog, drop go-kit/log (#4089)
* chore!: adopt log/slog, drop go-kit/log

The bulk of this change set was automated by the following script which
is being used to aid in converting the various exporters/projects to use
slog:

https://gist.github.com/tjhop/49f96fb7ebbe55b12deee0b0312d8434

This commit includes several changes:
- bump exporter-tookit to v0.13.1 for log/slog support
- updates golangci-lint deprecated configs
- enables sloglint linter
- removes old go-kit/log linter configs
- introduce some `if logger == nil { $newLogger }` additions to prevent
  nil references
- converts cluster membership config to use a stdlib compatible slog
  adapter, rather than creating a custom io.Writer for use as the
membership `logOutput` config

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>

* chore: address PR feedback

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>

---------

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
2024-11-06 09:09:57 +00:00
George Robinson cc6de9c666 Remove Id return from silences.Set(*pb.Silence)
This commit removes the Id from the method silences.Set(*pb.Silence)
as it is redundant. The Id is still set even when creating a silence
fails. This will be fixed in a later change.

Signed-off-by: George Robinson <george.robinson@grafana.com>
2024-06-20 15:47:49 +01:00
George Robinson c4a763c401
#3513: Mark muted alerts (#3793)
* Mark muted groups

This commit updates TimeMuteStage and TimeActiveStage to mark groups
as muted when its alerts are muted by an active or mute time interval,
and remove any existing markers when outside all active and mute
time intervals.

Signed-off-by: George Robinson <george.robinson@grafana.com>

* Remove unlock to defer

Signed-off-by: George Robinson <george.robinson@grafana.com>

---------

Signed-off-by: George Robinson <george.robinson@grafana.com>
2024-05-13 11:16:26 +01:00
George Robinson fc8c7d146f
#3513: Rewrite TestTimeMuteStage tests (#3794) 2024-04-11 13:53:50 +02:00
George Robinson 2dc23c90c9
Rewrite TestTimeActiveStage tests (#3795)
This commit rewrites the existing TestTimeActiveStage unit tests
to have complete isolation between test cases. Before this change,
each test case affected the state of its subsequent tests.

The motivation behind this change is to make it easier to assert
that alerts have been marked as muted.

Signed-off-by: George Robinson <george.robinson@grafana.com>
2024-04-11 11:42:16 +02:00
TJ Hoplock f00025d037
feat: add counter to track alerts dropped outside of time_intervals (#3565)
* feat: add counter to track alerts dropped outside of time_intervals

Addresses: #3512

This adds a new counter metric `alertmanager_alerts_supressed_total`
that is incremented by `len(alerts)` when an alert is suppressed for
being outside of a time_interval, ie inside of a mute_time_intervals or
outside of an active_time_intervals.

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>

* test: add time interval suppression metric checks for notify

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>

* test: fix failure message log values in notifier

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>

* ref: address PR feedback for #3565

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>

* fix: track suppressed notifications metric for inhibit/silence

Based on PR feedback:

https://github.com/prometheus/alertmanager/pull/3565/files#r1393068026

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>

* fix: broken notifier tests

- fixed metric count check to properly check the diff between
  input/output notifications from the suppression to compare to suppression
metric, was previously inverted to compare to how many notifications it
suppressed.
- stopped using `Reset()` to compare collection counts between the
  multiple stages that are executed in `TestMuteStageWithSilences()`.
the intent was to compare a clean metric collection after each stage
execution, but the final stage where all silences are lifted results in
no metric being created in the test, causing `prom_testutil.ToFloat64()`
to panic. changed to separate vars to check counts between each stage,
with care to consider prior counts.

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>

* rename metric and add constants

Signed-off-by: gotjosh <josue.abreu@gmail.com>

---------

Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
Signed-off-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>
2024-02-13 11:17:24 +00:00
Matthieu MOREL b9e347b9d1 golangci-lint: enable testifylint linter
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-12-10 08:50:03 +00:00
Walther Lee 3416d5a4f5
Add context reasons to notifications failed counter (#3631)
---------

Signed-off-by: Walther Lee <walther.lee@reddit.com>
Co-authored-by: Walther Lee <walther.lee@reddit.com>
Co-authored-by: Ben Kochie <superq@gmail.com>
2023-12-08 15:30:43 +01:00
gotjosh acb58400fd
Refactor: Move `inTimeIntervals` from `notify` to `timeinterval` (#3556)
* Refactor: Move `inTimeIntervals` from `notify` to `timeinterval`

There's absolutely no change of functionality here and I've expanded coverage for similar logic in both places.
---------

Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-10-13 14:15:05 +01:00
Colin Douch cfe4411deb
Add the receiver name to notification metrics (#3045)
* Add receiver name as a label to notify metrics

This commit adds in a second label to the notify family of metrics
(e.g. numTotalFailedNotifications) - the receiver name. This allows
disambiguating which receiver is failing when one has many receivers
with the same integration type

Signed-off-by: sinkingpoint <colin@quirl.co.nz>

* Gate receiver names behind a feature flag

Signed-off-by: sinkingpoint <colin@quirl.co.nz>

---------

Signed-off-by: sinkingpoint <colin@quirl.co.nz>
Signed-off-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>
2023-09-06 13:42:55 +01:00
Yijie Qin 7923bc5f8e
add status code label to the numTotalFailedNotifications metric (#3094)
* add reason label to the numTotalFailedNotifications metric

Signed-off-by: Yijie Qin <qinyijie@amazon.com>
2023-02-03 12:09:21 +00:00
Julien Pivotto b0443021dc Expires notify log sooner when possible
It seems useless to keep the notifications in the nflog for longer than
twice the repeat interval. This should help reduce memory usage of
clustered alertmanagers.

Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2022-10-14 10:03:17 +02:00
Ben Ridley 33a0e77a71
Add timezone support to time intervals. (#2782)
* Add explicit UTC to time interval tests

Signed-off-by: Ben Ridley <benridley29@gmail.com>

* Add timezone support to time intervals

Signed-off-by: Ben Ridley <benridley29@gmail.com>

* Update time interval documentation with time zone info

Signed-off-by: Ben Ridley <benridley29@gmail.com>

* Refactor notification tests to test timezone support

Signed-off-by: Ben Ridley <benridley29@gmail.com>

* Make use of Local more clear

Signed-off-by: Ben Ridley <benridley29@gmail.com>

* Fix documentation about timezone support.

Makes it clear that the default is UTC, but others are supported.

Signed-off-by: Ben Ridley <benridley29@gmail.com>

* Remove commented/unused function

Signed-off-by: Ben Ridley <benridley29@gmail.com>

* Fix tests using incorrect timezones

Previously tests were using time zone names that were unsupported by the
RFC822 parser. This switches the tests to use RFC822Z and specifies the
zones by number.

Signed-off-by: Ben Ridley <benridley29@gmail.com>

* Add a few more timezone test cases

Signed-off-by: Ben Ridley <benridley29@gmail.com>

* Remove unnecessary if/else branch

Co-authored-by: Sylvain Rabot <sylvain@abstraction.fr>
Signed-off-by: Ben Ridley <benridley29@gmail.com>

* Rename timezone to location for consistency with Go stdlib

Signed-off-by: Ben Ridley <benridley29@gmail.com>

* Make Windows timezone error more specific

Signed-off-by: Ben Ridley <benridley29@gmail.com>

* Update docs to use 'location'

Signed-off-by: Ben Ridley <benridley29@gmail.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Rabot <sylvain@abstraction.fr>
Signed-off-by: Ben Ridley <benridley29@gmail.com>

Signed-off-by: Ben Ridley <benridley29@gmail.com>
Co-authored-by: Sylvain Rabot <sylvain@abstraction.fr>
2022-09-22 14:45:17 +02:00
gotjosh cfb909f419
Marker: Rename `SetSilenced` to `SetActiveOrSilenced`
This accurately reflects what the function _actually_ does. If no active silences IDs are provided and the list of inhibitions we have is already empty the alert is actually set to Active. Took me a while to realise this as I was understanding how do we populate the alert list.

Signed-off-by: gotjosh <josue.abreu@gmail.com>
2022-06-17 12:51:23 +01:00
Matthias Loibl a6d10bd5bc
Update golangci-lint and fix complaints (#2853)
* Copy latest golangci-lint files from Prometheus

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* Use grafana/regexp over stdlib regexp

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* Fix typos in comments

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* Fix goimports complains in import sorting

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* gofumpt all Go files

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* Update naming to comply with revive linter

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* config: Fix error messages to be lower case

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* test/cli: Fix error messages to be lower case

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* .golangci.yaml: Remove obsolete space

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* config: Fix expected victorOps error

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* Use stdlib regexp

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>

* Clean up Go modules

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>
2022-03-25 17:59:51 +01:00
Sinuhe Tellez Rivera d155153305
Adds: Active time interval (#2779)
* add active time interval

Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* fix active time interval

Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* fix unittests for active time interval

Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update notify/notify.go

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update dispatch/route.go

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* split the stage for active and mute intervals

Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update notify/notify.go

Adds doc for a helper function

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update notify/notify.go

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update notify/notify.go

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update notify/notify.go

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* fix code after commit suggestions

Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Making mute_time_interval and time_intervals can coexist in the config

Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* docs: configuration's doc has been updated about time intervals

Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update config/config.go

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update docs/configuration.md

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update docs/configuration.md

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update docs/configuration.md

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update docs/configuration.md

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* updates configuration readme to improve active time description

Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* merge deprecated mute_time_intervals and time_intervals

Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update docs/configuration.md

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update docs/configuration.md

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update docs/configuration.md

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update docs/configuration.md

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update docs/configuration.md

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update docs/configuration.md

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update cmd/alertmanager/main.go

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update cmd/alertmanager/main.go

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* fmt main.go

Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update docs/configuration.md

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update docs/configuration.md

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* fix lint error

Signed-off-by: clyang82 <chuyang@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Document that matchers are ANDed together

Signed-off-by: Mac Chaffee <me@macchaffee.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Remove extra parentheticals

Signed-off-by: Mac Chaffee <me@macchaffee.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* config: root route should have empty matchers

Unmarshal should validate that the root route does
not contain any matchers. Prior to this change,
only the deprecated match structures were checked.

Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* chore: Let git ignore temporary files for ui/app

Signed-off-by: nekketsuuu <nekketsuuu@users.noreply.github.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* adding max_alerts parameter to slack webhook config

correcting the logic to trucate fields instead of dropping alerts in the slack integration

Signed-off-by: Prashant Balachandran <pnair@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* *: bump to Go 1.17 (#2792)

* *: bump to Go 1.17

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* *: fix yamllint errors

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Automate CSS-inlining for default HTML email template (#2798)

* Automate CSS-inlining for default HTML email template

The original HTML email template was added in `template/email.html`.
It looks like the CSS was manually inlined.  Most likely using the
premailer.dialect.ca web form, which is mentioned in the README for
the Mailgun transactional-email-templates project.  The resulting HTML
with inlined CSS was then copied into `template/default.tmpl`.  This
has resulted in `email.html` and `default.tmpl` diverging at times.

This commit adds build automation to inline the CSS automatically
using [juice][1].  The Go template containing the resulting HTML has
been moved into its own file to avoid the script that performs the CSS
inlining having to parse the `default.tmpl` file to insert it there.

Fixes #1939.

[1]: https://www.npmjs.com/package/juice

Signed-off-by: Brad Ison <bison@xvdf.io>

* Update asset/assets_vfsdata.go

Signed-off-by: Brad Ison <bison@xvdf.io>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* go.{mod,sum}: update Go dependencies

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* amtool to support http_config to access alertmanager (#2764)

* Support http_config for amtool

Co-authored-by: Julien Pivotto <roidelapluie@gmail.com>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: clyang82 <chuyang@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* notify/sns: detect FIFO topic based on the rendered value

Since the TopicARN field is a template string, it's safer to check for
the ".fifo" suffix in the rendered string.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* config: delegate Sigv4 validation to the inner type

This change also adds unit tests for SNS configuration.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* fix unittests

Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* fix comment about active time interval

Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* fix another comment about active time interval

Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update docs/configuration.md

Fix typo in documentation

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

* Update docs/configuration.md

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>

Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: clyang82 <chuyang@redhat.com>
Co-authored-by: Mac Chaffee <me@macchaffee.com>
Co-authored-by: Philip Gough <philip.p.gough@gmail.com>
Co-authored-by: nekketsuuu <nekketsuuu@users.noreply.github.com>
Co-authored-by: Prashant Balachandran <pnair@redhat.com>
Co-authored-by: Simon Pasquier <pasquier.simon@gmail.com>
Co-authored-by: Brad Ison <brad.ison@redhat.com>
Co-authored-by: Julien Pivotto <roidelapluie@gmail.com>
2022-03-04 15:24:29 +01:00
Bryan Boreham f5768fb193
Update xxhash to v2.1.1 and improve pooling (#2709)
* Update xxhash to v2.1.1

This saves linking two different versions

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-10-19 01:09:37 +02:00
Julien Pivotto b2a4cacb95 Update go dependencies & switch to go-kit/log
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-08-02 12:43:23 +02:00
Ben Ridley a1136942bb Fix typo in label to help debugging (again)
Signed-off-by: Ben Ridley <benridley29@gmail.com>
2021-07-13 13:49:30 +10:00
Ben Ridley 01287a4b6d Fix test case not being included in mute count
Signed-off-by: Ben Ridley <benridley29@gmail.com>
2021-07-13 13:48:16 +10:00
Ben Ridley c70481f71f Fix minor timezone typo to help debugging.
Signed-off-by: Ben Ridley <benridley29@gmail.com>
2021-07-13 10:33:37 +10:00
Ben Ridley 4ccbbaef20 Ensure time interval comparisons are in UTC
Signed-off-by: Ben Ridley <benridley29@gmail.com>
2021-07-13 10:27:13 +10:00
beorn7 e84c265196 Include pending silences for future muting decisions
Previously, if a pending silence existed for an alert, and it later
became active without any silences getting added in the meantime, we
would miss the existence of that newly active silence.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-05-27 22:15:57 +02:00
Ganesh Vernekar 10757eb5fb
Export newMetrics function and metrics struct (#2523)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-03-24 12:37:58 +05:30
Ben Kochie 53535551f5
Fix up golangci-lint errors.
Signed-off-by: Ben Kochie <superq@gmail.com>
2021-03-16 10:43:45 +01:00
Ben Ridley df54b4bacf Improve documentation wording and formatting in response to maintainer feedback
Signed-off-by: Ben Ridley <benridley29@gmail.com>
2021-03-01 08:30:02 +11:00
Ben Ridley 5d4231b001 Use consistent naming for mute time intervals
Signed-off-by: Ben Ridley <benridley29@gmail.com>
2021-03-01 08:30:02 +11:00
Ben Ridley a3cb125e5c Move timeinterval library into locally maintained package
Signed-off-by: Ben Ridley <benridley29@gmail.com>
2021-03-01 08:30:01 +11:00
Ben Ridley f53e7a984c Add tests for TimeMuteStage
Signed-off-by: Ben Ridley <benridley29@gmail.com>
2021-03-01 08:30:01 +11:00
Simon Pasquier 9f7f4ead46
notify: don't use the global metrics registry (#1977)
* notify: don't use the global metrics registry

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Address Max's comment

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-08-26 16:37:13 +02:00
Simon Pasquier 0c3120efac *: split notify package
Instead of keeping all notifiers in the notify package, it splits them
into individual sub-packages. This improves readability and
maintainability of the code.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-06-18 15:36:19 +02:00
Simon Pasquier 2abd78cbb7
*: use persistent HTTP clients (#1904)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-06-07 10:37:49 +02:00
beorn7 3c981a92f7 Improve `Mutes` performance for silences
Add version tracking of silences states. Adding a silence to the state
increments the version. If the version hasn't changed since the last
time an alert was checked for being silenced, we only have to verify
that the relevant silences are still active rather than checking the
alert against all silences.

Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-28 12:34:41 +01:00
beorn7 f3d9c89bbc Create a `Muter` implementation for silences
This encapsulates the logic of querying and marking silenced
alerts. It removes the code duplication flagged earlier.

I removed the error returned by the setAlertStatus function as we were
only logging it, and that's already done anyway when the error is
received from the `silence.Query` call (now in the `Mutes` method).

Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-26 16:42:59 +01:00
Max Leonard Inden 09a7370572
main.go: Move marker metric registering into types/types.go
Instead of registering marker metrics inside of
cmd/alertmanager/main.go, register them in types/types.go, encapsulating
marker specific logic in its module, not in main.go. In addition it
paves the path for removing the usage of the global metric registry in
the future, by taking a local metric registerer.

Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2019-02-05 14:59:22 +01:00
Simon Pasquier 306fd73e32 *: remove use of golang.org/x/net/context
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-11-09 10:00:23 +01:00
Simon Pasquier b7d891cf39 notify: notify resolved alerts properly (#1408)
* notify: notify resolved alerts properly

The PR #1205 while fixing an existing issue introduced another bug when
the send_resolved flag of the integration is set to true.

With send_resolved set to false, the semantics remain the same:
AlertManager generates a notification when new firing alerts are added
to the alert group. The notification only carries firing alerts.

With send_resolved set to true, AlertManager generates a notification
when new firing or resolved alerts are added to the alert group. The
notification carries both the firing and resolved notifications.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Fix comments

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-06-08 11:37:38 +02:00
pasquier-s 7b80919b36 Remove unused code (#1272) 2018-03-03 11:07:47 +01:00
pasquier-s e8a92f65ef Run staticcheck as part of the build process (#1264)
This change also fixes potential issues highlighted by running
staticcheck.
2018-02-28 17:42:32 +01:00
pasquier-s 62b957cc14 Notify only when new firing alerts are added (#1205)
After the initial notification has been sent, AlertManager shouldn't notify the
receiver again when no new alerts have been added to the group during
group_interval.

This change also modifies the acceptance test framework to assert that no
notification has been received in a given interval.
2018-01-23 16:52:03 +01:00
pasquier-s 9b10acae68 Don't notify resolved alerts if none were firing (#1198)
* Don't notify resolved alerts if none were firing

* Fix comments
2018-01-18 11:12:17 +01:00
Jose Donizetti d75ff37a38 Refactor inhibit stage (#1105)
* Refactor BuildPipeline to receive a muter

* Remove marker not used by InhibitStage
2017-12-14 16:22:31 +01:00
Julius Volz 947970af44 Convert Alertmanager to use non-global go-kit loggers
Fixes https://github.com/prometheus/alertmanager/issues/1040
2017-10-22 00:20:40 -07:00
Frederic Branczyk 0ef6695055
*: Remove .WasInhibited and .WasSilenced fields of Alert type 2017-10-10 15:50:15 +02:00
Fabian Reinartz d73a655bf4 Simplify silence modifications, add update endpoint (#796)
* Simplify silence modifications, add update endpoint

* vendor: add pkg/errors

* ui: Handle upserting of silences

.

* Regenerate bindata
2017-05-16 16:48:25 +02:00
stuart nelson 6a909abf17 Add processing status field to alert 2017-04-27 14:18:52 +02:00
Fabian Reinartz 3269bc39e1 *: switch group key to matcher serialization
Turn the GroupKey into a string that is composed of the matchers if the
path in the routing tree and the grouping labels.
Only hash it at the very end to ensure we don't exceed size limits of
integration APIs.
2017-04-21 12:06:23 +02:00
Fabian Reinartz b1486ca546 silence: move to gogoproto
This generates the protobuf Go code with gogoproto and switches to
standard library time types.
2017-04-18 12:47:42 +02:00
Fabian Reinartz 4258b028d6 nflog: switch to gogoproto
This switches the nflog to generate Go code via gogoproto and thereby
use standard library timestamp types.
2017-04-18 10:03:57 +02:00
Fabian Reinartz 309c6af4b2
nflog: use alert set instead of hash for deduplication
Building a hash over an entire set of alerts causes problems, because
the hash differs, on any change, whereas we only want to send
notifications if the alert and it's state have changed. Therefore this
introduces a list of alerts that are active and a list of alerts that
are resolved. If the currently active alerts of a group are a subset of
the ones that have been notified about before then they are
deduplicated. The resolved notifications work the same way, with a
separate list of resolved notifications that have already been sent.
2017-04-13 15:13:47 +02:00