Commit Graph

21 Commits

Author SHA1 Message Date
Atibhi Agrawal
4ffba65ff5
Add negative matchers for inhibitors (#2460)
Add new matchers for inhibition rules

Signed-off-by: aSquare14 <atibhi.a@gmail.com>
2021-01-22 15:54:11 +01:00
Simon Pasquier
25b32434a6
store: fix potential flaky test (#2077)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-10-22 09:25:31 +02:00
Simon Pasquier
4535311c34 dispatch: don't garbage-collect alerts from store
The aggregation group is already responsible for removing the resolved
alerts. Running the garbage collection in parallel introduces a race and
eventually resolved notifications may be dropped.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-09-18 11:42:14 +02:00
Matthias Loibl
c7f1eb7553
Use github.com/oklog/run not archived oklog/oklog
Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>
2019-04-22 15:11:40 +02:00
Simon Pasquier
c78b449f4a provider/mem: fix dropped alerts
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-04-19 15:35:21 +02:00
stuart nelson
0f634debfd
Merge pull request #1767 from prometheus/beorn7/muting2
Improve doc comments for Marker and friends
2019-02-26 13:00:04 +01:00
beorn7
f7df3743da Another doc comment fix for inhibition tests
Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-26 12:40:53 +01:00
beorn7
12671bd261 Improve doc comments for Marker and friends
This clarifies a bunch of things I have run into during code reading
in preparation for some performance improvements around muting.

It also moves doc comments from places where they don't show up in
godoc to visible places.

It also fixes golint warnings.

Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-25 17:48:15 +01:00
beorn7
33638b1412 Fix doc comment
Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-25 17:18:35 +01:00
beorn7
22db73fbf7 Modify the self-inhibition prevention semantics
This has been discussed in #666 (issue of hell...).

As concluded there, the cleanest semantics is most likely the
following: "An alert that matches both target and source side cannot
inhibit alerts for which the same is true." The two open questions
were:
1. How difficult is the implementation?
2. Is it needed?

This relatively simple commit proves that the answer to (1) is: Not
very difficult. (This also includes a performance-improving
simplification, which would have been possible without a change of
semantics.)

The answer to (2) is twofold:

For one, the original use case in #666 wasn't solved by our interim
solution. What we solved is the case where the self-inhibition is
triggered by a wide target match, i.e. I have a specific alert that
should inhibit a whole group of target alerts without inhibiting
itself. What we did _not_ solve is the inverted case: Self-inhibition
by a wide source match, i.e. an alert that should only fire if none of
a whole group of source alert fires. I mean, we "fixed" it as in, the
target alert will never be inhibited, but @lmb in #666 wanted the
alert to be inhibited _sometimes_ (just not _always_).

The other part is that I think that the asymmetry in our interim
solution will at some point haunt us. Thus, I really would like to get
this change in before we do a 1.0 release.

In practice, I expect this to be only relevant in very rare cases. But
those cases will be most difficult to reason with, and I claim that
the solution in this commit is matching what humans intuitively
expect.

Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-25 16:10:08 +01:00
Max Leonard Inden
09a7370572
main.go: Move marker metric registering into types/types.go
Instead of registering marker metrics inside of
cmd/alertmanager/main.go, register them in types/types.go, encapsulating
marker specific logic in its module, not in main.go. In addition it
paves the path for removing the usage of the global metric registry in
the future, by taking a local metric registerer.

Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2019-02-05 14:59:22 +01:00
stuart nelson
e883ccb9de
pull out shared code for storing alerts (#1507)
Move the code for storing and GC'ing alerts from being re-implemented in
several packages to existing in its own package

Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
2018-09-03 14:52:53 +02:00
Julius Volz
6d0edbe630 Fix a bunch of unhandled errors (#1501)
...as discovered by "gosec" (many other ones reported, but not all make
a lot of sense to fix).

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-08-05 15:38:25 +02:00
stuart nelson
80f2eeb2ca
Fix resolved alerts still inhibiting (#1331)
* inhibit: update inhibition cache when alerts resolve

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* inhibit: remove unnecessary fmt.Sprintf

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* inhibit: add unit tests

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* inhibit: use NopLogger in tests

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Update old alert with result of merge with new

On ingest, alerts with matching fingerprints are
merged if the new alert's start and end times
overlap with the old alert's.

The merge creates a new alert, which is then
updated in the internal alert store.

The original alert is not updated (because merge
creates a copy), so it is never marked as resolved
in the inhibitor's reference to it.

The code within the inhibitor relies on skipping
over resolved alerts, but because the old alert is
never updated it is never marked as resolved. Thus
it continues to inhibit other alerts until it is
cleaned up by the internal GC.

This commit updates the struct of the old alert
with the result of the merge with the new alert.

An alternative would be to always update the
inhibitor's internal cache of alerts regardless of
an alert's resolve status.

Signed-off-by: stuart nelson <stuartnelson3@gmail.com>

* Update inhibitor cache even if alert is resolved

This seems like a better choice than the previous
commit. I think it is more sane to have the
inhibitor update its own cache, rather than having
one of its pointers updated externally.

Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
2018-04-18 16:26:04 +02:00
Frederic Branczyk
f4c226c7d5
inhibit: Fix race in stopping inhibitor 2017-11-24 12:05:52 +01:00
Alin Sinpalean
dc3c78e6ae Prevent alerts from inhibiting themselves (#1017)
* Prevent alerts from inhibiting themselves.

* Don't inhibit alerts that match the source filter.

* No-op, nudge CircleCI.
2017-11-07 11:30:14 +01:00
Julius Volz
947970af44 Convert Alertmanager to use non-global go-kit loggers
Fixes https://github.com/prometheus/alertmanager/issues/1040
2017-10-22 00:20:40 -07:00
Frederic Branczyk
de24c5b7ec Fix inhibit race (#1032)
* inhibit: restructure stop logic, to prevent race condition

* vendor oklog/oklog/pkg/group
2017-10-07 13:01:37 +02:00
stuart nelson
6a909abf17 Add processing status field to alert 2017-04-27 14:18:52 +02:00
Carl Henrik Lunde
2935ff84e7 inhibitrule: defer Unlock to fix race 2016-10-08 22:07:29 +02:00
Fabian Reinartz
3931d4e64b *: restructure package tree
This commit packages up individual modules and removes the top-level
main package.
2016-08-09 14:24:52 +02:00