The aggregation group is already responsible for removing the resolved
alerts. Running the garbage collection in parallel introduces a race and
eventually resolved notifications may be dropped.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
This has been discussed in #666 (issue of hell...).
As concluded there, the cleanest semantics is most likely the
following: "An alert that matches both target and source side cannot
inhibit alerts for which the same is true." The two open questions
were:
1. How difficult is the implementation?
2. Is it needed?
This relatively simple commit proves that the answer to (1) is: Not
very difficult. (This also includes a performance-improving
simplification, which would have been possible without a change of
semantics.)
The answer to (2) is twofold:
For one, the original use case in #666 wasn't solved by our interim
solution. What we solved is the case where the self-inhibition is
triggered by a wide target match, i.e. I have a specific alert that
should inhibit a whole group of target alerts without inhibiting
itself. What we did _not_ solve is the inverted case: Self-inhibition
by a wide source match, i.e. an alert that should only fire if none of
a whole group of source alert fires. I mean, we "fixed" it as in, the
target alert will never be inhibited, but @lmb in #666 wanted the
alert to be inhibited _sometimes_ (just not _always_).
The other part is that I think that the asymmetry in our interim
solution will at some point haunt us. Thus, I really would like to get
this change in before we do a 1.0 release.
In practice, I expect this to be only relevant in very rare cases. But
those cases will be most difficult to reason with, and I claim that
the solution in this commit is matching what humans intuitively
expect.
Signed-off-by: beorn7 <beorn@soundcloud.com>
Instead of registering marker metrics inside of
cmd/alertmanager/main.go, register them in types/types.go, encapsulating
marker specific logic in its module, not in main.go. In addition it
paves the path for removing the usage of the global metric registry in
the future, by taking a local metric registerer.
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
Move the code for storing and GC'ing alerts from being re-implemented in
several packages to existing in its own package
Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
* inhibit: update inhibition cache when alerts resolve
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* inhibit: remove unnecessary fmt.Sprintf
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* inhibit: add unit tests
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* inhibit: use NopLogger in tests
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Update old alert with result of merge with new
On ingest, alerts with matching fingerprints are
merged if the new alert's start and end times
overlap with the old alert's.
The merge creates a new alert, which is then
updated in the internal alert store.
The original alert is not updated (because merge
creates a copy), so it is never marked as resolved
in the inhibitor's reference to it.
The code within the inhibitor relies on skipping
over resolved alerts, but because the old alert is
never updated it is never marked as resolved. Thus
it continues to inhibit other alerts until it is
cleaned up by the internal GC.
This commit updates the struct of the old alert
with the result of the merge with the new alert.
An alternative would be to always update the
inhibitor's internal cache of alerts regardless of
an alert's resolve status.
Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
* Update inhibitor cache even if alert is resolved
This seems like a better choice than the previous
commit. I think it is more sane to have the
inhibitor update its own cache, rather than having
one of its pointers updated externally.
Signed-off-by: stuart nelson <stuartnelson3@gmail.com>