alertmanager/inhibit/inhibit_test.go

480 lines
12 KiB
Go
Raw Normal View History

// Copyright 2016 Prometheus Team
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package inhibit
import (
"testing"
"time"
"github.com/prometheus/client_golang/prometheus"
Fix resolved alerts still inhibiting (#1331) * inhibit: update inhibition cache when alerts resolve Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: remove unnecessary fmt.Sprintf Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: add unit tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: use NopLogger in tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Update old alert with result of merge with new On ingest, alerts with matching fingerprints are merged if the new alert's start and end times overlap with the old alert's. The merge creates a new alert, which is then updated in the internal alert store. The original alert is not updated (because merge creates a copy), so it is never marked as resolved in the inhibitor's reference to it. The code within the inhibitor relies on skipping over resolved alerts, but because the old alert is never updated it is never marked as resolved. Thus it continues to inhibit other alerts until it is cleaned up by the internal GC. This commit updates the struct of the old alert with the result of the merge with the new alert. An alternative would be to always update the inhibitor's internal cache of alerts regardless of an alert's resolve status. Signed-off-by: stuart nelson <stuartnelson3@gmail.com> * Update inhibitor cache even if alert is resolved This seems like a better choice than the previous commit. I think it is more sane to have the inhibitor update its own cache, rather than having one of its pointers updated externally. Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
2018-04-18 14:26:04 +00:00
"github.com/prometheus/common/model"
"github.com/prometheus/common/promslog"
Fix resolved alerts still inhibiting (#1331) * inhibit: update inhibition cache when alerts resolve Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: remove unnecessary fmt.Sprintf Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: add unit tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: use NopLogger in tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Update old alert with result of merge with new On ingest, alerts with matching fingerprints are merged if the new alert's start and end times overlap with the old alert's. The merge creates a new alert, which is then updated in the internal alert store. The original alert is not updated (because merge creates a copy), so it is never marked as resolved in the inhibitor's reference to it. The code within the inhibitor relies on skipping over resolved alerts, but because the old alert is never updated it is never marked as resolved. Thus it continues to inhibit other alerts until it is cleaned up by the internal GC. This commit updates the struct of the old alert with the result of the merge with the new alert. An alternative would be to always update the inhibitor's internal cache of alerts regardless of an alert's resolve status. Signed-off-by: stuart nelson <stuartnelson3@gmail.com> * Update inhibitor cache even if alert is resolved This seems like a better choice than the previous commit. I think it is more sane to have the inhibitor update its own cache, rather than having one of its pointers updated externally. Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
2018-04-18 14:26:04 +00:00
"github.com/prometheus/alertmanager/config"
"github.com/prometheus/alertmanager/pkg/labels"
Fix resolved alerts still inhibiting (#1331) * inhibit: update inhibition cache when alerts resolve Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: remove unnecessary fmt.Sprintf Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: add unit tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: use NopLogger in tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Update old alert with result of merge with new On ingest, alerts with matching fingerprints are merged if the new alert's start and end times overlap with the old alert's. The merge creates a new alert, which is then updated in the internal alert store. The original alert is not updated (because merge creates a copy), so it is never marked as resolved in the inhibitor's reference to it. The code within the inhibitor relies on skipping over resolved alerts, but because the old alert is never updated it is never marked as resolved. Thus it continues to inhibit other alerts until it is cleaned up by the internal GC. This commit updates the struct of the old alert with the result of the merge with the new alert. An alternative would be to always update the inhibitor's internal cache of alerts regardless of an alert's resolve status. Signed-off-by: stuart nelson <stuartnelson3@gmail.com> * Update inhibitor cache even if alert is resolved This seems like a better choice than the previous commit. I think it is more sane to have the inhibitor update its own cache, rather than having one of its pointers updated externally. Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
2018-04-18 14:26:04 +00:00
"github.com/prometheus/alertmanager/provider"
"github.com/prometheus/alertmanager/store"
"github.com/prometheus/alertmanager/types"
)
var nopLogger = promslog.NewNopLogger()
Fix resolved alerts still inhibiting (#1331) * inhibit: update inhibition cache when alerts resolve Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: remove unnecessary fmt.Sprintf Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: add unit tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: use NopLogger in tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Update old alert with result of merge with new On ingest, alerts with matching fingerprints are merged if the new alert's start and end times overlap with the old alert's. The merge creates a new alert, which is then updated in the internal alert store. The original alert is not updated (because merge creates a copy), so it is never marked as resolved in the inhibitor's reference to it. The code within the inhibitor relies on skipping over resolved alerts, but because the old alert is never updated it is never marked as resolved. Thus it continues to inhibit other alerts until it is cleaned up by the internal GC. This commit updates the struct of the old alert with the result of the merge with the new alert. An alternative would be to always update the inhibitor's internal cache of alerts regardless of an alert's resolve status. Signed-off-by: stuart nelson <stuartnelson3@gmail.com> * Update inhibitor cache even if alert is resolved This seems like a better choice than the previous commit. I think it is more sane to have the inhibitor update its own cache, rather than having one of its pointers updated externally. Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
2018-04-18 14:26:04 +00:00
func TestInhibitRuleHasEqual(t *testing.T) {
Fix resolved alerts still inhibiting (#1331) * inhibit: update inhibition cache when alerts resolve Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: remove unnecessary fmt.Sprintf Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: add unit tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: use NopLogger in tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Update old alert with result of merge with new On ingest, alerts with matching fingerprints are merged if the new alert's start and end times overlap with the old alert's. The merge creates a new alert, which is then updated in the internal alert store. The original alert is not updated (because merge creates a copy), so it is never marked as resolved in the inhibitor's reference to it. The code within the inhibitor relies on skipping over resolved alerts, but because the old alert is never updated it is never marked as resolved. Thus it continues to inhibit other alerts until it is cleaned up by the internal GC. This commit updates the struct of the old alert with the result of the merge with the new alert. An alternative would be to always update the inhibitor's internal cache of alerts regardless of an alert's resolve status. Signed-off-by: stuart nelson <stuartnelson3@gmail.com> * Update inhibitor cache even if alert is resolved This seems like a better choice than the previous commit. I think it is more sane to have the inhibitor update its own cache, rather than having one of its pointers updated externally. Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
2018-04-18 14:26:04 +00:00
t.Parallel()
now := time.Now()
cases := []struct {
initial map[model.Fingerprint]*types.Alert
equal model.LabelNames
input model.LabelSet
result bool
}{
{
// No source alerts at all.
initial: map[model.Fingerprint]*types.Alert{},
input: model.LabelSet{"a": "b"},
result: false,
},
{
// No equal labels, any source alerts satisfies the requirement.
initial: map[model.Fingerprint]*types.Alert{1: {}},
input: model.LabelSet{"a": "b"},
result: true,
},
{
// Matching but already resolved.
initial: map[model.Fingerprint]*types.Alert{
1: {
Alert: model.Alert{
Labels: model.LabelSet{"a": "b", "b": "f"},
StartsAt: now.Add(-time.Minute),
EndsAt: now.Add(-time.Second),
},
},
2: {
Alert: model.Alert{
Labels: model.LabelSet{"a": "b", "b": "c"},
StartsAt: now.Add(-time.Minute),
EndsAt: now.Add(-time.Second),
},
},
},
equal: model.LabelNames{"a", "b"},
input: model.LabelSet{"a": "b", "b": "c"},
result: false,
},
{
// Matching and unresolved.
initial: map[model.Fingerprint]*types.Alert{
1: {
Alert: model.Alert{
Labels: model.LabelSet{"a": "b", "c": "d"},
StartsAt: now.Add(-time.Minute),
EndsAt: now.Add(-time.Second),
},
},
2: {
Alert: model.Alert{
Labels: model.LabelSet{"a": "b", "c": "f"},
StartsAt: now.Add(-time.Minute),
EndsAt: now.Add(time.Hour),
},
},
},
equal: model.LabelNames{"a"},
input: model.LabelSet{"a": "b"},
result: true,
},
{
// Equal label does not match.
initial: map[model.Fingerprint]*types.Alert{
1: {
Alert: model.Alert{
Labels: model.LabelSet{"a": "c", "c": "d"},
StartsAt: now.Add(-time.Minute),
EndsAt: now.Add(-time.Second),
},
},
2: {
Alert: model.Alert{
Labels: model.LabelSet{"a": "c", "c": "f"},
StartsAt: now.Add(-time.Minute),
EndsAt: now.Add(-time.Second),
},
},
},
equal: model.LabelNames{"a"},
input: model.LabelSet{"a": "b"},
result: false,
},
}
for _, c := range cases {
r := &InhibitRule{
Equal: map[model.LabelName]struct{}{},
scache: store.NewAlerts(),
}
for _, ln := range c.equal {
r.Equal[ln] = struct{}{}
}
for _, v := range c.initial {
r.scache.Set(v)
}
Modify the self-inhibition prevention semantics This has been discussed in #666 (issue of hell...). As concluded there, the cleanest semantics is most likely the following: "An alert that matches both target and source side cannot inhibit alerts for which the same is true." The two open questions were: 1. How difficult is the implementation? 2. Is it needed? This relatively simple commit proves that the answer to (1) is: Not very difficult. (This also includes a performance-improving simplification, which would have been possible without a change of semantics.) The answer to (2) is twofold: For one, the original use case in #666 wasn't solved by our interim solution. What we solved is the case where the self-inhibition is triggered by a wide target match, i.e. I have a specific alert that should inhibit a whole group of target alerts without inhibiting itself. What we did _not_ solve is the inverted case: Self-inhibition by a wide source match, i.e. an alert that should only fire if none of a whole group of source alert fires. I mean, we "fixed" it as in, the target alert will never be inhibited, but @lmb in #666 wanted the alert to be inhibited _sometimes_ (just not _always_). The other part is that I think that the asymmetry in our interim solution will at some point haunt us. Thus, I really would like to get this change in before we do a 1.0 release. In practice, I expect this to be only relevant in very rare cases. But those cases will be most difficult to reason with, and I claim that the solution in this commit is matching what humans intuitively expect. Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-22 18:57:27 +00:00
if _, have := r.hasEqual(c.input, false); have != c.result {
2016-06-17 13:10:16 +00:00
t.Errorf("Unexpected result %t, expected %t", have, c.result)
}
}
}
func TestInhibitRuleMatches(t *testing.T) {
Fix resolved alerts still inhibiting (#1331) * inhibit: update inhibition cache when alerts resolve Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: remove unnecessary fmt.Sprintf Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: add unit tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: use NopLogger in tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Update old alert with result of merge with new On ingest, alerts with matching fingerprints are merged if the new alert's start and end times overlap with the old alert's. The merge creates a new alert, which is then updated in the internal alert store. The original alert is not updated (because merge creates a copy), so it is never marked as resolved in the inhibitor's reference to it. The code within the inhibitor relies on skipping over resolved alerts, but because the old alert is never updated it is never marked as resolved. Thus it continues to inhibit other alerts until it is cleaned up by the internal GC. This commit updates the struct of the old alert with the result of the merge with the new alert. An alternative would be to always update the inhibitor's internal cache of alerts regardless of an alert's resolve status. Signed-off-by: stuart nelson <stuartnelson3@gmail.com> * Update inhibitor cache even if alert is resolved This seems like a better choice than the previous commit. I think it is more sane to have the inhibitor update its own cache, rather than having one of its pointers updated externally. Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
2018-04-18 14:26:04 +00:00
t.Parallel()
Modify the self-inhibition prevention semantics This has been discussed in #666 (issue of hell...). As concluded there, the cleanest semantics is most likely the following: "An alert that matches both target and source side cannot inhibit alerts for which the same is true." The two open questions were: 1. How difficult is the implementation? 2. Is it needed? This relatively simple commit proves that the answer to (1) is: Not very difficult. (This also includes a performance-improving simplification, which would have been possible without a change of semantics.) The answer to (2) is twofold: For one, the original use case in #666 wasn't solved by our interim solution. What we solved is the case where the self-inhibition is triggered by a wide target match, i.e. I have a specific alert that should inhibit a whole group of target alerts without inhibiting itself. What we did _not_ solve is the inverted case: Self-inhibition by a wide source match, i.e. an alert that should only fire if none of a whole group of source alert fires. I mean, we "fixed" it as in, the target alert will never be inhibited, but @lmb in #666 wanted the alert to be inhibited _sometimes_ (just not _always_). The other part is that I think that the asymmetry in our interim solution will at some point haunt us. Thus, I really would like to get this change in before we do a 1.0 release. In practice, I expect this to be only relevant in very rare cases. But those cases will be most difficult to reason with, and I claim that the solution in this commit is matching what humans intuitively expect. Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-22 18:57:27 +00:00
rule1 := config.InhibitRule{
SourceMatch: map[string]string{"s1": "1"},
TargetMatch: map[string]string{"t1": "1"},
Equal: model.LabelNames{"e"},
}
rule2 := config.InhibitRule{
SourceMatch: map[string]string{"s2": "1"},
TargetMatch: map[string]string{"t2": "1"},
Equal: model.LabelNames{"e"},
}
m := types.NewMarker(prometheus.NewRegistry())
ih := NewInhibitor(nil, []config.InhibitRule{rule1, rule2}, m, nopLogger)
now := time.Now()
Modify the self-inhibition prevention semantics This has been discussed in #666 (issue of hell...). As concluded there, the cleanest semantics is most likely the following: "An alert that matches both target and source side cannot inhibit alerts for which the same is true." The two open questions were: 1. How difficult is the implementation? 2. Is it needed? This relatively simple commit proves that the answer to (1) is: Not very difficult. (This also includes a performance-improving simplification, which would have been possible without a change of semantics.) The answer to (2) is twofold: For one, the original use case in #666 wasn't solved by our interim solution. What we solved is the case where the self-inhibition is triggered by a wide target match, i.e. I have a specific alert that should inhibit a whole group of target alerts without inhibiting itself. What we did _not_ solve is the inverted case: Self-inhibition by a wide source match, i.e. an alert that should only fire if none of a whole group of source alert fires. I mean, we "fixed" it as in, the target alert will never be inhibited, but @lmb in #666 wanted the alert to be inhibited _sometimes_ (just not _always_). The other part is that I think that the asymmetry in our interim solution will at some point haunt us. Thus, I really would like to get this change in before we do a 1.0 release. In practice, I expect this to be only relevant in very rare cases. But those cases will be most difficult to reason with, and I claim that the solution in this commit is matching what humans intuitively expect. Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-22 18:57:27 +00:00
// Active alert that matches the source filter of rule1.
sourceAlert1 := &types.Alert{
Alert: model.Alert{
Labels: model.LabelSet{"s1": "1", "t1": "2", "e": "1"},
StartsAt: now.Add(-time.Minute),
EndsAt: now.Add(time.Hour),
},
}
// Active alert that matches the source filter _and_ the target filter of rule2.
sourceAlert2 := &types.Alert{
Alert: model.Alert{
Modify the self-inhibition prevention semantics This has been discussed in #666 (issue of hell...). As concluded there, the cleanest semantics is most likely the following: "An alert that matches both target and source side cannot inhibit alerts for which the same is true." The two open questions were: 1. How difficult is the implementation? 2. Is it needed? This relatively simple commit proves that the answer to (1) is: Not very difficult. (This also includes a performance-improving simplification, which would have been possible without a change of semantics.) The answer to (2) is twofold: For one, the original use case in #666 wasn't solved by our interim solution. What we solved is the case where the self-inhibition is triggered by a wide target match, i.e. I have a specific alert that should inhibit a whole group of target alerts without inhibiting itself. What we did _not_ solve is the inverted case: Self-inhibition by a wide source match, i.e. an alert that should only fire if none of a whole group of source alert fires. I mean, we "fixed" it as in, the target alert will never be inhibited, but @lmb in #666 wanted the alert to be inhibited _sometimes_ (just not _always_). The other part is that I think that the asymmetry in our interim solution will at some point haunt us. Thus, I really would like to get this change in before we do a 1.0 release. In practice, I expect this to be only relevant in very rare cases. But those cases will be most difficult to reason with, and I claim that the solution in this commit is matching what humans intuitively expect. Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-22 18:57:27 +00:00
Labels: model.LabelSet{"s2": "1", "t2": "1", "e": "1"},
StartsAt: now.Add(-time.Minute),
EndsAt: now.Add(time.Hour),
},
}
ih.rules[0].scache = store.NewAlerts()
Modify the self-inhibition prevention semantics This has been discussed in #666 (issue of hell...). As concluded there, the cleanest semantics is most likely the following: "An alert that matches both target and source side cannot inhibit alerts for which the same is true." The two open questions were: 1. How difficult is the implementation? 2. Is it needed? This relatively simple commit proves that the answer to (1) is: Not very difficult. (This also includes a performance-improving simplification, which would have been possible without a change of semantics.) The answer to (2) is twofold: For one, the original use case in #666 wasn't solved by our interim solution. What we solved is the case where the self-inhibition is triggered by a wide target match, i.e. I have a specific alert that should inhibit a whole group of target alerts without inhibiting itself. What we did _not_ solve is the inverted case: Self-inhibition by a wide source match, i.e. an alert that should only fire if none of a whole group of source alert fires. I mean, we "fixed" it as in, the target alert will never be inhibited, but @lmb in #666 wanted the alert to be inhibited _sometimes_ (just not _always_). The other part is that I think that the asymmetry in our interim solution will at some point haunt us. Thus, I really would like to get this change in before we do a 1.0 release. In practice, I expect this to be only relevant in very rare cases. But those cases will be most difficult to reason with, and I claim that the solution in this commit is matching what humans intuitively expect. Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-22 18:57:27 +00:00
ih.rules[0].scache.Set(sourceAlert1)
ih.rules[1].scache = store.NewAlerts()
Modify the self-inhibition prevention semantics This has been discussed in #666 (issue of hell...). As concluded there, the cleanest semantics is most likely the following: "An alert that matches both target and source side cannot inhibit alerts for which the same is true." The two open questions were: 1. How difficult is the implementation? 2. Is it needed? This relatively simple commit proves that the answer to (1) is: Not very difficult. (This also includes a performance-improving simplification, which would have been possible without a change of semantics.) The answer to (2) is twofold: For one, the original use case in #666 wasn't solved by our interim solution. What we solved is the case where the self-inhibition is triggered by a wide target match, i.e. I have a specific alert that should inhibit a whole group of target alerts without inhibiting itself. What we did _not_ solve is the inverted case: Self-inhibition by a wide source match, i.e. an alert that should only fire if none of a whole group of source alert fires. I mean, we "fixed" it as in, the target alert will never be inhibited, but @lmb in #666 wanted the alert to be inhibited _sometimes_ (just not _always_). The other part is that I think that the asymmetry in our interim solution will at some point haunt us. Thus, I really would like to get this change in before we do a 1.0 release. In practice, I expect this to be only relevant in very rare cases. But those cases will be most difficult to reason with, and I claim that the solution in this commit is matching what humans intuitively expect. Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-22 18:57:27 +00:00
ih.rules[1].scache.Set(sourceAlert2)
cases := []struct {
target model.LabelSet
expected bool
}{
{
Modify the self-inhibition prevention semantics This has been discussed in #666 (issue of hell...). As concluded there, the cleanest semantics is most likely the following: "An alert that matches both target and source side cannot inhibit alerts for which the same is true." The two open questions were: 1. How difficult is the implementation? 2. Is it needed? This relatively simple commit proves that the answer to (1) is: Not very difficult. (This also includes a performance-improving simplification, which would have been possible without a change of semantics.) The answer to (2) is twofold: For one, the original use case in #666 wasn't solved by our interim solution. What we solved is the case where the self-inhibition is triggered by a wide target match, i.e. I have a specific alert that should inhibit a whole group of target alerts without inhibiting itself. What we did _not_ solve is the inverted case: Self-inhibition by a wide source match, i.e. an alert that should only fire if none of a whole group of source alert fires. I mean, we "fixed" it as in, the target alert will never be inhibited, but @lmb in #666 wanted the alert to be inhibited _sometimes_ (just not _always_). The other part is that I think that the asymmetry in our interim solution will at some point haunt us. Thus, I really would like to get this change in before we do a 1.0 release. In practice, I expect this to be only relevant in very rare cases. But those cases will be most difficult to reason with, and I claim that the solution in this commit is matching what humans intuitively expect. Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-22 18:57:27 +00:00
// Matches target filter of rule1, inhibited.
target: model.LabelSet{"t1": "1", "e": "1"},
expected: true,
},
{
// Matches target filter of rule2, inhibited.
target: model.LabelSet{"t2": "1", "e": "1"},
expected: true,
},
{
// Matches target filter of rule1 (plus noise), inhibited.
target: model.LabelSet{"t1": "1", "t3": "1", "e": "1"},
expected: true,
},
{
Modify the self-inhibition prevention semantics This has been discussed in #666 (issue of hell...). As concluded there, the cleanest semantics is most likely the following: "An alert that matches both target and source side cannot inhibit alerts for which the same is true." The two open questions were: 1. How difficult is the implementation? 2. Is it needed? This relatively simple commit proves that the answer to (1) is: Not very difficult. (This also includes a performance-improving simplification, which would have been possible without a change of semantics.) The answer to (2) is twofold: For one, the original use case in #666 wasn't solved by our interim solution. What we solved is the case where the self-inhibition is triggered by a wide target match, i.e. I have a specific alert that should inhibit a whole group of target alerts without inhibiting itself. What we did _not_ solve is the inverted case: Self-inhibition by a wide source match, i.e. an alert that should only fire if none of a whole group of source alert fires. I mean, we "fixed" it as in, the target alert will never be inhibited, but @lmb in #666 wanted the alert to be inhibited _sometimes_ (just not _always_). The other part is that I think that the asymmetry in our interim solution will at some point haunt us. Thus, I really would like to get this change in before we do a 1.0 release. In practice, I expect this to be only relevant in very rare cases. But those cases will be most difficult to reason with, and I claim that the solution in this commit is matching what humans intuitively expect. Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-22 18:57:27 +00:00
// Matches target filter of rule1 plus rule2, inhibited.
target: model.LabelSet{"t1": "1", "t2": "1", "e": "1"},
expected: true,
},
{
Modify the self-inhibition prevention semantics This has been discussed in #666 (issue of hell...). As concluded there, the cleanest semantics is most likely the following: "An alert that matches both target and source side cannot inhibit alerts for which the same is true." The two open questions were: 1. How difficult is the implementation? 2. Is it needed? This relatively simple commit proves that the answer to (1) is: Not very difficult. (This also includes a performance-improving simplification, which would have been possible without a change of semantics.) The answer to (2) is twofold: For one, the original use case in #666 wasn't solved by our interim solution. What we solved is the case where the self-inhibition is triggered by a wide target match, i.e. I have a specific alert that should inhibit a whole group of target alerts without inhibiting itself. What we did _not_ solve is the inverted case: Self-inhibition by a wide source match, i.e. an alert that should only fire if none of a whole group of source alert fires. I mean, we "fixed" it as in, the target alert will never be inhibited, but @lmb in #666 wanted the alert to be inhibited _sometimes_ (just not _always_). The other part is that I think that the asymmetry in our interim solution will at some point haunt us. Thus, I really would like to get this change in before we do a 1.0 release. In practice, I expect this to be only relevant in very rare cases. But those cases will be most difficult to reason with, and I claim that the solution in this commit is matching what humans intuitively expect. Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-22 18:57:27 +00:00
// Doesn't match target filter, not inhibited.
target: model.LabelSet{"t1": "0", "e": "1"},
expected: false,
},
{
Modify the self-inhibition prevention semantics This has been discussed in #666 (issue of hell...). As concluded there, the cleanest semantics is most likely the following: "An alert that matches both target and source side cannot inhibit alerts for which the same is true." The two open questions were: 1. How difficult is the implementation? 2. Is it needed? This relatively simple commit proves that the answer to (1) is: Not very difficult. (This also includes a performance-improving simplification, which would have been possible without a change of semantics.) The answer to (2) is twofold: For one, the original use case in #666 wasn't solved by our interim solution. What we solved is the case where the self-inhibition is triggered by a wide target match, i.e. I have a specific alert that should inhibit a whole group of target alerts without inhibiting itself. What we did _not_ solve is the inverted case: Self-inhibition by a wide source match, i.e. an alert that should only fire if none of a whole group of source alert fires. I mean, we "fixed" it as in, the target alert will never be inhibited, but @lmb in #666 wanted the alert to be inhibited _sometimes_ (just not _always_). The other part is that I think that the asymmetry in our interim solution will at some point haunt us. Thus, I really would like to get this change in before we do a 1.0 release. In practice, I expect this to be only relevant in very rare cases. But those cases will be most difficult to reason with, and I claim that the solution in this commit is matching what humans intuitively expect. Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-22 18:57:27 +00:00
// Matches both source and target filters of rule1,
// inhibited because sourceAlert1 matches only the
// source filter of rule1.
target: model.LabelSet{"s1": "1", "t1": "1", "e": "1"},
expected: true,
},
{
// Matches both source and target filters of rule2,
// not inhibited because sourceAlert2 matches also both the
// source and target filter of rule2.
Modify the self-inhibition prevention semantics This has been discussed in #666 (issue of hell...). As concluded there, the cleanest semantics is most likely the following: "An alert that matches both target and source side cannot inhibit alerts for which the same is true." The two open questions were: 1. How difficult is the implementation? 2. Is it needed? This relatively simple commit proves that the answer to (1) is: Not very difficult. (This also includes a performance-improving simplification, which would have been possible without a change of semantics.) The answer to (2) is twofold: For one, the original use case in #666 wasn't solved by our interim solution. What we solved is the case where the self-inhibition is triggered by a wide target match, i.e. I have a specific alert that should inhibit a whole group of target alerts without inhibiting itself. What we did _not_ solve is the inverted case: Self-inhibition by a wide source match, i.e. an alert that should only fire if none of a whole group of source alert fires. I mean, we "fixed" it as in, the target alert will never be inhibited, but @lmb in #666 wanted the alert to be inhibited _sometimes_ (just not _always_). The other part is that I think that the asymmetry in our interim solution will at some point haunt us. Thus, I really would like to get this change in before we do a 1.0 release. In practice, I expect this to be only relevant in very rare cases. But those cases will be most difficult to reason with, and I claim that the solution in this commit is matching what humans intuitively expect. Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-22 18:57:27 +00:00
target: model.LabelSet{"s2": "1", "t2": "1", "e": "1"},
expected: false,
},
{
// Matches target filter, equal label doesn't match, not inhibited
Modify the self-inhibition prevention semantics This has been discussed in #666 (issue of hell...). As concluded there, the cleanest semantics is most likely the following: "An alert that matches both target and source side cannot inhibit alerts for which the same is true." The two open questions were: 1. How difficult is the implementation? 2. Is it needed? This relatively simple commit proves that the answer to (1) is: Not very difficult. (This also includes a performance-improving simplification, which would have been possible without a change of semantics.) The answer to (2) is twofold: For one, the original use case in #666 wasn't solved by our interim solution. What we solved is the case where the self-inhibition is triggered by a wide target match, i.e. I have a specific alert that should inhibit a whole group of target alerts without inhibiting itself. What we did _not_ solve is the inverted case: Self-inhibition by a wide source match, i.e. an alert that should only fire if none of a whole group of source alert fires. I mean, we "fixed" it as in, the target alert will never be inhibited, but @lmb in #666 wanted the alert to be inhibited _sometimes_ (just not _always_). The other part is that I think that the asymmetry in our interim solution will at some point haunt us. Thus, I really would like to get this change in before we do a 1.0 release. In practice, I expect this to be only relevant in very rare cases. But those cases will be most difficult to reason with, and I claim that the solution in this commit is matching what humans intuitively expect. Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-22 18:57:27 +00:00
target: model.LabelSet{"t1": "1", "e": "0"},
expected: false,
},
}
for _, c := range cases {
if actual := ih.Mutes(c.target); actual != c.expected {
t.Errorf("Expected (*Inhibitor).Mutes(%v) to return %t but got %t", c.target, c.expected, actual)
}
}
}
func TestInhibitRuleMatchers(t *testing.T) {
t.Parallel()
rule1 := config.InhibitRule{
SourceMatchers: config.Matchers{&labels.Matcher{Type: labels.MatchEqual, Name: "s1", Value: "1"}},
TargetMatchers: config.Matchers{&labels.Matcher{Type: labels.MatchNotEqual, Name: "t1", Value: "1"}},
Equal: model.LabelNames{"e"},
}
rule2 := config.InhibitRule{
SourceMatchers: config.Matchers{&labels.Matcher{Type: labels.MatchEqual, Name: "s2", Value: "1"}},
TargetMatchers: config.Matchers{&labels.Matcher{Type: labels.MatchEqual, Name: "t2", Value: "1"}},
Equal: model.LabelNames{"e"},
}
m := types.NewMarker(prometheus.NewRegistry())
ih := NewInhibitor(nil, []config.InhibitRule{rule1, rule2}, m, nopLogger)
now := time.Now()
// Active alert that matches the source filter of rule1.
sourceAlert1 := &types.Alert{
Alert: model.Alert{
Labels: model.LabelSet{"s1": "1", "t1": "2", "e": "1"},
StartsAt: now.Add(-time.Minute),
EndsAt: now.Add(time.Hour),
},
}
// Active alert that matches the source filter _and_ the target filter of rule2.
sourceAlert2 := &types.Alert{
Alert: model.Alert{
Labels: model.LabelSet{"s2": "1", "t2": "1", "e": "1"},
StartsAt: now.Add(-time.Minute),
EndsAt: now.Add(time.Hour),
},
}
ih.rules[0].scache = store.NewAlerts()
ih.rules[0].scache.Set(sourceAlert1)
ih.rules[1].scache = store.NewAlerts()
ih.rules[1].scache.Set(sourceAlert2)
cases := []struct {
target model.LabelSet
expected bool
}{
{
// Matches target filter of rule1, inhibited.
target: model.LabelSet{"t1": "1", "e": "1"},
expected: false,
},
{
// Matches target filter of rule2, inhibited.
target: model.LabelSet{"t2": "1", "e": "1"},
expected: true,
},
{
// Matches target filter of rule1 (plus noise), inhibited.
target: model.LabelSet{"t1": "1", "t3": "1", "e": "1"},
expected: false,
},
{
// Matches target filter of rule1 plus rule2, inhibited.
target: model.LabelSet{"t1": "1", "t2": "1", "e": "1"},
expected: true,
},
{
// Doesn't match target filter, not inhibited.
target: model.LabelSet{"t1": "0", "e": "1"},
expected: true,
},
{
// Matches both source and target filters of rule1,
// inhibited because sourceAlert1 matches only the
// source filter of rule1.
target: model.LabelSet{"s1": "1", "t1": "1", "e": "1"},
expected: false,
},
{
// Matches both source and target filters of rule2,
// not inhibited because sourceAlert2 matches also both the
// source and target filter of rule2.
target: model.LabelSet{"s2": "1", "t2": "1", "e": "1"},
expected: true,
},
{
// Matches target filter, equal label doesn't match, not inhibited
target: model.LabelSet{"t1": "1", "e": "0"},
expected: false,
},
}
for _, c := range cases {
if actual := ih.Mutes(c.target); actual != c.expected {
t.Errorf("Expected (*Inhibitor).Mutes(%v) to return %t but got %t", c.target, c.expected, actual)
}
}
}
Fix resolved alerts still inhibiting (#1331) * inhibit: update inhibition cache when alerts resolve Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: remove unnecessary fmt.Sprintf Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: add unit tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: use NopLogger in tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Update old alert with result of merge with new On ingest, alerts with matching fingerprints are merged if the new alert's start and end times overlap with the old alert's. The merge creates a new alert, which is then updated in the internal alert store. The original alert is not updated (because merge creates a copy), so it is never marked as resolved in the inhibitor's reference to it. The code within the inhibitor relies on skipping over resolved alerts, but because the old alert is never updated it is never marked as resolved. Thus it continues to inhibit other alerts until it is cleaned up by the internal GC. This commit updates the struct of the old alert with the result of the merge with the new alert. An alternative would be to always update the inhibitor's internal cache of alerts regardless of an alert's resolve status. Signed-off-by: stuart nelson <stuartnelson3@gmail.com> * Update inhibitor cache even if alert is resolved This seems like a better choice than the previous commit. I think it is more sane to have the inhibitor update its own cache, rather than having one of its pointers updated externally. Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
2018-04-18 14:26:04 +00:00
type fakeAlerts struct {
alerts []*types.Alert
finished chan struct{}
}
func newFakeAlerts(alerts []*types.Alert) *fakeAlerts {
return &fakeAlerts{
alerts: alerts,
finished: make(chan struct{}),
}
}
func (f *fakeAlerts) GetPending() provider.AlertIterator { return nil }
func (f *fakeAlerts) Get(model.Fingerprint) (*types.Alert, error) { return nil, nil }
func (f *fakeAlerts) Put(...*types.Alert) error { return nil }
func (f *fakeAlerts) Subscribe() provider.AlertIterator {
ch := make(chan *types.Alert)
done := make(chan struct{})
go func() {
for _, a := range f.alerts {
ch <- a
}
// Send another (meaningless) alert to make sure that the inhibitor has
// processed everything.
ch <- &types.Alert{
Alert: model.Alert{
Labels: model.LabelSet{},
StartsAt: time.Now(),
},
}
close(f.finished)
<-done
}()
return provider.NewAlertIterator(ch, done, nil)
}
func TestInhibit(t *testing.T) {
t.Parallel()
now := time.Now()
inhibitRule := func() config.InhibitRule {
return config.InhibitRule{
Fix resolved alerts still inhibiting (#1331) * inhibit: update inhibition cache when alerts resolve Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: remove unnecessary fmt.Sprintf Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: add unit tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: use NopLogger in tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Update old alert with result of merge with new On ingest, alerts with matching fingerprints are merged if the new alert's start and end times overlap with the old alert's. The merge creates a new alert, which is then updated in the internal alert store. The original alert is not updated (because merge creates a copy), so it is never marked as resolved in the inhibitor's reference to it. The code within the inhibitor relies on skipping over resolved alerts, but because the old alert is never updated it is never marked as resolved. Thus it continues to inhibit other alerts until it is cleaned up by the internal GC. This commit updates the struct of the old alert with the result of the merge with the new alert. An alternative would be to always update the inhibitor's internal cache of alerts regardless of an alert's resolve status. Signed-off-by: stuart nelson <stuartnelson3@gmail.com> * Update inhibitor cache even if alert is resolved This seems like a better choice than the previous commit. I think it is more sane to have the inhibitor update its own cache, rather than having one of its pointers updated externally. Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
2018-04-18 14:26:04 +00:00
SourceMatch: map[string]string{"s": "1"},
TargetMatch: map[string]string{"t": "1"},
Equal: model.LabelNames{"e"},
}
}
// alertOne is muted by alertTwo when it is active.
alertOne := func() *types.Alert {
return &types.Alert{
Alert: model.Alert{
Labels: model.LabelSet{"t": "1", "e": "f"},
StartsAt: now.Add(-time.Minute),
EndsAt: now.Add(time.Hour),
},
}
}
alertTwo := func(resolved bool) *types.Alert {
var end time.Time
if resolved {
end = now.Add(-time.Second)
} else {
end = now.Add(time.Hour)
}
return &types.Alert{
Alert: model.Alert{
Labels: model.LabelSet{"s": "1", "e": "f"},
StartsAt: now.Add(-time.Minute),
EndsAt: end,
},
}
}
type exp struct {
lbls model.LabelSet
muted bool
}
for i, tc := range []struct {
alerts []*types.Alert
expected []exp
}{
{
// alertOne shouldn't be muted since alertTwo hasn't fired.
alerts: []*types.Alert{alertOne()},
expected: []exp{
{
lbls: model.LabelSet{"t": "1", "e": "f"},
muted: false,
},
},
},
{
// alertOne should be muted by alertTwo which is active.
alerts: []*types.Alert{alertOne(), alertTwo(false)},
expected: []exp{
{
lbls: model.LabelSet{"t": "1", "e": "f"},
muted: true,
},
{
lbls: model.LabelSet{"s": "1", "e": "f"},
muted: false,
},
},
},
{
// alertOne shouldn't be muted since alertTwo is resolved.
alerts: []*types.Alert{alertOne(), alertTwo(false), alertTwo(true)},
expected: []exp{
{
lbls: model.LabelSet{"t": "1", "e": "f"},
muted: false,
},
{
lbls: model.LabelSet{"s": "1", "e": "f"},
muted: false,
},
},
},
} {
ap := newFakeAlerts(tc.alerts)
mk := types.NewMarker(prometheus.NewRegistry())
inhibitor := NewInhibitor(ap, []config.InhibitRule{inhibitRule()}, mk, nopLogger)
Fix resolved alerts still inhibiting (#1331) * inhibit: update inhibition cache when alerts resolve Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: remove unnecessary fmt.Sprintf Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: add unit tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * inhibit: use NopLogger in tests Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Update old alert with result of merge with new On ingest, alerts with matching fingerprints are merged if the new alert's start and end times overlap with the old alert's. The merge creates a new alert, which is then updated in the internal alert store. The original alert is not updated (because merge creates a copy), so it is never marked as resolved in the inhibitor's reference to it. The code within the inhibitor relies on skipping over resolved alerts, but because the old alert is never updated it is never marked as resolved. Thus it continues to inhibit other alerts until it is cleaned up by the internal GC. This commit updates the struct of the old alert with the result of the merge with the new alert. An alternative would be to always update the inhibitor's internal cache of alerts regardless of an alert's resolve status. Signed-off-by: stuart nelson <stuartnelson3@gmail.com> * Update inhibitor cache even if alert is resolved This seems like a better choice than the previous commit. I think it is more sane to have the inhibitor update its own cache, rather than having one of its pointers updated externally. Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
2018-04-18 14:26:04 +00:00
go func() {
for ap.finished != nil {
select {
case <-ap.finished:
ap.finished = nil
default:
}
}
inhibitor.Stop()
}()
inhibitor.Run()
for _, expected := range tc.expected {
if inhibitor.Mutes(expected.lbls) != expected.muted {
mute := "unmuted"
if expected.muted {
mute = "muted"
}
t.Errorf("tc: %d, expected alert with labels %q to be %s", i, expected.lbls, mute)
}
}
}
}