Sync Makefile.common to latest which updates promu version
and adds license check to default target.
Add missing license headers.
Signed-off-by: Paul Gier <pgier@redhat.com>
* api/v2: sort silences similarly to v1 api
Sort the queried silences to match behaviour in the v1 api.
Sort silences in-place instead of creating multiple slices.
Use separate function for sorting silences for easier testing.
Add unit test for sort order.
Signed-off-by: Paul Gier <pgier@redhat.com>
* try a more complicated but clearer approach explicitly returning a no-auth stmp.Auth when no username is supplied in config
Signed-off-by: Jo Walsh <jowalsh@bgs.ac.uk>
* fix test to expect no error from auth if username is not supplied
Signed-off-by: Jo Walsh <jowalsh@bgs.ac.uk>
* clean up some formatting errors in surplus comments
Signed-off-by: Jo Walsh <jowalsh@bgs.ac.uk>
* keep noAuth / loginAuth functions all together
Signed-off-by: Jo Walsh <jowalsh@bgs.ac.uk>
* Address latest comments
Co-Authored-By: Jo Walsh <jowalsh@bgs.ac.uk>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Even if an alert is already silenced and/or muted, I still want to
know that my newly created silence will affect it.
Signed-off-by: beorn7 <beorn@soundcloud.com>
Without the `-u`, it will load what's required in the `go.mod`
file. But with the `-u`, it will load new stuff once it's available,
which makes the build non-reproducible. (Without any change in the
`errcheck` repo, other things will happen.)
Signed-off-by: beorn7 <beorn@soundcloud.com>
Essentially, the Silences.Expire() will in that case have no effect
because the affected silence is immediately seen as expired from the
storage and thus not updated. The silence will stay around in its old
state.
This fix makes sure to use the same “now” throughout the expiration
process.
Signed-off-by: beorn7 <beorn@soundcloud.com>
Add version tracking of silences states. Adding a silence to the state
increments the version. If the version hasn't changed since the last
time an alert was checked for being silenced, we only have to verify
that the relevant silences are still active rather than checking the
alert against all silences.
Signed-off-by: beorn7 <beorn@soundcloud.com>
This encapsulates the logic of querying and marking silenced
alerts. It removes the code duplication flagged earlier.
I removed the error returned by the setAlertStatus function as we were
only logging it, and that's already done anyway when the error is
received from the `silence.Query` call (now in the `Mutes` method).
Signed-off-by: beorn7 <beorn@soundcloud.com>
This changes removes all usage of golang.org/x/net/context in the code
base. It also bumps a few dependencies for the same reason:
- github.com/gogo/protobuf
- go-openapi/*
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
This clarifies a bunch of things I have run into during code reading
in preparation for some performance improvements around muting.
It also moves doc comments from places where they don't show up in
godoc to visible places.
It also fixes golint warnings.
Signed-off-by: beorn7 <beorn@soundcloud.com>
This has been discussed in #666 (issue of hell...).
As concluded there, the cleanest semantics is most likely the
following: "An alert that matches both target and source side cannot
inhibit alerts for which the same is true." The two open questions
were:
1. How difficult is the implementation?
2. Is it needed?
This relatively simple commit proves that the answer to (1) is: Not
very difficult. (This also includes a performance-improving
simplification, which would have been possible without a change of
semantics.)
The answer to (2) is twofold:
For one, the original use case in #666 wasn't solved by our interim
solution. What we solved is the case where the self-inhibition is
triggered by a wide target match, i.e. I have a specific alert that
should inhibit a whole group of target alerts without inhibiting
itself. What we did _not_ solve is the inverted case: Self-inhibition
by a wide source match, i.e. an alert that should only fire if none of
a whole group of source alert fires. I mean, we "fixed" it as in, the
target alert will never be inhibited, but @lmb in #666 wanted the
alert to be inhibited _sometimes_ (just not _always_).
The other part is that I think that the asymmetry in our interim
solution will at some point haunt us. Thus, I really would like to get
this change in before we do a 1.0 release.
In practice, I expect this to be only relevant in very rare cases. But
those cases will be most difficult to reason with, and I claim that
the solution in this commit is matching what humans intuitively
expect.
Signed-off-by: beorn7 <beorn@soundcloud.com>
Instead of handling all config specific logic inside
Alertmangaer.main(), this patch introduces the config coordinator
component.
Tasks of the config coordinator:
- Load and parse configuration
- Notify subscribers on configuration changes
- Register and manage configuration specific metrics
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
In similar vein to prometheus/prometheus/pkg/relabel/relabel.go, extend
Regexp to include the original regular expression string to faithfully
output what was read.
Update TestEmptyFieldsAndRegex.
Fixes: #1753
Signed-off-by: Miek Gieben <miek@miek.nl>
Most importantly, `api.New` now takes an `Options` struct as an
argument, which allows some other things done here as well:
- Timout and concurrency limit are now in the options, streamlining
the registration and the implementation of the limiting middleware.
- A local registry is used for metrics, and the metrics used so far
inside any of the api packages are using it now.
The 'in flight' metric now contains the 'get' as a method label. I
have also added a TODO to instrument other methods in the same way
(otherwise, the label doesn't reall make sense, semantically). I have
also added an explicit error counter for requests rejected because of
the concurrency limit. (They also show up as 503s in the generic HTTP
instrumentation (or they would, if v2 were instrumented, too), but
those 503s might have a number of reasons, while users might want to
alert on concurrency limit problems explicitly).
Signed-off-by: beorn7 <beorn@soundcloud.com>
While the newly added in-flight instrumentation works for all GET
requests, the existing HTTP instrumentation omits api/v2 calls. This
commit adds a TODO note about that.
Signed-off-by: beorn7 <beorn@soundcloud.com>
The context is created by the http.TimeoutHandler we use to set the
timeout.
I believe this is the only endpoint where propagating the timeout is
feasible and needed.
Signed-off-by: beorn7 <beorn@soundcloud.com>
The default concurrency limit is max(GOMAXPROCS, 8). That should not
imply that each GET requests eats a whole CPU. It's more to get some
reasonable heuristics for the processing power of the hosting machine
(while allowing at least 8 concurrent requests even on the smallest
machines). As GET requests can easily overload the Alertmanager,
rendering it incapable of doing its main task, namely sending alert
notifications, we need to limit GET requests by default.
In contrast, no timeout is set by default. The http.TimeoutHandler
inovkes quite a bit of machinery behind the scenes, in particular an
additional layer of buffering. Thus, we should first get a bit of
experience with it before we consider enforcing a timeout by default,
even if setting a timeout is in general the safer setting for
resiliency.
Signed-off-by: beorn7 <beorn@soundcloud.com>