* Alert metric reports different results to what the user sees via API
Fixes#1439 and #2619.
The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
The function value and parameters of a defer statement are immediately
evaluated, so this "disp" value is always nil, and calling Stop() on a nil
dispatcher is a no-op, so this does nothing, but wrapping it in a closure
that refers to "disp" fixes it.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Add CLI args for snapshot intervals
Signed-off-by: sed-i <82407168+sed-i@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: sed-i <82407168+sed-i@users.noreply.github.com>
* use same flag for silences and nflogs intervals
Signed-off-by: sed-i <82407168+sed-i@users.noreply.github.com>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
* add active time interval
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* fix active time interval
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* fix unittests for active time interval
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update notify/notify.go
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update dispatch/route.go
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* split the stage for active and mute intervals
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update notify/notify.go
Adds doc for a helper function
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update notify/notify.go
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update notify/notify.go
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update notify/notify.go
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* fix code after commit suggestions
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Making mute_time_interval and time_intervals can coexist in the config
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* docs: configuration's doc has been updated about time intervals
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update config/config.go
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update docs/configuration.md
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update docs/configuration.md
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update docs/configuration.md
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update docs/configuration.md
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* updates configuration readme to improve active time description
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* merge deprecated mute_time_intervals and time_intervals
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update docs/configuration.md
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update docs/configuration.md
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update docs/configuration.md
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update docs/configuration.md
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update docs/configuration.md
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update docs/configuration.md
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update cmd/alertmanager/main.go
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update cmd/alertmanager/main.go
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* fmt main.go
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update docs/configuration.md
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update docs/configuration.md
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* fix lint error
Signed-off-by: clyang82 <chuyang@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Document that matchers are ANDed together
Signed-off-by: Mac Chaffee <me@macchaffee.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Remove extra parentheticals
Signed-off-by: Mac Chaffee <me@macchaffee.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* config: root route should have empty matchers
Unmarshal should validate that the root route does
not contain any matchers. Prior to this change,
only the deprecated match structures were checked.
Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* chore: Let git ignore temporary files for ui/app
Signed-off-by: nekketsuuu <nekketsuuu@users.noreply.github.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* adding max_alerts parameter to slack webhook config
correcting the logic to trucate fields instead of dropping alerts in the slack integration
Signed-off-by: Prashant Balachandran <pnair@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* *: bump to Go 1.17 (#2792)
* *: bump to Go 1.17
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* *: fix yamllint errors
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Automate CSS-inlining for default HTML email template (#2798)
* Automate CSS-inlining for default HTML email template
The original HTML email template was added in `template/email.html`.
It looks like the CSS was manually inlined. Most likely using the
premailer.dialect.ca web form, which is mentioned in the README for
the Mailgun transactional-email-templates project. The resulting HTML
with inlined CSS was then copied into `template/default.tmpl`. This
has resulted in `email.html` and `default.tmpl` diverging at times.
This commit adds build automation to inline the CSS automatically
using [juice][1]. The Go template containing the resulting HTML has
been moved into its own file to avoid the script that performs the CSS
inlining having to parse the `default.tmpl` file to insert it there.
Fixes#1939.
[1]: https://www.npmjs.com/package/juice
Signed-off-by: Brad Ison <bison@xvdf.io>
* Update asset/assets_vfsdata.go
Signed-off-by: Brad Ison <bison@xvdf.io>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* go.{mod,sum}: update Go dependencies
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* amtool to support http_config to access alertmanager (#2764)
* Support http_config for amtool
Co-authored-by: Julien Pivotto <roidelapluie@gmail.com>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: clyang82 <chuyang@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* notify/sns: detect FIFO topic based on the rendered value
Since the TopicARN field is a template string, it's safer to check for
the ".fifo" suffix in the rendered string.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* config: delegate Sigv4 validation to the inner type
This change also adds unit tests for SNS configuration.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* fix unittests
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* fix comment about active time interval
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* fix another comment about active time interval
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update docs/configuration.md
Fix typo in documentation
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
* Update docs/configuration.md
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Sinuhe Tellez <dubyte@gmail.com>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: clyang82 <chuyang@redhat.com>
Co-authored-by: Mac Chaffee <me@macchaffee.com>
Co-authored-by: Philip Gough <philip.p.gough@gmail.com>
Co-authored-by: nekketsuuu <nekketsuuu@users.noreply.github.com>
Co-authored-by: Prashant Balachandran <pnair@redhat.com>
Co-authored-by: Simon Pasquier <pasquier.simon@gmail.com>
Co-authored-by: Brad Ison <brad.ison@redhat.com>
Co-authored-by: Julien Pivotto <roidelapluie@gmail.com>
* Add feature flag to enable discovery and use of public IPaddr for clustering.
Before this change, Alertmanager would refuse to startup if using a
advertise address binding to any address (0.0.0.0), and the host only
had an interface with a public IP address. After this change we feature
flag permitting the use of a discovered public address for cluster
gossiping.
Signed-off-by: Devin Trejo <dtrejo@palantir.com>
* Enable support for custom callbacks as part of maintenance
This enables support for custom Maintenance callbacks as part of the periodic maintenance of silences and notification logs.
Effectively a no-op for the Alertmanager but allows downstream implementation to inject custom logic as part of it.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Add tests
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Fix tests and remove whitespace
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Address review comments
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* run go fmt
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Fix import ordering
Signed-off-by: gotjosh <josue.abreu@gmail.com>
Limits are not used in standalone alertmanager.
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
* notify: don't use the global metrics registry
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Address Max's comment
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* cmd/alertmanager: reject invalid external URLs
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Address Brian's comments
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Simplify the code according to Max's feedback
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Instead of keeping all notifiers in the notify package, it splits them
into individual sub-packages. This improves readability and
maintainability of the code.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
This encapsulates the logic of querying and marking silenced
alerts. It removes the code duplication flagged earlier.
I removed the error returned by the setAlertStatus function as we were
only logging it, and that's already done anyway when the error is
received from the `silence.Query` call (now in the `Mutes` method).
Signed-off-by: beorn7 <beorn@soundcloud.com>
Instead of handling all config specific logic inside
Alertmangaer.main(), this patch introduces the config coordinator
component.
Tasks of the config coordinator:
- Load and parse configuration
- Notify subscribers on configuration changes
- Register and manage configuration specific metrics
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
Most importantly, `api.New` now takes an `Options` struct as an
argument, which allows some other things done here as well:
- Timout and concurrency limit are now in the options, streamlining
the registration and the implementation of the limiting middleware.
- A local registry is used for metrics, and the metrics used so far
inside any of the api packages are using it now.
The 'in flight' metric now contains the 'get' as a method label. I
have also added a TODO to instrument other methods in the same way
(otherwise, the label doesn't reall make sense, semantically). I have
also added an explicit error counter for requests rejected because of
the concurrency limit. (They also show up as 503s in the generic HTTP
instrumentation (or they would, if v2 were instrumented, too), but
those 503s might have a number of reasons, while users might want to
alert on concurrency limit problems explicitly).
Signed-off-by: beorn7 <beorn@soundcloud.com>
While the newly added in-flight instrumentation works for all GET
requests, the existing HTTP instrumentation omits api/v2 calls. This
commit adds a TODO note about that.
Signed-off-by: beorn7 <beorn@soundcloud.com>
The default concurrency limit is max(GOMAXPROCS, 8). That should not
imply that each GET requests eats a whole CPU. It's more to get some
reasonable heuristics for the processing power of the hosting machine
(while allowing at least 8 concurrent requests even on the smallest
machines). As GET requests can easily overload the Alertmanager,
rendering it incapable of doing its main task, namely sending alert
notifications, we need to limit GET requests by default.
In contrast, no timeout is set by default. The http.TimeoutHandler
inovkes quite a bit of machinery behind the scenes, in particular an
additional layer of buffering. Thus, we should first get a bit of
experience with it before we consider enforcing a timeout by default,
even if setting a timeout is in general the safer setting for
resiliency.
Signed-off-by: beorn7 <beorn@soundcloud.com>
Instead of registering marker metrics inside of
cmd/alertmanager/main.go, register them in types/types.go, encapsulating
marker specific logic in its module, not in main.go. In addition it
paves the path for removing the usage of the global metric registry in
the future, by taking a local metric registerer.
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
Instead of cmd/alertmanager/main.go instantiating and starting both api
v1 and v2, delegate that work to a generic api combining the two.
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
The current Alertmanager API v1 is undocumented and written by hand.
This patch introduces a new Alertmanager API - v2. The API is fully
generated via an OpenAPI 2.0 [1] specification (see
`api/v2/openapi.yaml`) with the exception of the http handlers itself.
Pros:
- Generated server code
- Ability to generate clients in all major languages
(Go, Java, JS, Python, Ruby, Haskell, *elm* [3] ...)
- Strict contract (OpenAPI spec) between server and clients.
- Instant feedback on frontend-breaking changes, due to strictly
typed frontend language elm.
- Generated documentation (See Alertmanager online Swagger UI [4])
Cons:
- Dependency on open api ecosystem including go-swagger [2]
In addition this patch includes the following changes.
- README.md: Add API section
- test: Duplicate acceptance test to API v1 & API v2 version
The Alertmanager acceptance test framework has a decent test coverage
on the Alertmanager API. Introducing the Alertmanager API v2 does not go
hand in hand with deprecating API v1. They should live alongside each
other for a couple of minor Alertmanager versions.
Instead of porting the acceptance test framework to use the new API v2,
this patch duplicates the acceptance tests, one using the API v1, the
other API v2.
Once API v1 is removed we can simply remove `test/with_api_v1` and bring
`test/with_api_v2` to `test/`.
[1]
https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md
[2] https://github.com/go-swagger/go-swagger/
[3] https://github.com/ahultgren/swagger-elm
[4]
http://petstore.swagger.io/?url=https://raw.githubusercontent.com/mxinden/alertmanager/apiv2/api/v2/openapi.yaml
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
Move the code for storing and GC'ing alerts from being re-implemented in
several packages to existing in its own package
Signed-off-by: stuart nelson <stuartnelson3@gmail.com>