Adds a job which runs periodically and refreshes cluster.peer dns records.
The problem is that when you restart all of the alertmanager instances in an environment like Kubernetes, DNS may contain old alertmanager instance IPs, but on startup (when Join() happens) none of the new instance IPs. As at the start DNS is not empty resolvePeers waitIfEmpty=true, will return and "islands" of 1 alertmanager instances will form.
Signed-off-by: Povilas Versockas <p.versockas@gmail.com>
* [silences] Don't merge expired silences
If they're expired, they should be cleaned up on
the next GC cycle, but merging them in means that
they'll probably be gossip'd continually between
the cluster members.
Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
* Add analogous behavior+test for nflog
The code for nflog was also constantly re-adding
nflogs to the internal memory store, the same as
the silence code was.
Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
* Add retention to TestQuery
With the default 0 retention, the alerts would not
be merged.
Signed-off-by: Stuart Nelson <stuartnelson3@gmail.com>
Instead of having one general silence, differentiate between postable
and gettable silence, hence making more fields required.
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
With the previous patch /status and /silences were requested from api
v2. Requesting alerts from api v1 is done in a separate commit to be
able to revert it once alerts also come from api v2.
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
This patch makes the Alertmanager UI (/status & /silences) use the
api/v2 endpoint. In addition it adds logic to generate the elm side data
model based on the OpenAPI specification.
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
Support `w` duration when making silences. Remove
`y` as a supported duration, silences are for
short term notification prevention.
Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
Alert merging assumed that EndsAt would always be empty for firing
alerts. This is no longer true starting with Prometheus v2.4.0: EndsAt
is set to a multiple of the evaluation interval or resend interval
(whichever is the largest). This change updates the merging logic to
support both cases.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Instead of only testing single instance Alertmanagers, this patch
enables individual tests to spin up Alertmanager clusters.
In addition it adds two tests:
1. A test firing alerts against a cluster, expecting to only receive a a
notification by one of the Alertmanager instances in the cluster.
2. A test firing alerts both against a single instance as well as a
cluster, making sure the output equals.
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
When creating a `NewClient`, pass only the hostname
as value for the `host` parameter, instead of `n.conf.Smarthost`,
which is hostname port.
This is the same as `smtp.Dial` does, which is called in
the else-branch.
Signed-off-by: Claudio Kressibucher <ckressibucher@graphicworks.ch>
alertmanager_notifications_total is increased only on
successful notifications, and not on any attempted
notification as the current description points
Signed-off-by: Chris Mark <chrismarkou92@gmail.com>
The documentation stated that the `alertname=`
part of a matcher could be dropped and it would be
assumed that the first value was the alertname.
This behavior was only honored if there was a
single matcher, but failed if there were multiple.
Any time we have one or more matchers, check to
see if the first matcher fails parsing. If so,
assume it's intended to be used as the alertname,
and append that value to the matcher.
Signed-off-by: Stuart Nelson <stuartnelson3@gmail.com>
This fixes the ineffectual assignment for the default GC interval when specifiying a duration of 0
Signed-off-by: Jannick Fahlbusch <git@jf-projects.de>
* Add support for images and links in the PagerDuty notification config
This only works when using the v2 API. It supports template values for all fields.
Signed-off-by: Sean Houghton <sean.houghton@activision.com>
* Migrate from go-bindata to vfsgen
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Ensure idempotent generation for vfsgen
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* asset: update generated files
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Fix asset paths for Windows platforms
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Regenerate assets
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Add vfs wrapper that returns constant mod time
This is identical to what we had with go-bindata and avoids the extra
step of storing the identity of the complete file system in another
location.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Additional cleanup
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Use a button instead of a for buttons
Using <a> was causing an unintended redirect to
the root. It appears this behavior has changed in
0.19.
Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
* bindata
Signed-off-by: stuart nelson <stuartnelson3@gmail.com>