Commit Graph

124 Commits

Author SHA1 Message Date
QuentinBisson
4aea4560ce
Fix flapping acceptance test
Signed-off-by: QuentinBisson <quentin@giantswarm.io>
2021-04-28 15:04:37 +02:00
Ben Kochie
53535551f5
Fix up golangci-lint errors.
Signed-off-by: Ben Kochie <superq@gmail.com>
2021-03-16 10:43:45 +01:00
Kiril Vladimirov
7320d83cbc Replace types.Matcher(s)? with labels.Matcher(s)?
Signed-off-by: Kiril Vladimirov <kiril@vladimiroff.org>
2021-01-22 17:02:48 +02:00
Josh Soref
0f2c65d265 Spelling (#2167)
* spelling: inhibition

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: matchers

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: notification

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: nonexistent

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: obfuscated

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: occurred

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: relevant

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: unexpected

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: marshaled

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: marshaling

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-01-23 17:06:16 +01:00
Ilya Gladyshev
196c62f488 At least one non-empty silence matcher (#2081)
* check if at least one silence matcher doesn't match empty strings

Signed-off-by: qoops <ilya.v.gladyshev@gmail.com>

* fixed grammar

Signed-off-by: qoops <ilya.v.gladyshev@gmail.com>
2019-10-31 15:42:03 +01:00
stuart nelson
a74758e4c7
Merge pull request #1830 from pgier/amtool-tests
test/cli: add basic amtool cli tests
2019-07-24 17:41:10 +02:00
Paul Gier
588e1e3f9f test/cli: add periods to comment sentences and import ordering
Signed-off-by: Paul Gier <pgier@redhat.com>
2019-06-20 10:37:59 -05:00
Simon Pasquier
1207b90029 test/with_api_v2: remove calls to the v1 API
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-06-20 17:37:08 +02:00
Paul Gier
50ad3114a0 test/cli: add basic tests for amtool cli
Signed-off-by: Paul Gier <pgier@redhat.com>
2019-06-19 17:33:35 -05:00
Simon Pasquier
0c3120efac *: split notify package
Instead of keeping all notifiers in the notify package, it splits them
into individual sub-packages. This improves readability and
maintainability of the code.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-06-18 15:36:19 +02:00
Simon Pasquier
c20873b1fe
test/with_api_v2: fix variable shadowing (#1889)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-05-15 10:49:59 +02:00
stuart nelson
6749f9faa9
Merge pull request #1871 from johncming/hotfix/dup-close
test/with_api_v1: delete duplicate close of http body.
2019-05-06 12:41:29 +02:00
johncming
1c38a90eeb test/with_api_v1: delete duplicate close of http body.
Signed-off-by: johncming <johncming@yahoo.com>
2019-05-05 08:40:18 +08:00
stuart nelson
1cc6c6f79c Move alert endpoints filter parsing to single function
They are exactly the same, no reason to duplicate.

Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
2019-04-30 10:59:17 +02:00
Paul Gier
8688c7b9ad api/v2: move generated client code from test to api/v2 (#1792)
- Move the generated api/v2 client code out of the test directory
and into the api/v2 directory with models and restapi.
- Remove duplicate models directory
- Update tests to use api/v2 package for models and client

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-03-12 17:11:23 +01:00
Karsten Weiss
c637ca1a6e Fix typos in comments and metric HELPs (#1790)
No functional change.

Signed-off-by: Karsten Weiss <knweiss@gmail.com>
2019-03-12 10:29:26 +01:00
Paul Gier
458f1d646b Makefile improvements
- make clean shouldn't print errors when files/directories have already
been removed
- add copyright header to generated api files to pass license check

Signed-off-by: Paul Gier <pgier@redhat.com>
2019-03-11 10:45:45 -05:00
Simon Pasquier
bc373f562f *: fix filter parameters with comma
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-03-08 09:56:05 +01:00
Simon Pasquier
c7de536129
*: use stdlib context (#1768)
This changes removes all usage of golang.org/x/net/context in the code
base. It also bumps a few dependencies for the same reason:
- github.com/gogo/protobuf
- go-openapi/*

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-26 12:18:57 +01:00
stuart nelson
51eebbef85
Stn/correctly mark api silences (#1733)
* Update alert status on every GET to alerts

Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
2019-02-18 17:06:51 +01:00
Max Leonard Inden
8e157b3af5
api/v2: Make cluster status peers and name optional
If a users chooses to disable the Alertmanager cluster feature, there is
no cluster name nor cluster peers. Hence these should be optional. Only
cluster status is set to "disabled".

Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2019-02-04 11:40:30 +01:00
Max Leonard Inden
2f055d9966
api/v2: Do not populate cluster info if clustering is disabled
When users start Alertmanager with `--cluster.listen-address=`, the
cluster will not be initialized, hence api.peer will be `nil`. So far
this would result in a nil pointer dereference by the API v2 accessing
the api.peer field.

With this patch, api v2 skips populating the peers array, sets the name
to an empty string and the status to "disabled" in case `api.peer` is
nil.

Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2019-01-31 16:56:59 +01:00
Simon Pasquier
b676fa79c0 *: update Makefile.common with new staticcheck (#1692)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-01-04 15:37:33 +01:00
Max Leonard Inden
2b697aaa6b
api/v2: Extract shared properties of gettable and postable alert
With issue 1465 on openapi-generator [1] being fixed, we can not extract
shared properties of the gettable and postable alert definition into a
shared object (`alert`) like we do for silence, gettable silence and
postable silence.

In addition this patch does the following changes to the UI:

- Use `List GettableAlert` instead of plural type definition like
`GettableAlerts` because the plural definitions are not generated.

- Fix openapi-generator-cli docker image to specific hash.

[1] https://github.com/OpenAPITools/openapi-generator/issues/1465

Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2018-11-28 14:35:39 +01:00
Max Inden
091a8a83b1
Merge pull request #1632 from mxinden/alerts-api-v2
ui: Move alerts to api v2
2018-11-26 14:28:13 +01:00
Simon Pasquier
5fd944a603 test/with_api_v1: add test for route prefix
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-11-23 13:59:28 +01:00
Simon Pasquier
d6f8437b9b test/with_api_v2: add test for route prefix
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-11-23 13:59:28 +01:00
Max Leonard Inden
f504f953c1
ui: Move /alerts to API v2
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2018-11-23 12:53:48 +01:00
Max Inden
573389a9bb
Merge pull request #1623 from simonpasquier/add-test-apiv2
test: add acceptance test for firing alerts with EndsAt
2018-11-18 16:32:59 +01:00
Simon Pasquier
2ea37af92c test: add acceptance test for firing alerts with EndsAt
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-11-15 16:37:41 +01:00
Max Leonard Inden
b4b8b750df
api/v2/openapi.yaml: Differentiate between post and get silence
Instead of having one general silence, differentiate between postable
and gettable silence, hence making more fields required.

Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2018-11-15 16:21:07 +01:00
Max Leonard Inden
e4e053b18e
ui: Move /status & /silences to API v2
This patch makes the Alertmanager UI (/status & /silences) use the
api/v2 endpoint. In addition it adds logic to generate the elm side data
model based on the OpenAPI specification.

Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2018-11-15 13:24:26 +01:00
Simon Pasquier
306fd73e32 *: remove use of golang.org/x/net/context
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-11-09 10:00:23 +01:00
Max Leonard Inden
d123cbe696
test: Enable testing against cluster of Alertmanagers
Instead of only testing single instance Alertmanagers, this patch
enables individual tests to spin up Alertmanager clusters.

In addition it adds two tests:

1. A test firing alerts against a cluster, expecting to only receive a a
notification by one of the Alertmanager instances in the cluster.

2. A test firing alerts both against a single instance as well as a
cluster, making sure the output equals.

Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2018-10-24 15:59:36 +02:00
Simon Pasquier
460b7a72fc test: Don't run TestResolved() in parallel and reduce to 2 runs (#1544)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-11 12:55:48 +02:00
Max Leonard Inden
f1b920bcc9
api: Implement OpenAPI generated Alertmanager API V2
The current Alertmanager API v1 is undocumented and written by hand.
This patch introduces a new Alertmanager API - v2. The API is fully
generated via an OpenAPI 2.0 [1] specification (see
`api/v2/openapi.yaml`) with the exception of the http handlers itself.

Pros:
- Generated server code
- Ability to generate clients in all major languages
  (Go, Java, JS, Python, Ruby, Haskell, *elm* [3] ...)
    - Strict contract (OpenAPI spec) between server and clients.
    - Instant feedback on frontend-breaking changes, due to strictly
      typed frontend language elm.
- Generated documentation (See Alertmanager online Swagger UI [4])

Cons:
- Dependency on open api ecosystem including go-swagger [2]

In addition this patch includes the following changes.

- README.md: Add API section

- test: Duplicate acceptance test to API v1 & API v2 version

  The Alertmanager acceptance test framework has a decent test coverage
  on the Alertmanager API. Introducing the Alertmanager API v2 does not go
  hand in hand with deprecating API v1. They should live alongside each
  other for a couple of minor Alertmanager versions.

  Instead of porting the acceptance test framework to use the new API v2,
  this patch duplicates the acceptance tests, one using the API v1, the
  other API v2.

  Once API v1 is removed we can simply remove `test/with_api_v1` and bring
  `test/with_api_v2` to `test/`.

[1]
https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md

[2] https://github.com/go-swagger/go-swagger/

[3] https://github.com/ahultgren/swagger-elm

[4]
http://petstore.swagger.io/?url=https://raw.githubusercontent.com/mxinden/alertmanager/apiv2/api/v2/openapi.yaml

Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2018-09-04 13:38:34 +02:00
Max Leonard Inden
1219541184
*.go: Introduce errcheck enforcing error handling
Errcheck [1] enforces error handling accross all go files. Functions can
be excluded via `scripts/errcheck_excludes.txt`.

This patch adds errcheck to the `test` Make target.

[1] https://github.com/kisielk/errcheck

Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2018-08-30 15:47:13 +02:00
Julius Volz
6d0edbe630 Fix a bunch of unhandled errors (#1501)
...as discovered by "gosec" (many other ones reported, but not all make
a lot of sense to fix).

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-08-05 15:38:25 +02:00
Simon Pasquier
b7d891cf39 notify: notify resolved alerts properly (#1408)
* notify: notify resolved alerts properly

The PR #1205 while fixing an existing issue introduced another bug when
the send_resolved flag of the integration is set to true.

With send_resolved set to false, the semantics remain the same:
AlertManager generates a notification when new firing alerts are added
to the alert group. The notification only carries firing alerts.

With send_resolved set to true, AlertManager generates a notification
when new firing or resolved alerts are added to the alert group. The
notification carries both the firing and resolved notifications.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Fix comments

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-06-08 11:37:38 +02:00
stuart nelson
80f2eeb2ca
Fix resolved alerts still inhibiting (#1331)
* inhibit: update inhibition cache when alerts resolve

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* inhibit: remove unnecessary fmt.Sprintf

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* inhibit: add unit tests

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* inhibit: use NopLogger in tests

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Update old alert with result of merge with new

On ingest, alerts with matching fingerprints are
merged if the new alert's start and end times
overlap with the old alert's.

The merge creates a new alert, which is then
updated in the internal alert store.

The original alert is not updated (because merge
creates a copy), so it is never marked as resolved
in the inhibitor's reference to it.

The code within the inhibitor relies on skipping
over resolved alerts, but because the old alert is
never updated it is never marked as resolved. Thus
it continues to inhibit other alerts until it is
cleaned up by the internal GC.

This commit updates the struct of the old alert
with the result of the merge with the new alert.

An alternative would be to always update the
inhibitor's internal cache of alerts regardless of
an alert's resolve status.

Signed-off-by: stuart nelson <stuartnelson3@gmail.com>

* Update inhibitor cache even if alert is resolved

This seems like a better choice than the previous
commit. I think it is more sane to have the
inhibitor update its own cache, rather than having
one of its pointers updated externally.

Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
2018-04-18 16:26:04 +02:00
Simon Pasquier
c92ed69ce8 Split cli package (#1314)
* cli: move commands to cli/cmd

* cli: use StatusAPI interface for config command

* cli: use SilenceAPI interface for silence commands

* cli: use AlertAPI for alert command

* cli: move back commands to cli package

And move API client code to its own package.

* cli: remove unused structs
2018-04-11 11:17:41 +02:00
Simon Pasquier
4cba49155d dispatch: don't reset timer if flush is in-progress (#1301)
When the aggregation group receives an alert that is past the initial
group_wait value, it should reset its timer only if the timer has ever
expired. Otherwise it means that the flush is already in-progress.
2018-03-29 12:22:49 +02:00
Simon Pasquier
0c086e3b12 cli: extract client bindings of the v1 API (#1278)
* cli: extract client bindings of the v1 API from amtool

This is a continuation of [1] but the code is kept in the alertmanage
repository rather than having it in client_golang.

[1] https://github.com/prometheus/client_golang/pull/333

Co-Authored-By: Fabian Reinartz <fab.reinartz@gmail.com>
Co-Authored-By: Tristan Colgate <tcolgate@gmail.com>
Co-Authored-By: Corin Lawson <corin@responsight.com>
Co-Authored-By: stuart nelson <stuartnelson3@gmail.com>

* cli: fix httpSilenceAPI.Set() method

* vendor: remove github.com/prometheus/client_golang/api/alertmanager

* cli: don't use the model.Alert type
2018-03-28 19:19:04 +02:00
Brian Brazil
aa950668bf The default group_by is meant to be no labels. (#1287)
This is what the intended default is, and what
the documentation says.
2018-03-16 18:39:23 +01:00
Corentin Chary
dd75201f1c Add /-/ready based on mesh status (#1209)
* Wait for the gossip to settle before sending notifications

See #1209 for details.

As an heuristic for mesh readyness, try to see if
the mesh looks stable (the number of peers isn't changing too much).
This implementation always mark the altermanager as ready after a maximum of 60s.

This adds one new flags to control this behavior:
```
      --cluster.settle-timeout=60s  mesh settling timeout. Do not wait more than this duration on startup.
```

It also adds `/-/ready` which always return 200 (in order to make it clear
that we are ready as soon as we can receive requests).

The mesh status is exposed in `/api/v1/status` and visible on `/#/status`.

* cluster: fix typos and base interval on gossipInterval
2018-03-02 15:45:21 +01:00
pasquier-s
c39a913f8a test: enable race detection (#1262)
This change enables race detection when running the tests. It also fixes
a couple of existing race conditions.
2018-02-27 18:18:53 +01:00
Stuart Nelson
a552afd998 Merge branch 'master' into memberlist 2018-02-13 10:47:17 +01:00
Fabian Reinartz
e6df2d8751 Adapt cluster listen address flag in tests 2018-02-12 11:31:55 +01:00
pasquier-s
76ee5388e7 Forbid 0 value for group_interval and repeat_interval (#1230)
Setting one of these parameters to a zero value doesn't make sense
semantically and can cause high CPU usage.
2018-02-09 10:53:46 +01:00
Fabian Reinartz
fd49dbb477 *: move to memberlist for clustering 2018-02-08 12:18:44 +01:00