The CI environment isn't as performant as local machines: the time
needed to fully initialize the test environment can be significant and
skew the verification. Rather than setting the "virtual" clock used to
measure alert timings at the beginning of the acceptance test, it is
better to wait for the test bed to be ready.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Update Go to 1.19
* Update Go.
* Update some Go modules.
* Update Swagger to the latest for Go 1.19 compatibility.
* api/v2: regenerate
* Accommodate to the changes in the client package
* asset/assets_vfsdata.go: regenerate
Signed-off-by: SuperQ <superq@gmail.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
The CI keeps reporting flakes for our acceptance test around the starting and stopping of the Alertmanagers. While I have an idea of where these failures are coming from, it would be nice to get a confirmation by structuring our error messages a bit better.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
While merging #2944, I noticed the CI failed: https://app.circleci.com/pipelines/github/prometheus/alertmanager/2686/workflows/b6f87b0a-20c3-455b-b706-432c38a77511/jobs/12028.
It seemed like a deadlock between uncoordinated routines but I couldn't pin point (or reproduce, I tried with -race and -count) the exact problem. However, from the logs, I could point out where the problem originated and kind of have a hunch it had to do with the way net listeners are handled by the TODO removed.
The more worrying bit of the CI failure is that it took 10m to timeout, with this change we'll force close the connection with a 5s deadline so at the very least we'll get the feedback faster.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
This commit moves the stuff formerly in /client into /test/with_api_v1
so that we can discourage use of the v1 client without breaking things
Signed-off-by: sinkingpoint <colin@quirl.co.nz>
* check if at least one silence matcher doesn't match empty strings
Signed-off-by: qoops <ilya.v.gladyshev@gmail.com>
* fixed grammar
Signed-off-by: qoops <ilya.v.gladyshev@gmail.com>
Instead of keeping all notifiers in the notify package, it splits them
into individual sub-packages. This improves readability and
maintainability of the code.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
- Move the generated api/v2 client code out of the test directory
and into the api/v2 directory with models and restapi.
- Remove duplicate models directory
- Update tests to use api/v2 package for models and client
Signed-off-by: Paul Gier <pgier@redhat.com>
- make clean shouldn't print errors when files/directories have already
been removed
- add copyright header to generated api files to pass license check
Signed-off-by: Paul Gier <pgier@redhat.com>
This changes removes all usage of golang.org/x/net/context in the code
base. It also bumps a few dependencies for the same reason:
- github.com/gogo/protobuf
- go-openapi/*
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
If a users chooses to disable the Alertmanager cluster feature, there is
no cluster name nor cluster peers. Hence these should be optional. Only
cluster status is set to "disabled".
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
When users start Alertmanager with `--cluster.listen-address=`, the
cluster will not be initialized, hence api.peer will be `nil`. So far
this would result in a nil pointer dereference by the API v2 accessing
the api.peer field.
With this patch, api v2 skips populating the peers array, sets the name
to an empty string and the status to "disabled" in case `api.peer` is
nil.
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
With issue 1465 on openapi-generator [1] being fixed, we can not extract
shared properties of the gettable and postable alert definition into a
shared object (`alert`) like we do for silence, gettable silence and
postable silence.
In addition this patch does the following changes to the UI:
- Use `List GettableAlert` instead of plural type definition like
`GettableAlerts` because the plural definitions are not generated.
- Fix openapi-generator-cli docker image to specific hash.
[1] https://github.com/OpenAPITools/openapi-generator/issues/1465
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
Instead of having one general silence, differentiate between postable
and gettable silence, hence making more fields required.
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
This patch makes the Alertmanager UI (/status & /silences) use the
api/v2 endpoint. In addition it adds logic to generate the elm side data
model based on the OpenAPI specification.
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
Instead of only testing single instance Alertmanagers, this patch
enables individual tests to spin up Alertmanager clusters.
In addition it adds two tests:
1. A test firing alerts against a cluster, expecting to only receive a a
notification by one of the Alertmanager instances in the cluster.
2. A test firing alerts both against a single instance as well as a
cluster, making sure the output equals.
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
The current Alertmanager API v1 is undocumented and written by hand.
This patch introduces a new Alertmanager API - v2. The API is fully
generated via an OpenAPI 2.0 [1] specification (see
`api/v2/openapi.yaml`) with the exception of the http handlers itself.
Pros:
- Generated server code
- Ability to generate clients in all major languages
(Go, Java, JS, Python, Ruby, Haskell, *elm* [3] ...)
- Strict contract (OpenAPI spec) between server and clients.
- Instant feedback on frontend-breaking changes, due to strictly
typed frontend language elm.
- Generated documentation (See Alertmanager online Swagger UI [4])
Cons:
- Dependency on open api ecosystem including go-swagger [2]
In addition this patch includes the following changes.
- README.md: Add API section
- test: Duplicate acceptance test to API v1 & API v2 version
The Alertmanager acceptance test framework has a decent test coverage
on the Alertmanager API. Introducing the Alertmanager API v2 does not go
hand in hand with deprecating API v1. They should live alongside each
other for a couple of minor Alertmanager versions.
Instead of porting the acceptance test framework to use the new API v2,
this patch duplicates the acceptance tests, one using the API v1, the
other API v2.
Once API v1 is removed we can simply remove `test/with_api_v1` and bring
`test/with_api_v2` to `test/`.
[1]
https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md
[2] https://github.com/go-swagger/go-swagger/
[3] https://github.com/ahultgren/swagger-elm
[4]
http://petstore.swagger.io/?url=https://raw.githubusercontent.com/mxinden/alertmanager/apiv2/api/v2/openapi.yaml
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
Errcheck [1] enforces error handling accross all go files. Functions can
be excluded via `scripts/errcheck_excludes.txt`.
This patch adds errcheck to the `test` Make target.
[1] https://github.com/kisielk/errcheck
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
* notify: notify resolved alerts properly
The PR #1205 while fixing an existing issue introduced another bug when
the send_resolved flag of the integration is set to true.
With send_resolved set to false, the semantics remain the same:
AlertManager generates a notification when new firing alerts are added
to the alert group. The notification only carries firing alerts.
With send_resolved set to true, AlertManager generates a notification
when new firing or resolved alerts are added to the alert group. The
notification carries both the firing and resolved notifications.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Fix comments
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* inhibit: update inhibition cache when alerts resolve
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* inhibit: remove unnecessary fmt.Sprintf
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* inhibit: add unit tests
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* inhibit: use NopLogger in tests
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Update old alert with result of merge with new
On ingest, alerts with matching fingerprints are
merged if the new alert's start and end times
overlap with the old alert's.
The merge creates a new alert, which is then
updated in the internal alert store.
The original alert is not updated (because merge
creates a copy), so it is never marked as resolved
in the inhibitor's reference to it.
The code within the inhibitor relies on skipping
over resolved alerts, but because the old alert is
never updated it is never marked as resolved. Thus
it continues to inhibit other alerts until it is
cleaned up by the internal GC.
This commit updates the struct of the old alert
with the result of the merge with the new alert.
An alternative would be to always update the
inhibitor's internal cache of alerts regardless of
an alert's resolve status.
Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
* Update inhibitor cache even if alert is resolved
This seems like a better choice than the previous
commit. I think it is more sane to have the
inhibitor update its own cache, rather than having
one of its pointers updated externally.
Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
* cli: move commands to cli/cmd
* cli: use StatusAPI interface for config command
* cli: use SilenceAPI interface for silence commands
* cli: use AlertAPI for alert command
* cli: move back commands to cli package
And move API client code to its own package.
* cli: remove unused structs