* Update Go to 1.18
* Update circleci machine image.
* Switch maildev to new upstream image location.
* Update Go modules to 1.17 format.
* Make dependabot monthly to match prometheus/prometheus.
Signed-off-by: SuperQ <superq@gmail.com>
This is a of a nitpicky change, but having error logs on bad requests
is a bit of a pain. This means that a bad client can spam the logs with
bad requests that are really not actually an issue for the server -
we just send back the error and move on. This commit moves a couple of logs
from `Error` to `Debug` so that they can be filtered a bit better
Signed-off-by: sinkingpoint <colin@quirl.co.nz>
The CI keeps reporting flakes for our acceptance test around the starting and stopping of the Alertmanagers. While I have an idea of where these failures are coming from, it would be nice to get a confirmation by structuring our error messages a bit better.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
As part of #2971, I'm about to extend the test for silences - extract the functions into helpers as part of a separate file and add names to the expectations so that we can easily identify them.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
As noted in #2867, there is an unnecessary require.Eventually in a
silence test. This PR addresses that by using a channel to signal that
that the maintenance loop has completed.
Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>
github.com/benbjohnson/clock provides a time interface to programs
rather than using the stdlib time package. This allows mocking time in
programs and tests. In this commit, the clock is used to speed up and
simplify testing of the silences package.
Signed-off-by: Joe Blubaugh <joe.blubaugh@grafana.com>
Add dependabot dependency check in order to maintain dependencies up-to-date and security updates on time.
Signed-off-by: David Ureba <david.ureba@aiven.io>
In accordance with a new rule introduced as part of https://github.com/grafana/dashboard-linter/pull/79 this is now required. However, for the new rule of `panel-unit-rule` we don't reap any benefits from specifiying a particular unit for our panels, the defaults work perfectly fine so they're ignored.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
This accurately reflects what the function _actually_ does. If no active silences IDs are provided and the list of inhibitions we have is already empty the alert is actually set to Active. Took me a while to realise this as I was understanding how do we populate the alert list.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
While merging #2944, I noticed the CI failed: https://app.circleci.com/pipelines/github/prometheus/alertmanager/2686/workflows/b6f87b0a-20c3-455b-b706-432c38a77511/jobs/12028.
It seemed like a deadlock between uncoordinated routines but I couldn't pin point (or reproduce, I tried with -race and -count) the exact problem. However, from the logs, I could point out where the problem originated and kind of have a hunch it had to do with the way net listeners are handled by the TODO removed.
The more worrying bit of the CI failure is that it took 10m to timeout, with this change we'll force close the connection with a 5s deadline so at the very least we'll get the feedback faster.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Alert metric reports different results to what the user sees via API
Fixes#1439 and #2619.
The previous metric is not _technically_ reporting incorrect results as the alerts _are_ still around and will be re-used if that same alert (equal fingerprint) is received before it is GCed. Therefore, I have kept the old metric under a new name `alertmanager_marked_alerts` and repurpose the current metric to match what the user sees in the UI.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
With https://github.com/grafana/dashboard-linter/pull/49 `template-job-rule` no longer validates both `instance` and `job` labels. Add the new rule of `template-instance-rule` to the exclusions to preserve the previous behaviour.
Signed-off-by: gotjosh <josue.abreu@gmail.com>