The aggregation group is already responsible for removing the resolved
alerts. Running the garbage collection in parallel introduces a race and
eventually resolved notifications may be dropped.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Instead of registering marker metrics inside of
cmd/alertmanager/main.go, register them in types/types.go, encapsulating
marker specific logic in its module, not in main.go. In addition it
paves the path for removing the usage of the global metric registry in
the future, by taking a local metric registerer.
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
Move the code for storing and GC'ing alerts from being re-implemented in
several packages to existing in its own package
Signed-off-by: stuart nelson <stuartnelson3@gmail.com>
Errcheck [1] enforces error handling accross all go files. Functions can
be excluded via `scripts/errcheck_excludes.txt`.
This patch adds errcheck to the `test` Make target.
[1] https://github.com/kisielk/errcheck
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
... rather than in the Subscribe method. Currently the cleanup for a
given Alert subscription is done in a blocking goroutine, started in
the Subscribe method.
This simplifies it by moving the cleanup to the GC.
Additionally it simplifies the subscribe method by setting up the
buffered channel big enough to fill it up with all pending alerts
preventing the necessity to start a goroutine in Subscribe at all.
Signed-off-by: Sergiusz Urbaniak <sergiusz.urbaniak@gmail.com>
TestAlertsSubscribePutStarvation tests starvation of `iterator.Close` and
`alerts.Put`. Both `Subscribe` and `Put` use the Alerts.mtx lock. `Subscribe`
needs it to subscribe and more importantly unsubscribe `Alerts.listeners`.
`Put` uses the lock to add additional alerts and iterate the `Alerts.listeners`
map. If the channel of a listener is at its limit, `alerts.Lock` is blocked,
whereby a listener can not unsubscribe as the lock is hold by `alerts.Lock`.
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
This change adds snapshotting to the silences and notification infos
providers. Snapshots are created periodically and on shutdown. They are
loaded into memory on startup.
Periodic snapshotting is run right after garbage collection.
This changes the end timestamp for unstarted silences to the
start timestamp so the silence remains valid by not having the end
time before the start time.
This change extracts setting logic directly into the silence state.
Only assigning of a UUID and mesh propagation are left directly to
the mesh provider.
Validity of modifying silence state extracted into its own method.
Test for state modification added.