Load silences at startup from a local JSON file, write them out every 10
seconds.
It's not perfect (writes may possibly be interrupted/inconsistent if the
program is terminated while writing), but this is a temporary solution
to keep people from going crazy about lost silences until we have a
proper storage management. If the JSON file gets corrupted, the alert
manager simply starts up without any silences loaded.
This adds mandatory Summary and Description fields to Event.
As for the alert name, there were two options: keep it a separate field and
treat it separately everywhere (including in silence Filter matching), or
make it a required field in the event's labels. The latter was causing far
less trouble, so I went with that. The alertname label still doesn't have
a special meaning to most parts of the code, except that the API checks its
presence and the web UI displays it differently.
Start with the simplest possible locking scheme: lock the object-global
mutex at the beginning of each user-facing method. This is equivalent to
implicit locking provided by the reactor.
The reasoning behind this change is the incredible overhead of the
previous reactor request/response code:
Overhead for current model for every user-facing method:
- 2 struct type definitions (req/resp)
- 1 channel
- 1 struct member definition site
- 1 channel init site
- 1 struct population site
- 1 struct servicing site
- 1 struct closing site
- 1 actual execution method
New lock-based code:
Per object: 1 lock
Per method:
- 1 taking the lock
- 1 actual execution method
Close() was not synced through the main dispatcher loop, so it could close
channels that were currently being written to by methods called from said
dispatcher loop. This leads to a crash. Instead, Close() now writes a
closeRequest, which is handled in the dispatcher.