I don't have a way to test all the other notification mechanisms, which
is something we should fix in general. For now, only PagerDuty and email
have the new runbook and alertmanager URL information.
Not very happy with the overall cleanliness of this, and the codebase
overall, of course, but since we need this urgently tomorrow, I hope
this is fine for now.
Load silences at startup from a local JSON file, write them out every 10
seconds.
It's not perfect (writes may possibly be interrupted/inconsistent if the
program is terminated while writing), but this is a temporary solution
to keep people from going crazy about lost silences until we have a
proper storage management. If the JSON file gets corrupted, the alert
manager simply starts up without any silences loaded.
Start with the simplest possible locking scheme: lock the object-global
mutex at the beginning of each user-facing method. This is equivalent to
implicit locking provided by the reactor.
The reasoning behind this change is the incredible overhead of the
previous reactor request/response code:
Overhead for current model for every user-facing method:
- 2 struct type definitions (req/resp)
- 1 channel
- 1 struct member definition site
- 1 channel init site
- 1 struct population site
- 1 struct servicing site
- 1 struct closing site
- 1 actual execution method
New lock-based code:
Per object: 1 lock
Per method:
- 1 taking the lock
- 1 actual execution method