alertmanager

History

Max Inden 3735df3ac7 cluster: Do not exit when failing to join cluster (#1465 ) Alertmanager is exiting with a non-zero exit code if the initial cluster join fails. This behavior could be not wanted because: - As Alertmanager is a critical component with an at-least-once guarantee, failing on joining the cluster is unnecessary as Alertmanager still functions by itself. - In an environment like Kubernetes discovering peers via DNS, peers might roll out one-by-one, leaving the DNS entries unpopulated for the first peer of a set. Failing on initial join prevents a roll-out. Instead of failing on the initial join this patch only logs the failure. The cluster can be later joined via the `handleReconnect`. This is a regression introduced in PR #1456 [1]. [1] https://github.com/prometheus/alertmanager/pull/1456 Signed-off-by: Max Leonard Inden <IndenML@gmail.com>		2018-07-11 17:19:33 +02:00
..
clusterpb	*: move to memberlist for clustering	2018-02-08 12:18:44 +01:00
advertise.go	cluster: fail when no private address can be found (#1437 )	2018-07-05 22:59:56 +02:00
advertise_test.go	cluster: fail when no private address can be found (#1437 )	2018-07-05 22:59:56 +02:00
channel.go	gossip large messages via SendReliable (#1415 )	2018-06-15 13:40:21 +02:00
channel_test.go	gossip large messages via SendReliable (#1415 )	2018-06-15 13:40:21 +02:00
cluster.go	cluster: Do not exit when failing to join cluster (#1465 )	2018-07-11 17:19:33 +02:00
cluster_test.go	cluster: make sure we don't miss the first pushPull (#1456 )	2018-07-09 11:16:04 +02:00
delegate.go	cluster: make sure we don't miss the first pushPull (#1456 )	2018-07-09 11:16:04 +02:00