Commit Graph

1528 Commits

Author SHA1 Message Date
Fabian Reinartz 56ff38288e *: cut 0.15.0-rc.0 2018-02-28 12:43:43 +01:00
Fabian Reinartz 187b116bba Build with Go 1.10 2018-02-28 11:37:17 +01:00
pasquier-s c39a913f8a test: enable race detection (#1262)
This change enables race detection when running the tests. It also fixes
a couple of existing race conditions.
2018-02-27 18:18:53 +01:00
pasquier-s 3df093968c cluster: gather alertmanager_peer_position all the time (#1247)
* cluster: gather alertmanager_peer_position all the time

This change moves the gathering of the alertmanager_peer_position metric
outside of the clusterWait() function so that the metric is computed
accurately even when no alerting group fires.

* cluster: add alertmanager_cluster_health_score metric

This metric is retrieved from the memberlist library.
2018-02-27 10:37:56 +01:00
pasquier-s c2dac90434 silence: fix skipped test (#1258)
TestStateMerge() was skipped because of a typo. Fixing the name revealed
that the test itself needed to be updated following the switch to the
memberlist library.
2018-02-27 10:17:48 +01:00
Brian Brazil 5cb71e1def Fix spelling and comment style. (#1257) 2018-02-27 10:07:33 +01:00
pasquier-s 29e441f88f Fix miscellaneous issues revealed by Go 1.10 (#1256)
* provider/mem: fix format verbs in tests

* api: fix format verb
2018-02-22 14:57:45 +00:00
stuart nelson 0f9c9a0bb0
Remove unused functions for mesh (#1251)
These functions were used with weaveworks/mesh,
but are no longer needed with memberlist.
2018-02-16 18:16:06 +01:00
Frederic Branczyk 28db2409fd
Merge pull request #1222 from simonpasquier/httpcfg
*: configure http client from config
2018-02-16 14:44:37 +01:00
Fabian Reinartz dd675e0c89
Merge pull request #1242 from roidelapluie/ptc
cluster: Make peer timeout configurable
2018-02-14 11:20:53 +01:00
Fabian Reinartz 4e434573c7
Merge pull request #1245 from simonpasquier/fix-join
cluster: pass resolved peers to Join()
2018-02-14 11:20:34 +01:00
Simon Pasquier f4c81c43e9 cluster: pass resolved peers to Join() 2018-02-13 16:53:09 +01:00
Julien Pivotto dc293439ca cluster: Make peer timeout configurable
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2018-02-13 16:31:33 +01:00
pasquier-s 382a0d8089 api: support zero StartsAt for alerts (#1238)
When the API receives alerts where StartsAt is zero, it updates the
value to EndsAt (if not zero itself) or "now". This ensures that the
alert validation will not fail since StartsAt has to be less than or
equal to EndsAt.
2018-02-13 16:26:34 +01:00
Frederic Branczyk 32f90a02ca
Merge pull request #1239 from simonpasquier/remove-dead-code
cmd: remove unused code
2018-02-13 16:21:44 +01:00
Simon Pasquier 955c92f1b6 Configure http client for Wechat 2018-02-13 14:52:53 +01:00
Simon Pasquier 8b93f1085d Add tests for HTTP client configuration 2018-02-13 14:30:59 +01:00
Frederic Branczyk d678022fea *: configure http client from config 2018-02-13 14:30:59 +01:00
Simon Pasquier 9d16fe8266 cmd: remove unused code 2018-02-13 14:20:54 +01:00
Frederic Branczyk 8eb8c1baa0
Merge pull request #1232 from prometheus/memberlist
*: move to memberlist for clustering
2018-02-13 11:26:51 +01:00
Stuart Nelson a552afd998 Merge branch 'master' into memberlist 2018-02-13 10:47:17 +01:00
stuart nelson 30af4d051b
release 0.14 (#1237) 2018-02-13 09:13:44 +01:00
Fabian Reinartz e6df2d8751 Adapt cluster listen address flag in tests 2018-02-12 11:31:55 +01:00
Stuart Nelson 46c6b3f2f1 Update frontend 2018-02-12 11:13:27 +01:00
Fabian Reinartz 6cfbe6e8b4 update cluster listen address flag 2018-02-12 10:22:49 +01:00
songjiayang d07a072b08 Fix WeChat issue (#1229)
* fix wechat issue

* wechat issue code review
2018-02-11 20:09:47 +01:00
Fabian Reinartz 3f2e00fbea cluster/api: improve metrics and cluster status 2018-02-09 11:16:00 +01:00
Fabian Reinartz 247bfff606 cluster: remove MergeSingle 2018-02-09 11:06:51 +01:00
pasquier-s 76ee5388e7 Forbid 0 value for group_interval and repeat_interval (#1230)
Setting one of these parameters to a zero value doesn't make sense
semantically and can cause high CPU usage.
2018-02-09 10:53:46 +01:00
Mike Bryant 6615ed15d2 Add templating to PD-CEF fields; Add missing field (#1231)
* Allow templating of Component and Group in PagerDuty v2

Related to #1211

* Add missing PD-CEF field Component
2018-02-09 10:50:18 +01:00
Andrey Kuzmin 5101d65938 Fix the slowness of the Silence UI (#1235)
* Cache tabs and fix slow css

* update bindata
2018-02-09 10:42:44 +01:00
Fabian Reinartz fd49dbb477 *: move to memberlist for clustering 2018-02-08 12:18:44 +01:00
Frederic Branczyk 168cb217c6
Merge pull request #1233 from Conorbro/resolved-alert-counter-fix
Fixes AM wrongly counting alerts with EndTimes in the future as resolved
2018-02-08 10:54:13 +01:00
conorbroderick e8832619e0 Fixes AM wrongly counting alerts with EndTimes in the future as resolved 2018-02-07 15:52:26 +00:00
Corentin Chary a43a513b77 Fix OpsGenie notifier and add unit tests (#1224)
See #1223, looks like OpsGenie now sometimes returns a 422 when you
don't specify a team. This change cleans up the JSON output and
add a few unit tests.
2018-02-06 13:45:59 +01:00
pasquier-s 17bd637c97 Add mesh metrics (#1225)
* Add mesh metrics

This change adds 2 new metrics for the mesh:

* alertmanager_peer_connection, state of the connection between the
  Alertmanager instance and a peer.
* alertmanager_peer_terminations_total, total number of terminated
  connection.

It also moves the gathering of the alertmanager_peer_position metric
outside of the meshWait() function so that the metric is computed
accurately even when no alerting group fires.

* Remove 'nick' label from alertmanager_peer_connection metric
2018-02-06 12:13:52 +01:00
Carlos Alexandro Becker c5ea346d06 allow global opsgenie api key (#1208)
* allow global opsgenie api key

* added missing files

* removed test
2018-01-29 16:05:17 +01:00
Carlos Alexandro Becker 23f31d7d5a improved error when victorops fails (#1207)
* improved error when victorops fails

* moved to debug

* allocate mem only once

* joining strings

* logging receiver name

* passing only group name
2018-01-29 16:00:04 +01:00
Tom Paine 081fc7d982 Update simple.yml (#1216)
match spacing on other receiver groups
2018-01-29 15:58:44 +01:00
Daniel Bonatto 94bef6419f Fixes prometheus/alertmanager#1211 (#1214)
Add template to severity field for PagerDuty API v2.
2018-01-27 11:22:41 +01:00
pasquier-s 62b957cc14 Notify only when new firing alerts are added (#1205)
After the initial notification has been sent, AlertManager shouldn't notify the
receiver again when no new alerts have been added to the group during
group_interval.

This change also modifies the acceptance test framework to assert that no
notification has been received in a given interval.
2018-01-23 16:52:03 +01:00
Stuart Nelson b45c11b561 Fix tests 2018-01-21 15:38:19 +01:00
Jose Donizetti fc9306cd7e Add expired silence validation (#1096)
* Add expired silence validation

* Add silence end time in the past validation
2018-01-21 15:29:51 +01:00
Jose Donizetti 2fe013bcaa Add tests to memory provider (#1104) 2018-01-21 15:27:21 +01:00
pasquier-s 63598904dc Fix pending connections never going to established (#1204) 2018-01-21 15:09:50 +01:00
pasquier-s 9b10acae68 Don't notify resolved alerts if none were firing (#1198)
* Don't notify resolved alerts if none were firing

* Fix comments
2018-01-18 11:12:17 +01:00
benbradley 0db01af11e amtool silence update support dwy suffixes to expire flag (#1197) 2018-01-15 19:45:46 +01:00
Stuart Nelson d20282e1e3 Correct CHANGELOG.md 2018-01-12 14:24:40 +01:00
stuart nelson fb713f6d82
v0.13.0 (#1194) 2018-01-12 11:29:15 +01:00
Stuart Nelson 7d36d79aba Update silence query long help 2018-01-12 10:44:38 +01:00