Simon Pasquier
f4c81c43e9
cluster: pass resolved peers to Join()
2018-02-13 16:53:09 +01:00
Frederic Branczyk
8eb8c1baa0
Merge pull request #1232 from prometheus/memberlist
...
*: move to memberlist for clustering
2018-02-13 11:26:51 +01:00
Stuart Nelson
a552afd998
Merge branch 'master' into memberlist
2018-02-13 10:47:17 +01:00
stuart nelson
30af4d051b
release 0.14 ( #1237 )
2018-02-13 09:13:44 +01:00
Fabian Reinartz
e6df2d8751
Adapt cluster listen address flag in tests
2018-02-12 11:31:55 +01:00
Stuart Nelson
46c6b3f2f1
Update frontend
2018-02-12 11:13:27 +01:00
Fabian Reinartz
6cfbe6e8b4
update cluster listen address flag
2018-02-12 10:22:49 +01:00
songjiayang
d07a072b08
Fix WeChat issue ( #1229 )
...
* fix wechat issue
* wechat issue code review
2018-02-11 20:09:47 +01:00
Fabian Reinartz
3f2e00fbea
cluster/api: improve metrics and cluster status
2018-02-09 11:16:00 +01:00
Fabian Reinartz
247bfff606
cluster: remove MergeSingle
2018-02-09 11:06:51 +01:00
pasquier-s
76ee5388e7
Forbid 0 value for group_interval and repeat_interval ( #1230 )
...
Setting one of these parameters to a zero value doesn't make sense
semantically and can cause high CPU usage.
2018-02-09 10:53:46 +01:00
Mike Bryant
6615ed15d2
Add templating to PD-CEF fields; Add missing field ( #1231 )
...
* Allow templating of Component and Group in PagerDuty v2
Related to #1211
* Add missing PD-CEF field Component
2018-02-09 10:50:18 +01:00
Andrey Kuzmin
5101d65938
Fix the slowness of the Silence UI ( #1235 )
...
* Cache tabs and fix slow css
* update bindata
2018-02-09 10:42:44 +01:00
Fabian Reinartz
fd49dbb477
*: move to memberlist for clustering
2018-02-08 12:18:44 +01:00
Frederic Branczyk
168cb217c6
Merge pull request #1233 from Conorbro/resolved-alert-counter-fix
...
Fixes AM wrongly counting alerts with EndTimes in the future as resolved
2018-02-08 10:54:13 +01:00
conorbroderick
e8832619e0
Fixes AM wrongly counting alerts with EndTimes in the future as resolved
2018-02-07 15:52:26 +00:00
Corentin Chary
a43a513b77
Fix OpsGenie notifier and add unit tests ( #1224 )
...
See #1223 , looks like OpsGenie now sometimes returns a 422 when you
don't specify a team. This change cleans up the JSON output and
add a few unit tests.
2018-02-06 13:45:59 +01:00
pasquier-s
17bd637c97
Add mesh metrics ( #1225 )
...
* Add mesh metrics
This change adds 2 new metrics for the mesh:
* alertmanager_peer_connection, state of the connection between the
Alertmanager instance and a peer.
* alertmanager_peer_terminations_total, total number of terminated
connection.
It also moves the gathering of the alertmanager_peer_position metric
outside of the meshWait() function so that the metric is computed
accurately even when no alerting group fires.
* Remove 'nick' label from alertmanager_peer_connection metric
2018-02-06 12:13:52 +01:00
Carlos Alexandro Becker
c5ea346d06
allow global opsgenie api key ( #1208 )
...
* allow global opsgenie api key
* added missing files
* removed test
2018-01-29 16:05:17 +01:00
Carlos Alexandro Becker
23f31d7d5a
improved error when victorops fails ( #1207 )
...
* improved error when victorops fails
* moved to debug
* allocate mem only once
* joining strings
* logging receiver name
* passing only group name
2018-01-29 16:00:04 +01:00
Tom Paine
081fc7d982
Update simple.yml ( #1216 )
...
match spacing on other receiver groups
2018-01-29 15:58:44 +01:00
Daniel Bonatto
94bef6419f
Fixes prometheus/alertmanager#1211 ( #1214 )
...
Add template to severity field for PagerDuty API v2.
2018-01-27 11:22:41 +01:00
pasquier-s
62b957cc14
Notify only when new firing alerts are added ( #1205 )
...
After the initial notification has been sent, AlertManager shouldn't notify the
receiver again when no new alerts have been added to the group during
group_interval.
This change also modifies the acceptance test framework to assert that no
notification has been received in a given interval.
2018-01-23 16:52:03 +01:00
Stuart Nelson
b45c11b561
Fix tests
2018-01-21 15:38:19 +01:00
Jose Donizetti
fc9306cd7e
Add expired silence validation ( #1096 )
...
* Add expired silence validation
* Add silence end time in the past validation
2018-01-21 15:29:51 +01:00
Jose Donizetti
2fe013bcaa
Add tests to memory provider ( #1104 )
2018-01-21 15:27:21 +01:00
pasquier-s
63598904dc
Fix pending connections never going to established ( #1204 )
2018-01-21 15:09:50 +01:00
pasquier-s
9b10acae68
Don't notify resolved alerts if none were firing ( #1198 )
...
* Don't notify resolved alerts if none were firing
* Fix comments
2018-01-18 11:12:17 +01:00
benbradley
0db01af11e
amtool silence update support dwy suffixes to expire flag ( #1197 )
2018-01-15 19:45:46 +01:00
Stuart Nelson
d20282e1e3
Correct CHANGELOG.md
2018-01-12 14:24:40 +01:00
stuart nelson
fb713f6d82
v0.13.0 ( #1194 )
2018-01-12 11:29:15 +01:00
Stuart Nelson
7d36d79aba
Update silence query long help
2018-01-12 10:44:38 +01:00
Thomás S. Bregolin
cdb44955cf
Make --expired list only expired silences ( #1176 ) ( #1190 )
...
This means there's no longer a way to list both active and expired
silences at the same time. This is the desired behaviour according to
consensus at https://github.com/prometheus/alertmanager/pull/1175
2018-01-12 10:35:06 +01:00
pasquier-s
907ac510f8
Fix flaky TestBatching acceptance test ( #1193 )
...
This change decreases the repeat_interval parameter from 5s to 4.9s to
make sure that the alerts are effectively sent after 5 seconds.
The workflow is:
- The dispatcher flushes the alerts at t0, sends the notification and
marks the notification log at t0+epsilon.
- The dispatcher flushes the alerts at t1, t2, t3 and t4 and doesn't
send the notifications as expected.
- At t5, the dispatcher flushes the alerts because current_time - (t0+epsilon)
is less then repeat_interval.
If repeat_interval is exactly 5s, there is a little chance that it is
greater than current_time - (t0+epsilon).
2018-01-11 22:45:59 +01:00
Colin Douch
17846f2e33
Fix updating silence comments ( #1189 )
...
Possibly another regression introduced by #976 . We use the wrong
variable to update comments in the `amtool silence update` command
which causes us to fail silently. This fixes that.
2018-01-10 17:05:03 +01:00
pasquier-s
a7d4e4ea7c
Log snapshot sizes on maintenance ( #1155 )
...
* Log snapshot sizes on maintenance
* Add metrics for snapshot sizes
This change adds 2 new gauges for tracking the last snapshots' sizes:
- alertmanager_nflog_snapshot_size_bytes
- alertmanager_silences_snapshot_size_bytes
2018-01-10 14:53:57 +01:00
stuart nelson
7b787dab05
Re-introduce prometheus durations in amtool silence creation ( #1185 )
...
* Fixes #1183
* Update expires comment
The default time is already output thanks to
kingpin.
2018-01-09 10:47:41 +01:00
stuart nelson
3aa7f03b10
Template secret keys for pagerduty notifier ( #1168 ) ( #1182 )
...
The tmpl() call was removed when migrating to
support pd v2 events api.
2018-01-08 13:41:10 +01:00
stuart nelson
3c61fe3fef
Return reload status from http endpoint ( #1152 ) ( #1180 )
...
* Return reload status from http endpoint (#1152 )
* Use same reload messaging as prometheus
2018-01-08 11:51:05 +01:00
Frederic Branczyk
0b5af7510b
Merge pull request #1159 from simonpasquier/add-healthy-probes
...
Add /-/healthy endpoint
2018-01-08 11:25:16 +01:00
Calle Pettersson
b7da058efb
Switch cmd/alertmanager to kingpin ( #974 )
2018-01-06 11:22:26 +01:00
Conor Broderick
a1153e83ff
Merge pull request #1167 from prometheus/fix-error-message
...
Fix error message
2018-01-03 11:10:39 +00:00
Christian Hoffmann
0e63715b23
UI: Fix JavaScript error in MSIE due to endswith() usage ( #1172 )
...
* index: avoid endswith() for MSIE compatibility
MSIE does not support endswith() [1]. substr() can
be used to work around this limitation.
[1] https://docs.microsoft.com/en-us/scripting/javascript/reference/endswith-method-string-javascript
* index: clean up comment
* ui: update bindata
2018-01-02 14:25:54 +01:00
Andrey Kuzmin
b8d20dffca
Update bindata.go
2018-01-02 12:46:24 +01:00
Andrey Kuzmin
1ccc7b1133
Dont output malformed error body
2018-01-02 12:45:36 +01:00
Andrey Kuzmin
6f8ccb031c
Fix expire buttons on the silences page ( #1171 )
...
* Only show confirmation for the specific silence
* Update bindata.go
2018-01-02 12:25:34 +01:00
Fabian Reinartz
92c04096a8
Merge pull request #1154 from dvrkps/patch-1
...
travis: update go version
2017-12-27 19:05:12 +01:00
pasquier-s
364979bbf8
Display connections in the Status page ( #1164 )
...
This change shows the status of the local connections in the web UI. It
can be used to troubleshoot mesh issues.
2017-12-22 11:39:27 +01:00
Calle Pettersson
608848390f
Switch amtool to kingpin ( #976 )
...
* Switch cmd/amtool to kingpin
* Touch-ups
* Implement long help
* Add missing short-form of --output
* Fix backwards compatibility for config file options
* Fix vendoring
* Review fixes
* Fix flag word order
2017-12-22 11:17:13 +01:00
anthraxn8b
2a0989094b
Added 2nd email address to “to“ field ( #1163 )
...
Did this to give an example with multiple email addresses in the “to“ field.
2017-12-22 00:14:23 +01:00