Commit Graph

50 Commits

Author SHA1 Message Date
Conor Broderick
1fd20fc954 Add dropped alertmanagers to alertmanagers API (#3865) 2018-02-21 09:00:07 +00:00
Matt Bostock
f0fd701309 Clarify notify metric descriptions (#2551)
The implementation of `sendAll` means that we observe latencies even for
notifications that would be considered dropped due to errors when
sending them.

Similarly, we count alerts as 'sent' even if an error occurred when
trying to send them (meaning they are potentially not sent at all).

336c7870ea/notifier/notifier.go (L340-L347)
2018-02-19 14:40:49 +00:00
Krasi Georgiev
7858745c04 rename structs for consistency 2018-01-30 17:49:05 +00:00
Krasi Georgiev
719c579f7b refactor main execution reloadReady handling, update some comments 2018-01-17 18:14:24 +00:00
Krasi Georgiev
5260c650ec use the config hash for the map lookup 2018-01-16 11:10:54 +00:00
Krasi Georgiev
8369826808 comment to rethink the map reference for the notifier discovery 2018-01-16 09:47:53 +00:00
Krasi Georgiev
d12e6f29fc discovery manager ApplyConfig now takes a direct ServiceDiscoveryConfig so that it can be used for the notify manager
reimplement the service discovery for the notify manager

Signed-off-by: Krasi Georgiev <krasi.root@gmail.com>
2018-01-15 13:39:44 +00:00
Shubheksha Jalan
ec94df49d4 Refactor SD configuration to remove config dependency (#3629)
* refactor: move targetGroup struct and CheckOverflow() to their own package

* refactor: move auth and security related structs to a utility package, fix import error in utility package

* refactor: Azure SD, remove SD struct from config

* refactor: DNS SD, remove SD struct from config into dns package

* refactor: ec2 SD, move SD struct from config into the ec2 package

* refactor: file SD, move SD struct from config to file discovery package

* refactor: gce, move SD struct from config to gce discovery package

* refactor: move HTTPClientConfig and URL into util/config, fix import error in httputil

* refactor: consul, move SD struct from config into consul discovery package

* refactor: marathon, move SD struct from config into marathon discovery package

* refactor: triton, move SD struct from config to triton discovery package, fix test

* refactor: zookeeper, move SD structs from config to zookeeper discovery package

* refactor: openstack, remove SD struct from config, move into openstack discovery package

* refactor: kubernetes, move SD struct from config into kubernetes discovery package

* refactor: notifier, use targetgroup package instead of config

* refactor: tests for file, marathon, triton SD - use targetgroup package instead of config.TargetGroup

* refactor: retrieval, use targetgroup package instead of config.TargetGroup

* refactor: storage, use config util package

* refactor: discovery manager, use targetgroup package instead of config.TargetGroup

* refactor: use HTTPClient and TLS config from configUtil instead of config

* refactor: tests, use targetgroup package instead of config.TargetGroup

* refactor: fix tagetgroup.Group pointers that were removed by mistake

* refactor: openstack, kubernetes: drop prefixes

* refactor: remove import aliases forced due to vscode bug

* refactor: move main SD struct out of config into discovery/config

* refactor: rename configUtil to config_util

* refactor: rename yamlUtil to yaml_config

* refactor: kubernetes, remove prefixes

* refactor: move the TargetGroup package to discovery/

* refactor: fix order of imports
2017-12-29 21:01:34 +01:00
Krasi Georgiev
e405e2f1ea refactored discovery 2017-12-18 17:22:49 +00:00
Luke Overend
9532c2c700 Pass ams to go routine when sending alerts (#3284)
Currently when sending alerts via the go routine within `sendAll`, the value
of `ams` is not passed to the routine, causing it to use the updated value of `ams`.

Example config:

```
alerting:
  alertmanagers:
    - basic_auth:
        username: 'prometheus'
        password: 'test123'
      static_configs:
      - targets:
        - localhost:9094
    - static_configs:
      - targets:
        - localhost:9095
```

In this example alerts sent to `localhost:9094` fail with:

```
level=error ts=2017-10-12T10:03:53.456819948Z caller=notifier.go:445
component=notifier alertmanager=http://localhost:9094/api/v1/alerts
count=1 msg="Error sending alert" err="bad response status 401
Unauthorized"
```

If you change the order to be:

```
alerting:
  alertmanagers:
    - static_configs:
      - targets:
        - localhost:9095
    - basic_auth:
        username: 'prometheus'
        password: 'test123'
      static_configs:
      - targets:
        - localhost:9094
```

It works as expected.

This commit changes the behavour so `ams` is passed to the go routine so
`n.sendOne` uses the appropriate `http.Client` details.
2017-12-12 13:40:00 +00:00
Julius Volz
099df0c5f0 Migrate "golang.org/x/net/context" -> "context" (#3333)
In some places, where ctxhttp or gRPC are concerned, we still need to use the
old contexts.
2017-10-24 21:21:42 -07:00
Marc Sluiter
6a633eece1 Added go-conntrack for monitoring http connections (#3241)
Added metrics for in- and outgoing traffic with go-conntrack.
2017-10-06 11:22:19 +01:00
Fabian Reinartz
d21f149745 *: migrate to go-kit/log 2017-09-08 22:01:51 +05:30
Fabian Reinartz
669075c6b9 Merge branch 'master' into dev-2.0 2017-06-06 09:36:51 +02:00
Chris Goller
42de0ae013 Use log.Logger interface for all discovery services 2017-06-01 11:25:55 -05:00
Frederic Branczyk
45df5c2daf
Merge branch 'release-1.6' 2017-05-22 13:44:44 +02:00
Frederic Branczyk
94e8b43aae
notifier: clone and not reuse LabelSet in AM discovery 2017-05-18 10:12:42 +02:00
Fabian Reinartz
6e804b3497 Merge branch 'master' into dev-2.0 2017-05-12 13:29:58 +02:00
Frederic Branczyk
0c96c4b157
notifier: expose metric for number of discovered alertmanagers 2017-05-08 10:37:19 +02:00
Fabian Reinartz
73b8ff0ddc Merge branch 'master' into dev-2.0 2017-04-27 10:19:55 +02:00
David Symonds
04ad889751 Preserve Alertmanager URLs as *url.URL.
Render a nicer link in the web UI.
2017-04-25 16:17:46 +10:00
Fabian Reinartz
8ffc851147 Merge branch 'master' into dev-2.0 2017-04-04 15:17:56 +02:00
Julius Volz
589061919a Merge pull request #2465 from Gouthamve/alert-metrics-2429
Better Metrics For Alerts
2017-03-31 21:45:05 +02:00
Goutham Veeramachaneni
f27ce34a13
Use Registerer to Register All Metrics
* Made Metric a Gauge so that it can be registered.
2017-04-01 00:14:30 +05:30
Goutham Veeramachaneni
7ba0a9e81a Add Comment About Initialising Counters 2017-03-31 23:39:02 +05:30
Goutham Veeramachaneni
0d0c9d5440
Move Registerer to Config Struct in Notifier 2017-03-31 21:20:12 +05:30
Julius Volz
815762a4ad Move retrieval.NewHTTPClient -> httputil.NewClientFromConfig 2017-03-20 14:17:04 +01:00
Goutham Veeramachaneni
f35816613e
Refactored Notifier to use Registerer
* Brought metrics back into Notifier

Notifier still implements a Collector. Check if that is needed.
2017-03-03 02:53:16 +05:30
Goutham Veeramachaneni
41da5c4ef2
Better Metrics For Alerts
* Closes prometheus/prometheus#2429
* Moved metrics to top of file for easier access
* Initialised CounterVecs
2017-03-02 23:58:15 +05:30
Fabian Reinartz
9304179ef7 Merge branch 'master' into dev-2.0 2017-03-02 08:16:58 +01:00
Julius Volz
f152ac5e23 notifier: Allow swapping out HTTP Doer
We need to be able to modify the HTTP POST in Weave Cortex to add
multitenancy information to a notification. Since we only really need a
special header in the end, the other option would be to just allow
passing in headers to the notifier. But swapping out the whole Doer is
more general and allows others to swap out the network-talky bits of the
notifier for their own use. Doing this via contexts here wouldn't work
well, due to the decoupled flow of data in the notifier.

There was no existing interface containing the ctxhttp.Post() or
ctxhttp.Do() methods, so I settled on just using Do() as a swappable
function directly (and with a more minimal signature than Post).
2017-02-27 20:36:22 +01:00
Fabian Reinartz
1d3cdd0d67 Merge branch 'master' into dev-2.0-rebase 2017-01-30 17:43:01 +01:00
Matt Bostock
4160892109 Correct notifications_dropped description
The current description does not accurately describe when the metric is incremented.

Aside from Alertmanger missing from the configuration, `prometheus_notifications_dropped_total` is incremented when errors occur while sending alert notifications to Alertmanager, or because the notifications queue is full, or because the number of notifications to be sent exceeds the queue capacity.

I think calling these cases 'errors' in a generic sense is more useful than the current description.
2017-01-13 23:36:00 +00:00
Fabian Reinartz
8b4e4a9d2b notifier: fully use labels.Labels 2016-12-29 16:53:11 +01:00
Fabian Reinartz
5817cb5bde *: migrate from model.* to promql.* types 2016-12-25 00:37:46 +01:00
Fabian Reinartz
2ad56aabd4 notifier: extract alertmanager into interface 2016-11-25 11:19:43 +01:00
Fabian Reinartz
b1f28b48a3 Fix typo 2016-11-25 08:47:04 +01:00
Fabian Reinartz
3fb4d1191b config: rename AlertingConfig, resolve file paths 2016-11-24 15:19:37 +01:00
Fabian Reinartz
d4deb8bbf2 web: show discovered Alertmanagers in UI 2016-11-24 15:06:50 +01:00
Fabian Reinartz
f210d96497 notifier: use dynamic service discovery 2016-11-23 18:23:37 +01:00
Dominik Schulz
182e17958a Trivial spelling corrections and a small comment. 2016-10-18 20:14:38 +02:00
Frederic Branczyk
572423cbe5
notifier: attach external labels before relabelling
By attaching external labels before relabelling, they are available
during relabelling.
2016-09-27 14:34:56 +02:00
Frederic Branczyk
b655aa002f introduce top level alerting config node 2016-08-09 14:19:25 +02:00
Frederic Branczyk
7714b9c781 move relabeling functionality to its own package
also remove the returned error as it was always nil
2016-08-09 14:19:20 +02:00
Frederic Branczyk
679d225c8d allow relabeling of alerts
in case of dropping don't even enqueue them
2016-08-09 14:18:31 +02:00
Dmitry Vorobev
273e457da4 web: return status code and error message for config resource 2016-07-15 10:15:24 +02:00
beorn7
064b57858e Consistently use the Seconds() method for conversion of durations
This also fixes one remaining case of recording integral numbers
of seconds only for a metric, i.e. this will probably fix #1796.
2016-07-07 15:24:35 +02:00
Fabian Reinartz
9baf120cd5 notifier: dispatch to multiple Alertmanagers
This commit extends the notifier to dispatch alert batches
to multiple Alertmanagers concurrently.
It changes the `-alertmanager.url` flag to accept a comma
separated list of URLs and/or to be set multiple times.
2016-06-06 11:41:10 +02:00
Dmitry Savintsev
7fdb62c253 fix several minor golint style issues 2016-05-11 14:26:18 +02:00
Fabian Reinartz
bfa8aaa017 Rename notification to notifier 2016-03-01 12:39:08 +01:00