This is done to prevent the latter operation from blocking/starving the former, as previously, the `tsets` channel was consumed by the same goroutine that consumes and feeds the buffered `n.more` channel, the `tsets` channel was less likely to be ready as it's unbuffered and only fed every `SDManager.updatert` seconds.
See https://github.com/prometheus/prometheus/issues/13676 and https://github.com/prometheus/prometheus/issues/8768
The synchronization with the sendLoop goroutine is managed through the n.mtx mutex.
This uses a similar approach than scrape manager's efbd6e41c5/scrape/manager.go (L115-L117)
The old TestHangingNotifier was replaced by the new one to more closely reflect reality.
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
to show "targets groups update" starvation when the notifications queue is full and an Alertmanager
is down.
The existing `TestHangingNotifier` that was added in https://github.com/prometheus/prometheus/pull/10948 doesn't really reflect the reality as the SD changes are manually fed into `syncCh` in a continuous way, whereas in reality, updates are only resent every `updatert`.
The test added here sets up an SD manager and links it to the notifier. The SD changes will be triggered by that manager as it's done in reality.
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
Co-authored-by: Ethan Hunter <ehunter@hudson-trading.com>
When all alerts were dropped after alert relabeling, the `sendAll()`
function didn't release the lock properly which created a deadlock with
the Alertmanager target discovery.
In addition, the commit detects early when there are no Alertmanager
endpoint to notify to avoid unnecessary work.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* refactor: move from io/ioutil to io and os packages
* use fs.DirEntry instead of os.FileInfo after os.ReadDir
Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>
This creates a new `model` directory and moves all data-model related
packages over there:
exemplar labels relabel rulefmt textparse timestamp value
All the others are more or less utilities and have been moved to `util`:
gate logging modetimevfs pool runtime
Signed-off-by: beorn7 <beorn@grafana.com>
We are re-enabling HTTP 2 again. There has been a few bugfixes upstream
in go, and we have also enabled ReadIdleTimeout.
Fix#7588Fix#9068
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
* Fix: Use json.Unmarshal() instead of json.Decoder
See https://ahmet.im/blog/golang-json-decoder-pitfalls/
json.Decoder is for JSON streams, not single JSON objects / bodies.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Revert modifications to targetgroup parsing
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Testify: move to require
Moving testify to require to fail tests early in case of errors.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
* More moves
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
* Refactor test assertions
This pull request gets rid of assert.True where possible to use
fine-grained assertions.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
This also fixes a bug in query_log_file, which now is relative to the config file like all other paths.
Signed-off-by: Andy Bursavich <abursavich@gmail.com>
* storage: Replace usage of sync/atomic with uber-go/atomic
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* tsdb: Replace usage of sync/atomic with uber-go/atomic
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* web: Replace usage of sync/atomic with uber-go/atomic
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* notifier: Replace usage of sync/atomic with uber-go/atomic
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* cmd: Replace usage of sync/atomic with uber-go/atomic
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* scripts: Verify that we are not using restricted packages
It checks that we are not directly importing 'sync/atomic'.
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* Reorganise imports in blocks
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* notifier/test: Apply PR suggestions
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* storage/remote: avoid storing references on newEntry
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* Revert "scripts: Verify that we are not using restricted packages"
This reverts commit 278d32748e.
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
* web: Group imports accordingly
Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
This change makes sure that nearly-identical Alertmanager configurations
aren't merged together.
The config's identifier was the MD5 hash of the configuration serialized
to JSON but because `relabel.Regexp` has no public field and doesn't
implement the JSON.Marshaler interface, it was always serialized to
"{}".
In practice, the identifier can be based on the index of the
configuration in the list.
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* Update go.mod dependencies before release
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Add issue for showing query warnings in promtool
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Revert json-iterator back to 1.1.6
It produced errors when marshaling Point values with special float
values.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Fix expected step values in promtool tests after client_golang update
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Update generated protobuf code after proto dep updates
Signed-off-by: Julius Volz <julius.volz@gmail.com>
With v0.16.0 Alertmanager introduced a new API (v2). This patch adds a
configuration option for Prometheus to send alerts to the v2 endpoint
instead of the defautl v1 endpoint.
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
1. If alerts is empty after `relabelAlerts`, just return to avoid
subsequent unnecessary operations
2. minor fix in notifier's test case
3. minor fix in comment
Signed-off-by: YaoZengzeng <yaozengzeng@zju.edu.cn>
From the documentation:
> The default HTTP client's Transport may not
> reuse HTTP/1.x "keep-alive" TCP connections if the Body is
> not read to completion and closed.
This effectively enable keep-alive for the fixed requests.
Signed-off-by: Romain Baugue <romain.baugue@elwinar.com>
- Unmarshall external_labels config as labels.Labels, add tests.
- Convert some more uses of model.LabelSet to labels.Labels.
- Remove old relabel pkg (fixes#3647).
- Validate external label names.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
* *: remove use of golang.org/x/net/context
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* scrape: fix TestTargetScrapeScrapeCancel
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
* refactor: move targetGroup struct and CheckOverflow() to their own package
* refactor: move auth and security related structs to a utility package, fix import error in utility package
* refactor: Azure SD, remove SD struct from config
* refactor: DNS SD, remove SD struct from config into dns package
* refactor: ec2 SD, move SD struct from config into the ec2 package
* refactor: file SD, move SD struct from config to file discovery package
* refactor: gce, move SD struct from config to gce discovery package
* refactor: move HTTPClientConfig and URL into util/config, fix import error in httputil
* refactor: consul, move SD struct from config into consul discovery package
* refactor: marathon, move SD struct from config into marathon discovery package
* refactor: triton, move SD struct from config to triton discovery package, fix test
* refactor: zookeeper, move SD structs from config to zookeeper discovery package
* refactor: openstack, remove SD struct from config, move into openstack discovery package
* refactor: kubernetes, move SD struct from config into kubernetes discovery package
* refactor: notifier, use targetgroup package instead of config
* refactor: tests for file, marathon, triton SD - use targetgroup package instead of config.TargetGroup
* refactor: retrieval, use targetgroup package instead of config.TargetGroup
* refactor: storage, use config util package
* refactor: discovery manager, use targetgroup package instead of config.TargetGroup
* refactor: use HTTPClient and TLS config from configUtil instead of config
* refactor: tests, use targetgroup package instead of config.TargetGroup
* refactor: fix tagetgroup.Group pointers that were removed by mistake
* refactor: openstack, kubernetes: drop prefixes
* refactor: remove import aliases forced due to vscode bug
* refactor: move main SD struct out of config into discovery/config
* refactor: rename configUtil to config_util
* refactor: rename yamlUtil to yaml_config
* refactor: kubernetes, remove prefixes
* refactor: move the TargetGroup package to discovery/
* refactor: fix order of imports
Currently when sending alerts via the go routine within `sendAll`, the value
of `ams` is not passed to the routine, causing it to use the updated value of `ams`.
Example config:
```
alerting:
alertmanagers:
- basic_auth:
username: 'prometheus'
password: 'test123'
static_configs:
- targets:
- localhost:9094
- static_configs:
- targets:
- localhost:9095
```
In this example alerts sent to `localhost:9094` fail with:
```
level=error ts=2017-10-12T10:03:53.456819948Z caller=notifier.go:445
component=notifier alertmanager=http://localhost:9094/api/v1/alerts
count=1 msg="Error sending alert" err="bad response status 401
Unauthorized"
```
If you change the order to be:
```
alerting:
alertmanagers:
- static_configs:
- targets:
- localhost:9095
- basic_auth:
username: 'prometheus'
password: 'test123'
static_configs:
- targets:
- localhost:9094
```
It works as expected.
This commit changes the behavour so `ams` is passed to the go routine so
`n.sendOne` uses the appropriate `http.Client` details.