Commit Graph

404 Commits

Author SHA1 Message Date
Jop Zinkweg
1f69c38ba4
Add discovery support for triton compute nodes (#7250)
Added optional configuration item role, defaults to 'container' (backwards-compatible).
Setting role to 'cn' will discover compute nodes instead.

Human-friendly compute node hostname discovery depends on cmon 1.7.0:
c1a2aeca36

Adjust testcases to use discovery config per case as two different types are now supported.

Updated documentation:
* new role setting
* clarify what the name 'container' covers as triton uses different names in different locations

Signed-off-by: jzinkweg <jzinkweg@gmail.com>
2020-05-22 16:19:21 +01:00
Aleksandra Gacek
8e53c19f9c discovery/kubernetes: expose label_selector and field_selector
Close #6807

Co-authored-by @shuttie
Signed-off-by: Aleksandra Gacek <algacek@google.com>
2020-02-15 14:57:56 +01:00
Grebennikov Roman
b4445ff03f discovery/kubernetes: expose label_selector and field_selector
Closes #6096

Signed-off-by: Grebennikov Roman <grv@dfdx.me>
2020-02-15 14:57:38 +01:00
Julien Pivotto
9d9bc524e5 Add query log (#6520)
* Add query log, make stats logged in JSON like in the API

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-01-08 13:28:43 +00:00
Callum Styan
67838643ee
Add config option for remote job name (#6043)
* Track remote write queues via a map so we don't care about index.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Support a job name for remote write/read so we can differentiate between
them using the name.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Remote write/read has Name to not confuse the meaning of the field with
scrape job names.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Split queue/client label into remote_name and url labels.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Don't allow for duplicate remote write/read configs.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Ensure we restart remote write queues if the hash of their config has
not changed, but the remote name has changed.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Include name in remote read/write config hashes, simplify duplicates
check, update test accordingly.

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-12-12 12:47:23 -08:00
Simon Pasquier
cccd542891
*: avoid missed Alertmanager targets (#6455)
This change makes sure that nearly-identical Alertmanager configurations
aren't merged together.

The config's identifier was the MD5 hash of the configuration serialized
to JSON but because `relabel.Regexp` has no public field and doesn't
implement the JSON.Marshaler interface, it was always serialized to
"{}".

In practice, the identifier can be based on the index of the
configuration in the list.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-12-12 17:00:19 +01:00
johncming
8d3083e256 config: add test case for scrape interval larger than timeout. (#6037)
Signed-off-by: johncming <johncming@yahoo.com>
2019-09-23 13:26:56 +02:00
Bartek Plotka
f0863a604e Removed extra tsdb/testutil after merge.
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
2019-08-14 10:12:32 +01:00
Chris Marchbanks
a6a55c433c Improve desired shards calculation (#5763)
The desired shards calculation now properly keeps track of the rate of
pending samples, and uses the previously unused integralAccumulator to
adjust for missing information in the desired shards calculation.

Also, configure more capacity for each shard.  The default 10 capacity
causes shards to block on each other while
sending remote requests. Default to a 500 sample capacity and explain in
the documentation that having more capacity will help throughput.

Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2019-08-13 10:10:21 +01:00
Chris Marchbanks
529ccff07b
Remove all usages of stretchr/testify
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2019-08-08 19:49:27 -06:00
Max Leonard Inden
41c22effbe
config&notifier: Add option to use Alertmanager API v2
With v0.16.0 Alertmanager introduced a new API (v2). This patch adds a
configuration option for Prometheus to send alerts to the v2 endpoint
instead of the defautl v1 endpoint.

Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2019-06-21 16:33:53 +02:00
Callum Styan
e9129abeff Remove max_retries from queue_config since it's not used in remote write
anymore.

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-06-10 12:43:08 -07:00
Tariq Ibrahim
8fdfa8abea refine error handling in prometheus (#5388)
i) Uses the more idiomatic Wrap and Wrapf methods for creating nested errors.
ii) Fixes some incorrect usages of fmt.Errorf where the error messages don't have any formatting directives.
iii) Does away with the use of fmt package for errors in favour of pkg/errors

Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-03-26 00:01:12 +01:00
Callum Styan
5603b857a9 Check if label value is valid when unmarhsaling external labels from
YAML, add a test to config_tests for valid/invalid external label
value.

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-03-18 20:31:12 +00:00
Tom Wilkie
c7b3535997 Use pkg/relabelling in remote write.
- Unmarshall external_labels config as labels.Labels, add tests.
- Convert some more uses of model.LabelSet to labels.Labels.
- Remove old relabel pkg (fixes #3647).
- Validate external label names.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2019-03-18 20:31:12 +00:00
Julien Pivotto
4397916cb2 Add honor_timestamps (#5304)
Fixes #5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2019-03-15 10:04:15 +00:00
Callum Styan
83c46fd549 update Consul vendor code so that catalog.ServiceMultipleTags can be (#5151)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-03-12 10:31:27 +00:00
Simon Pasquier
027d2ece14 config: resolve more file paths (#5284)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-03-12 10:24:15 +00:00
Simon Pasquier
e72c875e63
config: fix Kubernetes config with empty API server (#5256)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-22 15:51:47 +01:00
Simon Pasquier
c8a1a5a93c
discovery/kubernetes: fix support for password_file and bearer_token_file (#5211)
* discovery/kubernetes: fix support for password_file

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Create and pass custom RoundTripper to Kubernetes client

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Use inline HTTPClientConfig

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-20 11:22:34 +01:00
Callum Styan
6f69e31398 Tail the TSDB WAL for remote_write
This change switches the remote_write API to use the TSDB WAL.  This should reduce memory usage and prevent sample loss when the remote end point is down.

We use the new LiveReader from TSDB to tail WAL segments.  Logic for finding the tracking segment is included in this PR.  The WAL is tailed once for each remote_write endpoint specified. Reading from the segment is based on a ticker rather than relying on fsnotify write events, which were found to be complicated and unreliable in early prototypes.

Enqueuing a sample for sending via remote_write can now block, to provide back pressure.  Queues are still required to acheive parallelism and batching.  We have updated the queue config based on new defaults for queue capacity and pending samples values - much smaller values are now possible.  The remote_write resharding code has been updated to prevent deadlocks, and extra tests have been added for these cases.

As part of this change, we attempt to guarantee that samples are not lost; however this initial version doesn't guarantee this across Prometheus restarts or non-retryable errors from the remote end (eg 400s).

This changes also includes the following optimisations:
- only marshal the proto request once, not once per retry
- maintain a single copy of the labels for given series to reduce GC pressure

Other minor tweaks:
- only reshard if we've also successfully sent recently
- add pending samples, latest sent timestamp, WAL events processed metrics

Co-authored-by: Chris Marchbanks <csmarchbanks.com> (initial prototype)
Co-authored-by: Tom Wilkie <tom.wilkie@gmail.com> (sharding changes)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2019-02-12 11:39:13 +00:00
Simon Pasquier
f678e27eb6
*: use latest release of staticcheck (#5057)
* *: use latest release of staticcheck

It also fixes a couple of things in the code flagged by the additional
checks.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Use official release of staticcheck

Also run 'go list' before staticcheck to avoid failures when downloading packages.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-01-04 14:47:38 +01:00
Marcel D. Juhnke
c7d83b2b6a discovery: add support for Managed Identity authentication in Azure SD (#4590)
Signed-off-by: Marcel Juhnke <marrat@marrat.de>
2018-12-19 10:03:33 +00:00
Bartek Płotka
62c8337e77 Moved configuration into relabel package. (#4955)
Adapted top dir relabel to use pkg relabel structs.

Removal of this in a separate tracked here: https://github.com/prometheus/prometheus/issues/3647

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
2018-12-18 11:26:36 +00:00
Ryota Arai
135d580ab2 Introduce min_shards for remote write to set minimum number of shards. (#4924)
Signed-off-by: Ryota Arai <ryota.arai@gmail.com>
2018-12-04 17:32:14 +00:00
Julius Volz
d28246e337
Fix config loading panics on nil pointer slice elements (#4942)
Fixes https://github.com/prometheus/prometheus/issues/4902
Fixes https://github.com/prometheus/prometheus/issues/4889

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-12-03 18:09:02 +08:00
mengnan
a5d39361ab discovery/azure: Fail hard when Azure authentication parameters are missing (#4907)
* discovery/azure: fail hard when client_id/client_secret is empty

Signed-off-by: mengnan <supernan1994@gmail.com>

* discovery/azure: fail hard when authentication parameters are missing

Signed-off-by: mengnan <supernan1994@gmail.com>

* add unit test

Signed-off-by: mengnan <supernan1994@gmail.com>

* add unit test

Signed-off-by: mengnan <supernan1994@gmail.com>

* format code

Signed-off-by: mengnan <supernan1994@gmail.com>
2018-11-29 16:47:59 +01:00
Ben Kochie
c6399296dc
Fix spelling/typos (#4921)
* Fix spelling/typos

Fix spelling/typos reported by codespell/misspell.
* UK -> US spelling changes.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-11-27 17:44:29 +01:00
Simon Pasquier
ff08c40091 discovery/openstack: support tls_config
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-25 14:31:32 +02:00
Simon Pasquier
128ff546b8 config: add test for OpenStack SD (#4594)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-09-13 21:44:27 +05:30
Tariq Ibrahim
f708fd5c99 Adding support for multiple azure environments (#4569)
Signed-off-by: Tariq Ibrahim <tariq.ibrahim@microsoft.com>
2018-09-04 17:55:40 +02:00
Daisy T
7d01ead689 change time.duration to model.duration for standardization (#4479)
Signed-off-by: Daisy T <daisyts@gmx.com>
2018-08-24 16:55:21 +02:00
Goutham Veeramachaneni
c28cc5076c Saner defaults and metrics for remote-write (#4279)
* Rename queueCapacity to shardCapacity
* Saner defaults for remote write
* Reduce allocs on retries

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2018-07-18 05:15:16 +01:00
Paul Gier
d24d2acd11 config: set target group source index during unmarshalling (#4245)
* config: set target group source index during unmarshalling

Fixes issue #4214 where the scrape pool is unnecessarily reloaded for a
config reload where the config hasn't changed.  Previously, the discovery
manager changed the static config after loading which caused the in-memory
config to differ from a freshly reloaded config.

Signed-off-by: Paul Gier <pgier@redhat.com>

* [issue #4214] Test that static targets are not modified by discovery manager

Signed-off-by: Paul Gier <pgier@redhat.com>
2018-06-13 16:34:59 +01:00
Philippe Laflamme
2aba238f31 Use common HTTPClientConfig for marathon_sd configuration (#4009)
This adds support for basic authentication which closes #3090

The support for specifying the client timeout was removed as discussed in https://github.com/prometheus/common/pull/123. Marathon was the only sd mechanism doing this and configuring the timeout is done through `Context`.

DC/OS uses a custom `Authorization` header for authenticating. This adds 2 new configuration properties to reflect this.

Existing configuration files that use the bearer token will no longer work. More work is required to make this backwards compatible.
2018-04-05 09:08:18 +01:00
Manos Fokas
25f929b772 Yaml UnmarshalStrict implementation. (#4033)
* Updated yaml vendor package.

* remove checkOverflow duplicate in rulefmt

* remove duplicated HTTPClientConfig.Validate()

* Added yaml static check.
2018-04-04 09:07:39 +01:00
Kristiyan Nikolov
be85ba3842 discovery/ec2: Support filtering instances in discovery (#4011) 2018-03-31 07:51:11 +01:00
Corentin Chary
60dafd425c consul: improve consul service discovery (#3814)
* consul: improve consul service discovery

Related to #3711

- Add the ability to filter by tag and node-meta in an efficient way (`/catalog/services`
  allow filtering by node-meta, and returns a `map[string]string` or `service`->`tags`).
  Tags and nore-meta are also used in `/catalog/service` requests.
- Do not require a call to the catalog if services are specified by name. This is important
  because on large cluster `/catalog/services` changes all the time.
- Add `allow_stale` configuration option to do stale reads. Non-stale
  reads can be costly, even more when you are doing them to a remote
  datacenter with 10k+ targets over WAN (which is common for federation).
- Add `refresh_interval` to minimize the strain on the catalog and on the
  service endpoint. This is needed because of that kind of behavior from
  consul: https://github.com/hashicorp/consul/issues/3712 and because a catalog
  on a large cluster would basically change *all* the time. No need to discover
  targets in 1sec if we scrape them every minute.
- Added plenty of unit tests.

Benchmarks
----------

```yaml
scrape_configs:

- job_name: prometheus
  scrape_interval: 60s
  static_configs:
    - targets: ["127.0.0.1:9090"]

- job_name: "observability-by-tag"
  scrape_interval: "60s"
  metrics_path: "/metrics"
  consul_sd_configs:
    - server: consul.service.par.consul.prod.crto.in:8500
      tag: marathon-user-observability  # Used in After
      refresh_interval: 30s             # Used in After+delay
  relabel_configs:
    - source_labels: [__meta_consul_tags]
      regex: ^(.*,)?marathon-user-observability(,.*)?$
      action: keep

- job_name: "observability-by-name"
  scrape_interval: "60s"
  metrics_path: "/metrics"
  consul_sd_configs:
    - server: consul.service.par.consul.prod.crto.in:8500
      services:
        - observability-cerebro
        - observability-portal-web

- job_name: "fake-fake-fake"
  scrape_interval: "15s"
  metrics_path: "/metrics"
  consul_sd_configs:
    - server: consul.service.par.consul.prod.crto.in:8500
      services:
        - fake-fake-fake
```

Note: tested with ~1200 services, ~5000 nodes.

| Resource | Empty | Before | After | After + delay |
| -------- |:-----:|:------:|:-----:|:-------------:|
|/service-discovery size|5K|85MiB|27k|27k|27k|
|`go_memstats_heap_objects`|100k|1M|120k|110k|
|`go_memstats_heap_alloc_bytes`|24MB|150MB|28MB|27MB|
|`rate(go_memstats_alloc_bytes_total[5m])`|0.2MB/s|28MB/s|2MB/s|0.3MB/s|
|`rate(process_cpu_seconds_total[5m])`|0.1%|15%|2%|0.01%|
|`process_open_fds`|16|*1236*|22|22|
|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="services"}[5m])`|~0|1|1|*0.03*|
|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="service"}[5m])`|0.1|*80*|0.5|0.5|
|`prometheus_target_sync_length_seconds{quantile="0.9",scrape_job="observability-by-tag"}`|N/A|200ms|0.2ms|0.2ms|
|Network bandwidth|~10kbps|~2.8Mbps|~1.6Mbps|~10kbps|

Filtering by tag using relabel_configs uses **100kiB and 23kiB/s per service per job** and quite a lot of CPU. Also sends and additional *1Mbps* of traffic to consul.
Being a little bit smarter about this reduces the overhead quite a lot.
Limiting the number of `/catalog/services` queries per second almost removes the overhead of service discovery.

* consul: tweak `refresh_interval` behavior

`refresh_interval` now does what is advertised in the documentation,
there won't be more that one update per `refresh_interval`. It now
defaults to 30s (which was also the current waitTime in the consul query).

This also make sure we don't wait another 30s if we already waited 29s
in the blocking call by substracting the number of elapsed seconds.

Hopefully this will do what people expect it does and will be safer
for existing consul infrastructures.
2018-03-23 14:48:43 +00:00
pasquier-s
fc8cf08f42 Prevent invalid label names with labelmap (#3868)
This change ensures that the relabeling configurations using labelmap
can't generate invalid label names.
2018-02-21 10:02:22 +00:00
Shubheksha Jalan
0471e64ad1 Use shared types from the common repo (#3674)
* refactor: use shared types from common repo, remove util/config

* vendor: add common/config

* fix nit
2018-01-11 16:10:25 +01:00
Shubheksha Jalan
ec94df49d4 Refactor SD configuration to remove config dependency (#3629)
* refactor: move targetGroup struct and CheckOverflow() to their own package

* refactor: move auth and security related structs to a utility package, fix import error in utility package

* refactor: Azure SD, remove SD struct from config

* refactor: DNS SD, remove SD struct from config into dns package

* refactor: ec2 SD, move SD struct from config into the ec2 package

* refactor: file SD, move SD struct from config to file discovery package

* refactor: gce, move SD struct from config to gce discovery package

* refactor: move HTTPClientConfig and URL into util/config, fix import error in httputil

* refactor: consul, move SD struct from config into consul discovery package

* refactor: marathon, move SD struct from config into marathon discovery package

* refactor: triton, move SD struct from config to triton discovery package, fix test

* refactor: zookeeper, move SD structs from config to zookeeper discovery package

* refactor: openstack, remove SD struct from config, move into openstack discovery package

* refactor: kubernetes, move SD struct from config into kubernetes discovery package

* refactor: notifier, use targetgroup package instead of config

* refactor: tests for file, marathon, triton SD - use targetgroup package instead of config.TargetGroup

* refactor: retrieval, use targetgroup package instead of config.TargetGroup

* refactor: storage, use config util package

* refactor: discovery manager, use targetgroup package instead of config.TargetGroup

* refactor: use HTTPClient and TLS config from configUtil instead of config

* refactor: tests, use targetgroup package instead of config.TargetGroup

* refactor: fix tagetgroup.Group pointers that were removed by mistake

* refactor: openstack, kubernetes: drop prefixes

* refactor: remove import aliases forced due to vscode bug

* refactor: move main SD struct out of config into discovery/config

* refactor: rename configUtil to config_util

* refactor: rename yamlUtil to yaml_config

* refactor: kubernetes, remove prefixes

* refactor: move the TargetGroup package to discovery/

* refactor: fix order of imports
2017-12-29 21:01:34 +01:00
Brian Brazil
fba80da635
Fix default of read_recent to be false. (#3617)
This is what is documented in the migration guide, and the default settings
should make sense for a true long term storage.

Document the setting.
2017-12-23 17:21:38 +00:00
Krasi Georgiev
e405e2f1ea refactored discovery 2017-12-18 17:22:49 +00:00
Alberto Cortés
29da2fb9cd testutil: update to go1.9 testing.Helper 2017-12-08 19:06:53 +01:00
Alberto Cortés
8f6a9f7833 config: simplify tests by using testutil.NotOk (#3289)
Also include filename in all LoadFile errors

Also add mesage to testuitl.NotOk so we can identify failing tests when
using table driven tests.
2017-12-08 16:52:25 +00:00
Tobias Schmidt
7098c56474 Add remote read filter option
For special remote read endpoints which have only data for specific
queries, it is desired to limit the number of queries sent to the
configured remote read endpoint to reduce latency and performance
overhead.
2017-11-13 23:30:01 +01:00
Krasi Georgiev
e86d82ad2d Fix regression of alert rules state loss on config reload. (#3382)
* incorrect map name for the group prevented copying state from existing alert rules on config reload

* applyConfig test

* few nits

* nits 2
2017-11-01 12:58:00 +01:00
Thibault Chataigner
bf4a279a91 Remote storage reads based on oldest timestamp in primary storage (#3129)
Currently all read queries are simply pushed to remote read clients.
This is fine, except for remote storage for wich it unefficient and
make query slower even if remote read is unnecessary.
So we need instead to compare the oldest timestamp in primary/local
storage with the query range lower boundary. If the oldest timestamp
is older than the mint parameter, then there is no need for remote read.
This is an optionnal behavior per remote read client.

Signed-off-by: Thibault Chataigner <t.chataigner@criteo.com>
2017-10-18 12:08:14 +01:00
Alberto Cortés
6c67296423 config: fix error message for unexpected result of yaml marshal 2017-10-12 19:50:07 +02:00
Alberto Cortés
0f3d8ea075 config: use testutil package 2017-10-12 19:50:07 +02:00
Fabian Reinartz
2d0b8e8b94 Merge branch 'master' into dev-2.0 2017-10-05 13:09:18 +02:00
Alberto Cortés
bb3dad9cba config: simplify some returns 2017-09-26 16:57:56 +02:00
Bryan Boreham
9d6b945e41 Default HTTP keep-alive ON for remote read/write 2017-09-11 09:48:30 +00:00
Bryan Boreham
e0a4d18301 Allow http keep-alive setting to be overridden in config 2017-09-11 09:07:14 +00:00
Fabian Reinartz
e746282772 Merge branch 'master' into dev-2.0 2017-09-11 10:55:19 +02:00
Jamie Moore
7a135e0a1b Add the ability to assume a role for ec2 discovery 2017-09-10 00:36:43 +10:00
Fabian Reinartz
87918f3097 Merge branch 'master' into dev-2.0 2017-09-04 14:09:21 +02:00
Johannes 'fish' Ziemke
70f3d1e9f9 k8s: Support discovery of ingresses (#3111)
* k8s: Support discovery of ingresses

* Move additional labels below allocation

This makes it more obvious why the additional elements are allocated.
Also fix allocation for node where we only set a single label.

* k8s: Remove port from ingress discovery

* k8s: Add comment to ingress discovery example
2017-09-04 13:10:44 +02:00
CuiHaozhi
b1c18bf29b discovery openstack: support discovery hosts, add rule option.
Signed-off-by: CuiHaozhi <cuihz@wise2c.com>
2017-08-29 10:14:00 -04:00
Max Leonard Inden
1c96fbb992
Expose current Prometheus config via /status/config
This PR adds the `/status/config` endpoint which exposes the currently
loaded Prometheus config. This is the same config that is displayed on
`/config` in the UI in YAML format. The response payload looks like
such:
```
{
  "status": "success",
  "data": {
    "yaml": <CONFIG>
  }
}
```
2017-08-13 22:21:18 +02:00
Fabian Reinartz
25f3e1c424 Merge branch 'master' into mergemaster 2017-08-10 17:04:25 +02:00
Yuki Ito
1bf3b91ae0 Make sure that url for remote_read/write is not nil (#3024) 2017-08-07 08:49:45 +01:00
Tom Wilkie
5169f990f9 Review feedback: add yaml struct tags, don't embed queue config.
Also, rename QueueManageConfig to QueueConfig, for consistency with tags.
2017-08-01 14:43:56 +01:00
Tom Wilkie
454b661145 Make queue manager configurable. 2017-07-25 13:47:34 +01:00
Fabian Reinartz
dba7586671 Merge branch 'master' into dev-2.0 2017-07-11 17:22:14 +02:00
Fuente, Pablo Andres
9eb8c6e1d2 Renaming the config_notwin test to config_default 2017-07-10 11:08:16 -03:00
Fuente, Pablo Andres
fe73de9452 Renaming config test file to fix build tags
Renaming the name of a file of the config tests, in order to properly
use the Go build tags feature.
2017-07-10 00:02:08 -03:00
Fuente, Pablo Andres
193dc47230 Fixing code style to adhere gofmt 2017-07-09 02:43:33 -03:00
Fuente, Pablo Andres
902fafb8e7 Fixing tests for Windows
Fixing the config/config_test, the discovery/file/file_test and the
promql/promql_test tests for Windows. For most of the tests, the fix involved
correct handling of path separators. In the case of the promql tests, the
issue was related to the removal of the temporal directories used by the
storage. The issue is that the RemoveAll() call returns an error when it
tries to remove a directory which is not empty, which seems to be true due to
some kind of process that is still running after closing the storage. To fix
it I added some retries to the remove of the temporal directories.
Adding tags file from Universal Ctags to .gitignore
2017-07-09 01:59:30 -03:00
Goutham Veeramachaneni
98d20d5880 Make sure rendering config produces valid config
Fixes #2899

Signed-off-by: Goutham Veeramachaneni <goutham@boomerangcommerce.com>
2017-07-05 16:09:29 +02:00
Fabian Reinartz
65b087bcc1 config: resolve file SD paths relative to config 2017-07-04 11:40:26 +02:00
Fabian Reinartz
9ba61df45a Merge pull request #2789 from mtanda/aws_default_region
config: set default region for EC2 SD
2017-06-12 16:15:53 +02:00
Mitsuhiro Tanda
64cef5cd05 handle NewSession() error 2017-06-06 22:02:50 +09:00
Christian Groschupp
8f781e411c Openstack Service Discovery (#2701)
* Add openstack service discovery.

* Add gophercloud code for openstack service discovery.

* first changes for juliusv comments.

* add gophercloud code for floatingip.

* Add tests to openstack sd.

* Add testify suite vendor files.

* add copyright and make changes for code climate.

* Fixed typos in provider openstack.

* Renamed tenant to project in openstack sd.

* Change type of password to Secret in openstack sd.
2017-06-01 23:49:02 +02:00
Roman Vynar
dbe2eb2afc Hide consul token on UI. (#2797) 2017-06-01 22:14:23 +01:00
Mitsuhiro Tanda
a1ddab463e config: set default region for EC2 SD 2017-06-02 01:40:24 +09:00
Tobias Schmidt
287ec6e6cc Fix outdated target_group naming in error message
The target_groups config has been renamed to static_configs, the error
message for overflow attributes should reflect that.
2017-05-31 11:01:13 +02:00
Julius Volz
240bb671e2 config: Fix overflow checking in global config (#2783) 2017-05-30 20:58:06 +02:00
Conor Broderick
6766123f93 Replace regex with Secret type and remarshal config to hide secrets (#2775) 2017-05-29 12:46:23 +01:00
Brian Akins
27d66628a1 Allow limiting Kubernetes service discover to certain namespaces
Allow namespace discovery to be more easily extended in the future by using a struct rather than just a list.

Rename fields for kubernetes namespace discovery
2017-04-27 07:41:36 -04:00
Julius Volz
eb14678a25 Make remote read/write use config.HTTPClientConfig 2017-03-20 13:37:50 +01:00
Julius Volz
02395a224d [WIP] Remote Read 2017-03-20 13:13:44 +01:00
Julius Volz
525da88c35 Merge pull request #2479 from YKlausz/consul-tls
Adding consul capability to connect via tls
2017-03-20 11:40:18 +01:00
Goutham Veeramachaneni
5c89cec65c Stricter Relabel Config Checking for Labeldrop/keep (#2510)
* Minor code cleanup

* Labeldrop/Labelkeep Now *Only* Support Regex

Ref promtheus/prometheus#2368
2017-03-18 22:32:08 +01:00
yklausz
75880b594f Adding consul capability to connect via tls 2017-03-17 22:37:18 +01:00
Michael Kraus
04eadf6e20 Allow Marathon SD without bearer_token and bearer_token_file 2017-03-02 13:17:19 +01:00
Michael Kraus
47bdcf0f67 Allow the use of bearer_token or bearer_token_file for MarathonSD 2017-03-02 09:44:20 +01:00
Julius Volz
e9476b35d5 Re-add multiple remote writers
Each remote write endpoint gets its own set of relabeling rules.

This is based on the (yet-to-be-merged)
https://github.com/prometheus/prometheus/pull/2419, which removes legacy
remote write implementations.
2017-02-20 13:23:12 +01:00
Alex Somesan
b22eb65d0f Cleaner separation between ServiceAccount and custom authentication in K8S SD (#2348)
* Canonical usage of cluster service-account in K8S SD

* Early validation for opt-in custom auth in K8S SD

* Fix typo in condition
2017-01-19 10:52:52 +01:00
Fabian Reinartz
7eb849e6a8 Merge pull request #2307 from joyent/triton_discovery
Add Joyent Triton discovery
2017-01-18 05:08:11 +01:00
Richard Kiene
f3d9692d09 Add Joyent Triton discovery 2017-01-17 20:34:32 +00:00
Björn Rabenstein
ad40d0abbc Merge pull request #2288 from prometheus/limit-scrape
Add ability to limit scrape samples, and related metrics
2017-01-08 01:34:06 +01:00
Brian Brazil
30448286c7 Add sample_limit to scrape config.
This imposes a hard limit on the number of samples ingested from the
target. This is counted after metric relabelling, to allow dropping of
problemtic metrics.

This is intended as a very blunt tool to prevent overload due to
misbehaving targets that suddenly jump in sample count (e.g. adding
a label containing email addresses).

Add metric to track how often this happens.

Fixes #2137
2016-12-16 15:10:09 +00:00
Tristan Colgate-McFarlane
4d9134e6d8 Add labeldrop and labelkeep actions. (#2279)
Introduce two new relabel actions. labeldrop, and labelkeep.
These can be used to filter the set of labels by matching regex

- labeldrop: drops all labels that match the regex
- labelkeep: drops all labels that do not match the regex
2016-12-14 10:17:42 +00:00
Fabian Reinartz
cc35104504 config: fix naming and typo 2016-11-25 11:04:33 +01:00
Fabian Reinartz
3fb4d1191b config: rename AlertingConfig, resolve file paths 2016-11-24 15:19:37 +01:00
Fabian Reinartz
183c5749b9 config: add Alertmanager configuration 2016-11-23 18:23:37 +01:00
Fabian Reinartz
200bbe1bad config: extract SD and HTTPClient configurations 2016-11-23 18:23:37 +01:00
Fabian Reinartz
ec66082749 Merge branch 'ec2_sd_profile_support' of https://github.com/Ticketmaster/prometheus into Ticketmaster-ec2_sd_profile_support 2016-11-21 11:49:23 +01:00
Kraig Amador
bec6870ed4 ec2_sd_configs: Support profiles for configuring the ec2 service 2016-11-03 08:38:02 -07:00
beorn7
b2f28a9e82 Merge branch 'release-1.2' 2016-11-03 14:42:15 +01:00
Brian Brazil
d1ece12c70 Handle null Regex in the config as the empty regex. (#2150) 2016-11-03 13:34:15 +00:00
bekbulatov
2bc12fa2fb Set timeout for marathon_sd 2016-10-24 11:27:08 +01:00
bekbulatov
c689b35858 Merge branch 'master' into marathon_tls 2016-10-24 10:37:32 +01:00
Matti Savolainen
aabf4a419b use LabelNam.IsValid() instead of LabelNameRE and MatchString instead of Match 2016-10-19 16:30:52 +03:00
Matti Savolainen
ec6524ce74 test the labelTarget regex to make sure it properly validates pre-interpolated label names. 2016-10-19 13:32:42 +03:00
Matti Savolainen
f867c1fd58 formating and text fixes, adjust regexp 2016-10-19 13:31:55 +03:00
Matti Savolainen
56907ba6e3 Add interpolation to good test config. Fix regex 2016-10-19 01:19:19 +03:00
Matti Savolainen
7a36af1c85 add comment about interpolation 2016-10-19 00:42:49 +03:00
Matti Savolainen
3b8e7c1277 Merge branch 'master' of https://github.com/prometheus/prometheus into bug/target_label_unmarshal 2016-10-19 00:33:52 +03:00
Matti Savolainen
5a1e909b5d Make TargetLabel in RelabelConfig a string 2016-10-19 00:33:22 +03:00
Björn Rabenstein
d93f73874f Merge pull request #2093 from dominikschulz/spelling
Trivial spelling corrections
2016-10-18 22:46:03 +02:00
Dominik Schulz
182e17958a Trivial spelling corrections and a small comment. 2016-10-18 20:14:38 +02:00
bekbulatov
ac702f66eb Resolve merge conflicts 2016-10-18 14:14:24 +01:00
Fabian Reinartz
1b6dfa32a9 config: rename role 'endpoint' to 'endpoints' 2016-10-17 11:53:49 +02:00
Frederic Branczyk
2e18c81a00 config: adapt unit tests 2016-10-17 10:32:10 +02:00
Fabian Reinartz
b24602f713 kubernetes: merge back into single configuration 2016-10-17 10:32:10 +02:00
Fabian Reinartz
2331701b50 kubernetes: Add K8S v2 pod discovery
This adds plumbing for a parallel version of the new K8S SD
and adds pod discovery as the first role.
2016-10-17 10:32:10 +02:00
Dominik Schulz
72cbf8af6f Fix small copy and paste error 2016-10-08 08:49:00 +02:00
bekbulatov
01b53c1180 Add tls support 2016-10-07 13:40:22 +01:00
Brian Brazil
77605649a9 Add support for remote write relabelling.
Switch back to a single remote writer, as we were only ever meant to
have one and the relabel semantics are clearer that way.
2016-10-05 07:43:19 +01:00
Tom Wilkie
4520e12440 Add HTTP Basic Auth & TLS support to the generic write path. (#1957)
* Add config, HTTP Basic Auth and TLS support to the generic write path.

- Move generic write path configuration to the config file
- Factor out config.TLSConfig -> tlf.Config translation
- Support TLSConfig for generic remote storage
- Rename Run to Start, and make it non-blocking.
- Dedupe code in httputil for TLS config.
- Make remote queue metrics global.
2016-09-19 22:47:51 +02:00
Tobias Schmidt
874cb44bb6 Merge pull request #1996 from ton31337/Fix/allow_numbers_as_first_letter
Allow number to be the first letter as well for `job_name`
2016-09-16 11:08:52 -04:00
Donatas Abraitis
1aa8898b66 Allow number to be the first letter as well for job_name 2016-09-16 14:06:47 +03:00
Ingo Gottwald
3b546d061f Add support for GCE discovery 2016-09-16 08:55:33 +02:00
Alexey Miroshkin
e29d9394e5 Forbid invalid relabel configurations
This fix adds check if target_label value is set in case if action is replace or
hashmod
Issue [#1900]
2016-08-29 16:56:06 +02:00
Fabian Reinartz
be596f82b4 Merge pull request #1783 from knyar/json
Allow URLs in targets defined via a JSON file
2016-08-10 09:42:17 +02:00
Frederic Branczyk
b655aa002f introduce top level alerting config node 2016-08-09 14:19:25 +02:00
Frederic Branczyk
679d225c8d allow relabeling of alerts
in case of dropping don't even enqueue them
2016-08-09 14:18:31 +02:00
Fabian Reinartz
7a0b3af0b7 config: validate Kubernetes role correctly. 2016-07-18 22:24:41 +09:00
Fabian Reinartz
919558f601 config: remove deprecated target_groups configuration 2016-07-14 09:55:00 +09:00
Fabian Reinartz
7221228843 discovery/kubernetes: select between discovery role
This adds `role` field to the Kubernetes SD config, which indicates
which type of Kubernetes SD should be run.
This no longer allows discovering pods and nodes with the same SD
configuration for example.
2016-07-05 14:22:12 +02:00
Anton Tolchanov
772a3af38f Allow URLs in targets defined via a JSON file
This enables defining `blackbox_exporter` targets (which can be URLs,
because of relabeling) in a JSON file.

Not sure if this is the best approach, but current behaviour is
inconsistent (`UnmarshalYAML` does not have this check) and breaks
officially documented way to use `blackbox_exporter`.
2016-07-04 00:05:57 +03:00
Ali Reza
98c156c361 reorder config validation, move checkOverflow before any other validation 2016-06-13 10:02:20 +07:00
Fabian Reinartz
0f21bd31ca config: deprecate target_groups for static_configs
This change deprecates the `target_groups` option in favor
of `static_configs`. The old configuration is still accepted
but prints a warning.
Configuration loading errors if both options are set.
2016-06-08 15:55:25 +02:00
Jimmi Dyson
206bcfcdaa
Kubernetes SD: Remove kubeletPort config option 2016-06-07 12:34:55 +01:00
Ali Reza
c81b4e8a87 change config names to files for consistency 2016-05-30 07:47:58 +07:00
Gregory G. Tseng
7997c14b0d Add ServerName into TLS Config 2016-05-26 14:24:49 -07:00
Seth Miller
0988e3b937 Add support for Azure discovery
This change adds the ability to do target discovery with Microsoft's Azure platform.
2016-04-06 22:47:02 -05:00
Fabian Reinartz
37c709f917 Fix global config YAML issues 2016-02-15 14:08:25 +01:00
Fabian Reinartz
44a5e860ed Fix scrape timeout config checks 2016-02-15 12:07:46 +01:00
Julius Volz
829a029dda Update two more __meta_dns_srv_name references.
Although they are only in examples/tests and don't affect anything, they
could be confusing (the label has been renamed in the rest of the code a
while ago).
2016-02-14 22:20:39 +01:00
Fabian Reinartz
e26e4b6e89 Restrict scrape timeout to interval length 2016-02-12 12:52:22 +01:00
beorn7
a7408bfb47 Unify duration parsing
It's actually happening in several places (and for flags, we use the
standard Go time.Duration...). This at least reduces all our
home-grown parsing to one place (in model).
2016-01-29 15:41:50 +01:00
Julien Dehee
061fe2f364 Support AirBnB's Smartstack Nerve client for SD
nerve's registration format differs from serverset. With this commit
there is now a dedicated treecache file in util,
and two separate files for serverset and nerve.

Reference:
https://github.com/airbnb/nerve
2016-01-18 14:07:28 +01:00
Fabian Reinartz
4d1c9296d5 Add new defaults for relabel configurations 2015-11-16 13:16:13 +01:00
Brian Brazil
fd2bd81cd8 Allow all instance labels in target groups
With the blackbox exporter, the instance label will commonly
be used for things other than hostnames so remove this restriction.
https://example.com or https://example.com/probe/me are some examples.

To prevent user error, check that urls aren't provided as targets
when there's no relabelling that could potentically fix them.
2015-11-07 14:35:20 +00:00
Fabian Reinartz
f2a8261cdb Merge pull request #1177 from fabric8io/kubernetes-discovery
Kubernetes SD authentication options cleanup
2015-10-24 20:32:25 +02:00
Fabian Reinartz
180da1ba65 Add overflow check in TLS config 2015-10-24 17:12:34 +02:00
Jimmi Dyson
87940ec213 Kubernetes SD: Rename masters to api_servers in config 2015-10-24 14:41:14 +01:00
Jimmi Dyson
7ff5cc66ea Kubernetes SD authentication options cleanup 2015-10-23 16:47:52 +01:00
Brian Brazil
1ddf75240d config: Don't hide username, it's not secret.
Usernames are not generally considered to be secrets,
and treating them as secrets may lead to confusion
as to how secure they are. Obscuring them also makes
debugging harder.
2015-10-08 15:13:21 +01:00
Matt Jibson
dcb4856d72 Add SD for Amazon EC2 instances 2015-10-06 18:36:17 -04:00
Julius Volz
dac26cef71 Rename global "labels" config option to "external_labels". 2015-09-29 20:54:20 +02:00
Matt Jibson
0e99fa6c46 Allow labelmap action 2015-09-21 15:41:19 -04:00
Jimmi Dyson
ec04ba38a2 Kubernetes SD config check 2015-09-09 13:24:44 +01:00
Jimmi Dyson
a1574aa2b3 Move TLS options to scrape config
Fixes #1013, fixes #989
2015-09-09 09:52:21 +01:00
Fabian Reinartz
1ef5ed0cf2 Merge pull request #1048 from xperimental/fix/marathon-config
Fix parsing Marathon SD config
2015-09-06 20:09:46 +02:00
Robert Jacob
eb7416e71f Fix missing unmarshal for Marathon SD config. 2015-09-06 20:02:22 +02:00
Jimmi Dyson
d7a7fd4589 Kubernetes SD improvements
* Support multiple masters with retries against each master as required.
* Scrape masters' metrics.
* Add role meta label for node/service/master to make it easier for relabeling.
2015-09-04 11:31:20 +01:00
Julius Volz
f63a899744 Change config regexes to full-string matches.
This anchors all regular expressions entered via the config to match a
full string vs. a substring.

THIS IS A BREAKING CHANGE!

Fixes part of https://github.com/prometheus/prometheus/issues/996
2015-09-01 15:46:41 +02:00
Julius Volz
995d3b831d Fix most golint warnings.
This is with `golint -min_confidence=0.5`.

I left several lint warnings untouched because they were either
incorrect or I felt it was better not to change them at the moment.
2015-08-26 12:44:46 +02:00
Fabian Reinartz
438e232c9b Fix grouping of import blocks 2015-08-22 09:42:45 +02:00
Fabian Reinartz
306e8468a0 Switch from client_golang/model to common/model 2015-08-21 13:33:38 +02:00
Fabian Reinartz
ac0be60bb9 Add license headers 2015-08-20 13:03:56 +02:00
Fabian Reinartz
139f27bf8a Increase default retry interval for file SD
The automatic refresh is a safety mechanism in case
file watches fail. As they seem to be working well the
interval can be increased.
2015-08-16 15:06:26 +02:00
Fabian Reinartz
3c6dd161d7 Scrape all services on empty services list. 2015-08-14 17:39:41 +02:00
Fabian Reinartz
b964da4b75 Merge pull request #905 from fabric8io/kubernetes-discovery
Kubernetes discovery
2015-08-13 15:08:32 +02:00
Fabian Reinartz
24e91720ad Merge pull request #980 from prometheus/map-labels
Retrieval: Add relabel action to map labels names with a regex.
2015-08-13 14:36:59 +02:00
Brian Brazil
4e70a0a14e Retrieval: Add relabel action to map label names with a regex.
The intended use case is where a user has tags/labels coming
from metadata in Kubernetes or EC2, and wants to make
some subset of them into target labels.
2015-08-13 13:19:11 +01:00
Brian Brazil
43449b0581 config: Update tests/examples to use __tmp_ 2015-08-13 10:39:21 +01:00
Jimmi Dyson
923f8111d4 Initial Kubernetes discovery
Fixes #904
2015-08-13 10:38:52 +01:00
Fabian Reinartz
cdcfada2ac Merge pull request #965 from prometheus/fabxc/relpath
Resolve relative paths on configuration loading
2015-08-10 19:43:42 +02:00
Fabian Reinartz
73f1cc807d Check token and cert file existence in promtool 2015-08-10 11:42:29 +02:00
Fabian Reinartz
54202bc5a8 Merge pull request #902 from xperimental/feature/marathon-discovery
retrieval/discovery: Service discovery using marathon API
2015-08-10 01:43:37 +02:00
Robert Jacob
4d0f974c42 Add service discovery using Marathon API. 2015-08-10 01:36:24 +02:00
Will Rouesnel
7810448dbe Add proxy_url parameter to allow specifying per-job HTTP proxy servers
Allow scrape_configs to have an optional proxy_url option which specifies
a proxy to be used for all connections to hosts in that config.

Internally this modifies the various client functions to take a *url.URL pointer
which currently must point to an HTTP proxy (but has been left open-ended to
allow the url format to be extended to support others, such as maybe SOCKS if
needed).
2015-08-08 04:29:27 +10:00
Fabian Reinartz
7a67472fc1 Resolve relative paths on configuration loading
This moves the concern of resolving the files relative to the config
file into the configuration loading itself.
It also fixes #921 which did not load the cert and token files relatively.
2015-08-05 18:08:04 +02:00
Jimmi Dyson
52cf6b3e6e Configuration options for bearer tokens, client certs & CA certs
Fixes #918, fixes #917
2015-08-04 17:18:46 +01:00
Johannes 'fish' Ziemke
9ab340e95e Add support for A record based DNS SD
If using A records, the user needs to specify "port" and set "type" to
"A".
2015-07-30 15:55:38 +02:00
Fabian Reinartz
187fe4e3d3 Fix missing defaults for empty global config blocks 2015-07-17 21:25:56 +02:00
Fabian Reinartz
2a53b107c1 Fix missing defaults in empty configurations 2015-07-17 19:15:01 +02:00
Fabian Reinartz
435fc7234f config: add overflow detection for serverset config 2015-07-14 02:46:00 +02:00
Fabian Reinartz
02e06839f2 config: hide authentication credentials in String() output 2015-07-06 14:28:07 +02:00
Brian Brazil
52859b8033 Merge pull request #836 from prometheus/shard
Add 'hashmod' relabel action.
2015-06-24 21:40:10 +01:00
Brian Brazil
682f949ab1 Add 'hashmod' relabel action.
This takes the modulus of a hash of some labels.
Combined with a keep relabel action, this allows
for sharding of targets across multiple prometheus
servers.
2015-06-24 21:14:53 +01:00
Fabian Reinartz
4319b06dd2 config: add omitempty for consul SD config. 2015-06-24 16:22:52 +02:00
Fabian Reinartz
7ec15956e4 config: show original input on String() 2015-06-23 19:40:44 +02:00
Fabian Reinartz
dc7d27ab9a retrieval: add honor label handling and parametrized querying.
This commit adds the honor_labels and params arguments to the scrape
config. This allows to specify query parameters used by the scrapers
and handling scraped labels with precedence.
2015-06-23 13:45:14 +02:00
Brian Brazil
4d895242f9 Add support for Zookeeper Serversets for SD.
It can discover an entire tree of serversets, or just one.
2015-06-16 11:02:08 +01:00
Brian Brazil
0dbae36d36 Allow ingested metrics to be relabeled.
The main purpose of this is to allow for blacklisting
of expensive metrics as a tactical option.
It could also find uses for renaming and removing labels
from federation.
2015-06-13 15:18:27 +01:00
Brian Brazil
58ceae82bc Revert "Allow ingested metrics to be relabeled."
This reverts commit f2f26ca08f.

Was accidentally pushed to master instead of a branch for PR.
2015-06-12 22:12:26 +01:00
Brian Brazil
f2f26ca08f Allow ingested metrics to be relabeled.
The main purpose of this is to allow for blacklisting
of expensive metrics as a tactical option.
It could also find uses for renaming and removing labels
from federation.
2015-06-12 22:06:30 +01:00
Fabian Reinartz
116e6df096 config: raise error on unknown config parameters.
The YAML parser ignores additional parameters on unmarshaling. This causes
frequent confusion with bad configs that pass parsing.
These changes raise errors on additional parameters.
2015-06-12 13:42:56 +02:00
Fabian Reinartz
3a24a7779d config: extend and format config example/test. 2015-06-12 13:39:12 +02:00
Fabian Reinartz
458550560c config: error on missing regex in relabel config.
Fixes issue #787.
2015-06-10 23:42:51 +02:00
Fabian Reinartz
b5fe2e9afe Merge pull request #773 from prometheus/fabxc/simple-cfg
config: simplify default config handling.
2015-06-08 16:22:06 +02:00
Fabian Reinartz
f6c33a2347 config: prevent overwrite of DefaultGlobalConfig 2015-06-08 16:02:10 +02:00
Fabian Reinartz
db3367e83f config: ensure correct labelname in JSON target group. 2015-06-06 10:08:42 +02:00
Fabian Reinartz
0af1cff8af config: simplify default config handling. 2015-06-06 09:04:04 +02:00