Commit Graph

145 Commits

Author SHA1 Message Date
Tariq Ibrahim 3f7ed7de49 Adding new metric type to track in-flight remote read queries. (#4677)
Signed-off-by: tariqibrahim <tariq.ibrahim@microsoft.com>
2018-10-10 14:48:32 -07:00
Tom Wilkie d3a1ff1abf
Reduce memory usage of remote read by reducing pointer usage. (#4655)
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-09-25 19:14:00 +01:00
Tom Wilkie 457e4bb58e
Limit the number of samples remote read can return. (#4532)
* Limit the number of samples remote read can return.

- Return 413 entity too large.
- Limit can be set be a flag.  Allow 0 to mean no limit.
- Include limit in error message.
- Set default limit to 50M (* 16 bytes = 800MB).

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-09-05 15:50:50 +02:00
Daisy T 7d01ead689 change time.duration to model.duration for standardization (#4479)
Signed-off-by: Daisy T <daisyts@gmx.com>
2018-08-24 16:55:21 +02:00
Julius Volz 8fbe1b5133
Handle a bunch of unchecked errors (#4461)
There are many more (mostly finalizers like Close/Stop/etc.), but most of
the others seemed like one couldn't do much about them anyway.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-08-17 17:24:35 +02:00
Henri DF ffb7836c14 Send "Accept-Encoding" header in read request (#4421)
We should be doing this since we only accept Snappy-encoded responses.

Signed-off-by: Henri DF <henridf@gmail.com>
2018-07-26 12:45:04 +01:00
Henri DF 3abb2cc349 Fix typo (#4423)
Signed-off-by: Henri DF <henridf@gmail.com>
2018-07-26 08:49:53 +01:00
Goutham Veeramachaneni c28cc5076c Saner defaults and metrics for remote-write (#4279)
* Rename queueCapacity to shardCapacity
* Saner defaults for remote write
* Reduce allocs on retries

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2018-07-18 05:15:16 +01:00
Thomas Jackson 92c6f0c92e Add offset to selectParams (#4226)
* Add Start/End to SelectParams
* Make remote read use the new selectParams for start/end

This commit will continue sending the start/end time of the remote read
query as the overarching promql time and the specific range of data that
the query is intersted in receiving a response to is now part of the
ReadHints (upstream discussion in #4226).

* Remove unused vendored code

The genproto.sh script was updated, but the code wasn't regenerated.
This simply removes the vendored deps that are no longer part of the
codegen output.

Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>
2018-07-18 04:58:00 +01:00
Tom Wilkie 0b189b2da9 Review feedback.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-06-18 17:21:12 +01:00
Corentin Chary 530107f8ef federation: nil pointer deference when using remove read
```
level=error ts=2018-06-13T07:19:04.515149169Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56202: runtime error: invalid memory address or nil pointer dereference"
level=error ts=2018-06-13T07:19:04.516199547Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56204: runtime error: invalid memory address or nil pointer dereference"
level=error ts=2018-06-13T07:19:04.51717692Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56206: runtime error: invalid memory address or nil pointer dereference"
level=error ts=2018-06-13T07:19:04.564952878Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56208: runtime error: invalid memory address or nil pointer dereference"
level=error ts=2018-06-13T07:19:04.566575791Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56210: runtime error: invalid memory address or nil pointer dereference"
level=error ts=2018-06-13T07:19:04.567106063Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56212: runtime error: invalid memory address or nil pointer dereference"
```

When remove read is enabled, federation will call `q.Select(nil, mset...)`
which will break remote reads because it currently doesn't handle empty
SelectParams.

Signed-off-by: Corentin Chary <c.chary@criteo.com>
2018-06-18 17:21:12 +01:00
Andreas Auernhammer 37d1bcf495 limit size of POST requests against remote read endpoint (#4239)
This commit fixes a denial-of-service issue of the remote
read endpoint. It limits the size of the POST request body
to 32 MB such that clients cannot write arbitrary amounts
of data to the server memory.

Fixes #4238

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2018-06-08 08:19:20 +01:00
Bryan Boreham 3277aeefaa Add queue name to logger for remote writes
More than one remote_write destination can be configured, in which
case it's essential to know which one each log message refers to.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2018-06-01 13:04:00 +00:00
Tom Wilkie b58199bf12 Review feedback.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-05-29 11:35:43 +01:00
Tom Wilkie 3353bbd018 Add proper unclean shutdown handling with a cancellable context.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-05-29 09:51:29 +01:00
Tom Wilkie e51d6c4b6c Make remote flush deadline a command line param.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-05-23 15:06:01 +01:00
Tom Wilkie a6c353613a Make the flush deadline configurable.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-05-23 15:04:36 +01:00
Tom Wilkie aa17263edd Remove WaitGroup and extra goroutine.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-05-23 15:04:34 +01:00
Tom Wilkie f3c61f8bb2 Only give remote queues 1 minute to flush samples on shutdown.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-05-23 15:04:32 +01:00
Henri DF 2952387ed1 Pass query hints down into remote read query proto (#4122)
Signed-off-by: Henri DF <henridf@gmail.com>
2018-05-08 09:48:13 +01:00
Adam Shannon 809881d7f5 support reading basic_auth password_file for HTTP basic auth (#4077)
Issue: https://github.com/prometheus/prometheus/issues/4076

Signed-off-by: Adam Shannon <adamkshannon@gmail.com>
2018-04-25 18:19:06 +01:00
Mario Trangoni 464e747f1e fix some comments typos (#4059) 2018-04-08 10:51:54 +01:00
Tom Wilkie dc860e7d0e Fix nit. 2018-03-12 16:48:51 +00:00
Tom Wilkie 390b018c90 Test sample timeout delivery. 2018-03-12 15:35:43 +00:00
Tom Wilkie 22d820ef8e Review feedback. 2018-03-12 14:27:48 +00:00
Tom Wilkie f8c9d375b6 Correctly stop the timer used in the remote write path. 2018-03-09 12:00:26 +00:00
ferhat elmas ffa673f7d8 General simplifications (#3887)
Another try as in #1516
2018-02-26 07:58:10 +00:00
Fabian Reinartz 7ccd4b39b8 *: implement query params
This adds a parameter to the storage selection interface which allows
query engine(s) to pass information about the operations surrounding a
data selection.
This can for example be used by remote storage backends to infer the
correct downsampling aggregates that need to be provided.
2018-02-13 12:17:22 +01:00
Tom Wilkie a730083cbf
Merge pull request #3731 from bboreham/reuse-timer
Re-use timer in remote storage queue
2018-02-05 10:54:08 +01:00
Krasi Georgiev b75428ec19 rename package retrieve to scrape
no fucnctinal changes just renaming retrieval to scrape
2018-02-01 09:55:07 +00:00
Bryan Boreham 8a4535e6ad Re-use timer instead of creating new ones on every sample
The docs for `time.After()` note that "The underlying Timer is not
recovered by the garbage collector until the timer fires".
2018-01-24 12:36:29 +00:00
Tom Wilkie f2c5399e39
Merge pull request #3561 from twiedenbein/master
fixed bug with initialization of queueconfig
2018-01-17 12:24:58 +00:00
Shubheksha Jalan 0471e64ad1 Use shared types from the `common` repo (#3674)
* refactor: use shared types from common repo, remove util/config

* vendor: add common/config

* fix nit
2018-01-11 16:10:25 +01:00
Shubheksha Jalan ec94df49d4 Refactor SD configuration to remove `config` dependency (#3629)
* refactor: move targetGroup struct and CheckOverflow() to their own package

* refactor: move auth and security related structs to a utility package, fix import error in utility package

* refactor: Azure SD, remove SD struct from config

* refactor: DNS SD, remove SD struct from config into dns package

* refactor: ec2 SD, move SD struct from config into the ec2 package

* refactor: file SD, move SD struct from config to file discovery package

* refactor: gce, move SD struct from config to gce discovery package

* refactor: move HTTPClientConfig and URL into util/config, fix import error in httputil

* refactor: consul, move SD struct from config into consul discovery package

* refactor: marathon, move SD struct from config into marathon discovery package

* refactor: triton, move SD struct from config to triton discovery package, fix test

* refactor: zookeeper, move SD structs from config to zookeeper discovery package

* refactor: openstack, remove SD struct from config, move into openstack discovery package

* refactor: kubernetes, move SD struct from config into kubernetes discovery package

* refactor: notifier, use targetgroup package instead of config

* refactor: tests for file, marathon, triton SD - use targetgroup package instead of config.TargetGroup

* refactor: retrieval, use targetgroup package instead of config.TargetGroup

* refactor: storage, use config util package

* refactor: discovery manager, use targetgroup package instead of config.TargetGroup

* refactor: use HTTPClient and TLS config from configUtil instead of config

* refactor: tests, use targetgroup package instead of config.TargetGroup

* refactor: fix tagetgroup.Group pointers that were removed by mistake

* refactor: openstack, kubernetes: drop prefixes

* refactor: remove import aliases forced due to vscode bug

* refactor: move main SD struct out of config into discovery/config

* refactor: rename configUtil to config_util

* refactor: rename yamlUtil to yaml_config

* refactor: kubernetes, remove prefixes

* refactor: move the TargetGroup package to discovery/

* refactor: fix order of imports
2017-12-29 21:01:34 +01:00
Tom Wiedenbein 937ac8c060
fixed bug with initialization of queueconfig
QueueConfigs would only ever initialize to the default settings, and would not pick up their respective values from YAML.
2017-12-08 02:11:45 -08:00
Fabian Reinartz 83cd270ea4 *: adapt to storage interface changes 2017-11-23 19:05:04 +01:00
Tobias Schmidt 7098c56474 Add remote read filter option
For special remote read endpoints which have only data for specific
queries, it is desired to limit the number of queries sent to the
configured remote read endpoint to reduce latency and performance
overhead.
2017-11-13 23:30:01 +01:00
Tobias Schmidt 434f0374f7 Refactor remote storage querier handling
* Decouple remote client from ReadRecent feature.
* Separate remote read filter into a small, testable function.
* Use storage.Queryable interface to compose independent
  functionalities.
2017-11-13 23:19:15 +01:00
Julius Volz 9f10c63cff
Fix remote read labelset corruption (#3456)
The labelsets returned from remote read are mutated in higher levels
(like seriesFilter.Labels()) and since the concreteSeriesSet didn't
return a copy, the external mutation affected the labelset in the
concreteSeries itself. This resulted in bizarre bugs where local and
remote series would show with identical label sets in the UI, but not be
deduplicated, since internally, a series might come to look like:

{__name__="node_load5", instance="192.168.1.202:12090", job="node_exporter", node="odroid", node="odroid"}

(note the repetition of the last label)
2017-11-12 00:47:47 +01:00
Krasi Georgiev 5d8f93a22a now using only github.com/gogo/protobuf
bumped all grpc-gateway packages to v1.2.2
updated and run  the denproto.sh script
2017-11-02 11:31:57 +00:00
Tom Wilkie 1af3ef431d s/TestRemoveLabels/TestSeriesSetFilter/ 2017-10-26 13:50:39 +01:00
Tom Wilkie 9c3c98e8de Revert "Port 'Don't disable HTTP keep-alives for remote storage connections.' to 2.0 (see #3173)"
This reverts commit 0997191b18.
2017-10-26 13:43:48 +01:00
Tom Wilkie 746752b946 Merge external labels in order. 2017-10-26 11:44:49 +01:00
Tom Wilkie 6e4d4ea402 Initialise some counters in remote storage API. 2017-10-26 11:09:45 +01:00
Tom Wilkie 2ae04d0e79 Add license header. 2017-10-26 11:09:16 +01:00
Tom Wilkie e8c264e47a Add comment. 2017-10-26 11:09:16 +01:00
Tom Wilkie ee011d906d Port remote read server to 2.0. 2017-10-26 11:09:14 +01:00
Bryan Boreham 0997191b18 Port 'Don't disable HTTP keep-alives for remote storage connections.' to 2.0 (see #3173)
Removes configurability introduced in #3160 in favour of hard-coding,
per advice from @brian-brazil.
2017-10-26 11:08:33 +01:00
Tom Wilkie 56820726fa Move a couple of the encoding/decoding functions into codec.go 2017-10-26 11:08:33 +01:00
Conor Broderick 08b7328669 Port Metric name validation to 2.0 (see #2975) 2017-10-26 11:08:33 +01:00