Commit Graph

144 Commits

Author SHA1 Message Date
Tom Wilkie
d3a1ff1abf
Reduce memory usage of remote read by reducing pointer usage. (#4655)
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-09-25 19:14:00 +01:00
Tom Wilkie
457e4bb58e
Limit the number of samples remote read can return. (#4532)
* Limit the number of samples remote read can return.

- Return 413 entity too large.
- Limit can be set be a flag.  Allow 0 to mean no limit.
- Include limit in error message.
- Set default limit to 50M (* 16 bytes = 800MB).

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-09-05 15:50:50 +02:00
Daisy T
7d01ead689 change time.duration to model.duration for standardization (#4479)
Signed-off-by: Daisy T <daisyts@gmx.com>
2018-08-24 16:55:21 +02:00
Julius Volz
8fbe1b5133
Handle a bunch of unchecked errors (#4461)
There are many more (mostly finalizers like Close/Stop/etc.), but most of
the others seemed like one couldn't do much about them anyway.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-08-17 17:24:35 +02:00
Henri DF
ffb7836c14 Send "Accept-Encoding" header in read request (#4421)
We should be doing this since we only accept Snappy-encoded responses.

Signed-off-by: Henri DF <henridf@gmail.com>
2018-07-26 12:45:04 +01:00
Henri DF
3abb2cc349 Fix typo (#4423)
Signed-off-by: Henri DF <henridf@gmail.com>
2018-07-26 08:49:53 +01:00
Goutham Veeramachaneni
c28cc5076c Saner defaults and metrics for remote-write (#4279)
* Rename queueCapacity to shardCapacity
* Saner defaults for remote write
* Reduce allocs on retries

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2018-07-18 05:15:16 +01:00
Thomas Jackson
92c6f0c92e Add offset to selectParams (#4226)
* Add Start/End to SelectParams
* Make remote read use the new selectParams for start/end

This commit will continue sending the start/end time of the remote read
query as the overarching promql time and the specific range of data that
the query is intersted in receiving a response to is now part of the
ReadHints (upstream discussion in #4226).

* Remove unused vendored code

The genproto.sh script was updated, but the code wasn't regenerated.
This simply removes the vendored deps that are no longer part of the
codegen output.

Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>
2018-07-18 04:58:00 +01:00
Tom Wilkie
0b189b2da9 Review feedback.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-06-18 17:21:12 +01:00
Corentin Chary
530107f8ef federation: nil pointer deference when using remove read
```
level=error ts=2018-06-13T07:19:04.515149169Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56202: runtime error: invalid memory address or nil pointer dereference"
level=error ts=2018-06-13T07:19:04.516199547Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56204: runtime error: invalid memory address or nil pointer dereference"
level=error ts=2018-06-13T07:19:04.51717692Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56206: runtime error: invalid memory address or nil pointer dereference"
level=error ts=2018-06-13T07:19:04.564952878Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56208: runtime error: invalid memory address or nil pointer dereference"
level=error ts=2018-06-13T07:19:04.566575791Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56210: runtime error: invalid memory address or nil pointer dereference"
level=error ts=2018-06-13T07:19:04.567106063Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56212: runtime error: invalid memory address or nil pointer dereference"
```

When remove read is enabled, federation will call `q.Select(nil, mset...)`
which will break remote reads because it currently doesn't handle empty
SelectParams.

Signed-off-by: Corentin Chary <c.chary@criteo.com>
2018-06-18 17:21:12 +01:00
Andreas Auernhammer
37d1bcf495 limit size of POST requests against remote read endpoint (#4239)
This commit fixes a denial-of-service issue of the remote
read endpoint. It limits the size of the POST request body
to 32 MB such that clients cannot write arbitrary amounts
of data to the server memory.

Fixes #4238

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2018-06-08 08:19:20 +01:00
Bryan Boreham
3277aeefaa Add queue name to logger for remote writes
More than one remote_write destination can be configured, in which
case it's essential to know which one each log message refers to.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2018-06-01 13:04:00 +00:00
Tom Wilkie
b58199bf12 Review feedback.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-05-29 11:35:43 +01:00
Tom Wilkie
3353bbd018 Add proper unclean shutdown handling with a cancellable context.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-05-29 09:51:29 +01:00
Tom Wilkie
e51d6c4b6c Make remote flush deadline a command line param.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-05-23 15:06:01 +01:00
Tom Wilkie
a6c353613a Make the flush deadline configurable.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-05-23 15:04:36 +01:00
Tom Wilkie
aa17263edd Remove WaitGroup and extra goroutine.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-05-23 15:04:34 +01:00
Tom Wilkie
f3c61f8bb2 Only give remote queues 1 minute to flush samples on shutdown.
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2018-05-23 15:04:32 +01:00
Henri DF
2952387ed1 Pass query hints down into remote read query proto (#4122)
Signed-off-by: Henri DF <henridf@gmail.com>
2018-05-08 09:48:13 +01:00
Adam Shannon
809881d7f5 support reading basic_auth password_file for HTTP basic auth (#4077)
Issue: https://github.com/prometheus/prometheus/issues/4076

Signed-off-by: Adam Shannon <adamkshannon@gmail.com>
2018-04-25 18:19:06 +01:00
Mario Trangoni
464e747f1e fix some comments typos (#4059) 2018-04-08 10:51:54 +01:00
Tom Wilkie
dc860e7d0e Fix nit. 2018-03-12 16:48:51 +00:00
Tom Wilkie
390b018c90 Test sample timeout delivery. 2018-03-12 15:35:43 +00:00
Tom Wilkie
22d820ef8e Review feedback. 2018-03-12 14:27:48 +00:00
Tom Wilkie
f8c9d375b6 Correctly stop the timer used in the remote write path. 2018-03-09 12:00:26 +00:00
ferhat elmas
ffa673f7d8 General simplifications (#3887)
Another try as in #1516
2018-02-26 07:58:10 +00:00
Fabian Reinartz
7ccd4b39b8 *: implement query params
This adds a parameter to the storage selection interface which allows
query engine(s) to pass information about the operations surrounding a
data selection.
This can for example be used by remote storage backends to infer the
correct downsampling aggregates that need to be provided.
2018-02-13 12:17:22 +01:00
Tom Wilkie
a730083cbf
Merge pull request #3731 from bboreham/reuse-timer
Re-use timer in remote storage queue
2018-02-05 10:54:08 +01:00
Krasi Georgiev
b75428ec19 rename package retrieve to scrape
no fucnctinal changes just renaming retrieval to scrape
2018-02-01 09:55:07 +00:00
Bryan Boreham
8a4535e6ad Re-use timer instead of creating new ones on every sample
The docs for `time.After()` note that "The underlying Timer is not
recovered by the garbage collector until the timer fires".
2018-01-24 12:36:29 +00:00
Tom Wilkie
f2c5399e39
Merge pull request #3561 from twiedenbein/master
fixed bug with initialization of queueconfig
2018-01-17 12:24:58 +00:00
Shubheksha Jalan
0471e64ad1 Use shared types from the common repo (#3674)
* refactor: use shared types from common repo, remove util/config

* vendor: add common/config

* fix nit
2018-01-11 16:10:25 +01:00
Shubheksha Jalan
ec94df49d4 Refactor SD configuration to remove config dependency (#3629)
* refactor: move targetGroup struct and CheckOverflow() to their own package

* refactor: move auth and security related structs to a utility package, fix import error in utility package

* refactor: Azure SD, remove SD struct from config

* refactor: DNS SD, remove SD struct from config into dns package

* refactor: ec2 SD, move SD struct from config into the ec2 package

* refactor: file SD, move SD struct from config to file discovery package

* refactor: gce, move SD struct from config to gce discovery package

* refactor: move HTTPClientConfig and URL into util/config, fix import error in httputil

* refactor: consul, move SD struct from config into consul discovery package

* refactor: marathon, move SD struct from config into marathon discovery package

* refactor: triton, move SD struct from config to triton discovery package, fix test

* refactor: zookeeper, move SD structs from config to zookeeper discovery package

* refactor: openstack, remove SD struct from config, move into openstack discovery package

* refactor: kubernetes, move SD struct from config into kubernetes discovery package

* refactor: notifier, use targetgroup package instead of config

* refactor: tests for file, marathon, triton SD - use targetgroup package instead of config.TargetGroup

* refactor: retrieval, use targetgroup package instead of config.TargetGroup

* refactor: storage, use config util package

* refactor: discovery manager, use targetgroup package instead of config.TargetGroup

* refactor: use HTTPClient and TLS config from configUtil instead of config

* refactor: tests, use targetgroup package instead of config.TargetGroup

* refactor: fix tagetgroup.Group pointers that were removed by mistake

* refactor: openstack, kubernetes: drop prefixes

* refactor: remove import aliases forced due to vscode bug

* refactor: move main SD struct out of config into discovery/config

* refactor: rename configUtil to config_util

* refactor: rename yamlUtil to yaml_config

* refactor: kubernetes, remove prefixes

* refactor: move the TargetGroup package to discovery/

* refactor: fix order of imports
2017-12-29 21:01:34 +01:00
Tom Wiedenbein
937ac8c060
fixed bug with initialization of queueconfig
QueueConfigs would only ever initialize to the default settings, and would not pick up their respective values from YAML.
2017-12-08 02:11:45 -08:00
Fabian Reinartz
83cd270ea4 *: adapt to storage interface changes 2017-11-23 19:05:04 +01:00
Tobias Schmidt
7098c56474 Add remote read filter option
For special remote read endpoints which have only data for specific
queries, it is desired to limit the number of queries sent to the
configured remote read endpoint to reduce latency and performance
overhead.
2017-11-13 23:30:01 +01:00
Tobias Schmidt
434f0374f7 Refactor remote storage querier handling
* Decouple remote client from ReadRecent feature.
* Separate remote read filter into a small, testable function.
* Use storage.Queryable interface to compose independent
  functionalities.
2017-11-13 23:19:15 +01:00
Julius Volz
9f10c63cff
Fix remote read labelset corruption (#3456)
The labelsets returned from remote read are mutated in higher levels
(like seriesFilter.Labels()) and since the concreteSeriesSet didn't
return a copy, the external mutation affected the labelset in the
concreteSeries itself. This resulted in bizarre bugs where local and
remote series would show with identical label sets in the UI, but not be
deduplicated, since internally, a series might come to look like:

{__name__="node_load5", instance="192.168.1.202:12090", job="node_exporter", node="odroid", node="odroid"}

(note the repetition of the last label)
2017-11-12 00:47:47 +01:00
Krasi Georgiev
5d8f93a22a now using only github.com/gogo/protobuf
bumped all grpc-gateway packages to v1.2.2
updated and run  the denproto.sh script
2017-11-02 11:31:57 +00:00
Tom Wilkie
1af3ef431d s/TestRemoveLabels/TestSeriesSetFilter/ 2017-10-26 13:50:39 +01:00
Tom Wilkie
9c3c98e8de Revert "Port 'Don't disable HTTP keep-alives for remote storage connections.' to 2.0 (see #3173)"
This reverts commit 0997191b18.
2017-10-26 13:43:48 +01:00
Tom Wilkie
746752b946 Merge external labels in order. 2017-10-26 11:44:49 +01:00
Tom Wilkie
6e4d4ea402 Initialise some counters in remote storage API. 2017-10-26 11:09:45 +01:00
Tom Wilkie
2ae04d0e79 Add license header. 2017-10-26 11:09:16 +01:00
Tom Wilkie
e8c264e47a Add comment. 2017-10-26 11:09:16 +01:00
Tom Wilkie
ee011d906d Port remote read server to 2.0. 2017-10-26 11:09:14 +01:00
Bryan Boreham
0997191b18 Port 'Don't disable HTTP keep-alives for remote storage connections.' to 2.0 (see #3173)
Removes configurability introduced in #3160 in favour of hard-coding,
per advice from @brian-brazil.
2017-10-26 11:08:33 +01:00
Tom Wilkie
56820726fa Move a couple of the encoding/decoding functions into codec.go 2017-10-26 11:08:33 +01:00
Conor Broderick
08b7328669 Port Metric name validation to 2.0 (see #2975) 2017-10-26 11:08:33 +01:00
Tom Wilkie
8fe0212ff7 Port 'Make queue manager configurable.' to 2.0, see #2991 2017-10-26 11:08:33 +01:00