Commit Graph

55 Commits

Author SHA1 Message Date
Romain Baugue
95193fa027 Exhaust every request body before closing it (#5166) (#5479)
From the documentation:
> The default HTTP client's Transport may not
> reuse HTTP/1.x "keep-alive" TCP connections if the Body is
> not read to completion and closed.

This effectively enable keep-alive for the fixed requests.

Signed-off-by: Romain Baugue <romain.baugue@elwinar.com>
2019-04-18 09:50:37 +01:00
Simon Pasquier
c1682adb2f Bump prometheus/common to v0.3.0 (#5344)
* Reload certificates from disk automatically

This change bumps github.com/prometheus/common to include
https://github.com/prometheus/common/pull/173

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* scrape: close idle connections on reload/stop

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* use v0.3.0 tag

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-04-10 13:20:00 +01:00
Brian Brazil
f7184978f4 Protect against memory exhaustion when scraping.
Now that we're not losing the scrape cache across failed
scrape, a scrape that continually failed but had varying
series or metadata (e.g. timestamps in metric names,
plus hitting smaple_limit) would grow the cache indefinitely.

Add some code to catch that, and flush the cache anyway.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2019-04-04 19:09:11 +01:00
Brian Brazil
dd3073616c Don't lose the scrape cache on a failed scrape.
This avoids CPU usage increasing when the target comes back.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2019-04-04 19:09:11 +01:00
Tariq Ibrahim
8fdfa8abea refine error handling in prometheus (#5388)
i) Uses the more idiomatic Wrap and Wrapf methods for creating nested errors.
ii) Fixes some incorrect usages of fmt.Errorf where the error messages don't have any formatting directives.
iii) Does away with the use of fmt package for errors in favour of pkg/errors

Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-03-26 00:01:12 +01:00
Tom Wilkie
807fd33ecc Review feedback.
- Update read path to use labels.Labels.
- Fix the tests.
- Remove pack.
- Remove unused function.
- Fix race in tests.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2019-03-18 20:31:12 +00:00
Simon Pasquier
23069b87dc scrape: fallback to hostname if lookup fails (#5366)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-03-15 12:02:16 +00:00
Julien Pivotto
4397916cb2 Add honor_timestamps (#5304)
Fixes #5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2019-03-15 10:04:15 +00:00
xjewer
0d1a69353e scrape: Add global jitter for HA server (#5181)
* scrape: Add global jitter for HA server

Covers issue in https://github.com/prometheus/prometheus/pull/4926#issuecomment-449039848
where the HA setup become a problem for targets unable to be scraped simultaneously.
The new jitter per server relies on the hostname and external labels which necessarily to be uniq.

As before, scrape offset will be calculated with regard the absolute time, so even
restart/reload doesn't change scrape time per scrape target + prometheus instance.

Use fqdn if possible, otherwise fall back to the hostname. It adds extra random seed
to calculate server hash to be distinguish on machines with the same hostname, but
different DC.

Signed-off-by: Aleksei Semiglazov <xjewer@gmail.com>
2019-03-12 10:46:15 +00:00
Julien Pivotto
04ce817c49 scrape: Rewrite scrape loop options as a struct (#5314)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2019-03-12 10:26:18 +00:00
Nguyen Hai Truong
aed9ea144a Remove duplicated words in comments
Although it is spelling mistakes, it might make an affects
while reading.

Co-Authored-By: Kim Bao Long longkb@vn.fujitsu.com
Signed-off-by: Nguyen Hai Truong <truongnh@vn.fujitsu.com>
2019-02-20 17:41:02 -08:00
Simon Pasquier
12708acd15
scrape: catch errors when creating HTTP clients (#5182)
* scrape: catch errors when creating HTTP clients

This change makes sure that no scrape pool is created with a nil HTTP
client.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Address Tariq's comment

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Address Brian's comment

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-02-13 14:24:22 +01:00
JoeWrightss
4cb6c202ff Fix fmt.Errorf error message (#5199)
Signed-off-by: zhoulin xie <zhoulin.xie@daocloud.io>
2019-02-10 15:16:20 +05:30
Matt Layher
302148fd69 *: apply gofmt -s
Signed-off-by: Matt Layher <mdlayher@gmail.com>
2019-01-16 17:28:14 -05:00
Simon Pasquier
f678e27eb6
*: use latest release of staticcheck (#5057)
* *: use latest release of staticcheck

It also fixes a couple of things in the code flagged by the additional
checks.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Use official release of staticcheck

Also run 'go list' before staticcheck to avoid failures when downloading packages.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-01-04 14:47:38 +01:00
Bartek Płotka
62c8337e77 Moved configuration into relabel package. (#4955)
Adapted top dir relabel to use pkg relabel structs.

Removal of this in a separate tracked here: https://github.com/prometheus/prometheus/issues/3647

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
2018-12-18 11:26:36 +00:00
F4ncyMooN
cd491e2d3a wait for interval-now%interval to make sure target will be collected with a fixed interval when restart prometheus (#4926)
Signed-off-by: hlv <hlv@freewheel.tv>
2018-12-05 09:58:39 +00:00
Ben Kochie
c6399296dc
Fix spelling/typos (#4921)
* Fix spelling/typos

Fix spelling/typos reported by codespell/misspell.
* UK -> US spelling changes.

Signed-off-by: Ben Kochie <superq@gmail.com>
2018-11-27 17:44:29 +01:00
Brian Brazil
d2f0f54d68
Pass through content-type for non-compressed output. (#4912)
Fixes #4911

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-11-26 13:05:07 +00:00
Wei Guo
996fd958ac fix deadlock in scrape manager (#4894)
Scrape manager will fall in deadlock when we reload configs frequently.
2018-11-23 11:23:55 +02:00
arugaki
b98a5acb57 fix buf redeclared in scrapeloop (#4873)
* buf Has been declared in scrape.go in line 785, I think it is unnecessary to declare a new variable again here.

Signed-off-by: arugakiWei <arugaki.wei@daocloud.io>

* delete the buf in line 785 because  it is never used.

Signed-off-by: arugakiWei <arugaki.wei@daocloud.io>
2018-11-22 12:13:52 +08:00
Simon Pasquier
ed19373a78
*: remove use of golang.org/x/net/context (#4869)
* *: remove use of golang.org/x/net/context

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* scrape: fix TestTargetScrapeScrapeCancel

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-11-19 12:31:16 +01:00
Brian Brazil
9c03e11c2c Hook OpenMetrics parser into scraping.
Extend metadata api to support units.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-10-18 13:58:00 +01:00
Brian Brazil
ffe7efb411 Prepare for multiple text formats
Pass content type down to text parser.

Add layer of indirection in front of text parser,
and rename to avoid future clashes.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2018-10-18 13:58:00 +01:00
Will Hegedus
193ebe7e34 Updates to /targets and /rules (scrape duration, last evaluation time) (#4722)
* Add evaluationTimestamp (Last Evaluation) column to display on /rules
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>

* Add lastScrapeDuration ("Scrape Duration") to display on /targets
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>

* Updates based on Julius' feedback

Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>

* Update to set timestamp to when eval started (after eval completes)

Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>

* Update /rules to display time since last evaluation

Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>

* Re-order Last Eval/Eval Time to be consistent with targets page

Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>
2018-10-12 18:26:59 +02:00
Krasi Georgiev
47a673c3a0
process scrape loops reloading in parallel (#4526)
The scrape manage receiver's channel now just saves the target sets
and another backgorund runner updates the scrape loops every 5 seconds.
This is so that the scrape manager doesn't block the receiving channel
when it does the long background reloading of the scrape loops.

Active and dropped targets are now saved in each scrape pool instead of
the scrape manager. This is mainly to avoid races when getting the
targets via the web api.

When reloading the scrape loops now happens in parallel to speed up the
final disared state and this also speeds up the prometheus's shutting
down.

Also updated some funcs signatures in the web package for consistency.

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2018-09-26 12:20:56 +03:00
Julius Volz
8fbe1b5133
Handle a bunch of unchecked errors (#4461)
There are many more (mostly finalizers like Close/Stop/etc.), but most of
the others seemed like one couldn't do much about them anyway.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-08-17 17:24:35 +02:00
Krasi Georgiev
9f2f6accba fix the TestManagerReloadNoChange test (#4267)
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2018-07-04 12:01:19 +01:00
Fabian Reinartz
057a5ae2b1 Address comments
Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-06-06 11:21:17 -04:00
Fabian Reinartz
ad4c33c1ff scrape,api: provide per-target metric metadata
This adds a per-target cache of scraped metadata. The metadata is only
available for the lifecycle of the attached target. An API endpoint allows
to select metadata by metric name and a label selection of targets.

Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-06-06 05:56:10 -04:00
Fabian Reinartz
76a4a46cb0 pkg/textparse: refactor and add metadata handling
Extends the parser to allow retrieving metadata.
The lexer now yields proper tokens that are fed into a hand-written
parser on top.

Signed-off-by: Fabian Reinartz <freinartz@google.com>
2018-05-25 11:27:25 -04:00
Elif T. Kuş
57dcdfb15f Rewrote tests with testutil for several test files (#4086)
* promql: Rewrote tests with testutil for functions_test

Signed-off-by: Elif T. Kuş <elifkus@gmail.com>

* pkg/relabel: Rewrote tests with testutil for relabel_test

Signed-off-by: Elif T. Kuş <elifkus@gmail.com>

* discovery/consul: Rewrote tests with testutil for consul_test

Signed-off-by: Elif T. Kuş <elifkus@gmail.com>

* scrape: Rewrote tests with testutil for manager_test

Signed-off-by: Elif T. Kuş <elifkus@gmail.com>
2018-04-27 13:11:16 +01:00
Karsten Weiss
d79d573f71 Fix spelling mistakes found by codespell (#4065)
Signed-off-by: Karsten Weiss <knweiss@gmail.com>
2018-04-27 13:04:02 +01:00
Adam Shannon
809881d7f5 support reading basic_auth password_file for HTTP basic auth (#4077)
Issue: https://github.com/prometheus/prometheus/issues/4076

Signed-off-by: Adam Shannon <adamkshannon@gmail.com>
2018-04-25 18:19:06 +01:00
Björn Rabenstein
91e470d733
Merge pull request #4096 from simonpasquier/fix-scrape-races-2.2
Fix scrape races (release-2.2 branch)
2018-04-25 15:36:29 +02:00
Simon Pasquier
2cbba4e948 scrape: fix data races
This commit avoids passing the full scrape configuration down to the
scrape loop to fix data races when the scrape configuration is being
reloaded.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-04-18 11:17:31 +02:00
Simon Pasquier
8b89ab0173 scrape: add test detecting data races
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2018-04-18 11:17:25 +02:00
beorn7
94ff07b81d Merge branch 'release-2.2'
Signed-off-by: beorn7 <beorn@soundcloud.com>
2018-04-10 16:50:35 +02:00
Krasi Georgiev
dc29dd1c6f add mutex for DiscoveredLabels
Signed-off-by: Krasi Georgiev <krasi.root@gmail.com>
2018-04-10 00:18:58 +03:00
Krasi Georgiev
ddd46de6f4 Races/3994 (#4005)
Fix race by properly locking access to scrape pools. Use separate mutex for information needed by UI so that UI isn't blocked when targets are being updated.
2018-04-09 15:18:25 +01:00
Mario Trangoni
464e747f1e fix some comments typos (#4059) 2018-04-08 10:51:54 +01:00
Fabian Reinartz
184b6e3767
Merge pull request #3968 from zjwzte/fix-magic-number
Fix magic number.
2018-03-28 14:09:43 +02:00
ferhat elmas
ec8e4d8a7c all: remove unnecessary type conversions (#3992)
excep promql due to not to create conflict with #3966.
2018-03-21 09:25:22 +00:00
zjwzte
b7a37a1604 Fix magic number. 2018-03-15 10:15:35 +08:00
Fabian Reinartz
ef567ceb7d
Merge pull request #3835 from krasi-georgiev/pool-package-generalize
Pool package generalize
2018-02-28 14:30:46 +01:00
ferhat elmas
ffa673f7d8 General simplifications (#3887)
Another try as in #1516
2018-02-26 07:58:10 +00:00
Conor Broderick
99006d3baf Added dropped targets API to targets endpoint (#3870) 2018-02-21 17:26:18 +00:00
Krasi Georgiev
933ab8a34e stupid return mistake fix, and dropped the additional assertion check! 2018-02-20 13:32:23 +02:00
Krasi Georgiev
675ce533c9 refactored TestScrapeLoopAppend and added a test for empty labels 2018-02-20 11:05:54 +00:00
Brian Brazil
4addee2bee Fix honor_labels for empty labels, prune empty labels.
The semantics of honor_labels are that if a target exposes
and empty label it will override the target labels. This PR
fixes that by once again distinguishing between empty labels
and missing labels in this one use case.

Beyond that empty labels should be pruned and not added to storage,
which this also fixes.

Fixes #3841
2018-02-20 11:05:54 +00:00