Commit Graph

167 Commits

Author SHA1 Message Date
Julien Pivotto
2051ae2e6a Revert "Fix race condition in Rule manager.Update() function"
This reverts commit 8b11d2cfb6.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-02 08:01:21 +01:00
fuling
8b11d2cfb6 Fix race condition in Rule manager.Update() function
Signed-off-by: fuling <fuling.lgz@alibaba-inc.com>
2020-03-01 12:00:51 +01:00
Tobias Guggenmos
4835bbf376
Merge branch 'master' into split_parser 2020-02-19 15:18:13 +01:00
Bartlomiej Plotka
34426766d8 Unify Iterator interfaces. All point to storage now.
This is part of https://github.com/prometheus/prometheus/pull/5882 that can be done to simplify things.
All todos I added will be fixed in follow up PRs.

* querier.Querier, querier.Appender, querier.SeriesSet, and querier.Series interfaces merged
with storage interface.go. All imports that.
* querier.SeriesIterator replaced by chunkenc.Iterator
* Added chunkenc.Iterator.Seek method and tests for xor implementation (?)
* Since we properly handle SelectParams for Select methods I adjusted min max
based on that. This should help in terms of performance for queries with functions like offset.
* added Seek to deletedIterator and test.
* storage/tsdb was removed as it was only a unnecessary glue with incompatible structs.

No logic was changed, only different source of abstractions, so no need for benchmarks.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-02-17 18:03:54 +00:00
Tobias Guggenmos
20b1f596f6 Fix build errors in rest of prometheus
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
2020-02-17 16:09:23 +01:00
Julien Pivotto
135cc30063
rules: Make deleted rule series as stale after a reload (#6745)
* rules: Make deleted rule series as stale after a reload

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-02-12 16:22:18 +01:00
Julien Pivotto
f82d55e79f
Refactor and simplify rule_group_interval_seconds (#6711)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-01-29 11:26:08 +00:00
Julien Pivotto
56ebd5afde Delete prometheus_rule_group metrics when groups are removed (#6693)
* Delete prometheus_rule_group metrics when groups are removed

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-01-27 12:41:32 +00:00
Julien Pivotto
5f27ac3583 Refactor query log fields (#6694)
* Refactor query log fields

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-01-27 09:53:10 +00:00
Harkishen Singh
84e6459c4d Adds support for line-column numbers for invalid rules, promtool (#6533)
Signed-off-by: Harkishen Singh <harkishensingh@hotmail.com>
2020-01-15 18:07:54 +00:00
Julien Pivotto
0011bba19b Query Log: add origin of the rules (#6592)
* Query Log: add origin of the rules

We don't set rule name and rule kind because the added value would be
quite low, given we have now the file, the group and the query.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-01-10 08:28:17 +00:00
Julien Pivotto
e079c9ed45 manager: add full stops on comments
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2019-12-19 11:46:22 +01:00
alburthoffman
156dcb8cca avoid stopping rule groups if new rule groups are as same as old rule… (#6450)
* avoid stopping rule groups if new rule groups are as same as old rule groups

Signed-off-by: alburthoffman <alburthoffman@gmail.com>
2019-12-19 10:41:11 +00:00
Simon Pasquier
06066a3619
*: improve error messages when parsing bad rules (#5965)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-08-28 17:36:48 +02:00
Brian Brazil
2184b79763
Mark deleted rule's series as stale on next evaluation. (#5759)
Fixes #5755

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2019-08-07 16:11:05 +01:00
beorn7
dd81912554 Add objectives to Summaries
With the next release of client_golang, Summaries will not have
objectives by default. To not lose the objectives we have right now,
explicitly state the current default objectives.

Signed-off-by: beorn7 <beorn@grafana.com>
2019-06-12 02:03:13 +02:00
pbhudiaBAE
43953b105b Sorting alerts by group name in /alerts (#5448)
* Working group name

Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>

* Working categorised by group name

Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>

* Changed group sorting in web

Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>

* Fixed group sorting and comments

Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>

* Fixed group sorting and comments with gofmt

Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>

* Added file and group name

Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>

* reverted back to full path to yml file

Signed-off-by: Pritam Bhudia <pritam.bhudia@baesystems.com>
2019-05-14 23:14:27 +02:00
Yao Zengzeng
5544cb252a fix some mistakes in comments (#5533)
Signed-off-by: YaoZengzeng <yaozengzeng@zju.edu.cn>
2019-05-05 10:48:42 +01:00
Simon Pasquier
45506841e6
*: enable all default linters (#5504)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2019-05-03 15:11:28 +02:00
Bjoern Rabenstein
38d518c0fe Rework #5009 after comments
Signed-off-by: Bjoern Rabenstein <bjoern@rabenste.in>
2019-04-17 01:40:10 +02:00
Tariq Ibrahim
8fdfa8abea refine error handling in prometheus (#5388)
i) Uses the more idiomatic Wrap and Wrapf methods for creating nested errors.
ii) Fixes some incorrect usages of fmt.Errorf where the error messages don't have any formatting directives.
iii) Does away with the use of fmt package for errors in favour of pkg/errors

Signed-off-by: tariqibrahim <tariq181290@gmail.com>
2019-03-26 00:01:12 +01:00
James Ravn
e15d8c5802 reload: copy state on both name and labels (#5368)
* reload: copy state on both name and labels

Fix https://github.com/prometheus/prometheus/issues/5193

Using just name causes the linked issue - if new rules are inserted with
the same name (but different labels), the reordering will cause stale
markers to be inserted in the next eval for all shifted rules, despite
them not being stale.

Ideally we want to avoid stale markers for time series that still exist
in the new rules, with name and labels being the unique identifer.

This change adds labels to the internal map when copying the old rule
data to the new rule data. This prevents the problem of staling rules
that simply shifted order.

If labels change, it is a new time series and the old series will stale
regardless. So it should be safe to always match on name and labels when
copying state.

Signed-off-by: James Ravn <james@r-vn.org>
2019-03-15 15:23:36 +00:00
David Symonds
46361a7c85 rules: Fix sorting of result from (*Manager).RuleGroups (#5260)
The previous code was defective in that it never sorted groups within a
file due to doing a multi-key sort incorrectly.

Signed-off-by: David Symonds <dsymonds@gmail.com>
2019-02-23 09:51:44 +01:00
beorn7
2db1eeb4ec Fix prometheus_rule_group_last_evaluation_timestamp_seconds
It should be a unix timestamp, not the seconds in the minute.

Signed-off-by: beorn7 <beorn@soundcloud.com>
2019-02-06 11:02:49 +01:00
Ganesh Vernekar
787eb1e904 Set rule_group_last_duration_seconds to seconds (#5153)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2019-01-31 11:07:58 +01:00
Vishnunarayan K I
fd3ef6ba34 Add metric rule_group_rules_loaded to get the number of rules loaded (#5090)
Signed-off-by: Vishnunarayan K I <appukuttancr@gmail.com>
2019-01-13 14:28:07 +00:00
Tom Wilkie
121603c417
Expose rules.NewGroupMetrics and rules.Metrics. (#5059)
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
2019-01-03 12:07:06 +00:00
Bartek Płotka
de213d4a5e rule manager: Moved metric registration to custom registerer which is already available. (#4961)
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
2018-12-28 10:20:29 +00:00
mknapphrt
f0e9196dca Return warnings on a remote read fail (#4832)
Signed-off-by: Mark Knapp <mknapp@hudson-trading.com>
2018-11-30 14:27:12 +00:00
Krasi Georgiev
0754e5334b
querier for RestoreForState not closed. (#4922)
Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>
2018-11-28 15:25:17 +02:00
Wei Guo
e329cbf673 Add metric prometheus_rule_group_last_evaluation for recording and alerting (#4852)
* add metric prometheus_rule_group_last_evaluation for recording and alerting

Signed-off-by: Wei Guo <me@imkira.com>

* fix issues from comments

Signed-off-by: Wei Guo <me@imkira.com>
2018-11-27 14:38:13 +08:00
Will Hegedus
193ebe7e34 Updates to /targets and /rules (scrape duration, last evaluation time) (#4722)
* Add evaluationTimestamp (Last Evaluation) column to display on /rules
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>

* Add lastScrapeDuration ("Scrape Duration") to display on /targets
Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>

* Updates based on Julius' feedback

Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>

* Update to set timestamp to when eval started (after eval completes)

Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>

* Update /rules to display time since last evaluation

Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>

* Re-order Last Eval/Eval Time to be consistent with targets page

Signed-off-by: Will Hegedus <wbhegedus@liberty.edu>
2018-10-12 18:26:59 +02:00
Ganesh Vernekar
5790d23fd8 Unit testing for rules (#4350)
* Unit testing for rules
* Specifying order of group evaluation in unit tests

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2018-09-25 17:06:26 +01:00
Chris Marchbanks
63ed9d1b70 Send EndsAt along with alerts (#4550)
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2018-08-28 16:05:00 +01:00
Chris Marchbanks
87f1dad16d throttle resends of alerts to 1 minute by default (#4538)
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
2018-08-27 17:41:42 +01:00
Julius Volz
8fbe1b5133
Handle a bunch of unchecked errors (#4461)
There are many more (mostly finalizers like Close/Stop/etc.), but most of
the others seemed like one couldn't do much about them anyway.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-08-17 17:24:35 +02:00
Julien Pivotto
0b4d22b245 rules/manager: remove a no-longer-relevant comment (#4503)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2018-08-15 09:33:39 +01:00
Benji Visser
8bb6e0dd6e Show rule evaluation errors on rules page (#4457)
* adding information about the health and errors for Rules

adding Health() and LastError() to the Rule interface. This will allow
us to easily surface information about rules.

Signed-off-by: noqcks <benny@noqcks.io>

* updating rules.html with fields for Rule errors and health state

Signed-off-by: noqcks <benny@noqcks.io>

* fix code comment grammar & access Rule health/error info using a mutex

Signed-off-by: noqcks <benny@noqcks.io>

* s/Errors/Error/ in rules.html to remain consistent with targets.html

Signed-off-by: noqcks <benny@noqcks.io>

* adding periods to code comments in reporting/alerting

Signed-off-by: noqcks <benny@noqcks.io>

* putting health/error below mutex in struct field

Signed-off-by: noqcks <benny@noqcks.io>
2018-08-07 00:33:45 +02:00
Julius Volz
90521a65f8
Remove error return value from NotifyFunc() (#4459)
It's always nil and we also forgot to check it.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-08-04 21:31:12 +02:00
Ganesh Vernekar
f1db699dff Persist alert 'for' state across restarts (#4061)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2018-08-02 11:18:24 +01:00
Max Leonard Inden
71fafad099
api/v1: Coninue work exposing rules and alerts
Signed-off-by: Max Leonard Inden <IndenML@gmail.com>
2018-07-30 15:31:51 +02:00
Bryan Boreham
afdb66dfac Expose Group.CopyState() (#4304)
This makes the `rules` package more useful to projects that use
Prometheus as a library.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2018-07-18 15:14:38 +02:00
Julius Volz
9e3171f6e3 rules: Minor naming/comment cleanups (#4328)
Signed-off-by: Julius Volz <julius.volz@gmail.com>
2018-07-18 04:54:33 +01:00
Alin Sinpalean
9dc763cc03 Run rule evaluation with timestamps precisely evaluation_interval apart (#4201)
* Run rule evaluation with timestamps precisely evaluation_interval apart from one another.

Signed-off-by: Alin Sinpalean <alin.sinpalean@gmail.com>
2018-06-01 15:23:07 +01:00
Mario Trangoni
464e747f1e fix some comments typos (#4059) 2018-04-08 10:51:54 +01:00
Bryan Boreham
93494d8b7e Add an OpenTracing span for each rule (#4027)
* Add an OpenTracing span for each rule

So that tags and child spans can be traced back to the rule that they
refer to.
2018-03-30 21:29:19 +01:00
ferhat elmas
ffa673f7d8 General simplifications (#3887)
Another try as in #1516
2018-02-26 07:58:10 +00:00
Fabian Reinartz
7ccd4b39b8 *: implement query params
This adds a parameter to the storage selection interface which allows
query engine(s) to pass information about the operations surrounding a
data selection.
This can for example be used by remote storage backends to infer the
correct downsampling aggregates that need to be provided.
2018-02-13 12:17:22 +01:00
Brian Brazil
30b4439bbd Remove rule_type label from rule metrics.
This is not really needed now that we have rule groups
to distinguish rules.
2017-12-04 11:44:38 +00:00
Brian Brazil
b97f4cf48c Add metrics for rule group interval and last duration. 2017-12-04 11:44:38 +00:00