prometheus

Commit Graph

Author	SHA1	Message	Date
beorn7	a075900f9a	Merge branch 'beorn7/persistence' into beorn7/ingestion-tweaks	2015-03-18 19:09:31 +01:00
Julius Volz	b2651027fc	Fix special value handling in division and modulo. This fixes https://github.com/prometheus/prometheus/issues/597	2015-03-16 14:23:40 +01:00
beorn7	be11cb2b07	Remove the sample ingestion channel. The one central sample ingestion channel has caused a variety of trouble. This commit removes it. Targets and rule evaluation call an Append method directly now. To incorporate multiple storage backends (like OpenTSDB), storage.Tee forks the Append into two different appenders. Note that the tsdb queue manager had its own queue anyway. It was a queue after a queue... Much queue, so overhead... Targets have their own little buffer (implemented as a channel) to avoid stalling during an http scrape. But a new scrape will only be started once the old one is fully ingested. The contraption of three pipelined ingesters was removed. A Target is an ingester itself now. Despite more logic in Target, things should be less confusing now. Also, remove lint and vet warnings in ast.go.	2015-03-15 14:08:22 +01:00
beorn7	9e85ab0eef	Apply the new signature/fingerprinting functions from client_golang. This requires the new version of client_golang (vendoring will follow in the next commit), which changes the fingerprinting for clientmodel.Metric.	2015-03-03 18:34:01 +01:00
Fabian Reinartz	6f754073d5	Add OR operation and vector matching options. This commits implements the OR operation between two vectors. Vector matching using the ON clause is added to limit the set of labels that define a match between two elements. Group modifiers (GROUP_LEFT/GROUP_RIGHT) to request many-to-one matching are added.	2015-03-03 11:35:10 +01:00
Julius Volz	42601acfde	Replace labelsToKey() with metric Fingerprint (fixes grouping bug).	2015-02-21 17:45:47 +01:00
Julius Volz	7fefccd929	Write() directly into hash and use model.SeparatorByte.	2015-02-21 17:19:13 +01:00
Julius Volz	645cf57bed	Fix aggregation grouping key calculation.	2015-02-21 14:05:50 +01:00
Julius Volz	72d7b325a1	Implement offset operator. This allows changing the time offset for individual instant and range vectors in a query. For example, this returns the value of `foo` 5 minutes in the past relative to the current query evaluation time: foo offset 5m Note that the `offset` modifier always needs to follow the selector immediately. I.e. the following would be correct: sum(foo offset 5m) // GOOD. While the following would be incorrect: sum(foo) offset 5m // INVALID. The same works for range vectors. This returns the 5-minutes-rate that `foo` had a week ago: rate(foo[5m] offset 1w) This change touches the following components: * Lexer/parser: additions to correctly parse the new `offset`/`OFFSET` keyword. * AST: vector and matrix nodes now have an additional `offset` field. This is used during their evaluation to adjust query and result times appropriately. * Query analyzer: now works on separate sets of ranges and instants per offset. Isolating different offsets from each other completely in this way keeps the preloading code relatively simple. No storage engine changes were needed by this change. The rules tests have been changed to not probe the internal implementation details of the query analyzer anymore (how many instants and ranges have been preloaded). This would also become too cumbersome to test with the new model, and measuring the result of the query should be sufficient. This fixes https://github.com/prometheus/prometheus/issues/529 This fixed https://github.com/prometheus/promdash/issues/201	2015-02-18 02:41:27 +01:00
Brian Brazil	60271d58bf	Change the 2nd argument of round to toNearest. This is more useful if you want get a multiple of 2 or 5, while still working for .001.	2015-02-05 16:13:40 +00:00
Fabian Reinartz	fa1e90003b	Query timeout added. This is related to #454. Queries now timeout after a duration set by the -query.timeout flag. The TotalEvalTimer is now started/stopped inside any of the ast.Eval* functions.	2015-02-03 08:04:27 +01:00
Julius Volz	d4374a9265	More efficient JSON query result format. This depends on https://github.com/prometheus/client_golang/pull/51. For vectors, the result format looks like this: ```json { "version": 1, "type" : "vector", "value" : [ { "timestamp" : 1421765411.045, "value" : "65.475000", "metric" : { "quantile" : "0.5", "instance" : "http://localhost:9090/metrics", "job" : "prometheus", "__name__" : "http_request_duration_microseconds", "handler" : "/static/", "method" : "get", "code" : "304" } }, { "timestamp" : 1421765411.045, "value" : "5826.339000", "metric" : { "quantile" : "0.9", "instance" : "http://localhost:9090/metrics", "job" : "prometheus", "__name__" : "http_request_duration_microseconds", "handler" : "prometheus", "method" : "get", "code" : "200" } }, /* ... / ] } ``` For matrices, it looks like this: ```json { "version": 1, "type" : "matrix", "value" : [ { "metric" : { "quantile" : "0.99", "instance" : "http://localhost:9090/metrics", "job" : "prometheus", "__name__" : "http_request_duration_microseconds", "handler" : "/static/", "method" : "get", "code" : "200" }, "values" : [ [ 1421765547.659, "29162.953000" ], [ 1421765548.659, "29162.953000" ], [ 1421765549.659, "29162.953000" ], / ... */ ] } ] } ```	2015-01-26 13:06:22 +01:00
Bjoern Rabenstein	5859b74f1b	Clean up license issues. - Move CONTRIBUTORS.md to the more common AUTHORS. - Added the required NOTICE file. - Changed "Prometheus Team" to "The Prometheus Authors". - Reverted the erroneous changes to the Apache License.	2015-01-21 20:07:45 +01:00
Julius Volz	cc27fb8aab	Rename remaining all-caps constants in AST layer. Change-Id: Ibe97e30981969056ffcdb89e63c1468ea1ffa140	2014-12-25 01:30:47 +01:00
Julius Volz	2ade9d40cf	Clarify why we need int constants for expression types. Change-Id: I053fc5d32c118dbdb204dc8193337f981aff796e	2014-12-25 00:45:30 +01:00
Julius Volz	00a2a93a05	Add regression tests for metrics mutations in AST. It turned out in the end, that only drop_common_metrics() produced any erroneous output in the old system. The second expression in the test ("sum(testmetric) keeping_extra") already worked in the old code, but why not keep it in... The way to test ranged evaluations is a bit clumsy so far, so I want to build a nicer test framework in the end, where all the test cases can be specified as text files which specify desired inputs, outputs, query step widths, etc. Change-Id: I821859789e69b8232bededf670a1b76e9e8c8ca4	2014-12-12 20:34:55 +01:00
Julius Volz	c9618d11e8	Introduce copy-on-write for metrics in AST. This depends on changes in: https://github.com/prometheus/client_golang/tree/cow-metrics. Change-Id: I80b94833a60ddf954c7cd92fd2cfbebd8dd46142	2014-12-12 20:34:55 +01:00
Bjoern Rabenstein	7d11019aa2	Squash a few trivial TODOs. - Delete unneeded file view_adapter.go. - Assessed that we still need the fingerprints in nodes (to create iterators). - Turned numMemChunkDescs into a metric. Change-Id: I29be963c795a075ec00c095f76bf26405535609d	2014-11-27 18:26:06 +01:00
Julius Volz	3d47f94149	Drop metric names after transformations. After many transformations, it doesn't make sense to keep the metric names, since the result of the transformation is no longer that metric. This drops the metric name after such transformations and makes the web UI deal well with missing metric names. This depends on the current branch on the following things: - prometheus/client_golang needs to be at `e237cf15c6` in branch "julius/int-fingerprints" (to be merged with new storage) - prometheus/promdash needs to be at `dd7691c9c2` Change-Id: Ib3c8cad8d647d9854e8c653c424b8c235ccc231d	2014-11-25 17:13:04 +01:00
Bjoern Rabenstein	14bda4180c	Changes after pair code review. Change-Id: Ib72d40f8e9027818cfbbd32a7a7201eebda07455	2014-11-25 17:12:59 +01:00
Bjoern Rabenstein	f5f9f3514a	Major code cleanup. - Make it go-vet and golint clean. - Add comments, TODOs, etc. Change-Id: If1392d96f3d5b4cdde597b10c8dff1769fcfabe2	2014-11-25 17:02:53 +01:00
Julius Volz	e7ed39c9a6	Initial experimental snapshot of next-gen storage. Change-Id: Ifb8709960dbedd1d9f5efd88cdd359ee9fa9d26d	2014-11-25 17:02:00 +01:00
Julius Volz	01f652cb4c	Separate storage implementation from interfaces. This was initially motivated by wanting to distribute the rule checker tool under `tools/rule_checker`. However, this was not possible without also distributing the LevelDB dynamic libraries because the tool transitively depended on Levigo: rule checker -> query layer -> tiered storage layer -> leveldb This change separates external storage interfaces from the implementation (tiered storage, leveldb storage, memory storage) by putting them into separate packages: - storage/metric: public, implementation-agnostic interfaces - storage/metric/tiered: tiered storage implementation, including memory and LevelDB storage. I initially also considered splitting up the implementation into separate packages for tiered storage, memory storage, and LevelDB storage, but these are currently so intertwined that it would be another major project in itself. The query layers and most other parts of Prometheus now have notion of the storage implementation anymore and just use whatever implementation they get passed in via interfaces. The rule_checker is now a static binary :) Change-Id: I793bbf631a8648ca31790e7e772ecf9c2b92f7a0	2014-04-16 13:30:19 +02:00
Julius Volz	d411a7d810	Allow reversing vector and scalar arguments in binops. This allows putting a scalar as the first argument of a binary operator in which the second argument is a vector: <scalar> <binop> <vector> For example, 1 / http_requests_total ...will output a vector in which every sample value is 1 divided by the respective input vector element. This even works for filter binary operators now: 1 == http_requests_total Returns a vector with all values set to 1 for every element in http_requests_total whose initial value was 1. Note: For filter binary operators, the resulting values are always taken from the left-hand-side of the operation, no matter whether the scalar or the vector argument is the left-hand-side. That is, 1 != http_requests_total ...will set all result vector sample values to 1, although these are exactly the sample elements that were != 1 in the input vector. If you want to just filter elements without changing their sample values, you still need to do: http_requests_total != 1 The new filter form is a bit exotic, and so probably won't be used often. But it was easier to implement it than disallow it completely or change its behavior. Change-Id: Idd083f2bd3a1219ba1560cf4ace42f5b82e797a5	2014-04-08 17:16:18 +02:00
Julius Volz	c7c0b33d0b	Add regex-matching support for labels. There are four label-matching ops for selecting timeseries now: - Equal: = - NotEqual: != - RegexMatch: =~ - RegexNoMatch: !~ Instead of looking up labels by a simple clientmodel.LabelSet (basically an equals op for every key/value pair in the set), timeseries fingerprint selection is now done via a list of metric.LabelMatchers. Change-Id: I510a83f761198e80946146770ebb64e4abc3bb96	2014-04-01 14:24:53 +02:00
Julius Volz	86fc13a52e	Convert metric.Values to slice of values. The initial impetus for this was that it made unmarshalling sample values much faster. Other relevant benchmark changes in ns/op: Benchmark old new speedup ================================================================== BenchmarkMarshal 179170 127996 1.4x BenchmarkUnmarshal 404984 132186 3.1x BenchmarkMemoryGetValueAtTime 57801 50050 1.2x BenchmarkMemoryGetBoundaryValues 64496 53194 1.2x BenchmarkMemoryGetRangeValues 66585 54065 1.2x BenchmarkStreamAdd 45.0 75.3 0.6x BenchmarkAppendSample1 1157 1587 0.7x BenchmarkAppendSample10 4090 4284 0.95x BenchmarkAppendSample100 45660 44066 1.0x BenchmarkAppendSample1000 579084 582380 1.0x BenchmarkMemoryAppendRepeatingValues 22796594 22005502 1.0x Overall, this gives us good speedups in the areas where they matter most: decoding values from disk and accessing the memory storage (which is also used for views). Some of the smaller append examples take minimally longer, but the cost seems to get amortized over larger appends, so I'm not worried about these. Also, we're currently not bottlenecked on the write path and have plenty of other optimizations available in that area if it becomes necessary. Memory allocations during appends don't change measurably at all. Change-Id: I7dc7394edea09506976765551f35b138518db9e8	2014-03-11 18:23:37 +01:00
Julius Volz	3f226c9724	Rename {Scalar,Vector}Literal to {Scalar,Vector}Selector. Change-Id: Ie92301f47f5f49f30b3a62c365e377108982b080	2014-02-22 22:33:42 +01:00
Bjoern Rabenstein	fd63500ed3	Make rules/ast golint clean. Mostly, that means adding compliant doc strings to exported items. Also, remove 'go vet' warnings where possible. (Some are unfortunately not to avoid, arguably bugs in 'go vet'.) Change-Id: I2827b6dd317492864c1383c3de1ea9eac5a219bb	2014-02-14 15:01:39 +01:00
Julius Volz	0378c2ca1f	Nonexistent labels in BY-clauses shouldn't propagate to result. This fixes bug 2. of https://github.com/prometheus/prometheus/issues/374 Change-Id: Ia4a13153616bafce5bf10597966b071434422d09	2014-01-24 16:05:30 +01:00
Julius Volz	6dc36d0c3e	Don't keep extra labels in aggregations by default. MIN/MAX/SUM/AVG/COUNT aggregations will now by default drop all labels that are not specifically part of a BY-clause, even if a label value is the same within all timeseries of an aggregation group. The old behavior of keeping extra labels may still be switched on by adding KEEPING_EXTRA to the end of an aggregation statement: sum(http_requests) by (job, method) keeping_extra I'm open to better syntax/naming suggestions. Change-Id: I21d3fe7af9e98552ce3dffa3ce7c0a4ba4c0b4a4	2013-12-16 12:53:10 +01:00
Julius Volz	740d448983	Use custom timestamp type for sample timestamps and related code. So far we've been using Go's native time.Time for anything related to sample timestamps. Since the range of time.Time is much bigger than what we need, this has created two problems: - there could be time.Time values which were out of the range/precision of the time type that we persist to disk, therefore causing incorrectly ordered keys. One bug caused by this was: https://github.com/prometheus/prometheus/issues/367 It would be good to use a timestamp type that's more closely aligned with what the underlying storage supports. - sizeof(time.Time) is 192, while Prometheus should be ok with a single 64-bit Unix timestamp (possibly even a 32-bit one). Since we store samples in large numbers, this seriously affects memory usage. Furthermore, copying/working with the data will be faster if it's smaller. MEMORY USAGE RESULTS Initial memory usage comparisons for a running Prometheus with 1 timeseries and 100,000 samples show roughly a 13% decrease in total (VIRT) memory usage. In my tests, this advantage for some reason decreased a bit the more samples the timeseries had (to 5-7% for millions of samples). This I can't fully explain, but perhaps garbage collection issues were involved. WHEN TO USE THE NEW TIMESTAMP TYPE The new clientmodel.Timestamp type should be used whenever time calculations are either directly or indirectly related to sample timestamps. For example: - the timestamp of a sample itself - all kinds of watermarks - anything that may become or is compared to a sample timestamp (like the timestamp passed into Target.Scrape()). When to still use time.Time: - for measuring durations/times not related to sample timestamps, like duration telemetry exporting, timers that indicate how frequently to execute some action, etc. NOTE ON OPERATOR OPTIMIZATION TESTS We don't use operator optimization code anymore, but it still lives in the code as dead code. It still has tests, but I couldn't get all of them to pass with the new timestamp format. I commented out the failing cases for now, but we should probably remove the dead code soon. I just didn't want to do that in the same change as this. Change-Id: I821787414b0debe85c9fffaeb57abd453727af0f	2013-12-03 09:11:28 +01:00
Julius Volz	0003027dce	Add needed trailing spaces in logs.	2013-08-12 18:22:48 +02:00
Julius Volz	aa5d251f8d	Use github.com/golang/glog for all logging.	2013-08-12 17:54:36 +02:00
Julius Volz	81f0b85013	Return [] instead of null for empty result vectors.	2013-07-25 12:16:32 +02:00
Matt T. Proud	30b1cf80b5	WIP - Snapshot of Moving to Client Model.	2013-06-25 15:52:42 +02:00
Julius Volz	74cb676537	Implement Stringer interface for rules and all their children.	2013-06-07 15:54:32 +02:00
Julius Volz	51689d965d	Add debug timers to instant and range queries. This adds timers around several query-relevant code blocks. For now, the query timer stats are only logged for queries initiated through the UI. In other cases (rule evaluations), the stats are simply thrown away. My hope is that this helps us understand where queries spend time, especially in cases where they sometimes hang for unusual amounts of time.	2013-06-05 18:32:54 +02:00
Matt T. Proud	c10780c966	Introduce telemetry for rule evaluator durations. This commit adds telemetry for the Prometheus expression rule evaluator, which will enable meta-Prometheus monitoring of customers to ensure that no instance is falling behind in answering routine queries. A few other sundry simplifications are introduced, too.	2013-05-23 21:29:27 +02:00
Matt T. Proud	8f4c7ece92	Destroy naked returns in half of corpus. The use of naked return values is frowned upon. This is the first of two bulk updates to remove them.	2013-05-16 10:53:25 +03:00
Julius Volz	0877680761	Implement a COUNT ... BY aggregation operator. This also removes the now obsolete scalar count() function and corrects the expressions test naming (broken in `2202cd71c9 (L6R59)`) so that the expression tests will actually run.	2013-05-08 16:35:16 +02:00
Julius Volz	56324d8ce2	Make AST query storage non-global.	2013-05-07 13:15:10 +02:00
Julius Volz	99dcbe0f94	Integrate memory and disk layers in view rendering.	2013-04-19 16:01:27 +02:00
Julius Volz	c4d0969c00	Propagate more errors during rule evaluation.	2013-04-09 13:47:20 +02:00
Julius Volz	ec413459fa	Depointerize Matrix/Vector types as well as time.Time arguments.	2013-03-28 18:07:12 +01:00
Matt T. Proud	c53a72a894	Test data for the curator.	2013-03-27 18:13:43 +01:00
Julius Volz	b836066c71	Eliminate need to get fingerprints during query execution time.	2013-03-27 14:42:03 +01:00
Julius Volz	2b8f0b2cc7	Constantize metric name label name.	2013-03-26 16:20:23 +01:00
Julius Volz	3880a86c9c	In case of empty query results, return an empty matrix.	2013-03-25 12:14:48 +01:00
Julius Volz	8e4c5b0cea	Use AST query analyzer and views with tiered storage.	2013-03-21 18:16:52 +01:00
Julius Volz	16d9dcd6a8	Add copyright notices to all remaining files.	2013-02-07 11:49:04 +01:00

1 2

60 Commits