And use the new method to call to compact Histograms during
parsing. This happens for both `Histogram` and `FloatHistogram`. In
this way, if targets decide to optimize the exposition size by merging
spans with empty buckets in between, we still get a normalized
results. It will also normalize away any valid but weird
representations like empty spans, spans with offset zero, and empty
buckets at the start or end of a span.
The implementation seemed easy at first as it just turns the
`compactBuckets` helper into a generic function (which now got its own
file). However, the integer Histograms have delta buckets instead of
absolute buckets, which had to be treated specially in the generic
`compactBuckets` function. To make sure it works, I have added plenty
of explicit tests for `Histogram` in addition to the `FloatHistogram`
tests.
I have also updated the doc comment for the `Compact` method.
Based on the insights now expressed in the doc comment, compacting
with a maxEmptyBuckets > 0 is rarely useful. Therefore, this commit
also sets the value to 0 in the two cases we were using 3 so far. We
might still want to reconsider, so I don't want to remove the
maxEmptyBuckets parameter right now.
Signed-off-by: beorn7 <beorn@grafana.com>
This follow a simple function-based approach to access the count and
sum fields of a native Histogram. It might be more elegant to
implement “accessors” via the dot operator, as considered in the
brainstorming doc [1]. However, that would require the introduction of
a whole new concept in PromQL. For the PoC, we should be fine with the
function-based approch. Even the obvious inefficiencies (rate'ing a
whole histogram twice when we only want to rate each the count and the
sum once) could be optimized behind the scenes.
Note that the function-based approach elegantly solves the problem of
detecting counter resets in the sum of observations in the case of
negative observations. (Since the whole native Histogram is rate'd,
the counter reset is detected for the Histogram as a whole.)
We will decide later if an “accessor” approach is really needed. It
would change the example expression for average duration in
functions.md from
histogram_sum(rate(http_request_duration_seconds[10m]))
/
histogram_count(rate(http_request_duration_seconds[10m]))
to
rate(http_request_duration_seconds.sum[10m])
/
rate(http_request_duration_seconds.count[10m])
[1]: https://docs.google.com/document/d/1ch6ru8GKg03N02jRjYriurt-CZqUVY09evPg6yKTA1s/edit
Signed-off-by: beorn7 <beorn@grafana.com>
* Labels: create signature with/without labels
Instead of creating a new Labels slice then converting to signature,
go directly to the signature and save time.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Labels: refactor Builder tests
Have one test with a range of cases, and have them check the final
output rather than checking the internal structure of the Builder.
Also add a couple of cases where the value is "", which should be
interpreted as 'delete'.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Labels: add 'Keep' function to Builder
This lets us replace `Labels.WithLabels` with the more general `Builder`.
In `engine.resultMetric()` we can call `Keep()` instead of checking
and calling `Del()`.
Avoid calling `Sort()` in `Builder.Labels()` if we didn't add anything,
so that `Keep()` has the same performance as `WithLabels()`.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
For conventional histograms, we need to gather all the individual
bucket timeseries at a data point to do the quantile calculation. The
code so far mirrored this behavior for the new native
histograms. However, since a single data point contains all the
buckets alreade, that's actually not needed. This PR simplifies the
code while still detecting a mix of conventional and native
histograms.
The weird signature calculation for the conventional histograms is
getting even weirder because of that. If this PR turns out to do the
right thing, I will implement a proper fix for the signature
calculation upstream.
Signed-off-by: beorn7 <beorn@grafana.com>
* MergeFloatBucketIterator for []FloatBucketIterator
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* histogram_quantile for histograms
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix histogram_quantile
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Unit test and enhancements
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Iterators to iterate buckets in reverse and all buckets together including zero bucket
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Consider all buckets for histogram_quantile and fix the implementation
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Remove unneeded code
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix lint
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Add test case to showcase the problem in #9590
Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>
* Don't unwrap ParenExpr in newStepInvariantExpr
Fixes#9590
Signed-off-by: Thomas Jackson <jacksontj.89@gmail.com>
This creates a new `model` directory and moves all data-model related
packages over there:
exemplar labels relabel rulefmt textparse timestamp value
All the others are more or less utilities and have been moved to `util`:
gate logging modetimevfs pool runtime
Signed-off-by: beorn7 <beorn@grafana.com>
This commit adds `@ <timestamp>` modifier as per this design doc: https://docs.google.com/document/d/1uSbD3T2beM-iX4-Hp7V074bzBRiRNlqUdcWP6JTDQSs/edit.
An example query:
```
rate(process_cpu_seconds_total[1m])
and
topk(7, rate(process_cpu_seconds_total[1h] @ 1234))
```
which ranks based on last 1h rate and w.r.t. unix timestamp 1234 but actually plots the 1m rate.
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
* Use go1.14 new hash/maphash to hash both RHS and LHS instead of XOR'ing
which has been resulting in hash collisions.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Refactor engine labelset signature generation, just use labels.Labels
instead of hashes.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address review comments; function comments + store result of
lhs.String+rhs.String as key.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Replace all signatureFunc usage with signatureFuncString.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Make optimizations to labels String function and generation of rhs+lhs
as string in resultMetric.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Use separate string functions that don't use strconv just for engine
maps.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Use a byte invalid separator instead of quoting and have a buffer
attached to EvalNodeHelper instead of using a global pool in the labels
package.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address review comments.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Address more review comments, labels has a function that now builds a
byte slice without turning it into a string.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Use two different non-ascii hex codes as byte separators between labels
and between sets of labels when building bytes of a Labels struct.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* We only need the 2nd byte invalid sep. at the beginning of a
labels.Bytes
Signed-off-by: Callum Styan <callumstyan@gmail.com>
* Move check for empty VectorSelector to typeChecking
* Move check for twice set metric name to typeChecking
* Make child of MatrixSelector a general Node
* rename checkType to checkAST
* Rename fail to addParseErr
* Remove trailing whitespace
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
* Cleanup PromQL functions
The engine ensures, for Matrix functions, that functions are called with exactly one series at the time.
Therefore a lot of code can be inlined and we can directly assume the first element of the arguments exists and contains all the samples needed.
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>