Add an explanation for the quantile aggregation operator

Sadly, just linking to the Histogram best practice document, as done
for `histogram_quantile`, would be confusing here because the best
practice document only deals with quantiles in the context of
Histograms and Summaries, which is very different from the context of
the `quantile` aggregator and `quantile_over_time` function, which is
already a source of a lot of confusion.

Thus, I think the least bad solution is to add a short explanation in
this section directly. There isn't even a good resource on the
internet we can link to. A lot of statisticians use φ-quantiles, but
they don't have a generally accepted name for it.

I have added the explanation after the other detailed explanations of
`count_values`, `topk` and `bottomk`. I think that fits quite nicely
into the flow.

Signed-off-by: beorn7 <beorn@grafana.com>
This commit is contained in:
beorn7 2020-07-06 17:25:55 +02:00
parent ad7da8fd35
commit cf698f71e5

View File

@ -219,13 +219,18 @@ identical between all elements of the vector.
`count_values` outputs one time series per unique sample value. Each series has
an additional label. The name of that label is given by the aggregation
parameter, and the label value is the unique sample value. The value of each
parameter, and the label value is the unique sample value. The value of each
time series is the number of times that sample value was present.
`topk` and `bottomk` are different from other aggregators in that a subset of
the input samples, including the original labels, are returned in the result
vector. `by` and `without` are only used to bucket the input vector.
`quantile` calculates the φ-quantile, the value that ranks at number φ*N among
the N metric values of the dimensions aggregated over. φ is provided as the
aggregation parameter. For example, `quantile(0.5, ...)` calculates the median,
`quantile(0.95, ...)` the 95th percentile.
Example:
If the metric `http_requests_total` had time series that fan out by