Commit Graph

132 Commits

Author SHA1 Message Date
Arve Knudsen
7cbf749096
Upgrade to github.com/oklog/ulid/v2 (#16168)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2025-03-05 16:03:25 +01:00
Matthieu MOREL
c7d4b53ec1 chore: enable unused-parameter from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-02-19 19:50:28 +01:00
Oleg Zaytsev
c8359fcd6b
Fix bug in lbl!~".+" shortcut (#15684)
We were appending to the wrong slice, so instead of removing values, we
were adding them.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-12-17 17:34:24 +01:00
Oleg Zaytsev
9ad93ba8df
Optimize l=~".+" matcher (#15474)
Since dot is matching newline now, `l=~".+"` is "any non empty label
value", and #14144 added a specific method in the index for that so we
don't need to run the matcher on each one of the label values.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-11-27 12:33:20 +01:00
Arve Knudsen
06d54fcc6c
[PERF] TSDB: Optimize inverse matching (#14144)
Simple follow-up to #13620. Modify `tsdb.PostingsForMatchers` to use the optimized tsdb.IndexReader.PostingsForLabelMatching method also for inverse matching.

Introduce method `PostingsForAllLabelValues`, to avoid changing the existing method.

The performance is much improved for a subset of the cases; there are up to
~60% CPU gains and ~12.5% reduction in memory usage. 

Remove `TestReader_InversePostingsForMatcherHonorsContextCancel` since
`inversePostingsForMatcher` only passes `ctx` to `IndexReader` implementations now.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-11-19 15:49:01 +00:00
Ben Ye
f9057544cb
Fix AllPostings added twice (#13893)
* handle all postings added twice

---------

Signed-off-by: Ben Ye <benye@amazon.com>
2024-11-10 18:17:21 +01:00
György Krajcsovits
a4083f14e8 Fix populateWithDelChunkSeriesIterator corrupting chunk meta
When handling recoded histogram chunks the min time of the chunk is
updated by mistake. It should only update when the chunk is completely new.
Otherwise the ongoing chunk's meta will be later than the previously
written samples in it.

Same bug as https://github.com/prometheus/prometheus/pull/14629

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-10-18 10:34:22 +02:00
Mario Fernandez
5814920601
Fix: optimize .* regexp performance
Shortcut for `.*` matches newlines as well.
Add preamble change ^(?s:
Add test
dotAll flag por al regex
Add and fix regex tests

Signed-off-by: Mario Fernandez <mariofer@redhat.com>
2024-09-17 12:18:31 +02:00
Arve Knudsen
b0aba26ed5 tsdb: Fix ValNone typo in comment
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-08-23 08:20:20 +02:00
Bryan Boreham
e04d137649 [PERF] TSDB: Query head and ooo-head together
Add `HeadAndOOOQuerier` which iterates just once over series, then
where necessary merges chunks from in-order and out-of-order lists.

Add a ChunkQuerier for in-order and ooo together

Add copy-last-chunk behaviour to HeadAndOOOChunkReader

Out-of-order chunk IDs are distinguished from in-order by setting bit 23.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 11:19:02 +01:00
Bryan Boreham
da31da3ea6 Refactor: extract selectSeriesSet and selectChunkSeriesSet
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-08-14 11:19:02 +01:00
🌲 Harry 🌊 John 🏔
d5f6887294 Pass limit param as hint to storage.Querier
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
2024-06-20 09:47:38 -07:00
Oleg Zaytsev
64a9abb8be
Change LabelValuesFor() to accept index.Postings (#14280)
The only call we have to LabelValuesFor() has an index.Postings, and we
expand it to pass to this method, which will iterate over the values.

That's a waste of resources: we can iterate on the index.Postings
directly.

If there's any downstream implementation that has a slice of series,
they can always do an index.ListPostings from them: doing that is
cheaper than expanding an abstract index.Postings.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-11 15:36:46 +02:00
Ben Ye
6683895620
optimize regex matching for empty label values in posting match (#14075)
Also update tests.

Signed-off-by: Ben Ye <benye@amazon.com>
2024-05-29 16:03:33 +01:00
George Krajcsovits
fdaafdb041
tsdb: check for context cancel before regex matching postings (#14096)
* tsdb: check for context cancel before regex matching postings

Regex matching can be heavy if the regex takes a lot of cycles to
evaluate and we can get stuck evaluating postings for a long time
without this fix. The constant checkContextEveryNIterations=100
may be changed later.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-05-15 06:26:19 +02:00
Arve Knudsen
5c4310aa37
[ENHANCEMENT] TSDB: Optimize querying with regexp matchers
Add method `PostingsForLabelMatching` to `tsdb.IndexReader`, to obtain postings for labels with a certain name and values accepted by a provided callback, and use it from `tsdb.PostingsForMatchers`.
The intention is to optimize regexp matcher paths, especially not having to load all label values before matching on them.

Plus tests, and refactor some `tsdb/index.Reader` methods.

Benchmarking shows memory reduction up to ~100%, and speedup of up to ~50%.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2024-05-09 10:55:30 +01:00
Alan Protasio
d15869af32
Avoid creating new slices for labels values on postings for matchers (#13958)
* Avoid creating new slices for labels values on postings for matchers

Signed-off-by: alanprot <alanprot@gmail.com>

* refactor

Signed-off-by: alanprot <alanprot@gmail.com>

---------

Signed-off-by: alanprot <alanprot@gmail.com>
2024-04-24 16:41:33 +02:00
Matthieu MOREL
6f595c6762
golangci-lint: enable whitespace linter (#13905)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-04-11 09:27:54 +01:00
Bryan Boreham
080d440bf8 Merge remote-tracking branch 'origin/main' into pr/13461 2024-03-25 12:14:26 +00:00
machine424
f477e0539a
Move from golang.org/x/exp/slices into slices now that we only support Go >= 1.21
Prevent adding back golang.org/x/exp/slices.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-02-28 14:54:53 +01:00
Marco Pracucci
501bc6419e
Add ShardedPostings() support to TSDB (#10421)
This PR is a reference implementation of the proposal described in #10420.

In addition to what described in #10420, in this PR I've introduced labels.StableHash(). The idea is to offer an hashing function which doesn't change over time, and that's used by query sharding in order to get a stable behaviour over time. The implementation of labels.StableHash() is the hashing function used by Prometheus before stringlabels, and what's used by Grafana Mimir for query sharding (because built before stringlabels was a thing).

Follow up work
As mentioned in #10420, if this PR is accepted I'm also open to upload another foundamental piece used by Grafana Mimir query sharding to accelerate the query execution: an optional, configurable and fast in-memory cache for the series hashes.

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-01-29 11:57:27 +00:00
Marco Pracucci
ec9cada56e
Remove unused isRegexMetaCharacter()
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-01-26 06:35:02 +01:00
Marco Pracucci
515890ec53
Use Matcher.SetMatches()
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-01-26 06:26:52 +01:00
Filip Petkovski
583f3e587c
Optimize histogram iterators (#13340)
Optimize histogram iterators

Histogram iterators allocate new objects in the AtHistogram and
AtFloatHistogram methods, which makes calculating rates over long
ranges expensive.

In #13215 we allowed an existing object to be reused
when converting an integer histogram to a float histogram. This commit follows
the same idea and allows injecting an existing object in the AtHistogram and
AtFloatHistogram methods. When the injected value is nil, iterators allocate
new histograms, otherwise they populate and return the injected object.

The commit also adds a CopyTo method to Histogram and FloatHistogram which
is used in the BufferedIterator to overwrite items in the ring instead of making
new copies.

Note that a specialized HPoint pool is needed for all of this to work 
(`matrixSelectorHPool`).

---------

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
2024-01-23 17:02:14 +01:00
Oleg Zaytsev
ed172a6667
Optimize label values with matchers by taking shortcuts (#13426)
Don't calculate postings beforehand: we may not need them. If all
matchers are for the requested label, we can just filter its values.

Also, if there are no values at all, no need to run any kind of
logic.

Also add more labelValuesWithMatchers benchmarks

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-01-23 11:40:21 +01:00
Matthieu MOREL
8f6cf3aabb tsdb: use Go standard errors
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-12-11 12:18:54 +00:00
Fiona Liao
ce126230e7
Fix chunks iterator bug when tombstone covers a whole chunk (#13209)
When no samples are returned in a chunk because all the samples have
been deleted, the chunk iterator then stops without iterating through
any remaining chunks.

Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>
2023-11-29 11:24:04 +01:00
Fiona Liao
5bee0cfce2
Change ChunkReader.Chunk() to ChunkOrIterable()
The ChunkReader interface's Chunk() has been changed to ChunkOrIterable(). 

This is a precursor to OOO native histogram support - with OOO native histograms, the chunks.Meta passed to Chunk() can result in multiple chunks being returned rather than just a single chunk (e.g. if oooMergedChunk has a counter reset in the middle). 

To support this, ChunkOrIterable() requires either a single chunk or an iterable to be returned. If an iterable is returned, the caller has the responsibility of converting the samples from the iterable into possibly multiple chunks. The OOOHeadChunkReader now returns an iterable rather than a chunk to prepare for the native histograms case. Also as a beneficial side effect, oooMergedChunk and boundedChunk has been simplified as they only need to implement the Iterable interface now, not the full Chunk interface.

---------

Signed-off-by: Fiona Liao <fiona.y.liao@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
2023-11-28 11:14:29 +01:00
Jeanette Tan
52eb303031 Refactor assigning MinTime in histogram chunks
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-11-02 21:23:05 +08:00
Jeanette Tan
27abf09e7f Fix missing MinTime in histogram chunks
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-11-02 13:33:39 +08:00
Goutham Veeramachaneni
86729d4d7b
Update exp package (#12650) 2023-09-21 22:53:51 +02:00
Björn Rabenstein
f8dd8770ac
Merge pull request #12757 from bboreham/reuse-bufiter
TSDB: re-use iterator when moving between series
2023-09-21 14:08:53 +02:00
Alan Protasio
959c98441b Add context argument to tsdb.PostingsForMatchers
Signed-off-by: Alan Protasio <alanprot@gmail.com>
2023-09-16 18:13:32 +02:00
zenador
69edd8709b
Add warnings (and annotations) to PromQL query results (#12152)
Return annotations (warnings and infos) from PromQL queries

This generalizes the warnings we have already used before (but only for problems with remote read) as "annotations".

Annotations can be warnings or infos (the latter could be false positives). We do not treat them different in the API for now and return them all as "warnings". It would be easy to distinguish them and return infos separately, should that appear useful in the future.

The new annotations are then used to create a lot of warnings or infos during PromQL evaluations. Partially these are things we have wanted for a long time (e.g. inform the user that they have applied `rate` to a metric that doesn't look like a counter), but the new native histograms have created even more needs for those annotations (e.g. if a query tries to aggregate float numbers with histograms).

The annotations added here are not yet complete. A prominent example would be a warning about a range too short for a rate calculation. But such a warnings is more tricky to create with good fidelity and we will tackle it later.

Another TODO is to take annotations into account when evaluating recording rules.

---------

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2023-09-14 18:57:31 +02:00
Arve Knudsen
156222cc50
Add context argument to LabelQuerier.LabelValues (#12665)
Add context argument to LabelQuerier.LabelValues and
LabelQuerier.SortedLabelValues.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2023-09-14 16:02:04 +02:00
Arve Knudsen
a964349e97
Add context argument to LabelQuerier.LabelNames (#12666)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2023-09-14 10:39:51 +02:00
Arve Knudsen
4451ba10b4
Add context argument to IndexReader.Postings (#12667)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2023-09-13 17:45:06 +02:00
Arve Knudsen
6daee89e5f
Add context argument to Querier.Select (#12660)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2023-09-12 12:37:38 +02:00
Dimitar Dimitrov
b40865833d
PostingsForMatchers race with creating new series (#12558)
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
2023-08-29 11:03:27 +02:00
Bryan Boreham
bdc7983956 TSDB: re-use iterator when moving between series
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-08-26 14:01:44 +00:00
George Krajcsovits
6cd2d1621f
Hide histogram chunk append and reset header internals (#12352)
tsdb: Hide histogram chunk append and reset header internals

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>
2023-07-26 15:08:16 +02:00
Patrick Oyarzun
68e5937474
Apply relevant label matchers in LabelValues before fetching extra postings (#12274)
* Apply matchers when fetching label values

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

* Avoid extra copying of label values

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>

---------

Signed-off-by: Patrick Oyarzun <patrick.oyarzun@grafana.com>
2023-07-04 10:37:58 +01:00
Alan Protasio
73078bf738
Opmizing Group Regex (#12375)
Signed-off-by: Alan Protasio <alanprot@gmail.com>
2023-05-30 13:49:22 +02:00
Alan Protasio
8c5d4b4add
Opmize MatchNotEqual (#12377)
Signed-off-by: Alan Protasio <alanprot@gmail.com>
2023-05-21 10:41:30 +02:00
George Krajcsovits
92d6980360
Fix populateWithDelChunkSeriesIterator and gauge histograms (#12330)
Use AppendableGauge to detect corrupt chunk with gauge histograms.
Detect if first sample is a gauge but the chunk is not set up to contain
gauge histograms.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>
2023-05-19 10:24:06 +02:00
Alan Protasio
c0f1abb574 MatchNotRegexp optimization
Signed-off-by: Alan Protasio <alanprot@gmail.com>
2023-05-10 20:08:38 -07:00
Matthieu MOREL
bae9a21200
Merge branch 'main' into linter/nilerr
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-04-19 19:56:39 +02:00
beorn7
5b53aa1108 style: Replace else if cascades with switch
Wiser coders than myself have come to the conclusion that a `switch`
statement is almost always superior to a statement that includes any
`else if`.

The exceptions that I have found in our codebase are just these two:

* The `if else` is followed by an additional statement before the next
  condition (separated by a `;`).
* The whole thing is within a `for` loop and `break` statements are
  used. In this case, using `switch` would require tagging the `for`
  loop, which probably tips the balance.

Why are `switch` statements more readable?

For one, fewer curly braces. But more importantly, the conditions all
have the same alignment, so the whole thing follows the natural flow
of going down a list of conditions. With `else if`, in contrast, all
conditions but the first are "hidden" behind `} else if `, harder to
spot and (for no good reason) presented differently from the first
condition.

I'm sure the aforemention wise coders can list even more reasons.

In any case, I like it so much that I have found myself recommending
it in code reviews. I would like to make it a habit in our code base,
without making it a hard requirement that we would test on the CI. But
for that, there has to be a role model, so this commit eliminates all
`if else` occurrences, unless it is autogenerated code or fits one of
the exceptions above.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:22:31 +02:00
beorn7
c3c7d44d84 lint: Adjust to the lint warnings raised by current versions of golint-ci
We haven't updated golint-ci in our CI yet, but this commit prepares
for that.

There are a lot of new warnings, and it is mostly because the "revive"
linter got updated. I agree with most of the new warnings, mostly
around not naming unused function parameters (although it is justified
in some cases for documentation purposes – while things like mocks are
a good example where not naming the parameter is clearer).

I'm pretty upset about the "empty block" warning to include `for`
loops. It's such a common pattern to do something in the head of the
`for` loop and then have an empty block. There is still an open issue
about this: https://github.com/mgechev/revive/issues/810 I have
disabled "revive" altogether in files where empty blocks are used
excessively, and I have made the effort to add individual
`// nolint:revive` where empty blocks are used just once or twice.
It's borderline noisy, though, but let's go with it for now.

I should mention that none of the "empty block" warnings for `for`
loop bodies were legitimate.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:10:10 +02:00
Matthieu MOREL
fb3eb21230 enable gocritic, unconvert and unused linters
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-04-13 19:20:22 +00:00