Commit Graph

1053 Commits

Author SHA1 Message Date
Björn Rabenstein 2e58d46522
Merge pull request #13662 from prometheus/nhcb
Native histograms custom buckets storage
2024-06-27 21:44:20 +02:00
Bryan Boreham 348f7f8d0c
Merge pull request #14341 from charleskorn/charleskorn/cleanup-pending-read
Fix issue where pending OOO read can be left dangling if creating querier fails
2024-06-25 09:23:54 +01:00
Ben Ye 246b7c6a5c
TSDB: Change block populator to accept postings index function (#14213)
Signed-off-by: Ben Ye <benye@amazon.com>
2024-06-25 09:21:48 +01:00
Ben Ye 5585a3c7e5
tsdb: expose hook to customize block querier (#14114)
* expose hook for block querier

Signed-off-by: Ben Ye <benye@amazon.com>

* update comment

Signed-off-by: Ben Ye <benye@amazon.com>

* use defined type

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>
2024-06-25 09:47:06 +02:00
Charles Korn 2c5e88748e
Fix issue where pending OOO read can be left dangling if creating querier fails
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-06-25 14:22:44 +10:00
Jeanette Tan dda5f48c9e Merge branch 'main' into nhcb-review-2 2024-06-20 22:50:00 +08:00
Oleg Zaytsev fd1a89b7c8
Pass affected labels to `MemPostings.Delete()` (#14307)
* Pass affected labels to MemPostings.Delete

As suggested by @bboreham, we can track the labels of the deleted series
and avoid iterating through all the label/value combinations.

This looks much faster on the MemPostings.Delete call. We don't have a
benchmark on stripeSeries.gc() where we'll pay the price of iterating
the labels of each one of the deleted series.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-18 10:28:56 +00:00
Ben Ye 0e6fca8e76 add unit test
Signed-off-by: Ben Ye <benye@amazon.com>
2024-06-16 12:09:42 -07:00
Ben Ye e7db2e30a4 fix check context cancellation not incrementing count
Signed-off-by: Ben Ye <benye@amazon.com>
2024-06-15 11:43:26 -07:00
Ben Ye 5a218708f1
tsdb: Extend compactor interface to allow compactions to create multiple output blocks (#14143)
* add hook to allow head compaction to create multiple output blocks

Signed-off-by: Ben Ye <benye@amazon.com>

* change Compact interface; remove BlockPopulator changes

Signed-off-by: Ben Ye <benye@amazon.com>

* rebase main

Signed-off-by: Ben Ye <benye@amazon.com>

* fix lint

Signed-off-by: Ben Ye <benye@amazon.com>

* fix unit test

Signed-off-by: Ben Ye <benye@amazon.com>

* address feedbacks; add unit test

Signed-off-by: Ben Ye <benye@amazon.com>

* Apply suggestions from code review

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Update tsdb/compact_test.go

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2024-06-12 17:31:25 -04:00
Sebastian Rabenhorst 05380aa0ac
agent db: make rejecting ooo samples configurable (#14094)
feat: Make OOO ingestion time window configurable for Prometheus Agent.

Signed-off-by: Sebastian Rabenhorst <sebastian.rabenhorst@shopify.com>
2024-06-12 11:07:42 -03:00
Oleg Zaytsev 64a9abb8be
Change LabelValuesFor() to accept index.Postings (#14280)
The only call we have to LabelValuesFor() has an index.Postings, and we
expand it to pass to this method, which will iterate over the values.

That's a waste of resources: we can iterate on the index.Postings
directly.

If there's any downstream implementation that has a slice of series,
they can always do an index.ListPostings from them: doing that is
cheaper than expanding an abstract index.Postings.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-11 15:36:46 +02:00
Bryan Boreham c5d923aa7c
Merge pull request #14279 from colega/fix-label-names-for-not-found
headIndexReader.LabelNamesFor: skip not found series
2024-06-11 01:06:19 +03:00
Oleg Zaytsev 10a3c7220b
`MemPostings.PostingsForLabelMatching()`: don't hold the mutex while matching (#14286)
* MemPostings.PostingsForLabelMatching: let mutex go

This changes the `MemPostings.PostingsForLabelMatching` implementation
to stop holding the read mutex while matching the label values.

We've seen that this method can be slow when the matcher is expensive,
that's why we even added a context expiration check.

However, there are critical process that might be waiting on this mutex:
writes (adding new series) and compaction (deleting the
garbage-collected ones), so we should avoid holding it for a long period
of time.

Given that we've copied the values to a slice anyway, there's no need to
hold the lock while matching.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-10 14:24:17 +02:00
Oleg Zaytsev 2dc177d8af
`MemPostings.Delete()`: reduce locking/unlocking (#13286)
* MemPostings: reduce locking/unlocking

MemPostings.Delete is called from Head.gc(), i.e. it gets the IDs of the
series that have churned.

I'd assume that many label values aren't affected by that churn at all,
so it doesn't make sense to touch the lock while checking them.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-10 14:23:22 +02:00
Oleg Zaytsev d0d361da53
headIndexReader.LabelNamesFor: skip not found series
It's quite common during the compaction cycle to hold series IDs for
series that aren't in the TSDB head anymore.

We shouldn't fail if that happens, as the caller has no way to figure
out which one of the IDs doesn't exist.

Fixes https://github.com/prometheus/prometheus/issues/14278

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-07 16:09:53 +02:00
Jeanette Tan 14f8dded39 Merge branch 'main' into nhcb
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-06-07 19:17:14 +08:00
Jeanette Tan 9adc1699c3 fix according to code review
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-06-07 18:50:59 +08:00
Ben Ye 8a08f452b6
tsdb: Allow passing a custom compactor to override the default one (#14113)
* expose hook in tsdb to allow customizing compactor

Signed-off-by: Ben Ye <benye@amazon.com>

* address comment

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>
2024-06-04 19:11:36 -04:00
Bryan Boreham 42b546a43d
tsdb: add details to duplicate sample error (#13277)
Now the error will include the timestamp and the existing and new values.
When you are trying to track down the source of this error, it can be
useful to see that the values are close, or alternating, or something
else.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-06-04 08:54:09 +01:00
Arve Knudsen b8b9015e38 tsdb/index: Fix TestReader_PostingsForLabelMatchingHonorsContextCancel
Fix number of series in
TestReader_PostingsForLabelMatchingHonorsContextCancel (off by one).

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-06-03 17:29:06 +02:00
Bryan Boreham 3ee52abb53 [ENHANCEMENT] TSDB: Save map lookup on validation
Goes faster.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-05-30 09:17:11 +01:00
Bryan Boreham 7d98487447 [ENHANCEMENT] TSDB: let Resize re-use buffer
This saves having to zero the buffer every time.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-05-30 09:17:11 +01:00
Bryan Boreham c0bb156eca [ENHANCEMENT] TSDB: Eliminate pointer when storing exemplars
Saves memory and effort.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-05-30 09:17:11 +01:00
Bryan Boreham 3eb5581877 [ENHANCEMENT] TSDB: Reduce map lookups on exemplar index
In many cases we already have a pointer to the entry.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-05-30 09:17:11 +01:00
Bryan Boreham f0c50b5a66 [Test] TSDB: BenchmarkResizeExemplar multiple per series
One exemplar per series is not a typical workload. Make it the same as
`BenchmarkAddExemplar`.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-05-30 09:17:11 +01:00
Bryan Boreham 929fbf860e [Test] TSDB: let BenchmarkAddExemplar reuse slots
Test with different amounts of capacity and exemplars, so that sometimes
new exemplars are evicting older exemplars.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-05-30 09:16:30 +01:00
Ben Ye 6683895620
optimize regex matching for empty label values in posting match (#14075)
Also update tests.

Signed-off-by: Ben Ye <benye@amazon.com>
2024-05-29 16:03:33 +01:00
Arve Knudsen b2396c0c8f Upgrade to golangci-lint v1.59.0
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-05-27 22:38:48 +02:00
Alan Protasio 8894d65cd6
Fix head stats and hooks when replaying a corrupted snapshot (#14079)
* Fixing head stats and hooks when replaying a corrupted snapshot

Signed-off-by: alanprot <alanprot@gmail.com>

* Fixing create/removed series metrics

Signed-off-by: alanprot <alanprot@gmail.com>

* Refactoring to have common code between gc and flush method

Signed-off-by: alanprot <alanprot@gmail.com>

* Update tsdb/head.go

Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

* refactor

Signed-off-by: alanprot <alanprot@gmail.com>

* Update tsdb/head_test.go

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

* Update tsdb/head_test.go

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>

---------

Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>
Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2024-05-24 22:43:21 -04:00
Björn Rabenstein 3119b8a055
Merge pull request #13218 from machine424/ro-promtool
Make DBReadOnly more RO
2024-05-21 13:27:40 +02:00
Oleg Zaytsev fe9cb5a803
Check context every 128 labels instead of 100 (#14118)
Follow up on https://github.com/prometheus/prometheus/pull/14096

As promised, I bring a benchmark, which shows a very small improvement
if context is checked every 128 iterations of label instead of every
100.

It's much easier for a computer to check modulo 128 than modulo 100.
This is a very small 0-2% improvement but I'd say this is one of the
hottest paths of the app so this is still relevant.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-21 11:30:43 +02:00
Arve Knudsen 5ca56eeb6b
tsdb/index: Refactor Reader tests (#14071)
tsdb/index: Refactor Reader tests

Co-authored-by: Björn Rabenstein <github@rabenste.in>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
2024-05-16 11:51:46 +02:00
Oleksandr Redko f10c3454e9 Enable perfsprint linter and fix up code
Signed-off-by: Oleksandr Redko <oleksandr.red+github@gmail.com>
2024-05-15 17:51:05 +03:00
György Krajcsovits b215a41be4 tsdb/index/postings: fix missing lock unlock
Followup to #14096

Unfortunately the previous PR introduced this bug by not releasing the
lock before returning.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-05-15 14:02:39 +02:00
George Krajcsovits fdaafdb041
tsdb: check for context cancel before regex matching postings (#14096)
* tsdb: check for context cancel before regex matching postings

Regex matching can be heavy if the regex takes a lot of cycles to
evaluate and we can get stuck evaluating postings for a long time
without this fix. The constant checkContextEveryNIterations=100
may be changed later.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-05-15 06:26:19 +02:00
Jeanette Tan f028496133 Merge branch 'main' into nhcb
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-05-14 16:20:15 +08:00
Arve Knudsen 5c4310aa37
[ENHANCEMENT] TSDB: Optimize querying with regexp matchers
Add method `PostingsForLabelMatching` to `tsdb.IndexReader`, to obtain postings for labels with a certain name and values accepted by a provided callback, and use it from `tsdb.PostingsForMatchers`.
The intention is to optimize regexp matcher paths, especially not having to load all label values before matching on them.

Plus tests, and refactor some `tsdb/index.Reader` methods.

Benchmarking shows memory reduction up to ~100%, and speedup of up to ~50%.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2024-05-09 10:55:30 +01:00
Arve Knudsen d699dc3c77
Fix language in docs and comments (#14041)
Fix language in docs and comments

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
2024-05-08 17:57:09 +02:00
Arve Knudsen 108a6bc9f6 tsdb/chunkenc.Pool: Refactor Get and Put
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-05-08 13:37:25 +02:00
Jeanette Tan 796b1bbfde Merge branch 'main' into nhcb
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-05-08 19:11:39 +08:00
Alan Protasio d15869af32
Avoid creating new slices for labels values on postings for matchers (#13958)
* Avoid creating new slices for labels values on postings for matchers

Signed-off-by: alanprot <alanprot@gmail.com>

* refactor

Signed-off-by: alanprot <alanprot@gmail.com>

---------

Signed-off-by: alanprot <alanprot@gmail.com>
2024-04-24 16:41:33 +02:00
György Krajcsovits bcafa5f1f9 Merge remote-tracking branch 'upstream/main' into update-nhcb 2024-04-24 11:06:59 +02:00
Arthur Silva Sens b5b5e1e5ae
Merge pull request #13919 from GiedriusS/dont_forget_to_unregister
tsdb/wlog: unregister metrics on WL close
2024-04-18 16:44:03 -03:00
Giedrius Statkevičius bdf490726a tsdb/wlog: add test for metrics unregistering
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-04-18 11:11:37 +03:00
machine424 c5a1cc9148
chore(tsdb): add a sandboxDir to DBReadOnly, the directory can be used for transient file writes.
use it in loadDataAsQueryable to make sure the RO Head doesn't truncate or cut new chunks in data/chunks_head/.

add a -sandbox-dir-root flag to "promtool tsdb dump/dump-openmetrics" to control the root of that sandbox dirrectory.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-04-15 17:00:25 +02:00
Giedrius Statkevičius 3b8fe00767 tsdb/wlog: unregister metrics on WL close
Thanos can create and destroy TSDBs dynamically, and once a TSDB
disappears its files are deleted. Calculating the size of the
WAL then fails with errors like:

```
msg: "Failed to calculate size of "wal" dir", "err": "lstat
/tsdbdir/wal: no such file or directory", "caller": "wlog.go:271"
```

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-04-11 11:30:05 +03:00
Matthieu MOREL 6f595c6762
golangci-lint: enable whitespace linter (#13905)
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-04-11 09:27:54 +01:00
Jonathan Halterman 633224886a
Write out of order hint when initially creating meta file (#13894)
Signed-off-by: Jonathan Halterman <jonathan@grafana.com>
Signed-off-by: Jonathan Halterman <jhalterman@gmail.com>
Co-authored-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com>
2024-04-08 17:34:14 +02:00
Łukasz Mierzwa 277f04f0c4
Stop compactions if there's a block to write (#13754)
* Stop compactions if there's a block to write

db.Compact() checks if there's a block to write with HEAD chunks before calling db.compactBlocks().
This is to ensure that if we need to write a block then it happens ASAP, otherwise memory usage might keep growing.

But what can also happen is that we don't need to write any block, we start db.compactBlocks(),
compaction takes hours, and in the meantime HEAD needs to write out chunks to a block.

This can be especially problematic if, for example, you run Thanos sidecar that's uploading block,
which requires that compactions are disabled. Then you disable Thanos sidecar and re-enable compactions.
When db.compactBlocks() is finally called it might have a huge number of blocks to compact, which might
take a very long time, during which HEAD cannot write out chunks to a new block.
In such case memory usage will keep growing until either:
- compactions are finally finished and HEAD can write a block
- we run out of memory and Prometheus gets OOM-killed

This change adds a check for pending HEAD block writes inside db.compactBlocks(), so that
we bail out early if there are still compactions to run, but we also need to write a new
block.

Also add a test for compactBlocks.

---------

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
Signed-off-by: Lukasz Mierzwa <lukasz@cloudflare.com>
2024-04-07 18:28:28 +01:00