prometheus

Commit Graph

Author	SHA1	Message	Date
Björn Rabenstein	2e58d46522	Merge pull request #13662 from prometheus/nhcb Native histograms custom buckets storage	2024-06-27 21:44:20 +02:00
Bryan Boreham	348f7f8d0c	Merge pull request #14341 from charleskorn/charleskorn/cleanup-pending-read Fix issue where pending OOO read can be left dangling if creating querier fails	2024-06-25 09:23:54 +01:00
Ben Ye	246b7c6a5c	TSDB: Change block populator to accept postings index function (#14213 ) Signed-off-by: Ben Ye <benye@amazon.com>	2024-06-25 09:21:48 +01:00
Ben Ye	5585a3c7e5	tsdb: expose hook to customize block querier (#14114 ) * expose hook for block querier Signed-off-by: Ben Ye <benye@amazon.com> * update comment Signed-off-by: Ben Ye <benye@amazon.com> * use defined type Signed-off-by: Ben Ye <benye@amazon.com> --------- Signed-off-by: Ben Ye <benye@amazon.com>	2024-06-25 09:47:06 +02:00
Charles Korn	2c5e88748e	Fix issue where pending OOO read can be left dangling if creating querier fails Signed-off-by: Charles Korn <charles.korn@grafana.com>	2024-06-25 14:22:44 +10:00
Jeanette Tan	dda5f48c9e	Merge branch 'main' into nhcb-review-2	2024-06-20 22:50:00 +08:00
Oleg Zaytsev	fd1a89b7c8	Pass affected labels to `MemPostings.Delete()` (#14307 ) * Pass affected labels to MemPostings.Delete As suggested by @bboreham, we can track the labels of the deleted series and avoid iterating through all the label/value combinations. This looks much faster on the MemPostings.Delete call. We don't have a benchmark on stripeSeries.gc() where we'll pay the price of iterating the labels of each one of the deleted series. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-06-18 10:28:56 +00:00
Ben Ye	0e6fca8e76	add unit test Signed-off-by: Ben Ye <benye@amazon.com>	2024-06-16 12:09:42 -07:00
Ben Ye	e7db2e30a4	fix check context cancellation not incrementing count Signed-off-by: Ben Ye <benye@amazon.com>	2024-06-15 11:43:26 -07:00
Ben Ye	5a218708f1	tsdb: Extend compactor interface to allow compactions to create multiple output blocks (#14143 ) * add hook to allow head compaction to create multiple output blocks Signed-off-by: Ben Ye <benye@amazon.com> * change Compact interface; remove BlockPopulator changes Signed-off-by: Ben Ye <benye@amazon.com> * rebase main Signed-off-by: Ben Ye <benye@amazon.com> * fix lint Signed-off-by: Ben Ye <benye@amazon.com> * fix unit test Signed-off-by: Ben Ye <benye@amazon.com> * address feedbacks; add unit test Signed-off-by: Ben Ye <benye@amazon.com> * Apply suggestions from code review Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Update tsdb/compact_test.go Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> --------- Signed-off-by: Ben Ye <benye@amazon.com> Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>	2024-06-12 17:31:25 -04:00
Sebastian Rabenhorst	05380aa0ac	agent db: make rejecting ooo samples configurable (#14094 ) feat: Make OOO ingestion time window configurable for Prometheus Agent. Signed-off-by: Sebastian Rabenhorst <sebastian.rabenhorst@shopify.com>	2024-06-12 11:07:42 -03:00
Oleg Zaytsev	64a9abb8be	Change LabelValuesFor() to accept index.Postings (#14280 ) The only call we have to LabelValuesFor() has an index.Postings, and we expand it to pass to this method, which will iterate over the values. That's a waste of resources: we can iterate on the index.Postings directly. If there's any downstream implementation that has a slice of series, they can always do an index.ListPostings from them: doing that is cheaper than expanding an abstract index.Postings. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-06-11 15:36:46 +02:00
Bryan Boreham	c5d923aa7c	Merge pull request #14279 from colega/fix-label-names-for-not-found headIndexReader.LabelNamesFor: skip not found series	2024-06-11 01:06:19 +03:00
Oleg Zaytsev	10a3c7220b	`MemPostings.PostingsForLabelMatching()`: don't hold the mutex while matching (#14286 ) * MemPostings.PostingsForLabelMatching: let mutex go This changes the `MemPostings.PostingsForLabelMatching` implementation to stop holding the read mutex while matching the label values. We've seen that this method can be slow when the matcher is expensive, that's why we even added a context expiration check. However, there are critical process that might be waiting on this mutex: writes (adding new series) and compaction (deleting the garbage-collected ones), so we should avoid holding it for a long period of time. Given that we've copied the values to a slice anyway, there's no need to hold the lock while matching. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-06-10 14:24:17 +02:00
Oleg Zaytsev	2dc177d8af	`MemPostings.Delete()`: reduce locking/unlocking (#13286 ) * MemPostings: reduce locking/unlocking MemPostings.Delete is called from Head.gc(), i.e. it gets the IDs of the series that have churned. I'd assume that many label values aren't affected by that churn at all, so it doesn't make sense to touch the lock while checking them. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-06-10 14:23:22 +02:00
Oleg Zaytsev	d0d361da53	headIndexReader.LabelNamesFor: skip not found series It's quite common during the compaction cycle to hold series IDs for series that aren't in the TSDB head anymore. We shouldn't fail if that happens, as the caller has no way to figure out which one of the IDs doesn't exist. Fixes https://github.com/prometheus/prometheus/issues/14278 Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-06-07 16:09:53 +02:00
Jeanette Tan	14f8dded39	Merge branch 'main' into nhcb Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>	2024-06-07 19:17:14 +08:00
Jeanette Tan	9adc1699c3	fix according to code review Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>	2024-06-07 18:50:59 +08:00
Ben Ye	8a08f452b6	tsdb: Allow passing a custom compactor to override the default one (#14113 ) * expose hook in tsdb to allow customizing compactor Signed-off-by: Ben Ye <benye@amazon.com> * address comment Signed-off-by: Ben Ye <benye@amazon.com> --------- Signed-off-by: Ben Ye <benye@amazon.com>	2024-06-04 19:11:36 -04:00
Bryan Boreham	42b546a43d	tsdb: add details to duplicate sample error (#13277 ) Now the error will include the timestamp and the existing and new values. When you are trying to track down the source of this error, it can be useful to see that the values are close, or alternating, or something else. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-06-04 08:54:09 +01:00
Arve Knudsen	b8b9015e38	tsdb/index: Fix TestReader_PostingsForLabelMatchingHonorsContextCancel Fix number of series in TestReader_PostingsForLabelMatchingHonorsContextCancel (off by one). Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2024-06-03 17:29:06 +02:00
Bryan Boreham	3ee52abb53	[ENHANCEMENT] TSDB: Save map lookup on validation Goes faster. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-05-30 09:17:11 +01:00
Bryan Boreham	7d98487447	[ENHANCEMENT] TSDB: let Resize re-use buffer This saves having to zero the buffer every time. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-05-30 09:17:11 +01:00
Bryan Boreham	c0bb156eca	[ENHANCEMENT] TSDB: Eliminate pointer when storing exemplars Saves memory and effort. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-05-30 09:17:11 +01:00
Bryan Boreham	3eb5581877	[ENHANCEMENT] TSDB: Reduce map lookups on exemplar index In many cases we already have a pointer to the entry. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-05-30 09:17:11 +01:00
Bryan Boreham	f0c50b5a66	[Test] TSDB: BenchmarkResizeExemplar multiple per series One exemplar per series is not a typical workload. Make it the same as `BenchmarkAddExemplar`. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-05-30 09:17:11 +01:00
Bryan Boreham	929fbf860e	[Test] TSDB: let BenchmarkAddExemplar reuse slots Test with different amounts of capacity and exemplars, so that sometimes new exemplars are evicting older exemplars. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-05-30 09:16:30 +01:00
Ben Ye	6683895620	optimize regex matching for empty label values in posting match (#14075 ) Also update tests. Signed-off-by: Ben Ye <benye@amazon.com>	2024-05-29 16:03:33 +01:00
Arve Knudsen	b2396c0c8f	Upgrade to golangci-lint v1.59.0 Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2024-05-27 22:38:48 +02:00
Alan Protasio	8894d65cd6	Fix head stats and hooks when replaying a corrupted snapshot (#14079 ) * Fixing head stats and hooks when replaying a corrupted snapshot Signed-off-by: alanprot <alanprot@gmail.com> * Fixing create/removed series metrics Signed-off-by: alanprot <alanprot@gmail.com> * Refactoring to have common code between gc and flush method Signed-off-by: alanprot <alanprot@gmail.com> * Update tsdb/head.go Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Signed-off-by: Alan Protasio <alanprot@gmail.com> * refactor Signed-off-by: alanprot <alanprot@gmail.com> * Update tsdb/head_test.go Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Alan Protasio <alanprot@gmail.com> * Update tsdb/head_test.go Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com> Signed-off-by: Alan Protasio <alanprot@gmail.com> --------- Signed-off-by: alanprot <alanprot@gmail.com> Signed-off-by: Alan Protasio <alanprot@gmail.com> Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>	2024-05-24 22:43:21 -04:00
Björn Rabenstein	3119b8a055	Merge pull request #13218 from machine424/ro-promtool Make DBReadOnly more RO	2024-05-21 13:27:40 +02:00
Oleg Zaytsev	fe9cb5a803	Check context every 128 labels instead of 100 (#14118 ) Follow up on https://github.com/prometheus/prometheus/pull/14096 As promised, I bring a benchmark, which shows a very small improvement if context is checked every 128 iterations of label instead of every 100. It's much easier for a computer to check modulo 128 than modulo 100. This is a very small 0-2% improvement but I'd say this is one of the hottest paths of the app so this is still relevant. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2024-05-21 11:30:43 +02:00
Arve Knudsen	5ca56eeb6b	tsdb/index: Refactor Reader tests (#14071 ) tsdb/index: Refactor Reader tests Co-authored-by: Björn Rabenstein <github@rabenste.in> Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> --------- Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> Co-authored-by: Björn Rabenstein <github@rabenste.in>	2024-05-16 11:51:46 +02:00
Oleksandr Redko	f10c3454e9	Enable perfsprint linter and fix up code Signed-off-by: Oleksandr Redko <oleksandr.red+github@gmail.com>	2024-05-15 17:51:05 +03:00
György Krajcsovits	b215a41be4	tsdb/index/postings: fix missing lock unlock Followup to #14096 Unfortunately the previous PR introduced this bug by not releasing the lock before returning. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-05-15 14:02:39 +02:00
George Krajcsovits	fdaafdb041	tsdb: check for context cancel before regex matching postings (#14096 ) * tsdb: check for context cancel before regex matching postings Regex matching can be heavy if the regex takes a lot of cycles to evaluate and we can get stuck evaluating postings for a long time without this fix. The constant checkContextEveryNIterations=100 may be changed later. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2024-05-15 06:26:19 +02:00
Jeanette Tan	f028496133	Merge branch 'main' into nhcb Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>	2024-05-14 16:20:15 +08:00
Arve Knudsen	5c4310aa37	[ENHANCEMENT] TSDB: Optimize querying with regexp matchers Add method `PostingsForLabelMatching` to `tsdb.IndexReader`, to obtain postings for labels with a certain name and values accepted by a provided callback, and use it from `tsdb.PostingsForMatchers`. The intention is to optimize regexp matcher paths, especially not having to load all label values before matching on them. Plus tests, and refactor some `tsdb/index.Reader` methods. Benchmarking shows memory reduction up to ~100%, and speedup of up to ~50%. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>	2024-05-09 10:55:30 +01:00
Arve Knudsen	d699dc3c77	Fix language in docs and comments (#14041 ) Fix language in docs and comments --------- Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> Co-authored-by: Björn Rabenstein <github@rabenste.in>	2024-05-08 17:57:09 +02:00
Arve Knudsen	108a6bc9f6	tsdb/chunkenc.Pool: Refactor Get and Put Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2024-05-08 13:37:25 +02:00
Jeanette Tan	796b1bbfde	Merge branch 'main' into nhcb Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>	2024-05-08 19:11:39 +08:00
Alan Protasio	d15869af32	Avoid creating new slices for labels values on postings for matchers (#13958 ) * Avoid creating new slices for labels values on postings for matchers Signed-off-by: alanprot <alanprot@gmail.com> * refactor Signed-off-by: alanprot <alanprot@gmail.com> --------- Signed-off-by: alanprot <alanprot@gmail.com>	2024-04-24 16:41:33 +02:00
György Krajcsovits	bcafa5f1f9	Merge remote-tracking branch 'upstream/main' into update-nhcb	2024-04-24 11:06:59 +02:00
Arthur Silva Sens	b5b5e1e5ae	Merge pull request #13919 from GiedriusS/dont_forget_to_unregister tsdb/wlog: unregister metrics on WL close	2024-04-18 16:44:03 -03:00
Giedrius Statkevičius	bdf490726a	tsdb/wlog: add test for metrics unregistering Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>	2024-04-18 11:11:37 +03:00
machine424	c5a1cc9148	chore(tsdb): add a sandboxDir to DBReadOnly, the directory can be used for transient file writes. use it in loadDataAsQueryable to make sure the RO Head doesn't truncate or cut new chunks in data/chunks_head/. add a -sandbox-dir-root flag to "promtool tsdb dump/dump-openmetrics" to control the root of that sandbox dirrectory. Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2024-04-15 17:00:25 +02:00
Giedrius Statkevičius	3b8fe00767	tsdb/wlog: unregister metrics on WL close Thanos can create and destroy TSDBs dynamically, and once a TSDB disappears its files are deleted. Calculating the size of the WAL then fails with errors like: ``` msg: "Failed to calculate size of "wal" dir", "err": "lstat /tsdbdir/wal: no such file or directory", "caller": "wlog.go:271" ``` Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>	2024-04-11 11:30:05 +03:00
Matthieu MOREL	6f595c6762	golangci-lint: enable whitespace linter (#13905 ) Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2024-04-11 09:27:54 +01:00
Jonathan Halterman	633224886a	Write out of order hint when initially creating meta file (#13894 ) Signed-off-by: Jonathan Halterman <jonathan@grafana.com> Signed-off-by: Jonathan Halterman <jhalterman@gmail.com> Co-authored-by: Jesus Vazquez <jesusvazquez@users.noreply.github.com>	2024-04-08 17:34:14 +02:00
Łukasz Mierzwa	277f04f0c4	Stop compactions if there's a block to write (#13754 ) * Stop compactions if there's a block to write db.Compact() checks if there's a block to write with HEAD chunks before calling db.compactBlocks(). This is to ensure that if we need to write a block then it happens ASAP, otherwise memory usage might keep growing. But what can also happen is that we don't need to write any block, we start db.compactBlocks(), compaction takes hours, and in the meantime HEAD needs to write out chunks to a block. This can be especially problematic if, for example, you run Thanos sidecar that's uploading block, which requires that compactions are disabled. Then you disable Thanos sidecar and re-enable compactions. When db.compactBlocks() is finally called it might have a huge number of blocks to compact, which might take a very long time, during which HEAD cannot write out chunks to a new block. In such case memory usage will keep growing until either: - compactions are finally finished and HEAD can write a block - we run out of memory and Prometheus gets OOM-killed This change adds a check for pending HEAD block writes inside db.compactBlocks(), so that we bail out early if there are still compactions to run, but we also need to write a new block. Also add a test for compactBlocks. --------- Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com> Signed-off-by: Lukasz Mierzwa <lukasz@cloudflare.com>	2024-04-07 18:28:28 +01:00

1 2 3 4 5 ...

1053 Commits