prometheus/tsdb
Łukasz Mierzwa 277f04f0c4
Stop compactions if there's a block to write (#13754)
* Stop compactions if there's a block to write

db.Compact() checks if there's a block to write with HEAD chunks before calling db.compactBlocks().
This is to ensure that if we need to write a block then it happens ASAP, otherwise memory usage might keep growing.

But what can also happen is that we don't need to write any block, we start db.compactBlocks(),
compaction takes hours, and in the meantime HEAD needs to write out chunks to a block.

This can be especially problematic if, for example, you run Thanos sidecar that's uploading block,
which requires that compactions are disabled. Then you disable Thanos sidecar and re-enable compactions.
When db.compactBlocks() is finally called it might have a huge number of blocks to compact, which might
take a very long time, during which HEAD cannot write out chunks to a new block.
In such case memory usage will keep growing until either:
- compactions are finally finished and HEAD can write a block
- we run out of memory and Prometheus gets OOM-killed

This change adds a check for pending HEAD block writes inside db.compactBlocks(), so that
we bail out early if there are still compactions to run, but we also need to write a new
block.

Also add a test for compactBlocks.

---------

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
Signed-off-by: Lukasz Mierzwa <lukasz@cloudflare.com>
2024-04-07 18:28:28 +01:00
..
agent tsdb tests: make work with labels SymbolTable 2024-02-26 11:45:25 +00:00
chunkenc Optimize histogram iterators (#13340) 2024-01-23 17:02:14 +01:00
chunks Remove unused function tsdb/chunks.PopulatedChunk (#13763) 2024-03-14 11:15:17 +01:00
docs Enforce chunks ordering when writing index. (#8085) 2024-02-04 16:31:49 +01:00
encoding tsdb/encoding: use Go standard errors package 2023-11-11 19:01:11 +01:00
errors Enable default revive rules (#13068) 2023-11-29 17:23:34 +00:00
fileutil tests: remove err from message when testify prints it already 2024-02-01 14:18:01 +00:00
goversion remove obsolete build tag 2024-01-17 22:26:32 +08:00
index fix function and struct name 2024-03-09 17:53:17 +08:00
record tsdb tests: make work with labels SymbolTable 2024-02-26 11:45:25 +00:00
testdata
tombstones Update tombstones.go 2023-11-11 19:22:06 +01:00
tsdbutil Revert "Adding small test update for temp dir using t.TempDir (#13293)" 2023-12-30 19:17:30 +00:00
wlog refactor: utilize standard functions max/min 2024-04-04 03:15:38 +09:00
.gitignore
CHANGELOG.md
README.md
block.go Move from golang.org/x/exp/slices into slices now that we only support Go >= 1.21 2024-02-28 14:54:53 +01:00
block_test.go TSDB: move function only used in tests 2024-03-15 08:54:47 +00:00
blockwriter.go tsdb: use Go standard errors 2023-12-11 12:18:54 +00:00
blockwriter_test.go Add a chunk size limit in bytes (#12054) 2023-08-24 15:21:17 +02:00
compact.go Merge pull request #13864 from yeya24/expose-compactor-metrics 2024-04-05 11:24:41 +02:00
compact_test.go Optimize histogram iterators (#13340) 2024-01-23 17:02:14 +01:00
db.go Stop compactions if there's a block to write (#13754) 2024-04-07 18:28:28 +01:00
db_test.go Stop compactions if there's a block to write (#13754) 2024-04-07 18:28:28 +01:00
example_test.go Add context argument to Querier.Select (#12660) 2023-09-12 12:37:38 +02:00
exemplar.go Move from golang.org/x/exp/slices into slices now that we only support Go >= 1.21 2024-02-28 14:54:53 +01:00
exemplar_test.go Standardise exemplar label as "trace_id" 2024-02-15 14:20:08 +00:00
head.go TSDB: Don't rely on integer overflow in head compaction check (#13755) 2024-03-26 12:17:38 +01:00
head_append.go refactor: utilize standard functions max/min 2024-04-04 03:15:38 +09:00
head_bench_test.go tsdb: use Go standard errors 2023-12-11 12:18:54 +00:00
head_read.go [refactor] moving mergedOOOChunks Iterator (#13881) 2024-04-03 10:14:34 +02:00
head_read_test.go Add ShardedPostings() support to TSDB (#10421) 2024-01-29 11:57:27 +00:00
head_test.go TSDB: Don't rely on integer overflow in head compaction check (#13755) 2024-03-26 12:17:38 +01:00
head_wal.go tsdb: create SymbolTables for labels as required 2024-02-26 11:45:25 +00:00
isolation.go tsdb: create isolation transaction slice on demand 2023-10-21 13:45:47 +00:00
isolation_test.go
mocks_test.go tsdb: use Go standard errors 2023-12-11 12:18:54 +00:00
ooo_head.go Fix issue where queries can fail or omit OOO samples if OOO head compaction occurs between creating a querier and reading chunks (#13115) 2023-11-24 12:38:38 +01:00
ooo_head_read.go [refactor] moving mergedOOOChunks Iterator (#13881) 2024-04-03 10:14:34 +02:00
ooo_head_read_test.go Move from golang.org/x/exp/slices into slices now that we only support Go >= 1.21 2024-02-28 14:54:53 +01:00
ooo_head_test.go ci(lint): enable godot; append dot at the end of comments 2023-10-31 19:53:38 +02:00
ooo_isolation.go Fix issue where queries can fail or omit OOO samples if OOO head compaction occurs between creating a querier and reading chunks (#13115) 2023-11-24 12:38:38 +01:00
ooo_isolation_test.go Fix issue where queries can fail or omit OOO samples if OOO head compaction occurs between creating a querier and reading chunks (#13115) 2023-11-24 12:38:38 +01:00
querier.go Merge remote-tracking branch 'origin/main' into pr/13461 2024-03-25 12:14:26 +00:00
querier_bench_test.go Optimize label values with matchers by taking shortcuts (#13426) 2024-01-23 11:40:21 +01:00
querier_test.go Merge remote-tracking branch 'origin/main' into pr/13461 2024-03-25 12:14:26 +00:00
repair.go tsdb: use Go standard errors 2023-12-11 12:18:54 +00:00
repair_test.go tsdb tests: use go-cmp instead of DeepEquals 2024-02-08 19:32:33 +00:00
tsdbblockutil.go Optimize histogram iterators (#13340) 2024-01-23 17:02:14 +01:00

README.md

TSDB

GoPkg

This directory contains the Prometheus TSDB (Time Series DataBase) library, which handles storage and querying of all Prometheus v2 data.

Documentation

External resources

A series of blog posts explaining different components of TSDB: