Commit Graph

607 Commits

Author SHA1 Message Date
beorn7
3ce988b031 Merge branch 'sparsehistogram' into beorn7/sparsehistogram 2022-07-13 18:07:54 +02:00
beorn7
28f028e938 Merge branch 'main' into sparsehistogram 2022-07-12 19:07:13 +02:00
beorn7
5d14046d28 tsdb: Fix chunk handling during appendHistogram
Previously, the maxTime wasn't updated properly in case of a recoding
happening.

My apologies for reformatting many lines for line length. During the
bug hunt, I tried to make things more readable in a reasonably wide
editor window.

Signed-off-by: beorn7 <beorn@grafana.com>
2022-07-06 18:44:53 +02:00
beorn7
642c5758ff tsdb: Expose histogram append bug
Signed-off-by: beorn7 <beorn@grafana.com>
2022-07-06 18:44:45 +02:00
beorn7
49be0784b4 tsdb: Fix chunk handling during histogram recoding
Previously, the maxTime wasn't updated properly in case of a recoding
happening.

My apologies for reformatting many lines for line length. During the
bug hunt, I tried to make things more readable in a reasonably wide
editor window.

Signed-off-by: beorn7 <beorn@grafana.com>
2022-07-06 14:34:02 +02:00
Jesus Vazquez
6cfe44d7fd
WaitUntilIdle optimize idling time (#10878)
Relates to @bboreham optimization in https://github.com/prometheus/prometheus/pull/10859

Bryan did reduce the sleep time improving the deltas on the benchmark by
quite a lot. However I've been working on a similar implementation for
out of order and I noticed that we actually get into this method
thousands of times.

@ywwg had the brilliant idea of not always sleeping before the select
but actually make it a case in the select so we only sleep if we need
to.

The benchmark deltas are amazing

```
❯ benchstat old_implementation.txt new_implementation_using_time_after.txt
name                                                                                                     old time/op  new time/op  delta
LoadWAL/batches=10,seriesPerBatch=100,samplesPerSeries=7200,exemplarsPerSeries=0,mmappedChunkT=0-8        521ms ±25%   253ms ± 6%  -51.47%  (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=100,samplesPerSeries=7200,exemplarsPerSeries=36,mmappedChunkT=0-8       773ms ± 3%   369ms ±31%  -52.23%  (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=100,samplesPerSeries=7200,exemplarsPerSeries=72,mmappedChunkT=0-8       592ms ±28%   297ms ±28%  -49.80%  (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=100,samplesPerSeries=7200,exemplarsPerSeries=360,mmappedChunkT=0-8      547ms ± 2%  999ms ±187%     ~     (p=0.690 n=5+5)
LoadWAL/batches=10,seriesPerBatch=10000,samplesPerSeries=50,exemplarsPerSeries=0,mmappedChunkT=0-8        11.3s ± 4%    1.3s ±44%  -88.48%  (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=10000,samplesPerSeries=50,exemplarsPerSeries=2,mmappedChunkT=0-8        11.1s ± 1%    1.2s ±20%  -89.08%  (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=0,mmappedChunkT=0-8        1.24s ± 3%   0.18s ± 7%  -85.76%  (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=2,mmappedChunkT=0-8        1.24s ± 2%   0.18s ± 5%  -85.24%  (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=5,mmappedChunkT=0-8        1.23s ± 5%   0.27s ±33%  -77.73%  (p=0.008 n=5+5)
LoadWAL/batches=10,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=24,mmappedChunkT=0-8       1.28s ± 1%   0.36s ± 7%  -71.51%  (p=0.008 n=5+5)
LoadWAL/batches=100,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=0,mmappedChunkT=3800-8    12.1s ± 1%    3.1s ± 6%  -74.33%  (p=0.008 n=5+5)
LoadWAL/batches=100,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=2,mmappedChunkT=3800-8    12.1s ± 1%    3.4s ± 4%  -71.94%  (p=0.008 n=5+5)
LoadWAL/batches=100,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=5,mmappedChunkT=3800-8    12.1s ± 1%    3.8s ±17%  -68.35%  (p=0.008 n=5+5)
LoadWAL/batches=100,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=24,mmappedChunkT=3800-8   12.4s ± 1%    4.0s ±18%  -67.71%  (p=0.008 n=5+5)
```

Benchmarked on Linux
```
goos: linux
goarch: amd64
pkg: github.com/prometheus/prometheus/tsdb
cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
```

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
2022-06-30 15:00:04 +02:00
Julien Pivotto
bacd776356
Merge pull request #10907 from damnever/fix/panic
Fix panic if series is not found when deleting series
2022-06-30 11:23:08 +02:00
Peter Štibraný
ffc60d8397
Reduce chunk write queue memory usage 2 (#10874)
* Job queue

This PR reimplements chan chunkWriteJob with custom buffered queue that should use less memory, because it doesn't preallocate entire buffer for maximum queue size at once. Instead it allocates individual "segments" with smaller size.

As elements are added to the queue, they fill individual segments. When elements are removed from the queue (and segments), empty segments can be thrown away. This doesn't change memory usage of the queue when it's full, but should decrease its memory footprint when it's empty (queue will keep max 1 segment in such case).

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Modify test to work with low resolution timer.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Improve comments.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
2022-06-29 17:51:27 +05:30
Xiaochao Dong (@damnever)
6b042da2d8 Fix panic if series is not found when deleting series
Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>
2022-06-24 15:55:32 +08:00
Steve Azzopardi
04fe2c9522
fix(tsdb): inc mmap corruption counter on mmap out of sequence error (#10406)
What
---
When we see out of sequence chunks increase the chunk corruption counter
to indicate that one of the chunks was corrupted.

Reference: https://github.com/prometheus/prometheus/pull/10406#issuecomment-1142595527
Signed-off-by: Steve Azzopardi <steveazz@outlook.com>
2022-06-22 14:03:12 +05:30
Peter Štibraný
03a2313f7a
Reduce chunk write queue memory usage (#10873)
* dont waste space on the chunkRefMap
* add time factor
* add comments
* better readability
* add instrumentation and more comments
* formatting
* uppercase comments
* Address review feedback. Renamed "free" to "shrink" everywhere, updated comments and threshold to 1000.
* double space

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
Co-authored-by: Peter Štibraný <pstibrany@gmail.com>

Co-authored-by: Mauro Stettler <mauro.stettler@gmail.com>
2022-06-17 13:11:39 +05:30
Bryan Boreham
9f77d23889
tsdb: commit data periodically in CreateBlock (#10788)
To avoid building up data in memory, commit and make a new appender
periodically.

The number `commitAfter = 10000` was chosen arbitrarily; testing with
10x more or less gives slightly worse results.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-06-17 11:26:19 +05:30
Łukasz Mierzwa
d65f037def
Don't increment prometheus_tsdb_compactions_failed_total when context is canceled (#10772)
When restarting Prometheus I sometimes see:

caller=db.go:832 level=error component=tsdb msg="compaction failed" err="compact head: persist head block: 2 errors: populate block: context canceled; context canceled"

And prometheus_tsdb_compactions_failed_total metric gets incremented.
This makes it more difficult to write alerts based on
prometheus_tsdb_compactions_failed_total metric since any restart can
trigger it.

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
2022-06-17 11:21:43 +05:30
beorn7
095b6c93dd Merge branch 'main' into sparsehistogram 2022-06-14 14:27:35 +02:00
Bryan Boreham
542b9ecdbd
tsdb: reduce sleep time when reading WAL (#10859)
The code sleeps for a short time to allow goroutines to finish, however
it seems the duration can be reduced a lot, speeding up the reading
process.

I checked using some WAL data from production, and the queue is almost
always empty at the time we enter `waitForIdle()` so there is no danger
of spinning in the tight loop.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-06-12 11:54:11 +05:30
beorn7
40ad5e284a Merge branch 'main' into beorn7/sparsehistogram 2022-06-09 20:50:30 +02:00
Bryan Boreham
9f79a6f4b5
tsdb: faster CRC check by avoiding allocations (#10789)
Instead of creating a new hashing object every time, call `crc32.Checksum`
which computes the answer without allocations.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-06-08 08:00:59 +05:30
Matej Gera
1dd247f68b
Remote Write: Rename confusing walDir parameter to dir (#10464)
* Rename walDir parameter to dir

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Improve NewQueueManager comment

Signed-off-by: Matej Gera <matejgera@gmail.com>
2022-05-30 21:45:30 -07:00
David Leadbeater
57f4aab27d
Update godoc links and remove note about TSDB versioning (#10754)
Signed-off-by: David Leadbeater <dgl@dgl.cx>
2022-05-26 18:34:43 +10:00
maizige
10b677b826
fix typo (#10696)
Update doc comment

Signed-off-by: gemaizi <864321211@qq.com>
2022-05-25 18:01:45 +02:00
Filip Petkovski
d3cb39044e
Fix typo in symbol table size exceeded error message (#10746)
This commit fixes a typo when reporting an error that the the symbols
table size has been exceeded.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2022-05-25 10:40:36 +02:00
Julien Pivotto
6e3a0efe40
Make necessary change to compile promql parser to wasm (#10683)
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2022-05-12 09:12:05 +02:00
Matthias Rampke
78f2645787
test(tsdb): break up repeated test to avoid timeout (#10671)
On macOS, the TestTombstoneCleanRetentionLimitsRace performs very
poorly. It takes more than a second to write out one block, and as it
writes 400 of them, we run into the 10-minute test timeout frequently.

While this doesn't fix the actual performance issue, breaking each
iteration into a subtest makes the test pass reliably (because each
iteration comfortably finishes in under a minute).

Related report: https://groups.google.com/g/prometheus-developers/c/jxQ6Ayg6VJ4/m/03H_DS9PDAAJ

Signed-off-by: Matthias Rampke <matthias@prometheus.io>
2022-05-09 00:39:26 +02:00
Łukasz Mierzwa
88f9b248b4
Correctly format error message (#10669)
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
2022-05-06 00:42:31 +02:00
Bryan Boreham
4b9f248e85
unit tests: make all Labels sorted alphabetically (#10532)
"Labels is a sorted set of labels. Order has to be guaranteed upon
instantiation." says the comment, so fix all the tests that break this
rule.

For `BenchmarkLabelValuesWithMatchers()` and
`BenchmarkHeadLabelValuesWithMatchers()` the amount of work done changes
significantly if you put the labels in order, because all series refs
get neatly partitioned by the `tens` label, so I renamed the labels
to maintain the previous behaviour.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-05-04 23:41:36 +02:00
beorn7
654c07783c Fix deprecation
Signed-off-by: beorn7 <beorn@grafana.com>
2022-05-04 13:43:23 +02:00
beorn7
3bc711e333 Merge branch 'main' into sparsehistogram 2022-05-04 13:37:13 +02:00
Matthieu MOREL
e2ede285a2
refactor: move from io/ioutil to io and os packages (#10528)
* refactor: move from io/ioutil to io and os packages
* use fs.DirEntry instead of os.FileInfo after os.ReadDir

Signed-off-by: MOREL Matthieu <matthieu.morel@cnp.fr>
2022-04-27 11:24:36 +02:00
Oleg Zaytsev
af0f6da5cb
Fix chunk overflow appending samples at a variable rate (#10607)
* Add a test with variable samples rate append

This test overflows the chunk created in memseries, and the total amount
of samples in the (only) mmapped chunk is 29, instead of the 65565
appended ones.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Cut new chunk when rate prediction was wrong

When appending samples at a slow rate, and then appending at a higher
rate, the prediction we made to cut a new chunk is no longer valid.
Sometimes this can even cause an overflow in the chunk, if more samples
than uint16 can hold are appended.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Improve comment on 2*samplesPerChunk

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Assert that all chunks have less than 240 samples

Also, trigger new chunk at 240, not at more than 240

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2022-04-20 14:54:20 +02:00
Paschalis Tsilias
40c1efe8bc
tsdb/agent: Ignore duplicate exemplars (#10595)
* tsdb/agent: Ignore duplicate exemplars

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Make each exemplar unique in TestCommit

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Re-Trigger CI for Windows and UI-related steps

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Change test comment to properly re-trigger pipeline

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>

* Defer Close() calls for test agent and segment reader

Signed-off-by: Paschalis Tsilias <paschalist0@gmail.com>
2022-04-18 11:41:04 -04:00
Julien Pivotto
685ce9964d
Merge pull request #10599 from prometheus/release-2.35
Merge back release 2.35
2022-04-15 00:10:06 +02:00
Robert Fratto
286dfc70b7
tsdb/agent: port grafana/agent#676 (#10587)
* tsdb/agent: port grafana/agent#676

grafana/agent#676 fixed an issue where a loading a WAL with multiple
segments may result in ref ID collision. The starting ref ID for new
series should be set to the highest ref ID across all series records
from all WAL segments.

This fixes an issue where the starting ref ID was incorrectly set to the
highest ref ID found in the newest segment, which may not have any ref
IDs at all if no series records have been appended to it yet.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: update terminology (s/ref ID/nextRef)

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2022-04-14 10:27:06 +02:00
chavacava
0b41fd6e71
Fix data races in WAL replay (#10571)
Signed-off-by: chavacava <salvadorcavadini+github@gmail.com>
2022-04-12 16:00:20 +05:30
beorn7
7ee1836ef5 Merge branch 'main' into sparsehistogram 2022-04-05 18:31:19 +02:00
Bryan Boreham
2c1be4df7b
tsdb: more efficient sorting of postings read from WAL at startup (#10500)
* tsdb: avoid slice-to-interface allocation in EnsureOrder

This is pulling the `seriesRefSlice` out of the loop, so the compiler
doesn't allocate a new one on the heap every time.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* tsdb: use pointer type in Pool for EnsureOrder

As noted by staticcheck, Pool prefers the objects in the pool to have
pointer type. This is a little more fiddly to code, but avoids
allocation of a wrapper object every time a slice is put into the pool.

Removed a comment that said fixing this has a performance penalty: not
borne out by benchmarks.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-03-30 15:10:19 +05:30
Wilbert Guo
83a2e52bc2
Add SyncForState Implementation for Ruler HA (#10070)
* continuously syncing activeAt for alerts

Signed-off-by: Yijie Qin <qinyijie@amazon.com>
Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

* add import

Signed-off-by: Yijie Qin <qinyijie@amazon.com>
Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

* Refactor SyncForState and add unit tests

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

* Format code

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

* Add hook for syncForState

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

Fix go lint

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

Refactor syncForState override implementation

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

Add syncForState override func as argument to Update()

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

Fix go formatting

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

Fix circleci test errors

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

Remove overrideFunc as argument to run()

Signed-off-by: Wilbert Guo <wilbeguo@amazon.com>

* remove the syncForState

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* use the override function to decide if need to replace the activeAt or not

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fix test case

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fix format

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* Trigger build

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fixing comments

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* return the result of map of alerts instead of single one

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* upper case the QueryforStateSeries

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* use a more generic rule group post process function type

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fix indentation

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fix gofmt

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fix lint

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fixing naming

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fix comments

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* add the lastEvalTimestamp as parameter

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* fmt

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

* change funcType to func

Signed-off-by: Yijie Qin <qinyijie@amazon.com>

Co-authored-by: Yijie Qin <qinyijie@amazon.com>
Co-authored-by: Yijie Qin <63399121+qinxx108@users.noreply.github.com>
2022-03-29 02:16:46 +02:00
Howie
1291ec7185
deleting *.tmp WAL files on startup (#10317)
* fix issue #10245

Signed-off-by: lihaowei <haoweili35@gmail.com>

* minor changes

Signed-off-by: lihaowei <haoweili35@gmail.com>

* review changes

Signed-off-by: lihaowei <haoweili35@gmail.com>

* minor changes

Signed-off-by: lihaowei <haoweili35@gmail.com>
2022-03-24 16:14:14 +05:30
beorn7
f9c411604d Fix spelling errors
Signed-off-by: beorn7 <beorn@grafana.com>
2022-03-22 16:02:13 +01:00
beorn7
4210aac74a Merge branch 'main' into sparsehistogram 2022-03-22 14:47:42 +01:00
Chris Marchbanks
c1387494dd
Merge pull request #10452 from prometheus/release-2.34
Merge Release 2.34 into main
2022-03-15 12:32:18 -06:00
songjiayang
8d8be43824
Update wal.md (#10442)
update exemplar record type

Signed-off-by: songjiayang <songjiayang1@gmail.com>
2022-03-15 22:33:45 +05:30
Mauro Stettler
b025390cb4
Disable chunk write queue by default, allow user to configure the exact size (#10425)
* Disable chunk write queue by default

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* update flag description

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
2022-03-11 17:26:59 +01:00
Łukasz Mierzwa
da23c4649a
Enable misspell check in golangci-lint (#10393)
Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
2022-03-03 18:11:19 +01:00
Łukasz Mierzwa
a4317bf0ec
Run gofumpt on all files (#10392)
* Run gofumpt on all files

Getting golangci-lint errors when building on my laptop, possibly because I have newer version of gofumpt then what it was formatted with.
Run gofumpt -w -extra on all files as it will be needed in the future anyway.

* Update golangci-lint to v1.44.2

v1.44.0 upgraded gofumpt so bumping version in CI will help keep formatting correct for everyone

* Address golangci-lint error

Getting 'error-strings: error strings should not be capitalized or end with punctuation or a newline' from revive here.
Drop new line.

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
2022-03-03 17:21:05 +01:00
cui fliter
c9b56d1a49
all: fix some typos (#10389)
Signed-off-by: cuishuang <imcusg@gmail.com>
2022-03-03 12:03:07 +00:00
Ganesh Vernekar
4cc25c0cb0
Fix panic on query when m-map replay fails with snapshot enabled (#10348)
* Fix panic on query when m-map replay fails with snapshot enabled

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix flake

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2022-02-25 08:53:40 -07:00
Björn Rabenstein
d1edb006c1
Merge pull request #10341 from prometheus/release-2.33
Merge release-2.33 forward into main
2022-02-22 22:51:05 +01:00
Ganesh Vernekar
24827782cb
Fix panics when m-mapping head chunks (#10316)
* Fix panics when m-mapping head chunks

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix reviews

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2022-02-22 20:35:15 +05:30
Dieter Plaetinck
aa8874bc56
clarify Head.appendableMinValidTime (#10303)
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2022-02-17 16:30:48 +05:30
lwangrabbit
9fde6edbf5
tsdb/wal: Move comment of w.writer.Append(...) to the WriteTo interface (#10198)
Signed-off-by: wanglipeng <wanglipeng@huayun.com>

Co-authored-by: wanglipeng <wanglipeng@huayun.com>
2022-01-30 22:14:16 -08:00
Eng Zer Jun
3e67654d37
refactor: use T.TempDir() and B.TempDir to create temporary directory
The directory created by `T.TempDir()` and `B.TempDir()` is
automatically removed when the test and all its subtests complete.

Reference: https://pkg.go.dev/testing#T.TempDir
Reference: https://pkg.go.dev/testing#B.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2022-01-22 18:57:30 +08:00
Robert Fratto
b71a6dbbd1
tsdb/agent: Fix deadlock from simultaneous GC and write (#10166)
* tsdb/agent: Fix deadlock from simultaneous GC and write

This commit fixes a potential deadlock where storing in-memory series
references could deadlock with a WAL GC cycle.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* add missing license header

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* order local imports

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* align deadlock testing with discovery/manager_test.go method

Also prevents GCs from running concurrently, which could also cause a
deadlock (even though it's currently impossible for two GCs to run
concurrently).

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2022-01-19 20:23:06 +05:30
Mauro Stettler
bf959b36cb
Nits after PR 10051 merge (#10159)
Signed-off-by: Marco Pracucci <marco@pracucci.com>

Co-authored-by: Marco Pracucci <marco@pracucci.com>
2022-01-19 20:20:35 +05:30
Ganesh Vernekar
129ed4ec8b
Fix Example() function in TSDB (#10153)
* Fix Example() function in TSDB

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2022-01-11 17:24:03 +05:30
Mauro Stettler
0df3489275
Write chunks via queue, predicting the refs (#10051)
* Write chunks via queue, predicting the refs

Our load tests have shown that there is a latency spike in the
remote write handler whenever the head chunks need to be written,
because chunkDiskMapper.WriteChunk() blocks until the chunks are written
to disk.

This adds a queue to the chunk disk mapper which makes the WriteChunk()
method non-blocking unless the queue is full. Reads can still be served
from the queue.

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* address PR feeddback

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* initialize metrics without .Add(0)

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* change isRunningMtx to normal lock

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* do not re-initialize chunkrefmap

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* update metric outside of lock scope

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* add benchmark for adding job to chunk write queue

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* remove unnecessary "success" var

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* gofumpt -extra

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* avoid WithLabelValues call in addJob

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* format comments

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* addressing PR feedback

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* rename cutExpectRef to cutAndExpectRef

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* use head.Init() instead of .initTime()

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* address PR feedback

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* PR feedback

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* update test according to PR feedback

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* replace callbackWg -> awaitCb

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* better test of truncation with empty files

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

* replace callbackWg -> awaitCb

Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2022-01-10 13:36:45 +00:00
Oleg Zaytsev
a83d46ee9c
Tidy postingsWithIndexHeap (#10123)
Unexported postingsWithIndexHeap's methods that don't need to be
exported, and added detailed comments.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2022-01-06 16:03:44 +05:30
Bryan Boreham
82860a770c
tsdb: use simpler map key to improve exemplar ingest performance (#10111)
* tsdb: fix exemplar benchmarks

Go benchmarks are expected to do an amount of work that varies with
the `b.N` parameter. Previously these benchmarks would report a result
like 0.01 ns/op, which is nonsense.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* tsdb: use simpler map key to improve exemplar perf

Prometheus holds an index of exemplars so it can discard the oldest one
for a series when a new one is added.
Since the keys are not for human eyes, we can use a simpler format
and save the effort of quoting label values.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Exemplars: allocate index map with estimated size

This avoids Go having to re-size the map several times as it grows.
16 exemplars per series is a guess; if it is too low then the map will
be sparse, while if it is too high then the map will have to resize once
or twice.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-01-06 15:58:58 +05:30
Oleg Zaytsev
701545286d
Pop intersected postings heap without popping (#10092)
See this comment for detailed explanation:

https://github.com/prometheus/prometheus/pull/9907#issuecomment-1002189932

TL;DR: if we don't call Pop() on the heap implementation, we don't need
to return our param as an `interface{}` so we save an allocation.
This would be popped for every label value, so it can be thousands of
saved allocations here (see benchmarks).

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2022-01-05 16:16:43 +05:30
Peter Štibraný
e51a17b501
CompactBlockMetas should produce correct mint/maxt for overlapping blocks. (#10108)
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
2022-01-05 15:10:00 +05:30
Oleg Zaytsev
3947238ce0
Label values with matchers by intersecting postings (#9907)
* LabelValues w/matchers by intersecting postings

Instead of iterating all matched series to find the values, this
checks if each one of the label values is present in the matched series
(postings).

Pending to be benchmarked.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Benchmark labelValuesWithMatchers

name                                                                   old time/op    new time/op
Querier/Head/labelValuesWithMatchers/i_with_n="1"                         157ms ± 0%        48ms ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="^.+$"                      1.80s ± 0%       0.46s ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",j!="foo"                144ms ± 0%        57ms ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"      304ms ± 0%       111ms ± 0%
Querier/Head/labelValuesWithMatchers/n_with_j!="foo"                      761ms ± 0%       164ms ± 0%
Querier/Head/labelValuesWithMatchers/n_with_i="1"                        6.11µs ± 0%      6.62µs ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1"                        117ms ± 0%        62ms ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="^.+$"                     1.44s ± 0%       0.24s ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",j!="foo"              92.1ms ± 0%      70.3ms ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"     196ms ± 0%       115ms ± 0%
Querier/Block/labelValuesWithMatchers/n_with_j!="foo"                     1.23s ± 0%       0.21s ± 0%
Querier/Block/labelValuesWithMatchers/n_with_i="1"                       1.06ms ± 0%      0.88ms ± 0%

name                                                                   old alloc/op   new alloc/op
Querier/Head/labelValuesWithMatchers/i_with_n="1"                        29.5MB ± 0%      26.9MB ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="^.+$"                     46.8MB ± 0%     251.5MB ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",j!="foo"               29.5MB ± 0%      22.3MB ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"     46.8MB ± 0%      23.9MB ± 0%
Querier/Head/labelValuesWithMatchers/n_with_j!="foo"                     10.3kB ± 0%  138535.2kB ± 0%
Querier/Head/labelValuesWithMatchers/n_with_i="1"                        5.54kB ± 0%      7.09kB ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1"                       39.1MB ± 0%      28.5MB ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="^.+$"                     287MB ± 0%       253MB ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",j!="foo"              34.3MB ± 0%      23.9MB ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"    51.6MB ± 0%      25.5MB ± 0%
Querier/Block/labelValuesWithMatchers/n_with_j!="foo"                     144MB ± 0%       139MB ± 0%
Querier/Block/labelValuesWithMatchers/n_with_i="1"                       6.43kB ± 0%      8.66kB ± 0%

name                                                                   old allocs/op  new allocs/op
Querier/Head/labelValuesWithMatchers/i_with_n="1"                          104k ± 0%        500k ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="^.+$"                       204k ± 0%        600k ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",j!="foo"                 104k ± 0%        500k ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"       204k ± 0%        500k ± 0%
Querier/Head/labelValuesWithMatchers/n_with_j!="foo"                       66.0 ± 0%       255.0 ± 0%
Querier/Head/labelValuesWithMatchers/n_with_i="1"                          61.0 ± 0%       205.0 ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1"                         304k ± 0%        600k ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="^.+$"                     5.20M ± 0%       0.70M ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",j!="foo"                204k ± 0%        600k ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"      304k ± 0%        600k ± 0%
Querier/Block/labelValuesWithMatchers/n_with_j!="foo"                     3.00M ± 0%       0.00M ± 0%
Querier/Block/labelValuesWithMatchers/n_with_i="1"                         61.0 ± 0%       247.0 ± 0%

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Don't expand postings to intersect them

Using a min heap we can check whether matched postings intersect with
each one of the label values postings. This avoid expanding postings
(and thus having all of them in memory at any point).

Slightly slower than the expanding postings version for some cases, but
definitely pays the price once the cardinality grows.

Still offers 10x latency improvement where previous latencies were
reaching 1s.

Benchmark results:

name \ time/op                                                         old.txt      intersect.txt    intersect_noexpand.txt
Querier/Head/labelValuesWithMatchers/i_with_n="1"                       157ms ± 0%        48ms ± 0%              110ms ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="^.+$"                    1.80s ± 0%       0.46s ± 0%              0.18s ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",j!="foo"              144ms ± 0%        57ms ± 0%              125ms ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"    304ms ± 0%       111ms ± 0%              177ms ± 0%
Querier/Head/labelValuesWithMatchers/n_with_j!="foo"                    761ms ± 0%       164ms ± 0%              134ms ± 0%
Querier/Head/labelValuesWithMatchers/n_with_i="1"                      6.11µs ± 0%      6.62µs ± 0%             4.29µs ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1"                      117ms ± 0%        62ms ± 0%              120ms ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="^.+$"                   1.44s ± 0%       0.24s ± 0%              0.15s ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",j!="foo"            92.1ms ± 0%      70.3ms ± 0%            125.4ms ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"   196ms ± 0%       115ms ± 0%              170ms ± 0%
Querier/Block/labelValuesWithMatchers/n_with_j!="foo"                   1.23s ± 0%       0.21s ± 0%              0.14s ± 0%
Querier/Block/labelValuesWithMatchers/n_with_i="1"                     1.06ms ± 0%      0.88ms ± 0%             0.92ms ± 0%

name \ alloc/op                                                        old.txt      intersect.txt    intersect_noexpand.txt
Querier/Head/labelValuesWithMatchers/i_with_n="1"                      29.5MB ± 0%      26.9MB ± 0%             19.1MB ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="^.+$"                   46.8MB ± 0%     251.5MB ± 0%             36.3MB ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",j!="foo"             29.5MB ± 0%      22.3MB ± 0%             19.1MB ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"   46.8MB ± 0%      23.9MB ± 0%             20.7MB ± 0%
Querier/Head/labelValuesWithMatchers/n_with_j!="foo"                   10.3kB ± 0%  138535.2kB ± 0%              6.4kB ± 0%
Querier/Head/labelValuesWithMatchers/n_with_i="1"                      5.54kB ± 0%      7.09kB ± 0%             4.30kB ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1"                     39.1MB ± 0%      28.5MB ± 0%             20.7MB ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="^.+$"                   287MB ± 0%       253MB ± 0%               38MB ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",j!="foo"            34.3MB ± 0%      23.9MB ± 0%             20.7MB ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"  51.6MB ± 0%      25.5MB ± 0%             22.3MB ± 0%
Querier/Block/labelValuesWithMatchers/n_with_j!="foo"                   144MB ± 0%       139MB ± 0%                0MB ± 0%
Querier/Block/labelValuesWithMatchers/n_with_i="1"                     6.43kB ± 0%      8.66kB ± 0%             5.86kB ± 0%

name \ allocs/op                                                       old.txt      intersect.txt    intersect_noexpand.txt
Querier/Head/labelValuesWithMatchers/i_with_n="1"                        104k ± 0%        500k ± 0%               300k ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="^.+$"                     204k ± 0%        600k ± 0%               400k ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",j!="foo"               104k ± 0%        500k ± 0%               300k ± 0%
Querier/Head/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"     204k ± 0%        500k ± 0%               300k ± 0%
Querier/Head/labelValuesWithMatchers/n_with_j!="foo"                     66.0 ± 0%       255.0 ± 0%              139.0 ± 0%
Querier/Head/labelValuesWithMatchers/n_with_i="1"                        61.0 ± 0%       205.0 ± 0%               87.0 ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1"                       304k ± 0%        600k ± 0%               400k ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="^.+$"                   5.20M ± 0%       0.70M ± 0%              0.50M ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",j!="foo"              204k ± 0%        600k ± 0%               400k ± 0%
Querier/Block/labelValuesWithMatchers/i_with_n="1",i=~"^.*$",j!="foo"    304k ± 0%        600k ± 0%               400k ± 0%
Querier/Block/labelValuesWithMatchers/n_with_j!="foo"                   3.00M ± 0%       0.00M ± 0%              0.00M ± 0%
Querier/Block/labelValuesWithMatchers/n_with_i="1"                       61.0 ± 0%       247.0 ± 0%              129.0 ± 0%

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Apply comment suggestions from the code review

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* Change else { if } to else if

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Remove sorting of label values

We were not sorting them before, so no need to sort them now

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-12-28 15:59:03 +01:00
Ganesh Vernekar
6c7577177c
Merge pull request #10056 from charlesxsh/fix-TestWALRestoreCorrupted
fix potential goroutine leaks at TestWALRestoreCorrupted
2021-12-21 14:54:45 +05:30
beorn7
86cc83b13c storage: iterator fixes after merge
Signed-off-by: beorn7 <beorn@grafana.com>
2021-12-18 14:12:01 +01:00
beorn7
64c7bd2b08 Merge branch 'main' into sparsehistogram 2021-12-18 14:04:25 +01:00
Shihao Xia
3696d7dedb fix potential goroutine leaks
Signed-off-by: Shihao Xia <charlesxsh@hotmail.com>
2021-12-17 18:35:30 -05:00
Julien Pivotto
27343277fa
Merge release-2.32 forward into main (#10032)
* storage: expose bug in iterators #10027

Signed-off-by: beorn7 <beorn@grafana.com>

* storage: fix bug #10027 in iterators' Seek method

Signed-off-by: beorn7 <beorn@grafana.com>

* Append reporting metrics without limit

If reporting metrics fails due to reaching the limit, this makes the
target appear as UP in the UI, but the metrics are missing.

This commit bypasses that limit for report metrics.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Remove check against cfg so interval/ timeout are always set (#10023) (#10031)

Signed-off-by: Nicholas Blott <blottn@tcd.ie>

Co-authored-by: Nicholas Blott <blottn@tcd.ie>

* Cut v2.32.1

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Apply suggestions from code review

Signed-off-by: Julius Volz <julius.volz@gmail.com>

Co-authored-by: Levi Harrison <git@leviharrison.dev>

Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu>
Co-authored-by: Nicholas Blott <blottn@tcd.ie>
Co-authored-by: Julius Volz <julius.volz@gmail.com>
Co-authored-by: Levi Harrison <git@leviharrison.dev>
2021-12-17 23:18:38 +01:00
beorn7
0ede6ae321 storage: fix bug #10027 in iterators' Seek method
Signed-off-by: beorn7 <beorn@grafana.com>
2021-12-16 12:07:35 +01:00
beorn7
6f33ab2b35 Merge branch 'main' into sparsehistogram 2021-12-15 13:49:33 +01:00
Ganesh Vernekar
227d155e3f
Fix queries after a failed snapshot replay (#9980)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-12-09 15:15:27 +05:30
Ganesh Vernekar
05d4d97bcd
Fix queries after a failed snapshot replay (#9980)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-12-08 15:32:14 +00:00
Nick Pillitteri
084bd70708
Convert atomic Int64 to native type when logging value (#9938)
The atomic Int64 type wasn't able to be represented when logging
via go-kit log and ended up as `"unsupported value type"`.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
2021-12-06 22:25:22 +01:00
beorn7
e8e9155a11 Merge branch 'main' into sparsehistogram 2021-11-30 18:22:37 +01:00
beorn7
e4e24453fa Merge branch 'main' into beorn7/merge2 2021-11-30 17:19:06 +01:00
Robert Fratto
4cbddb41eb
tsdb/agent: Synchronize appender code with grafana/agent main (#9664)
* tsdb/agent: synchronize appender code with grafana/agent main

This commit synchronize the appender code with grafana/agent main. This
includes adding support for appending exemplars.

Closes #9610

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: fix build error

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: introduce some exemplar tests, refactor tests a little

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: address review feedback

- Re-use hash when creating a new series
- Fix typo in exemplar append error

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: remove unused AddFast method

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: close wal reader after test

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: add out-of-order tracking, change series TS in commit

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* address review feedback

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2021-11-30 21:14:40 +05:30
Björn Rabenstein
b866db009b
storage: Fix and improve the Seek method of various iterators (#9878)
There was a subtle and nasty bug in listSeriesIterator.Seek.

In addition, the Seek call is defined to be a no-op if the current
position of the iterator is already pointing to a suitable
sample. This commit adds fast paths for this case to several
potentially expensive Seek calls.

Another bug was in concreteSeriesIterator.Seek. It always searched the
whole series and not from the current position of the iterator.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-29 15:17:56 +05:30
Björn Rabenstein
7e42acd3b1
tsdb: Rework iterators (#9877)
- Pick At... method via return value of Next/Seek.
- Do not clobber returned buckets.
- Add partial FloatHistogram suppert.

Note that the promql package is now _only_ dealing with
FloatHistograms, following the idea that PromQL only knows float
values.

As a byproduct, I have removed the histogramSeries metric. In my
understanding, series can have both float and histogram samples, so
that metric doesn't make sense anymore.

As another byproduct, I have converged the sampleBuf and the
histogramSampleBuf in memSeries into one. The sample type stored in
the sampleBuf has been extended to also contain histograms even before
this commit.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-29 13:24:23 +05:30
Ganesh Vernekar
26c0a433f5
Support appending different sample types to the same series (#9705)
* Support appending different sample types to the same series

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix build

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-26 17:43:27 +05:30
Bryan Boreham
1b74a3812e
Fix panic, out of order chunks, and race warning during WAL replay (#9856)
* Fix panic on WAL replay

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Refactor: introduce walSubsetProcessor

walSubsetProcessor packages up the `processWALSamples()` function and
its input and output channels, helping to clarify how these things
relate.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Refactor: extract more methods onto walSubsetProcessor

This makes the main logic easier to follow.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Fix race warning by locking processWALSamples

Although we have waited for the processor to finish, we still get a
warning from the race detector because it doesn't know how the different
parts relate.

Add a lock round each batch of samples, so the race detector can see
that we never access series owned by the processor outside of a lock.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Added test to reproduce issue 9859

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Remove redundant unit test

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix out of order chunks during WAL replay

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix nits

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
2021-11-25 13:36:14 +05:30
Oleg Zaytsev
5e746e4e88
Check postings bytes length when decoding (#9766)
Added validation to expected postings length compared to the bytes slice
length. With 32bit postings, we expect to have 4 bytes per each posting.
If the number doesn't add up, we know that the input data is not
compatible with our code (maybe it's cut, or padded with trash, or even
written in a different coded).

This is needed in downstream projects to correctly identify cached
postings written with an unknown codec, but it's also a good idea to
validate it here.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2021-11-24 15:26:37 +05:30
Darshan Chaudhary
9dcf8b2208
Add the ability to disable tsdb isolation (#9270)
* Disable isolation in isolation struct

Signed-off-by: darshanime <deathbullet@gmail.com>

* Run tsdb tests with isolation disabled

Signed-off-by: darshanime <deathbullet@gmail.com>

* Check for isolation disabled in isoState.Close()

Signed-off-by: darshanime <deathbullet@gmail.com>

* use t.Skip to skip isolation tests when disabled

Signed-off-by: darshanime <deathbullet@gmail.com>

* address review comments

Signed-off-by: darshanime <deathbullet@gmail.com>

* fix test for defaultIsolationState

Signed-off-by: darshanime <deathbullet@gmail.com>

* Change flag name. Set flag in DB. Do not init txRing. Close isoState.

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Test disabled isolation in CircleCI test_go

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Skip isolation related tests in db_test.go

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-19 15:41:32 +05:30
beorn7
5d4db805ac Merge branch 'main' into sparsehistogram 2021-11-17 19:57:31 +01:00
Dieter Plaetinck
067efc3725
clarify HeadChunkID type and usage (#9726)
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-11-17 18:35:10 +05:30
Sunil Thaha
a484a83d4a
fix: panic when checkpoint directory is empty (#9687)
Calling `wal.NewSegmentBufReader()` without any segments would cause a
`panic` resulting in prometheus crashing. This patch fixes the panic by
making segmentBufReader return a EOF if there are not segments.

This also means an empty checkpoint directory which should never be the
case unless it has been tampered with (or has issues due to the
underlying filesystem e.g. NFS) would be ignored by Prometheus and would
continue to run instead of the current behaviour which is to panic.

Fixes: https://github.com/prometheus/prometheus/issues/9605

Signed-off-by: Sunil Thaha <sthaha@redhat.com>
2021-11-17 16:39:04 +05:30
Dieter Plaetinck
0fac9bb859
Add basic initial developer docs for TSDB (#9451)
* Add basic initial developer docs for TSDB

There's a decent amount of content already out there (blog posts,
conference talks, etc), but:
* when they get stale, they don't tend to get updated
* they still leave me with questions that I'ld like to answer
  for developers (like me) who want to use, or work with, TSDB

What I propose is developer docs inside the prometheus
repository.  Easy to find and harness the power of the community
to expand it and keep it up to date.

* perfect is the enemy of good.  Let's have a base and incrementally improve
* Markdown docs should be broad but not too deep.  Source code comments
  can complement them, and are the ideal place for implementation details.

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* use example code that works out of the box

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Apply suggestions from code review

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* PR feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* more docs

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* PR feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Apply suggestions from code review

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Apply suggestions from code review

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Update tsdb/docs/usage.md

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* final tweaks

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* workaround docs versioning issue

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Move example code to real executable, testable example.

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* cleanup example test and make sure it always reproduces

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* obtain temp dir in a way that works with older Go versions

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* Fix Ganesh's comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-17 15:51:27 +05:30
beorn7
4c28d9fac7 Move to histogram.Histogram pointers
This is to avoid copying the many fields of a histogram.Histogram all
the time.

This also fixes a bunch of formerly broken tests.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-12 23:17:35 +01:00
Mauro Stettler
8a4f659126 fix error message
Signed-off-by: Mauro Stettler <mauro.stettler@gmail.com>
2021-11-12 21:45:46 +01:00
Robert Fratto
72a9f7fee9
Share TSDB locker code with agent (#9623)
* share tsdb db locker code with agent

Closes #9616

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* add flag to disable lockfile for agent

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* use agentOnlySetting instead of PreAction

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb: address review feedback

1. Rename Locker to DirLocker
2. Move DirLocker to tsdb/tsdbutil
3. Name metric using fmt.Sprintf
4. Refine error checking in DirLocker test

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb: create test utilities to assert expected DirLocker behavior

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/tsdbutil: fix lint errors

Signed-off-by: Robert Fratto <robertfratto@gmail.com>

* tsdb/agent: fix windows test failure

Use new DB variable instead of overriding the old one.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2021-11-11 11:45:25 -05:00
beorn7
f1065e44a4 model: String method for histogram.Histogram
This includes a regular bucket iterator and a string method for
histogram.Bucket.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-11 17:29:22 +01:00
Peter Štibraný
422e7839d4
Add more size checks when writing individual sections in the index. (#9710)
* Add more size checks when writing individual sections in the index.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>

* Use uint and add comment about it.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
2021-11-11 15:44:28 +05:30
Mateusz Gozdek
83086aee00 tsdb/agent: use unique registry per tests
So tests can run in parallel.

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-11 01:37:24 +01:00
Mateusz Gozdek
2f312ff4c5 tsdb: mark TestTombstoneCleanRetentionLimitsRace test as slow
It takes over 100 seconds to execute this test, so I'd consider it as
slow.

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-11 01:37:24 +01:00
Levi Harrison
7400e07fa9
Close DB in Agent tests (#9630)
* Close agent db in tests

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Close first DB before opening second

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Use seperate variables for different DBs?

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Close remote storage

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Fix closing of stuff

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Remove the build flags after a rebase

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix closing of stuff 2

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-11-10 20:05:54 +05:30
Mateusz Gozdek
f4650c27e7 tsdb/wal: fix flaky TestReaderFuzz* tests
It seems sometimes you can get error like:

                Error:          Not equal:
                                expected: []byte(nil)
                                actual  : []byte{}

                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -1,2 +1,3 @@
                                -([]uint8) <nil>
                                +([]uint8) {
                                +}

This commit does what bytes.Equal does to silence those differences. I'm
not sure if this is a correct solution or just covering up the actual bug.

Closes #9574

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-09 14:32:20 +01:00
Björn Rabenstein
4c56a193c5
Merge pull request #9478 from prometheus/beorn7/pkg-deprecation
Move packages out of deprecated pkg directory
2021-11-09 11:09:16 +01:00
曹明
a0d31c28fc tsdb: Add windows arm64 support.
Signed-off-by: 曹明 <caoming1@kingsoft.com>
2021-11-09 11:07:27 +01:00
Mateusz Gozdek
b319b14431
tsdb/chunks: preallocate at least some space on non-Windows systems (#9581)
To avoid potential chunk corruption read, which I am not sure why is
happening.

Closes #9561.

Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
2021-11-09 13:47:00 +05:30
beorn7
c954cd9d1d Move packages out of deprecated pkg directory
This creates a new `model` directory and moves all data-model related
packages over there:
  exemplar labels relabel rulefmt textparse timestamp value

All the others are more or less utilities and have been moved to `util`:
  gate logging modetimevfs pool runtime

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-09 08:03:10 +01:00
beorn7
a1e595edac Fix two trivial lint warnings
Not sure why those show up for me locally but not if run by the CI.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-08 22:32:13 +01:00
beorn7
8f92c90897 Add TODOs and some minor tweaks
Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-07 17:12:04 +01:00
Dieter Plaetinck
cda025b5b5
TSDB: demistify SeriesRefs and ChunkRefs (#9536)
* TSDB: demistify seriesRefs and ChunkRefs

The TSDB package contains many types of series and chunk references,
all shrouded in uint types.  Often the same uint value may
actually mean one of different types, in non-obvious ways.

This PR aims to clarify the code and help navigating to relevant docs,
usage, etc much quicker.

Concretely:

* Use appropriately named types and document their semantics and
  relations.
* Make multiplexing and demuxing of types explicit
  (on the boundaries between concrete implementations and generic
  interfaces).
* Casting between different types should be free.  None of the changes
  should have any impact on how the code runs.

TODO: Implement BlockSeriesRef where appropriate (for a future PR)

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* feedback

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* agent: demistify seriesRefs and ChunkRefs

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-11-06 15:40:04 +05:30
johncming
b882d2b7c7
tsdb/wal: Avoid writing closed channel. (#9566)
Signed-off-by: johncming <johncming@yahoo.com>
2021-11-06 15:11:06 +05:30