Commit Graph

171 Commits

Author SHA1 Message Date
Ganesh Vernekar 30b2592fb5
Avoid empty mmap files by using .tmp files to write headers
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-07-24 14:32:26 +05:30
Krasi Georgiev 9a75b5f84b
Avoid panic when the headChunk is nil during isolation.
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
2020-07-24 14:32:17 +05:30
Ganesh Vernekar 48fae12b89
Fix unsequential m-map files (#7414)
* Fix unsequential m-map files

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-06-18 19:24:58 +05:30
Ganesh Vernekar 1627d234da
Moves the atomically accessed member to the top of the struct (#7365)
* Moves the 64bit atomically accessed field to the top of the struct.

Signed-off-by: Bryan Varner <1652015+bvarner@users.noreply.github.com>

* Moves the 64bit atomically accessed field to the top of the struct.

Signed-off-by: Bryan Varner <1652015+bvarner@users.noreply.github.com>

* Fixing up go fmt formatting issues.

Signed-off-by: Bryan Varner <1652015+bvarner@users.noreply.github.com>

Co-authored-by: Bryan Varner <1652015+bvarner@users.noreply.github.com>
2020-06-09 10:55:43 +05:30
Peter Štibraný ff80690a6e
Optimise lowWatermark in Isolation (#7332)
* Track open appenders in doubly-linked list to make lowWatermark O(1).
* Use RW locks.
* Added BenchmarkIsolationWithState.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
2020-06-03 20:09:05 +02:00
Jess G fdc49fae5b
Added time range parameters to labelNames API (#7288)
* add time range params to labelNames api

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* evaluate min/max time range when reading labels from the head

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* add time range params to labelValues api

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* fix test, add docs

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* add a test for head min max range

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* fix test to match comment

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* address CR comments

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* combine vars only used once

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* add time range params to labelNames api

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* evaluate min/max time range when reading labels from the head

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* add time range params to labelValues api

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* fix test, add docs

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* add a test for head min max range

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* fix test to match comment

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* address CR comments

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* combine vars only used once

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* fix test

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* restart ci

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* use range expectedLabelNames instead of range actualLabelNames in test

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
2020-05-30 13:50:09 +01:00
Ganesh Vernekar a1355eb7c7
Remove time based m-map file creation (#7314)
* Remove time based m-map file creation

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-05-29 20:08:41 +05:30
Ganesh Vernekar 83619aa9ac
Preallocate m-map file only for Windows (#7306)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-05-28 20:24:19 +05:30
Guangwen Feng 2393d6137b
Add unit test case for func Type in record.go (#7082)
Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>
2020-05-27 12:08:33 +05:30
Krasimir Georgiev f4dd45609a
Use min and maxt of the range head when creating a block (#7282)
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
2020-05-22 17:00:06 +05:30
Krasimir Georgiev 09df8d94e0
More explicit chunks and head error handling. (#7277) 2020-05-22 12:03:23 +03:00
Ganesh Vernekar 1c99adb9fd
Callbacks for lifecycle of series in TSDB (#7159)
* Callbacks for lifecycle of series in TSDB

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add more comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-05-20 18:52:08 +05:30
Ganesh Vernekar d4b9fe801f
M-map full chunks of Head from disk (#6679)
When appending to the head and a chunk is full it is flushed to the disk and m-mapped (memory mapped) to free up memory

Prom startup now happens in these stages
 - Iterate the m-maped chunks from disk and keep a map of series reference to its slice of mmapped chunks.
- Iterate the WAL as usual. Whenever we create a new series, look for it's mmapped chunks in the map created before and add it to that series.

If a head chunk is corrupted the currpted one and all chunks after that are deleted and the data after the corruption is recovered from the existing WAL which means that a corruption in m-mapped files results in NO data loss.

[Mmaped chunks format](https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/head_chunks.md)  - main difference is that the chunk for mmaping now also includes series reference because there is no index for mapping series to chunks.
[The block chunks](https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/chunks.md) are accessed from the index which includes the offsets for the chunks in the chunks file - example - chunks of series ID have offsets 200, 500 etc in the chunk files.
In case of mmaped chunks, the offsets are stored in memory and accessed from that. During WAL replay, these offsets are restored by iterating all m-mapped chunks as stated above by matching the series id present in the chunk header and offset of that chunk in that file.

**Prombench results**

_WAL Replay_

1h Wal reply time
30% less wal reply time - 4m31 vs 3m36
2h Wal reply time
20% less wal reply time - 8m16 vs 7m

_Memory During WAL Replay_

High Churn:
10-15% less RAM -  32gb vs 28gb
20% less RAM after compaction 34gb vs 27gb
No Churn:
20-30% less RAM -  23gb vs 18gb
40% less RAM after compaction 32.5gb vs 20gb

Screenshots are in [this comment](https://github.com/prometheus/prometheus/pull/6679#issuecomment-621678932)


Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-05-06 21:00:00 +05:30
Bartlomiej Plotka 532f7bbac9
Merge pull request #7204 from prometheus/release-2.18
[Merge Without Squash] Merge release-2.18 back to master.
2020-05-05 18:58:45 +01:00
Ben Ye 1e4e37144d
Fixed wrongly handled not ready TSDB on web and API. (#7182)
* fix federate endpoint panic

Signed-off-by: yeya24 <yb532204897@gmail.com>

* Fixed all cases of not ready TSDB being wrongly handled.

* Fixed issue for federation.
* Ensured this will never happen again thanks to interfaces
* Fixes same issue for stats.
* Added tests for readiness.
* Fixed bug in stats. It was:
   status.MaxTime = db.Head().MaxTime()
   status.MinTime = db.Head().MaxTime()


Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-04-29 17:16:14 +01:00
ga 05038b48bd
Goroutine: Fix ambiguous variable (#7175)
Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>
2020-04-28 11:02:26 +01:00
Goutham Veeramachaneni 84b4d079c8
Make sure deleted intervals are excluded from Seek (#6980)
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
2020-04-23 10:00:30 +01:00
Julien Pivotto fc3fb3265a
Merge pull request #7145 from prometheus/release-2.17
Backport release 2.17 into master
2020-04-20 14:08:12 +02:00
Julien Pivotto ed1852ab95
TSDB: Isolation: avoid creating appenderId's without appender (#7135)
Prior to this commit we could have situations where we are creating an
appenderId but never creating an appender to go with it, therefore
blocking the low watermak.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-04-17 20:51:03 +02:00
ZouYu 2b7437d60e
Fix some warnings: 'redundant type from array, slice, or map composite literal' (#7109)
Signed-off-by: ZouYu <zouy.fnst@cn.fujitsu.com>
2020-04-15 11:17:41 +01:00
Marek Slabicki 8224ddec23
Capitalizing first letter of all log lines (#7043)
Signed-off-by: Marek Slabicki <thaniri@gmail.com>
2020-04-11 09:22:18 +01:00
Brian Brazil cd73b3d33e
Reduce how much old WAL we keep around. (#7098)
Previously we were keeping up to around 6 hours of WAL around by
removing 1/3 every hours. This was excessive, so switch to removing 2/3
which will up to around 3 hours of WAL around.

This will roughly halve the size of the WAL and halve startup time for
those who are I/O bound. This may increase the checkpoint size for
those with certain churn patterns, but by much less than we're saving
from the segments.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2020-04-07 15:55:57 +05:30
Brad Walker 3348930df5
Replace fileutil.ReadDir with ioutil.ReadDir (#7029) (#7033)
* tsdb: Replace fileutil.ReadDir with ioutil.ReadDir (#7029)

Signed-off-by: Brad Walker <brad@bradmwalker.com>

* tsdb: Remove fileutil.ReadDir (#7029)

Signed-off-by: Brad Walker <brad@bradmwalker.com>
2020-04-06 19:04:20 +05:30
MengZeLee a7982ffc0f
Fix typo (#7068)
Fix typo.

Signed-off-by: MengZn <adnt587@gmail.com>
2020-03-30 13:18:34 +05:30
Brian Brazil 7646cbca32
Use .UTC everywhere we use time.Unix (#7066)
time.Unix attaches the local timezone, which can then
leak out (e.g. in the alert json). While this is harmless,
we should be consistent.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2020-03-29 17:35:39 +01:00
Julien Pivotto 9057decce2
Merge pull request #7060 from prometheus/release-2.17
Release 2.17
2020-03-27 15:57:07 +01:00
Julien Pivotto ceef10cee4 Reset comment
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-26 00:17:56 +01:00
Julien Pivotto 73228b1b68 Those links should not be reverted
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-25 20:37:26 +01:00
Julien Pivotto 653f343547 Revert head posting optimization
This reverts commit 52630ad0c7.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-25 20:19:33 +01:00
Bartlomiej Plotka d5c33877f9
storage: Added Chunks{Queryable/Querier/SeriesSet/Series/Iteratable. Added generic Merge{SeriesSet/Querier} implementation. (#7005)
* storage: Added Chunks{Queryable/Querier/SeriesSet/Series/Iteratable. Added generic Merge{SeriesSet/Querier} implementation.

## Rationales:

In many places (e.g. chunk Remote read, Thanos Receive fetching chunk from TSDB), we operate on encoded chunks not samples.
This means that we unnecessary decode/encode, wasting CPU, time and memory.
This PR adds chunk iterator interfaces and makes the merge code to be reused between both seriesSets

I will make the use of it in following PR inside tsdb itself. For now fanout implements it and mergers.

All merges now also allows passing series mergers. This opens doors for custom deduplications other than TSDB vertical ones (e.g. offline one we have in Thanos).

## Changes

* Added Chunk versions of all iterating methods. It all starts in Querier/ChunkQuerier. The plan is that
Storage will implement both chunked and samples.
* Added Seek to chunks.Iterator interface for iterating over chunks.
* NewMergeChunkQuerier was added; Both this and NewMergeQuerier are now using generigMergeQuerier to share the code. Generic code was added.
* Improved tests.
* Added some TODO for further simplifications in next PRs.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Moved s/Labeled/SeriesLabels as per Krasi suggestion.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Krasi's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Second iteration of Krasi comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Another round of comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-03-24 20:15:47 +00:00
Ben Kochie fac7a4a050
Merge pull request #7037 from prometheus/bjk/golint
Enable golint in CI
2020-03-24 09:20:08 +01:00
Ben Kochie 269e7c8091
Fix golint issues.
Signed-off-by: Ben Kochie <superq@gmail.com>
2020-03-23 20:38:43 +01:00
Ganesh Vernekar 6fdc852813
Fix TestHeadDeleteSimple to test reloaded Head too (#7021)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-03-23 16:55:25 +02:00
Ganesh Vernekar e64a149984
Close Head in DBReadOnly.FlushWAL (#7022)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-03-23 14:49:44 +05:30
zhulongcheng e813f60fd6
tsdb: fix sequence check for WAL segments (#7032)
Signed-off-by: zhulongcheng <zhulongcheng.dev@gmail.com>
2020-03-23 13:16:28 +05:30
zhulongcheng dbb8f5861d
tsdb: add tombstonesHeaderSize constant (#7028)
Signed-off-by: zhulongcheng <zhulongcheng.dev@gmail.com>
2020-03-22 12:59:35 +05:30
Julien Pivotto f1984bb007
Merge pull request #7025 from prometheus/release-2.17
Merge release 2.17 into master
2020-03-21 21:39:07 +01:00
beorn7 526cff39b9 Fix tests that were broken by #7009
Signed-off-by: beorn7 <beorn@grafana.com>
2020-03-20 21:22:58 +01:00
Bartlomiej Plotka c4eefd1b3a storage: Removed SelectSorted method; Simplified interface; Added requirement for remote read to sort response.
This is technically BREAKING CHANGE, but it was like this from the beginning: I just notice that we rely in
Prometheus on remote read being sorted. This is because we use selected data from remote reads in MergeSeriesSet
which rely on sorting.

I found during work on https://github.com/prometheus/prometheus/pull/5882 that
we do so many repetitions because of this, for not good reason. I think
I found a good balance between convenience and readability with just one method.
Smaller the interface = better.

Also I don't know what TestSelectSorted was testing, but now it's testing sorting.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-03-20 21:14:43 +01:00
Callum Styan f802f1e8ca
Fix bug with WAL watcher and Live Reader metrics usage. (#6998)
* Fix bug with WAL watcher and Live Reader metrics usage.

Calling NewXMetrics when creating a Watcher or LiveReader results in a
registration error, which we're ignoring, and as a result other than the
first Watcher/Reader created, we had no metrics for either. So we would
only have metrics like Watcher Records Read for the first remote write
config in a users config file.

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2020-03-20 17:34:15 +01:00
Bartlomiej Plotka 8fa4ada9ae
Merge pull request #7010 from prometheus/beorn7/fix-test
Fix tests that were broken by #7009
2020-03-19 17:41:02 +00:00
Ganesh Vernekar e50fdbc70c
Live m-mapping of chunks on disk (#6830)
* Live m-mapping of chunks on disk

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments Part 2

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments Part 3

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments Part 4

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Attempt to fix windows bug

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-03-19 22:03:44 +05:30
beorn7 c0ecbb38af Fix tests that were broken by #7009
Signed-off-by: beorn7 <beorn@grafana.com>
2020-03-19 16:28:23 +01:00
Björn Rabenstein 1da83305be
Merge pull request #7009 from prometheus/release-2.17
Merge release-2.17 into master
2020-03-19 13:46:28 +01:00
zhulongcheng 5f5c7a4477
tsdb: sort checkpoints by segment number (#6987)
Signed-off-by: zhulongcheng <zhulongcheng.dev@gmail.com>
2020-03-18 20:40:41 +05:30
Julien Pivotto 8907ba6235 Make TSDB use storage errors
This fixes #6992, which was introduced by #6777. There was an
intermediate component which translated TSDB errors into storage errors,
but that component was deleted and this bug went unnoticed, until we
were watching at the Prombench results. Without this, scrape will fail
instead of dropping samples or using "Add" when the series have been
garbage collected.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-17 22:24:25 +01:00
Julien Pivotto 0f9e78bd88
tsdb: fix races around head chunks (#6985)
* tsdb: fix races around head chunks

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-16 13:59:22 +01:00
Björn Rabenstein d80b0810c1
Move crucial actions to defer (#6918)
With defer having less of a performance penalty, there is no reason
not to do those crucial operations via defer.

Context: With isolation in place, if we forget to Commit/Rollback, the
low watermark will get stuck forever.

The current code should not have any bugs, but moving to defer helps
to avoid future bugs.

This is also moving the `closeAppend` in the `Commit` implementation
itself to defer. If logging to the WAL fails, we would have missed the
`closeAppend`.

Signed-off-by: beorn7 <beorn@grafana.com>
2020-03-13 20:54:47 +01:00
Bartlomiej Plotka 9d9c45588e Addressed Goutham's review.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-03-13 19:18:31 +00:00
Bartlomiej Plotka cd9516316a Addressed Brian's comments.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-03-13 14:37:47 +00:00