Commit Graph

271 Commits

Author SHA1 Message Date
Bartlomiej Plotka
4ae2ef94e0
tsdb: Delete blocks atomically; Remove tmp blocks on start; Added test. (#7772)
## Changes:

* Rename dir when deleting
* Ignoring blocks with broken meta.json on start (we do that on reload)
* Compactor writes <ulid>.tmp-for-creation blocks instead of just .tmp
* Delete tmp-for-creation and tmp-for-deletion blocks during DB open.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-08-11 06:56:08 +01:00
Zhou Hao
40ace418d1
fix misspell (#7764)
Signed-off-by: Zhou Hao <zhouhao@cn.fujitsu.com>
2020-08-07 08:57:25 +01:00
Frederic Branczyk
e0cf219f0d
tsdb: Save allocations on labels by re-using label array
Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>
2020-08-05 10:27:14 +02:00
Robert-André Mauchin
ed6ce7ac98
Convert int to string using rune() (#7707)
See https://github.com/golang/go/issues/32479

Fix #7706.

Signed-off-by: Robert-André Mauchin <zebob.m@gmail.com>
2020-08-03 15:10:04 +01:00
Bartlomiej Plotka
28c5cfaf0d
tsdb: Moved code merge series and iterators to differen files; cleanup. No functional changes just move! (#7714)
I did not want to move those in previous PR to make it easier to review. Now small cleanup time for readability. (:

## Changes

* Merge series goes to `storage/merge.go` leaving `fanout.go` for just fanout code.
* Moved `fanout test` code from weird separate package to storage.
* Unskiped one test: TestFanout_SelectSorted/chunk_querier
* Moved block series set codes responsible for querying blocks to `querier.go` from `compact.go`



Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-08-03 11:32:56 +01:00
johncming
ac677ed8b3
promql: delete redundant return value. (#7721)
Signed-off-by: johncming <johncming@yahoo.com>
2020-08-03 10:45:53 +01:00
Julien Pivotto
30e079bbd5
TSDB: Fix master tests (#7705)
Now appenders take a context.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-31 17:33:54 +02:00
Bartlomiej Plotka
e6d7cc5fa4
tsdb: Added ChunkQueryable implementations to db; unified MergeSeriesSets and vertical to single struct. (#7069)
* tsdb: Added ChunkQueryable implementations to db; unified compactor, querier and fanout block iterating.

Chained to https://github.com/prometheus/prometheus/pull/7059

* NewMerge(Chunk)Querier now takies multiple primaries allowing tsdb DB code to use it.
* Added single SeriesEntry / ChunkEntry for all series implementations.
* Unified all vertical, and non vertical for compact and querying to single
merge series / chunk sets by reusing VerticalSeriesMergeFunc for overlapping algorithm (same logic as before)
* Added block (Base/Chunk/)Querier for block querying. We then use populateAndTomb(Base/Chunk/) to iterate over chunks or samples.
* Refactored endpoint tests and querier tests to include subtests.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed comments from Brian and Beorn.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed snapshot test and added chunk iterator support for DBReadOnly.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed race when iterating over Ats first.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed tests.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed populate block tests.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed endpoints test.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed test.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Added test & fixed case of head open chunk.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed DBReadOnly tests and bug producing 1 sample chunks.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Added cases for partial block overlap for multiple full chunks.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Added extra tests for chunk meta after compaction.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed small vertical merge bug and added more tests for that.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-07-31 16:03:02 +01:00
Annanay
48b9afd14b Address comments
Signed-off-by: Annanay <annanayagarwal@gmail.com>
2020-07-30 17:25:51 +05:30
Annanay
0b4d448d29 Fix tests
Signed-off-by: Annanay <annanayagarwal@gmail.com>
2020-07-30 17:18:47 +05:30
Annanay
263d2aa5f5 Fix failing tests
Signed-off-by: Annanay <annanayagarwal@gmail.com>
2020-07-30 17:06:56 +05:30
Annanay
9bba8a6eae Merge branch 'master' into appender-context
Signed-off-by: Annanay <annanayagarwal@gmail.com>
2020-07-30 16:43:18 +05:30
Annanay
89129cd39a Address comments
Signed-off-by: Annanay <annanayagarwal@gmail.com>
2020-07-30 16:41:13 +05:30
Javier Palomo Almena
b58a613443
Replace sync/atomic with uber-go/atomic (#7683)
* storage: Replace usage of sync/atomic with uber-go/atomic

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* tsdb: Replace usage of sync/atomic with uber-go/atomic

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* web: Replace usage of sync/atomic with uber-go/atomic

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* notifier: Replace usage of sync/atomic with uber-go/atomic

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* cmd: Replace usage of sync/atomic with uber-go/atomic

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* scripts: Verify that we are not using restricted packages

It checks that we are not directly importing 'sync/atomic'.

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* Reorganise imports in blocks

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* notifier/test: Apply PR suggestions

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* storage/remote: avoid storing references on newEntry

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* Revert "scripts: Verify that we are not using restricted packages"

This reverts commit 278d32748e.

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* web: Group imports accordingly

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
2020-07-30 13:15:42 +05:30
Javier Palomo Almena
348ff4285f
tsdb: Replace sync/atomic with uber-go/atomic in tsdb (#7659)
* tsdb/chunks: Replace sync/atomic with uber-go/atomic

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* tsdb/heaad: Replace sync/atomic with uber-go/atomic

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* vendor: Make go.uber.org/atomic a direct dependency

There is no modifications to go.sum and vendor/ because
it was already vendored.

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>

* tsdb: Remove comments referring to the sync/atomic alignment bug

Related: https://golang.org/pkg/sync/atomic/#pkg-note-BUG

Signed-off-by: Javier Palomo <javier.palomo.almena@gmail.com>
2020-07-28 10:12:42 +05:30
Annanay
f40e4579b7 gofmt
Signed-off-by: Annanay <annanayagarwal@gmail.com>
2020-07-24 20:40:19 +05:30
Annanay
7f98a744e5 Add context to Appender interface
Signed-off-by: Annanay <annanayagarwal@gmail.com>
2020-07-24 19:40:51 +05:30
Ben Ye
50c261502e
add tsdb cmds into promtool (#6088)
Signed-off-by: yeya24 <yb532204897@gmail.com>

update tsdb cli in makefile and promu

Signed-off-by: yeya24 <yb532204897@gmail.com>

remove building tsdb bin

Signed-off-by: yeya24 <yb532204897@gmail.com>

remove useless func

Signed-off-by: yeya24 <yb532204897@gmail.com>

refactor analyzeBlock

Signed-off-by: yeya24 <yb532204897@gmail.com>

Fix Makefile

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-07-23 19:35:50 +01:00
johncming
9801f52b0a
tsdb/chunks: fix bug of data race(#7643). (#7646)
Signed-off-by: johncming <johncming@yahoo.com>
2020-07-23 18:05:19 +05:30
Ganesh Vernekar
4a8531a64b
BlocksToDelete function in DB options (#7638)
* Optional retention filter for DB

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Specify len for the map creation

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-07-22 20:49:33 +05:30
Julien Pivotto
ffc925dd21
TSDB: Error when we commit/rollback twice (#7593)
* TSDB: Error when we commit/rollback twice

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-22 11:57:38 +02:00
Julien Pivotto
cfe30a7b62
TSDB: Use t.Cleanup to delete temporary files (#7620)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-21 10:39:02 +02:00
Julien Pivotto
62805b2fe9
tsdb: test for leaks (#7566)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-07-21 10:08:06 +02:00
Krasimir Georgiev
ccab2b30c9
Test no panic after a WAL corruption (#7625)
* no panic the head memseries has chunks in it

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>

* fix a panic when querying after a wal corruption.

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>

* review nits

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>

* Add test for reading the data after a wal corruption.

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>

Update tsdb/db_test.go

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

Update tsdb/db_test.go

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>

* spellings

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2020-07-21 12:32:13 +05:30
Julien Pivotto
9b8cc663f7
Merge pull request #7623 from prometheus/release-2.20
Release 2.20
2020-07-20 19:16:06 +02:00
Krasi Georgiev
d30492cbb0 Avoid panic when the headChunk is nil during isolation.
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
2020-07-20 18:23:18 +03:00
Zhou Hao
ddedf454d0
add os.RemoveAll err verification (#7540)
* add os.RemoveAll err verification for watcher_test

Signed-off-by: Zhou Hao <zhouhao@cn.fujitsu.com>

* add os.RemoveAll err verification for db_test

Signed-off-by: Zhou Hao <zhouhao@cn.fujitsu.com>

* add os.RemoveAll err verification for write_test

Signed-off-by: Zhou Hao <zhouhao@cn.fujitsu.com>

* add os.RemoveAll err verification for queue_manager_test

Signed-off-by: Zhou Hao <zhouhao@cn.fujitsu.com>

* tsdb/wal/watcher_test: add close operation before delete

Signed-off-by: Zhou Hao <zhouhao@cn.fujitsu.com>
2020-07-17 11:47:32 +05:30
Ganesh Vernekar
1760c7474c
Replay m-map chunks irrespective of WAL (#7589)
* Replay m-map chunks irrespective of WAL

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* More logs

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-07-16 18:34:08 +05:30
Björn Rabenstein
e0067a7bd8
Merge pull request #7573 from codesome/mmap-empty-files
Avoid empty mmap files by using .tmp files to write headers
2020-07-16 12:13:34 +02:00
Ganesh Vernekar
b8a7e80f9b
Fix review comments
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-07-16 12:43:27 +05:30
Ganesh Vernekar
ea013343ca
Log when starting to create a checkpoint (#7581)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-07-15 19:15:37 +05:30
Ganesh Vernekar
7a763ff61e
Avoid empty mmap files by using .tmp files to write headers
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-07-14 14:59:28 +05:30
Bartlomiej Plotka
823b218e1b
Fixed race between compact (gc, populate) and head append causing unknown symbol error. (#7560)
* Fixed race between compact (gc, populate) and head append causing unknown symbol error.

Fixes https://github.com/prometheus/prometheus/issues/7373

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-07-14 09:36:22 +01:00
Bartlomiej Plotka
492061b24c
Revert "Fix unknown symbol error during head compaction (#7526)" (#7556)
This reverts commit 30505a202a.
2020-07-11 22:37:16 +05:30
Ganesh Vernekar
30505a202a
Fix unknown symbol error during head compaction (#7526)
* Fix race during head compaction

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Comment out the test

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Skip test instead of commenting it out

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-07-07 17:29:09 +05:30
Marco Pracucci
2f6bf7de4c
Optimise labels regex matchers containing a literal within the pattern (#7503)
* Added labels matchers regex fast path for literals within the regex

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2020-07-07 09:38:04 +01:00
Harkishen Singh
f32307b656
Increments WAL corruption metric on WAL corruption during checkpointing (#7491)
* Increments wal corruption metric on error during checkpointing

Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>

* check for wal corruption error

Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>
2020-07-05 11:25:42 +05:30
Ganesh Vernekar
e65e2e0dac
Fix panic from db metrics (#7501)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-07-05 10:11:42 +05:30
Bartlomiej Plotka
1861bf38f5
tombstones: Fixed Add method in order to support trimming time series; Simplified the algo. (#7471)
* tombstones: Fixed Add method in order to support edge trimming; Simplified the algo.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Removed duplicated test case.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fixed comment, removed "edge" mention.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Removed trimming word.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-06-29 17:00:22 +01:00
Marco Pracucci
cef4dd6fff
Optimized label regex matcher with literal prefix and/or suffix (#7453)
* Optimized label regex matcher with literal prefix and/or suffix

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added license

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added more tests cases with newlines

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Restored deleted test

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2020-06-26 15:19:09 +05:30
Ganesh Vernekar
082c17b691
Introduce SortedLabelValues/LabelValues to speedup queries for high cardinality (#7448)
* Introduce LabelValuesUnsorted to speedup queries for high cardinality

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add sort check

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-06-25 14:10:29 +01:00
Bartlomiej Plotka
b788986717
storage: Adjusted fully storage layer support for chunk iterators: Remote read client, readyStorage, fanout. (#7059)
* Fixed nits introduced by https://github.com/prometheus/prometheus/pull/7334
* Added ChunkQueryable implementation to fanout and readyStorage.
* Added more comments.
* Changed NewVerticalChunkSeriesMerger to CompactingChunkSeriesMerger, removed tiny interface by reusing VerticalSeriesMergeFunc for overlapping algorithm for
both chunks and series, for both querying and compacting (!) + made sure duplicates are merged.
* Added ErrChunkSeriesSet
* Added Samples interface for seamless []promb.Sample to []tsdbutil.Sample conversion.
* Deprecating non chunks serieset based StreamChunkedReadResponses, added chunk one.
* Improved tests.
* Split remote client into Write (old storage) and read.
* Queryable client is now SampleAndChunkQueryable. Since we cannot use nice QueryableFunc I moved
all config based options to sampleAndChunkQueryableClient to aboid boilerplate.

In next commit: Changes for TSDB.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-06-24 14:41:52 +01:00
Joe Lei
74a73ba1cf
fix analyze limit not work expected (#7430)
Signed-off-by: joelei <thezero12@hotmail.com>
2020-06-22 10:38:10 +01:00
Ganesh Vernekar
b7c46a8c79
Merge remote-tracking branch 'upstream/master' into merge-release-2.19
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-06-19 12:40:29 +05:30
Ganesh Vernekar
48fae12b89
Fix unsequential m-map files (#7414)
* Fix unsequential m-map files

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-06-18 19:24:58 +05:30
Marco Pracucci
3b529ddbce
Cleanup bstream_test.go based on post-merge feedback received on #7390 (#7413)
* Fixed bstream test license

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Simplified bstreamReader.loadNextBuffer()

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed date in license

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2020-06-18 14:49:39 +05:30
Simon Pasquier
d634785944
tsdb/docs: fix head chunks directory + link from README (#7309)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-17 20:38:21 +05:30
Simon Pasquier
2f12049371
tsdb: improve logs when encountering corruption (#7308)
* tsdb: improve logs when encountering corruption

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Wrap corrupted block errors

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Add file path to head chunks

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2020-06-17 16:40:00 +02:00
Marco Pracucci
f42ed03dc5
Optimized bstream reader used by XORChunk iterator (#7390)
* Optimized bstream reader

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed linter

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added license to new file

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed type cast

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Changed comments

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Improved comments and rolledback no-op changes

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed race condition

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2020-06-15 16:44:40 +01:00
Julien Pivotto
f893786153
Fix TSDB test failure (#7394)
PR #7338 was not rebased on top of master and interface had changed.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-06-14 22:07:23 +05:30
Krasimir Georgiev
ab6203b7c7
add head compaction test (#7338) 2020-06-12 13:29:26 +03:00
Ganesh Vernekar
9593b64ce6
Merge branch 'master' into to-merge-release-2.19 2020-06-10 20:01:25 +05:30
Kemal Akkoyun
66dfb951c4
*: Consistent Error/Warning handling for SeriesSet iterator: Allowing Async Select (#7251)
* Add errors and Warnings to SeriesSet

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Change Querier interface and refactor accordingly

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Refactor promql/engine to propagate warnings at eval stage

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Address review issues

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Make sure all the series from all Selects are pre-advanced

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Address review issues

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Separate merge series sets

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Clean

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Refactor merge querier failure handling

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Refactored and simplified fanout with improvements from incoming chunk iterator PRs.

* Secondary logic is hidden, instead of weird failed series set logic we had.
* Fanout is well commented
* Fanout closing record all errors
* MergeQuerier improved API (clearer)
* deferredGenericMergeSeriesSet is not needed as we return no samples anyway for failed series sets (next = false).

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fix formatting

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Fix CI issues

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Added final tests for error handling.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

* Moved hints in populate to be allocated only when needed.
* Used sync.Once in secondary Querier to achieve all-or-nothing partial response logic.
* Select after first Next is done will panic.

NOTE: in lazySeriesSet in theory we could just panic, I think however we can
totally just return error, it will panic in expand anyway.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Utilize errWithWarnings

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Fix recently introduced expansion issue

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Add tests for secondary querier error handling

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Implement lazy merge

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Add name to test cases

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Reorganize

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Address review comments

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Address review comments

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Remove redundant warnings

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Fix rebase mistake

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-06-09 17:57:31 +01:00
Ganesh Vernekar
1627d234da
Moves the atomically accessed member to the top of the struct (#7365)
* Moves the 64bit atomically accessed field to the top of the struct.

Signed-off-by: Bryan Varner <1652015+bvarner@users.noreply.github.com>

* Moves the 64bit atomically accessed field to the top of the struct.

Signed-off-by: Bryan Varner <1652015+bvarner@users.noreply.github.com>

* Fixing up go fmt formatting issues.

Signed-off-by: Bryan Varner <1652015+bvarner@users.noreply.github.com>

Co-authored-by: Bryan Varner <1652015+bvarner@users.noreply.github.com>
2020-06-09 10:55:43 +05:30
Peter Štibraný
ff80690a6e
Optimise lowWatermark in Isolation (#7332)
* Track open appenders in doubly-linked list to make lowWatermark O(1).
* Use RW locks.
* Added BenchmarkIsolationWithState.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
2020-06-03 20:09:05 +02:00
Jess G
fdc49fae5b
Added time range parameters to labelNames API (#7288)
* add time range params to labelNames api

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* evaluate min/max time range when reading labels from the head

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* add time range params to labelValues api

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* fix test, add docs

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* add a test for head min max range

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* fix test to match comment

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* address CR comments

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* combine vars only used once

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* add time range params to labelNames api

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* evaluate min/max time range when reading labels from the head

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* add time range params to labelValues api

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* fix test, add docs

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* add a test for head min max range

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* fix test to match comment

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* address CR comments

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* combine vars only used once

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* fix test

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* restart ci

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>

* use range expectedLabelNames instead of range actualLabelNames in test

Signed-off-by: jessicagreben <Jessica.greben1+github@gmail.com>
2020-05-30 13:50:09 +01:00
Ganesh Vernekar
a1355eb7c7
Remove time based m-map file creation (#7314)
* Remove time based m-map file creation

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-05-29 20:08:41 +05:30
Ganesh Vernekar
83619aa9ac
Preallocate m-map file only for Windows (#7306)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-05-28 20:24:19 +05:30
Guangwen Feng
2393d6137b
Add unit test case for func Type in record.go (#7082)
Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>
2020-05-27 12:08:33 +05:30
Krasimir Georgiev
f4dd45609a
Use min and maxt of the range head when creating a block (#7282)
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
2020-05-22 17:00:06 +05:30
Krasimir Georgiev
09df8d94e0
More explicit chunks and head error handling. (#7277) 2020-05-22 12:03:23 +03:00
Ganesh Vernekar
1c99adb9fd
Callbacks for lifecycle of series in TSDB (#7159)
* Callbacks for lifecycle of series in TSDB

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add more comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-05-20 18:52:08 +05:30
Ganesh Vernekar
d4b9fe801f
M-map full chunks of Head from disk (#6679)
When appending to the head and a chunk is full it is flushed to the disk and m-mapped (memory mapped) to free up memory

Prom startup now happens in these stages
 - Iterate the m-maped chunks from disk and keep a map of series reference to its slice of mmapped chunks.
- Iterate the WAL as usual. Whenever we create a new series, look for it's mmapped chunks in the map created before and add it to that series.

If a head chunk is corrupted the currpted one and all chunks after that are deleted and the data after the corruption is recovered from the existing WAL which means that a corruption in m-mapped files results in NO data loss.

[Mmaped chunks format](https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/head_chunks.md)  - main difference is that the chunk for mmaping now also includes series reference because there is no index for mapping series to chunks.
[The block chunks](https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/chunks.md) are accessed from the index which includes the offsets for the chunks in the chunks file - example - chunks of series ID have offsets 200, 500 etc in the chunk files.
In case of mmaped chunks, the offsets are stored in memory and accessed from that. During WAL replay, these offsets are restored by iterating all m-mapped chunks as stated above by matching the series id present in the chunk header and offset of that chunk in that file.

**Prombench results**

_WAL Replay_

1h Wal reply time
30% less wal reply time - 4m31 vs 3m36
2h Wal reply time
20% less wal reply time - 8m16 vs 7m

_Memory During WAL Replay_

High Churn:
10-15% less RAM -  32gb vs 28gb
20% less RAM after compaction 34gb vs 27gb
No Churn:
20-30% less RAM -  23gb vs 18gb
40% less RAM after compaction 32.5gb vs 20gb

Screenshots are in [this comment](https://github.com/prometheus/prometheus/pull/6679#issuecomment-621678932)


Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-05-06 21:00:00 +05:30
Bartlomiej Plotka
532f7bbac9
Merge pull request #7204 from prometheus/release-2.18
[Merge Without Squash] Merge release-2.18 back to master.
2020-05-05 18:58:45 +01:00
Ben Ye
1e4e37144d
Fixed wrongly handled not ready TSDB on web and API. (#7182)
* fix federate endpoint panic

Signed-off-by: yeya24 <yb532204897@gmail.com>

* Fixed all cases of not ready TSDB being wrongly handled.

* Fixed issue for federation.
* Ensured this will never happen again thanks to interfaces
* Fixes same issue for stats.
* Added tests for readiness.
* Fixed bug in stats. It was:
   status.MaxTime = db.Head().MaxTime()
   status.MinTime = db.Head().MaxTime()


Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-04-29 17:16:14 +01:00
ga
05038b48bd
Goroutine: Fix ambiguous variable (#7175)
Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>
2020-04-28 11:02:26 +01:00
Goutham Veeramachaneni
84b4d079c8
Make sure deleted intervals are excluded from Seek (#6980)
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
2020-04-23 10:00:30 +01:00
Julien Pivotto
fc3fb3265a
Merge pull request #7145 from prometheus/release-2.17
Backport release 2.17 into master
2020-04-20 14:08:12 +02:00
Julien Pivotto
ed1852ab95
TSDB: Isolation: avoid creating appenderId's without appender (#7135)
Prior to this commit we could have situations where we are creating an
appenderId but never creating an appender to go with it, therefore
blocking the low watermak.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-04-17 20:51:03 +02:00
ZouYu
2b7437d60e
Fix some warnings: 'redundant type from array, slice, or map composite literal' (#7109)
Signed-off-by: ZouYu <zouy.fnst@cn.fujitsu.com>
2020-04-15 11:17:41 +01:00
Marek Slabicki
8224ddec23
Capitalizing first letter of all log lines (#7043)
Signed-off-by: Marek Slabicki <thaniri@gmail.com>
2020-04-11 09:22:18 +01:00
Brian Brazil
cd73b3d33e
Reduce how much old WAL we keep around. (#7098)
Previously we were keeping up to around 6 hours of WAL around by
removing 1/3 every hours. This was excessive, so switch to removing 2/3
which will up to around 3 hours of WAL around.

This will roughly halve the size of the WAL and halve startup time for
those who are I/O bound. This may increase the checkpoint size for
those with certain churn patterns, but by much less than we're saving
from the segments.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2020-04-07 15:55:57 +05:30
Brad Walker
3348930df5
Replace fileutil.ReadDir with ioutil.ReadDir (#7029) (#7033)
* tsdb: Replace fileutil.ReadDir with ioutil.ReadDir (#7029)

Signed-off-by: Brad Walker <brad@bradmwalker.com>

* tsdb: Remove fileutil.ReadDir (#7029)

Signed-off-by: Brad Walker <brad@bradmwalker.com>
2020-04-06 19:04:20 +05:30
MengZeLee
a7982ffc0f
Fix typo (#7068)
Fix typo.

Signed-off-by: MengZn <adnt587@gmail.com>
2020-03-30 13:18:34 +05:30
Brian Brazil
7646cbca32
Use .UTC everywhere we use time.Unix (#7066)
time.Unix attaches the local timezone, which can then
leak out (e.g. in the alert json). While this is harmless,
we should be consistent.

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
2020-03-29 17:35:39 +01:00
Julien Pivotto
9057decce2
Merge pull request #7060 from prometheus/release-2.17
Release 2.17
2020-03-27 15:57:07 +01:00
Julien Pivotto
ceef10cee4 Reset comment
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-26 00:17:56 +01:00
Julien Pivotto
73228b1b68 Those links should not be reverted
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-25 20:37:26 +01:00
Julien Pivotto
653f343547 Revert head posting optimization
This reverts commit 52630ad0c7.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-25 20:19:33 +01:00
Bartlomiej Plotka
d5c33877f9
storage: Added Chunks{Queryable/Querier/SeriesSet/Series/Iteratable. Added generic Merge{SeriesSet/Querier} implementation. (#7005)
* storage: Added Chunks{Queryable/Querier/SeriesSet/Series/Iteratable. Added generic Merge{SeriesSet/Querier} implementation.

## Rationales:

In many places (e.g. chunk Remote read, Thanos Receive fetching chunk from TSDB), we operate on encoded chunks not samples.
This means that we unnecessary decode/encode, wasting CPU, time and memory.
This PR adds chunk iterator interfaces and makes the merge code to be reused between both seriesSets

I will make the use of it in following PR inside tsdb itself. For now fanout implements it and mergers.

All merges now also allows passing series mergers. This opens doors for custom deduplications other than TSDB vertical ones (e.g. offline one we have in Thanos).

## Changes

* Added Chunk versions of all iterating methods. It all starts in Querier/ChunkQuerier. The plan is that
Storage will implement both chunked and samples.
* Added Seek to chunks.Iterator interface for iterating over chunks.
* NewMergeChunkQuerier was added; Both this and NewMergeQuerier are now using generigMergeQuerier to share the code. Generic code was added.
* Improved tests.
* Added some TODO for further simplifications in next PRs.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Moved s/Labeled/SeriesLabels as per Krasi suggestion.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Krasi's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Second iteration of Krasi comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Another round of comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-03-24 20:15:47 +00:00
Ben Kochie
fac7a4a050
Merge pull request #7037 from prometheus/bjk/golint
Enable golint in CI
2020-03-24 09:20:08 +01:00
Ben Kochie
269e7c8091
Fix golint issues.
Signed-off-by: Ben Kochie <superq@gmail.com>
2020-03-23 20:38:43 +01:00
Ganesh Vernekar
6fdc852813
Fix TestHeadDeleteSimple to test reloaded Head too (#7021)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-03-23 16:55:25 +02:00
Ganesh Vernekar
e64a149984
Close Head in DBReadOnly.FlushWAL (#7022)
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-03-23 14:49:44 +05:30
zhulongcheng
e813f60fd6
tsdb: fix sequence check for WAL segments (#7032)
Signed-off-by: zhulongcheng <zhulongcheng.dev@gmail.com>
2020-03-23 13:16:28 +05:30
zhulongcheng
dbb8f5861d
tsdb: add tombstonesHeaderSize constant (#7028)
Signed-off-by: zhulongcheng <zhulongcheng.dev@gmail.com>
2020-03-22 12:59:35 +05:30
Julien Pivotto
f1984bb007
Merge pull request #7025 from prometheus/release-2.17
Merge release 2.17 into master
2020-03-21 21:39:07 +01:00
beorn7
526cff39b9 Fix tests that were broken by #7009
Signed-off-by: beorn7 <beorn@grafana.com>
2020-03-20 21:22:58 +01:00
Bartlomiej Plotka
c4eefd1b3a storage: Removed SelectSorted method; Simplified interface; Added requirement for remote read to sort response.
This is technically BREAKING CHANGE, but it was like this from the beginning: I just notice that we rely in
Prometheus on remote read being sorted. This is because we use selected data from remote reads in MergeSeriesSet
which rely on sorting.

I found during work on https://github.com/prometheus/prometheus/pull/5882 that
we do so many repetitions because of this, for not good reason. I think
I found a good balance between convenience and readability with just one method.
Smaller the interface = better.

Also I don't know what TestSelectSorted was testing, but now it's testing sorting.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-03-20 21:14:43 +01:00
Callum Styan
f802f1e8ca
Fix bug with WAL watcher and Live Reader metrics usage. (#6998)
* Fix bug with WAL watcher and Live Reader metrics usage.

Calling NewXMetrics when creating a Watcher or LiveReader results in a
registration error, which we're ignoring, and as a result other than the
first Watcher/Reader created, we had no metrics for either. So we would
only have metrics like Watcher Records Read for the first remote write
config in a users config file.

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2020-03-20 17:34:15 +01:00
Bartlomiej Plotka
8fa4ada9ae
Merge pull request #7010 from prometheus/beorn7/fix-test
Fix tests that were broken by #7009
2020-03-19 17:41:02 +00:00
Ganesh Vernekar
e50fdbc70c
Live m-mapping of chunks on disk (#6830)
* Live m-mapping of chunks on disk

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments Part 2

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments Part 3

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments Part 4

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Attempt to fix windows bug

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
2020-03-19 22:03:44 +05:30
beorn7
c0ecbb38af Fix tests that were broken by #7009
Signed-off-by: beorn7 <beorn@grafana.com>
2020-03-19 16:28:23 +01:00
Björn Rabenstein
1da83305be
Merge pull request #7009 from prometheus/release-2.17
Merge release-2.17 into master
2020-03-19 13:46:28 +01:00
zhulongcheng
5f5c7a4477
tsdb: sort checkpoints by segment number (#6987)
Signed-off-by: zhulongcheng <zhulongcheng.dev@gmail.com>
2020-03-18 20:40:41 +05:30
Julien Pivotto
8907ba6235 Make TSDB use storage errors
This fixes #6992, which was introduced by #6777. There was an
intermediate component which translated TSDB errors into storage errors,
but that component was deleted and this bug went unnoticed, until we
were watching at the Prombench results. Without this, scrape will fail
instead of dropping samples or using "Add" when the series have been
garbage collected.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-17 22:24:25 +01:00
Julien Pivotto
0f9e78bd88
tsdb: fix races around head chunks (#6985)
* tsdb: fix races around head chunks

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-03-16 13:59:22 +01:00
Björn Rabenstein
d80b0810c1
Move crucial actions to defer (#6918)
With defer having less of a performance penalty, there is no reason
not to do those crucial operations via defer.

Context: With isolation in place, if we forget to Commit/Rollback, the
low watermark will get stuck forever.

The current code should not have any bugs, but moving to defer helps
to avoid future bugs.

This is also moving the `closeAppend` in the `Commit` implementation
itself to defer. If logging to the WAL fails, we would have missed the
`closeAppend`.

Signed-off-by: beorn7 <beorn@grafana.com>
2020-03-13 20:54:47 +01:00
Bartlomiej Plotka
9d9c45588e Addressed Goutham's review.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-03-13 19:18:31 +00:00
Bartlomiej Plotka
cd9516316a Addressed Brian's comments.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-03-13 14:37:47 +00:00