Commit Graph

55 Commits

Author SHA1 Message Date
beorn7
7093b089f2 Use more varbit in histogram chunks
This adds bit buckets for larger numbers to varbit encoding and also
an unsigned version of varbit encoding.

Then, varbit encoding is used for all the histogram chunk data instead
of varint.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-13 20:03:35 +02:00
Björn Rabenstein
7309c20e7e
Merge pull request #9500 from codesome/resettests
Add unit test for counter reset header
2021-10-13 18:19:21 +02:00
Ganesh Vernekar
dcaf568279
Metadata -> Layout renaming
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-13 20:27:48 +05:30
Ganesh Vernekar
4e206c7c77
Fix reviews
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-13 20:23:31 +05:30
Ganesh Vernekar
85e6686f84
Add unit test for counter reset header
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-13 15:57:42 +05:30
Björn Rabenstein
311673d62e
Save on slice allocations in histogramIterator (#9494)
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-13 14:43:49 +05:30
Björn Rabenstein
c450c01eb9
Remove obsolete TODOs about metadata (#9490)
Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-12 17:00:30 +05:30
beorn7
7a8bb8222c Style cleanup of all the changes in sparsehistogram so far
A lot of this code was hacked together, literally during a
hackathon. This commit intends not to change the code substantially,
but just make the code obey the usual style practices.

A (possibly incomplete) list of areas:

* Generally address linter warnings.

* The `pgk` directory is deprecated as per dev-summit. No new packages should
  be added to it. I moved the new `pkg/histogram` package to `model`
  anticipating what's proposed in #9478.

* Make the naming of the Sparse Histogram more consistent. Including
  abbreviations, there were just too many names for it: SparseHistogram,
  Histogram, Histo, hist, his, shs, h. The idea is to call it "Histogram" in
  general. Only add "Sparse" if it is needed to avoid confusion with
  conventional Histograms (which is rare because the TSDB really has no notion
  of conventional Histograms). Use abbreviations only in local scope, and then
  really abbreviate (not just removing three out of seven letters like in
  "Histo"). This is in the spirit of
  https://github.com/golang/go/wiki/CodeReviewComments#variable-names

* Several other minor name changes.

* A lot of formatting of doc comments. For one, following
  https://github.com/golang/go/wiki/CodeReviewComments#comment-sentences
  , but also layout question, anticipating how things will look like
  when rendered by `godoc` (even where `godoc` doesn't render them
  right now because they are for unexported types or not a doc comment
  at all but just a normal code comment - consistency is queen!).

* Re-enabled `TestQueryLog` and `TestEndopints` (they pass now,
  leaving them disabled was presumably an oversight).

* Bucket iterator for histogram.Histogram is now created with a
  method.

* HistogramChunk.iterator now allows iterator recycling. (I think
  @dieterbe only commented it out because he was confused by the
  question in the comment.)

* HistogramAppender.Append panics now because we decided to treat
  staleness marker differently.

Signed-off-by: beorn7 <beorn@grafana.com>
2021-10-11 13:02:03 +02:00
Ganesh Vernekar
5d4dc7e413
Convert the header into an enum
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-07 19:53:24 +05:30
Ganesh Vernekar
175ef4ebcf
Add a NotCounterReset flag
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-06 16:06:34 +05:30
Ganesh Vernekar
a280b6c2da
Fix review comments
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-06 15:28:10 +05:30
Ganesh Vernekar
eb9931e961
Add info about counter resets in chunk meta
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-04 18:44:12 +05:30
Ganesh Vernekar
1dd22ed655
Support stale samples for sparse histograms (#9352)
* Support stale samples for sparse histograms

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Don't cut a new chunk for every stale sample

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Update comments for HistoAppender.Appendable

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-10-01 13:41:51 +05:30
Ganesh Vernekar
c373200b75
Cut a new chunk on counter resets for any bucket
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-24 20:02:08 +05:30
Ganesh Vernekar
19e98e5469
Support storing the zero threshold in the histogram chunk (#9165)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-06 18:08:41 +05:30
Ganesh Vernekar
7026e6b4e4
Fix tests in histo_test.go (#9163)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-06 14:26:56 +05:30
Ganesh Vernekar
8b70e87ab9
Merge remote-tracking branch 'upstream/main' into sparse-refactor
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-05 12:16:08 +05:30
jinglina
1a430e5f89
remove redundant parentheses (#9134)
Signed-off-by: jinglina <jinglinax@163.com>
2021-07-29 18:26:57 +05:30
Bryan Boreham
6788760efa
Reduce memory allocation in benchmarkIterator() (#5983)
Previously it was allocating millions of chunks, all containing the
same 250 samples.  Above some ratio of CPU performance to available
memory, the benchmark cannot run.

Make 250 a const and just allocate one chunk which we iterate
repeatedly till we reach the benchmark count.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-07-26 19:36:54 +05:30
Ganesh Vernekar
4fefd7520e
Skip the failing TestHistoChunkSameBuckets (#9089)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-15 14:35:35 +05:30
beorn7
cb75747bce Fix re-encoding
Signed-off-by: beorn7 <beorn@grafana.com>
2021-07-06 00:20:35 +02:00
beorn7
01957eee2b Fix interjections even more
Signed-off-by: beorn7 <beorn@grafana.com>
2021-07-05 23:59:33 +02:00
beorn7
dc1c744169 Fix interjections at the end
Signed-off-by: beorn7 <beorn@grafana.com>
2021-07-05 23:01:39 +02:00
Oleg Zaytsev
40126a8494
Use binary literals for xor chunk encoding
An opinionated cosmetic change, but since go 1.13 we have this fancy
0b.... literals so we don't need to write hex and comment the binary
value.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2021-07-05 16:39:24 +02:00
beorn7
deb02d59fb Fix lint issues
Signed-off-by: beorn7 <beorn@grafana.com>
2021-07-05 15:27:46 +02:00
Dieter Plaetinck
dc6b068c67 bugfix: only bump numRead when all fields are successfully read
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-07-05 15:57:47 +03:00
Dieter Plaetinck
98f86d671a cleanup comments
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-07-05 15:56:38 +03:00
Dieter Plaetinck
99ae04bb6f add SHS chunk recoding and head cutting to head block (no tests yet)
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-07-05 15:56:38 +03:00
Ganesh Vernekar
4c01ff5194
Bunch of fixes for sparse histograms (#9043)
* Do not panic on histoAppender.Append

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* M-map all chunks on shutdown

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Support negative schema for querying

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-03 23:04:34 +05:30
Dieter Plaetinck
6c13375ac8
sparsehistogram recoding upon detection that new buckets have appeared (#9030)
* bucketIterator which returns all valid bucket indices for a []span

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* support for comparing []spans and generating interjections

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* add license header

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* assert order fix

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* handle pathological 0-length span case more gracefully

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* stale todo

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* decode-recode histograms when new buckets appear

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* factor out recoding and also add it to the fallback case

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* make linter happy

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-07-02 11:50:30 +05:30
beorn7
518b77c59d Fix a few trivial style nits
Signed-off-by: beorn7 <beorn@grafana.com>
2021-07-01 17:11:54 +02:00
Ganesh Vernekar
f4d3af73f0
Query histograms from TSDB and unit test for append+query (#9022)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-30 20:18:13 +05:30
Dieter Plaetinck
4d27816ea5
Sparsehistogram: improve dod encoding, testing, encode chunk metadata (#9015)
* factor out different varbit schemes and include Beorn's "optimum" for buckets

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* use more compact dod encoding scheme for SHS chunk columns

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* remove FB VB and xor dod encoding because we won't use it

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* HistoChunk metadata encoding

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* add SparseHistogram.Copy()

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* histogram test: test appending a few histograms

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* add license headers

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-06-30 16:15:43 +05:30
Ganesh Vernekar
04ad56d9b8
Append sparse histograms into the Head block (#9013)
* Append sparse histograms into the Head block

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Add AtHistogram() to Iterator interface. Make HistoChunk conform to Chunk interface.

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-29 20:08:46 +05:30
Dieter Plaetinck
58917d1b76
sparsehistogram: integer types and timestamp separation (#9014)
* integer types and timestamp separation

1) unify types to int64. as suggested by beorn. we want to support
   counters going down (resets) even if we plan to create new chunks for
   now, in that case
2) histogram type doesn't know its own timestamp. include it separately
   in appending and iteration

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* correction: count and zeroCount to remain unsigned

to make api more resilient and that's what we use in protobuf anyway

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* temp hack. Ganesh will fix

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-06-29 19:27:59 +05:30
Dieter Plaetinck
fd11a339a7
Sparsehistogram chunk implementation (#9009)
* histogram chunk

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* xorAppender.AppendHistogram non-method

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* basic histogram chunk test

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-06-29 14:07:41 +05:30
Julien Pivotto
6c56a1faaa
Testify: move to require (#8122)
* Testify: move to require

Moving testify to require to fail tests early in case of errors.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* More moves

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-10-29 09:43:23 +00:00
johncming
28ca0965f0
tsdb/chunkenc: fix typo of return error. (#7670)
* tsdb/chunkenc: fix typo of return error.

Signed-off-by: johncming <johncming@yahoo.com>

* tsdb: fix typo of function in markdonw.

Signed-off-by: johncming <johncming@yahoo.com>
2020-10-28 12:03:11 +00:00
Julien Pivotto
4e5b1722b3
Move away from testutil, refactor imports (#8087)
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-10-22 11:00:08 +02:00
Marco Pracucci
3b529ddbce
Cleanup bstream_test.go based on post-merge feedback received on #7390 (#7413)
* Fixed bstream test license

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Simplified bstreamReader.loadNextBuffer()

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed date in license

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2020-06-18 14:49:39 +05:30
Marco Pracucci
f42ed03dc5
Optimized bstream reader used by XORChunk iterator (#7390)
* Optimized bstream reader

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed linter

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added license to new file

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed type cast

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Changed comments

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Improved comments and rolledback no-op changes

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed race condition

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2020-06-15 16:44:40 +01:00
Bartlomiej Plotka
d5c33877f9
storage: Added Chunks{Queryable/Querier/SeriesSet/Series/Iteratable. Added generic Merge{SeriesSet/Querier} implementation. (#7005)
* storage: Added Chunks{Queryable/Querier/SeriesSet/Series/Iteratable. Added generic Merge{SeriesSet/Querier} implementation.

## Rationales:

In many places (e.g. chunk Remote read, Thanos Receive fetching chunk from TSDB), we operate on encoded chunks not samples.
This means that we unnecessary decode/encode, wasting CPU, time and memory.
This PR adds chunk iterator interfaces and makes the merge code to be reused between both seriesSets

I will make the use of it in following PR inside tsdb itself. For now fanout implements it and mergers.

All merges now also allows passing series mergers. This opens doors for custom deduplications other than TSDB vertical ones (e.g. offline one we have in Thanos).

## Changes

* Added Chunk versions of all iterating methods. It all starts in Querier/ChunkQuerier. The plan is that
Storage will implement both chunked and samples.
* Added Seek to chunks.Iterator interface for iterating over chunks.
* NewMergeChunkQuerier was added; Both this and NewMergeQuerier are now using generigMergeQuerier to share the code. Generic code was added.
* Improved tests.
* Added some TODO for further simplifications in next PRs.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Moved s/Labeled/SeriesLabels as per Krasi suggestion.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Krasi's comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Second iteration of Krasi comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Another round of comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-03-24 20:15:47 +00:00
Peter Štibraný
1d396b96dc
Specify that returned samples must be ordered by timestamp. (#6877)
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
2020-02-26 13:11:55 +00:00
Bartlomiej Plotka
59c9d6ef45 Addressed Brian's comments, moved metrics to main.go
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-02-17 18:03:57 +00:00
Bartlomiej Plotka
5d84e5d895 Make chunkenc.Iterator.At behaviour unspecified without Next done.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-02-17 18:03:57 +00:00
Bartlomiej Plotka
cfba92a133 Addressed comments.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-02-17 18:03:57 +00:00
Bartlomiej Plotka
849faa407b Minor fixes.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-02-17 18:03:57 +00:00
Bartlomiej Plotka
2cf637fbf5 Addressed comments.
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-02-17 18:03:57 +00:00
Bartlomiej Plotka
34426766d8 Unify Iterator interfaces. All point to storage now.
This is part of https://github.com/prometheus/prometheus/pull/5882 that can be done to simplify things.
All todos I added will be fixed in follow up PRs.

* querier.Querier, querier.Appender, querier.SeriesSet, and querier.Series interfaces merged
with storage interface.go. All imports that.
* querier.SeriesIterator replaced by chunkenc.Iterator
* Added chunkenc.Iterator.Seek method and tests for xor implementation (?)
* Since we properly handle SelectParams for Select methods I adjusted min max
based on that. This should help in terms of performance for queries with functions like offset.
* added Seek to deletedIterator and test.
* storage/tsdb was removed as it was only a unnecessary glue with incompatible structs.

No logic was changed, only different source of abstractions, so no need for benchmarks.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
2020-02-17 18:03:54 +00:00
Marco Pracucci
699f3e8f4d
Added comments to the Chunk interface
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2020-02-05 13:07:41 +01:00