Changes:
* Make `NewReader` method useful. It was impossible to use it, because closer was always nil.
* ReadSymbols, TOC and ReadOffsetTable are not public functions (used by Thanos).
* decbufXXX are now functions.
* More verbose errors.
* Removed unused crc32 field.
* Some var name changes to make it more verbose:
* symbols -> allocatedSymbols
* symbolsSlice -> symbolsV1
* symbols -> symbolsV2
*
* Pre-calculate symbolsTableSize.
* Initialized symbols for Symbols() method with valid length.
* Added test for Symbol method.
* Made Decoder LookupSymbol method public. Kept Decode public as it is useful as helper from index package.
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
Avoid a tree of merge objects, which can result in
what I suspect is n^2 calls to Seek when using Without.
With 100k metrics, and a regex of ^$ in BenchmarkHeadPostingForMatchers:
Before:
BenchmarkHeadPostingForMatchers-8 1 51633185216 ns/op 29745528 B/op 200357 allocs/op
After:
BenchmarkHeadPostingForMatchers-8 10 108924996 ns/op 25715025 B/op 101748 allocs/op
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
This saves memory, about a quarter of the size of the postings map
itself with high-cardinality labels (not including the post ids).
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
This reduces memory by only having to store the string's 16
bytes+map overheard once per label name, rather than duplicating it in every
entry for the label value.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
Reuse the string already allocated for symbols
in the posting tables.
Use a slice for symbols in v2 format.
Move symbol size logic into the index code.
Avoid duplication of lookupSymbol logic.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
fixes: https://github.com/prometheus/tsdb/issues/426
Using `filepath.Join()` instead of strings containing forward slash path delimiters (needed for non-*nix OSes), as suggested by @krasi-georgiev
more meaningful names for serializedStringTuples and stringTuples structs
Signed-off-by: knrt10 <tripathi.kautilya@gmail.com>
Co-authored-by: Krasi Georgiev <kgeorgie@redhat.com>
Currently the offsets are cast into uint32 even though the index can
grow larger than 4GiB.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>