Commit Graph

60 Commits

Author SHA1 Message Date
Fabian Reinartz
ac5bd71d8f Doc fixes 2017-11-10 14:10:20 +00:00
Fabian Reinartz
b7c3cfecbf index: abstract ByteSlice and adjust indexReader
This replaces the builtin byte slice with an interface for the index
reader. This allows the complex decoding of the index file format
to be used against more generalized implementations.
2017-11-09 17:38:32 +00:00
Fabian Reinartz
c354d6bd59 index: simplify checksum validation 2017-11-09 15:58:36 +00:00
Fabian Reinartz
798f2bdb0a
Merge pull request #189 from sunhay/checksum-checks
Index checksum validation on reads
2017-11-09 15:35:09 +00:00
Nipun Talukdar
791a2dda4d Fixed a problem of adding padding of 4 zero bytes in some cases (#194)
* Fixed a problem of adding padding of 4 zero bytes in some cases

* Incorporated review comments
2017-11-03 20:16:19 +01:00
ranbochen
a27cf34a36 fix bugs on platform windows to pass all test case. (#192)
* fix bugs on platform windows to pass all test case.

* fix bugs on platform windows to pass all test case

* clean up codes
2017-10-31 15:37:41 +01:00
Sunny Klair
aecac3b521 Resuse single CRC for index checksum validation 2017-10-27 12:29:59 -04:00
Sunny Klair
b65dd43c5b Validate index series/postings/symbol table checksums on read 2017-10-26 15:34:31 -04:00
Sunny Klair
ab02ea4de4 Validate index offset table checksums on read 2017-10-26 13:48:31 -04:00
Sunny Klair
4fdf9b195c Validate index TOC checksum on read 2017-10-25 18:12:13 -04:00
Fabian Reinartz
665955da48 Clarify postings index semantics, handle staleness
The postings list index may point to series that no longer
exist during garbage collection. This clarifies that this is valid
behavior.
It would be possible, though more complex, to always keep them in sync.
However, series existance means nothing in itself as the queried time
range defines whether there's actual data. Thus our definition is sane
overall as long as drift is kept small.
2017-10-11 09:37:19 +02:00
Fabian Reinartz
fb9da52b11 Add more verbose error handling for closing, reduce locking
This commit introduces error returns in various places and is explicit
about closing persisted blocks.
{Index,Chunk,Tombstone}Readers are more consistent about their Close()
method. Whenever a reader is retrieved, the corresponding close method
must eventually be called. We use this to track pending readers against
persisted blocks.

Querier's against the DB no longer hold a read lock for their entire
lifecycle. This avoids long running queriers to starve new ones when we
have to acquire a write lock when reloading blocks.
2017-10-10 12:13:37 +02:00
Goutham Veeramachaneni
da565f975e Merge pull request #161 from prometheus/fileutil
Remove dependency on etcd/pkg/fileutil
2017-10-04 17:08:54 +05:30
Fabian Reinartz
bbe72dccb9 Remove dependency on etcd/pkg/fileutil 2017-10-04 10:23:41 +02:00
Fabian Reinartz
78df406dac Allocate and cache strings for persisted blocks
This change loads the full symbol table when we open a persisted block
and allocates a string for each. This ensures that strings retrieved
through the index can be used after the block was closed.
Before we backed the strings by the mmap'd byte regions which would
segfault in this case.

Also remove an inconsistency in the disk format and move both offset
tables to the end (breaking change).
2017-10-02 15:56:57 +02:00
Goutham Veeramachaneni
8919baef03
Expose NewIndexReader() and cleanups
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-09-13 13:47:20 +05:30
Fabian Reinartz
3870ec285c Merge pull request #140 from prometheus/locks
Fix various races
2017-09-11 10:41:33 +02:00
Fabian Reinartz
b09d90c79c Add decoding method to retrieve unsafe strings
When decoding data from mmaped blocks, we would like to retrieve
a string backed by the mmaped region. As the underlying byte slice
never changes, this is safe.
2017-09-08 18:41:43 +02:00
Goutham Veeramachaneni
afaf12fe45
Compress the series chunk details in index.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-09-08 20:25:19 +05:30
Fabian Reinartz
1ddedf2b30 Change series ID from uint32 to uint64 2017-09-04 16:08:38 +02:00
Matt Layher
78b15c3434
Add newCRC32 function to simplify hash initialization 2017-08-26 12:04:00 -04:00
Fabian Reinartz
2644c8665c Don't allocate ChunkMetas, reuse postings slices 2017-08-06 20:41:24 +02:00
Fabian Reinartz
96d7f540d4 Persist series without allocating the full set
Change index persistence for series to not be accumulated in memory
before being written as one large batch. `Labels` and `ChunkMeta`
objects are reused.
This cuts down memory spikes during compaction of multiple blocks
significantly.

As part of the the Index{Reader,Writer} now have an explicit notion of
symbols and series must be inserted in order.
2017-08-06 12:06:41 +02:00
Goutham Veeramachaneni
a110a64abd
Add full Snapshot support
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-06-06 18:15:54 +05:30
Goutham Veeramachaneni
34a86af3c6
Move tombstones to their own thing.
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-17 08:36:56 +05:30
Goutham Veeramachaneni
4f1d857590
Implement Delete on HeadBlock
Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-15 23:28:14 +05:30
Goutham Veeramachaneni
5579efbd5b
Initial implentation of Deletes on persistedBlock
Very much a WIP

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-05-14 14:36:26 +05:30
Fabian Reinartz
2032a11d98 Add padding between fixed-sized index sections 2017-05-02 12:43:51 +02:00
Fabian Reinartz
34ba92eeeb Move CRC back to chunks file, alignment for fixed-sized ints 2017-04-30 10:18:07 +02:00
Fabian Reinartz
a54f46d5e7 Migrate last IndexWriter pieces to decbuf 2017-04-30 10:18:07 +02:00
Fabian Reinartz
94f3fd9812 Move encoding helpers into separate file 2017-04-30 10:18:07 +02:00
Fabian Reinartz
35b62f001e Change offset table layout, add TOC, ... 2017-04-30 10:18:07 +02:00
Fabian Reinartz
8b1f514a2d index: validate current write stages 2017-04-30 10:18:07 +02:00
Fabian Reinartz
9b4eafcc4c Simplify and document postings serialization 2017-04-30 10:10:18 +02:00
Fabian Reinartz
0aad526d1a Simplify label value index
This removes the flag from the label value index and serializes it using
encbufs.
TODO: move CRC32 checksum into label value index hash table for
referntial integrity.
2017-04-30 10:10:18 +02:00
Fabian Reinartz
d30b181406 Switch series serialization to use encbufs 2017-04-30 10:10:18 +02:00
Fabian Reinartz
2ebaf1af4f Add encode buffer and simplify symbol serialization 2017-04-30 10:10:18 +02:00
Fabian Reinartz
433e73f865 Change series and symbol table format 2017-04-30 10:10:18 +02:00
Fabian Reinartz
df96d97dab Move chunk checksum 2017-04-30 10:10:18 +02:00
Julius Volz
8d1fb4fa01 Minor comment fixes and additions. 2017-04-28 15:41:42 +02:00
Fabian Reinartz
778103b450 Add liecence file and headers 2017-04-10 20:59:45 +02:00
Fabian Reinartz
7de2217011 Add fast-path for equality matching 2017-04-05 15:37:48 +02:00
Fabian Reinartz
10c7c9acbe Adjust import names to new repository organisation 2017-04-04 11:27:26 +02:00
Goutham Veeramachaneni
71e05a22c7
Add mockIndex And Refactor Tests To Use That 2017-03-30 04:48:41 +05:30
Goutham Veeramachaneni
7b94a4e17d
Rename bytePostings To bigEndianPostings
* To be more specific about the contents of the byte slice.
2017-03-27 14:04:42 +05:30
Goutham Veeramachaneni
efb0dfe1be
Implement Postings Iterator Over Bytes
Closes fabxc/tsdb#18
2017-03-26 23:40:12 +05:30
Fabian Reinartz
2ef3682560 Hotfix erroneous "label index missing" error 2017-03-20 11:37:06 +01:00
Fabian Reinartz
a8e8903350 Use ChunkMeta references for clarity
This has been a common source of hard to debug issues. Its a premature
and unbenchmarked optimization and semantically, we want ChunkMetas to
be references in all changed cases.
2017-03-14 15:40:16 +01:00
Fabian Reinartz
8a7addfc44 Split persistence by chunk/index instead of read/write 2017-03-07 12:48:52 +01:00
Fabian Reinartz
2a825f6c28 Consolidate mem index into HeadBlock 2016-12-22 01:12:28 +01:00