Commit Graph

383 Commits

Author SHA1 Message Date
Ganesh Vernekar eeace6bcab
Add couple of metrics to track sparse histograms in TSDB (#9271)
* Add couple of metrics to track sparse histograms in TSDB

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix Beorn's comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-30 19:08:44 +05:30
Ganesh Vernekar c373200b75
Cut a new chunk on counter resets for any bucket
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-24 20:02:08 +05:30
Ganesh Vernekar eedb86783e
Fix queries on blocks for sparse histograms and add unit test (#9209)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-16 18:52:29 +05:30
Ganesh Vernekar 42f576aa18
Add test for sparse histogram compaction (#9208)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-16 16:32:23 +05:30
Ganesh Vernekar f0688c21d6
Log sparse histograms into WAL and replay from it (#9191)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-11 17:38:48 +05:30
Ganesh Vernekar 095f572d4a
Sync sparsehistogram branch with main (#9189)
* Fix `kuma_sd` targetgroup reporting (#9157)

* Bundle all xDS targets into a single group

Signed-off-by: austin ce <austin.cawley@gmail.com>

* Snapshot in-memory chunks on shutdown for faster restarts (#7229)

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Rename links

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Remove Individual Data Type Caps in Per-shard Buffering for Remote Write (#8921)

* Moved everything to nPending buffer

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Simplify exemplar capacity addition

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Added pre-allocation

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Don't allocate if not sending exemplars

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Avoid deadlock when processing duplicate series record (#9170)

* Avoid deadlock when processing duplicate series record

`processWALSamples()` needs to be able to send on its output channel
before it can read the input channel, so reads to allow this in case the
output channel is full.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* processWALSamples: update comment

Previous text seems to relate to an earlier implementation.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Optimise WAL loading by removing extra map and caching min-time (#9160)

* BenchmarkLoadWAL: close WAL after use

So that goroutines are stopped and resources released

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* BenchmarkLoadWAL: make series IDs co-prime with #workers

Series are distributed across workers by taking the modulus of the
ID with the number of workers, so multiples of 100 are a poor choice.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* BenchmarkLoadWAL: simulate mmapped chunks

Real Prometheus cuts chunks every 120 samples, then skips those samples
when re-reading the WAL. Simulate this by creating a single mapped chunk
for each series, since the max time is all the reader looks at.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Fix comment

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Remove series map from processWALSamples()

The locks that is commented to reduce contention in are now sharded
32,000 ways, so won't be contended. Removing the map saves memory and
goes just as fast.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* loadWAL: Cache the last mmapped chunk time

So we can skip calling append() for samples it will reject.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Improvements from code review

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Full stops and capitals on comments

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Cache max time in both places mmappedChunks is updated

Including refactor to extract function `setMMappedChunks`, to reduce
code duplication.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Update head min/max time when mmapped chunks added

This ensures we have the correct values if no WAL samples are added for
that series.

Note that `mSeries.maxTime()` was always `math.MinInt64` before, since
that function doesn't consider mmapped chunks.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Split Go and React Tests (#8897)

* Added go-ci and react-ci

Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu>
Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Remove search keymap from new expression editor (#9184)

Signed-off-by: Julius Volz <julius.volz@gmail.com>

Co-authored-by: Austin Cawley-Edwards <austin.cawley@gmail.com>
Co-authored-by: Levi Harrison <git@leviharrison.dev>
Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu>
Co-authored-by: Bryan Boreham <bjboreham@gmail.com>
Co-authored-by: Julius Volz <julius.volz@gmail.com>
2021-08-11 15:43:17 +05:30
Ganesh Vernekar 19e98e5469
Support storing the zero threshold in the histogram chunk (#9165)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-06 18:08:41 +05:30
Ganesh Vernekar 7026e6b4e4
Fix tests in histo_test.go (#9163)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-06 14:26:56 +05:30
Ganesh Vernekar 8b70e87ab9
Merge remote-tracking branch 'upstream/main' into sparse-refactor
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-05 12:16:08 +05:30
Ganesh Vernekar 848cb5a6d6
Enhanced WAL replay for duplicate series record (#7438)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-03 20:03:54 +05:30
Ganesh Vernekar 8002a3ab80
Breakdown tsdb/head.go into multiple files (#9147)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-08-03 14:14:26 +02:00
jinglina 1a430e5f89
remove redundant parentheses (#9134)
Signed-off-by: jinglina <jinglinax@163.com>
2021-07-29 18:26:57 +05:30
Darshan Chaudhary c4f2e9eec5
Add present_over_time (#9097)
* Add present_over_time

Signed-off-by: darshanime <deathbullet@gmail.com>

* Add tests for present_over_time

Signed-off-by: darshanime <deathbullet@gmail.com>

* Address PR comments

Signed-off-by: darshanime <deathbullet@gmail.com>

* Add documentation for present_over_time

Signed-off-by: darshanime <deathbullet@gmail.com>

* Update documentation

Signed-off-by: darshanime <deathbullet@gmail.com>

* Update documentation comment

Signed-off-by: darshanime <deathbullet@gmail.com>
2021-07-29 12:38:11 +02:00
Oleg Zaytsev f9482c5bf6
Clarify computeChunkEndTime's purpose (#9049)
I was struggling to understand the purpose of this method until I
tweaked the tests, so I decided to write down my observations.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2021-07-28 18:39:05 +05:30
Bryan Boreham 60804c5a09
remote_write: reduce blocking from garbage-collect of series (#9109)
* Refactor: pass segment-reading function as param

To allow a different implementation to be used when garbage-collecting.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* remote_write: reduce blocking from GC of series

Add a method `UpdateSeriesSegment()` which is used together with
`SeriesReset()` to garbage-collect old series. This allows us to
split the lock around queueManager series data and avoid blocking
`Append()` while reading series from the last checkpoint.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Cosmetic: review feedback on comments

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* remote-write benchmark: include GC of series

Reduce the total number of samples per iteration from 5000*5000
(25 million) which is too big for my laptop, to 1*10000.

Extend `createTimeseries()` to add additional labels, so that the
queue manager is doing more realistic work.

Move the Append() call to a background goroutine - this works because
TestWriteClient uses a WaitGroup to signal completion.

Call `StoreSeries()` and `SeriesReset()` while adding samples, to
simulate the garbage-collection that wal.Watcher does.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Change BenchmarkSampleDelivery to call UpdateSeriesSegment

This matches what Watcher.garbageCollectSeries() is doing now

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-07-27 13:21:48 -07:00
Bryan Boreham dea37853d9
tsdb: use dennwc/varint to speed up WAL decoding (#9106)
* tsdb: use dennwc/varint to speed up decoding

This is a tiny library, MIT-licensed, which unrolls the loop to go
about twice as fast.

Needed to copy the sign-inverting logic inline, previously provided by
the `binary` package.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* More comments to explain varint decoding

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-07-27 10:02:57 +05:30
Bryan Boreham 6788760efa
Reduce memory allocation in benchmarkIterator() (#5983)
Previously it was allocating millions of chunks, all containing the
same 250 samples.  Above some ratio of CPU performance to available
memory, the benchmark cannot run.

Make 250 a const and just allocate one chunk which we iterate
repeatedly till we reach the benchmark count.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-07-26 19:36:54 +05:30
Oleg Zaytsev b1ed4a0a66
LabelNames API with matchers (#9083)
* Push the matchers for LabelNames all the way into the index.

NB This doesn't actually implement it in the index, just plumbs it through for now...

Signed-off-by: Tom Wilkie <tom@grafana.com>

* Hack it up.  Does not work.

Signed-off-by: Tom Wilkie <tom@grafana.com>

* Revert changes I don't understand

Can't see why do we need to hold a mutex on symbols, and the purpose of
the LabelNamesFor method.

Maybe I'll need to re-add this later.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Implement LabelNamesFor

This method provides the label names that appear in the postings
provided. We do that deeper than the label values because we know
beforehand that most of the label names we'll be the same across
different postings, and we don't want to go down an up looking up the
same symbols for all different series.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Mutex on symbols should be unlocked

However, I still don't understand why do we need a mutex here.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Fix head.LabelNamesFor

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Implement mockIndex LabelNames with matchers

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Nitpick on slice initialisation

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Add tests for LabelNamesWithMatchers

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Fix the mutex mess on head.LabelValues/LabelNames

I still don't see why we need to grab that unrelated mutex, but at least
now we're grabbing it consistently

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Check error after iterating postings

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Use the error from posting when there was en error in postings

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Update storage/interface.go comment

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* Update tsdb/index/index.go comment

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* Update tsdb/index/index.go wrapped error msg

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* Update tsdb/index/index.go wrapped error msg

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* Update tsdb/index/index.go warpped error msg

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>

* Remove unneeded comment

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Add testcases for LabelNames w/matchers in api.go

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Use t.Cleanup() instead of defer in tests

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

Co-authored-by: Tom Wilkie <tom@grafana.com>
Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>
2021-07-20 18:08:08 +05:30
Jupiter 7337ecf0d3
Log when total symbol size exceeds 2^32 bytes. (#9104)
* Compaction fails when total symbol size exceeds 2^32 bytes.
Signed-off-by: tanghengjian <1040104807@qq.com>

* Compaction fails when total symbol size exceeds 2^32 bytes.

Signed-off-by: tanghengjian <1040104807@qq.com>

* Compaction fails when total symbol size exceeds 2^32 bytes.

Signed-off-by: root <tanghengjian@oppo.com>

Co-authored-by: root <tanghengjian@oppo.com>
2021-07-20 15:51:36 +05:30
Ganesh Vernekar 59d02b5ef0
tsdb: Block Head GC till pending readers are done reading (#9081)
* tsdb: Block Head GC till pending readers are done reading

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix review comments 2

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix the exclusiveness of maxt in WaitForPendingReadersInTimeRange

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-20 14:17:20 +05:30
Martin Disibio 1bcd13d6b5
Exemplar resize (#8974)
* Create experimental circular buffer resize method, benchmarks

Signed-off-by: Martin Disibio <mdisibio@gmail.com>

* Optimize exemplar resize to only replay as many exemplars as needed

Signed-off-by: Martin Disibio <mdisibio@gmail.com>

* More comments, benchmark AddExemplar

Signed-off-by: Martin Disibio <mdisibio@gmail.com>

* optimizations

Signed-off-by: Martin Disibio <mdisibio@gmail.com>

* comment

Signed-off-by: Martin Disibio <mdisibio@gmail.com>

* Slight refactor of resize benchmark + make use of resize via runtime
reloadable storage config.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Some more config related changes.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Address some review comments.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Address more review comments.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Refactor to remove usage of noopExemplarStorage and avoid race condition
when resizing from Head code.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* Fix or add comments to clarify some of the new behaviour.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

* fix potential panics related to negative exemplar buffer lengths

Signed-off-by: Callum Styan <callumstyan@gmail.com>

Co-authored-by: Callum Styan <callumstyan@gmail.com>
2021-07-20 10:22:57 +05:30
Ganesh Vernekar 4fefd7520e
Skip the failing TestHistoChunkSameBuckets (#9089)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-15 14:35:35 +05:30
ide-rea 14b24d15b0
update checkpoint replay status (#8898)
* Consider  wal checkpoint replay status

Signed-off-by: XiaoYu Zhang <ideoutrea@163.com>

* Fix tests failed

Signed-off-by: XiaoYu Zhang <ideoutrea@163.com>

* Update checkpoint replay status

Signed-off-by: XiaoYu Zhang <ideoutrea@163.com>
2021-07-13 15:38:07 +05:30
Ganesh Vernekar 79305e704b
Compare block sizes with sparse histograms (#9045)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-08 11:01:53 +05:30
Bryan Boreham dc8f505595
tsdb: coalesce lock/unlock operations for append (#9061)
Fetch the low watermark value under the same lock as we need for the
appender, rather than releasing then re-aquiring a lock on the same
Mutex.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2021-07-07 18:58:20 +05:30
beorn7 cb75747bce Fix re-encoding
Signed-off-by: beorn7 <beorn@grafana.com>
2021-07-06 00:20:35 +02:00
beorn7 01957eee2b Fix interjections even more
Signed-off-by: beorn7 <beorn@grafana.com>
2021-07-05 23:59:33 +02:00
beorn7 dc1c744169 Fix interjections at the end
Signed-off-by: beorn7 <beorn@grafana.com>
2021-07-05 23:01:39 +02:00
Ganesh Vernekar 1acb701e5c
Fix TSDB race while reading histograms
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-05 21:51:35 +05:30
Ganesh Vernekar 9f206a7a05
Fix race in TSBD while reading/writing histograms (#9051)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-05 21:27:26 +05:30
Oleg Zaytsev 40126a8494
Use binary literals for xor chunk encoding
An opinionated cosmetic change, but since go 1.13 we have this fancy
0b.... literals so we don't need to write hex and comment the binary
value.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2021-07-05 16:39:24 +02:00
beorn7 deb02d59fb Fix lint issues
Signed-off-by: beorn7 <beorn@grafana.com>
2021-07-05 15:27:46 +02:00
Dieter Plaetinck dc6b068c67 bugfix: only bump numRead when all fields are successfully read
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-07-05 15:57:47 +03:00
Dieter Plaetinck 98f86d671a cleanup comments
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-07-05 15:56:38 +03:00
Dieter Plaetinck 99ae04bb6f add SHS chunk recoding and head cutting to head block (no tests yet)
Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-07-05 15:56:38 +03:00
Ganesh Vernekar 67871fd1f2
Support compaction of Head block for histograms (#9044)
* Update querier.go to support Head compaction with histograms

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Add test for Head compaction with histograms

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Fix tests

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-04 16:12:37 +05:30
Ganesh Vernekar 4c01ff5194
Bunch of fixes for sparse histograms (#9043)
* Do not panic on histoAppender.Append

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* M-map all chunks on shutdown

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Support negative schema for querying

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-07-03 23:04:34 +05:30
Dieter Plaetinck 6c13375ac8
sparsehistogram recoding upon detection that new buckets have appeared (#9030)
* bucketIterator which returns all valid bucket indices for a []span

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* support for comparing []spans and generating interjections

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* add license header

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* assert order fix

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* handle pathological 0-length span case more gracefully

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* stale todo

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* decode-recode histograms when new buckets appear

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* factor out recoding and also add it to the fallback case

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* make linter happy

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-07-02 11:50:30 +05:30
beorn7 518b77c59d Fix a few trivial style nits
Signed-off-by: beorn7 <beorn@grafana.com>
2021-07-01 17:11:54 +02:00
Ganesh Vernekar f4d3af73f0
Query histograms from TSDB and unit test for append+query (#9022)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-30 20:18:13 +05:30
Dieter Plaetinck 4d27816ea5
Sparsehistogram: improve dod encoding, testing, encode chunk metadata (#9015)
* factor out different varbit schemes and include Beorn's "optimum" for buckets

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* use more compact dod encoding scheme for SHS chunk columns

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* remove FB VB and xor dod encoding because we won't use it

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* HistoChunk metadata encoding

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* add SparseHistogram.Copy()

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* histogram test: test appending a few histograms

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* add license headers

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-06-30 16:15:43 +05:30
Ganesh Vernekar 04ad56d9b8
Append sparse histograms into the Head block (#9013)
* Append sparse histograms into the Head block

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Add AtHistogram() to Iterator interface. Make HistoChunk conform to Chunk interface.

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-29 20:08:46 +05:30
Dieter Plaetinck 58917d1b76
sparsehistogram: integer types and timestamp separation (#9014)
* integer types and timestamp separation

1) unify types to int64. as suggested by beorn. we want to support
   counters going down (resets) even if we plan to create new chunks for
   now, in that case
2) histogram type doesn't know its own timestamp. include it separately
   in appending and iteration

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* correction: count and zeroCount to remain unsigned

to make api more resilient and that's what we use in protobuf anyway

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* temp hack. Ganesh will fix

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-06-29 19:27:59 +05:30
Dieter Plaetinck fd11a339a7
Sparsehistogram chunk implementation (#9009)
* histogram chunk

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* xorAppender.AppendHistogram non-method

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>

* basic histogram chunk test

Signed-off-by: Dieter Plaetinck <dieter@grafana.com>
2021-06-29 14:07:41 +05:30
Ganesh Vernekar 64bea6999e
HistogramAppender interface for sparse histograms (#9007)
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2021-06-28 20:30:55 +05:30
Julien Pivotto fa6b2897f0
Merge pull request #8956 from LeviHarrison/fix-tsdb-test-flake
CI: Ignore goleak in TSDB test
2021-06-20 23:41:55 +02:00
Levi Harrison 437c470c40 Added ignore
Signed-off-by: Levi Harrison <git@leviharrison.dev>
2021-06-17 10:04:52 -04:00
Levi Harrison 4a4882d4c7 Replace godoc.org links
Signed-off-by: Levi Harrison <git@leviharrison.dev>
2021-06-17 07:18:51 -04:00
Julien Pivotto b1c179be85
Fix main build (#8948)
Was broken after the merge of #8824

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2021-06-16 17:18:32 +02:00
Julien Duchesne 8855c2e626
Add `prometheus_tsdb_clean_start` metric (#8824)
Add cleanup of the lockfile when the db is cleanly closed

The metric describes the status of the lockfile on startup
0: Already existed
1: Did not exist
-1: Disabled

Therefore, if the min value over time of this metric is 0, that means that executions have exited uncleanly
We can then use that metric to have a much lower threshold on the crashlooping alert:

If the metric exists and it has been zero, two restarts is enough to trigger the alarm
If it does not exist (old prom version for example), the current five restarts threshold remains

Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>

* Change metric name + set unset value to -1

Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>

* Only check the last value of the clean start alert

Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>

* Fix test + nit

Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>
2021-06-16 15:03:02 +05:30