prometheus

Commit Graph

Author	SHA1	Message	Date
Ganesh Vernekar	eeace6bcab	Add couple of metrics to track sparse histograms in TSDB (#9271 ) * Add couple of metrics to track sparse histograms in TSDB Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix Beorn's comments Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-08-30 19:08:44 +05:30
Ganesh Vernekar	c373200b75	Cut a new chunk on counter resets for any bucket Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-08-24 20:02:08 +05:30
Ganesh Vernekar	eedb86783e	Fix queries on blocks for sparse histograms and add unit test (#9209 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-08-16 18:52:29 +05:30
Ganesh Vernekar	42f576aa18	Add test for sparse histogram compaction (#9208 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-08-16 16:32:23 +05:30
Ganesh Vernekar	f0688c21d6	Log sparse histograms into WAL and replay from it (#9191 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-08-11 17:38:48 +05:30
Ganesh Vernekar	095f572d4a	Sync sparsehistogram branch with main (#9189 ) * Fix `kuma_sd` targetgroup reporting (#9157) * Bundle all xDS targets into a single group Signed-off-by: austin ce <austin.cawley@gmail.com> * Snapshot in-memory chunks on shutdown for faster restarts (#7229) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Rename links Signed-off-by: Levi Harrison <git@leviharrison.dev> * Remove Individual Data Type Caps in Per-shard Buffering for Remote Write (#8921) * Moved everything to nPending buffer Signed-off-by: Levi Harrison <git@leviharrison.dev> * Simplify exemplar capacity addition Signed-off-by: Levi Harrison <git@leviharrison.dev> * Added pre-allocation Signed-off-by: Levi Harrison <git@leviharrison.dev> * Don't allocate if not sending exemplars Signed-off-by: Levi Harrison <git@leviharrison.dev> * Avoid deadlock when processing duplicate series record (#9170) * Avoid deadlock when processing duplicate series record `processWALSamples()` needs to be able to send on its output channel before it can read the input channel, so reads to allow this in case the output channel is full. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * processWALSamples: update comment Previous text seems to relate to an earlier implementation. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Optimise WAL loading by removing extra map and caching min-time (#9160) * BenchmarkLoadWAL: close WAL after use So that goroutines are stopped and resources released Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * BenchmarkLoadWAL: make series IDs co-prime with #workers Series are distributed across workers by taking the modulus of the ID with the number of workers, so multiples of 100 are a poor choice. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * BenchmarkLoadWAL: simulate mmapped chunks Real Prometheus cuts chunks every 120 samples, then skips those samples when re-reading the WAL. Simulate this by creating a single mapped chunk for each series, since the max time is all the reader looks at. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Fix comment Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Remove series map from processWALSamples() The locks that is commented to reduce contention in are now sharded 32,000 ways, so won't be contended. Removing the map saves memory and goes just as fast. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * loadWAL: Cache the last mmapped chunk time So we can skip calling append() for samples it will reject. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Improvements from code review Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Full stops and capitals on comments Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Cache max time in both places mmappedChunks is updated Including refactor to extract function `setMMappedChunks`, to reduce code duplication. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Update head min/max time when mmapped chunks added This ensures we have the correct values if no WAL samples are added for that series. Note that `mSeries.maxTime()` was always `math.MinInt64` before, since that function doesn't consider mmapped chunks. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Split Go and React Tests (#8897) * Added go-ci and react-ci Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu> Signed-off-by: Levi Harrison <git@leviharrison.dev> * Remove search keymap from new expression editor (#9184) Signed-off-by: Julius Volz <julius.volz@gmail.com> Co-authored-by: Austin Cawley-Edwards <austin.cawley@gmail.com> Co-authored-by: Levi Harrison <git@leviharrison.dev> Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu> Co-authored-by: Bryan Boreham <bjboreham@gmail.com> Co-authored-by: Julius Volz <julius.volz@gmail.com>	2021-08-11 15:43:17 +05:30
Ganesh Vernekar	19e98e5469	Support storing the zero threshold in the histogram chunk (#9165 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-08-06 18:08:41 +05:30
Ganesh Vernekar	7026e6b4e4	Fix tests in histo_test.go (#9163 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-08-06 14:26:56 +05:30
Ganesh Vernekar	8b70e87ab9	Merge remote-tracking branch 'upstream/main' into sparse-refactor Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-08-05 12:16:08 +05:30
Ganesh Vernekar	848cb5a6d6	Enhanced WAL replay for duplicate series record (#7438 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-08-03 20:03:54 +05:30
Ganesh Vernekar	8002a3ab80	Breakdown tsdb/head.go into multiple files (#9147 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-08-03 14:14:26 +02:00
jinglina	1a430e5f89	remove redundant parentheses (#9134 ) Signed-off-by: jinglina <jinglinax@163.com>	2021-07-29 18:26:57 +05:30
Darshan Chaudhary	c4f2e9eec5	Add present_over_time (#9097 ) * Add present_over_time Signed-off-by: darshanime <deathbullet@gmail.com> * Add tests for present_over_time Signed-off-by: darshanime <deathbullet@gmail.com> * Address PR comments Signed-off-by: darshanime <deathbullet@gmail.com> * Add documentation for present_over_time Signed-off-by: darshanime <deathbullet@gmail.com> * Update documentation Signed-off-by: darshanime <deathbullet@gmail.com> * Update documentation comment Signed-off-by: darshanime <deathbullet@gmail.com>	2021-07-29 12:38:11 +02:00
Oleg Zaytsev	f9482c5bf6	Clarify computeChunkEndTime's purpose (#9049 ) I was struggling to understand the purpose of this method until I tweaked the tests, so I decided to write down my observations. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2021-07-28 18:39:05 +05:30
Bryan Boreham	60804c5a09	remote_write: reduce blocking from garbage-collect of series (#9109 ) * Refactor: pass segment-reading function as param To allow a different implementation to be used when garbage-collecting. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * remote_write: reduce blocking from GC of series Add a method `UpdateSeriesSegment()` which is used together with `SeriesReset()` to garbage-collect old series. This allows us to split the lock around queueManager series data and avoid blocking `Append()` while reading series from the last checkpoint. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Cosmetic: review feedback on comments Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * remote-write benchmark: include GC of series Reduce the total number of samples per iteration from 50005000 (25 million) which is too big for my laptop, to 110000. Extend `createTimeseries()` to add additional labels, so that the queue manager is doing more realistic work. Move the Append() call to a background goroutine - this works because TestWriteClient uses a WaitGroup to signal completion. Call `StoreSeries()` and `SeriesReset()` while adding samples, to simulate the garbage-collection that wal.Watcher does. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Change BenchmarkSampleDelivery to call UpdateSeriesSegment This matches what Watcher.garbageCollectSeries() is doing now Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2021-07-27 13:21:48 -07:00
Bryan Boreham	dea37853d9	tsdb: use dennwc/varint to speed up WAL decoding (#9106 ) * tsdb: use dennwc/varint to speed up decoding This is a tiny library, MIT-licensed, which unrolls the loop to go about twice as fast. Needed to copy the sign-inverting logic inline, previously provided by the `binary` package. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * More comments to explain varint decoding Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2021-07-27 10:02:57 +05:30
Bryan Boreham	6788760efa	Reduce memory allocation in benchmarkIterator() (#5983 ) Previously it was allocating millions of chunks, all containing the same 250 samples. Above some ratio of CPU performance to available memory, the benchmark cannot run. Make 250 a const and just allocate one chunk which we iterate repeatedly till we reach the benchmark count. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2021-07-26 19:36:54 +05:30
Oleg Zaytsev	b1ed4a0a66	LabelNames API with matchers (#9083 ) * Push the matchers for LabelNames all the way into the index. NB This doesn't actually implement it in the index, just plumbs it through for now... Signed-off-by: Tom Wilkie <tom@grafana.com> * Hack it up. Does not work. Signed-off-by: Tom Wilkie <tom@grafana.com> * Revert changes I don't understand Can't see why do we need to hold a mutex on symbols, and the purpose of the LabelNamesFor method. Maybe I'll need to re-add this later. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Implement LabelNamesFor This method provides the label names that appear in the postings provided. We do that deeper than the label values because we know beforehand that most of the label names we'll be the same across different postings, and we don't want to go down an up looking up the same symbols for all different series. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Mutex on symbols should be unlocked However, I still don't understand why do we need a mutex here. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Fix head.LabelNamesFor Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Implement mockIndex LabelNames with matchers Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Nitpick on slice initialisation Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Add tests for LabelNamesWithMatchers Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Fix the mutex mess on head.LabelValues/LabelNames I still don't see why we need to grab that unrelated mutex, but at least now we're grabbing it consistently Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Check error after iterating postings Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Use the error from posting when there was en error in postings Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com> Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Update storage/interface.go comment Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com> * Update tsdb/index/index.go comment Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com> * Update tsdb/index/index.go wrapped error msg Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com> * Update tsdb/index/index.go wrapped error msg Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com> * Update tsdb/index/index.go warpped error msg Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com> * Remove unneeded comment Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Add testcases for LabelNames w/matchers in api.go Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Use t.Cleanup() instead of defer in tests Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Tom Wilkie <tom@grafana.com> Co-authored-by: Ganesh Vernekar <15064823+codesome@users.noreply.github.com>	2021-07-20 18:08:08 +05:30
Jupiter	7337ecf0d3	Log when total symbol size exceeds 2^32 bytes. (#9104 ) * Compaction fails when total symbol size exceeds 2^32 bytes. Signed-off-by: tanghengjian <1040104807@qq.com> * Compaction fails when total symbol size exceeds 2^32 bytes. Signed-off-by: tanghengjian <1040104807@qq.com> * Compaction fails when total symbol size exceeds 2^32 bytes. Signed-off-by: root <tanghengjian@oppo.com> Co-authored-by: root <tanghengjian@oppo.com>	2021-07-20 15:51:36 +05:30
Ganesh Vernekar	59d02b5ef0	tsdb: Block Head GC till pending readers are done reading (#9081 ) * tsdb: Block Head GC till pending readers are done reading Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix review comments Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix review comments 2 Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix the exclusiveness of maxt in WaitForPendingReadersInTimeRange Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-07-20 14:17:20 +05:30
Martin Disibio	1bcd13d6b5	Exemplar resize (#8974 ) * Create experimental circular buffer resize method, benchmarks Signed-off-by: Martin Disibio <mdisibio@gmail.com> * Optimize exemplar resize to only replay as many exemplars as needed Signed-off-by: Martin Disibio <mdisibio@gmail.com> * More comments, benchmark AddExemplar Signed-off-by: Martin Disibio <mdisibio@gmail.com> * optimizations Signed-off-by: Martin Disibio <mdisibio@gmail.com> * comment Signed-off-by: Martin Disibio <mdisibio@gmail.com> * Slight refactor of resize benchmark + make use of resize via runtime reloadable storage config. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Some more config related changes. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Address some review comments. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Address more review comments. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Refactor to remove usage of noopExemplarStorage and avoid race condition when resizing from Head code. Signed-off-by: Callum Styan <callumstyan@gmail.com> * Fix or add comments to clarify some of the new behaviour. Signed-off-by: Callum Styan <callumstyan@gmail.com> * fix potential panics related to negative exemplar buffer lengths Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Callum Styan <callumstyan@gmail.com>	2021-07-20 10:22:57 +05:30
Ganesh Vernekar	4fefd7520e	Skip the failing TestHistoChunkSameBuckets (#9089 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-07-15 14:35:35 +05:30
ide-rea	14b24d15b0	update checkpoint replay status (#8898 ) * Consider wal checkpoint replay status Signed-off-by: XiaoYu Zhang <ideoutrea@163.com> * Fix tests failed Signed-off-by: XiaoYu Zhang <ideoutrea@163.com> * Update checkpoint replay status Signed-off-by: XiaoYu Zhang <ideoutrea@163.com>	2021-07-13 15:38:07 +05:30
Ganesh Vernekar	79305e704b	Compare block sizes with sparse histograms (#9045 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-07-08 11:01:53 +05:30
Bryan Boreham	dc8f505595	tsdb: coalesce lock/unlock operations for append (#9061 ) Fetch the low watermark value under the same lock as we need for the appender, rather than releasing then re-aquiring a lock on the same Mutex. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2021-07-07 18:58:20 +05:30
beorn7	cb75747bce	Fix re-encoding Signed-off-by: beorn7 <beorn@grafana.com>	2021-07-06 00:20:35 +02:00
beorn7	01957eee2b	Fix interjections even more Signed-off-by: beorn7 <beorn@grafana.com>	2021-07-05 23:59:33 +02:00
beorn7	dc1c744169	Fix interjections at the end Signed-off-by: beorn7 <beorn@grafana.com>	2021-07-05 23:01:39 +02:00
Ganesh Vernekar	1acb701e5c	Fix TSDB race while reading histograms Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-07-05 21:51:35 +05:30
Ganesh Vernekar	9f206a7a05	Fix race in TSBD while reading/writing histograms (#9051 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-07-05 21:27:26 +05:30
Oleg Zaytsev	40126a8494	Use binary literals for xor chunk encoding An opinionated cosmetic change, but since go 1.13 we have this fancy 0b.... literals so we don't need to write hex and comment the binary value. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>	2021-07-05 16:39:24 +02:00
beorn7	deb02d59fb	Fix lint issues Signed-off-by: beorn7 <beorn@grafana.com>	2021-07-05 15:27:46 +02:00
Dieter Plaetinck	dc6b068c67	bugfix: only bump numRead when all fields are successfully read Signed-off-by: Dieter Plaetinck <dieter@grafana.com>	2021-07-05 15:57:47 +03:00
Dieter Plaetinck	98f86d671a	cleanup comments Signed-off-by: Dieter Plaetinck <dieter@grafana.com>	2021-07-05 15:56:38 +03:00
Dieter Plaetinck	99ae04bb6f	add SHS chunk recoding and head cutting to head block (no tests yet) Signed-off-by: Dieter Plaetinck <dieter@grafana.com>	2021-07-05 15:56:38 +03:00
Ganesh Vernekar	67871fd1f2	Support compaction of Head block for histograms (#9044 ) * Update querier.go to support Head compaction with histograms Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Add test for Head compaction with histograms Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Fix tests Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-07-04 16:12:37 +05:30
Ganesh Vernekar	4c01ff5194	Bunch of fixes for sparse histograms (#9043 ) * Do not panic on histoAppender.Append Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * M-map all chunks on shutdown Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Support negative schema for querying Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-07-03 23:04:34 +05:30
Dieter Plaetinck	6c13375ac8	sparsehistogram recoding upon detection that new buckets have appeared (#9030 ) * bucketIterator which returns all valid bucket indices for a []span Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * support for comparing []spans and generating interjections Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * add license header Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * assert order fix Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * handle pathological 0-length span case more gracefully Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * stale todo Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * decode-recode histograms when new buckets appear Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * factor out recoding and also add it to the fallback case Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * make linter happy Signed-off-by: Dieter Plaetinck <dieter@grafana.com>	2021-07-02 11:50:30 +05:30
beorn7	518b77c59d	Fix a few trivial style nits Signed-off-by: beorn7 <beorn@grafana.com>	2021-07-01 17:11:54 +02:00
Ganesh Vernekar	f4d3af73f0	Query histograms from TSDB and unit test for append+query (#9022 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-06-30 20:18:13 +05:30
Dieter Plaetinck	4d27816ea5	Sparsehistogram: improve dod encoding, testing, encode chunk metadata (#9015 ) * factor out different varbit schemes and include Beorn's "optimum" for buckets Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * use more compact dod encoding scheme for SHS chunk columns Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * remove FB VB and xor dod encoding because we won't use it Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * HistoChunk metadata encoding Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * add SparseHistogram.Copy() Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * histogram test: test appending a few histograms Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * add license headers Signed-off-by: Dieter Plaetinck <dieter@grafana.com>	2021-06-30 16:15:43 +05:30
Ganesh Vernekar	04ad56d9b8	Append sparse histograms into the Head block (#9013 ) * Append sparse histograms into the Head block Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com> * Add AtHistogram() to Iterator interface. Make HistoChunk conform to Chunk interface. Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-06-29 20:08:46 +05:30
Dieter Plaetinck	58917d1b76	sparsehistogram: integer types and timestamp separation (#9014 ) * integer types and timestamp separation 1) unify types to int64. as suggested by beorn. we want to support counters going down (resets) even if we plan to create new chunks for now, in that case 2) histogram type doesn't know its own timestamp. include it separately in appending and iteration Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * correction: count and zeroCount to remain unsigned to make api more resilient and that's what we use in protobuf anyway Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * temp hack. Ganesh will fix Signed-off-by: Dieter Plaetinck <dieter@grafana.com>	2021-06-29 19:27:59 +05:30
Dieter Plaetinck	fd11a339a7	Sparsehistogram chunk implementation (#9009 ) * histogram chunk Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * xorAppender.AppendHistogram non-method Signed-off-by: Dieter Plaetinck <dieter@grafana.com> * basic histogram chunk test Signed-off-by: Dieter Plaetinck <dieter@grafana.com>	2021-06-29 14:07:41 +05:30
Ganesh Vernekar	64bea6999e	HistogramAppender interface for sparse histograms (#9007 ) Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2021-06-28 20:30:55 +05:30
Julien Pivotto	fa6b2897f0	Merge pull request #8956 from LeviHarrison/fix-tsdb-test-flake CI: Ignore goleak in TSDB test	2021-06-20 23:41:55 +02:00
Levi Harrison	437c470c40	Added ignore Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-06-17 10:04:52 -04:00
Levi Harrison	4a4882d4c7	Replace godoc.org links Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-06-17 07:18:51 -04:00
Julien Pivotto	b1c179be85	Fix main build (#8948 ) Was broken after the merge of #8824 Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-06-16 17:18:32 +02:00
Julien Duchesne	8855c2e626	Add `prometheus_tsdb_clean_start` metric (#8824 ) Add cleanup of the lockfile when the db is cleanly closed The metric describes the status of the lockfile on startup 0: Already existed 1: Did not exist -1: Disabled Therefore, if the min value over time of this metric is 0, that means that executions have exited uncleanly We can then use that metric to have a much lower threshold on the crashlooping alert: If the metric exists and it has been zero, two restarts is enough to trigger the alarm If it does not exist (old prom version for example), the current five restarts threshold remains Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com> * Change metric name + set unset value to -1 Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com> * Only check the last value of the clean start alert Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com> * Fix test + nit Signed-off-by: Julien Duchesne <julien.duchesne@grafana.com>	2021-06-16 15:03:02 +05:30

1 2 3 4 5 ...

383 Commits