prometheus

Commit Graph

Author	SHA1	Message	Date
Bjoern Rabenstein	c3b282bd14	Add regression tests for 'loop until op is consumed' bug. - Most of this is the actual regression test in tiered_test.go. - Working on that regression tests uncovered problems in tiered_test.go that are fixed in this commit. - The 'op.consumed = false' line added to freelist.go was actually not fixing a bug. Instead, there was no bug at all. So this commit removes that line again, but adds a regression test to make sure that the assumed bug is indeed not there (cf. freelist_test.go). - Removed more code duplication in operation.go (following the same approach as before, i.e. embedding op type A into op type B if everything in A is the same as in B with the exception of String() and ExtractSample()). (This change make struct literals for ops more clunky, but that only affects tests. No code change whatsoever was necessary in the actual code after this refactoring.) - Fix another op leak in tiered.go. Change-Id: Ia165c52e33290ad4f6aba9c83d92318d4f583517	2014-03-12 18:40:24 +01:00
Julius Volz	86fc13a52e	Convert metric.Values to slice of values. The initial impetus for this was that it made unmarshalling sample values much faster. Other relevant benchmark changes in ns/op: Benchmark old new speedup ================================================================== BenchmarkMarshal 179170 127996 1.4x BenchmarkUnmarshal 404984 132186 3.1x BenchmarkMemoryGetValueAtTime 57801 50050 1.2x BenchmarkMemoryGetBoundaryValues 64496 53194 1.2x BenchmarkMemoryGetRangeValues 66585 54065 1.2x BenchmarkStreamAdd 45.0 75.3 0.6x BenchmarkAppendSample1 1157 1587 0.7x BenchmarkAppendSample10 4090 4284 0.95x BenchmarkAppendSample100 45660 44066 1.0x BenchmarkAppendSample1000 579084 582380 1.0x BenchmarkMemoryAppendRepeatingValues 22796594 22005502 1.0x Overall, this gives us good speedups in the areas where they matter most: decoding values from disk and accessing the memory storage (which is also used for views). Some of the smaller append examples take minimally longer, but the cost seems to get amortized over larger appends, so I'm not worried about these. Also, we're currently not bottlenecked on the write path and have plenty of other optimizations available in that area if it becomes necessary. Memory allocations during appends don't change measurably at all. Change-Id: I7dc7394edea09506976765551f35b138518db9e8	2014-03-11 18:23:37 +01:00
Bjoern Rabenstein	9ea9189dd1	Remove the multi-op-per-fingerprint capability. Currently, rendering a view is capable of handling multiple ops for the same fingerprint efficiently. However, this capability requires a lot of complexity in the code, which we are not using at all because the way we assemble a viewRequest will never have more than one operation per fingerprint. This commit weeds out the said capability, along with all the code needed for it. It is still possible to have more than one operation for the same fingerprint, it will just be handled in a less efficient way (as proven by the unit tests). As a result, scanjob.go could be removed entirely. This commit also contains a few related refactorings and removals of dead code in operation.go, view,go, and freelist.go. Also, the docstrings received some love. Change-Id: I032b976e0880151c3f3fdb3234fb65e484f0e2e5	2014-03-04 16:29:56 +01:00
Julius Volz	740d448983	Use custom timestamp type for sample timestamps and related code. So far we've been using Go's native time.Time for anything related to sample timestamps. Since the range of time.Time is much bigger than what we need, this has created two problems: - there could be time.Time values which were out of the range/precision of the time type that we persist to disk, therefore causing incorrectly ordered keys. One bug caused by this was: https://github.com/prometheus/prometheus/issues/367 It would be good to use a timestamp type that's more closely aligned with what the underlying storage supports. - sizeof(time.Time) is 192, while Prometheus should be ok with a single 64-bit Unix timestamp (possibly even a 32-bit one). Since we store samples in large numbers, this seriously affects memory usage. Furthermore, copying/working with the data will be faster if it's smaller. MEMORY USAGE RESULTS Initial memory usage comparisons for a running Prometheus with 1 timeseries and 100,000 samples show roughly a 13% decrease in total (VIRT) memory usage. In my tests, this advantage for some reason decreased a bit the more samples the timeseries had (to 5-7% for millions of samples). This I can't fully explain, but perhaps garbage collection issues were involved. WHEN TO USE THE NEW TIMESTAMP TYPE The new clientmodel.Timestamp type should be used whenever time calculations are either directly or indirectly related to sample timestamps. For example: - the timestamp of a sample itself - all kinds of watermarks - anything that may become or is compared to a sample timestamp (like the timestamp passed into Target.Scrape()). When to still use time.Time: - for measuring durations/times not related to sample timestamps, like duration telemetry exporting, timers that indicate how frequently to execute some action, etc. NOTE ON OPERATOR OPTIMIZATION TESTS We don't use operator optimization code anymore, but it still lives in the code as dead code. It still has tests, but I couldn't get all of them to pass with the new timestamp format. I commented out the failing cases for now, but we should probably remove the dead code soon. I just didn't want to do that in the same change as this. Change-Id: I821787414b0debe85c9fffaeb57abd453727af0f	2013-12-03 09:11:28 +01:00
Matt T. Proud	772d3d6b11	Consolidate LevelDB storage construction. There are too many parameters to constructing a LevelDB storage instance for a construction method, so I've opted to take an idiomatic approach of embedding them in a struct for easier mediation and versioning.	2013-08-03 17:25:03 +02:00
Matt T. Proud	f7704af4f8	Code Review: Formatting comments.	2013-07-15 15:12:01 +02:00
Julius Volz	d2da21121c	Implement getValueRangeAtIntervalOp for faster range queries. This also short-circuits optimize() for now, since it is complex to implement for the new operator, and ops generated by the query layer already fulfill the needed invariants. We should still investigate later whether to completely delete operator optimization code or extend it to support getValueRangeAtIntervalOp operators.	2013-06-26 18:10:36 +02:00
Matt T. Proud	30b1cf80b5	WIP - Snapshot of Moving to Client Model.	2013-06-25 15:52:42 +02:00
Julius Volz	f2b4067b7b	Speedup and clean up operation optimization.	2013-06-20 03:01:13 +02:00
Julius Volz	f2b48b8c4a	Make getValuesAtIntervalOp consume all chunk data in one pass. This is mainly a small performance improvement, since we skip past the last extracted time immediately if it was also the last sample in the chunk, instead of trying to extract non-existent values before the chunk end again and again and only gradually approaching the end of the chunk.	2013-05-22 18:14:45 +02:00
Julius Volz	71a3172abb	Fix and optimize getValuesAtIntervalOp data extraction. - only the data extracted in the last loop iteration of ExtractSamples() was emitted as output - if e.g. op interval < sample interval, there were situations where the same sample was added multiple times to the output	2013-05-14 13:55:17 +02:00
Julius Volz	99dcbe0f94	Integrate memory and disk layers in view rendering.	2013-04-19 16:01:27 +02:00
Matt T. Proud	ceb6611957	Fix regression in subsequent range op. compactions. We have an anomaly whereby subsequent range operations fail to be compacted into one single range operation. This fixes such behavior.	2013-03-21 18:11:04 +01:00
Matt T. Proud	978acd4e96	Simplify time group optimizations. The old code performed well according to the benchmarks, but the new code shaves 1/6th of the time off the original and with less code.	2013-03-21 18:08:48 +01:00
Matt T. Proud	582354f6de	Fix remaining ``make advice`` issues.	2013-03-21 18:08:47 +01:00
Matt T. Proud	615e6d13d7	Run ``make format``.	2013-03-21 18:08:47 +01:00
Julius Volz	caeb759ed7	Add tests for and fix getValuesAlongRangeOp value extraction.	2013-03-21 18:08:47 +01:00
Julius Volz	e2fb497eba	Add operator value extraction tests.	2013-03-21 18:08:47 +01:00
Julius Volz	12a8863582	Add data extraction methods to operator types.	2013-03-21 18:08:47 +01:00
Matt T. Proud	41068c2e84	Checkpoint.	2013-03-21 18:06:51 +01:00

20 Commits