prometheus

mirror of https://github.com/prometheus/prometheus synced 2024-12-27 17:13:22 +00:00

Author	SHA1	Message	Date
beorn7	836f1db04c	Improve MetricsForLabelMatchers WIP: This needs more tests. It now gets a from and through value, which it may opportunistically use to optimize the retrieval. With possible future range indices, this could be used in a very efficient way. This change merely applies some easy checks, which should nevertheless solve the use case of heavy rule evaluations on servers with a lot of series churn. Idea is the following: - Only archive series that are at least as old as the headChunkTimeout (which was already extremely unlikely to happen). - Then maintain a high watermark for the last archival, i.e. no archived series has a sample more recent than that watermark. - Any query that doesn't reach to a time before that watermark doesn't have to touch the archive index at all. (A production server at Soundcloud with the aforementioned series churn and heavy rule evaluations spends 50% of its CPU time in archive index lookups. Since rule evaluations usually only touch very recent values, most of those lookup should disappear with this change.) - Federation with a very broad label matcher will profit from this, too. As a byproduct, the un-needed MetricForFingerprint method was removed from the Storage interface.	2016-03-09 00:25:59 +01:00
beorn7	167b83695c	Merge branch 'beorn7/storage5' into beorn7/storage6	2016-03-08 00:20:44 +01:00
beorn7	01795382c9	Merge branch 'beorn7/storage4' into beorn7/storage5	2016-03-08 00:20:13 +01:00
beorn7	f7fc542db6	Merge branch 'master' into beorn7/storage4 Conflicts: storage/local/persistence.go	2016-03-08 00:14:00 +01:00
beorn7	c13b1ecfe9	Make chunk iterators more DRY This finally extracts all the common code of the two chunk iterators into one. Any future chunk encodings with fast access by index can use the same iterator by simply providing an indexAccessor. Other future chunk encodings without fast index access (like Gorilla-style) can still implement the chunkIterator interface as usual.	2016-03-07 20:23:14 +01:00
beorn7	32f280a3cd	Slim down the chunkIterator interface For one, remove unneeded methods. Then, instead of using a channel for all values, use a bufio.Scanner-like interface. This removes the need for creating a goroutine and avoids the (unnecessary) locking performed by channel sending and receiving. This will make it much easier to write new chunk implementations (like Gorilla-style encoding).	2016-03-07 19:50:13 +01:00
beorn7	b6fdb355d7	Move dump-heads into its own tool	2016-03-07 16:30:19 +01:00
beorn7	f193f2b8ef	Add a command to promtool that dumps metadata of heads.db I needed this today for debugging. It can certainly be improved, but it's already quite helpful. I refactored the reading of heads.db files out of persistence, which is an improvement, too. I made minor changes to the cli package to allow outputting via the io.Writer interface.	2016-03-07 16:21:57 +01:00
beorn7	fc7de5374a	Quarantine series upon problem writing to the series file This fixes https://github.com/prometheus/prometheus/issues/1059 , but not in the obvious way (simply not updating the persist watermark, because that's actually not that simple - we don't really know what has gone wrong exactly). As any errors relevant here are most likely caused by severe and unrecoverable problems with the series file, Using the now quarantine feature is the right step. We don't really have to be worried about any inconsistent state of the series because it will be removed for good ASAP. Another plus is that we don't have to declare the whole storage dirty anymore.	2016-03-03 13:15:02 +01:00
beorn7	0ea5801e47	Handle errors caused by data corruption more gracefully This requires all the panic calls upon unexpected data to be converted into errors returned. This pollute the function signatures quite lot. Well, this is Go... The ideas behind this are the following: - panic only if it's a programming error. Data corruptions happen, and they are not programming errors. - If we detect a data corruption, we "quarantine" the series, essentially removing it from the database and putting its data into a separate directory for forensics. - Failure during writing to a series file is not considered corruption automatically. It will call setDirty, though, so that a crashrecovery upon the next restart will commence and check for that. - Series quarantining and setDirty calls are logged and counted in metrics, but are hidden from the user of the interfaces in interface.go, whith the notable exception of Append(). The reasoning is that we treat corruption by removing the corrupted series, i.e. a query for it will return no results on its next call anyway, so return no results right now. In the case of Append(), we want to tell the user that no data has been appended, though. Minor side effects: - Now consistently using filepath.* instead of path.*. - Introduced structured logging where I touched it. This makes things less consistent, but a complete change to structured logging would be out of scope for this PR.	2016-03-02 23:02:34 +01:00
beorn7	b6840997a7	Merge branch 'beorn7/storage2' into beorn7/storage3	2016-03-02 16:11:25 +01:00
beorn7	ce58fd357b	Merge branch 'beorn7/storage' into beorn7/storage2 Conflicts: storage/local/chunk.go storage/local/interface.go	2016-03-02 16:09:32 +01:00
beorn7	2581648f70	Separate iterators by offset Add test that exposes the problem.	2016-03-02 16:01:03 +01:00
beorn7	c740789ce3	Improve predict_linear Fixes https://github.com/prometheus/prometheus/issues/1401 This remove the last (and in fact bogus) use of BoundaryValues. Thus, a whole lot of unused (and arguably sub-optimal / ugly) code can be removed here, too.	2016-02-25 12:10:55 +01:00
beorn7	4b503ed9a5	Merge branch 'master' into beorn7/storage2	2016-02-24 14:03:49 +01:00
beorn7	059295332f	Merge remote-tracking branch 'origin/master' into beorn7/storage	2016-02-24 14:02:27 +01:00
beorn7	53005c3085	Merge branch 'beorn7/storage' into beorn7/storage2	2016-02-24 14:00:56 +01:00
beorn7	28e9bbc15f	Populate chunkDesc.chunkLastTime during checkpoint loading, too	2016-02-24 13:58:34 +01:00
Björn Rabenstein	a8c79f0a0c	Merge pull request #1422 from prometheus/release-0.17 Merge more commits from 0.17.	2016-02-23 23:07:44 +01:00
beorn7	8fa1560e48	Fix a very special case of handling the checkpoint timer	2016-02-23 16:48:35 +01:00
beorn7	41e44f6ab9	Merge branch 'master' into beorn7/storage2	2016-02-22 16:54:33 +01:00
Björn Rabenstein	d9eb624322	Merge pull request #1415 from prometheus/release-0.17 Forward-merge release-0.17 into master	2016-02-22 16:39:48 +01:00
beorn7	4d1f7b49b6	Fix a race condition in calculatePersistenceUrgencyScore	2016-02-22 15:48:39 +01:00
beorn7	454ecf3f52	Rework the way ranges and instants are handled In a way, our instants were also ranges, just with the staleness delta as range length. They are no treated equally, just that in one case, the range length is set as range, in the other the staleness delta. However, there are "real" instants where start and and time of a query is the same. In those cases, we only want to return a single value (the one closest before or at the equal start and end time). If that value is the last sample in the series, odds are we have it already in the series object. In that case, there is no need to pin or load any chunks. A special singleSampleSeriesIterator is created for that. This should greatly speed up instant queries as they happen frequently for rule evaluations.	2016-02-22 01:47:18 +01:00
beorn7	b876f8e6a5	Move lastSamplePair method up to memorySeries This implies a slight change of behavior as only samples added to the respective instance of a memorySeries are returned. However, this is most likely anyway what we want. Following cases: - Server has been restarted: Given the time it takes to cleanly shutdown and start up a server, the series are now stale anyway. An improved staleness handling (still to be implemented) will be based on tracking if a given target is continuing to expose samples for a given time series. In that case, we need a full scrape cycle to decide about staleness. So again, it makes sense to consider everything stale directly after a server restart. - Series unarchived due to a read request: The series is definitely stale so we don't want to return anything anyway. - Freshly created time series or series unarchived because of a sample append: That happens because appending a sample is imminent. Before the fingerprint lock is released, the series will have received a sample, and lastSamplePair will always returned the expected value.	2016-02-19 18:16:41 +01:00
beorn7	1e13f89039	Return SamplePair istead of *SamplePair consistently Formalize ZeroSamplePair as return value for non-existing samples. Change LastSamplePairForFingerprint to return a SamplePair (and not a pointer to it), which saves allocations in a potentially extremely frequent call.	2016-02-19 17:00:40 +01:00
beorn7	d290340367	Fix and improve chunkDesc locking	2016-02-19 16:24:38 +01:00
beorn7	0e202dacb4	Streamline series iterator creation This will fix issue #1035 and will also help to make issue #1264 less bad. The fundamental problem in the current code: In the preload phase, we quite accurately determine which chunks will be used for the query being executed. However, in the subsequent step of creating series iterators, the created iterators are referencing _all_ in-memory chunks in their series, even the un-pinned ones. In iterator creation, we copy a pointer to each in-memory chunk of a series into the iterator. While this creates a certain amount of allocation churn, the worst thing about it is that copying the chunk pointer out of the chunkDesc requires a mutex acquisition. (Remember that the iterator will also reference un-pinned chunks, so we need to acquire the mutex to protect against concurrent eviction.) The worst case happens if a series doesn't even contain any relevant samples for the query time range. We notice that during preloading but then we will still create a series iterator for it. But even for series that do contain relevant samples, the overhead is quite bad for instant queries that retrieve a single sample from each series, but still go through all the effort of series iterator creation. All of that is particularly bad if a series has many in-memory chunks. This commit addresses the problem from two sides: First, it merges preloading and iterator creation into one step, i.e. the preload call returns an iterator for exactly the preloaded chunks. Second, the required mutex acquisition in chunkDesc has been greatly reduced. That was enabled by a side effect of the first step, which is that the iterator is only referencing pinned chunks, so there is no risk of concurrent eviction anymore, and chunks can be accessed without mutex acquisition. To simplify the code changes for the above, the long-planned change of ValueAtTime to ValueAtOrBefore time was performed at the same time. (It should have been done first, but it kind of accidentally happened while I was in the middle of writing the series iterator changes. Sorry for that.) So far, we actively filtered the up to two values that were returned by ValueAtTime, i.e. we invested work to retrieve up to two values, and then we invested more work to throw one of them away. The SeriesIterator.BoundaryValues method can be removed once #1401 is fixed. But I really didn't want to load even more changes into this PR. Benchmarks: The BenchmarkFuzz.* benchmarks run 83% faster (i.e. about six times faster) and allocate 95% fewer bytes. The reason for that is that the benchmark reads one sample after another from the time series and creates a new series iterator for each sample read. To find out how much these improvements matter in practice, I have mirrored a beefy Prometheus server at SoundCloud that suffers from both issues #1035 and #1264. To reach steady state that would be comparable, the server needs to run for 15d. So far, it has run for 1d. The test server currently has only half as many memory time series and 60% of the memory chunks the main server has. The 90th percentile rule evaluation cycle time is ~11s on the main server and only ~3s on the test server. However, these numbers might get much closer over time. In addition to performance improvements, this commit removes about 150 LOC.	2016-02-19 16:24:38 +01:00
beorn7	ef3ab96111	Populate first and last time in the chunk descriptor earlier The First time is kind of trivial as we always know it when we create a new chunkDesc. The last time is only know when the chunk is closed, so we have to set it at that time. The change saves a lot of digging down into the chunk itself. Especially the last time is relative expensive as it involves the creation of an iterator. The first time access now doesn't require locking, which is also a nice gain.	2016-02-15 14:06:09 +01:00
beorn7	9a3edea477	Remove race condition from TestRetentionCutoff	2016-02-12 12:13:19 +01:00
Julius Volz	9b6d69610a	Fix various typos in comments. Helpfully reported by https://goreportcard.com/report/github.com/prometheus/prometheus :)	2016-02-10 03:47:00 +01:00
Fabian Reinartz	1f877f3d2a	Fix deadlock, structure target logging	2016-02-03 10:39:34 +01:00
Fabian Reinartz	59f1e722df	Return error on sample appending	2016-02-02 14:01:44 +01:00
beorn7	ec08c9a391	Rework the way to communicate backpressure (AKA suspended ingestion) This gives up on the idea to communicate throuh the Append() call (by either not returning as it is now or returning an error as suggested/explored elsewhere). Here I have added a Throttled() call, which has the advantage that it can be called before a whole _batch_ of Append()'s. Scrapes will happen completely or not at all. Same for rule group evaluations. That's a highly desired behavior (as discussed elsewhere). The code is even simpler now as the whole ingestion buffer could be removed. Logging of throttled mode has been streamlined and will create at most one message per minute.	2016-02-01 14:45:44 +01:00
beorn7	87ef24cd25	Add instrumentation and refactor things around "rushed mode"	2016-01-26 17:44:21 +01:00
beorn7	a2cd479058	Fix calculation of chunks to persist after restart Since we are not overestimating the number of chunks to persist anymore, this commit also adjusts the default value for -storage.local.memory-chunks. Update of documentation will follow.	2016-01-25 19:33:51 +01:00
beorn7	972d94433a	Introduce a hysteresis for "rushed mode" "Rushed mode" is formerly known as "degraded mode", which is changed with this commit, too. The name "degraded" was very misleading. Also, switch into rushed mode if we have too many chunks in memory and an at least reasonable amount of chunks to persist so that speeding up persisting chunks can help.	2016-01-25 19:24:37 +01:00
beorn7	14796bdb60	Improve chunkMaxBatchSize doc comment	2016-01-25 18:57:51 +01:00
beorn7	582af1618c	Streamline chunk writing This helps to avoid allocations in the same way we were already doing it during reading.	2016-01-25 16:36:36 +01:00
beorn7	99b9611351	Remove a race condition from TestRetentionCutoff	2016-01-25 16:36:14 +01:00
beorn7	3f4d22e4c7	Update doc comment This should have gone into a previous commit, but I forgot to save this particular file.	2016-01-12 12:38:18 +01:00
beorn7	add2ebdd56	Tolerate the lost+found directory in the data directory	2016-01-11 18:05:36 +01:00
Björn Rabenstein	6293f3a374	Merge pull request #1304 from prometheus/beorn7/storage Improve handling of series file truncation	2016-01-11 17:27:08 +01:00
beorn7	cb117d8346	Add a series ops metric "purge_on_request" It counts series deletions triggered via the API.	2016-01-11 17:22:16 +01:00
beorn7	4221c7de5c	Improve handling of series file truncation If only very few chunks are to be truncated from a very large series file, the rewrite of the file is a lorge overhead. With this change, a certain ratio of the file has to be dropped to make it happen. While only causing disk overhead at about the same ratio (by default 10%), it will cut down I/O by a lot in above scenario.	2016-01-11 16:42:10 +01:00
Corentin Chary	7b6c3e556c	Use '.' instead of '=' to separate labels from their values in Graphite Using .label=value. was weird to use in Graphite and didn't bring much value.	2016-01-11 13:57:14 +01:00
Julius Volz	75fdcf5698	Merge pull request #1197 from iksaif/master Add support for remote storage on Graphite	2015-11-10 09:46:17 +01:00
Corentin Chary	a2e4439086	Add support for remote storage on Graphite Allows to use graphite over tcp or udp. Metrics labels and values are used to construct a valid Graphite path in a way that will allow us to eventually read them back and reconstruct the metrics. For example, this metric: model.Metric{ model.MetricNameLabel: "test:metric", "testlabel": "test:value", "testlabel2": "test:value", ) Will become: test:metric.testlabel=test:value.testlabel2=test:value escape.go takes care of escaping values to match Graphite character set, it basically uses percent-encoding as a fallback wich will work pretty will in the graphite/grafana world. The remote storage module also has an optional 'prefix' parameter to prefix all metrics with a path (for example, 'prometheus.'). Graphite URLs are simply in the form tcp://host:port or udp://host:port.	2015-11-10 07:58:57 +01:00
Fabian Reinartz	33aab4169c	Anchor regexes in vector matching This commit makes the regex behavior of vector matching consistent with configuration and label_replace() by anchoring it. Fixes #1200	2015-11-05 11:23:43 +01:00
Fabian Reinartz	e3b6ec9784	Switch to common/log	2015-10-03 10:21:43 +02:00

1 2 3 4 5 ...

589 Commits