prometheus

Commit Graph

Author	SHA1	Message	Date
beorn7	47e3c90f9b	Clean up error propagation Only return an error where callers are doing something with it except simply logging and ignoring. All the errors touched in this commit flag the storage as dirty anyway, and that fact is logged anyway. So most of what is being removed here is just log spam. As discussed earlier, the class of errors that flags the storage as dirty signals fundamental corruption, no even bubbling up a one-time warning to the user (e.g. about incomplete results) isn't helping much because _anything_ happening in the storage has to be doubted from that point on (and in fact retroactively into the past, too). Flagging the storage dirty, and alerting on it (plus marking the state in the web UI) is the only way I can see right now. As a byproduct, I cleaned up the setDirty method a bit and improved the logged errors.	2016-03-09 18:56:30 +01:00
beorn7	99854a84d7	Merge branch 'beorn7/storage6' into beorn7/storage7	2016-03-09 17:23:25 +01:00
beorn7	5e4fa96719	Merge branch 'beorn7/storage5' into beorn7/storage6	2016-03-09 17:21:32 +01:00
beorn7	b343e65907	Merge branch 'beorn7/storage4' into beorn7/storage5 erge is necessary,	2016-03-09 17:14:42 +01:00
beorn7	d0a4477446	Merge branch 'beorn7/storage3' into beorn7/storage4 Conflicts: storage/local/preload.go storage/local/storage.go storage/local/storage_test.go	2016-03-09 17:13:16 +01:00
beorn7	55eddab25f	Merge branch 'beorn7/storage2' into beorn7/storage3	2016-03-09 16:48:46 +01:00
beorn7	161eada3ad	Make chunkIterator even leaner.	2016-03-09 16:20:39 +01:00
beorn7	dad302144d	Make a naked return less naked	2016-03-09 15:06:00 +01:00
beorn7	beb36df4bb	De-flag preloadChunksForRange Now there is preloadChunksForRange and preloadChunksForInstant in both, the series and the storage.	2016-03-09 14:50:09 +01:00
beorn7	bbd34d7ccf	Merge branch 'beorn7/storage6' into beorn7/storage7	2016-03-09 00:50:33 +01:00
beorn7	7cdfae1466	Merge branch 'beorn7/storage5' into beorn7/storage6	2016-03-09 00:50:17 +01:00
beorn7	d6b00b4f6c	Merge branch 'beorn7/storage4' into beorn7/storage5	2016-03-09 00:50:05 +01:00
beorn7	eb9caf13be	Merge branch 'beorn7/storage3' into beorn7/storage4	2016-03-09 00:49:52 +01:00
beorn7	d284864c87	Merge branch 'beorn7/storage2' into beorn7/storage3	2016-03-09 00:49:41 +01:00
beorn7	dcb7c0d3ee	Merge branch 'master' into beorn7/storage2	2016-03-09 00:48:51 +01:00
beorn7	836f1db04c	Improve MetricsForLabelMatchers WIP: This needs more tests. It now gets a from and through value, which it may opportunistically use to optimize the retrieval. With possible future range indices, this could be used in a very efficient way. This change merely applies some easy checks, which should nevertheless solve the use case of heavy rule evaluations on servers with a lot of series churn. Idea is the following: - Only archive series that are at least as old as the headChunkTimeout (which was already extremely unlikely to happen). - Then maintain a high watermark for the last archival, i.e. no archived series has a sample more recent than that watermark. - Any query that doesn't reach to a time before that watermark doesn't have to touch the archive index at all. (A production server at Soundcloud with the aforementioned series churn and heavy rule evaluations spends 50% of its CPU time in archive index lookups. Since rule evaluations usually only touch very recent values, most of those lookup should disappear with this change.) - Federation with a very broad label matcher will profit from this, too. As a byproduct, the un-needed MetricForFingerprint method was removed from the Storage interface.	2016-03-09 00:25:59 +01:00
Björn Rabenstein	eebe077f98	Merge pull request #1476 from prometheus/beorn7/makefile Use UTC for build timestamp	2016-03-08 18:18:54 +01:00
beorn7	6ba379e256	Use UTC for build timestamp	2016-03-08 17:47:17 +01:00
beorn7	d77d625ad3	Merge branch 'master' into beorn7/storage6	2016-03-08 17:39:14 +01:00
Brian Brazil	84c421da8e	Merge pull request #1475 from prometheus/fabxc/targetsort Sort exported targets	2016-03-08 16:24:55 +00:00
Fabian Reinartz	f2e359962c	Sort exported targets	2016-03-08 17:12:27 +01:00
Fabian Reinartz	eb915ec40f	Merge pull request #1474 from prometheus/fabxc/spinfix Handle closed target provider channel	2016-03-08 17:02:05 +01:00
Fabian Reinartz	56fc9bdff3	Handle closed target provider channel This fixes the case where a target provider closes the update channel and exits before the context is canceled. This should only be true for the static provider but it's safer to generally handle this case.	2016-03-08 15:49:03 +01:00
Tobias Schmidt	2f151d02eb	Merge pull request #1456 from prometheus/validate-alertmanager-url Validate alertmanager URL	2016-03-07 20:09:46 -05:00
Tobias Schmidt	7763bbd993	Validate alertmanager URL	2016-03-07 20:07:17 -05:00
beorn7	167b83695c	Merge branch 'beorn7/storage5' into beorn7/storage6	2016-03-08 00:20:44 +01:00
beorn7	01795382c9	Merge branch 'beorn7/storage4' into beorn7/storage5	2016-03-08 00:20:13 +01:00
beorn7	c01658e20d	Merge branch 'beorn7/storage3' into beorn7/storage4	2016-03-08 00:18:00 +01:00
beorn7	f138847d31	Merge branch 'beorn7/storage2' into beorn7/storage3	2016-03-08 00:17:33 +01:00
beorn7	f7fc542db6	Merge branch 'master' into beorn7/storage4 Conflicts: storage/local/persistence.go	2016-03-08 00:14:00 +01:00
beorn7	3d86130d8c	Merge branch 'master' into beorn7/storage3	2016-03-07 23:39:12 +01:00
beorn7	1f30c8de8d	Merge branch 'master' into beorn7/storage2	2016-03-07 23:38:42 +01:00
beorn7	c13b1ecfe9	Make chunk iterators more DRY This finally extracts all the common code of the two chunk iterators into one. Any future chunk encodings with fast access by index can use the same iterator by simply providing an indexAccessor. Other future chunk encodings without fast index access (like Gorilla-style) can still implement the chunkIterator interface as usual.	2016-03-07 20:23:14 +01:00
beorn7	32f280a3cd	Slim down the chunkIterator interface For one, remove unneeded methods. Then, instead of using a channel for all values, use a bufio.Scanner-like interface. This removes the need for creating a goroutine and avoids the (unnecessary) locking performed by channel sending and receiving. This will make it much easier to write new chunk implementations (like Gorilla-style encoding).	2016-03-07 19:50:13 +01:00
Björn Rabenstein	1bd4c92e1f	Merge pull request #1457 from prometheus/beorn7/promtool Add a command to promtool that dumps metadata of heads.db	2016-03-07 17:22:48 +01:00
beorn7	b6fdb355d7	Move dump-heads into its own tool	2016-03-07 16:30:19 +01:00
beorn7	f193f2b8ef	Add a command to promtool that dumps metadata of heads.db I needed this today for debugging. It can certainly be improved, but it's already quite helpful. I refactored the reading of heads.db files out of persistence, which is an improvement, too. I made minor changes to the cli package to allow outputting via the io.Writer interface.	2016-03-07 16:21:57 +01:00
Fabian Reinartz	6bbb4af837	Merge pull request #1465 from prometheus/beorn7/fix-test2 Fix flaky file-sd test	2016-03-07 15:46:18 +01:00
beorn7	d44b83690e	Fix flaky file-sd test	2016-03-07 15:39:18 +01:00
Björn Rabenstein	2a2cc52828	Merge pull request #1405 from prometheus/beorn7/storage Streamline series iterator creation	2016-03-07 13:30:56 +01:00
Fabian Reinartz	5b9e85e556	Merge pull request #1404 from prometheus/scraperef2 Retrieval refactoring	2016-03-06 22:17:00 +01:00
Fabian Reinartz	6ceb7e7887	Merge pull request #1463 from mischief/linuxisms scripts: drop -f from hostname, openbsd does not support it	2016-03-05 08:57:18 +01:00
Nick Owens	53777e7bc4	scripts: drop -f from hostname, openbsd does not support it	2016-03-04 19:59:28 -08:00
Fabian Reinartz	8d2a73aff0	Merge pull request #1451 from pdbogen/origin/1446 rewrite operator balancing to be recursive	2016-03-03 19:42:17 +01:00
Patrick Bogen	250344b344	use short variable assignment	2016-03-03 09:46:50 -08:00
beorn7	fc7de5374a	Quarantine series upon problem writing to the series file This fixes https://github.com/prometheus/prometheus/issues/1059 , but not in the obvious way (simply not updating the persist watermark, because that's actually not that simple - we don't really know what has gone wrong exactly). As any errors relevant here are most likely caused by severe and unrecoverable problems with the series file, Using the now quarantine feature is the right step. We don't really have to be worried about any inconsistent state of the series because it will be removed for good ASAP. Another plus is that we don't have to declare the whole storage dirty anymore.	2016-03-03 13:15:02 +01:00
Fabian Reinartz	29e31dc3c6	Merge pull request #1452 from prometheus/fix-style-checker Detect code style violations in deeply nested files	2016-03-03 09:22:56 +01:00
Tobias Schmidt	d7889e61bb	Detect code style violations in deeply nested files So far the style check did not recognize issues in files in deeply nested directories, e.g. retrieval/discovery/kubernetes/discovery.go.	2016-03-03 02:21:16 -05:00
Patrick Bogen	2062fbae0f	rewrite operator balancing to be recursive	2016-03-02 15:56:40 -08:00
beorn7	0ea5801e47	Handle errors caused by data corruption more gracefully This requires all the panic calls upon unexpected data to be converted into errors returned. This pollute the function signatures quite lot. Well, this is Go... The ideas behind this are the following: - panic only if it's a programming error. Data corruptions happen, and they are not programming errors. - If we detect a data corruption, we "quarantine" the series, essentially removing it from the database and putting its data into a separate directory for forensics. - Failure during writing to a series file is not considered corruption automatically. It will call setDirty, though, so that a crashrecovery upon the next restart will commence and check for that. - Series quarantining and setDirty calls are logged and counted in metrics, but are hidden from the user of the interfaces in interface.go, whith the notable exception of Append(). The reasoning is that we treat corruption by removing the corrupted series, i.e. a query for it will return no results on its next call anyway, so return no results right now. In the case of Append(), we want to tell the user that no data has been appended, though. Minor side effects: - Now consistently using filepath.* instead of path.*. - Introduced structured logging where I touched it. This makes things less consistent, but a complete change to structured logging would be out of scope for this PR.	2016-03-02 23:02:34 +01:00

1 2 3 4 5 ...

2696 Commits All Branches Search

2696 Commits

All Branches