prometheus

Commit Graph

Author	SHA1	Message	Date
beorn7	5c41ca84e5	Catch negative staleness delta set on the command line	2016-11-01 15:17:59 +01:00
Brian Brazil	6bc29ba857	Fix regression from #1957 , specify non-zero default timeout. (#2121 ) Fixes #2075	2016-10-26 14:47:41 +01:00
Julius Volz	ab80ced756	storage: separate chunk package, publish more names This is a followup to https://github.com/prometheus/prometheus/pull/2011. This publishes more of the methods and other names of the chunk code and moves the chunk code to its own package. There's some unavoidable ugliness: the chunk and chunkDesc metrics are used by both packages, so I had to move them to the chunk package. That isn't great, but I don't see how to do it better without a larger redesign of everything. Same for the evict requests and some other types.	2016-09-26 13:25:11 +02:00
Fabian Reinartz	57b358b82a	vendor: update govalidator (#2023 ) Fixes #2022	2016-09-23 01:06:51 +02:00
Matt Bostock	dd98766b32	cmd/prometheus/main.go: Fix typo in comment	2016-09-21 21:59:25 +01:00
Tom Wilkie	4520e12440	Add HTTP Basic Auth & TLS support to the generic write path. (#1957 ) * Add config, HTTP Basic Auth and TLS support to the generic write path. - Move generic write path configuration to the config file - Factor out config.TLSConfig -> tlf.Config translation - Support TLSConfig for generic remote storage - Rename Run to Start, and make it non-blocking. - Dedupe code in httputil for TLS config. - Make remote queue metrics global.	2016-09-19 22:47:51 +02:00
Julius Volz	c187308366	storage: Contextify storage interfaces. This is based on https://github.com/prometheus/prometheus/pull/1997. This adds contexts to the relevant Storage methods and already passes PromQL's new per-query context into the storage's query methods. The immediate motivation supporting multi-tenancy in Frankenstein, but this could also be used by Prometheus's normal local storage to support cancellations and timeouts at some point.	2016-09-19 16:29:07 +02:00
Julius Volz	ed5a0f0abe	promql: Allow per-query contexts. For Weaveworks' Frankenstein, we need to support multitenancy. In Frankenstein, we initially solved this without modifying the promql package at all: we constructed a new promql.Engine for every query and injected a storage implementation into that engine which would be primed to only collect data for a given user. This is problematic to upstream, however. Prometheus assumes that there is only one engine: the query concurrency gate is part of the engine, and the engine contains one central cancellable context to shut down all queries. Also, creating a new engine for every query seems like overkill. Thus, we want to be able to pass per-query contexts into a single engine. This change gets rid of the promql.Engine's built-in base context and allows passing in a per-query context instead. Central cancellation of all queries is still possible by deriving all passed-in contexts from one central one, but this is now the responsibility of the caller. The central query context is now created in main() and passed into the relevant components (web handler / API, rule manager). In a next step, the per-query context would have to be passed to the storage implementation, so that the storage can implement multi-tenancy or other features based on the contextual information.	2016-09-19 15:38:17 +02:00
Julius Volz	5f5a78e807	Merge pull request #1974 from prometheus/disable-local-storage Allow disabling local storage.	2016-09-17 18:40:01 +02:00
Tom Wilkie	d83879210c	Switch back to protos over HTTP, instead of GRPC. My aim is to support the new grpc generic write path in Frankenstein. On the surface this seems easy - however I've hit a number of problems that make me think it might be better to not use grpc just yet. The explanation of the problems requires a little background. At weave, traffic to frankenstein need to go through a couple of services first, for SSL and to be authenticated. So traffic goes: internet -> frontend -> authfe -> frankenstein - The frontend is Nginx, and adds/removes SSL. Its done this way for legacy reasons, so the certs can be managed in one place, although eventually we imagine we'll merge it with authfe. All traffic from frontend is sent to authfe. - Authfe checks the auth tokens / cookie etc and then picks the service to forward the RPC to. - Frankenstein accepts the reads and does the right thing with them. First problem I hit was Nginx won't proxy http2 requests - it can accept them, but all calls downstream are http1 (see https://trac.nginx.org/nginx/ticket/923). This wasn't such a big deal, so it now looks like: internet --(grpc/http2)--> frontend --(grpc/http1)--> authfe --(grpc/http1)--> frankenstein Next problem was golang grpc server won't accept http1 requests (see https://groups.google.com/forum/#!topic/grpc-io/JnjCYGPMUms). It is possible to link a grpc server in with a normal go http mux, as long as the mux server is serving over SSL, as the golang http client & server won't do http2 over anything other than an SSL connection. This would require making all our service to service comms SSL. So I had a go a writing a grpc http1 server, and got pretty far. But is was a bit of a mess. So finally I thought I'd make a separate grpc frontend for this, running in parallel with the frontend/authfe combo on a different port - and first up I'd need a grpc reverse proxy. Ideally we'd have some nice, generic reverse proxy that only knew about a map from service names -> downstream service, and didn't need to decode & re-encode every request as it went through. It seems like this can't be done with golang's grpc library - see https://github.com/mwitkow/grpc-proxy/issues/1. And then I was surprised to find you can't do grpc from browsers! See http://www.grpc.io/faq/ - not important to us, but I'm starting to question why we decided to use grpc in the first place? It would seem we could have most of the benefits of grpc with protos over HTTP, and this wouldn't preclude moving to grpc when its a bit more mature? In fact, the grcp FAQ even admits as much: > Why is gRPC better than any binary blob over HTTP/2? > This is largely what gRPC is on the wire.	2016-09-15 23:21:54 +01:00
Tobias Schmidt	29ced0090f	Fix common english misspellings	2016-09-14 23:23:28 -04:00
Julius Volz	b24e5d63bc	Add noop local storage engine. This adds a flag -storage.local.engine which allows turning off local storage in Prometheus. Instead of adding if-conditions and nil checks to all parts of Prometheus that deal with Prometheus's local storage (including the web interface), disabling local storage simply means replacing the normal local storage with a noop version that throws samples away and returns empty query results. We also don't add the noop storage to the fanout appender to decrease internal overhead. Instead of returning empty results, an alternate behavior could be to return errors on any query that point out that the local storage is disabled. Not sure which one is more preferable, so I went with the empty result option for now.	2016-09-14 13:18:05 +02:00
Julius Volz	a88e950d1f	Mark remote write address flag as experimental.	2016-09-01 00:58:53 +02:00
Julius Volz	aa3f2b7216	Generic write cleanups and changes. - fold metric name into labels - return initialization errors back to main - add snappy compression - better context handling - pre-allocation of labels - remove generic naming - other cleanups	2016-08-30 17:24:48 +02:00
Brian Brazil	36d2c4bd0b	Add generic write path using grpc. This uses a new proto format, with scope for multiple samples per timeseries in future. This will allow users to pump samples out to whatever they like without having to change the core Prometheus code. There's also an example receiver to save users figuring out the boilerplate themselves.	2016-08-30 17:19:18 +02:00
Julius Volz	4a866c13be	Fix ApplyConfig() error handling Currently, Prometheus starts up without any error when there is an invalid rule file :-/	2016-08-13 00:59:02 +02:00
Julius Volz	08891beb5f	Merge pull request #1828 from drawks/iss-1821 Error on non-flag commandline arguments	2016-07-21 00:35:53 +02:00
Björn Rabenstein	12709af249	Merge pull request #1838 from prometheus/release-1.0 Explicitly add logging flags to our custom flag set	2016-07-21 00:33:12 +02:00
Dave Rawks	00ea36cdbe	Error on non-flag commandline arguments - Added minor cmdline parsing logic change to bail on unconsumed arguments. Fixes #1821	2016-07-20 10:28:26 -07:00
beorn7	bf6201483c	Improve wording on log flag comment	2016-07-20 17:32:42 +02:00
beorn7	25385aafcb	Explicitly add logging flags to our custom flag set In https://github.com/prometheus/prometheus/pull/1782 , we moved to a custom flag set to avoid getting test flags into the main prometheus binary. However, that removed the logging flags, too. This commit updates the vendoring to a version of the log package that allows adding the log flags to our flag set explicitly.	2016-07-20 17:27:39 +02:00
Dmitry Vorobev	273e457da4	web: return status code and error message for config resource	2016-07-15 10:15:24 +02:00
Fabian Reinartz	59d26e8536	web: add -web.route-prefix flag Fixes #1191	2016-07-07 11:49:16 +02:00
Fabian Reinartz	8c24dfdb86	cmd/prometheus: use own flag set Fixes #1743	2016-07-03 14:23:31 +02:00
Fabian Reinartz	dd57e7ef5c	Merge pull request #1699 from prometheus/fabxc-multiam notifier: dispatch to multiple Alertmanagers	2016-06-06 12:01:41 +02:00
Fabian Reinartz	9baf120cd5	notifier: dispatch to multiple Alertmanagers This commit extends the notifier to dispatch alert batches to multiple Alertmanagers concurrently. It changes the `-alertmanager.url` flag to accept a comma separated list of URLs and/or to be set multiple times.	2016-06-06 11:41:10 +02:00
beorn7	99881ded63	Make the number of fingerprint mutexes configurable With a lot of series accessed in a short timeframe (by a query, a large scrape, checkpointing, ...), there is actually quite a significant amount of lock contention if something similar is running at the same time. In those cases, the number of locks needs to be increased. On the same front, as our fingerprints don't have a lot of entropy, I introduced some additional shuffling. With the current state, anly changes in the least singificant bits of a FP would matter.	2016-06-02 19:18:00 +02:00
beorn7	da8cb10b43	Partition the status tab into items in a dropdown I got feedback from different sources about rules and targets being too heavy in the status tab if their are lots of them. This change also allows for more fine-granular locking.	2016-05-18 18:13:55 +02:00
Steve Durrheimer	399d5c6375	Make version informations consistent between prometheus components	2016-05-05 22:33:18 +02:00
beorn7	865d16f870	Rename Gorilla into varbit	2016-03-23 16:30:41 +01:00
beorn7	8cdced3850	Implement Gorilla-inspired chunk encoding This is not a verbatim implementation of the Gorilla encoding. First of all, it could not, even if we wanted, because Prometheus has a different chunking model (constant size, not constant time). Second, this adds a number of changes that improve the encoding in general or at least for the specific use case of Prometheus (and are partially only possible in the context of Prometheus). See comments in the code for details.	2016-03-17 14:47:08 +01:00
Tobias Schmidt	2f151d02eb	Merge pull request #1456 from prometheus/validate-alertmanager-url Validate alertmanager URL	2016-03-07 20:09:46 -05:00
Tobias Schmidt	7763bbd993	Validate alertmanager URL	2016-03-07 20:07:17 -05:00
beorn7	b6fdb355d7	Move dump-heads into its own tool	2016-03-07 16:30:19 +01:00
beorn7	f193f2b8ef	Add a command to promtool that dumps metadata of heads.db I needed this today for debugging. It can certainly be improved, but it's already quite helpful. I refactored the reading of heads.db files out of persistence, which is an improvement, too. I made minor changes to the cli package to allow outputting via the io.Writer interface.	2016-03-07 16:21:57 +01:00
Fabian Reinartz	bfa8aaa017	Rename notification to notifier	2016-03-01 12:39:08 +01:00
Fabian Reinartz	fce17b41c5	Merge pull request #1408 from prometheus/hostname Log argument parse errors	2016-02-19 12:22:12 +01:00
Fabian Reinartz	e62677d7ba	Log argument parse errors Fixes #1407	2016-02-19 12:20:10 +01:00
Ignacio Carbajo	6a323b1e6d	Fix minor typo	2016-02-17 22:52:44 +00:00
beorn7	ec08c9a391	Rework the way to communicate backpressure (AKA suspended ingestion) This gives up on the idea to communicate throuh the Append() call (by either not returning as it is now or returning an error as suggested/explored elsewhere). Here I have added a Throttled() call, which has the advantage that it can be called before a whole _batch_ of Append()'s. Scrapes will happen completely or not at all. Same for rule group evaluations. That's a highly desired behavior (as discussed elsewhere). The code is even simpler now as the whole ingestion buffer could be removed. Logging of throttled mode has been streamlined and will create at most one message per minute.	2016-02-01 14:45:44 +01:00
Fabian Reinartz	d9f836e5b8	Merge pull request #1340 from prometheus/validate-externa-url Validate URL parameters	2016-01-27 15:49:08 +01:00
beorn7	a2cd479058	Fix calculation of chunks to persist after restart Since we are not overestimating the number of chunks to persist anymore, this commit also adjusts the default value for -storage.local.memory-chunks. Update of documentation will follow.	2016-01-25 19:33:51 +01:00
Tobias Schmidt	122d73858d	Validate URL parameters	2016-01-25 00:37:09 -05:00
Julius Volz	b150c5768c	Add missing word in comment.	2016-01-21 01:37:08 +01:00
Fabian Reinartz	7e1b39c682	Fix startup/teardown order, add documentation	2016-01-18 17:34:25 +01:00
beorn7	4221c7de5c	Improve handling of series file truncation If only very few chunks are to be truncated from a very large series file, the rewrite of the file is a lorge overhead. With this change, a certain ratio of the file has to be dropped to make it happen. While only causing disk overhead at about the same ratio (by default 10%), it will cut down I/O by a lot in above scenario.	2016-01-11 16:42:10 +01:00
Fabian Reinartz	37d80c4b25	Fix premature rule evaluation This commit prevents rule evaluation from starting until after the storage is ready.	2016-01-08 17:51:22 +01:00
Richard Hartmann	7da42eee6e	main.go: Remove warning about external_labels	2016-01-07 11:15:14 +01:00
Julius Volz	87d1831f12	Document INFLUXDB_PW env var in username flag Fixes https://github.com/prometheus/prometheus/issues/1281	2016-01-04 00:18:41 +01:00
Fabian Reinartz	62075aa037	Reduce noisy no-alertmanager warning	2015-12-17 15:42:26 +01:00

1 2

87 Commits