prometheus

Commit Graph

Author	SHA1	Message	Date
Mateusz Gozdek	0bfef847b0	discovery/consul: fix leaking goroutine from test Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-10 09:40:43 +01:00
Conor Evans	c28b9a0574	Add datacenter to Consul service discovery logs (#9668 ) * add datacenter to consul service discovery logs Signed-off-by: Conor Evans <coevans@tcd.ie>	2021-11-08 09:34:21 +01:00
Mateusz Gozdek	1a6c2283a3	Format Go source files using 'gofumpt -w -s -extra' Part of #9557 Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>	2021-11-02 19:52:34 +01:00
Julien Pivotto	63b3e4e5ec	Enable HTTP2 again (#9398 ) We are re-enabling HTTP 2 again. There has been a few bugfixes upstream in go, and we have also enabled ReadIdleTimeout. Fix #7588 Fix #9068 Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2021-09-26 23:16:12 +02:00
jinglina	ed24e51e7c	remove redundant type conversion (#9126 ) Signed-off-by: jinglina <jinglinax@163.com>	2021-07-28 13:33:46 +05:30
Levi Harrison	faed8df31d	Enable reading consul token from file (#8926 ) * Adopted common http client Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-06-12 00:06:59 +02:00
Levi Harrison	b5f6f8fb36	Switched to go-kit/log Signed-off-by: Levi Harrison <git@leviharrison.dev>	2021-06-11 12:28:36 -04:00
Austin Cawley-Edwards	301815e48b	Update prometheus-common and the consul HTTP client (#8913 ) * Update to prometheus-common@v0.29.0 Signed-off-by: austin ce <austin.cawley@gmail.com>	2021-06-11 14:24:41 +02:00
Frederic Hemberger	39a87fd9d2	consul_sd: Add namespace support for Consul Enterprise Signed-off-by: Frederic Hemberger <mail@frederic-hemberger.de>	2021-06-09 16:35:02 +02:00
songjiayang	b781b5cac5	Refactor file discovery init function (#8891 ) * Refactor file discovery init function Combine to one init function like other discovery. Signed-off-by: songjiayang <songjiayang1@gmail.com>	2021-06-04 14:43:24 +02:00
Nick Triller	fddf4918c0	Send empty targetgroup if nothing discovered Signed-off-by: Nick Triller <nicktriller@gmail.com>	2021-04-29 09:06:52 +02:00
Julien Pivotto	6c56a1faaa	Testify: move to require (#8122 ) * Testify: move to require Moving testify to require to fail tests early in case of errors. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * More moves Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-29 09:43:23 +00:00
Julien Pivotto	1282d1b39c	Refactor test assertions (#8110 ) * Refactor test assertions This pull request gets rid of assert.True where possible to use fine-grained assertions. Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-27 11:06:53 +01:00
Julien Pivotto	4e5b1722b3	Move away from testutil, refactor imports (#8087 ) Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-10-22 11:00:08 +02:00
johncming	a5beb627ff	some fixies for consul sd. (#7799 ) * discovery/consul: make duration more accurate. Signed-off-by: johncming <johncming@yahoo.com> * discovery/consul: fix bug when context done. Signed-off-by: johncming <johncming@yahoo.com>	2020-08-25 15:46:14 +02:00
Andy Bursavich	4e6a94a27d	Invert service discovery dependencies (#7701 ) This also fixes a bug in query_log_file, which now is relative to the config file like all other paths. Signed-off-by: Andy Bursavich <abursavich@gmail.com>	2020-08-20 13:48:26 +01:00
Julien Pivotto	9da53391d1	Merge pull request #7739 from prometheus/release-2.20 Merge release-2.20 into the main branch after Consul fix	2020-08-04 20:15:43 +02:00
Julien Pivotto	3a7120bc07	Consul: Reduce WatchTimeout to 2m and set it as timeout for requests Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-08-03 00:42:55 +02:00
Julien Pivotto	93e9c010f3	Add more Go leak tests (#7652 ) * Implement go leak test for promql Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Implement go leak test for Consul SD Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu> * Implement go leak test in discovery manager Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>	2020-07-24 10:10:20 +01:00
John Bampton	98a69b77d1	Fix spelling (#7512 ) Signed-off-by: John Bampton <jbampton@users.noreply.github.com>	2020-07-04 14:54:26 +02:00
Pierre Souchay	1508678001	Use 10m timeouts for watches (#7423 ) use ?wait=10m will give results as fast as usual when data is changing but will perform far less requests when services do not change. On large infrastructure, this will reduce quite a lot the number of qps on Consul servers while having the same performance for freshness of results. Signed-off-by: Pierre Souchay <p.souchay@criteo.com>	2020-06-20 20:22:45 +01:00
Mathilde Gilles	9b9c58aea8	[Consul] Add health label to metrics (#5313 ) Label metrics with the target health using consul's /health endpoint. Signed-off-by: Mathilde Gilles <m.gilles@criteo.com>	2020-02-25 13:32:30 +00:00
Simon Pasquier	fe76ccbfe3	discovery/consul: fix logging of tags (#6783 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2020-02-13 13:11:44 +01:00
Ben Ye	60527de355	keep consul service metrics in global variables (#6764 ) Signed-off-by: yeya24 <yb532204897@gmail.com>	2020-02-06 05:48:58 +00:00
Josh Soref	91d76c8023	Spelling (#6517 ) * spelling: alertmanager Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: attributes Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: autocomplete Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: bootstrap Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: caught Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: chunkenc Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: compaction Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: corrupted Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: deletable Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: expected Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: fine-grained Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: initialized Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: iteration Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: javascript Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: multiple Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: number Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: overlapping Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: possible Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: postings Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: procedure Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: programmatic Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: queuing Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: querier Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: repairing Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: received Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: reproducible Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: retention Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: sample Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: segements Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: semantic Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: software [LICENSE] Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: staging Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: timestamp Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: unfortunately Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: uvarint Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: subsequently Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: ressamples Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>	2020-01-02 15:54:09 +01:00
Jean-Baptiste Le Duigou	5973227434	adding additional unit tests for getDataCenter() in consul (#6192 ) * adding additional unit tests for getDataCenter() in consul Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com> * Consult Tests : update comments to start with uppercase and end with point Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com> * Consult Test : using table-driven tests Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com> * Consul Test : cleaner syntax Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com> * Consul Test : even cleaner syntax Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com> * Consul Test : update comments Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com> * Fixing naming convention by removing underscore in function name Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com> * Removing duplicated test case for getDatacenter() Signed-off-by: Jean-Baptiste Le Duigou <jb.leduigou@gmail.com>	2019-11-15 14:52:39 +01:00
Simon Pasquier	19ce6b7f5f	discovery: fix more error logs on context cancelation (#6133 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-10-18 11:48:51 +02:00
Ganesh Vernekar	5ecef3542d	Cleanup after merging tsdb into prometheus Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>	2019-08-13 14:04:14 +05:30
AllenZMC	41151ca8dc	fix mis-spelling in consul_test.go (#5836 ) Signed-off-by: czm <zhongming.chang@daocloud.io>	2019-08-06 06:11:41 +01:00
beorn7	dd81912554	Add objectives to Summaries With the next release of client_golang, Summaries will not have objectives by default. To not lose the objectives we have right now, explicitly state the current default objectives. Signed-off-by: beorn7 <beorn@grafana.com>	2019-06-12 02:03:13 +02:00
Simon Pasquier	45506841e6	*: enable all default linters (#5504 ) Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-05-03 15:11:28 +02:00
Tariq Ibrahim	8fdfa8abea	refine error handling in prometheus (#5388 ) i) Uses the more idiomatic Wrap and Wrapf methods for creating nested errors. ii) Fixes some incorrect usages of fmt.Errorf where the error messages don't have any formatting directives. iii) Does away with the use of fmt package for errors in favour of pkg/errors Signed-off-by: tariqibrahim <tariq181290@gmail.com>	2019-03-26 00:01:12 +01:00
Simon Pasquier	782d00059a	discovery: factorize for SD based on refresh (#5381 ) * discovery: factorize for SD based on refresh Signed-off-by: Simon Pasquier <spasquie@redhat.com> * discovery: use common metrics for refresh Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-03-25 11:54:22 +01:00
Mario Trangoni	5354ffff99	Fix some spelling issues (#5361 ) See, $ codespell -S './vendor/,./.git,./web/ui/static/vendor*' --ignore-words-list="uint,dur,ue,iff,te,wan" Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>	2019-03-14 14:38:54 +00:00
Callum Styan	83c46fd549	update Consul vendor code so that catalog.ServiceMultipleTags can be (#5151 ) Signed-off-by: Callum Styan <callumstyan@gmail.com>	2019-03-12 10:31:27 +00:00
Simon Pasquier	f9462d5d44	discovery/consul: pass current context to Consul queries Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-02-18 14:23:56 +01:00
Simon Pasquier	f678e27eb6	: use latest release of staticcheck (#5057 ) : use latest release of staticcheck It also fixes a couple of things in the code flagged by the additional checks. Signed-off-by: Simon Pasquier <spasquie@redhat.com> Use official release of staticcheck Also run 'go list' before staticcheck to avoid failures when downloading packages. Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2019-01-04 14:47:38 +01:00
Samuel Alfageme	240321acee	Add taggedAddress to the labels in ConsulSD (#5001 ) Useful when multiple (tagged) addresses for a node are exposed on the catalog API Ref. https://www.consul.io/api/catalog.html#taggedaddresses Signed-off-by: Samuel Alfageme <samuel@alfage.me>	2018-12-18 11:51:05 +01:00
Ben Kochie	c6399296dc	Fix spelling/typos (#4921 ) * Fix spelling/typos Fix spelling/typos reported by codespell/misspell. * UK -> US spelling changes. Signed-off-by: Ben Kochie <superq@gmail.com>	2018-11-27 17:44:29 +01:00
Simon Pasquier	1cd29f782c	discovery/consul: close idle connections on stop Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2018-08-01 17:26:52 +02:00
Romain Baugue	b41be4ef52	Discovery consul service meta (#4280 ) * Upgrade Consul client * Add ServiceMeta to the labels in ConsulSD Signed-off-by: Romain Baugue <romain.baugue@elwinar.com>	2018-07-18 05:06:56 +01:00
Julius Volz	5cf0113762	Add "omitempty" to some SD config YAML field tags (#4338 ) Especially for Kubernetes SD, this fixes a bug where the rendered configuration says "api_server: null", which when read back is not interpreted as an un-set API server (thus the default is not applied). Signed-off-by: Julius Volz <julius.volz@gmail.com>	2018-07-03 13:43:41 +02:00
Elif T. Kuş	57dcdfb15f	Rewrote tests with testutil for several test files (#4086 ) * promql: Rewrote tests with testutil for functions_test Signed-off-by: Elif T. Kuş <elifkus@gmail.com> * pkg/relabel: Rewrote tests with testutil for relabel_test Signed-off-by: Elif T. Kuş <elifkus@gmail.com> * discovery/consul: Rewrote tests with testutil for consul_test Signed-off-by: Elif T. Kuş <elifkus@gmail.com> * scrape: Rewrote tests with testutil for manager_test Signed-off-by: Elif T. Kuş <elifkus@gmail.com>	2018-04-27 13:11:16 +01:00
Adam Shannon	809881d7f5	support reading basic_auth password_file for HTTP basic auth (#4077 ) Issue: https://github.com/prometheus/prometheus/issues/4076 Signed-off-by: Adam Shannon <adamkshannon@gmail.com>	2018-04-25 18:19:06 +01:00
sev3ryn	cc917aee7f	fix of endless loop while doing Consul service discovery. (#4044 ) Reloading Prometheus configs doesn't make loop end. It produced a goroutine leak	2018-04-05 10:41:09 +01:00
Manos Fokas	25f929b772	Yaml UnmarshalStrict implementation. (#4033 ) * Updated yaml vendor package. * remove checkOverflow duplicate in rulefmt * remove duplicated HTTPClientConfig.Validate() * Added yaml static check.	2018-04-04 09:07:39 +01:00
Corentin Chary	60dafd425c	consul: improve consul service discovery (#3814 ) * consul: improve consul service discovery Related to #3711 - Add the ability to filter by tag and node-meta in an efficient way (`/catalog/services` allow filtering by node-meta, and returns a `map[string]string` or `service`->`tags`). Tags and nore-meta are also used in `/catalog/service` requests. - Do not require a call to the catalog if services are specified by name. This is important because on large cluster `/catalog/services` changes all the time. - Add `allow_stale` configuration option to do stale reads. Non-stale reads can be costly, even more when you are doing them to a remote datacenter with 10k+ targets over WAN (which is common for federation). - Add `refresh_interval` to minimize the strain on the catalog and on the service endpoint. This is needed because of that kind of behavior from consul: https://github.com/hashicorp/consul/issues/3712 and because a catalog on a large cluster would basically change all the time. No need to discover targets in 1sec if we scrape them every minute. - Added plenty of unit tests. Benchmarks ---------- ```yaml scrape_configs: - job_name: prometheus scrape_interval: 60s static_configs: - targets: ["127.0.0.1:9090"] - job_name: "observability-by-tag" scrape_interval: "60s" metrics_path: "/metrics" consul_sd_configs: - server: consul.service.par.consul.prod.crto.in:8500 tag: marathon-user-observability # Used in After refresh_interval: 30s # Used in After+delay relabel_configs: - source_labels: [__meta_consul_tags] regex: ^(.,)?marathon-user-observability(,.)?$ action: keep - job_name: "observability-by-name" scrape_interval: "60s" metrics_path: "/metrics" consul_sd_configs: - server: consul.service.par.consul.prod.crto.in:8500 services: - observability-cerebro - observability-portal-web - job_name: "fake-fake-fake" scrape_interval: "15s" metrics_path: "/metrics" consul_sd_configs: - server: consul.service.par.consul.prod.crto.in:8500 services: - fake-fake-fake ``` Note: tested with ~1200 services, ~5000 nodes. \| Resource \| Empty \| Before \| After \| After + delay \| \| -------- \|:-----:\|:------:\|:-----:\|:-------------:\| \|/service-discovery size\|5K\|85MiB\|27k\|27k\|27k\| \|`go_memstats_heap_objects`\|100k\|1M\|120k\|110k\| \|`go_memstats_heap_alloc_bytes`\|24MB\|150MB\|28MB\|27MB\| \|`rate(go_memstats_alloc_bytes_total[5m])`\|0.2MB/s\|28MB/s\|2MB/s\|0.3MB/s\| \|`rate(process_cpu_seconds_total[5m])`\|0.1%\|15%\|2%\|0.01%\| \|`process_open_fds`\|16\|1236\|22\|22\| \|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="services"}[5m])`\|~0\|1\|1\|0.03\| \|`rate(prometheus_sd_consul_rpc_duration_seconds_count{call="service"}[5m])`\|0.1\|80\|0.5\|0.5\| \|`prometheus_target_sync_length_seconds{quantile="0.9",scrape_job="observability-by-tag"}`\|N/A\|200ms\|0.2ms\|0.2ms\| \|Network bandwidth\|~10kbps\|~2.8Mbps\|~1.6Mbps\|~10kbps\| Filtering by tag using relabel_configs uses 100kiB and 23kiB/s per service per job and quite a lot of CPU. Also sends and additional 1Mbps of traffic to consul. Being a little bit smarter about this reduces the overhead quite a lot. Limiting the number of `/catalog/services` queries per second almost removes the overhead of service discovery. * consul: tweak `refresh_interval` behavior `refresh_interval` now does what is advertised in the documentation, there won't be more that one update per `refresh_interval`. It now defaults to 30s (which was also the current waitTime in the consul query). This also make sure we don't wait another 30s if we already waited 29s in the blocking call by substracting the number of elapsed seconds. Hopefully this will do what people expect it does and will be safer for existing consul infrastructures.	2018-03-23 14:48:43 +00:00
zemek	8a01a0fbed	Set consul server default to localhost:8500 (#3703 )	2018-01-24 12:14:32 +00:00
Shubheksha Jalan	0471e64ad1	Use shared types from the `common` repo (#3674 ) * refactor: use shared types from common repo, remove util/config * vendor: add common/config * fix nit	2018-01-11 16:10:25 +01:00
Callum Styan	97464236c7	comments with TargetProvider should read Discoverer instead (#3667 )	2018-01-08 23:59:18 +00:00

1 2

66 Commits