prometheus

Commit Graph

Author	SHA1	Message	Date
Bryan Boreham	1e3fef6ab0	scraping: limit detail on dropped targets, to save memory (#12647 ) It's possible (quite common on Kubernetes) to have a service discovery return thousands of targets then drop most of them in relabel rules. The main place this data is used is to display in the web UI, where you don't want thousands of lines of display. The new limit is `keep_dropped_targets`, which defaults to 0 for backwards-compatibility. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2023-08-14 15:39:25 +01:00
Goutham Veeramachaneni	ad4f514e66	Add OTLP Ingestion endpoint (#12571 ) * Add OTLP Ingestion endpoint We copy files from the otel-collector-contrib. See the README in `storage/remote/otlptranslator/README.md`. This supersedes: https://github.com/prometheus/prometheus/pull/11965 Signed-off-by: gouthamve <gouthamve@gmail.com> * Return a 200 OK It is what the OTEL Golang SDK expect :( https://github.com/open-telemetry/opentelemetry-go/issues/4363 Signed-off-by: Goutham <gouthamve@gmail.com> --------- Signed-off-by: gouthamve <gouthamve@gmail.com> Signed-off-by: Goutham <gouthamve@gmail.com>	2023-07-28 12:35:28 +02:00
Julien Pivotto	c572d9d6d9	Merge pull request #11905 from charleskorn/api-response-format-extension-point Add extension point for returning different content types from API endpoints	2023-07-15 22:49:29 +02:00
Marco Pracucci	7cc4292328	Export MinTime and MaxTime Signed-off-by: Marco Pracucci <marco@pracucci.com>	2023-07-06 17:48:13 +02:00
Julien Pivotto	0186ec7873	Merge pull request #12516 from vinted/convert_queryopts_to_interface promql: convert QueryOpts to interface	2023-07-04 23:38:31 +02:00
Julien Pivotto	986fde06b2	Merge pull request #11688 from damnever/fix/datamodelvalidation-remotewriteapi Validate the metric names and labels in the remote write handler	2023-07-04 13:52:02 +02:00
Charles Korn	097faf33c6	Merge branch 'main' into api-response-format-extension-point # Conflicts: # web/api/v1/api.go # web/api/v1/api_test.go	2023-07-04 13:26:13 +10:00
Giedrius Statkevičius	3f230fc9f8	promql: convert QueryOpts to interface Convert QueryOpts to an interface so that downstream projects like https://github.com/thanos-community/promql-engine could extend the query options with engine specific options that are not in the original engine. Will be used to enable query analysis per-query. Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>	2023-07-03 16:20:31 +03:00
Julien Pivotto	e043b273a6	Merge pull request #12439 from prometheus/release-2.45 Merge release 2.45.0 back to main	2023-06-17 10:16:48 +02:00
Arthur Silva Sens	1ea477f4bc	Add feature flag to squash metadata from /api/v1/metadata (#12391 ) Signed-off-by: ArthurSens <arthursens2005@gmail.com>	2023-06-12 16:17:20 +01:00
Jesus Vazquez	bfa466d00f	Create release candidate 2.45.0-rc.0 (#12435 ) Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>	2023-06-07 12:29:04 +02:00
Baskar Shanmugam	905a0bd63a	Added 'limit' query parameter support to /api/v1/status/tsdb endpoint (#12336 ) * Added 'topN' query parameter support to /api/v1/status/tsdb endpoint Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> * Updated query parameter for tsdb status to 'limit' Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> * Corrected Stats() parameter name from topN to limit Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> * Fixed p.Stats CI failure Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com> --------- Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>	2023-05-22 14:37:07 +02:00
Vladimir Varankin	d281ebb178	web: display GOMEMLIMIT in runtime info Signed-off-by: Vladimir Varankin <vladimir@varank.in>	2023-04-23 20:24:34 +02:00
Julien Pivotto	8f1dc4a70f	Merge pull request #12248 from yeya24/consistent-response Use same error for instant and range query when 400	2023-04-21 11:44:20 +02:00
Julien Pivotto	e2512078e5	Merge pull request #12241 from mmorel-35/linter/nilerr enable gocritic, unconvert and unused linters	2023-04-20 15:13:31 +02:00
gotjosh	2f22c8b7f8	Merge pull request #12270 from prometheus/gotjosh/allow-filtering-of-rules-by-name-api Rules API: Allow filtering by rule name	2023-04-20 12:03:08 +01:00
gotjosh	e78be38cc0	don't show empty groups Signed-off-by: gotjosh <josue.abreu@gmail.com>	2023-04-20 11:20:20 +01:00
Matthieu MOREL	bae9a21200	Merge branch 'main' into linter/nilerr Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2023-04-19 19:56:39 +02:00
beorn7	5b53aa1108	style: Replace `else if` cascades with `switch` Wiser coders than myself have come to the conclusion that a `switch` statement is almost always superior to a statement that includes any `else if`. The exceptions that I have found in our codebase are just these two: * The `if else` is followed by an additional statement before the next condition (separated by a `;`). * The whole thing is within a `for` loop and `break` statements are used. In this case, using `switch` would require tagging the `for` loop, which probably tips the balance. Why are `switch` statements more readable? For one, fewer curly braces. But more importantly, the conditions all have the same alignment, so the whole thing follows the natural flow of going down a list of conditions. With `else if`, in contrast, all conditions but the first are "hidden" behind `} else if `, harder to spot and (for no good reason) presented differently from the first condition. I'm sure the aforemention wise coders can list even more reasons. In any case, I like it so much that I have found myself recommending it in code reviews. I would like to make it a habit in our code base, without making it a hard requirement that we would test on the CI. But for that, there has to be a role model, so this commit eliminates all `if else` occurrences, unless it is autogenerated code or fits one of the exceptions above. Signed-off-by: beorn7 <beorn@grafana.com>	2023-04-19 17:22:31 +02:00
beorn7	c3c7d44d84	lint: Adjust to the lint warnings raised by current versions of golint-ci We haven't updated golint-ci in our CI yet, but this commit prepares for that. There are a lot of new warnings, and it is mostly because the "revive" linter got updated. I agree with most of the new warnings, mostly around not naming unused function parameters (although it is justified in some cases for documentation purposes – while things like mocks are a good example where not naming the parameter is clearer). I'm pretty upset about the "empty block" warning to include `for` loops. It's such a common pattern to do something in the head of the `for` loop and then have an empty block. There is still an open issue about this: https://github.com/mgechev/revive/issues/810 I have disabled "revive" altogether in files where empty blocks are used excessively, and I have made the effort to add individual `// nolint:revive` where empty blocks are used just once or twice. It's borderline noisy, though, but let's go with it for now. I should mention that none of the "empty block" warnings for `for` loop bodies were legitimate. Signed-off-by: beorn7 <beorn@grafana.com>	2023-04-19 17:10:10 +02:00
gotjosh	96b6463f25	review comments Signed-off-by: gotjosh <josue.abreu@gmail.com>	2023-04-18 16:26:32 +01:00
gotjosh	f3394bf7a1	Rules API: Allow filtering by rule name Introduces support for a new query parameter in the `/rules` API endpoint that allows filtering by rule names. If all the rules of a group are filtered, we skip the group entirely. Signed-off-by: gotjosh <josue.abreu@gmail.com>	2023-04-18 10:12:08 +01:00
Ben Ye	fd3630b9a3	add ctx to QueryEngine interface Signed-off-by: Ben Ye <benye@amazon.com>	2023-04-17 21:32:38 -07:00
Matthieu MOREL	fb3eb21230	enable gocritic, unconvert and unused linters Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2023-04-13 19:20:22 +00:00
beorn7	c0879d64cf	promql: Separate `Point` into `FPoint` and `HPoint` In other words: Instead of having a “polymorphous” `Point` that can either contain a float value or a histogram value, use an `FPoint` for floats and an `HPoint` for histograms. This seemingly small change has a _lot_ of repercussions throughout the codebase. The idea here is to avoid the increase in size of `Point` arrays that happened after native histograms had been added. The higher-level data structures (`Sample`, `Series`, etc.) are still “polymorphous”. The same idea could be applied to them, but at each step the trade-offs needed to be evaluated. The idea with this change is to do the minimum necessary to get back to pre-histogram performance for functions that do not touch histograms. Here are comparisons for the `changes` function. The test data doesn't include histograms yet. Ideally, there would be no change in the benchmark result at all. First runtime v2.39 compared to directly prior to this commit: ``` name old time/op new time/op delta RangeQuery/expr=changes(a_one[1d]),steps=1-16 391µs ± 2% 542µs ± 1% +38.58% (p=0.000 n=9+8) RangeQuery/expr=changes(a_one[1d]),steps=10-16 452µs ± 2% 617µs ± 2% +36.48% (p=0.000 n=10+10) RangeQuery/expr=changes(a_one[1d]),steps=100-16 1.12ms ± 1% 1.36ms ± 2% +21.58% (p=0.000 n=8+10) RangeQuery/expr=changes(a_one[1d]),steps=1000-16 7.83ms ± 1% 8.94ms ± 1% +14.21% (p=0.000 n=10+10) RangeQuery/expr=changes(a_ten[1d]),steps=1-16 2.98ms ± 0% 3.30ms ± 1% +10.67% (p=0.000 n=9+10) RangeQuery/expr=changes(a_ten[1d]),steps=10-16 3.66ms ± 1% 4.10ms ± 1% +11.82% (p=0.000 n=10+10) RangeQuery/expr=changes(a_ten[1d]),steps=100-16 10.5ms ± 0% 11.8ms ± 1% +12.50% (p=0.000 n=8+10) RangeQuery/expr=changes(a_ten[1d]),steps=1000-16 77.6ms ± 1% 87.4ms ± 1% +12.63% (p=0.000 n=9+9) RangeQuery/expr=changes(a_hundred[1d]),steps=1-16 30.4ms ± 2% 32.8ms ± 1% +8.01% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=10-16 37.1ms ± 2% 40.6ms ± 2% +9.64% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=100-16 105ms ± 1% 117ms ± 1% +11.69% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16 783ms ± 3% 876ms ± 1% +11.83% (p=0.000 n=9+10) ``` And then runtime v2.39 compared to after this commit: ``` name old time/op new time/op delta RangeQuery/expr=changes(a_one[1d]),steps=1-16 391µs ± 2% 547µs ± 1% +39.84% (p=0.000 n=9+8) RangeQuery/expr=changes(a_one[1d]),steps=10-16 452µs ± 2% 616µs ± 2% +36.15% (p=0.000 n=10+10) RangeQuery/expr=changes(a_one[1d]),steps=100-16 1.12ms ± 1% 1.26ms ± 1% +12.20% (p=0.000 n=8+10) RangeQuery/expr=changes(a_one[1d]),steps=1000-16 7.83ms ± 1% 7.95ms ± 1% +1.59% (p=0.000 n=10+8) RangeQuery/expr=changes(a_ten[1d]),steps=1-16 2.98ms ± 0% 3.38ms ± 2% +13.49% (p=0.000 n=9+10) RangeQuery/expr=changes(a_ten[1d]),steps=10-16 3.66ms ± 1% 4.02ms ± 1% +9.80% (p=0.000 n=10+9) RangeQuery/expr=changes(a_ten[1d]),steps=100-16 10.5ms ± 0% 10.8ms ± 1% +3.08% (p=0.000 n=8+10) RangeQuery/expr=changes(a_ten[1d]),steps=1000-16 77.6ms ± 1% 78.1ms ± 1% +0.58% (p=0.035 n=9+10) RangeQuery/expr=changes(a_hundred[1d]),steps=1-16 30.4ms ± 2% 33.5ms ± 4% +10.18% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=10-16 37.1ms ± 2% 40.0ms ± 1% +7.98% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=100-16 105ms ± 1% 107ms ± 1% +1.92% (p=0.000 n=10+10) RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16 783ms ± 3% 775ms ± 1% -1.02% (p=0.019 n=9+9) ``` In summary, the runtime doesn't really improve with this change for queries with just a few steps. For queries with many steps, this commit essentially reinstates the old performance. This is good because the many-step queries are the one that matter most (longest absolute runtime). In terms of allocations, though, this commit doesn't make a dent at all (numbers not shown). The reason is that most of the allocations happen in the sampleRingIterator (in the storage package), which has to be addressed in a separate commit. Signed-off-by: beorn7 <beorn@grafana.com>	2023-04-13 19:25:16 +02:00
Ben Ye	fb67d368a2	use consistent error for instant and range query 400 Signed-off-by: Ben Ye <benye@amazon.com>	2023-04-11 13:45:34 -07:00
Xiaochao Dong (@damnever)	2b7202c4cc	Validate the metric names and labels in the remote write handler Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>	2023-04-05 19:09:05 +08:00
pbudner	46683eadf7	fix: advertise correct flag to enable remote write receiver Signed-off-by: pbudner <mail@pascalbudner.de>	2023-03-11 13:50:52 +01:00
Charles Korn	38c1930f48	Merge branch 'main' into api-response-format-extension-point Signed-off-by: Charles Korn <charles.korn@grafana.com>	2023-03-09 12:06:26 +11:00
Charles Korn	46a28899a0	Implement fully-featured content negotiation for API requests, and allow overriding the default API codec. Signed-off-by: Charles Korn <charles.korn@grafana.com>	2023-03-09 12:02:45 +11:00
Julien Pivotto	db2d759b81	Add support for lookbackdelta per query via the API Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2023-03-08 00:30:05 +01:00
Charles Korn	eaad7c0fc8	Merge branch 'main' into api-response-format-extension-point Signed-off-by: Charles Korn <charleskorn@users.noreply.github.com>	2023-02-15 14:18:23 +01:00
Charles Korn	deba5120ea	Address PR feeedback: reduce log level. Signed-off-by: Charles Korn <charles.korn@grafana.com>	2023-02-11 15:34:25 +01:00
Charles Korn	857b23873f	Expose QueryData so that implementations of Codec.CanEncode() can perform a type assertion against Response.Data. Signed-off-by: Charles Korn <charles.korn@grafana.com>	2023-02-02 15:30:56 +11:00
Charles Korn	a0dd1468be	Move custom jsoniter code into json_codec.go. Signed-off-by: Charles Korn <charles.korn@grafana.com>	2023-02-02 13:10:20 +11:00
Charles Korn	3e94dd8c8f	Add extension point for returning different content types from API endpoints Signed-off-by: Charles Korn <charles.korn@grafana.com>	2023-02-02 13:10:19 +11:00
Marco Pracucci	3db77b4491	API: change HTTP status code tracked in metrics form 503/422 to 499 if a request is canceled Signed-off-by: Marco Pracucci <marco@pracucci.com>	2023-01-26 13:06:37 +01:00
Julien Pivotto	2c408289f8	Add stabilizing to UI Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2023-01-19 11:33:54 +01:00
Julien Pivotto	ce55e5074d	Add 'keep_firing_for' field to alerting rules This commit adds a new 'keep_firing_for' field to Prometheus alerting rules. The 'resolve_delay' field specifies the minimum amount of time that an alert should remain firing, even if the expression does not return any results. This feature was discussed at a previous dev summit, and it was determined that a feature like this would be useful in order to allow the expression time to stabilize and prevent confusing resolved messages from being propagated through Alertmanager. This approach is simpler than having two PromQL queries, as was sometimes discussed, and it should be easy to implement. This commit does not include tests for the 'resolve_delay' field. This is intentional, as the purpose of this commit is to gather comments on the proposed design of the 'resolve_delay' field before implementing tests. Once the design of the 'resolve_delay' field has been finalized, a follow-up commit will be submitted with tests." See https://github.com/prometheus/prometheus/issues/11570 Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2023-01-13 12:11:39 +01:00
Łukasz Mierzwa	e1b7082008	Show individual scrape pools on /targets page (#11142 ) * Add API endpoints for getting scrape pool names This adds api/v1/scrape_pools endpoint that returns the list of names of all the scrape pools configured. Having it allows to find out what scrape pools are defined without having to list and parse all targets. The second change is adding scrapePool query parameter support in api/v1/targets endpoint, that allows to filter returned targets by only finding ones for passed scrape pool name. Both changes allow to query for a specific scrape pool data, rather than getting all the targets for all possible scrape pools. The problem with api/v1/targets endpoint is that it returns huge amount of data if you configure a lot of scrape pools. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com> * Add a scrape pool selector on /targets page Current targets page lists all possible targets. This works great if you only have a few scrape pools configured, but for systems with a lot of scrape pools and targets this slow things down a lot. Not only does the /targets page load very slowly in such case (waiting for huge API response) but it also take a long time to render, due to huge number of elements. This change adds a dropdown selector so it's possible to select only intersting scrape pool to view. There's also scrapePool query param that will open selected pool automatically. Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com> Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>	2022-12-23 11:55:08 +01:00
Ganesh Vernekar	e3719d670b	Merge remote-tracking branch 'upstream/main' into sparsehistogram Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>	2022-10-25 14:38:56 -04:00
Alan Protasio	5ac12ac351	api: Wrapped promQL based API errors with `returnAPIError` function (#11356 ) * wrap api error on get series/labels on `returnAPIError` function Signed-off-by: Alan Protasio <approtas@amazon.com> * lint Signed-off-by: Alan Protasio <approtas@amazon.com> * query exemplars Signed-off-by: Alan Protasio <approtas@amazon.com> Signed-off-by: Alan Protasio <approtas@amazon.com>	2022-10-20 11:17:00 +02:00
Jesus Vazquez	e934d0f011	Merge 'main' into sparsehistogram Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>	2022-10-05 22:14:49 +02:00
Bryan Boreham	3330d85ba8	Replace sort.Strings and sort.Ints with faster slices.Sort (#11318 ) Use new experimental package `golang.org/x/exp/slices`. slices.Sort works on values that are directly comparable, like ints, so avoids the overhad of an interface call to `.Less()`. Left tests unchanged, because they don't need the speed and it may be a cross-check that slices.Sort gives the same answer. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2022-09-30 20:03:56 +05:30
Miguel Ángel Ortuño	e4b87a7a2a	api: export point marshaling functions (#11323 ) Export `marshalTimestamp` and `marshalValue` functions by moving them under their own util package. Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>	2022-09-29 20:16:48 +05:30
Alan Protasio	f1a3dbbb6b	GetSeries should Select sorted results only if more than one matcher is requested (#11313 ) Signed-off-by: Alan Protasio <approtas@amazon.com> Signed-off-by: Alan Protasio <approtas@amazon.com>	2022-09-16 09:40:41 +02:00
Julien Pivotto	96d5a32659	Update go to 1.19, set min version to 1.18 (#11279 ) * Update go to 1.19, set min version to 1.18 Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu> * Update golangci-lint Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu> Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>	2022-09-07 11:30:48 +02:00
beorn7	c9fd3c235d	Merge branch 'main' into sparsehistogram	2022-08-10 17:54:37 +02:00
Julius Volz	b57deb6eb0	Add /api/v1/format_query API endpoint for formatting queries (#11036 ) * Add /api/v1/format_query API endpoint for formatting queries This uses the formatting functionality introduced in https://github.com/prometheus/prometheus/pull/10544. I've chosen "query" instead of "expr" in both the endpoint and parameter names to stay consistent with the existing API endpoints. Otherwise, I would have preferred to use the term "expr". Signed-off-by: Julius Volz <julius.volz@gmail.com> * Add docs for /api/v1/format_query endpoint Signed-off-by: Julius Volz <julius.volz@gmail.com> * Add note that formatting expressions removes comments Signed-off-by: Julius Volz <julius.volz@gmail.com>	2022-07-20 14:55:09 +02:00
beorn7	095b6c93dd	Merge branch 'main' into sparsehistogram	2022-06-14 14:27:35 +02:00

1 2 3 4 5 ...

279 Commits