Commit Graph

1486 Commits

Author SHA1 Message Date
Bryan Boreham
a073e04a9b
Merge pull request #12366 from prometheus/release-2.44
Merge release 2.44 back to main
2023-05-16 18:06:29 +01:00
Bryan Boreham
1ac5131f69
Release 2.44.0 (#12364)
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-05-14 07:13:03 +01:00
beorn7
9e500345f3 textparse/scrape: Add option to scrape both classic and native histograms
So far, if a target exposes a histogram with both classic and native
buckets, a native-histogram enabled Prometheus would ignore the
classic buckets. With the new scrape config option
`scrape_classic_histograms` set, both buckets will be ingested,
creating all the series of a classic histogram in parallel to the
native histogram series. For example, a histogram `foo` would create a
native histogram series `foo` and classic series called `foo_sum`,
`foo_count`, and `foo_bucket`.

This feature can be used in a migration strategy from classic to
native histograms, where it is desired to have a transition period
during which both native and classic histograms are present.

Note that two bugs in classic histogram parsing were found and fixed
as a byproduct of testing the new feature:

1. Series created from classic _gauge_ histograms didn't get the
   _sum/_count/_bucket prefix set.
2. Values of classic _float_ histograms weren't parsed properly.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-05-13 01:32:25 +02:00
Bryan Boreham
94d9367bbf
Create 2.44.0-rc.2 (#12341)
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-05-08 11:30:29 +01:00
Bryan Boreham
3d26faade4
Create 2.44.0-rc.1 (#12323)
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-05-03 16:18:28 +01:00
Bryan Boreham
aeccf9e770 Bump version to 2.44.0-rc0
Including CHANGELOG.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-04-24 15:38:43 +00:00
Vladimir Varankin
d281ebb178 web: display GOMEMLIMIT in runtime info
Signed-off-by: Vladimir Varankin <vladimir@varank.in>
2023-04-23 20:24:34 +02:00
Julien Pivotto
8f1dc4a70f
Merge pull request #12248 from yeya24/consistent-response
Use same error for instant and range query when 400
2023-04-21 11:44:20 +02:00
Julien Pivotto
e2512078e5
Merge pull request #12241 from mmorel-35/linter/nilerr
enable gocritic, unconvert and unused linters
2023-04-20 15:13:31 +02:00
gotjosh
2f22c8b7f8
Merge pull request #12270 from prometheus/gotjosh/allow-filtering-of-rules-by-name-api
Rules API: Allow filtering by rule name
2023-04-20 12:03:08 +01:00
gotjosh
e78be38cc0
don't show empty groups
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-04-20 11:20:20 +01:00
Matthieu MOREL
bae9a21200
Merge branch 'main' into linter/nilerr
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-04-19 19:56:39 +02:00
beorn7
5b53aa1108 style: Replace else if cascades with switch
Wiser coders than myself have come to the conclusion that a `switch`
statement is almost always superior to a statement that includes any
`else if`.

The exceptions that I have found in our codebase are just these two:

* The `if else` is followed by an additional statement before the next
  condition (separated by a `;`).
* The whole thing is within a `for` loop and `break` statements are
  used. In this case, using `switch` would require tagging the `for`
  loop, which probably tips the balance.

Why are `switch` statements more readable?

For one, fewer curly braces. But more importantly, the conditions all
have the same alignment, so the whole thing follows the natural flow
of going down a list of conditions. With `else if`, in contrast, all
conditions but the first are "hidden" behind `} else if `, harder to
spot and (for no good reason) presented differently from the first
condition.

I'm sure the aforemention wise coders can list even more reasons.

In any case, I like it so much that I have found myself recommending
it in code reviews. I would like to make it a habit in our code base,
without making it a hard requirement that we would test on the CI. But
for that, there has to be a role model, so this commit eliminates all
`if else` occurrences, unless it is autogenerated code or fits one of
the exceptions above.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:22:31 +02:00
beorn7
c3c7d44d84 lint: Adjust to the lint warnings raised by current versions of golint-ci
We haven't updated golint-ci in our CI yet, but this commit prepares
for that.

There are a lot of new warnings, and it is mostly because the "revive"
linter got updated. I agree with most of the new warnings, mostly
around not naming unused function parameters (although it is justified
in some cases for documentation purposes – while things like mocks are
a good example where not naming the parameter is clearer).

I'm pretty upset about the "empty block" warning to include `for`
loops. It's such a common pattern to do something in the head of the
`for` loop and then have an empty block. There is still an open issue
about this: https://github.com/mgechev/revive/issues/810 I have
disabled "revive" altogether in files where empty blocks are used
excessively, and I have made the effort to add individual
`// nolint:revive` where empty blocks are used just once or twice.
It's borderline noisy, though, but let's go with it for now.

I should mention that none of the "empty block" warnings for `for`
loop bodies were legitimate.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:10:10 +02:00
gotjosh
96b6463f25
review comments
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-04-18 16:26:32 +01:00
gotjosh
f3394bf7a1
Rules API: Allow filtering by rule name
Introduces support for a new query parameter in the `/rules` API endpoint that allows filtering by rule names.

If all the rules of a group are filtered, we skip the group entirely.

Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-04-18 10:12:08 +01:00
Ben Ye
fd3630b9a3 add ctx to QueryEngine interface
Signed-off-by: Ben Ye <benye@amazon.com>
2023-04-17 21:32:38 -07:00
Matthieu MOREL
fb3eb21230 enable gocritic, unconvert and unused linters
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-04-13 19:20:22 +00:00
beorn7
817a2396cb Name float values as "floats", not as "values"
In the past, every sample value was a float, so it was fine to call a
variable holding such a float "value" or "sample". With native
histograms, a sample might have a histogram value. And a histogram
value is still a value. Calling a float value just "value" or "sample"
or "V" is therefore misleading. Over the last few commits, I already
renamed many variables, but this cleans up a few more places where the
changes are more invasive.

Note that we do not to attempt naming in the JSON APIs or in the
protobufs. That would be quite a disruption. However, internally, we
can call variables as we want, and we should go with the option of
avoiding misunderstandings.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-13 19:25:24 +02:00
beorn7
630bcb494b storage: Use separate sample types for histogram vs. float
Previously, we had one “polymorphous” `sample` type in the `storage`
package. This commit breaks it up into `fSample`, `hSample`, and
`fhSample`, each still implementing the `tsdbutil.Sample` interface.

This reduces allocations in `sampleRing.Add` but inflicts the penalty
of the interface wrapper, which makes things worse in total.

This commit therefore just demonstrates the step taken. The next
commit will tackle the interface overhead problem.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-13 19:25:24 +02:00
beorn7
c0879d64cf promql: Separate Point into FPoint and HPoint
In other words: Instead of having a “polymorphous” `Point` that can
either contain a float value or a histogram value, use an `FPoint` for
floats and an `HPoint` for histograms.

This seemingly small change has a _lot_ of repercussions throughout
the codebase.

The idea here is to avoid the increase in size of `Point` arrays that
happened after native histograms had been added.

The higher-level data structures (`Sample`, `Series`, etc.) are still
“polymorphous”. The same idea could be applied to them, but at each
step the trade-offs needed to be evaluated.

The idea with this change is to do the minimum necessary to get back
to pre-histogram performance for functions that do not touch
histograms. Here are comparisons for the `changes` function. The test
data doesn't include histograms yet. Ideally, there would be no change
in the benchmark result at all.

First runtime v2.39 compared to directly prior to this commit:

```
name                                                  old time/op    new time/op    delta
RangeQuery/expr=changes(a_one[1d]),steps=1-16            391µs ± 2%     542µs ± 1%  +38.58%  (p=0.000 n=9+8)
RangeQuery/expr=changes(a_one[1d]),steps=10-16           452µs ± 2%     617µs ± 2%  +36.48%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_one[1d]),steps=100-16         1.12ms ± 1%    1.36ms ± 2%  +21.58%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_one[1d]),steps=1000-16        7.83ms ± 1%    8.94ms ± 1%  +14.21%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1-16           2.98ms ± 0%    3.30ms ± 1%  +10.67%  (p=0.000 n=9+10)
RangeQuery/expr=changes(a_ten[1d]),steps=10-16          3.66ms ± 1%    4.10ms ± 1%  +11.82%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_ten[1d]),steps=100-16         10.5ms ± 0%    11.8ms ± 1%  +12.50%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1000-16        77.6ms ± 1%    87.4ms ± 1%  +12.63%  (p=0.000 n=9+9)
RangeQuery/expr=changes(a_hundred[1d]),steps=1-16       30.4ms ± 2%    32.8ms ± 1%   +8.01%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=10-16      37.1ms ± 2%    40.6ms ± 2%   +9.64%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=100-16      105ms ± 1%     117ms ± 1%  +11.69%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16     783ms ± 3%     876ms ± 1%  +11.83%  (p=0.000 n=9+10)
```

And then runtime v2.39 compared to after this commit:

```
name                                                  old time/op    new time/op    delta
RangeQuery/expr=changes(a_one[1d]),steps=1-16            391µs ± 2%     547µs ± 1%  +39.84%  (p=0.000 n=9+8)
RangeQuery/expr=changes(a_one[1d]),steps=10-16           452µs ± 2%     616µs ± 2%  +36.15%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_one[1d]),steps=100-16         1.12ms ± 1%    1.26ms ± 1%  +12.20%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_one[1d]),steps=1000-16        7.83ms ± 1%    7.95ms ± 1%   +1.59%  (p=0.000 n=10+8)
RangeQuery/expr=changes(a_ten[1d]),steps=1-16           2.98ms ± 0%    3.38ms ± 2%  +13.49%  (p=0.000 n=9+10)
RangeQuery/expr=changes(a_ten[1d]),steps=10-16          3.66ms ± 1%    4.02ms ± 1%   +9.80%  (p=0.000 n=10+9)
RangeQuery/expr=changes(a_ten[1d]),steps=100-16         10.5ms ± 0%    10.8ms ± 1%   +3.08%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1000-16        77.6ms ± 1%    78.1ms ± 1%   +0.58%  (p=0.035 n=9+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1-16       30.4ms ± 2%    33.5ms ± 4%  +10.18%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=10-16      37.1ms ± 2%    40.0ms ± 1%   +7.98%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=100-16      105ms ± 1%     107ms ± 1%   +1.92%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16     783ms ± 3%     775ms ± 1%   -1.02%  (p=0.019 n=9+9)
```

In summary, the runtime doesn't really improve with this change for
queries with just a few steps. For queries with many steps, this
commit essentially reinstates the old performance. This is good
because the many-step queries are the one that matter most (longest
absolute runtime).

In terms of allocations, though, this commit doesn't make a dent at
all (numbers not shown). The reason is that most of the allocations
happen in the sampleRingIterator (in the storage package), which has
to be addressed in a separate commit.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-13 19:25:16 +02:00
Ben Ye
fb67d368a2 use consistent error for instant and range query 400
Signed-off-by: Ben Ye <benye@amazon.com>
2023-04-11 13:45:34 -07:00
Hayk Davtyan
408f31f786
[WebUI/ScrapePoolList] Case-insensitive search of "Scrape Pools" (#12207)
Signed-off-by: hayk96 <hayko5999@gmail.com>
2023-04-02 11:37:58 +02:00
Alison
3ac49d4ae2
Set SourceMap to false to fix UI in MCR builds (#12175)
* Set sourceMap to false

Signed-off-by: Alison Burgess <alburgess@microsoft.com>

* add sourcemap=false to package.json build

Signed-off-by: Alison Burgess <alburgess@microsoft.com>

* set sourcemap to true in tsconfig

Signed-off-by: Alison Burgess <alburgess@microsoft.com>

---------

Signed-off-by: Alison Burgess <alburgess@microsoft.com>
Co-authored-by: Alison Burgess <alburgess@microsoft.com>
2023-03-23 22:25:48 +01:00
Julien Pivotto
de50efbf7a
Merge pull request #12165 from prometheus/release-2.43
Merge 2.43 in main
2023-03-21 17:26:28 +01:00
Julien Pivotto
1070c9b06c Release 2.43.0
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-03-21 13:07:51 +01:00
Julien Pivotto
2c6168be5f Release 2.43.0-rc.1
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-03-16 20:21:40 +01:00
pbudner
46683eadf7 fix: advertise correct flag to enable remote write receiver
Signed-off-by: pbudner <mail@pascalbudner.de>
2023-03-11 13:50:52 +01:00
Julien Pivotto
b6d91e8bf8 Release 2.43.0-rc.0
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-03-09 14:30:02 +01:00
Julien Pivotto
db2d759b81 Add support for lookbackdelta per query via the API
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-03-08 00:30:05 +01:00
Julien Pivotto
8e4350dd59 Directly include SVG logo in the page.
Use HTML <svg> element to include Prometheus logo in the status page.

Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-03-04 00:36:07 +01:00
Shan Aminzadeh
3f6f5d3357
Scope GroupBy labels to metric (#11914)
Signed-off-by: Shan Aminzadeh <shan.aminzadeh@chronosphere.io>
2023-02-17 10:23:16 +01:00
Fish-pro
43d77f7c41 Use http constants instead of string
Signed-off-by: Fish-pro <zechun.chen@daocloud.io>
2023-02-10 10:21:05 +08:00
Julien Pivotto
c70d85baed
Merge pull request #11916 from prometheus/release-2.42
Merge 2.42 to main
2023-02-02 10:17:27 +01:00
Kemal Akkoyun
225c61122d
Cut v2.42.0 (#11912)
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
2023-01-31 18:04:46 +01:00
Kemal Akkoyun
7b9cc7eea3
Cut v2.42.0-rc.0 (#11902)
* Cut v2.42.0-rc.0

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Add missing log items

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
2023-01-27 13:43:47 +01:00
Bartlomiej Plotka
6dcfb71740
Merge pull request #11897 from pracucci/propose-to-change-query-canceled-status-code
API: change HTTP status code from 503/422 to 499 if a request is canceled
2023-01-26 13:51:43 +00:00
Marco Pracucci
3db77b4491
API: change HTTP status code tracked in metrics form 503/422 to 499 if a request is canceled
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-01-26 13:06:37 +01:00
Kemal Akkoyun
bab3b4e3c3 Upgrade dependencies
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
2023-01-26 09:19:50 +01:00
Kemal Akkoyun
ae597cac62
Upgrade several UI dependencies (#11894)
* build(deps): bump @codemirror/autocomplete in /web/ui

Bumps [@codemirror/autocomplete](https://github.com/codemirror/autocomplete) from 6.3.0 to 6.4.0.
- [Release notes](https://github.com/codemirror/autocomplete/releases)
- [Changelog](https://github.com/codemirror/autocomplete/blob/main/CHANGELOG.md)
- [Commits](https://github.com/codemirror/autocomplete/compare/6.3.0...6.4.0)

---
updated-dependencies:
- dependency-name: "@codemirror/autocomplete"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* build(deps): bump @fortawesome/fontawesome-svg-core in /web/ui

Bumps [@fortawesome/fontawesome-svg-core](https://github.com/FortAwesome/Font-Awesome) from 6.2.0 to 6.2.1.
- [Release notes](https://github.com/FortAwesome/Font-Awesome/releases)
- [Changelog](https://github.com/FortAwesome/Font-Awesome/blob/6.x/CHANGELOG.md)
- [Commits](https://github.com/FortAwesome/Font-Awesome/compare/6.2.0...6.2.1)

---
updated-dependencies:
- dependency-name: "@fortawesome/fontawesome-svg-core"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-01-25 19:49:14 +01:00
Julien Pivotto
44ef49805c
Merge pull request #11890 from prometheus/release-2.41
Merge back Release 2.41
2023-01-25 11:22:46 +01:00
Shan Aminzadeh
cdfd18ce00
[lezer-promql] Fix package.json main to point to correct cjs module (#11888)
Signed-off-by: Shan Aminzadeh <shan.aminzadeh@chronosphere.io>
2023-01-25 11:00:59 +01:00
Julien Pivotto
2c408289f8 Add stabilizing to UI
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-01-19 11:33:54 +01:00
Julien Pivotto
ce55e5074d Add 'keep_firing_for' field to alerting rules
This commit adds a new 'keep_firing_for' field to Prometheus alerting
rules. The 'resolve_delay' field specifies the minimum amount of time
that an alert should remain firing, even if the expression does not
return any results.

This feature was discussed at a previous dev summit, and it was
determined that a feature like this would be useful in order to allow
the expression time to stabilize and prevent confusing resolved messages
from being propagated through Alertmanager.

This approach is simpler than having two PromQL queries, as was
sometimes discussed, and it should be easy to implement.

This commit does not include tests for the 'resolve_delay' field.  This
is intentional, as the purpose of this commit is to gather comments on
the proposed design of the 'resolve_delay' field before implementing
tests. Once the design of the 'resolve_delay' field has been finalized,
a follow-up commit will be submitted with tests."

See https://github.com/prometheus/prometheus/issues/11570

Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-01-13 12:11:39 +01:00
beorn7
d121db7a65
federate: Fix PeekBack usage
In most cases, there is no sample at `maxt`, so `PeekBack` has to be
used. So far, `PeekBack` did not return a float histogram, and we
disregarded even any returned normal histogram. This fixes both, and
also tweaks the unit test to discover the problem (by using an earlier
timestamp than "now" for the samples in the TSDB).

Signed-off-by: beorn7 <beorn@grafana.com>
2023-01-12 20:43:02 +05:30
Ganesh Vernekar
7a88bc3581
Test federation with native histograms
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2023-01-12 20:43:02 +05:30
Ganesh Vernekar
33f880d123
Add native histogram support in federation
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
2023-01-12 20:42:59 +05:30
Levi Harrison
3b4cbf8da4
Inject readiness state through context (#11617)
Signed-off-by: Levi Harrison <git@leviharrison.dev>

Signed-off-by: Levi Harrison <git@leviharrison.dev>
2023-01-09 00:04:00 +01:00
Aleksey Smirnov
84c6f0e584
Init value for useState hook calls once (#11802)
Signed-off-by: Smirnov Aleksey <aleksey.smirnov@sbermarket.ru>

Signed-off-by: Smirnov Aleksey <aleksey.smirnov@sbermarket.ru>
Co-authored-by: Smirnov Aleksey <aleksey.smirnov@sbermarket.ru>
2023-01-03 22:09:00 +01:00
Łukasz Mierzwa
e1b7082008
Show individual scrape pools on /targets page (#11142)
* Add API endpoints for getting scrape pool names

This adds api/v1/scrape_pools endpoint that returns the list of *names* of all the scrape pools configured.
Having it allows to find out what scrape pools are defined without having to list and parse all targets.

The second change is adding scrapePool query parameter support in api/v1/targets endpoint, that allows to
filter returned targets by only finding ones for passed scrape pool name.

Both changes allow to query for a specific scrape pool data, rather than getting all the targets for all possible scrape pools.
The problem with api/v1/targets endpoint is that it returns huge amount of data if you configure a lot of scrape pools.

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>

* Add a scrape pool selector on /targets page

Current targets page lists all possible targets. This works great if you only have a few scrape pools configured,
but for systems with a lot of scrape pools and targets this slow things down a lot.
Not only does the /targets page load very slowly in such case (waiting for huge API response) but it also take
a long time to render, due to huge number of elements.
This change adds a dropdown selector so it's possible to select only intersting scrape pool to view.
There's also scrapePool query param that will open selected pool automatically.

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>

Signed-off-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
2022-12-23 11:55:08 +01:00