Commit Graph

1520 Commits

Author SHA1 Message Date
Julien Pivotto
76dd9b5470
Merge pull request #12618 from prometheus/release-2.46
Merge release 2.46 into main
2023-07-31 10:07:17 +02:00
Goutham Veeramachaneni
ad4f514e66
Add OTLP Ingestion endpoint (#12571)
* Add OTLP Ingestion endpoint

We copy files from the otel-collector-contrib. See the README in
`storage/remote/otlptranslator/README.md`.

This supersedes: https://github.com/prometheus/prometheus/pull/11965

Signed-off-by: gouthamve <gouthamve@gmail.com>

* Return a 200 OK

It is what the OTEL Golang SDK expect :(

https://github.com/open-telemetry/opentelemetry-go/issues/4363

Signed-off-by: Goutham <gouthamve@gmail.com>

---------

Signed-off-by: gouthamve <gouthamve@gmail.com>
Signed-off-by: Goutham <gouthamve@gmail.com>
2023-07-28 12:35:28 +02:00
Julien Pivotto
16c645a6dd Release 2.46.0
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-07-25 13:38:08 +02:00
Julien Pivotto
c37af1eda5 Release 2.46.0-rc.0
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
2023-07-20 16:52:56 +02:00
beorn7
2ea8df4734 histogram: Expose #12305
Native histograms without a zero threshold aren't federated properly.

This adds a test to prove the specific failure mode, which is that
histograms with a zero threshold of zero are federated as classic
histograms.

The underlying reason is that the protobuf parser identifies a native
histogram by detecting a zero bucket or by detecting integer buckets.
Therefore, a float histogram with a zero threshold of zero and an
unpopulated zero bucket falls through the cracks (no integer buckets,
no zero bucket).

This commit also addse a test case for the latter.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-07-19 15:29:11 +02:00
Julien Pivotto
c572d9d6d9
Merge pull request #11905 from charleskorn/api-response-format-extension-point
Add extension point for returning different content types from API endpoints
2023-07-15 22:49:29 +02:00
Julien Pivotto
f3f3d8f5ca
Merge pull request #12540 from bboreham/slices-sorts2
Replace sort.Sort with faster slices.SortFunc
2023-07-11 13:08:19 +02:00
Bryan Boreham
ce153e3fff Replace sort.Sort with faster slices.SortFunc
The generic version is more efficient.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-07-10 09:43:45 +00:00
Marco Pracucci
7cc4292328
Export MinTime and MaxTime
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2023-07-06 17:48:13 +02:00
Julien Pivotto
0186ec7873
Merge pull request #12516 from vinted/convert_queryopts_to_interface
promql: convert QueryOpts to interface
2023-07-04 23:38:31 +02:00
Julien Pivotto
986fde06b2
Merge pull request #11688 from damnever/fix/datamodelvalidation-remotewriteapi
Validate the metric names and labels in the remote write handler
2023-07-04 13:52:02 +02:00
Charles Korn
097faf33c6
Merge branch 'main' into api-response-format-extension-point
# Conflicts:
#	web/api/v1/api.go
#	web/api/v1/api_test.go
2023-07-04 13:26:13 +10:00
Giedrius Statkevičius
3f230fc9f8 promql: convert QueryOpts to interface
Convert QueryOpts to an interface so that downstream projects like
https://github.com/thanos-community/promql-engine could extend the query
options with engine specific options that are not in the original
engine.

Will be used to enable query analysis per-query.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2023-07-03 16:20:31 +03:00
Matthias Loibl
71d149a79f
Merge pull request #12472 from metalmatze/request-counter-init
web: Initialize requestCounter metrics to 0 with handler and 200 labels
2023-06-30 14:07:20 +02:00
cui fliter
484a9e4071
fix some typos (#12498)
Signed-off-by: cui fliter <imcusg@gmail.com>
2023-06-29 12:28:13 +02:00
Matthias Loibl
686482ab34
Remove Add(0)
Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>
2023-06-27 18:10:38 +02:00
Julien Pivotto
490bf641be
Merge pull request #12487 from prometheus/release-2.45
Merge Release 2.45 back to main
2023-06-23 23:56:10 +02:00
Jesus Vazquez
8ef767e396
Release 2.45.0 (#12486)
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
2023-06-23 17:01:52 +02:00
Jesus Vazquez
c858049744
Create 2.45.0-rc.1 (#12478)
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
2023-06-20 15:13:02 +00:00
Matthias Loibl
8bc2a19469
web: Initialize requestCounter metrics to 0 with handler and 200k labels.
Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>
2023-06-19 17:50:16 +02:00
Julien Pivotto
e043b273a6
Merge pull request #12439 from prometheus/release-2.45
Merge release 2.45.0 back to main
2023-06-17 10:16:48 +02:00
Arthur Silva Sens
1ea477f4bc
Add feature flag to squash metadata from /api/v1/metadata (#12391)
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2023-06-12 16:17:20 +01:00
Jesus Vazquez
edfc97a77e
Bump UI version (#12440)
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
2023-06-07 16:00:15 +02:00
Jesus Vazquez
bfa466d00f
Create release candidate 2.45.0-rc.0 (#12435)
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
2023-06-07 12:29:04 +02:00
cui fliter
6e7ac76981
fix problematic link (#12405)
Signed-off-by: cui fliter <imcusg@gmail.com>
2023-05-29 10:26:11 +02:00
Baskar Shanmugam
905a0bd63a
Added 'limit' query parameter support to /api/v1/status/tsdb endpoint (#12336)
* Added 'topN' query parameter support to /api/v1/status/tsdb endpoint

Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>

* Updated query parameter for tsdb status to 'limit'

Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>

* Corrected Stats() parameter name from topN to limit

Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>

* Fixed p.Stats CI failure

Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>

---------

Signed-off-by: Baskar Shanmugam <baskar.shanmugam.career@gmail.com>
2023-05-22 14:37:07 +02:00
Bryan Boreham
a073e04a9b
Merge pull request #12366 from prometheus/release-2.44
Merge release 2.44 back to main
2023-05-16 18:06:29 +01:00
Bryan Boreham
1ac5131f69
Release 2.44.0 (#12364)
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-05-14 07:13:03 +01:00
beorn7
9e500345f3 textparse/scrape: Add option to scrape both classic and native histograms
So far, if a target exposes a histogram with both classic and native
buckets, a native-histogram enabled Prometheus would ignore the
classic buckets. With the new scrape config option
`scrape_classic_histograms` set, both buckets will be ingested,
creating all the series of a classic histogram in parallel to the
native histogram series. For example, a histogram `foo` would create a
native histogram series `foo` and classic series called `foo_sum`,
`foo_count`, and `foo_bucket`.

This feature can be used in a migration strategy from classic to
native histograms, where it is desired to have a transition period
during which both native and classic histograms are present.

Note that two bugs in classic histogram parsing were found and fixed
as a byproduct of testing the new feature:

1. Series created from classic _gauge_ histograms didn't get the
   _sum/_count/_bucket prefix set.
2. Values of classic _float_ histograms weren't parsed properly.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-05-13 01:32:25 +02:00
Bryan Boreham
94d9367bbf
Create 2.44.0-rc.2 (#12341)
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-05-08 11:30:29 +01:00
Bryan Boreham
3d26faade4
Create 2.44.0-rc.1 (#12323)
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-05-03 16:18:28 +01:00
Bryan Boreham
aeccf9e770 Bump version to 2.44.0-rc0
Including CHANGELOG.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2023-04-24 15:38:43 +00:00
Vladimir Varankin
d281ebb178 web: display GOMEMLIMIT in runtime info
Signed-off-by: Vladimir Varankin <vladimir@varank.in>
2023-04-23 20:24:34 +02:00
Julien Pivotto
8f1dc4a70f
Merge pull request #12248 from yeya24/consistent-response
Use same error for instant and range query when 400
2023-04-21 11:44:20 +02:00
Julien Pivotto
e2512078e5
Merge pull request #12241 from mmorel-35/linter/nilerr
enable gocritic, unconvert and unused linters
2023-04-20 15:13:31 +02:00
gotjosh
2f22c8b7f8
Merge pull request #12270 from prometheus/gotjosh/allow-filtering-of-rules-by-name-api
Rules API: Allow filtering by rule name
2023-04-20 12:03:08 +01:00
gotjosh
e78be38cc0
don't show empty groups
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-04-20 11:20:20 +01:00
Matthieu MOREL
bae9a21200
Merge branch 'main' into linter/nilerr
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-04-19 19:56:39 +02:00
beorn7
5b53aa1108 style: Replace else if cascades with switch
Wiser coders than myself have come to the conclusion that a `switch`
statement is almost always superior to a statement that includes any
`else if`.

The exceptions that I have found in our codebase are just these two:

* The `if else` is followed by an additional statement before the next
  condition (separated by a `;`).
* The whole thing is within a `for` loop and `break` statements are
  used. In this case, using `switch` would require tagging the `for`
  loop, which probably tips the balance.

Why are `switch` statements more readable?

For one, fewer curly braces. But more importantly, the conditions all
have the same alignment, so the whole thing follows the natural flow
of going down a list of conditions. With `else if`, in contrast, all
conditions but the first are "hidden" behind `} else if `, harder to
spot and (for no good reason) presented differently from the first
condition.

I'm sure the aforemention wise coders can list even more reasons.

In any case, I like it so much that I have found myself recommending
it in code reviews. I would like to make it a habit in our code base,
without making it a hard requirement that we would test on the CI. But
for that, there has to be a role model, so this commit eliminates all
`if else` occurrences, unless it is autogenerated code or fits one of
the exceptions above.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:22:31 +02:00
beorn7
c3c7d44d84 lint: Adjust to the lint warnings raised by current versions of golint-ci
We haven't updated golint-ci in our CI yet, but this commit prepares
for that.

There are a lot of new warnings, and it is mostly because the "revive"
linter got updated. I agree with most of the new warnings, mostly
around not naming unused function parameters (although it is justified
in some cases for documentation purposes – while things like mocks are
a good example where not naming the parameter is clearer).

I'm pretty upset about the "empty block" warning to include `for`
loops. It's such a common pattern to do something in the head of the
`for` loop and then have an empty block. There is still an open issue
about this: https://github.com/mgechev/revive/issues/810 I have
disabled "revive" altogether in files where empty blocks are used
excessively, and I have made the effort to add individual
`// nolint:revive` where empty blocks are used just once or twice.
It's borderline noisy, though, but let's go with it for now.

I should mention that none of the "empty block" warnings for `for`
loop bodies were legitimate.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-19 17:10:10 +02:00
gotjosh
96b6463f25
review comments
Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-04-18 16:26:32 +01:00
gotjosh
f3394bf7a1
Rules API: Allow filtering by rule name
Introduces support for a new query parameter in the `/rules` API endpoint that allows filtering by rule names.

If all the rules of a group are filtered, we skip the group entirely.

Signed-off-by: gotjosh <josue.abreu@gmail.com>
2023-04-18 10:12:08 +01:00
Ben Ye
fd3630b9a3 add ctx to QueryEngine interface
Signed-off-by: Ben Ye <benye@amazon.com>
2023-04-17 21:32:38 -07:00
Matthieu MOREL
fb3eb21230 enable gocritic, unconvert and unused linters
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2023-04-13 19:20:22 +00:00
beorn7
817a2396cb Name float values as "floats", not as "values"
In the past, every sample value was a float, so it was fine to call a
variable holding such a float "value" or "sample". With native
histograms, a sample might have a histogram value. And a histogram
value is still a value. Calling a float value just "value" or "sample"
or "V" is therefore misleading. Over the last few commits, I already
renamed many variables, but this cleans up a few more places where the
changes are more invasive.

Note that we do not to attempt naming in the JSON APIs or in the
protobufs. That would be quite a disruption. However, internally, we
can call variables as we want, and we should go with the option of
avoiding misunderstandings.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-13 19:25:24 +02:00
beorn7
630bcb494b storage: Use separate sample types for histogram vs. float
Previously, we had one “polymorphous” `sample` type in the `storage`
package. This commit breaks it up into `fSample`, `hSample`, and
`fhSample`, each still implementing the `tsdbutil.Sample` interface.

This reduces allocations in `sampleRing.Add` but inflicts the penalty
of the interface wrapper, which makes things worse in total.

This commit therefore just demonstrates the step taken. The next
commit will tackle the interface overhead problem.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-13 19:25:24 +02:00
beorn7
c0879d64cf promql: Separate Point into FPoint and HPoint
In other words: Instead of having a “polymorphous” `Point` that can
either contain a float value or a histogram value, use an `FPoint` for
floats and an `HPoint` for histograms.

This seemingly small change has a _lot_ of repercussions throughout
the codebase.

The idea here is to avoid the increase in size of `Point` arrays that
happened after native histograms had been added.

The higher-level data structures (`Sample`, `Series`, etc.) are still
“polymorphous”. The same idea could be applied to them, but at each
step the trade-offs needed to be evaluated.

The idea with this change is to do the minimum necessary to get back
to pre-histogram performance for functions that do not touch
histograms. Here are comparisons for the `changes` function. The test
data doesn't include histograms yet. Ideally, there would be no change
in the benchmark result at all.

First runtime v2.39 compared to directly prior to this commit:

```
name                                                  old time/op    new time/op    delta
RangeQuery/expr=changes(a_one[1d]),steps=1-16            391µs ± 2%     542µs ± 1%  +38.58%  (p=0.000 n=9+8)
RangeQuery/expr=changes(a_one[1d]),steps=10-16           452µs ± 2%     617µs ± 2%  +36.48%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_one[1d]),steps=100-16         1.12ms ± 1%    1.36ms ± 2%  +21.58%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_one[1d]),steps=1000-16        7.83ms ± 1%    8.94ms ± 1%  +14.21%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1-16           2.98ms ± 0%    3.30ms ± 1%  +10.67%  (p=0.000 n=9+10)
RangeQuery/expr=changes(a_ten[1d]),steps=10-16          3.66ms ± 1%    4.10ms ± 1%  +11.82%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_ten[1d]),steps=100-16         10.5ms ± 0%    11.8ms ± 1%  +12.50%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1000-16        77.6ms ± 1%    87.4ms ± 1%  +12.63%  (p=0.000 n=9+9)
RangeQuery/expr=changes(a_hundred[1d]),steps=1-16       30.4ms ± 2%    32.8ms ± 1%   +8.01%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=10-16      37.1ms ± 2%    40.6ms ± 2%   +9.64%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=100-16      105ms ± 1%     117ms ± 1%  +11.69%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16     783ms ± 3%     876ms ± 1%  +11.83%  (p=0.000 n=9+10)
```

And then runtime v2.39 compared to after this commit:

```
name                                                  old time/op    new time/op    delta
RangeQuery/expr=changes(a_one[1d]),steps=1-16            391µs ± 2%     547µs ± 1%  +39.84%  (p=0.000 n=9+8)
RangeQuery/expr=changes(a_one[1d]),steps=10-16           452µs ± 2%     616µs ± 2%  +36.15%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_one[1d]),steps=100-16         1.12ms ± 1%    1.26ms ± 1%  +12.20%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_one[1d]),steps=1000-16        7.83ms ± 1%    7.95ms ± 1%   +1.59%  (p=0.000 n=10+8)
RangeQuery/expr=changes(a_ten[1d]),steps=1-16           2.98ms ± 0%    3.38ms ± 2%  +13.49%  (p=0.000 n=9+10)
RangeQuery/expr=changes(a_ten[1d]),steps=10-16          3.66ms ± 1%    4.02ms ± 1%   +9.80%  (p=0.000 n=10+9)
RangeQuery/expr=changes(a_ten[1d]),steps=100-16         10.5ms ± 0%    10.8ms ± 1%   +3.08%  (p=0.000 n=8+10)
RangeQuery/expr=changes(a_ten[1d]),steps=1000-16        77.6ms ± 1%    78.1ms ± 1%   +0.58%  (p=0.035 n=9+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1-16       30.4ms ± 2%    33.5ms ± 4%  +10.18%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=10-16      37.1ms ± 2%    40.0ms ± 1%   +7.98%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=100-16      105ms ± 1%     107ms ± 1%   +1.92%  (p=0.000 n=10+10)
RangeQuery/expr=changes(a_hundred[1d]),steps=1000-16     783ms ± 3%     775ms ± 1%   -1.02%  (p=0.019 n=9+9)
```

In summary, the runtime doesn't really improve with this change for
queries with just a few steps. For queries with many steps, this
commit essentially reinstates the old performance. This is good
because the many-step queries are the one that matter most (longest
absolute runtime).

In terms of allocations, though, this commit doesn't make a dent at
all (numbers not shown). The reason is that most of the allocations
happen in the sampleRingIterator (in the storage package), which has
to be addressed in a separate commit.

Signed-off-by: beorn7 <beorn@grafana.com>
2023-04-13 19:25:16 +02:00
Ben Ye
fb67d368a2 use consistent error for instant and range query 400
Signed-off-by: Ben Ye <benye@amazon.com>
2023-04-11 13:45:34 -07:00
Xiaochao Dong (@damnever)
2b7202c4cc Validate the metric names and labels in the remote write handler
Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>
2023-04-05 19:09:05 +08:00
Hayk Davtyan
408f31f786
[WebUI/ScrapePoolList] Case-insensitive search of "Scrape Pools" (#12207)
Signed-off-by: hayk96 <hayko5999@gmail.com>
2023-04-02 11:37:58 +02:00