Commit Graph

115 Commits

Author SHA1 Message Date
Arve Knudsen
89bbb885e5
Upgrade to golangci-lint v1.62.0 (#15424)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-11-20 17:22:20 +01:00
Bryan Boreham
5571c7dc98 FastRegexMatcher: use stack memory for lowercase copy of string
Up to 32-byte values this saves garbage, runs faster.
For prefixes, only `toLower` the part we need for the map lookup.

Split toNormalisedLower into fast and slow paths, to avoid a penalty
for the `copy` call in the case where no allocations are done.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-10-28 16:28:58 +00:00
Bryan Boreham
31c5760551
Neater string vs byte-slice conversions (#14425)
unsafe.Slice and unsafe.StringData were added in Go 1.20

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-21 12:19:21 +02:00
Mario Fernandez
5814920601
Fix: optimize .* regexp performance
Shortcut for `.*` matches newlines as well.
Add preamble change ^(?s:
Add test
dotAll flag por al regex
Add and fix regex tests

Signed-off-by: Mario Fernandez <mariofer@redhat.com>
2024-09-17 12:18:31 +02:00
Owen Williams
9da75328ea
fix(utf8): ensure correct validation when legacy mode turned on (#14736)
fix(utf8): ensure correct validation when legacy mode turned on

This depends on the included update of the prometheus/common dependency.

---------

Signed-off-by: Owen Williams <owen.williams@grafana.com>
2024-08-28 17:15:42 +02:00
beorn7
0f760f63dd lint: Revamp our linting rules, mostly around doc comments
Several things done here:

- Set `max-issues-per-linter` to 0 so that we actually see all linter
  warnings and not just 50 per linter. (As we also set
  `max-same-issues` to 0, I assume this was the intention from the
  beginning.)

- Stop using the golangci-lint default excludes (by setting
  `exclude-use-default: false`. Those are too generous and don't match
  our style conventions. (I have re-added some of the excludes
  explicitly in this commit. See below.)

- Re-add the `errcheck` exclusion we have used so far via the
  defaults.

- Exclude the signature requirement `govet` has for `Seek` methods
  because we use non-standard `Seek` methods a lot. (But we keep other
  requirements, while the default excludes completely disabled the
  check for common method segnatures.)

- Exclude warnings about missing doc comments on exported symbols. (We
  used to be pretty adamant about doc comments, but stopped that at
  some point in the past. By now, we have about 500 missing doc
  comments. We may consider reintroducing this check, but that's
  outside of the scope of this commit. The default excludes of
  golangci-lint essentially ignore doc comments completely.)

- By stop using the default excludes, we now get warnings back on
  malformed doc comments. That's the most impactful change in this
  commit. It does not enforce doc comments (again), but _if_ there is
  a doc comment, it has to have the recommended form. (Most of the
  changes in this commit are fixing this form.)

- Improve wording/spelling of some comments in .golangci.yml, and
  remove an outdated comment.

- Leave `package-comments` inactive, but add a TODO asking if we
  should change that.

- Add a new sub-linter `comment-spacings` (and fix corresponding
  comments), which avoids missing spaces after the leading `//`.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-08-22 17:36:11 +02:00
Bryan Boreham
d84282b105 Labels: use single byte as separator - small speedup
Since `seps` is a variable, `seps[0]` has to be bounds-checked every
time. Replacing with a constant everywhere it is used skips this
overhead.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-07-15 09:47:16 +01:00
Bryan Boreham
82a8c6abe2
[ENHANCEMENT] Optimize regexps with multiple prefixes (#13843)
For example `foo.*|bar.*|baz.*`. Instead of checking each one in turn,
we build a map of prefixes, then check the smaller set that could match
the string supplied.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Improve testing and readability

Address review comments on #13843

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-07-03 18:45:36 +01:00
Marco Pracucci
ec31acaf02
FastRegexMatcher: small optimization for the literal prefix case
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-07-01 10:12:50 +02:00
Bryan Boreham
7a82e4b503 Labels benchmarks: remove artefact of small symbol-tables
Symbol tables with fewer than 128 entries, so everything can be
represented as a single byte, are not realistic.

Stuff the symbol table with fake entries before adding the real ones.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-06-21 16:49:10 +01:00
Bryan Boreham
2ba7bc9446 Labels: further optimisation for dedupelabels
Inline (by copy-paste) the fast path of `decodeVarint` in various
places where it gets called a lot.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-06-21 16:46:13 +01:00
Bryan Boreham
2ced2f6aec [PERF] Labels: faster varint for dedupelabels
Including tests.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-06-21 11:57:09 +01:00
Bryan Boreham
84602bbace
Merge branch 'main' into fix-matcher-string-with-empty-label-name 2024-06-19 05:56:25 -04:00
Oleg Zaytsev
4f78cc809c
Refactor toNormalisedLower: shorter and slightly faster. (#14299)
Refactor toNormalisedLower: shorter and slightly faster

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-18 09:57:37 +00:00
Oleg Zaytsev
03cf6141d4
Fix Matcher.String() with empty label name
When the label name is empty, which can happen now with quoted label
name, it should be quoted when printed as a string again.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-13 18:46:35 +02:00
Ranveer Avhad
39902ba694
[BUGFIX] FastRegexpMatcher: do Unicode normalization as part of case-insensitive comparison (#14170)
* Converted string to standarized form
* Added golang.org/x/text in Go dependencies
* Added test cases for FastRegexMatcher
* Added benchmark for toNormalizedLower

Signed-off-by: RA <ranveeravhad777@gmail.com>
2024-06-10 18:31:41 -04:00
Marco Pracucci
d966ae6400
Optimize containsInOrder() inlining it
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-06-04 10:34:15 +02:00
Marco Pracucci
a0807733be
Improved tests
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-06-04 10:34:15 +02:00
Marco Pracucci
78fdd2188d
Improve contains check done by FastRegexMatcher
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-06-04 10:34:15 +02:00
Bryan Boreham
1e0b0e250a
Merge pull request #14090 from colega/improve-zeroOrOneCharacterStringMatcher-Matches
Improve `zeroOrOneCharacterStringMatcher` by using `utf8.DecodeRuneInString`
2024-05-16 09:28:53 +01:00
Oleksandr Redko
f10c3454e9 Enable perfsprint linter and fix up code
Signed-off-by: Oleksandr Redko <oleksandr.red+github@gmail.com>
2024-05-15 17:51:05 +03:00
Oleg Zaytsev
8b4c9459a2
Check utf8.RuneError result
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-13 17:44:26 +02:00
Oleg Zaytsev
dbe88fae22
Add invalid utf8 test cases to regexp
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-13 17:05:31 +02:00
Oleg Zaytsev
bcff5059e6
Use utf8.DecodeRuneInString(s)
This replaces the custom `moreThanOneRune` function with the standard
`utf8.DecodeRuneInString(s)` that can be used to figure out the size of
the first rune.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-13 15:41:00 +02:00
Oleg Zaytsev
fdfc6d4725
Benchmark zeroOrOneCharacterStringMatcher.Matches
This adds some more test cases for unicode values, and also a benchmark
for zeroOrOneCharacterStringMatcher.Matches()

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-13 15:36:55 +02:00
Oleg Zaytsev
b7b4355807
Use bytes.Buffer from stack buf in Matcher.String()
Also removed the growing until there's a benchmark for that.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-09 10:00:24 +02:00
Oleg Zaytsev
6ebda5a7bc
Optimize Matcher.String()
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-08 17:05:27 +02:00
Oleg Zaytsev
dabd789fd5
Quote label name in matchers when needed
When the label name of a matcher contains non-standard characters, like
a dot, or starts with a digit, it should be quoted.

If it's not quoted, then `VectorSelector.String()` isn't a valid PromQL.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-08 16:58:51 +02:00
Oleg Zaytsev
2524a91591
Fix FastRegexMatcher matching multibyte runes with . (#14059)
When `zeroOrOneCharacterStringMatcher` wach checking the input string,
it assumed that if there are more than one bytes, then there are more
than one runes, but that's not necessarily true.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-07 16:33:37 +02:00
Matthieu MOREL
d496687c8e golangci-lint: enable usestdlibvars linter
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-04-08 19:26:23 +00:00
Bryan Boreham
7c28521451 [TESTS] Truncate some long test names, for readability
The strings produced by these tests can run to thousands of characters,
which makes test logs difficult to read.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-03 10:10:39 +01:00
carehabit
a672662073
all: fix some typos (#13863)
Signed-off-by: carehabit <shenyuting@outlook.com>
2024-04-01 18:06:05 +02:00
Domantas
435f330d0b
[BUGFIX] labels: don't modify original labels in DropMetricName (#13845)
Restrict the capacity of first argument to `append()` to force an allocation.
This is for the slice implementation only.

Signed-off-by: Domantas Jadenkus <djadenkus@gmail.com>
2024-03-27 10:35:17 +00:00
Bryan Boreham
48786ad4e8 Use slices insteda of exp/slices
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-25 12:20:18 +00:00
Bryan Boreham
080d440bf8 Merge remote-tracking branch 'origin/main' into pr/13461 2024-03-25 12:14:26 +00:00
Oleg Zaytsev
d12e785075
Improve readability
As suggested by @bboreham

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-03-18 11:16:09 +01:00
Oleg Zaytsev
9699598952
Improve Labels.Compare performance w/stringlabels
I was bored on a train and I spent some amount of time trying to scratch
some nanoseconds off the Labels.Compare when running with stringlabels.

I would be ashamed to admit the real amount of time I spent on it.

The worst thing is, I can't really explain why this is performing so
much better, and someone should re-run the benchmarks on their machine
to confirm that it's not something related to general relativity because
the train is moving. I also added some extra real-life benchmark cases
with longer labelsets (these aren't the longest we have in production,
but kubernetes labelsets are fairly common in Prometheus so I thought it
would be nice to have them).

My benchmarks show this diff:

goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/model/labels
                                       │     old     │                 new                 │
                                       │   sec/op    │   sec/op     vs base                │
Labels_Compare/equal                     5.898n ± 0%   5.875n ± 1%   -0.40% (p=0.037 n=10)
Labels_Compare/not_equal                 11.78n ± 2%   11.01n ± 1%   -6.54% (p=0.000 n=10)
Labels_Compare/different_sizes           4.959n ± 1%   4.906n ± 2%   -1.05% (p=0.050 n=10)
Labels_Compare/lots                      21.32n ± 0%   17.54n ± 5%  -17.75% (p=0.000 n=10)
Labels_Compare/real_long_equal           15.06n ± 1%   14.92n ± 0%   -0.93% (p=0.000 n=10)
Labels_Compare/real_long_different_end   25.20n ± 0%   24.43n ± 0%   -3.04% (p=0.000 n=10)
geomean                                  11.86n        11.25n        -5.16%

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-03-17 17:08:06 +01:00
Bryan Boreham
0bb5588386
labels: optimize String method (#13673)
Use a stack buffer to reduce memory allocations.

`Write(AppendQuote(AvailableBuffer` does not allocate or copy when
the buffer has sufficient space.

Also add a benchmark, with some refactoring.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-03-12 11:34:03 +00:00
machine424
f477e0539a
Move from golang.org/x/exp/slices into slices now that we only support Go >= 1.21
Prevent adding back golang.org/x/exp/slices.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2024-02-28 14:54:53 +01:00
Bryan Boreham
e1a741a0d7 labels: update copyright dates
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-26 11:45:25 +00:00
Bryan Boreham
55e7de04f8 model/labels (stringlabels): use strings.Clone
Suggestion from @colega.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-26 11:45:25 +00:00
Bryan Boreham
d16ce3c9bd model/labels (dedupelabels): small clarifications
Suggestion from @colega.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-26 11:45:25 +00:00
Bryan Boreham
b39286fd1f Add dedupelabels tag to not build regular labels
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-23 13:50:27 +00:00
Bryan Boreham
d51a5344cd labels: new version de-duplicating strings in SymbolTables
The individual strings for label names and values are held in a table,
and each Labels value is a run of varint-encoded indexes into that table.

When creating new labels, a sync.Mutex is locked around reads and writes.
When reading labels, there is no locking because the table of strings
used by those labels is immutable.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-23 13:50:27 +00:00
Bryan Boreham
28191109a8 Labels: add fake versions of SymbolTable apis
So we can use them where necessary for internlabels implementation.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-23 13:50:27 +00:00
Bryan Boreham
d1af84f6ee Labels: move Builder and Reset out of common
New internstrings implementation is different.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-23 13:50:27 +00:00
Bryan Boreham
5aa4473894 labels tests: extend TestBuilder
Start with empty base labels, also check new and re-used symbol tables

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-23 13:50:27 +00:00
Bryan Boreham
bb82a57e64 Labels: Call NewScratchBuilder in test_utils
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-23 13:50:27 +00:00
Bryan Boreham
cc5dc6a61b labels: use Equal instead of DeepEqual
This will work better with a different data structure.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-02-23 13:50:27 +00:00
beorn7
553d92affd model/labels: Fix new lint warning in test
Signed-off-by: beorn7 <beorn@grafana.com>
2024-02-07 18:12:26 +01:00