Support UTF-8 label matchers: Update the docs on how to use UTF-8 in label matchers and parse mode feature flags (#3572)

* Update the docs on how to use UTF-8 in label matchers and parse mode feature flags

Signed-off-by: George Robinson <george.robinson@grafana.com>
---------

Signed-off-by: George Robinson <george.robinson@grafana.com>
This commit is contained in:
George Robinson 2024-02-14 09:20:42 +00:00 committed by GitHub
parent 4d6ddd25c9
commit c2cf3db045
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 192 additions and 24 deletions

View File

@ -421,27 +421,146 @@ source_matchers:
## Label matchers
Label matchers are used both in routes and inhibition rules to match certain alerts.
Label matchers match alerts to routes, silences, and inhibition rules.
**Important**: Prometheus is adding support for UTF-8 in labels and metrics. In order to also support UTF-8 in the Alertmanager, Alertmanager versions 0.27 and later have a new parser for matchers that has a number of backwards incompatible changes. While most matchers will be forward-compatible, some will not. Alertmanager is operating a transition period where it supports both UTF-8 and classic matchers, and has provided a number of tools to help you prepare for the transition.
If this is a new Alertmanager installation, we recommend enabling UTF-8 strict mode before creating an Alertmanager configuration file. You can find instructions on how to enable UTF-8 strict mode [here](#utf-8-strict-mode).
If this is an existing Alertmanager installation, we recommend running the Alertmanager in the default mode called fallback mode before enabling UTF-8 strict mode. In this mode, Alertmanager will log a warning if you need to make any changes to your configuration file before UTF-8 strict mode can be enabled. Alertmanager will make UTF-8 strict mode the default in the next two versions, so it's important to transition as soon as possible.
Irrespective of whether an Alertmanager installation is a new or existing installation, you can also use `amtool` to validate that an Alertmanager configuration file is compatible with UTF-8 strict mode before enabling it in Alertmanager server. You do not need a running Alertmanager server to do this. You can find instructions on how to validate an Alertmanager configuration file using `amtool` [here](#verification).
### Alertmanager server operational modes
During the transition period, Alertmanager supports three modes of operation. These are known as fallback mode, UTF-8 strict mode and classic mode. Fallback mode is the default mode.
Operators of Alertmanager servers should transition to UTF-8 strict mode before the end of the transition period. Alertmanager will make UTF-8 strict mode the default in the next two versions, so it's important to transition as soon as possible.
#### Fallback mode
Alertmanager runs in a special mode called fallback mode as its default mode. As operators, you should not experience any difference in how your routes, silences or inhibition rules work.
In fallback mode, configurations are first parsed as UTF-8 matchers, and if incompatible with the UTF-8 parser, are then parsed as classic matchers. If your Alertmanager configuration contains matchers that are incompatible with the UTF-8 parser, Alertmanager will parse them as classic matchers and log a warning. This warning also includes a suggestion on how to change the matchers from classic matchers to UTF-8 matchers. For example:
> ts=2024-02-11T10:00:00Z caller=parse.go:176 level=warn msg="Alertmanager is moving to a new parser for labels and matchers, and this input is incompatible. Alertmanager has instead parsed the input using the classic matchers parser as a fallback. To make this input compatible with the UTF-8 matchers parser please make sure all regular expressions and values are double-quoted. If you are still seeing this message please open an issue." input="foo=" origin=config err="end of input: expected label value" suggestion="foo=\"\""
Here the matcher `foo=` can be made into a valid UTF-8 matcher by double quoting the right hand side of the expression to give `foo=""`. These two matchers are equivalent, however with UTF-8 matchers the right hand side of the matcher is a required field.
In rare cases, a configuration can cause disagreement between the UTF-8 and classic parser. This happens when a matcher is valid in both parsers, but due to added support for UTF-8, results in different parsings depending on which parser is used. If your Alertmanager configuration has disagreement, Alertmanager will use the classic parser and log a warning. For example:
> ts=2024-02-11T10:00:00Z caller=parse.go:183 level=warn msg="Matchers input has disagreement" input="qux=\"\\xf0\\x9f\\x99\\x82\"\n" origin=config
Any occurrences of disagreement should be looked at on a case by case basis as depending on the nature of the disagreement, the configuration might not need updating before enabling UTF-8 strict mode. For example `\xf0\x9f\x99\x82` is the byte sequence for the 🙂 emoji. If the intention is to match a literal 🙂 emoji then no change is required. However, if the intention is to match the literal `\xf0\x9f\x99\x82` then the matcher should be changed to `qux="\\xf0\\x9f\\x99\\x82"`.
#### UTF-8 strict mode
In UTF-8 strict mode, Alertmanager disables support for classic matchers:
> alertmanager --config.file=config.yml --enable-feature="utf8-strict-mode"
This mode should be enabled for new Alertmanager installations, and existing Alertmanager installations once all warnings of incompatible matchers have been resolved. Alertmanager will not start in UTF-8 strict mode until all the warnings of incompatible matchers have been resolved:
> ts=2024-02-11T10:00:00Z caller=coordinator.go:118 level=error component=configuration msg="Loading configuration file failed" file=config.yml err="end of input: expected label value"
UTF-8 strict mode will be the default mode of Alertmanager at the end of the transition period.
#### Classic mode
Classic mode is equivalent to Alertmanager versions 0.26.0 and older:
```
alertmanager --config.file=config.yml --enable-feature="classic-mode"
```
You can use this mode if you suspect there is an issue with fallback mode or UTF-8 strict mode. In such cases, please open an issue on GitHub with as much information as possible.
### Verification
You can use `amtool` to validate that an Alertmanager configuration file is compatible with UTF-8 strict mode before enabling it in Alertmanager server. You do not need a running Alertmanager server to do this.
Just like Alertmanager server, `amtool` will log a warning if the configuration is incompatible or contains disagreement:
```
amtool check-config config.yml
Checking 'config.yml'
level=warn msg="Alertmanager is moving to a new parser for labels and matchers, and this input is incompatible. Alertmanager has instead parsed the input using the classic matchers parser as a fallback. To make this input compatible with the UTF-8 matchers parser please make sure all regular expressions and values are double-quoted. If you are still seeing this message please open an issue." input="foo=" origin=config err="end of input: expected label value" suggestion="foo=\"\""
level=warn msg="Matchers input has disagreement" input="qux=\"\\xf0\\x9f\\x99\\x82\"\n" origin=config
SUCCESS
Found:
- global config
- route
- 2 inhibit rules
- 2 receivers
- 0 templates
```
You will know if a configuration is compatible with UTF-8 strict mode when no warnings are logged in `amtool`:
```
amtool check-config config.yml
Checking 'config.yml' SUCCESS
Found:
- global config
- route
- 2 inhibit rules
- 2 receivers
- 0 templates
```
You can also use `amtool` in UTF-8 strict mode as an additional level of verification. You will know that a configuration is invalid because the command will fail:
```
amtool check-config config.yml --enable-feature="utf8-strict-mode"
level=warn msg="UTF-8 mode enabled"
Checking 'config.yml' FAILED: end of input: expected label value
amtool: error: failed to validate 1 file(s)
```
You will know that a configuration is valid because the command will succeed:
```
amtool check-config config.yml --enable-feature="utf8-strict-mode"
level=warn msg="UTF-8 mode enabled"
Checking 'config.yml' SUCCESS
Found:
- global config
- route
- 2 inhibit rules
- 2 receivers
- 0 templates
```
### `<matcher>`
A matcher is a string with a syntax inspired by PromQL and OpenMetrics. The syntax of a matcher consists of three tokens:
#### UTF-8 matchers
A UTF-8 matcher consists of three tokens:
- An unquoted literal or a double-quoted string for the label name.
- One of `=`, `!=`, `=~`, or `!~`. `=` means equals, `!=` means not equal, `=~` means matches the regular expression and `!~` means doesn't match the regular expression.
- An unquoted literal or a double-quoted string for the regular expression or label value.
Unquoted literals can contain all UTF-8 characters other than the reserved characters. These are whitespace, and all characters in ``` { } ! = ~ , \ " ' ` ```. For example, `foo`, `[a-zA-Z]+`, and `Προμηθεύς` (Prometheus in Greek) are all examples of valid unquoted literals. However, `foo!` is not a valid literal as `!` is a reserved character.
Double-quoted strings can contain all UTF-8 characters. Unlike unquoted literals, there are no reserved characters. You can even use UTF-8 code points. For example, `"foo!"`, `"bar,baz"`, `"\"baz qux\""` and `"\xf0\x9f\x99\x82"` are valid double-quoted strings.
#### Classic matchers
A classic matcher is a string with a syntax inspired by PromQL and OpenMetrics. The syntax of a classic matcher consists of three tokens:
- A valid Prometheus label name.
- One of `=`, `!=`, `=~`, or `!~`. `=` means equals, `!=` means that the strings are not equal, `=~` is used for equality of regex expressions and `!~` is used for un-equality of regex expressions. They have the same meaning as known from PromQL selectors.
- A UTF-8 string, which may be enclosed in double quotes. Before or after each token, there may be any amount of whitespace.
The 3rd token may be the empty string. Within the 3rd token, OpenMetrics escaping rules apply: `\"` for a double-quote, `\n` for a line feed, `\\` for a literal backslash. Unescaped `"` must not occur inside the 3rd token (only as the 1st or last character). However, literal line feed characters are tolerated, as are single `\` characters not followed by `\`, `n`, or `"`. They act as a literal backslash in that case.
Matchers are ANDed together, meaning that all matchers must evaluate to "true" when tested against the labels on a given alert. For example, an alert with these labels:
#### Composition of matchers
```json
{"alertname":"Watchdog","severity":"none"}
```
You can compose matchers to create complex match expressions. When composed, all matchers must match for the entire expression to match. For example, the expression `{alertname="Watchdog", severity=~"warning|critical"}` will match an alert with labels `alertname=Watchdog, severity=critical` but not an alert with labels `alertname=Watchdog, severity=none` as while the alertname is Watchdog the severity is neither warning nor critical.
would NOT match this list of matchers:
You can compose matchers into expressions with a YAML list:
```yaml
matchers:
@ -449,11 +568,60 @@ matchers:
- severity =~ "warning|critical"
```
In the configuration, multiple matchers are combined in a YAML list. However, it is also possible to combine multiple matchers within a single YAML string, again using syntax inspired by PromQL. In such a string, a leading `{` and/or a trailing `}` is optional and will be trimmed before further parsing. Individual matchers are separated by commas outside of quoted parts of the string. Those commas may be surrounded by whitespace. Parts of the string inside unescaped double quotes `"…"` are considered quoted (and commas don't act as separators there). If double quotes are escaped with a single backslash `\`, they are ignored for the purpose of identifying quoted parts of the input string. If the input string, after trimming the optional trailing `}`, ends with a comma, followed by optional whitespace, this comma and whitespace will be trimmed.
or as a PromQL inspired expression where each matcher is comma separated:
Here are some examples of valid string matchers:
```
{alertname="Watchdog", severity=~"warning|critical"}
```
1. Shown below are two equality matchers combined in a long form YAML list.
A single trailing comma is permitted:
```
{alertname="Watchdog", severity=~"warning|critical",}
```
The open `{` and close `}` brace are optional:
```
alertname="Watchdog", severity=~"warning|critical"
```
However, both must be either present or omitted. You cannot have incomplete open or close braces:
```
{alertname="Watchdog", severity=~"warning|critical"
```
```
alertname="Watchdog", severity=~"warning|critical"}
```
You cannot have duplicate open or close braces either:
```
{{alertname="Watchdog", severity=~"warning|critical",}}
```
Whitespace (spaces, tabs and newlines) is permitted outside double quotes and has no effect on the matchers themselves. For example:
```
{
alertname = "Watchdog",
severity =~ "warning|critical",
}
```
is equivalent to:
```
{alertname="Watchdog",severity=~"warning|critical"}
```
#### More examples
Here are some more examples:
1. Two equals matchers composed as a YAML list:
```yaml
matchers:
@ -461,25 +629,25 @@ Here are some examples of valid string matchers:
- dings != bums
```
2. Similar to example 1, shown below are two equality matchers combined in a short form YAML list.
2. Two matchers combined composed as a short-form YAML list:
```yaml
matchers: [ foo = bar, dings != bums ]
```
As shown below, in the short-form, it's generally better to quote the list elements to avoid problems with special characters like commas:
As shown below, in the short-form, it's better to use double quotes to avoid problems with special characters like commas:
```yaml
matchers: [ "foo = \"bar,baz\"", "dings != bums" ]
```
3. You can also put both matchers into one PromQL-like string. Single quotes for the whole string work best here.
3. You can also put both matchers into one PromQL-like string. Single quotes work best here:
```yaml
matchers: [ '{foo="bar", dings!="bums"}' ]
```
4. To avoid any confusion about YAML string quoting and escaping, you can use YAML block quoting and then only worry about the OpenMetrics escaping inside the block. A complex example with a regular expression and different quotes inside the label value is shown below:
4. To avoid issues with escaping and quoting rules in YAML, you can also use a YAML block:
```yaml
matchers: