Add NodeSystemSaturation and NodeMemoryMajorPagesFaults
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
This commit is contained in:
parent
e15e7d6a7b
commit
580c497261
|
@ -323,6 +323,41 @@
|
||||||
description: 'CPU usage at {{ $labels.instance }} has been above 80% for the last 15 minutes, is currently at {{ printf "%.2f" $value }}%.',
|
description: 'CPU usage at {{ $labels.instance }} has been above 80% for the last 15 minutes, is currently at {{ printf "%.2f" $value }}%.',
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
alert: 'NodeSystemSaturation',
|
||||||
|
expr: |||
|
||||||
|
node_load1{%(nodeExporterSelector)s}
|
||||||
|
/ count without (cpu, mode) (node_cpu_seconds_total{%(nodeExporterSelector)s, mode="idle"}) > 2
|
||||||
|
||| % $._config,
|
||||||
|
'for': '15m',
|
||||||
|
labels: {
|
||||||
|
severity: 'critical',
|
||||||
|
},
|
||||||
|
annotations: {
|
||||||
|
summary: 'System saturated, load per core is very high.',
|
||||||
|
description: |||
|
||||||
|
System load per core at {{ $labels.instance }} has been above 2 for the last 15 minutes, is currently at {{ printf "%.2f" $value }}.
|
||||||
|
This might indicate this instance resources saturation and can cause it becoming unresponsive.
|
||||||
|
|||,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
alert: 'NodeMemoryMajorPagesFaults',
|
||||||
|
expr: |||
|
||||||
|
rate(node_vmstat_pgmajfault{%(nodeExporterSelector)s}[5m]) > 500
|
||||||
|
||| % $._config,
|
||||||
|
'for': '15m',
|
||||||
|
labels: {
|
||||||
|
severity: 'warning',
|
||||||
|
},
|
||||||
|
annotations: {
|
||||||
|
summary: 'Memory major page faults are occurring at very high rate.',
|
||||||
|
description: |||
|
||||||
|
Memory major pages are occurring at very high rate at {{ $labels.instance }}, 500 major page faults per second for the last 15 minutes, is currently at {{ printf "%.2f" $value }}.
|
||||||
|
Please check that there is enough memory available at this instance.
|
||||||
|
|||,
|
||||||
|
},
|
||||||
|
},
|
||||||
{
|
{
|
||||||
alert: 'NodeMemoryHighUtilization',
|
alert: 'NodeMemoryHighUtilization',
|
||||||
expr: |||
|
expr: |||
|
||||||
|
@ -352,7 +387,7 @@
|
||||||
summary: 'Disk IO queue is high.',
|
summary: 'Disk IO queue is high.',
|
||||||
description: |||
|
description: |||
|
||||||
Disk IO queue (aqu-sq) is high on {{ $labels.device }} at {{ $labels.instance }}, has been above 10 for the last 15 minutes, is currently at {{ printf "%.2f" $value }}.
|
Disk IO queue (aqu-sq) is high on {{ $labels.device }} at {{ $labels.instance }}, has been above 10 for the last 15 minutes, is currently at {{ printf "%.2f" $value }}.
|
||||||
This symptom might indicate disk saturation.,
|
This symptom might indicate disk saturation.
|
||||||
|||,
|
|||,
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
|
Loading…
Reference in New Issue