* Unify CPU collector conventions
Add a common CPU metric description.
* All collectors use the same `nodeCpuSecondsDesc`.
* All collectors drop the `cpu` prefix for `cpu` label values.
* Fix subsystem string in cpu_freebsd.
* Fix Linux CPU freq label names.
* Improve stat linux metric names.
cpu is no longer used.
* node_cpu -> node_cpu_seconds_total for Linux
* Improve filesystem metric names with units
* Improve units and names of linux disk stats
Remove sector metrics, the bytes metrics cover those already.
* Infiniband counters should end in _total
* Improve timex metric names, convert to more normal units.
See
3c073991eb/kernel/time/ntp.c (L909)
for what stabil means, looks like a moving average of some form.
* Update test fixture
* For meminfo metrics that had "kB" units, add _bytes
* Interrupts counter should have _total
Linux "guest" metrics for VMs are already accounted for in node_cpu
`user` and `nice` metrics. Separate these into their own metric to
avoid duplication of data.
* cpu: Support processor-less (memory-only) NUMA nodes
Processor-less (memory-only) NUMA nodes exist e.g. in systems that use
Intel Optane drives for RAM expansion using Intel Memory Drive
Technology (IMDT).
IMDT RAM expansion supports two modes:
* "Unify Remote Memory domains": present a processor-less (memory-only)
NUMA domain, which is the default
* "Expand local memory domains": to expand each processor’s memory domain
with a portion of the memory made available by Optane and IMDT
This commit fixes a crash in the first case (when "cpulist" is empty).
Here's an example of such a system:
$ numastat -m|head -n5
Per-node system memory usage (in MBs):
Node 0 Node 1 Node 2 Total
--------------- --------------- --------------- ---------------
MemTotal 118239.56 130816.00 464384.00 713439.56
$ for i in {0..2}; do echo -n "$i: " ; cat /sys/bus/node/devices/node$i/cpulist ; done
0: 0-7,16-23
1: 8-15,24-31
2:
$ /opt/vsmp/bin/vsmpversion -vvv
Memory Drive Technology: 8.2.1455.74 (Sep 28 2017 13:09:59)
System configuration:
Boards: 3
1 x Proc. + I/O + Memory
2 x NVM devices (Intel SSDPED1K375GAQ)
Processors: 2, Cores: 16, Threads: 32
Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz Stepping 01
Memory (MB): 713472 (of 977450), Cache: 251416, Private: 12562
1 x 249088MB [262036/ 678/12270]
1 x 232192MB [357707/125369/ 146] 82:00.0#1
1 x 232192MB [357707/125369/ 146] 83:00.0#1
* cpu: rename some variables (pkg => node)
* cpu: Use %v not %q in log.Debugf() format strings
* Move NodeCollector into package collector
* Refactor collector enabling
* Update README with new collector enabled flags
* Fix out-of-date inline flag reference syntax
* Use new flags in end-to-end tests
* Add flag to disable all default collectors
* Track if a flag has been set explicitly
* Add --collectors.disable-defaults to README
* Revert disable-defaults flag
* Shorten flags
* Fixup timex collector registration
* Fix end-to-end tests
* Change procfs and sysfs path flags
* Fix review comments
* cpu: Metric 'package_throttles_total' is per package.
'package_throttles_total' is per package, not per cpu. This also reduces
the total number of cpu time series a lot (esp for multi core cpus).
* cpu: Better handling of a cpulist edge-case.
* cpu: Extract the package number from the directory name.
Do not rely on the range index.
* cpu: Add package_throttle_count for node0 cpu1
This file must be ignored by the cpu collector.