diff --git a/README.md b/README.md index 772b9c2f..6ca81e64 100644 --- a/README.md +++ b/README.md @@ -68,8 +68,25 @@ zfs | Exposes [ZFS](http://open-zfs.org/) performance statistics. | [Linux](http ### Disabled by default -The perf collector may not work by default on all Linux systems due to kernel -configuration and security settings. To allow access, set the following sysctl +`node_exporter` also implements a number of collectors that are disabled by default. Reasons for this vary by +collector, and may include: +* High cardinality +* Prolonged runtime that exceeds Prometheus` `scrape_interval` or `scrape_timeout` +* Significant resource demands on the host + +You can enable additional collectors as desired by adding them to your +init system's or service supervisor's startup configuration for +`node_exporter` but caution is advised. Enable at most one at a time, +testing first on a non-production system, then by hand on a single +production node. When enabling additional collectors, you should +carefully monitor the change by observing the ` +scrape_duration_seconds` metric to ensure that collection completes +and does not time out. In addition, monitor the +`scrape_samples_post_metric_relabeling` metric to see the changes in +cardinality. + +The `perf` collector may not work out of the box on some Linux systems due to kernel +configuration and security settings. To allow access, set the following `sysctl` parameter: ``` @@ -85,7 +102,7 @@ Depending on the configured value different metrics will be available, for most cases `0` will provide the most complete set. For more information see [`man 2 perf_event_open`](http://man7.org/linux/man-pages/man2/perf_event_open.2.html). -By default, the perf collector will only collect metrics of the CPUs that +By default, the `perf` collector will only collect metrics of the CPUs that `node_exporter` is running on (ie [`runtime.NumCPU`](https://golang.org/pkg/runtime/#NumCPU). If this is insufficient (e.g. if you run `node_exporter` with its CPU affinity set to @@ -96,7 +113,7 @@ configuration is zero indexed and can also take a stride value; e.g. `--collector.perf --collector.perf.cpus=1-10:5` would collect on CPUs 1, 5, and 10. -The perf collector is also able to collect +The `perf` collector is also able to collect [tracepoint](https://www.kernel.org/doc/html/latest/core-api/tracepoint.html) counts when using the `--collector.perf.tracepoint` flag. Tracepoints can be found using [`perf list`](http://man7.org/linux/man-pages/man1/perf.1.html) or @@ -126,13 +143,13 @@ perf | Exposes perf based metrics (Warning: Metrics are dependent on kernel conf ### Textfile Collector -The textfile collector is similar to the [Pushgateway](https://github.com/prometheus/pushgateway), +The `textfile` collector is similar to the [Pushgateway](https://github.com/prometheus/pushgateway), in that it allows exporting of statistics from batch jobs. It can also be used to export static metrics, such as what role a machine has. The Pushgateway -should be used for service-level metrics. The textfile module is for metrics +should be used for service-level metrics. The `textfile` module is for metrics that are tied to a machine. -To use it, set the `--collector.textfile.directory` flag on the Node exporter. The +To use it, set the `--collector.textfile.directory` flag on the `node_exporter` commandline. The collector will parse all files in that directory matching the glob `*.prom` using the [text format](http://prometheus.io/docs/instrumenting/exposition_formats/). **Note:** Timestamps are not supported. @@ -203,6 +220,7 @@ The `node_exporter` is designed to monitor the host system. It's not recommended to deploy it as a Docker container because it requires access to the host system. Be aware that any non-root mount points you want to monitor will need to be bind-mounted into the container. + If you start container for host monitoring, specify `path.rootfs` argument. This argument must match path in bind-mount of host root. The node\_exporter will use `path.rootfs` as prefix to access host filesystem. diff --git a/docs/TIME.md b/docs/TIME.md index 18773e0b..340c72d6 100644 --- a/docs/TIME.md +++ b/docs/TIME.md @@ -2,15 +2,15 @@ ## `ntp` collector -This collector is intended for usage with local NTPD like [ntp.org](http://ntp.org/), [chrony](https://chrony.tuxfamily.org/comparison.html) or [OpenNTPD](http://www.openntpd.org/). +This collector is intended for usage with local NTP daemons including [ntp.org](http://ntp.org/), [chrony](https://chrony.tuxfamily.org/comparison.html), and [OpenNTPD](http://www.openntpd.org/). -Note, some chrony packages have `local stratum 10` configuration value making chrony a valid server when it is unsynchronised. This configuration makes one of `node_ntp_sanity` heuristics unreliable. +Note, some chrony packages have `local stratum 10` configuration value making chrony a valid server when it is unsynchronised. This configuration makes one of the heuristics that derive `node_ntp_sanity` unreliable. -Note, OpenNTPD does not listen for SNTP queries by default, you should add `listen on 127.0.0.1` configuration line to use this collector with OpenNTPD. +Note, OpenNTPD does not listen for SNTP queries by default. Add `listen on 127.0.0.1` to the OpenNTPD configuration when using this collector with that package. ### `node_ntp_stratum` -This metric shows [stratum](https://en.wikipedia.org/wiki/Network_Time_Protocol#Clock_strata) of local NTPD. +This metric shows the [stratum](https://en.wikipedia.org/wiki/Network_Time_Protocol#Clock_strata) of the local NTP daemon. Stratum `16` means that clock are unsynchronised. See also aforementioned note about default local stratum in chrony.