* Introduce histogram support
Prior to this change, the custom queries were restricted to counters and
gauges.
This change introduces a new ColumnUsage, namely HISTOGRAM, that expects
the column to contain an array of upper inclusive bounds for each
observation bucket in the emitted metric. It also expects three more
columns to be present with the suffixes:
- `_bucket`, containing an array of cumulative counters for the
observation buckets;
- `_sum`, the total sum of all observed values; and
- `_count`, the count of events that have been observed.
A flag has been added to the MetricMap struct to easily identify metrics
that should emit a histogram and the construction of a histogram metric
is aided by the pg.Array function and a new helper dbToUint64 function.
Finally, and example of usage is given in queries.yaml.
fixes#402
Signed-off-by: Corin Lawson <corin@responsight.com>
* Introduces tests for histogram support
Prior to this change, the histogram support was untested.
This change introduces a new integration test that reads a user query
containing a number of histogram metrics. Also, additional checks have
been added to TestBooleanConversionToValueAndString to test dbToUint64.
Signed-off-by: Corin Lawson <corin@responsight.com>
Update query for pg_stat_user_tables:
* Split up to multi-line format to make it easier to read.
* Remove duplicate of column `COALESCE(last_vacuum, '1970-01-01Z')`.
Signed-off-by: Ben Kochie <superq@gmail.com>
* do not panic when envs are set incorrectly
* do not panic when envs are set incorrectly - fix tests
Co-authored-by: Will Rouesnel <wrouesnel@wrouesnel.com>
The existing 'pg_stat_replication' data does not
include stats for inactive replication slots. This
commit adds a minimal amount of metrics from
'pg_replication_slots' to know if a slot is
active and its lag.
This is helpful to detect if an inactive slot
is causing the server to run out of storage due
to an inactive slot blocking WAL flushing.
Failures in parsing the user's queries are just being swallowed, which
makes troubleshooting YAML issues frustrating/impossible. I'm presuming
this was not intentional, since there is error handling code in the
function that calls this one, though it is unreachable as far as I can
tell without this change.
Co-authored-by: Will Rouesnel <wrouesnel@wrouesnel.com>
* Add a build info metric
Add a standard Prometheus build info metric to make monitoring rollouts
easier.
* Update prometheus vendoring.
* Fix build error
Fix missing bool in builtinMetricMaps.
Signed-off-by: Ben Kochie <superq@gmail.com>
During a refactor the pg_stat_bgwriter metrics were accidentally removed
and not re-added by 34fdb69ee2.
This commit restores these metrics. Resolves#336.
Closes#326 as is provides a viable solution to use a K8S init container
to fully contruct the PostgreSQL URI and 'hand it over' to the postgres_exporter
process.
Adds a file containing some basic alerting rules for Prometheus as a launch point for more community contributions (or just when
I get around to coming up with some more I'm interested in).
In the user queries.yml file, the created namespaces can now be optionally
cached by setting cache_seconds, which will prevent the query being re-run
within that timeframe if previous results are available.
Supercedes #211, credit to @SamSaffron for the original PR.
* Fix problem: If autodiscovery is enable exporter make connection twice to database from connetion string (exclude current database from SQL-query);
* Fix problem: don't get default metrics and settings if autodiscovery is enabled. Now you can use --disable-default-metrics and --disable-settings-metrics with --auto-discover-databases and
* Use struct instead of interface{} when parsing query user
* Use MappingOptions
* Split function to be more testable
* Rename function to parseUserQueries
* Start to add test about query parsing
Some backstory
==============
I was attempting to use postgres_exporter with the official Docker
container (https://hub.docker.com/_/postgres) In a Kubernetes
StatefulSet, with a side-car configuration, but found that I wasn't able
to connect even with sharing the Postgres Unix listening socket, between
both containers. After copying the container over to an Alpine base I
quickly found out that the postgres_exporter was actually starting
before the main Postres container had dropped the unix socket onto the
file system, a quick work around is to write a bash for loop checking
for the existence of a unix socket, however this would require
maintaining a container, besides other users may find retries useful on
startup.
Implementation
==============
All changes are made to the getServer function and variables are
local, I was unsure if it was worth adding command line switches but
this would allow for a more sophisticated backOff loop in the future.
Hope this help, and let me know if you would like me to changes
anything.
If exporter is scraped by multiple Prometheuses (as we do) - Collect() could be called concurrently. In result in some cases one of Prometheuses could get pg_up = 0, because it was explicitly set to zero on first Collect call.
* Add exclude-databases option
* Update readme to explain --exclude-databases
* Add comments to ExcludeDatabases function and unexport Contains function
This reverts commit 6585e6672f.
There's been some weird changes added to the upstream archiver library which
break the build. Since this is only used in the build process, I'm in no hurry
to update it.