Commit Graph

4020 Commits

Author SHA1 Message Date
Tom Wilkie
4d9b917d11 Instrument Prometheus with OpenTracing (#2554)
* Use request.Context() instead of a global map of contexts.

* Add some basic opentracing instrumentation on the query path.

* Remove tracehandler endpoint.
2017-05-02 18:49:29 -05:00
Stephan Erb
0b9fca983b Fix reload of ZooKeeper service discovery config (#2669)
Rational:

* When the config is reloaded and the provider context is canceled, we need to
  exit the current ZK `TargetProvider.Run` method as a new provider will be
  instantiated.
* In case `Stop` is called on the `ZookeeperTreeCache`, the update/events
  channel may not be closed as it is shared by multiple caches and would
  thus be double closed.
* Stopping all `zookeeperTreeCacheNode`s on teardown ensures all associated
  watcher go-routines will be closed eagerly rather than implicityly on
  connection close events.
2017-05-02 18:21:37 -05:00
Fabian Reinartz
86426c0566 Merge pull request #2672 from svend/kubernetes-pods-port-comment
Document what ports are scraped by default in k8s example
2017-05-02 11:12:13 +02:00
Svend Sorensen
94a3e863e4 Document what ports are scraped by default in k8s example
The Kubernetes pod SD creates a target for each declared port, as documented:

https://prometheus.io/docs/operating/configuration/#pod

> The pod role discovers all pods and exposes their containers as targets. For
> each declared port of a container, a single target is generated. If a
> container has no specified ports, a port-free target per container is created
> for manually adding a port via relabeling.

This results in the default port being the declared port, or no port if none are
declared.
2017-05-01 15:58:48 -07:00
Fabian Reinartz
8c483e27d3 Merge pull request #2661 from prometheus/texts
pkg/textparse: implement timestamp parsing
2017-04-27 17:49:57 +02:00
Fabian Reinartz
1df03d8346 make: disable remote tests temporarily 2017-04-27 17:27:19 +02:00
Fabian Reinartz
377886b371 pkg/textparse: implement timestamp parsing 2017-04-27 17:02:07 +02:00
Conor Broderick
314b81062d Updated vendoring for log level reporting issue (#2660) 2017-04-27 14:25:13 +01:00
Brian Akins
27d66628a1 Allow limiting Kubernetes service discover to certain namespaces
Allow namespace discovery to be more easily extended in the future by using a struct rather than just a list.

Rename fields for kubernetes namespace discovery
2017-04-27 07:41:36 -04:00
Fabian Reinartz
e829dbe2be retrieval: comment out accept header again 2017-04-27 11:46:08 +02:00
Fabian Reinartz
0f3110487d Merge remote-tracking branch 'origin/dev-2.0' into dev-2.0 2017-04-27 10:25:04 +02:00
Fabian Reinartz
37deb21c45 vendor: remove unused dependency and last ref to fabxc/tsdb 2017-04-27 10:23:34 +02:00
Fabian Reinartz
73b8ff0ddc Merge branch 'master' into dev-2.0 2017-04-27 10:19:55 +02:00
Julius Volz
fe11c5933a Fix mutation of active alert elements by notifier (#2656)
This caused the external label application in the notifier to bleed back
into the rule manager's active alerting elements.
2017-04-26 10:29:42 -05:00
Fabian Reinartz
5248118b10 Merge pull request #2654 from dsymonds/master
Add maintainers' GitHub usernames to MAINTAINERS.md.
2017-04-25 08:43:36 +02:00
David Symonds
8bb07490a2 Add maintainers' GitHub usernames to MAINTAINERS.md.
CONTRIBUTING.md instructs people to loop them in using that mechanism,
but nothing lists the right username.
2017-04-25 16:32:23 +10:00
Fabian Reinartz
60d9138b6b Merge pull request #2653 from dsymonds/master
Preserve Alertmanager URLs as *url.URL.
2017-04-25 08:27:31 +02:00
David Symonds
04ad889751 Preserve Alertmanager URLs as *url.URL.
Render a nicer link in the web UI.
2017-04-25 16:17:46 +10:00
Conor Broderick
9eb1a5d6bf Handle invalid query in graph UI (#2652) 2017-04-24 10:50:57 +01:00
Brian Brazil
8b8ba26129 Merge pull request #2644 from prometheus/release-1.6
Merge 1.6.1 release from 1.6 branch
2017-04-19 15:22:24 +01:00
Brian Brazil
8097a3c523 Cut v1.6.1 (#2640) 2017-04-19 14:23:56 +01:00
Fabian Reinartz
6a947878eb Merge pull request #2643 from prometheus/ci-2.0
Add license to files.
2017-04-19 14:50:20 +02:00
Brian Brazil
5c9a6ce747 Add license to files.
This should fix CI for dev-2.0.
2017-04-19 13:46:22 +01:00
beorn7
e499ef8cac Merge bug fixes from branch 'release-1.6' 2017-04-18 18:06:01 +02:00
Björn Rabenstein
872ed88166 Merge pull request #2638 from prometheus/beorn7/storage
storage: Don't panic if storage has no FPs even after initial wait
2017-04-18 17:02:07 +02:00
beorn7
1dd737d7c3 storage: Don't panic if storage has no FPs even after initial wait 2017-04-18 15:59:12 +02:00
Matt Layher
1faf33acac Add promlint check for histogram/summary reserved names (#2626) 2017-04-15 22:38:01 +01:00
Tobias Schmidt
09a977a782 Create sha256 checksums file during release 2017-04-15 12:26:51 -03:00
Tobias Schmidt
619cc0e0ff Merge pull request #2625 from mdlayher/promlint-cleanup
Simplify promlint problems gathering, use protobuf accessors
2017-04-14 22:47:30 +02:00
Matt Layher
cc4198f421
Simplify promlint problems gathering, use protobuf accessors 2017-04-14 16:40:40 -04:00
Matt Layher
34a4813464 Initial promlint counter _total suffix check (#2624) 2017-04-14 22:09:54 +02:00
Matt Layher
254cb1ec29 Use untyped metrics for some promlint tests (#2623) 2017-04-14 19:38:57 +01:00
Björn Rabenstein
67d511784d Merge pull request #2619 from prometheus/release-1.6
Cut v1.6.0
2017-04-14 20:12:22 +02:00
beorn7
10f6453829 Cut v1.6.0 2017-04-14 19:53:58 +02:00
Jack Neely
896f951e68 Force buckets in a histogram to be monotonic for quantile estimation (#2610)
* Force buckets in a histogram to be monotonic for quantile estimation

The assumption that bucket counts increase monotonically with increasing
upperBound may be violated during:

  * Recording rule evaluation of histogram_quantile, especially when rate()
     has been applied to the underlying bucket timeseries.
  * Evaluation of histogram_quantile computed over federated bucket
     timeseries, especially when rate() has been applied

This is because scraped data is not made available to RR evalution or
federation atomically, so some buckets are computed with data from the N
most recent scrapes, but the other buckets are missing the most recent
observations.

Monotonicity is usually guaranteed because if a bucket with upper bound
u1 has count c1, then any bucket with a higher upper bound u > u1 must
have counted all c1 observations and perhaps more, so that c  >= c1.

Randomly interspersed partial sampling breaks that guarantee, and rate()
exacerbates it. Specifically, suppose bucket le=1000 has a count of 10 from
4 samples but the bucket with le=2000 has a count of 7, from 3 samples. The
monotonicity is broken. It is exacerbated by rate() because under normal
operation, cumulative counting of buckets will cause the bucket counts to
diverge such that small differences from missing samples are not a problem.
rate() removes this divergence.)

bucketQuantile depends on that monotonicity to do a binary search for the
bucket with the qth percentile count, so breaking the monotonicity
guarantee causes bucketQuantile() to return undefined (nonsense) results.

As a somewhat hacky solution until the Prometheus project is ready to
accept the changes required to make scrapes atomic, we calculate the
"envelope" of the histogram buckets, essentially removing any decreases
in the count between successive buckets.

* Fix up comment docs for ensureMonotonic

* ensureMonotonic: Use switch statement

Use switch statement rather than if/else for better readability.
Process the most frequent cases first.
2017-04-14 16:21:49 +02:00
Matt Layher
283756c503 Initial commit of 'promtool check-metrics', promlint package (#2605) 2017-04-13 23:53:41 +02:00
Conor Broderick
ee62807b62 Added min/max to graph to accomodate for constant time series (#2612)
Added min/max to graph to accommodate constant time series
2017-04-12 14:25:25 +01:00
Björn Rabenstein
1fb2190eeb Merge pull request #2607 from prometheus/beorn7/storage
Vendoring update prior to 1.6 release
2017-04-11 14:31:58 +02:00
beorn7
c53f256a09 storage: Fix use of counter (Set -> Add) 2017-04-11 12:58:24 +02:00
beorn7
1ae50b1d1b vendoring: Update client_golang/prometheus
This is mostly required to enable summaries without quantiles
2017-04-11 12:58:24 +02:00
beorn7
92d4cf7663 vendoring: Remove unused packages 2017-04-11 12:58:24 +02:00
Brian Brazil
0e0fc5a7f4 Correct example name to adapter. (#2590) 2017-04-10 17:24:53 +01:00
Fabian Reinartz
757cba7c31 cmd/prometheus: Undo GOGC adjustment 2017-04-10 16:22:01 +02:00
Fabian Reinartz
ece483c0c1 version: cut 2.0.0-alpha.0 2017-04-10 13:03:47 +02:00
Fabian Reinartz
f2d610c1e5 vendor: update tsdb for fast equal matching 2017-04-10 13:00:27 +02:00
Björn Rabenstein
acd72ae1a7 Merge pull request #2591 from prometheus/beorn7/storage
storage: Several optimizations of checkpointing
2017-04-07 20:02:14 +02:00
Goutham Veeramachaneni
cffb1acf7f Test Longer Tests in Travis (#2570)
* Test Longer Tests in Travis

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>

* Make test Target Run All Tests

* Add test-short to run short tests

test is running all the tests now as we are running make tests in
CircleCI and I think the base image is shared across Prometheus Org.

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>

* Remove Empty Line

Signed-off-by: Goutham Veeramachaneni <cs14btech11014@iith.ac.in>
2017-04-07 13:46:06 +02:00
beorn7
f20b84e816 flags: Improve doc strings for checkpoint flags 2017-04-07 13:10:12 +02:00
beorn7
f338d791d2 storage: Several optimizations of checkpointing
- checkpointSeriesMapAndHeads accepts a context now to allow
  cancelling.

- If a shutdown is initiated, cancel the ongoing checkpoint. (We will
  create a final checkpoint anyway.)

- Always wait for at least as long as the last checkpoint took before
  starting the next checkpoint (to cap the time spending checkpointing
  at 50%).

- If an error has occurred during checkpointing, don't bother to sync
  the write.

- Make sure the temporary checkpoint file is deleted, even if an error
  has occurred.

- Clean up the checkpoint loop a bit. (The concurrent Timer.Reset(0)
  call might have cause a race.)
2017-04-07 13:10:12 +02:00
Björn Rabenstein
934d86b936 Merge pull request #2593 from prometheus/beorn7/storage2
storage: Recover from corrupted indices for archived series
2017-04-07 12:55:35 +02:00