Commit Graph

361 Commits

Author SHA1 Message Date
Bjoern Rabenstein
934d09f738 Fix race during shutdown.
Change-Id: I2f8bf48d92a14f1e5ecde27c1b138734d7653394
2014-11-25 17:08:45 +01:00
Bjoern Rabenstein
38fc24d0ed Fix targetpool_test.go and other tests.
Change-Id: I91a4dd1d39e01f174e1aaae653ce1ed7aecaa624
2014-11-25 17:08:26 +01:00
Julius Volz
7f5d3c2c29 Fix and improve the fp locker.
Benchmark:
$ go test -bench 'Fingerprint' -test.run 'Fingerprint' -test.cpu=1,2,4

OLD
BenchmarkFingerprintLockerParallel        500000              3618 ns/op
BenchmarkFingerprintLockerParallel-2      100000             12257 ns/op
BenchmarkFingerprintLockerParallel-4      500000             10164 ns/op
BenchmarkFingerprintLockerSerial        10000000               283 ns/op
BenchmarkFingerprintLockerSerial-2      10000000               284 ns/op
BenchmarkFingerprintLockerSerial-4      10000000               288 ns/op

NEW
BenchmarkFingerprintLockerParallel       1000000              1018 ns/op
BenchmarkFingerprintLockerParallel-2     1000000              1164 ns/op
BenchmarkFingerprintLockerParallel-4     2000000               910 ns/op
BenchmarkFingerprintLockerSerial        50000000                56.0 ns/op
BenchmarkFingerprintLockerSerial-2      50000000                47.9 ns/op
BenchmarkFingerprintLockerSerial-4      50000000                54.5 ns/op

Change-Id: I3c65a43822840e7e64c3c3cfe759e1de51272581
2014-11-25 17:07:45 +01:00
Bjoern Rabenstein
7ad55ef83c Actually close the iterator channels.
Change-Id: I6f6a2aef5ff55c6b2d21ad91d02ae6b0ecba4ae8
2014-11-25 17:07:45 +01:00
Bjoern Rabenstein
8fba3302bc Bold changes to concurrency.
(WIP. Probably doesn't work yet.)

Change-Id: Id1537dfcca53831a1d428078a5863ece7bdf4875
2014-11-25 17:07:45 +01:00
Bjoern Rabenstein
fcdf5a8ee7 Fix bugs in chunk evict code.
Also, simplify code by re-looking up metric in metric map.

Change-Id: Ib2092f9184374e5a543e87d3a9f4a74fda64b193
2014-11-25 17:07:45 +01:00
Bjoern Rabenstein
7e6a03fbf9 Fix a few concurrency issues before starting to use the new fp locker.
Change-Id: I8615e8816e79ef0882e123163ee590c739b79d12
2014-11-25 17:07:45 +01:00
Julius Volz
db92620163 Instrument eviction and purge durations.
Change-Id: Ia5b2319363ad2644674c9b7a94162a89bcc296fb
2014-11-25 17:07:45 +01:00
Julius Volz
e0ee7ec7ab Add fingerprintLocker for locking individual fingerprints.
Change-Id: Id41ba555715229edf7d6543f56736b82f6eff1ef
2014-11-25 17:07:45 +01:00
Julius Volz
df1b2a2422 Fix indexing latency instrumentation.
Change-Id: I532c170121cd2996d1a378adbb1fd551cd5a4e38
2014-11-25 17:07:44 +01:00
Bjoern Rabenstein
01dd618a20 Fix a locking bug.
Change-Id: I183780785991d0b4165ce9186f53eb8201fb3ed5
2014-11-25 17:07:44 +01:00
Julius Volz
a746fbb8bc Instrument indexing: queue length, batch sizes and latencies.
Change-Id: I60bcbd24b160e47d418a485d8cffa39344a257c6
2014-11-25 17:07:44 +01:00
Bjoern Rabenstein
aea32b0b4b Avoid redundant fingerprint calculation.
Change-Id: Ief8a165dcfa5030226953346ec9dfe4a7787df1f
2014-11-25 17:07:44 +01:00
Bjoern Rabenstein
e9ff29c547 Comment/code cleanup.
Change-Id: I38736e3d0fec79759a2bafa35aecf914480ff810
2014-11-25 17:07:44 +01:00
Bjoern Rabenstein
0031a448e2 Add WaitForIndexing.
Change-Id: I5a5c975c4246632f937413322c855bbe63d00802
2014-11-25 17:07:44 +01:00
Bjoern Rabenstein
c7aad110fb Add an indexing queue and batch the ops.
Some other improvements on the way, in particular codec -> codable
renaming and addition of LookupSet methods.

Change-Id: I978f8f3f84ca8e4d39a9d9f152ae0ad274bbf4e2
2014-11-25 17:07:44 +01:00
Bjoern Rabenstein
71206dbc06 More code cleanups.
Add license text everywhere.
And others....

Change-Id: I11ccde267a2ef7eb366c4788ba7aeae14ba7545c
2014-11-25 17:07:44 +01:00
Julius Volz
f0d5d4bda3 Fix bug around index purging.
Change-Id: I8cea00e03f72bbeead2cbd2d26b34d986059ced0
2014-11-25 17:07:44 +01:00
Julius Volz
630b5a087a Also consider on-disk fingerprints during purge.
This reintroduces LevelDB iterators so that we can iterate through all
the on-disk fingerprints.

Change-Id: I007ee4638d038d2a4461bbda27f30fcaad411474
2014-11-25 17:07:35 +01:00
Bjoern Rabenstein
f5f9f3514a Major code cleanup.
- Make it go-vet and golint clean.
- Add comments, TODOs, etc.

Change-Id: If1392d96f3d5b4cdde597b10c8dff1769fcfabe2
2014-11-25 17:02:53 +01:00
Bjoern Rabenstein
3592dc2359 Implement series eviction.
Change-Id: I7a503e0ba78aae3761d032851b06f2807122b085
2014-11-25 17:02:52 +01:00
Bjoern Rabenstein
bbf49200ab Implement methods in persistence.go.
Change-Id: I804cdd0b30420e171825fd86fe1281eca0d5e638
2014-11-25 17:02:23 +01:00
Bjoern Rabenstein
5a128a04a9 Major reorganization of the storage.
Most important, the heads file will now persist all the chunk descs,
too. Implicitly, it will serve as the persisted form of the
fp-to-series map.

Change-Id: Ic867e78f2714d54c3b5733939cc5aef43f7bd08d
2014-11-25 17:02:01 +01:00
Bjoern Rabenstein
e7cb9ddb9f Use a sync.pool for the staging buffer in codec.go.
Change-Id: I1aae6847f77b5a7c75582b07c199b1943cf90552
2014-11-25 17:02:01 +01:00
Bjoern Rabenstein
4770cf76a4 Make index package more self-contained.
Moved interna from diskPersistence into the indexer.
TotalIndexer now called diskIndexer.

Change-Id: I6c8c62cb171f12bbd8a5474773af7786d71ba388
2014-11-25 17:02:01 +01:00
Bjoern Rabenstein
89f10e8eb2 Move to using the standard library interfaces for encoding/decoding.
BinaryMarshaler instead of encodable.
BinaryUnmarshaler instead of decodable.

Left 'codable' in place for lack of a better word.

Change-Id: I8a104be7d6db916e8dbc47ff95e6ff73b845ac22
2014-11-25 17:02:01 +01:00
Bjoern Rabenstein
af77d5ef0b Added a few missing implementations in index.go.
Also, added closing of persistence and mem storage.

Change-Id: Iacf0d22c3520dd2584d9546984c1f8a5ed6cd54e
2014-11-25 17:02:01 +01:00
Julius Volz
cca7ebe906 Some more cleanups / obsolete code removals.
Change-Id: I584144ceeeedafdb114266d8a6d2513e67b1d010
2014-11-25 17:02:00 +01:00
Julius Volz
7e85711df0 Beginnings of a tiered index implementation.
This reintroduces a LevelDB-based metrics index.

Change-Id: I4111540301c52255a07b2f570761707a32f72c05
2014-11-25 17:02:00 +01:00
Julius Volz
8dfaa5ecd2 Remove use of freelists for chunk bufs.
Change-Id: Ib887fdb61e1d96da0cd32545817b925ba88831c1
2014-11-25 17:02:00 +01:00
Julius Volz
7b35e0f0b8 Use constants from math package instead of literals.
Change-Id: I55427ba32c2cbb32ee42ec1e3153160965ab8b3c
2014-11-25 17:02:00 +01:00
Julius Volz
15929eece2 Unpin any already loaded chunks upon preloading error.
Change-Id: Ib451136e3ef21bce8b814c21b66eaab727ab341b
2014-11-25 17:02:00 +01:00
Julius Volz
fd01d07589 Check that chunk buffer length fits in 16 bit.
Change-Id: Id086a54aa8a1990c1979e747c1c02e53bed6d447
2014-11-25 17:02:00 +01:00
Bjoern Rabenstein
1ca7f24137 Remove float diff tolerance altogether.
Change-Id: I9ea9683a4665d5800fca75560bb4b8a8b4406d55
2014-11-25 17:02:00 +01:00
Bjoern Rabenstein
d742edfe0d Fix precision loss.
Large delta values often imply a difference between a large base value
and the large delta value, potentially resulting in small numbers with
a huge precision error. Since large delta values need 8 bytes anyway,
we are not even saving memory.

As a solution, always save the absoluto value rather than a delta once
8 bytes would be needed for the delta. Timestamps are then saved as 8
byte integers, while values are always saved as float64 in that case.

Change-Id: I01100d600515e16df58ce508b50982ffd762cc49
2014-11-25 17:02:00 +01:00
Bjoern Rabenstein
dc2e463a97 Improvements after review.
Change-Id: I484359282d4c7113518bbbb131f4f18383c08fdb
2014-11-25 17:02:00 +01:00
Bjoern Rabenstein
52c9dc43a3 Improve testing.
In particular, create a fuzz test for time series.

Change-Id: I523a17912405a0b6b46bd395c781d201dfe55036
2014-11-25 17:02:00 +01:00
Julius Volz
3b25867d61 Add chunk persistence tests, fix storage tests.
Change-Id: Id0b8f5382e99efa839cc0f826e92bbda985fe9a9
2014-11-25 17:02:00 +01:00
Bjoern Rabenstein
ecdf5ab14f Index-persistence switched from gob to a hand-coded solution.
Change-Id: Ib4ec42535bd08df16d34d4774bb638e35c5a1841
2014-11-25 17:02:00 +01:00
Julius Volz
e7ed39c9a6 Initial experimental snapshot of next-gen storage.
Change-Id: Ifb8709960dbedd1d9f5efd88cdd359ee9fa9d26d
2014-11-25 17:02:00 +01:00
Julius Volz
c6e9f085a3 Update used Go version to 1.3.
Go downloads moved to a different URL and require following redirects
(curl's '-L' option) now.

Go 1.3 deliberately randomizes ranges over maps, which uncovered some
bugs in our tests. These are fixed too.

Change-Id: Id2d9e185d8d2379a9b7b8ad5ba680024565d15f4
2014-11-25 17:02:00 +01:00
Bjoern Rabenstein
1909686789 Make metrics exported by the Prometheus server itself more consistent.
- Always spell out the time unit (e.g. milliseconds instead of ms).

- Remove "_total" from the names of metrics that are not counters.

- Make use of the "Namespace" and "Subsystem" fields in the options.

- Removed the "capacity" facet from all metrics about channels/queues.
  These are all fixed via command line flags and will never change
  during the runtime of a process. Also, they should not be part of
  the same metric family. I have added separate metrics for the
  capacity of queues as convenience. (They will never change and are
  only set once.)

- I left "metric_disk_latency_microseconds" unchanged, although that
  metric measures the latency of the storage device, even if it is not
  a spinning disk. "SSD" is read by many as "solid state disk", so
  it's not too far off. (It should be "solid state drive", of course,
  but "metric_drive_latency_microseconds" is probably confusing.)

- Brian suggested to not mix "failure" and "success" outcome in the
  same metric family (distinguished by labels). For now, I left it as
  it is. We are touching some bigger issue here, especially as other
  parts in the Prometheus ecosystem are following the same
  principle. We still need to come to terms here and then change
  things consistently everywhere.

Change-Id: If799458b450d18f78500f05990301c12525197d3
2014-11-25 17:02:00 +01:00
Julius Volz
80b3d3bf34 Speed up disk flushes by removing unnecessary sort.
The first sort in groupByFingerprint already ensures that all resulting sample
lists contain only one fingerprint. We also already assume that all
samples passed into AppendSamples (and thus groupByFingerprint) are
chronologically sorted within each fingerprint.

The extra chronological sort is thus superfluous. Furthermore, this
second sort didn't only sort chronologically, but also compared all
metric fingerprints again (although we already know that we're only
sorting within samples for the same fingerprint). This caused a huge
memory and runtime overhead.

In a heavily loaded real Prometheus, this brought down disk flush times
from ~9 minutes to ~1 minute.

OLD:
BenchmarkLevelDBAppendRepeatingValues   5  331391808 ns/op  44542953 B/op   597788 allocs/op
BenchmarkLevelDBAppendsRepeatingValues  5  329893512 ns/op  46968288 B/op  3104373 allocs/op

NEW:
BenchmarkLevelDBAppendRepeatingValues   5  299298635 ns/op  43329497 B/op   567616 allocs/op
BenchmarkLevelDBAppendsRepeatingValues 20   92204601 ns/op   1779454 B/op    70975 allocs/op

Change-Id: Ie2d8db3569b0102a18010f9e106e391fda7f7883
2014-11-25 17:01:59 +01:00
Julius Volz
21cafe6cd7 Only evict memory series after they are on disk.
This fixes the problem where samples become temporarily unavailable for
queries while they are being flushed to disk. Although the entire
flushing code could use some major refactoring, I'm explicitly trying to
do the minimal change to fix the problem since there's a whole new
storage implementation in the pipeline.

Change-Id: I0f5393a30b88654c73567456aeaea62f8b3756d9
2014-11-25 17:01:59 +01:00
Bjoern Rabenstein
8956faeccb Migrate to new client_golang.
This change will only be submitted when the new client_golang has been
moved to the new version.

Change-Id: Ifceb59333072a08286a8ac910709a8ba2e3a1581
2014-11-25 17:01:59 +01:00
Brian Brazil
e041c0cd46 Add console and alert templates with access to all data.
Move rulemanager to it's own package to break cicrular dependency.
Make NewTestTieredStorage available to tests, remove duplication.

Change-Id: I33b321245a44aa727bfc3614a7c9ae5005b34e03
2014-05-30 16:24:56 +01:00
Bjoern Rabenstein
ca6a4fccef Weed out our homegrown test.Tester.
The Go stdlib has testing.TB now, which fulfills the exact same
purpose.

Change-Id: I0db9c73400e208ca376b932a02b7e3402234b87c
2014-05-21 19:27:24 +02:00
Julius Volz
4df5c7ab18 Optimize label matcher memory and runtime behavior.
This optimizes the runtime and memory allocation behavior for label matchers
other than type "Equal". Instead of creating a new set for every union of
fingerprints, this simply adds new fingerprints to the existing set to achieve
the same effect.

The current behavior made a production Prometheus unresponsive when running a
NotEqual match against the "instance" label (a label with high value
cardinality).

BEFORE:
BenchmarkGetFingerprintsForNotEqualMatcher        10   170430297 ns/op  39229944 B/op    40709 allocs/op

AFTER:
BenchmarkGetFingerprintsForNotEqualMatcher      5000      706260 ns/op    217717 B/op     1116 allocs/op

Change-Id: Ifd78e81e7dfbf5d7249e50ad1903a5d9c42c347a
2014-05-05 11:29:17 -04:00
Bjoern Rabenstein
de9a88b964 Ensure temporal order in streams.
BenchmarkAppendSample.* before this change:

BenchmarkAppendSample1   1000000              1142 ns/op
--- BENCH: BenchmarkAppendSample1
        memory_test.go:81: 1 cycles with 9992.000000 bytes per cycle, totalling 9992
        memory_test.go:81: 100 cycles with 250.399994 bytes per cycle, totalling 25040
        memory_test.go:81: 10000 cycles with 239.428802 bytes per cycle, totalling 2394288
        memory_test.go:81: 1000000 cycles with 255.504684 bytes per cycle, totalling 255504688
BenchmarkAppendSample10   500000              3823 ns/op
--- BENCH: BenchmarkAppendSample10
        memory_test.go:81: 1 cycles with 15536.000000 bytes per cycle, totalling 15536
        memory_test.go:81: 100 cycles with 662.239990 bytes per cycle, totalling 66224
        memory_test.go:81: 10000 cycles with 601.937622 bytes per cycle, totalling 6019376
        memory_test.go:81: 500000 cycles with 598.582764 bytes per cycle, totalling 299291408
BenchmarkAppendSample100           50000             41111 ns/op
--- BENCH: BenchmarkAppendSample100
        memory_test.go:81: 1 cycles with 79824.000000 bytes per cycle, totalling 79824
        memory_test.go:81: 100 cycles with 4924.479980 bytes per cycle, totalling 492448
        memory_test.go:81: 10000 cycles with 4278.019043 bytes per cycle, totalling 42780192
        memory_test.go:81: 50000 cycles with 4275.242676 bytes per cycle, totalling 213762144
BenchmarkAppendSample1000           5000            533933 ns/op
--- BENCH: BenchmarkAppendSample1000
        memory_test.go:81: 1 cycles with 840224.000000 bytes per cycle, totalling 840224
        memory_test.go:81: 100 cycles with 62789.281250 bytes per cycle, totalling 6278928
        memory_test.go:81: 5000 cycles with 55208.601562 bytes per cycle, totalling 276043008
ok      github.com/prometheus/prometheus/storage/metric/tiered  27.828s

BenchmarkAppendSample.* after this change:

BenchmarkAppendSample1   1000000              1109 ns/op
--- BENCH: BenchmarkAppendSample1
        memory_test.go:131: 1 cycles with 9992.000000 bytes per cycle, totalling 9992
        memory_test.go:131: 100 cycles with 250.399994 bytes per cycle, totalling 25040
        memory_test.go:131: 10000 cycles with 239.220795 bytes per cycle, totalling 2392208
        memory_test.go:131: 1000000 cycles with 255.492630 bytes per cycle, totalling 255492624
BenchmarkAppendSample10   500000              3663 ns/op
--- BENCH: BenchmarkAppendSample10
        memory_test.go:131: 1 cycles with 15536.000000 bytes per cycle, totalling 15536
        memory_test.go:131: 100 cycles with 662.239990 bytes per cycle, totalling 66224
        memory_test.go:131: 10000 cycles with 601.889587 bytes per cycle, totalling 6018896
        memory_test.go:131: 500000 cycles with 598.550903 bytes per cycle, totalling 299275472
BenchmarkAppendSample100           50000             40694 ns/op
--- BENCH: BenchmarkAppendSample100
        memory_test.go:131: 1 cycles with 78976.000000 bytes per cycle, totalling 78976
        memory_test.go:131: 100 cycles with 4928.319824 bytes per cycle, totalling 492832
        memory_test.go:131: 10000 cycles with 4277.961426 bytes per cycle, totalling 42779616
        memory_test.go:131: 50000 cycles with 4275.054199 bytes per cycle, totalling 213752720
BenchmarkAppendSample1000           5000            530744 ns/op
--- BENCH: BenchmarkAppendSample1000
        memory_test.go:131: 1 cycles with 842192.000000 bytes per cycle, totalling 842192
        memory_test.go:131: 100 cycles with 62765.441406 bytes per cycle, totalling 6276544
        memory_test.go:131: 5000 cycles with 55209.812500 bytes per cycle, totalling 276049056
ok      github.com/prometheus/prometheus/storage/metric/tiered  27.468s

Change-Id: Idaa339cd83539b5e4391614541a2c3a04002d66d
2014-04-22 15:22:54 +02:00
Julius Volz
1b29975865 Fix RWLock memory storage deadlock.
This fixes https://github.com/prometheus/prometheus/issues/390

The cause for the deadlock was a lock semantic in Go that wasn't
obvious to me when introducing this bug:

http://golang.org/pkg/sync/#RWMutex.Lock

Key phrase: "To ensure that the lock eventually becomes available, a
blocked Lock call excludes new readers from acquiring the lock."

In the memory series storage, we have one function
(GetFingerprintsForLabelMatchers) acquiring an RLock(), which calls
another function also acquiring the same RLock()
(GetLabelValuesForLabelName). That normally doesn't deadlock, unless a
Lock() call from another goroutine happens right in between the two
RLock() calls, blocking both the Lock() and the second RLock() call from
ever completing.

  GoRoutine 1          GoRoutine 2
  ======================================
  RLock()
  ...                  Lock() [DEADLOCK]
  RLock() [DEADLOCK]   Unlock()
  RUnlock()
  RUnlock()

Testing deadlocks is tricky, but the regression test I added does
reliably detect the deadlock in the original code on my machine within a
normal concurrent reader/writer run duration of 250ms.

Change-Id: Ib34c2bb8df1a80af44550cc2bf5007055cdef413
2014-04-17 13:43:13 +02:00