Commit Graph

817 Commits

Author SHA1 Message Date
Joerg Mann
8cee26a02f monitoring: mars-status update, add zabbix template
- add zabbix template, cronjob and config
2013-12-12 08:31:07 +01:00
Thomas Schoebel-Theuer
1d52efb880 brick_mem: improve debugging messages 2013-12-05 08:08:57 +01:00
Thomas Schoebel-Theuer
ff2b4337ea infra: show version tags in /proc/sys/mars/version 2013-12-05 08:08:26 +01:00
Thomas Schoebel-Theuer
7e124d0550 all: release light0.1beta0.15 2013-11-21 11:54:10 +01:00
Thomas Schoebel-Theuer
03e6ae01f6 doc: describe throttling 2013-11-21 11:53:17 +01:00
Thomas Schoebel-Theuer
eb9aebc3ae infra: fix delay computation in limiter 2013-11-21 07:20:01 +01:00
Thomas Schoebel-Theuer
fd30cd6b44 infra: show ops count in limiter 2013-11-21 07:20:01 +01:00
Thomas Schoebel-Theuer
af418eb9f0 infra: make limiter {min,max}_window configurable 2013-11-21 07:20:00 +01:00
Thomas Schoebel-Theuer
8b74dddc24 infra: fix limiter overflow in denominator 2013-11-21 07:20:00 +01:00
Thomas Schoebel-Theuer
8696e417db infra: make limiter max_delay settable 2013-11-21 07:20:00 +01:00
Thomas Schoebel-Theuer
606528768f if: fix amount of throttling 2013-11-20 11:54:14 +01:00
Frank Liepold
f3c9d8757f test_suite: current state
Signed-off-by: Thomas Schoebel-Theuer <schoebel@bell.site>
2013-11-20 11:13:57 +01:00
Thomas Schoebel-Theuer
6579393177 light: rename throttling parameters and defaults 2013-11-20 11:13:57 +01:00
Thomas Schoebel-Theuer
65bdee3b08 infra: show cumulatives in all limiters 2013-11-19 12:22:45 +01:00
Frank Liepold
871e3994db light: fix throttling calculation of request sizes
Signed-off-by: Thomas Schoebel-Theuer <schoebel@bell.site>
2013-11-19 11:44:15 +01:00
Thomas Schoebel-Theuer
cc857ac4e7 doc: update ChangeLog 2013-11-18 13:44:05 +01:00
Thomas Schoebel-Theuer
97f296f8f7 doc: update INSTALL 2013-11-18 13:41:50 +01:00
Thomas Schoebel-Theuer
c167fccfa3 doc: README points to http://schoebel.github.io/mars/ 2013-11-18 13:41:50 +01:00
Thomas Schoebel-Theuer
67927ef56a doc: add logo for github pages 2013-11-18 08:19:10 +01:00
Frank Liepold
18fd90d538 test_suite: current state
Signed-off-by: Thomas Schoebel-Theuer <schoebel@bell.site>
2013-11-07 10:38:21 +01:00
Joerg Mann
fa8f8bdb0c mars-status: fixes, rewrite version- and linkcheck, add historyview
Signed-off-by: Thomas Schoebel-Theuer <schoebel@bell.site>
2013-11-06 14:43:09 +01:00
Thomas Schoebel-Theuer
3b0a78803d sio: remove non-working kmap() 2013-11-05 13:02:35 +01:00
Thomas Schoebel-Theuer
232349e544 if: remove non-working kmap() 2013-11-05 12:31:34 +01:00
Thomas Schoebel-Theuer
9134be1a3e all: allow throttling of bulk write requests 2013-10-31 08:24:56 +01:00
Thomas Schoebel-Theuer
0a8292cb80 if: add diskstats 2013-10-31 08:02:09 +01:00
Frank Liepold
02558d5ab0 marsadm: correct message
Signed-off-by: Thomas Schoebel-Theuer <tst@1und1.de>
2013-10-22 09:40:06 +02:00
Frank Liepold
5766d22e6b marsadm: invalidate does not delete logfiles or version links anymore
Signed-off-by: Thomas Schoebel-Theuer <tst@1und1.de>
2013-10-22 09:40:06 +02:00
Frank Liepold
83361f0745 marsadm: leave-resource removes logfiles and version links of the resource
Signed-off-by: Thomas Schoebel-Theuer <tst@1und1.de>
2013-10-22 09:40:06 +02:00
Frank Liepold
c832799910 light: allow logfiles not to be consecutive on secondary site
If there are holes in the logfile sequence and this holes concern only logfiles
which are already applied (i.e. logfiles lying before all replay links)
the secondary can continue working.
Warnings are written as long as the situation exists.

Signed-off-by: Thomas Schoebel-Theuer <tst@1und1.de>
2013-10-22 09:39:22 +02:00
Frank Liepold
675e46d689 light: report next logfile to be copyable in case of logfile sequence holes
Up to now holes in the logfile sequence caused the copy process to stop after
having fetched the last logfile before the hole.

E.g. in emergency mode such holes are created intentionally on the primary
side. After the situation has been cleaned up, the secondary must be able to
fetch newly created logfiles.

Signed-off-by: Thomas Schoebel-Theuer <tst@1und1.de>
2013-10-22 09:38:17 +02:00
Thomas Schoebel-Theuer
915f955333 light: fix copy_next_is_available propagation 2013-10-17 14:49:41 +02:00
Thomas Schoebel-Theuer
35b9345d94 server: fix socket shutdown in error path 2013-10-17 07:48:32 +02:00
Thomas Schoebel-Theuer
99644a943a all: make *_switch() code idempotent
New semantics: it must be possible to call the switch functions
even when nothing has changed.
2013-10-17 07:48:32 +02:00
Thomas Schoebel-Theuer
7a2755a56f light: prevent races on device size 2013-10-17 07:48:32 +02:00
Thomas Schoebel-Theuer
ffc97c5c68 if: fix set_capacity() 2013-10-17 07:48:31 +02:00
Thomas Schoebel-Theuer
be24c712e0 bio: fix usage of i_size_read() 2013-10-17 07:35:35 +02:00
Thomas Schoebel-Theuer
8971edad18 if: set capacity upon regular switch() maintenance 2013-10-17 07:35:34 +02:00
Thomas Schoebel-Theuer
7f8bf6c29a brick_mem: add /proc/sys/mars/mem_allow_freelist 2013-10-17 07:30:10 +02:00
Thomas Schoebel-Theuer
4abb584aad doc: move pictures to images/ 2013-10-04 10:53:42 +02:00
Thomas Schoebel-Theuer
3b1705af99 doc: new chapter about use cases MARS vs DRBD 2013-09-17 13:36:28 +02:00
Frank Liepold
6b41af4cd9 test_suite: new and updated test cases 2013-09-17 13:36:27 +02:00
Frank Liepold
08e5803cd1 light: workaround flying IO before reporting memory leaks
We report an error if there are unfreed mrefs after the device brick
has been switched to power off.

Instead of reporting an error at once, we report only warnings in the first 20
seconds. If there are still unfreed mrefs after that time an error is reported.
2013-09-17 13:36:27 +02:00
Thomas Schoebel-Theuer
74e12ad531 infra: add mapfree_grace_keep_mb 2013-09-17 13:36:27 +02:00
Thomas Schoebel-Theuer
0755380a52 light: show CONFIG_DEBUG* in modinfo 2013-09-17 12:16:36 +02:00
Thomas Schoebel-Theuer
797132cfb8 sio: adapt to newer kernels (kmap_atomic) 2013-09-17 12:16:36 +02:00
Thomas Schoebel-Theuer
453fcb59d8 if: fix early kill of if_brick 2013-09-17 12:16:36 +02:00
Thomas Schoebel-Theuer
9134c1b771 light: add transferstatus symlink 2013-09-17 12:16:36 +02:00
Frank Liepold
ebe0ca6ad9 light: reduce cascades on lamport clock workaround
Signed-off-by: Thomas Schoebel-Theuer <tst@1und1.de>

Some filesystems like ext3 have only full second resolution.

Therefore, we _must_ advance the Lamport clock in whole seconds
when working on such gear, since we want to prevent lost
updates which would be caused by standstill Lamport clocks.

Sometimes, the lamport clock gets updated more frequently per second
than real time. In such cases, the Lamport clock will run much faster
than real time. After some weeks of operation, the Lamport clock
will be far in the future.

In general, we cannot do anything against that. When some fine-grained
information cannot be coded into some specific data type, it
cannot be coded.

However, when updates start to occur less frequently, we want to
_leave_ the workaround mode ASAP. The old code set tv_nsec to 0
which made it very likely that the workaround was triggered
again unnecessarily.

In order to _reduce_ that effect, we prevent unnecessary cascades
of whole-second leaps by setting the nanoseconds constantly to 1
if the full second was increased due to insufficient capabilities
of the underlying filesystem. At least in those cases where
Lamport timestamps are transferred over the network and/or we have
mixed configurations between ext3/ext4, we hope to
decrease the risk of endless cascades.

Experience shows that the new code behaves better.
2013-08-28 14:54:04 +02:00
Frank Liepold
96be062f63 tests: update 2013-08-06 14:40:16 +02:00
Thomas Schoebel-Theuer
4b59be870e copy: speedup by making overlap the default
Since commit 62e2f5944b, aio prevents races on the length
of a transaction logfile.

Thefore, we can safely enable IO parallelism at writes fired off
by copy.

The old behaviour was a serious IO bottleneck.
2013-08-06 14:30:05 +02:00