Commit Graph

339 Commits

Author SHA1 Message Date
Thomas Schoebel-Theuer
8deb1c7d02 marsadm: unlink leftover deletion links 2017-08-25 15:07:59 +02:00
Thomas Schoebel-Theuer
6a9795f247 marsadm: speed up error text retrieval 2017-07-05 07:38:15 +02:00
Thomas Schoebel-Theuer
86a4f1674c marsadm: introduce configurable MARS_PATH 2017-07-05 07:38:15 +02:00
Thomas Schoebel-Theuer
4c74c8e985 marsadm: fetch newest symlinks at join-resource 2017-07-05 07:38:15 +02:00
Thomas Schoebel-Theuer
12e41def3f marsadm: cleanup old remains on join-resource --force 2017-07-05 07:38:15 +02:00
Thomas Schoebel-Theuer
66734e4211 marsadm: log-purge-all must not fail on empty resource 2017-07-05 07:38:15 +02:00
Thomas Schoebel-Theuer
d3ede5b39f marsadm: tolerate empty resource dirs at leave-resource 2017-07-05 07:38:15 +02:00
Thomas Schoebel-Theuer
ee94c1279a marsadm: safeguard rsync at join-cluster 2017-07-05 07:38:15 +02:00
Thomas Schoebel-Theuer
1950c0fc1b marsadm: internal wait-cluster before doing join-resource
This is necessary when the full mesh communication is relaxed.
2017-07-05 07:38:15 +02:00
Thomas Schoebel-Theuer
1d85ec9cb3 userspace: rework ssh and rsync 2017-07-05 07:38:14 +02:00
Thomas Schoebel-Theuer
60a08c7387 marsadm: better --dry-run 2017-07-05 07:38:14 +02:00
Thomas Schoebel-Theuer
7bb3b2abcd marsadm: fix syslog quotation characters 2017-07-05 07:38:14 +02:00
Thomas Schoebel-Theuer
a53b467808 marsadm: add feature version number 2017-05-28 19:13:14 +02:00
Thomas Schoebel-Theuer
12f7e83ab0 marsadm: sync caches upon detach 2017-02-09 10:13:38 +01:00
Thomas Schoebel-Theuer
812011aa07 marsadm: make logrotate more rubust against missing logfiles
This should not happen at all.

During several millions of operations hour, it occurs however when
hardware is defective. Try self-healing as far as possible.
2017-01-25 09:30:52 +01:00
Thomas Schoebel-Theuer
838c98ca6d marsadm: systemantically missing macros *-logcount 2016-08-09 09:37:10 +02:00
Thomas Schoebel-Theuer
d09cc8e218 marsadm: fix {replay,fetch,work}-lognr and replay-basenr
These were forgotten to export, and they were not systematic.
2016-08-09 09:37:10 +02:00
Thomas Schoebel-Theuer
474d7d0a05 marsadm: fix wrong lognr result in corner case 2016-08-09 09:37:10 +02:00
Thomas Schoebel-Theuer
6559c534be marsadm: directly switch back to former primary
Use the new knowledge about old primary.

This is only relevant for people who are consistently ignoring
mars-manual.pdf which clearly states that intermediate
"marsadm secondary" should not be used at all, except for the
last step in final destruction of a resource.
2016-08-09 09:37:10 +02:00
Thomas Schoebel-Theuer
79a1d20c69 marsadm: fix annoying perl warning 2016-08-09 09:37:10 +02:00
Thomas Schoebel-Theuer
f89e0a7d96 marsadm: lowlevel IP address commands
This is absolutely necessary for coping with changes in network
setups.
2016-03-09 09:42:38 +01:00
Thomas Schoebel-Theuer
207635632b marsadm: check uniqueness of IPs at join-cluster 2016-03-01 11:58:23 +01:00
Thomas Schoebel-Theuer
20eca8c447 marsadm: verbose callstack at ldie 2016-03-01 11:58:23 +01:00
Thomas Schoebel-Theuer
83ae4720fa marsadm: reimplement buggy primitive macros
The old version was complicated and error prone, due to historic
development.

Now the structure should be much simpler.
2016-02-15 07:10:41 +01:00
Thomas Schoebel-Theuer
8c3cfe97f3 marsadm: show wrong permissions
Feature request by Tilmann Steinberg.

It greatly eases debugging when searching for a source of wrong
permissions.

Some admin tools like Puppet seem to have their own default notion
of "secure permissions" and try to "fix" them ;)
2016-02-15 07:10:41 +01:00
Thomas Schoebel-Theuer
c0d57bef7a marsadm: fix view-wait-is-* when symlinks are not yet present 2016-02-15 07:10:40 +01:00
Thomas Schoebel-Theuer
1edef479fc marsadm: show the old designated primary in the log
This is vital for incident analysis.
2016-02-03 22:01:49 +01:00
Thomas Schoebel-Theuer
89014d29c3 marsadm: new primitive device-opened
This is absolutely needed for race avoidance in scripting.
2016-02-03 22:01:48 +01:00
Thomas Schoebel-Theuer
561c2bd6c6 marsadm: rename occurences of deprecated present-{disk,device} 2016-02-03 22:01:48 +01:00
Thomas Schoebel-Theuer
6418370357 marsadm: rename present-{disk,device} to *-present and deprecate it
This is important for namespace systematics of primitive macros.

First name the object, then name its property. Like in OO.

Exception: when _finding_ the object itself needs an operation, or
additional information, e.g. %get-disk{} (this is the "lookup operation"
for the object itself, at least by concept).

For compatibility, the old forms will be accepted also
(silently, undocumented).
2016-02-03 22:01:48 +01:00
Thomas Schoebel-Theuer
08c776fc36 marsadm: allow devices as size argument 2016-02-03 22:01:48 +01:00
Thomas Schoebel-Theuer
f4f9ba93e2 marsadm: correct replay error checking 2016-02-03 22:01:48 +01:00
Thomas Schoebel-Theuer
7ff2d896ea marsadm: fix join-cluster when the peer is actively running
In such a case rsync may spill an error because some symlinks
were updated in the meantime or have vanished. We can safely
ignore that.
2016-02-03 22:01:48 +01:00
Thomas Schoebel-Theuer
e36a2ea4f1 marsadm: fix present-{disk,device} 2016-02-03 22:01:48 +01:00
Thomas Schoebel-Theuer
0e6bb47cb6 marsadm: fix edge cases of try_to_avoid_splitbrain()
Originally a trivial silly bug (boolean value was wrong), leading to an
endless loop when a local versionlink was missing, which can happen
only after a primary crash at the wrong moment shortly after a logrotate
(not even during ordinary operations), followed by a hard reboot.

As documented in mars-manual.pdf, you simply need "modprobe mars"
to recover after such a crash reboot. MARS remembers the primary state
persistently for you and restores everything _automatically_.

Using "marsadm primary" in such a case to switch the current primary
to primary again (after an unnecessary "marsadm secondary" which is
strongly discouraged by mars-manual.pdf), although the host is / was
already in primary state after the reboot, is at least as silly as
the mentioned bug. Doing this in an /etc/init.d/ startup script
where it really doesn't belong into, is even more silly.

The latter is even an OPERATIONAL RISK, because "marsadm secondary"
works _globally_ in the whole cluster (as documented in mars-manual.pdf).
Such an improper startup script _can_ (potentially) disturb another
cluster member which had become primary in the _meantime_ during reboot.
Global cluster operations don't belong into startup scripts, because
reboots may happen unintentionally at any time.
2016-02-03 22:00:47 +01:00
Thomas Schoebel-Theuer
e207443833 marsadm: fix binary operators =~ and "match" 2016-01-21 08:09:48 +01:00
Thomas Schoebel-Theuer
feb0b34604 marsadm: fix irritating "Inconsistent" display at primary side
At an actual primary, "Inconsistent" would be the correct description
for the state of the _disk_.

However most sysadmins will confuse this with the state of the
_replication_ (which is of course never inconsistent during
writeback from the memory buffer).

Although documented correctly, misunderstandings continue
to survive, because humans are automatically abstracting away
from detail components such as a "disk", and are automatically
assuming that "marsadm view" would relate to the replication
as a whole.

Avoid misunderstandings by more detailed message distinctions
aiming to address all of these in parallel.
2016-01-15 17:58:30 +01:00
Thomas Schoebel-Theuer
cd122db700 marsadm: display logfile replay errors in diskstate 2016-01-15 17:58:27 +01:00
Thomas Schoebel-Theuer
cc1074fc53 marsadm: add primitive macro errno-text 2016-01-15 17:29:47 +01:00
Thomas Schoebel-Theuer
6c41326f7a marsadm: add basic macro replay-code 2016-01-15 17:23:14 +01:00
Thomas Schoebel-Theuer
cc1d786654 marsadm: disallow ordinary switching when logfiles are damaged
Only primary --force should be possible in such a (rare) case.
2016-01-15 17:10:48 +01:00
Thomas Schoebel-Theuer
69386b33d9 marsadm: fix /mars security issues
Only relevant for non-storage servers where customers have access to.

Notice that /mars is a _reserved_ filesystem for MARS-internal purposes.
It has mothing to do with an ordinary filesystem.

Users have generally to be kept out.
2016-01-13 14:12:00 +01:00
Thomas Schoebel-Theuer
3a543d5ca5 marsadm: improve weird --host=other deletion 2015-10-07 10:42:07 +02:00
Thomas Schoebel-Theuer
8e786d129f marsadm: remove distracting warning
This is no longer needed.
2015-08-04 14:18:45 +02:00
Thomas Schoebel-Theuer
58294defe5 marsadm: safeguard {create,join}-resource against old remains 2015-08-04 10:21:32 +02:00
Thomas Schoebel-Theuer
3e92223e47 marsadm: fix annoying warning in corner case 2015-07-22 12:19:41 +02:00
Thomas Schoebel-Theuer
b190337eb8 marsadm: use $real_host for deletions
This can make a difference when using --host= and the other
host is no long really existant.
2015-05-05 09:30:08 +02:00
Thomas Schoebel-Theuer
5f485b6a02 marsadm: better leave-cluster cleanup of dead files 2015-05-05 09:30:08 +02:00
Thomas Schoebel-Theuer
fb880f9b2c marsadm: leave-cluster should rmmod only on real host 2015-05-05 09:30:08 +02:00
Thomas Schoebel-Theuer
c6fc05a3be marsadm: allow --force --host= cleanup on non-joined host 2015-05-05 09:30:04 +02:00