Commit Graph

362 Commits

Author SHA1 Message Date
Thomas Schoebel-Theuer 12f7e83ab0 marsadm: sync caches upon detach 2017-02-09 10:13:38 +01:00
Thomas Schoebel-Theuer 812011aa07 marsadm: make logrotate more rubust against missing logfiles
This should not happen at all.

During several millions of operations hour, it occurs however when
hardware is defective. Try self-healing as far as possible.
2017-01-25 09:30:52 +01:00
Thomas Schoebel-Theuer 185b63070c log-impex: provisionary compatibility 2016-08-09 09:37:10 +02:00
Thomas Schoebel-Theuer f048aec390 userspace: add example cronjob 2016-08-09 09:37:10 +02:00
Thomas Schoebel-Theuer bb6b65a002 userspace: add basic systemd unit
First try. May need some improvements in future.
2016-08-09 09:37:10 +02:00
Thomas Schoebel-Theuer 838c98ca6d marsadm: systemantically missing macros *-logcount 2016-08-09 09:37:10 +02:00
Thomas Schoebel-Theuer d09cc8e218 marsadm: fix {replay,fetch,work}-lognr and replay-basenr
These were forgotten to export, and they were not systematic.
2016-08-09 09:37:10 +02:00
Thomas Schoebel-Theuer 474d7d0a05 marsadm: fix wrong lognr result in corner case 2016-08-09 09:37:10 +02:00
Thomas Schoebel-Theuer 6559c534be marsadm: directly switch back to former primary
Use the new knowledge about old primary.

This is only relevant for people who are consistently ignoring
mars-manual.pdf which clearly states that intermediate
"marsadm secondary" should not be used at all, except for the
last step in final destruction of a resource.
2016-08-09 09:37:10 +02:00
Thomas Schoebel-Theuer 79a1d20c69 marsadm: fix annoying perl warning 2016-08-09 09:37:10 +02:00
Thomas Schoebel-Theuer f89e0a7d96 marsadm: lowlevel IP address commands
This is absolutely necessary for coping with changes in network
setups.
2016-03-09 09:42:38 +01:00
Thomas Schoebel-Theuer 207635632b marsadm: check uniqueness of IPs at join-cluster 2016-03-01 11:58:23 +01:00
Thomas Schoebel-Theuer 20eca8c447 marsadm: verbose callstack at ldie 2016-03-01 11:58:23 +01:00
Thomas Schoebel-Theuer 83ae4720fa marsadm: reimplement buggy primitive macros
The old version was complicated and error prone, due to historic
development.

Now the structure should be much simpler.
2016-02-15 07:10:41 +01:00
Thomas Schoebel-Theuer 8c3cfe97f3 marsadm: show wrong permissions
Feature request by Tilmann Steinberg.

It greatly eases debugging when searching for a source of wrong
permissions.

Some admin tools like Puppet seem to have their own default notion
of "secure permissions" and try to "fix" them ;)
2016-02-15 07:10:41 +01:00
Thomas Schoebel-Theuer c0d57bef7a marsadm: fix view-wait-is-* when symlinks are not yet present 2016-02-15 07:10:40 +01:00
Thomas Schoebel-Theuer 1edef479fc marsadm: show the old designated primary in the log
This is vital for incident analysis.
2016-02-03 22:01:49 +01:00
Thomas Schoebel-Theuer 89014d29c3 marsadm: new primitive device-opened
This is absolutely needed for race avoidance in scripting.
2016-02-03 22:01:48 +01:00
Thomas Schoebel-Theuer 561c2bd6c6 marsadm: rename occurences of deprecated present-{disk,device} 2016-02-03 22:01:48 +01:00
Thomas Schoebel-Theuer 6418370357 marsadm: rename present-{disk,device} to *-present and deprecate it
This is important for namespace systematics of primitive macros.

First name the object, then name its property. Like in OO.

Exception: when _finding_ the object itself needs an operation, or
additional information, e.g. %get-disk{} (this is the "lookup operation"
for the object itself, at least by concept).

For compatibility, the old forms will be accepted also
(silently, undocumented).
2016-02-03 22:01:48 +01:00
Thomas Schoebel-Theuer 08c776fc36 marsadm: allow devices as size argument 2016-02-03 22:01:48 +01:00
Thomas Schoebel-Theuer f4f9ba93e2 marsadm: correct replay error checking 2016-02-03 22:01:48 +01:00
Thomas Schoebel-Theuer 7ff2d896ea marsadm: fix join-cluster when the peer is actively running
In such a case rsync may spill an error because some symlinks
were updated in the meantime or have vanished. We can safely
ignore that.
2016-02-03 22:01:48 +01:00
Thomas Schoebel-Theuer e36a2ea4f1 marsadm: fix present-{disk,device} 2016-02-03 22:01:48 +01:00
Thomas Schoebel-Theuer 0e6bb47cb6 marsadm: fix edge cases of try_to_avoid_splitbrain()
Originally a trivial silly bug (boolean value was wrong), leading to an
endless loop when a local versionlink was missing, which can happen
only after a primary crash at the wrong moment shortly after a logrotate
(not even during ordinary operations), followed by a hard reboot.

As documented in mars-manual.pdf, you simply need "modprobe mars"
to recover after such a crash reboot. MARS remembers the primary state
persistently for you and restores everything _automatically_.

Using "marsadm primary" in such a case to switch the current primary
to primary again (after an unnecessary "marsadm secondary" which is
strongly discouraged by mars-manual.pdf), although the host is / was
already in primary state after the reboot, is at least as silly as
the mentioned bug. Doing this in an /etc/init.d/ startup script
where it really doesn't belong into, is even more silly.

The latter is even an OPERATIONAL RISK, because "marsadm secondary"
works _globally_ in the whole cluster (as documented in mars-manual.pdf).
Such an improper startup script _can_ (potentially) disturb another
cluster member which had become primary in the _meantime_ during reboot.
Global cluster operations don't belong into startup scripts, because
reboots may happen unintentionally at any time.
2016-02-03 22:00:47 +01:00
Thomas Schoebel-Theuer e207443833 marsadm: fix binary operators =~ and "match" 2016-01-21 08:09:48 +01:00
Thomas Schoebel-Theuer feb0b34604 marsadm: fix irritating "Inconsistent" display at primary side
At an actual primary, "Inconsistent" would be the correct description
for the state of the _disk_.

However most sysadmins will confuse this with the state of the
_replication_ (which is of course never inconsistent during
writeback from the memory buffer).

Although documented correctly, misunderstandings continue
to survive, because humans are automatically abstracting away
from detail components such as a "disk", and are automatically
assuming that "marsadm view" would relate to the replication
as a whole.

Avoid misunderstandings by more detailed message distinctions
aiming to address all of these in parallel.
2016-01-15 17:58:30 +01:00
Thomas Schoebel-Theuer cd122db700 marsadm: display logfile replay errors in diskstate 2016-01-15 17:58:27 +01:00
Thomas Schoebel-Theuer cc1074fc53 marsadm: add primitive macro errno-text 2016-01-15 17:29:47 +01:00
Thomas Schoebel-Theuer 6c41326f7a marsadm: add basic macro replay-code 2016-01-15 17:23:14 +01:00
Thomas Schoebel-Theuer cc1d786654 marsadm: disallow ordinary switching when logfiles are damaged
Only primary --force should be possible in such a (rare) case.
2016-01-15 17:10:48 +01:00
Thomas Schoebel-Theuer 69386b33d9 marsadm: fix /mars security issues
Only relevant for non-storage servers where customers have access to.

Notice that /mars is a _reserved_ filesystem for MARS-internal purposes.
It has mothing to do with an ordinary filesystem.

Users have generally to be kept out.
2016-01-13 14:12:00 +01:00
Thomas Schoebel-Theuer 3a543d5ca5 marsadm: improve weird --host=other deletion 2015-10-07 10:42:07 +02:00
Thomas Schoebel-Theuer 8e786d129f marsadm: remove distracting warning
This is no longer needed.
2015-08-04 14:18:45 +02:00
Thomas Schoebel-Theuer 58294defe5 marsadm: safeguard {create,join}-resource against old remains 2015-08-04 10:21:32 +02:00
Thomas Schoebel-Theuer 3e92223e47 marsadm: fix annoying warning in corner case 2015-07-22 12:19:41 +02:00
Thomas Schoebel-Theuer b190337eb8 marsadm: use $real_host for deletions
This can make a difference when using --host= and the other
host is no long really existant.
2015-05-05 09:30:08 +02:00
Thomas Schoebel-Theuer 5f485b6a02 marsadm: better leave-cluster cleanup of dead files 2015-05-05 09:30:08 +02:00
Thomas Schoebel-Theuer fb880f9b2c marsadm: leave-cluster should rmmod only on real host 2015-05-05 09:30:08 +02:00
Thomas Schoebel-Theuer c6fc05a3be marsadm: allow --force --host= cleanup on non-joined host 2015-05-05 09:30:04 +02:00
Thomas Schoebel-Theuer 2a99f12294 marsadm: cleanup all orphane symlinks 2015-05-05 09:12:42 +02:00
Thomas Schoebel-Theuer 1eea119870 marsadm: allow --force on 'all' 2015-05-05 09:12:37 +02:00
Thomas Schoebel-Theuer 3f9571999d marsadm: allow log-rotate on secondaries upon --force
This makes not much sense, but is provided for cases where you are
really desperate.
2015-05-05 08:56:41 +02:00
Thomas Schoebel-Theuer 27e975f4d7 marsadm: allow --threshold=human-readable 2015-03-24 08:33:03 +01:00
Thomas Schoebel-Theuer 5bead43add marsadm: distinguish multiples of 1024 from 1000 2015-03-24 08:33:03 +01:00
Thomas Schoebel-Theuer f43d5fd58e marsadm: add list inquiry functions 2015-03-24 08:33:03 +01:00
Thomas Schoebel-Theuer 8d60cb304c marsadm: show replication infos only on secondaries 2015-03-24 08:33:03 +01:00
Thomas Schoebel-Theuer c481f75cb8 marsadm: hint on wasted disk space 2015-03-24 08:33:03 +01:00
Thomas Schoebel-Theuer c88965e24a marsadm: report disk/device sizes 2015-03-24 08:33:03 +01:00
Thomas Schoebel-Theuer 1f2680dd62 marsadm: fix external races on resize 2015-03-24 08:33:03 +01:00