mars/userspace
Thomas Schoebel-Theuer 461ac8b4cd marsadm: new switch semantics on marsadm primary
Apparently, sysadmins often forget to execute "marsadm up mydata"
(or similar) after a failover.

Recall the failover command sequence:
"marsadm pause-fetch mydata; marsadm primary --force mydata"

Some months later, other sysadmins in the group are stumbling over
the very old "pause-fetch" after a regular planned handover via
"marsadm primary mydata". It works, but the former primary
(which is now secondary) does no longer fetch data, because of the
very old pause-fetch command which was never reverted.

Afterwards, /mars is filling up slowly over a long time.

Somewhen later (e.g. a few days), a monitoring alert "/mars too full"
is happening at midnight, leading to an unnecessary on-duty call.

A different type of monitoring could help, by not only
tracking the filling level of /mars, but also view-todo-fetch or
similar. However, some people dislike this, because there
exist operational use cases (like creation of backups) where pause-fetch
is executed _deliberately_ for a longer time.

Here is a workaround for a forgotten resume-fetch / up after
the first failover:

After the  _original_ "marsadm primary" or "primary --force" has
succeeded by appearance of /dev/mars/mydata, we simply execute
the equivalent of "marsadm up mydata".

This changes the semantics of the "primary" command. Hopefully
no scripts on this world will break.
2020-11-07 08:25:47 +01:00
..
cron.d userspace: improved cron job 2017-09-27 07:11:46 +02:00
udev infra: add new udev rules 2014-12-09 14:31:42 +01:00
make-man.sh doc: create manpage automatically from marsadm --help 2015-02-27 11:32:57 +01:00
mars-log-impex.c log-impex: provisionary compatibility 2016-08-09 09:37:10 +02:00
marsadm marsadm: new switch semantics on marsadm primary 2020-11-07 08:25:47 +01:00
marsadm.8 doc: update version 2015-03-09 09:53:06 +01:00
write-reboot.c all: clarify license GPLv2+ 2014-11-25 18:09:17 +01:00