Commit Graph

839 Commits

Author SHA1 Message Date
Thomas Schoebel-Theuer
a48dcca14d marsadm: new commands {de,}activate-guest 2020-11-29 17:41:06 +01:00
Thomas Schoebel-Theuer
4cae329dc9 marsadm: speedup join-resource on non-reachable peers 2020-11-27 23:17:25 +01:00
Thomas Schoebel-Theuer
4e6ef0751b marsadm: fix annoying warning 2020-11-27 23:17:25 +01:00
Thomas Schoebel-Theuer
6fed821b6e marsadm: carefully shortcut self-waiting 2020-11-27 23:17:24 +01:00
Thomas Schoebel-Theuer
74f0da534b marsadm: fix join-cluster dir creation 2020-11-27 21:04:04 +01:00
Thomas Schoebel-Theuer
94df66a3c1 marsadm: fix too strong race detection 2020-11-26 11:40:11 +01:00
Thomas Schoebel-Theuer
feb0540224 marsadm: fix typo 2020-11-12 05:37:17 +01:00
Thomas Schoebel-Theuer
30730e4a50 marsadm: safeguard races on log-purge-res 2020-11-10 16:04:01 +01:00
Thomas Schoebel-Theuer
54226c78a7 marsadm: safeguard purge of recent logfiles 2020-11-10 16:04:01 +01:00
Thomas Schoebel-Theuer
6020414d25 marsadm: safeguard deletion of recent logfiles 2020-11-10 16:04:01 +01:00
Thomas Schoebel-Theuer
f871eb9514 marsadm: safeguard deletion of last logfile 2020-11-10 16:04:01 +01:00
Thomas Schoebel-Theuer
e9e5c1a1da marsadm: safeguard logrotate --force 2020-11-10 16:04:01 +01:00
Thomas Schoebel-Theuer
dc1e778abb marsadm: safeguard race between split-brain log-deletes 2020-11-10 16:04:01 +01:00
Thomas Schoebel-Theuer
54cb4605d0 all: bump versions 2020-11-10 16:04:01 +01:00
Thomas Schoebel-Theuer
3aa037d976 marsadm: push new replay links to primary 2020-11-10 16:04:01 +01:00
Thomas Schoebel-Theuer
69199818c8 marsadm: include probe_dir in transitive closure 2020-11-10 16:04:01 +01:00
Thomas Schoebel-Theuer
1c0e4cf9a9 marsadm: new ssh-less split-cluster method 2020-11-10 16:03:50 +01:00
Thomas Schoebel-Theuer
25cec3526e marsadm: new ssh-less merge-cluster method 2020-11-10 16:02:46 +01:00
Thomas Schoebel-Theuer
9cb9e81310 marsadm: new _push_link_foreign onto foreign IP 2020-11-10 16:02:05 +01:00
Thomas Schoebel-Theuer
9a72f86c60 marsadm: new option --no-ssh 2020-11-07 08:56:09 +01:00
Thomas Schoebel-Theuer
7511ebadcf marsadm: local peer and resource cache 2020-11-07 08:56:09 +01:00
Thomas Schoebel-Theuer
29b22a779f marsadm: check peer activations 2020-11-07 08:56:07 +01:00
Thomas Schoebel-Theuer
ab6990593d marsadm: better _get_ip 2020-11-07 08:34:57 +01:00
Thomas Schoebel-Theuer
18319eed23 marsadm: fix parsing of backslash-terminated lines
Suggested-by: dhrmn <notifications@github.com>
2020-11-07 08:25:47 +01:00
Thomas Schoebel-Theuer
70a4aae762 marsadm: primitives {is,todo,nr}-secondary 2020-11-07 08:25:47 +01:00
Thomas Schoebel-Theuer
7427478957 marsadm: primitives wait-{is,todo}-{primary,secondary}-{on,off} 2020-11-07 08:25:47 +01:00
Thomas Schoebel-Theuer
461ac8b4cd marsadm: new switch semantics on marsadm primary
Apparently, sysadmins often forget to execute "marsadm up mydata"
(or similar) after a failover.

Recall the failover command sequence:
"marsadm pause-fetch mydata; marsadm primary --force mydata"

Some months later, other sysadmins in the group are stumbling over
the very old "pause-fetch" after a regular planned handover via
"marsadm primary mydata". It works, but the former primary
(which is now secondary) does no longer fetch data, because of the
very old pause-fetch command which was never reverted.

Afterwards, /mars is filling up slowly over a long time.

Somewhen later (e.g. a few days), a monitoring alert "/mars too full"
is happening at midnight, leading to an unnecessary on-duty call.

A different type of monitoring could help, by not only
tracking the filling level of /mars, but also view-todo-fetch or
similar. However, some people dislike this, because there
exist operational use cases (like creation of backups) where pause-fetch
is executed _deliberately_ for a longer time.

Here is a workaround for a forgotten resume-fetch / up after
the first failover:

After the  _original_ "marsadm primary" or "primary --force" has
succeeded by appearance of /dev/mars/mydata, we simply execute
the equivalent of "marsadm up mydata".

This changes the semantics of the "primary" command. Hopefully
no scripts on this world will break.
2020-11-07 08:25:47 +01:00
Thomas Schoebel-Theuer
08c41805ec marsadm: purge any left-over probe dirs 2020-11-07 08:25:47 +01:00
Thomas Schoebel-Theuer
838b85c508 marsadm: global purge at cron 2020-11-07 08:25:47 +01:00
Thomas Schoebel-Theuer
93ef671cf3 marsadm: global purge at link-purge-all 2020-11-07 08:25:47 +01:00
Thomas Schoebel-Theuer
4d05bb3796 marsadm: split up link_purge_global 2020-11-07 08:25:47 +01:00
Thomas Schoebel-Theuer
bd5412d4f5 marsadm: fix version detection for gone members 2020-11-07 08:25:47 +01:00
Thomas Schoebel-Theuer
5b1ca6773a marsadm: safeguard missing old deletions 2020-11-07 08:25:47 +01:00
Thomas Schoebel-Theuer
533b13b3df marsadm: fix initial join-resource on slow metadata communication 2020-11-07 08:25:47 +01:00
Thomas Schoebel-Theuer
c3585565be marsadm: fix join-cluster on unknown peer 2020-11-07 08:25:47 +01:00
Thomas Schoebel-Theuer
1dd31c1285 marsadm: only ask myself upon self wait-cluster 2020-11-07 08:25:47 +01:00
Thomas Schoebel-Theuer
90947c1b14 marsadm: fix wait-cluster race abort 2020-11-07 08:25:47 +01:00
Thomas Schoebel-Theuer
72cbf7b8be marsadm: skip unnecessary wait-cluster restart 2020-11-07 08:01:07 +01:00
Thomas Schoebel-Theuer
b2cd7ddf23 marsadm: clear any local caches 2020-10-28 06:09:11 +01:00
Thomas Schoebel-Theuer
d3acf3f9c8 marsadm: fix join-cluster missing dirs 2020-10-28 06:09:11 +01:00
Thomas Schoebel-Theuer
c9b7fcf7f9 marsadm: safeguard join-resource endless loop 2020-10-28 06:09:11 +01:00
Thomas Schoebel-Theuer
e3ebc5762b marsadm: view disk-error 2020-09-30 14:24:27 +02:00
Thomas Schoebel-Theuer
26b40474cb marsadm: re-activate any forgotten fetch on handover 2020-09-21 14:40:48 +02:00
Thomas Schoebel-Theuer
ed95e24496 marsadm: allow leave-resource --force on empty resource 2020-09-19 17:42:34 +02:00
Thomas Schoebel-Theuer
ae2668b265 marsadm: hint admins on --ignore-sync 2020-09-18 17:45:57 +02:00
Thomas Schoebel-Theuer
23748272ca marsadm: remove stray nonsense 2020-09-18 17:45:57 +02:00
Thomas Schoebel-Theuer
87064c1c5a marsadm: fix primitive disk-present 2020-09-10 11:21:38 +02:00
Thomas Schoebel-Theuer
11792c250e marsadm: remove annoying doubled error code 2020-09-05 23:08:30 +02:00
Thomas Schoebel-Theuer
60baf9c378 marsadm: fix old deletions max_nr detection 2020-09-05 23:06:38 +02:00
Thomas Schoebel-Theuer
24bb735d5a marsadm: report summary on non-reachable non-member hosts 2020-09-03 16:29:55 +02:00
Thomas Schoebel-Theuer
2dbc0769d0 marsadm: old deletion method must ignore non-members 2020-09-03 16:29:55 +02:00
Thomas Schoebel-Theuer
3a727a04b7 marsadm: use ssh-free push at lowlevel-delete-host 2020-09-03 16:29:55 +02:00
Thomas Schoebel-Theuer
1e30e0c945 marsadm: use ssh-free push at lowlevel-set-host-ip 2020-09-03 16:29:55 +02:00
Thomas Schoebel-Theuer
f9044fc9bf marsadm: workaround versionlink appearance race with log-rotate 2020-09-03 16:29:55 +02:00
Thomas Schoebel-Theuer
ac689b8640 marsadm: workaround race with primary logrotate 2020-09-03 16:29:55 +02:00
Thomas Schoebel-Theuer
80f18138d3 marsadm: now simplify get_alive_links() 2020-09-03 16:29:55 +02:00
Thomas Schoebel-Theuer
bcc1a63318 marsadm: new concept guest members 2020-09-03 16:29:55 +02:00
Thomas Schoebel-Theuer
2180337e85 marsadm: avoid old rsync method at join-resource 2020-09-03 16:29:55 +02:00
Thomas Schoebel-Theuer
aecccd547c marsadm: unify naming of versionlink 2020-09-03 16:29:55 +02:00
Thomas Schoebel-Theuer
c7983a6fb6 marsadm: purge stray and/or transient guest links 2020-09-03 16:29:55 +02:00
Thomas Schoebel-Theuer
6d2091eb8e marsadm: add --keep-backups for alivelink purge 2020-09-03 16:29:55 +02:00
Thomas Schoebel-Theuer
8cddbc1851 marsadm: do not delete versionlinks during ongoing join-resource 2020-09-01 19:35:10 +02:00
Thomas Schoebel-Theuer
6750a4fc63 marsadm: join-resource needs preliminary guest-like activation 2020-09-01 19:35:10 +02:00
Thomas Schoebel-Theuer
019b991cda marsadm: earlier device check at {create,join}-resource 2020-09-01 19:35:10 +02:00
Thomas Schoebel-Theuer
3deaa91ba9 marsadm: fix non-generic timestamp override 2020-09-01 19:35:10 +02:00
Thomas Schoebel-Theuer
eddddd5fcd marsadm: fix single-resource phased ldie 2020-09-01 19:35:10 +02:00
Thomas Schoebel-Theuer
e71faba173 marsadm: fix invalid subtraction in corner case 2020-08-12 08:56:48 +02:00
Thomas Schoebel-Theuer
d4c64f60fd marsadm: safeguard race on readlink 2020-08-12 08:56:48 +02:00
Thomas Schoebel-Theuer
89b647a261 marsadm: silence compat warning 2020-08-12 08:56:48 +02:00
Thomas Schoebel-Theuer
859c208835 marsadm: silence warnings 2020-08-12 08:56:47 +02:00
Thomas Schoebel-Theuer
58a5537d0a marsadm: purge historic links 2020-08-02 13:21:29 +02:00
Thomas Schoebel-Theuer
a6167603ad marsadm: adjust report to masses of peers 2020-08-02 13:21:29 +02:00
Thomas Schoebel-Theuer
62c542bad1 marsadm: fix and speedup detection of common peers 2020-08-02 13:21:29 +02:00
Thomas Schoebel-Theuer
9b618876a7 marsadm: safeguard peer matching 2020-08-02 13:21:29 +02:00
Thomas Schoebel-Theuer
08ee99d304 marsadm: safeguard wait-cluster against illegal timestamps 2020-08-02 13:21:28 +02:00
Thomas Schoebel-Theuer
58359ff381 marsadm: safeguard features agains illegal values 2020-08-02 13:21:28 +02:00
Thomas Schoebel-Theuer
38bd337aeb marsadm: fix globs without any wildcard 2020-08-02 13:21:28 +02:00
Thomas Schoebel-Theuer
201648d414 Revert "marsadm: fix corner case of "all""
This reverts commit ea804c111a.
2020-08-02 13:21:28 +02:00
Thomas Schoebel-Theuer
f5d6f29ebf marsadm: new alivelinks 2020-08-02 13:21:26 +02:00
Thomas Schoebel-Theuer
620327703b marsadm: silence warning 2020-08-02 12:10:20 +02:00
Thomas Schoebel-Theuer
1cfca1590b marsadm: fix feature version computation 2020-08-02 12:10:20 +02:00
Thomas Schoebel-Theuer
b9f68d947f marsadm: report marsadm_version 2020-08-02 12:10:20 +02:00
Thomas Schoebel-Theuer
d24c57e50a all: bump features version 2020-08-02 10:56:17 +02:00
Thomas Schoebel-Theuer
1eb85b831b marsadm: show age of hanging IO requests 2020-08-02 10:56:17 +02:00
Thomas Schoebel-Theuer
73210b2c2b marsadm: use stderr for several messages 2020-08-02 10:56:17 +02:00
Thomas Schoebel-Theuer
6bdbfbbb36 marsadm: cron in phases with single sleep 2020-08-02 10:56:17 +02:00
Thomas Schoebel-Theuer
5d347a5201 marsadm: fix LOOP timeout 2020-08-02 10:56:17 +02:00
Thomas Schoebel-Theuer
950e0ca258 marsadm: new lskip 2020-08-02 10:56:17 +02:00
Thomas Schoebel-Theuer
911f7cb83d marsadm: fix failure compensation 2020-08-02 10:56:17 +02:00
Thomas Schoebel-Theuer
a9c6e20f9f marsadm: new --error-injection-phase for testing 2020-08-02 10:56:17 +02:00
Thomas Schoebel-Theuer
4331383355 marsadm: join-resource also push links 2020-08-02 10:56:17 +02:00
Thomas Schoebel-Theuer
3daffa9656 marsadm: new join-cluster method without ssh/rsync 2020-08-02 10:56:17 +02:00
Thomas Schoebel-Theuer
90c165c272 marsadm: fix wait-cluster after join-cluster 2020-07-31 09:26:20 +02:00
Thomas Schoebel-Theuer
c1bed57e80 marsadm: fix full ping 2020-07-31 09:26:20 +02:00
Thomas Schoebel-Theuer
eebb5098d4 marsadm: safeguard missing replaylink 2020-07-20 09:45:20 +02:00
Thomas Schoebel-Theuer
c0154f2e06 marsadm: tighten try_to_avoid_splitbrain 2020-07-20 09:45:19 +02:00
Thomas Schoebel-Theuer
752ed6397f marsadm: decrease speakiness of info messages 2020-07-20 09:45:19 +02:00
Thomas Schoebel-Theuer
fd689d0bd2 marsadm: decrease speakiness of compressions/digests 2020-07-20 09:45:19 +02:00
Thomas Schoebel-Theuer
8c7b2d6027 marsadm: safeguard file creation and touch 2020-07-20 09:45:19 +02:00
Thomas Schoebel-Theuer
95683eef95 marsadm: fix device detection for EXTREMELY old modules 2020-07-20 09:45:19 +02:00
Thomas Schoebel-Theuer
84c37376c6 marsadm: fix file detection 2020-07-20 09:45:19 +02:00
Thomas Schoebel-Theuer
ec00d2abb9 marsadm: fix leave-resource new deletions 2020-07-20 09:45:19 +02:00
Thomas Schoebel-Theuer
fc4af8c32a marsadm: fix --parallel error_count and status 2020-07-10 08:45:42 +02:00
Thomas Schoebel-Theuer
27ea1238a1 marsadm: fix remote alivelink timestamp race 2020-07-10 08:45:42 +02:00
Thomas Schoebel-Theuer
230cb716a0 marsadm: fix attach/detach timeout when no modprobe 2020-07-10 08:45:42 +02:00
Thomas Schoebel-Theuer
9772c52bec marsadm: fix device_exists() fallback to local detection 2020-07-10 08:45:42 +02:00
Thomas Schoebel-Theuer
cdbc8aa752 marsadm: allow --singlestep phase execution for debugging 2020-06-30 21:07:09 +02:00
Thomas Schoebel-Theuer
027be54fd7 marsadm: introduce fail_action for error compensation 2020-06-30 21:07:09 +02:00
Thomas Schoebel-Theuer
12e4747e50 marsadm: invalidate cannot be forced on primary 2020-06-30 21:07:09 +02:00
Thomas Schoebel-Theuer
cd2cb5c1bc marsadm: factor out helper device_exists() 2020-06-30 21:07:09 +02:00
Thomas Schoebel-Theuer
468c80aeeb marsadm: do not init systemd-want 2020-06-30 21:07:09 +02:00
Thomas Schoebel-Theuer
fc2f7062fe marsadm: allow empty expansion of 'all' 2020-06-30 21:07:09 +02:00
Thomas Schoebel-Theuer
f46b562c3f marsadm: pretty-print default-header 2020-05-29 21:06:01 +02:00
Thomas Schoebel-Theuer
d34b204030 marsadm: reduce deprecated _get_actual_primary()
Final removal is only possible after an agreement is found
that *-1and1 macros can be removed.
2020-05-17 07:38:23 +02:00
Thomas Schoebel-Theuer
37a7acaf6f marsadm: distinguish role ForcedPrimary 2020-05-17 07:38:23 +02:00
Thomas Schoebel-Theuer
5ee5298e7b marsadm: new primitives nr-{attach,sync,fetch,replay,primary} 2020-05-17 07:38:23 +02:00
Thomas Schoebel-Theuer
0f3f43575b marsadm: fix join-resource corner case 2020-05-17 07:38:23 +02:00
Thomas Schoebel-Theuer
3b533dea06 marsadm: report LocalDevice stats 2020-04-13 11:24:02 +02:00
Thomas Schoebel-Theuer
c6eb62e890 marsadm: new primitive device-error 2020-04-13 11:24:02 +02:00
Thomas Schoebel-Theuer
61bbdec62f marsadm: new primitive device-nrflying 2020-04-13 11:24:01 +02:00
Thomas Schoebel-Theuer
2f35c9f6e7 marsadm: new primitives device-{ops-rate,amount-rate,rate} 2020-04-13 11:24:01 +02:00
Thomas Schoebel-Theuer
628c636dff all: distinguish *_ops_* from *_amount_* at limiter 2020-04-13 11:24:01 +02:00
Thomas Schoebel-Theuer
6e760727c4 all: bump features version 2020-04-13 11:21:17 +02:00
Thomas Schoebel-Theuer
03cc3874e8 marsadm: set and report flags in cleartext 2020-04-13 11:21:17 +02:00
Thomas Schoebel-Theuer
3ba04911c2 marsadm: bump version 2020-04-11 08:16:51 +02:00
Thomas Schoebel-Theuer
08a9c7a273 marsadm: new EXPERIMENTAL deletion method 2020-04-11 08:16:51 +02:00
Thomas Schoebel-Theuer
57ed669472 marsadm: final deletions via cron 2020-04-08 20:39:38 +02:00
Thomas Schoebel-Theuer
80c70599c8 marsadm: allow deletion of directories 2020-04-08 20:39:38 +02:00
Thomas Schoebel-Theuer
b9f0f57a32 marsadm: obey .deleted otherwise 2020-04-08 20:39:38 +02:00
Thomas Schoebel-Theuer
f9d2f2696f marsadm: obey .deleted in -l -f -e 2020-04-08 20:39:38 +02:00
Thomas Schoebel-Theuer
6ce4cfa723 marsadm: obey .deleted in all globs 2020-04-08 20:39:38 +02:00
Thomas Schoebel-Theuer
96646fee1e marsadm: new handover waiting 2020-04-06 15:12:43 +02:00
Thomas Schoebel-Theuer
582a3de94e marsadm: allow busy looping 2020-04-06 15:12:43 +02:00
Thomas Schoebel-Theuer
81ed8e7eed marsadm: factor out forking 2020-04-06 15:12:43 +02:00
Thomas Schoebel-Theuer
f2990a9d4f marsadm: further try_to_avoid_splitbrain() 2020-04-06 15:12:43 +02:00
Thomas Schoebel-Theuer
24f4051b53 marsadm: make check_primary_gone() more rubust 2020-04-06 15:12:43 +02:00
Thomas Schoebel-Theuer
1c5416b6fc marsadm: stabilize versionlink correction 2020-04-06 15:12:43 +02:00
Thomas Schoebel-Theuer
a3eb193dc0 marsadm: do not fail logrotate at secondaries 2020-04-06 15:12:43 +02:00
Thomas Schoebel-Theuer
1477d2adfb marsadm: reduce sleep time 2020-03-28 10:23:30 +01:00
Thomas Schoebel-Theuer
7ab9ac1a38 marsadm: skip unnecessary deletion wait 2020-03-28 10:23:30 +01:00
Thomas Schoebel-Theuer
bd61306a75 marsadm: avoid unnecessary rsync 2020-03-28 10:23:30 +01:00
Thomas Schoebel-Theuer
44a4054886 marsadm: speedup join-resource 2020-03-28 10:23:30 +01:00
Thomas Schoebel-Theuer
762477849c marsadm: avoid mutual symlink clobbering 2020-03-28 10:21:22 +01:00
Thomas Schoebel-Theuer
ea804c111a marsadm: fix corner case of "all" 2020-03-28 10:21:22 +01:00
Thomas Schoebel-Theuer
263d9fa9d7 marsadm: new command update-cluster 2020-03-28 10:21:22 +01:00
Thomas Schoebel-Theuer
c3e5df459f marsadm: fix race between fetch and primary --force 2020-02-28 09:41:05 +01:00
Thomas Schoebel-Theuer
c3f9970029 marsadm: new option --parallel 2020-02-15 15:32:35 +01:00
Thomas Schoebel-Theuer
002f10839a marsadm: implicit log-purge-all before {create,join}-resource
After certain incidents, leftovers may remain.
Before complaing about them and before refusing an important
repair step, just cleanup beforehand.
2020-01-25 20:15:23 +01:00
Thomas Schoebel-Theuer
a65205b8e1 marsadm: fix interpretation of leading zeros 2020-01-25 20:15:23 +01:00
Thomas Schoebel-Theuer
f482f6db33 marsadm: new command err-purge-all 2020-01-25 20:15:23 +01:00