marsadm: try to avoid split brain on primary switching

This commit is contained in:
Thomas Schoebel-Theuer 2014-01-21 13:43:43 +01:00
parent 2b71a212de
commit 49c13052f7
3 changed files with 70 additions and 5 deletions

View File

@ -131,7 +131,7 @@ tst@1und1.de
\end_layout
\begin_layout Date
Version 0.10 (incomplete)
Version 0.11 (incomplete)
\end_layout
\begin_layout Lowertitleback
@ -2593,7 +2593,7 @@ marsadm primary
\begin_layout Standard
The preconditions try to protect you from doing silly things, such as accidental
ly provoking a split brain error state.
We want to avoid split brain as well as we can.
We try to avoid split brain as best as we can.
Therefore, we distinguish between
\emph on
intended
@ -2603,7 +2603,53 @@ intended
emergeny
\emph default
switching.
Intended switching will try to avoid split brain as best as it can.
Intended switching will try to avoid split brain
\emph on
as best as it can
\emph default
.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Don't
\emph on
rely
\emph default
on split brain avoidance, in particular when scripting any higher-level
applications such as cluster managers.
\family typewriter
marsadm
\family default
does its best, but at least in case of (unnoticed) network outages / partitions
(or even
\emph on
very
\emph default
slow / overloaded networks), an attempt to become up-to-date is likely
to fail.
If you want to
\emph on
ensure
\emph default
that no split brain can result from intended primary switching, please
give the
\family typewriter
primary
\family default
command only after your secondary is
\emph on
known
\emph default
to be up-to-date.
\end_layout
\begin_layout Standard

Binary file not shown.

View File

@ -600,7 +600,25 @@ sub detect_splitbrain {
}
sub try_to_avoid_splitbrain {
# NYI
my ($cmd, $res) = @_;
my ($min, $max) = get_minmax_versions($res);
my @host_list = glob("$mars/resource-$res/replay-*");
return if scalar(@host_list) < 2;
my $vers_glob = sprintf("$mars/resource-$res/version-%09d-*", $max);
for (;;) {
my $ok = 1;
my @versions = glob($vers_glob);
my $first = get_link(shift @versions);
while (@versions) {
my $next = get_link(shift @versions);
if ($next ne $first) {
$ok = 0;
}
}
last if $ok;
lprint "trying to avoid split brain: logfile update not yet completed.\n";
sleep_timeout();
}
}
sub get_size {
@ -1395,7 +1413,8 @@ sub primary_phase2 {
return if $force;
return unless $cmd eq "primary";
check_primary_gone($res);
try_to_avoid_splitbrain(@_);
my $ok = detect_splitbrain($res);
try_to_avoid_splitbrain(@_) if $ok;
}
# when necessary, switch to primary