user-manual: rework resource operations

This commit is contained in:
Thomas Schoebel-Theuer 2019-09-09 13:55:36 +02:00 committed by Thomas Schoebel-Theuer
parent 6d0da533ca
commit 45c27aaba9
1 changed files with 149 additions and 55 deletions

View File

@ -13564,7 +13564,8 @@ from
\emph default \emph default
) the local disk. ) the local disk.
These processes are automatically paused. These processes are automatically paused.
As another contrast to DRBD, the respective processes will usually Another difference to DRBD: the fetch / replay processes etc will usually
\emph on \emph on
automatically automatically
\emph default \emph default
@ -16151,7 +16152,7 @@ pairs
\emph on \emph on
potentially potentially
\emph default \emph default
occur from any other other source host which happens to be reachable (although occur from any other source host which happens to be reachable (although
the current implementation prefers the current designated primary, but the current implementation prefers the current designated primary, but
this may change in future). this may change in future).
In addition, In addition,
@ -16164,13 +16165,13 @@ all
\emph default \emph default
communication. communication.
It only stops fetching logfiles. It only stops fetching logfiles.
The symlink update running in background is The symlink updates running in background (default port 7777) are
\emph on \emph on
not not
\emph default \emph default
stopped, in order to always propagate as much metadata as possible in the stopped, in order to always propagate as much metadata as possible throughout
cluster. the cluster.
In case of a later incident, chances are higher for a better knowledge In case of a later incident, chances will be higher for a better knowledge
of the of the
\emph on \emph on
real real
@ -17198,11 +17199,29 @@ status open
\begin_layout Plain Layout \begin_layout Plain Layout
\size scriptsize
There are three variants:
\end_layout
\begin_layout Plain Layout
Variant 1: planned handover (no
\family typewriter
--force
\family default
)
\end_layout
\begin_layout Plain Layout
\size scriptsize \size scriptsize
Precondition: sync must have finished at any resource member. Precondition: sync must have finished at any resource member.
All relevant transaction logfiles must be either already locally present, All relevant transaction logfiles must be either already locally present,
or be fetchable (see or be fetchable (see
\family typewriter \family typewriter
marsadm up
\family default
, or low-level commands
\family typewriter
resume-fetch resume-fetch
\family default \family default
and and
@ -17232,23 +17251,16 @@ marsadm secondary
\emph on \emph on
not recommended not recommended
\emph default \emph default
), ), at least the old primary must be reachable.
\emph on The (old) primarie's virutal device
all
\emph default
other members of the resource must be reachable (since we have no memory
who was the old primary before), and then they must also match the same
preconditions.
When another host is currently primary (whether designated or not), it
must match the preconditions of
\family typewriter
marsadm secondary
\family default
(that means, its local
\family typewriter \family typewriter
/dev/mars/mydata /dev/mars/mydata
\family default \family default
device must not be in use any more). must not be in use any more (see
\family typewriter
marsadm wait-umount
\family default
).
A split brain must not already exist. A split brain must not already exist.
\end_layout \end_layout
@ -17270,27 +17282,36 @@ Switches the
designated primary designated primary
\series default \series default
. .
There are three variants:
\end_layout \end_layout
\begin_layout Plain Layout \begin_layout Plain Layout
\size scriptsize \size scriptsize
1) Description of the
\series bold \series bold
Handover Handover
\series default \series default
when protocol (when
\emph on
not
\emph default
giving
\family typewriter \family typewriter
--force --force
\family default \family default
: when another host is currently primary, it is first asked to leave its is not given): when another host is currently primary, it is first asked
primary role, and it is waited until it actually has become secondary. to leave its primary role.
After that, the local host is asked to become primary. When systemd templates are active, this will be automatically triggered
via
\family typewriter
systemctl stop $stop_unit
\family default
.
Otherwise, you are resposible for stopping the load yourself, and you dhoulf
use
\family typewriter
marsadm wait-umount
\family default
in advance for checking.
Anyway, the handover procol s waiting until the former primary has actually
become secondary.
After that, the local host is requested to become primary.
Before actually becoming primary, all relevant logfiles are transferred Before actually becoming primary, all relevant logfiles are transferred
over the network and replayed, in order to avoid accidental creation of over the network and replayed, in order to avoid accidental creation of
split brain as best as possible split brain as best as possible
@ -17393,6 +17414,28 @@ join-resource
after the handover completed successfully. after the handover completed successfully.
\end_layout \end_layout
\begin_layout Enumerate
\size scriptsize
use the option
\family typewriter
--ignore-sync
\family default
, which leads to a restart of the running sync from position 0.
\end_layout
\begin_layout Plain Layout
Variant 2: planned handover (no
\family typewriter
--force
\family default
) with sync abort (
\family typewriter
--ignore-sync
\family default
)
\end_layout
\begin_layout Plain Layout \begin_layout Plain Layout
\size scriptsize \size scriptsize
@ -17410,6 +17453,14 @@ Handover ignoring running syncs,
time. time.
\end_layout \end_layout
\begin_layout Plain Layout
Variant 3: unplanned failover (
\family typewriter
--force
\family default
)
\end_layout
\begin_layout Plain Layout \begin_layout Plain Layout
\size scriptsize \size scriptsize
@ -17475,7 +17526,7 @@ Never
\family typewriter \family typewriter
primary --force primary --force
\family default \family default
when when planned handover via
\family typewriter \family typewriter
primary primary
\family default \family default
@ -17574,16 +17625,16 @@ B
\family typewriter \family typewriter
UpToDate UpToDate
\family default \family default
, you can prevent a split brain by yourself even when giving , you have some
\emph on
chance
\emph default
for avoiding a split brain even with
\family typewriter \family typewriter
primary --force primary --force
\family default \family default
afterwards. .
However, checking / assuring this is However, there is no guarantee.
\emph on
your
\emph default
responsibility!
\end_layout \end_layout
\begin_layout Plain Layout \begin_layout Plain Layout
@ -17662,8 +17713,11 @@ reference "subsec:Split-Brain-Resolution"
\family typewriter \family typewriter
marsadm invalidate marsadm invalidate
\family default \family default
cannot always resolve a split brain at other secondaries (which are neither cannot resolve a split brain at
the old nor the new designated primary). \emph on
other
\emph default
secondaries (which are neither the old nor the new designated primary).
Therefore, prefer the Therefore, prefer the
\family typewriter \family typewriter
leave-resource leave-resource
@ -17716,7 +17770,7 @@ wrong
\family typewriter \family typewriter
primary --force primary --force
\family default \family default
, you will have a chance to recover by either forcing the , you will have a chance for recovery by either forcing the
\begin_inset Quotes eld \begin_inset Quotes eld
\end_inset \end_inset
@ -17770,7 +17824,7 @@ reference "subsec:Forced-Switching"
. .
For your safety, For your safety,
\family typewriter \family typewriter
force --force
\family default \family default
does not work in newer marsadm (after mars0.1stable52) when your replica does not work in newer marsadm (after mars0.1stable52) when your replica
is a current sync target. is a current sync target.
@ -17934,6 +17988,10 @@ uniquely
Notice: in difference to DRBD, you Notice: in difference to DRBD, you
\series bold \series bold
don't need don't need
\series default
and you
\series bold
should not use
\series default \series default
this command during normal operation, including handover. this command during normal operation, including handover.
Any resource member which is Any resource member which is
@ -18084,11 +18142,15 @@ last
\family typewriter \family typewriter
leave-resource leave-resource
\family default \family default
(or the dangerous (or before
\emph on
forcefully killing
\emph default
your resource via the dangerous
\family typewriter \family typewriter
delete-resource delete-resource
\family default \family default
), you will need this before you can do that. ).
\end_layout \end_layout
\end_inset \end_inset
@ -18392,12 +18454,36 @@ Rationale: remove hindering split-brain /
leave-resource leave-resource
\family default \family default
leftovers. leftovers.
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
Usually, you don't need this.
\family typewriter
leave-resource
\family default
and
\family typewriter
invalidate
\family default
are already doing a similar logfile cleanup for you.
\end_layout \end_layout
\begin_layout Plain Layout \begin_layout Plain Layout
\size scriptsize \size scriptsize
Use this only when split brain does not go away by means of Use this only as a desperate last resort when split brain does not go away
by means of
\family typewriter \family typewriter
leave-resource leave-resource
\family default \family default
@ -18411,14 +18497,11 @@ could
\end_layout \end_layout
\begin_layout Plain Layout \begin_layout Plain Layout
\begin_inset Graphics THIS IS
filename images/MatieresToxiques.png \emph on
lyxscale 50 POTENTIALLY
scale 17 \emph default
DANGEROUS
\end_inset
THIS IS POTENTIALLY DANGEROUS!
\end_layout \end_layout
\begin_layout Plain Layout \begin_layout Plain Layout
@ -18429,8 +18512,19 @@ This command
might might
\emph default \emph default
destroy some valuable logfiles / other information in case the local informatio destroy some valuable logfiles / other information in case the local informatio
n is outdated or otherwise incorrect. n is outdated or otherwise incorrect, as could be the case during very awkward
MARS does its best for checking anything, but there is no guarantee. disaster scenarios, such as corrupted
\family typewriter
/mars
\family default
filesystems.
MARS does its best for checking anything, but there cannot be an absolute
guarantee.
\end_layout
\begin_layout Plain Layout
That said, no single incident has been observed during millions of operation
hours.
\end_layout \end_layout
\begin_layout Plain Layout \begin_layout Plain Layout