mars/docu/mars-manual.lyx

36006 lines
653 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

#LyX 2.1 created this file. For more info see http://www.lyx.org/
\lyxformat 474
\begin_document
\begin_header
\textclass scrreprt
\begin_preamble
\usepackage[dvipsnames]{xcolor}
\usepackage{listings}
\end_preamble
\options abstracton
\use_default_options true
\begin_modules
customHeadersFooters
enumitem
fixltx2e
\end_modules
\maintain_unincluded_children false
\language english
\language_package default
\inputencoding auto
\fontencoding global
\font_roman default
\font_sans default
\font_typewriter default
\font_math auto
\font_default_family rmdefault
\use_non_tex_fonts false
\font_sc false
\font_osf false
\font_sf_scale 100
\font_tt_scale 100
\graphics default
\default_output_format default
\output_sync 0
\bibtex_command default
\index_command default
\paperfontsize 10
\spacing single
\use_hyperref true
\pdf_title "MARS Manual"
\pdf_author "Thomas Schöbel-Theuer"
\pdf_bookmarks true
\pdf_bookmarksnumbered false
\pdf_bookmarksopen false
\pdf_bookmarksopenlevel 1
\pdf_breaklinks true
\pdf_pdfborder true
\pdf_colorlinks true
\pdf_backref false
\pdf_pdfusetitle true
\papersize a4paper
\use_geometry true
\use_package amsmath 1
\use_package amssymb 1
\use_package cancel 1
\use_package esint 1
\use_package mathdots 1
\use_package mathtools 1
\use_package mhchem 1
\use_package stackrel 1
\use_package stmaryrd 1
\use_package undertilde 1
\cite_engine basic
\cite_engine_type default
\biblio_style plain
\use_bibtopic false
\use_indices false
\paperorientation portrait
\suppress_date false
\justification true
\use_refstyle 1
\index Index
\shortcut idx
\color #008000
\end_index
\leftmargin 3.7cm
\topmargin 2.7cm
\rightmargin 2.8cm
\bottommargin 2.3cm
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\paragraph_indentation default
\quotes_language english
\papercolumns 1
\papersides 2
\paperpagestyle headings
\tracking_changes false
\output_changes false
\html_math_output 0
\html_css_as_file 0
\html_be_strict false
\end_header
\begin_body
\begin_layout Title
\family typewriter
MARS Manual
\begin_inset Newline newline
\end_inset
\begin_inset space ~
\end_inset
\end_layout
\begin_layout Subtitle
Multiversion Asynchronous Replicated Storage
\begin_inset Newline newline
\end_inset
\begin_inset space ~
\end_inset
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/earth-mars-transfer.fig
width 70col%
\end_inset
\end_layout
\begin_layout Author
Thomas Schöbel-Theuer (
\family typewriter
tst@1und1.de
\family default
)
\end_layout
\begin_layout Date
Version 0.1-36
\end_layout
\begin_layout Lowertitleback
\noindent
Copyright (C) 2013-16 Thomas Schöbel-Theuer
\begin_inset Newline newline
\end_inset
Copyright (C) 2013-16 1&1 Internet AG (see
\begin_inset Flex URL
status open
\begin_layout Plain Layout
http://www.1und1.de
\end_layout
\end_inset
shortly called 1&1 in the following).
\begin_inset Newline newline
\end_inset
\size footnotesize
Permission is granted to copy, distribute and/or modify this document under
the terms of the GNU Free Documentation License, Version 1.3 or any later
version published by the Free Software Foundation; with no Invariant Sections,
no Front-Cover Texts, and no Back-Cover Texts.
A copy of the license is included in the section entitled
\begin_inset Quotes eld
\end_inset
\begin_inset CommandInset ref
LatexCommand nameref
reference "chap:GNU-FDL"
\end_inset
\begin_inset Quotes erd
\end_inset
.
\end_layout
\begin_layout Abstract
\family typewriter
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
sloppy
\end_layout
\end_inset
MARS
\family default
is a block-level storage replication system for long distances / flaky
networks under GPL.
It runs as a Linux kernel module.
The sysadmin interface is similar to DRBD
\begin_inset Foot
status open
\begin_layout Plain Layout
Registered trademarks are the property of their respective owner.
\end_layout
\end_inset
, but its internal engine is completely different from DRBD: it works with
\series bold
transaction logging
\series default
, similar to some database systems.
\end_layout
\begin_layout Abstract
Therefore, MARS can provide stronger
\series bold
consistency guarantees
\series default
.
Even in case of network bottlenecks / problems / failures, the secondaries
may become outdated (reflect an elder state), but never become inconsistent.
In contrast to DRBD, MARS preserves the
\series bold
order of write operations
\series default
even when the network is flaky (
\series bold
Anytime Consistency
\series default
).
\end_layout
\begin_layout Abstract
The current version of MARS supports
\begin_inset Formula $k>2$
\end_inset
replicas and works
\series bold
asynchronously
\series default
.
Therefore, application performance is completely decoupled from any network
problems.
Future versions are planned to also support synchronous or near-synchronous
modes.
\end_layout
\begin_layout Abstract
\paragraph_spacing double
\noindent
\begin_inset space ~
\end_inset
\begin_inset Newline newline
\end_inset
\begin_inset space ~
\end_inset
\begin_inset Newline newline
\end_inset
\begin_inset Box Frameless
position "c"
hor_pos "c"
has_inner_box 1
inner_pos "c"
use_parbox 0
use_makebox 1
width "100col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\begin_inset Graphics
filename images/earth-mars-transfer.fig
width 70col%
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\begin_inset CommandInset toc
LatexCommand tableofcontents
\end_inset
\end_layout
\begin_layout Chapter
Why You should Replicate Big Data at Block Layer
\begin_inset CommandInset label
LatexCommand label
name "chap:Why-You-should"
\end_inset
\end_layout
\begin_layout Section
Cost Arguments from Architecture
\end_layout
\begin_layout Standard
Datacenters aren't usually operated for fun or for hobby.
Costs are therefore a very important argument.
\end_layout
\begin_layout Standard
Many enterprise system architects are starting with a particular architecture
in mind, called
\begin_inset Quotes eld
\end_inset
big cluster
\begin_inset Quotes erd
\end_inset
.
There is a common belief that otherwise
\series bold
scalability
\series default
could not be achieved:
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/Architecure_Big_Cluster.pdf
width 100col%
\end_inset
\end_layout
\begin_layout Standard
\noindent
The crucial point is the storage network here:
\begin_inset Formula $n$
\end_inset
frontend servers are interconnected with
\begin_inset Formula $m=O(n)$
\end_inset
storage servers, in order to achieve properties like scalability, failure
tolerance, etc.
\end_layout
\begin_layout Standard
Since
\emph on
any
\emph default
of the
\begin_inset Formula $n$
\end_inset
frontends must be able to access
\emph on
any
\emph default
of the
\begin_inset Formula $m$
\end_inset
storages in realtime, the storage network must be dimensioned for
\begin_inset Formula $O(n\cdot m)=O(n^{2})$
\end_inset
network connections running in parallel.
Even if the total network throughput would be scaling only with
\begin_inset Formula $O(n)$
\end_inset
, the network has to
\emph on
switch
\emph default
the packets from
\begin_inset Formula $n$
\end_inset
sources to
\begin_inset Formula $m$
\end_inset
destinations (and their opposite way back) in
\series bold
realtime
\series default
.
\end_layout
\begin_layout Standard
This
\series bold
cross-bar functionality
\series default
in realtime makes the storage network expensive.
Some further factors are increasing the costs of storage networks:
\end_layout
\begin_layout Itemize
In order to limit error propagation from other networks, the storage network
is often built as a
\emph on
physically separate
\emph default
/
\emph on
dedicated
\emph default
network.
\end_layout
\begin_layout Itemize
Because storage networks are heavily reacting to high latencies and packet
loss, they often need to be dimensioned for the
\series bold
worst case
\series default
(load peaks, packet storms, etc), needing one of the best = most expensive
components for reducing latency and increasing throughput.
Dimensioning to the worst case instead of an average case plus some safety
margins is nothing but an expensive
\series bold
overdimensioning
\series default
/
\series bold
over-engineering
\series default
.
\end_layout
\begin_layout Itemize
When multipathing is required for improving fault tolerance of the storage
network itself, these efforts will even
\series bold
double
\series default
.
\end_layout
\begin_layout Itemize
When geo-redundancy is required, the whole mess may easily more than double
another time because in cases of disasters like terrorist attacks the backup
datacenter must be prepared for taking over for multiple days or weeks.
\end_layout
\begin_layout Standard
Fortunately, there is an alternative called
\begin_inset Quotes eld
\end_inset
sharding architecture
\begin_inset Quotes erd
\end_inset
which does not need a storage network at all, at least when built and dimension
ed properly.
Instead, it
\emph on
should have
\emph default
(but not always needs) a so-called replication network which can, when
present, be dimensioned much smaller because it does neither need realtime
operations, nor scalabiliy to
\begin_inset Formula $O(n^{2})$
\end_inset
:
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/Architecure_Sharding.pdf
width 100col%
\end_inset
\end_layout
\begin_layout Standard
\noindent
Sharding architectures are extremely well suited when both the input traffic
and the data is
\series bold
already partitioned
\series default
.
For example, when several thousands or even millions of customers are operating
on disjoint data sets, like in web hosting where each webspace is residing
in its own home directory, or when each of millions of mySQL database instances
has to be isolated from its neighbour.
\end_layout
\begin_layout Standard
Even in cases when any customer may potentially access any of the data items
residing in the whole storage pool (e.g.
like in a search engine), sharding can be often applied.
The trick is to create some relatively simple content-based dynamic switching
or redirect mechanism in the input network traffic, similar to HTTP load
balancers or redirectors.
\end_layout
\begin_layout Standard
Only when partitioning of input traffic plus data is not possible in a reasonabl
e way, big cluster architectures as implemented for example in Ceph or Swift
(and partly even possible with MARS when resticted to the block layer)
have their
\series bold
usecase
\series default
.
Only under such a precondition they are really needed.
\end_layout
\begin_layout Standard
When sharding is possible, it is the preferred model due to cost and performance
reasons.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Notice that MARS' new remote device feature from the 0.2 branch series (which
is a replacement for iSCSI)
\emph on
could
\emph default
be used for implementing the
\begin_inset Quotes eld
\end_inset
big cluster
\begin_inset Quotes erd
\end_inset
model at block layer.
\end_layout
\begin_layout Standard
Nevertheless, this sub-variant is not the preferred model.
Following is the a super-model which combines both the
\begin_inset Quotes eld
\end_inset
big cluster
\begin_inset Quotes erd
\end_inset
and sharding model at block lyer in a very flexible way.
The following example shows only two servers from a pool consisting of
hundreds or thousands of servers:
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/MARS_Cluster_on_Demand.pdf
width 100col%
\end_inset
\end_layout
\begin_layout Standard
\noindent
The idea is to use iSCSI or the MARS remote device
\emph on
only where necessary
\emph default
.
Preferably, local storage is divided into multiple Logical Volumes (LVs)
via LVM, which are
\emph on
directly
\emph default
used
\emph on
locally
\emph default
by Virtual Machines (VMs), such as KVM or filesystem-based variants like
LXC containers.
\end_layout
\begin_layout Standard
In the above example, the left machine has relatively less CPU power or
RAM than storage capacity.
Therefore, not
\emph on
all
\emph default
LVs could be instantiated locally at the same time without causing operational
problems, but
\emph on
some
\emph default
of them can be run locally.
The example solution is to
\emph on
exceptionally(!)
\emph default
export LV3 to the right server, which has some otherwise unused CPU and
RAM capacity.
\end_layout
\begin_layout Standard
Notice that locally running VMs doesn't produce any storage network traffic
at all.
Therefore, this is the preferred runtime configuration.
\end_layout
\begin_layout Standard
Only in cases of resource imbalance, such as (transient) CPU or RAM peaks
(e.g.
caused by DDOS attacks),
\emph on
some
\emph default
containers may be run somewhere else over the network.
In a well-balanced and well-dimensioned system, this will be the
\series bold
vast minority
\series default
, and should be only used for dealing with timely load peaks etc.
\end_layout
\begin_layout Standard
Running VMs directly on the same servers as their storage is a
\series bold
major cost reducer.
\end_layout
\begin_layout Standard
You simply don't need to buy and operate
\begin_inset Formula $n+m$
\end_inset
servers, but only about
\begin_inset Formula $\max(n,m)+m\cdot\epsilon$
\end_inset
servers, where
\begin_inset Formula $\epsilon$
\end_inset
corresponds to some relative small extra resources needed by MARS.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
In addition to this and to reduced networking costs, there are further cost
savings at power consumption, air conditioning, Height Units (HUs), number
of HDDs, operating costs, etc as explained below in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Cost-Arguments-from"
\end_inset
.
\end_layout
\begin_layout Standard
The sharding model needs a different approach to load balancing of storage
space than the big cluster model.
There are serveral possibilities at different layers:
\end_layout
\begin_layout Itemize
Dynamically growing the sizes of LVs via
\family typewriter
lvresize
\family default
followed by
\family typewriter
marsadm resize
\family default
followed by
\family typewriter
xfs_growfs
\family default
or similar operations.
\end_layout
\begin_layout Itemize
Moving customer data at filesystem or database level via
\family typewriter
rsync
\family default
or
\family typewriter
mysqldump
\family default
or similar.
\end_layout
\begin_layout Itemize
Moving whole LVs via MARS, as shown in the following example:
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/MARS_Background_Migration.pdf
width 100col%
\end_inset
\end_layout
\begin_layout Standard
\noindent
The idea is to dynamically create
\emph on
additional
\emph default
LV replicas for the sake of background migration.
Examples:
\end_layout
\begin_layout Itemize
In case you had no redundancy at LV level before, you have
\begin_inset Formula $k=1$
\end_inset
replicas during ordinary operation.
If not yet done, you should transparently introduce MARS into your LVM-based
stack by using the so-called
\begin_inset Quotes eld
\end_inset
standalone mode
\begin_inset Quotes erd
\end_inset
of MARS.
When necessary, create the first MARS replica with
\family typewriter
marsadm create-resource
\family default
on your already-existing LV data, which is retained unmodified, and restart
your application again.
Now, for the sake of migration, you just create an additional replica at
another server via
\family typewriter
marsadm join-resource
\family default
there and wait until the second mirror has been fully
\series bold
synced
\series default
in background, while your application is running and while the contents
of the LV is modified
\emph on
in parallel
\emph default
by your ordinary applications.
Then you do a primary
\series bold
handover
\series default
to your mirror.
This is usually a matter of minutes, or even seconds.
Once the application runs again at the new location, you can delete the
old replica via
\family typewriter
marsadm leave-resource
\family default
and
\family typewriter
lvremove
\family default
.
Finally, you may re-use the freed-up space for something else (e.g.
\family typewriter
lvresize
\family default
of
\emph on
another
\emph default
LV followed by
\family typewriter
marsadm resize
\family default
followed by
\family typewriter
xfs_growfs
\family default
or similar).
For the sake of some hardware lifecycle, you may run a different strategy:
evacuate the original source server completely via the above MARS migration
method, and eventually decommission it.
\end_layout
\begin_layout Itemize
In case you already have a redundant LV copy somewhere, you should run a
similar procedure, but starting with
\begin_inset Formula $k=2$
\end_inset
replicas, and temporarily increasing the number of replicas to either
\begin_inset Formula $k'=3$
\end_inset
when moving each replica step-by-step, or you may even directly go up to
\begin_inset Formula $k'=4$
\end_inset
when moving pairs at once.
\end_layout
\begin_layout Itemize
When already starting with
\begin_inset Formula $k>2$
\end_inset
LV replicas in the starting position, you can do the same analogously,
or you may then use a lesser variant.
For example, we have some mission-critical servers at 1&1 which are running
\begin_inset Formula $k=4$
\end_inset
replicas all the time on relatively small but important LVs for extremely
increased safety.
Only in such a case, you may have the freedom to temporarily decrease from
\begin_inset Formula $k=4$
\end_inset
to
\begin_inset Formula $k'=3$
\end_inset
and then going up to
\begin_inset Formula $k''=4$
\end_inset
again.
This has the advantage of requiring less temporary storage space for
\emph on
swapping
\emph default
some LVs.
\end_layout
\begin_layout Section
Cost Arguments from Technology
\begin_inset CommandInset label
LatexCommand label
name "sec:Cost-Arguments-from"
\end_inset
\end_layout
\begin_layout Standard
A common pre-jugdement is that
\begin_inset Quotes eld
\end_inset
big cluster
\begin_inset Quotes erd
\end_inset
is the cheapest scaling storage technology when built on so-called
\begin_inset Quotes eld
\end_inset
commodity hardware
\begin_inset Quotes erd
\end_inset
.
While this is very often true for the
\begin_inset Quotes eld
\end_inset
commodity hardware
\begin_inset Quotes erd
\end_inset
part, it is often not true for the
\begin_inset Quotes eld
\end_inset
big cluster
\begin_inset Quotes erd
\end_inset
part.
But let us first look at the
\begin_inset Quotes eld
\end_inset
commodity
\begin_inset Quotes erd
\end_inset
part.
\end_layout
\begin_layout Standard
Here are some rough market prices for basic storage as determined around
end of 2016 / start of 2017:
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Tabular
<lyxtabular version="3" rows="6" columns="3">
<features rotate="0" tabularvalignment="middle">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
Technology
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
Enterprise-Grade
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
Price in € / TB
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
Consumer SATA disks via on-board SATA controllers
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
no (small-scale)
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
< 30 possible
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
SAS disks via SAS HBAs (e.g.
in external 14
\begin_inset Quotes erd
\end_inset
shelfs)
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
halfways
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
< 80
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
SAS disks via hardware RAID + LVM (+DRBD/MARS)
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
yes
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
80 to 150
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
Commercial storage appliances via iSCSI
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
yes
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
around 1000
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
Cloud storage, S3 over 5 years lifetime
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
yes
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size small
3000 to 8000
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Standard
\noindent
You can see that any self-built and self-administered storage (whose price
varies with slower high-capacity versus faster low-capacity disks) is much
cheaper than any commercial offering by about a factor of 10 or even more.
If you need to operate serveral petabytes of data, self-built storage is
always cheaper than commercial one, even if additional manpower would be
needed for commissioning and operating.
Here we just assume that the storage is needed permanently for at least
5 years, as is the case in web hosting, databases, backup / archival systems,
and many other application areas.
\end_layout
\begin_layout Standard
Cloud storage is way too much hyped.
From a commercial perspective it usually pays off only when your storage
demands are
\emph on
extremely
\emph default
varying over time, and when you need some
\emph on
extra
\emph default
capacity only
\emph on
temporarily
\emph default
for a
\emph on
very
\emph default
short time.
\end_layout
\begin_layout Standard
In addition to basic storage prices, many further factors come into play
when roughly comparing big clusters versus sharding (
\family roman
\series medium
\shape up
\size normal
\emph off
\bar no
\strikeout off
\uuline off
\uwave off
\noun off
\color none
\begin_inset Formula $\times2$
\end_inset
\family default
\series default
\shape default
\size default
\emph default
\bar default
\strikeout default
\uuline default
\uwave default
\noun default
\color inherit
means with geo-redundancy):
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Tabular
<lyxtabular version="3" rows="5" columns="5">
<features rotate="0" tabularvalignment="middle">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
BC
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
SHA
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
BC
\begin_inset Formula $\times2$
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
SHA
\begin_inset Formula $\times2$
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
# of Disks
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
>200%
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
<120%
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
>400%
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
<240%
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
# of Servers
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\begin_inset Formula $\approx\times2$
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\begin_inset Formula $\approx\times1.1$
\end_inset
possible
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\begin_inset Formula $\approx\times4$
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\begin_inset Formula $\approx\times2.2$
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Power Consumption
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\begin_inset Formula $\approx\times2$
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
dito
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
dito
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
dito
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
HU Consumption
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\begin_inset Formula $\approx\times2$
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
dito
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
dito
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
dito
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Standard
\noindent
The crucial point is not only the number of extra servers needed for dedicated
storage boxes, but also the total number of HDDs.
While big cluster implementations like Ceph or Swift can
\emph on
theoretically
\emph default
use some erasure encoding for avoiding full object replicas, their
\emph on
practice
\emph default
as seen in our internal 1&1 Ceph clusters is similar to RAID-10, but just
on objects instead of block-based sectors.
\end_layout
\begin_layout Standard
Therefore a big cluster typically needs >200% disks to reach the same net
capacity as a sharded cluster, where typically hardware RAID-60 with a
significantly smaller overhead is sufficient for providing sufficient failure
tolerance at disk level.
\end_layout
\begin_layout Standard
There is a surprising consequence from this: geo-redundancy is not as expensive
as many people are believing.
It just needs to be built with the proper architecture.
A sharded geo-redundant pool based on hardware RAID-60 costs roughly about
the same as (or when taking
\begin_inset Formula $O(n^{2})$
\end_inset
storage networks into account it is possibly even cheaper than) a big cluster
with full replicas without geo-redundancy.
A geo-redundant sharded pool provides even better failure compensation.
\end_layout
\begin_layout Standard
Notice that geo-redundancy implies by definition that an unforeseeable
\series bold
full datacenter loss
\series default
(e.g.
caused by
\series bold
disasters
\series default
like a terrorist attack or an earthquake) must be compensated for
\series bold
several days or weeks
\series default
.
Therefore it is
\emph on
not
\emph default
sufficient to take a big cluster and just spread it to two different locations.
\end_layout
\begin_layout Standard
In any case, a MARS-based geo-redundant sharding pool is cheaper than using
commercial storage appliances which are much more expensive by their nature.
\end_layout
\begin_layout Section
Performance Arguments from Architecture
\end_layout
\begin_layout Standard
Some people think that replication is easily done at filesystem layer.
There exist lots of cluster filesystems and other filesystem-layer solutions
which claim to be able to replicate your data, sometimes even over long
distances.
\end_layout
\begin_layout Standard
Trying to replicate several petabytes of data, or some billions of inodes,
is however a much bigger challenge than many people can imagine.
\end_layout
\begin_layout Standard
Choosing the wrong layer for
\series bold
mass data replication
\series default
may get you into trouble.
Here is an explanation why replication at the block layer is more easy
and less error prone:
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/Layers.pdf
width 100col%
\end_inset
\end_layout
\begin_layout Standard
\noindent
The picture shows the main components of a standalone Unix / Linux system.
In the late 1970s / early 1980s, a so-called
\series bold
Buffer Cache
\series default
had been introduced into the architecture of Unix.
Today's Linux has refined the concept to various internal caches such as
the Page Cache and the Dentry Cache.
\end_layout
\begin_layout Standard
All these caches serve only one purpose: they are reducing the load onto
the storage by exploitation of fast RAM.
A well-tuned cache can yield high cache hit ratios, typically 99%.
In some cases (as observed in practice) even more than 99.9%.
\end_layout
\begin_layout Standard
Now start distributing the system over long distances.
There are two potential cut points A and B.
Cutting at A means replication at filesystem level.
B means replication at block level.
\end_layout
\begin_layout Standard
When replicating at A, you will notice that the caches are
\emph on
below
\emph default
your cut point.
Thus you will have to re-implement
\series bold
distributed caches
\series default
, and you will have to
\series bold
maintain cache coherence
\series default
.
\end_layout
\begin_layout Standard
When replicating at B, the Linux caches are
\emph on
above
\emph default
your cut point.
Thus you will receive much less traffic, typically already reduced by a
factor of 100, or even more.
This is much more easy to cope with.
You will also profit from
\series bold
journalling filesystems
\series default
like
\family typewriter
ext4
\family default
or
\family typewriter
xfs
\family default
.
In contrast,
\emph on
truly distributed
\begin_inset Foot
status open
\begin_layout Plain Layout
In this context,
\begin_inset Quotes eld
\end_inset
truly
\begin_inset Quotes erd
\end_inset
means that the POSIX semantics would be always guaranteed cluster-wide,
and even in case of partial failures.
In practice, some distributed filesystems like NFS don't even obey the
POSIX standard
\emph on
locally
\emph default
on 1 standalone client.
We know of projects which have
\emph on
failed
\emph default
right because of this.
\end_layout
\end_inset
\emph default
journalling is typically not available with distributed cluster filesystems.
\end_layout
\begin_layout Standard
A
\emph on
potential
\emph default
drawback of block layer replication is that you are typically limited to
active-passive replication.
An active-active operation is not impossible at block layer (see combinations
of DRBD with
\family typewriter
ocfs2
\family default
), but less common, and less safe to operate.
\end_layout
\begin_layout Standard
This limitation isn't necessarily caused by the choice of layer.
It is simply caused by the
\series bold
laws of physics
\series default
: communication is always limited by the speed of light.
A distributed filesystem is nothing else but a logically
\series bold
distributed shared memory
\series default
(DSM).
\end_layout
\begin_layout Standard
Some decades of research on DSM have shown that there exist applications
/ workloads where the DSM model is
\emph on
inferior
\emph default
to the direct communication paradigm.
Even in short-distance / cluster scenarios.
Long-distance DSM is extremely cumbersome.
\end_layout
\begin_layout Standard
Therefore: you simply shouldn't try to solve long-distance communication
needs via communication over filesystems.
Even simple producer-consumer scenarios (one-way communication) are less
performant (e.g.
when compared to plain TCP/IP) when it comes to distributed POSIX semantics.
There is simply too much
\series bold
synchronisation overhead at metadata level
\series default
.
\end_layout
\begin_layout Standard
If you have a need for mixed operations at different locations in parallel:
just split your data set into disjoint filesystem instances (or database
/ VM instances, etc).
All you need is careful thought about the
\emph on
appropriate
\emph default
\emph on
granularity
\emph default
of your data sets (such as well-chosen
\emph on
sets
\emph default
of user homedirectory subtrees, or database sets logically belonging together,
etc).
\end_layout
\begin_layout Standard
Replication at filesystem level is often at single-file granularity.
If you have several millions or even billions of inodes, you may easily
find yourself in a snakepit.
\end_layout
\begin_layout Standard
Conclusion: active-passive operation over long distances (such as between
continents) is even an advantage.
It keeps you from trying bad / almost impossible things.
\end_layout
\begin_layout Chapter
Use Cases for MARS vs DRBD
\begin_inset CommandInset label
LatexCommand label
name "chap:Use-Cases-for"
\end_inset
\end_layout
\begin_layout Standard
DRBD has a long history of successfully providing HA features to many users
of Linux.
With the advent of MARS, many people are wondering what the difference
is.
They ask for recommendations.
In which use cases should DRBD be recommended, and in which other cases
is MARS the better choice?
\end_layout
\begin_layout Standard
The following table is a short guide to the most important cases where the
decision is rather clear:
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Tabular
<lyxtabular version="3" rows="6" columns="2">
<features rotate="0" tabularvalignment="middle">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Use Case
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Recommendation
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
server pairs, each directly connected via
\series bold
crossover cables
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
DRBD
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
active-active
\series default
/ dual-primary, e.g.
\family typewriter
\series bold
gfs2
\family default
\series default
,
\family typewriter
\series bold
ocfs2
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
DRBD
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
distance
\series bold
> 50km
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
MARS
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
> 100 server pairs
\series default
over a short-distance
\series bold
shared
\series default
line
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
MARS
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
all else / intermediate cases
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
read the following details
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Standard
\noindent
There exist some use cases where DRBD is clearly better than MARS.
1&1 has a long history of experiences with DRBD where it works very fine,
in particular coupling Linux devices rack-to-rack via crossover cables.
DRBD is just
\emph on
constructed
\emph default
for that use case (RAID-1 over network).
In such a scenario, DRBD is better than MARS because it uses up less disk
space resources.
In addition, newer DRBD versions can run over high-speed but short-distance
interconnects like Infiniband (via the SDP protocol).
Another use case for DRBD is active-active / dual-primary mode, e.g.
\family typewriter
ocfs2
\family default
\begin_inset Foot
status open
\begin_layout Plain Layout
Notice that
\family typewriter
ocfs2
\family default
is appearantly not constructed for long distances.
1&1 has some experiences on a specific short distance cluster where the
\family typewriter
ocfs2
\family default
/
\family typewriter
DRBD
\family default
combination scaled a little bit better than
\family typewriter
NFS
\family default
, but worse than
\family typewriter
glusterfs
\family default
(using 2 clients in both cases -- notice that
\family typewriter
glusterfs
\family default
showed extremely bad performance when trying to enable active-active
\family typewriter
glusterfs
\family default
replication between 2 server instances, therefore we ended up using active-pass
ive DRBD replication below a single
\family typewriter
glusterfs
\family default
server).
Conclusion:
\family typewriter
NFS
\family default
<
\family typewriter
ocfs2
\family default
<
\family typewriter
glusterfs
\family default
< sharding.
We found that
\family typewriter
glusterfs
\family default
on top of active-passive DRBD scalability was about 2 times better than
\family typewriter
NFS
\family default
on top of active-passive DRBD, while
\family typewriter
ocfs2
\family default
on top of
\family typewriter
DRBD
\family default
in active-active mode was somewhere inbetween.
All cluster comparisons with an increasing workload over time (measured
as number of customers which could be safely operated).
Each system was replaced by the next one when the respective scalability
was at its respective end, each time leading to operational problems.
The ultimate solution was to replace all of these clustering concepts by
the general concept of
\series bold
sharding
\series default
.
\end_layout
\end_inset
over short
\begin_inset Foot
status open
\begin_layout Plain Layout
Active-active won't work over long distances at all because of high network
latencies (cf chapter
\begin_inset CommandInset ref
LatexCommand ref
reference "chap:Why-You-should"
\end_inset
).
Probably, for replication of whole clusters over long distances DRBD and
MARS could be stacked: using DRBD on top for MARS for active-active clustering
of
\family typewriter
gfs2
\family default
or
\family typewriter
ocfs2
\family default
, and a MARS instance
\emph on
below
\emph default
for failover of
\emph on
one
\emph default
of the DRBD replicas over long distances.
\end_layout
\end_inset
distances.
\end_layout
\begin_layout Standard
On the other hand, there exist other use cases where DRBD did not work as
expected, leading to incidents and other operational problems.
We analyzed them for our specific use cases.
The later author of MARS came to the conclusion that they could only be
resolved by fundamental changes in the overall architecture of DRBD.
The development of MARS started at the personal initiative of the author,
first in form of a personal project during holidays, but later picked up
by 1&1 as an official project.
\end_layout
\begin_layout Standard
MARS and DRBD simply have
\series bold
different application areas
\series default
.
\end_layout
\begin_layout Standard
In the following, we will discuss the pros and cons of each system in particular
situations and contexts, and we shed some light at their conceptual and
operational differences.
\end_layout
\begin_layout Section
Network Bottlenecks
\begin_inset CommandInset label
LatexCommand label
name "sec:Network-Bottlenecks"
\end_inset
\end_layout
\begin_layout Subsection
Behaviour of DRBD
\begin_inset CommandInset label
LatexCommand label
name "sub:Behaviour-of-DRBD"
\end_inset
\end_layout
\begin_layout Standard
In order to describe the most important problem we found when DRBD was used
to couple whole datacenters (each encompassing thousands of servers) over
metro distances, we strip down that complicated real-life scenario to a
simplified laboratory scenario in order to demonstrate the effect with
minimal means.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Notice that the following DRBD effect does not appear at crossover cables.
The following scenario covers a non-standard case of DRBD.
DRBD works fine when no network bottleneck appears!
\end_layout
\begin_layout Standard
The following picture illustrates an effect which has been observed in 1&1
datacenters when running masses of DBRD instances through a single network
bottleneck.
In addition, the effect is also reproducible by an elder version of the
MARS test suite
\begin_inset Foot
status open
\begin_layout Plain Layout
The effect has been demonstrated some years ago with DRBD version 8.3.13.
By construction, is is independent from any of the DRBD series 8.3.x, 8.4.x,
or 9.0.x.
\end_layout
\end_inset
:
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/network-bottleneck-drbd.fig
width 80col%
\end_inset
\end_layout
\begin_layout Standard
\noindent
The simplified scenario is the following:
\end_layout
\begin_layout Enumerate
DRBD is loaded with a low to medium, but constant rate of write operations
for the sake of simplicity of the scenario.
\end_layout
\begin_layout Enumerate
The network has some throughput bottleneck, depicted as a red line.
For the sake of simplicity, we just linearly decrease it over time, starting
from full throughput, down to zero.
The decrease is very slowly over time (some minutes, or even hours).
\end_layout
\begin_layout Standard
What will happen in this scenario?
\end_layout
\begin_layout Standard
As long as the actual DRBD write throughput is lower than the network bandwidth
(left part of the horizontal blue line), DRBD works as expected.
\end_layout
\begin_layout Standard
Once the maximum network throughput (red line) starts to fall short of the
required application throughput (first blue dotted line), we get into trouble.
By its very nature, DRBD works
\series bold
synchronously
\series default
.
Therefore, it
\emph on
must
\emph default
transfer all your application writes through the bottleneck, but now it
is impossible
\begin_inset Foot
status open
\begin_layout Plain Layout
This is independent from the DRBD protocols A through C, because it just
depends on an information-theoretic argument independently from any protocol.
We have a fundamental conflict between network capabilities and application
demands here, which cannot be circumvented due to the
\series bold
synchronous
\series default
nature of DRBD.
\end_layout
\end_inset
due to the bottleneck.
As a consequence, the application running on top of DRBD will see increasingly
higher IO latencies and/or stalls / hangs.
We found practical cases (at least with former versions of DRBD) where
IO latencies exceeded practical monitoring limits such as
\begin_inset Formula $5$
\end_inset
s by far, up to the range of
\emph on
minutes
\emph default
.
As an experienced sysadmin, you know what happens next: your application
will run into an incident, and your customers will be dissatisfied.
\end_layout
\begin_layout Standard
In order to deal with such situations, DRBD has lots of tuning parameters.
In particular, the
\family typewriter
timeout
\family default
parameter and/or the
\family typewriter
ping-timeout
\family default
parameter will determine when DRBD will give up in such a situation and
simply drop the network connection as an emergency measure.
Dropping the network connection is roughly equivalent to an automatic
\family typewriter
disconnect
\family default
, followed by an automatic re-connect attempt after
\family typewriter
connect-int
\family default
seconds.
During the dropped connection, the incident will appear as being resolved,
but at some hidden cost
\begin_inset Foot
status open
\begin_layout Plain Layout
By appropriately tuning various DRBD parameters, such as
\family typewriter
timeout
\family default
and/or
\family typewriter
ping-timeout
\family default
, you can keep the impact of the incident below some viable limit.
However, the automatic disconnect will then happen earlier and more often
in practice.
Flaky or overloaded networks may easily lead to an enormous number of automatic
disconnects.
\end_layout
\end_inset
.
\end_layout
\begin_layout Standard
What happens next in our scenario? During the
\family typewriter
disconnect
\family default
, DRBD will record all positions of writes in its bitmap and/or in its activity
log.
As soon as the automatic re-connect succeeds after
\family typewriter
connect-int
\family default
seconds, DRBD has to do a partial re-sync of those blocks which were marked
dirty in the meantime.
This leads to an
\emph on
additional
\emph default
bandwidth demand
\begin_inset Foot
status open
\begin_layout Plain Layout
DRBD parameters
\family typewriter
sync-rate
\family default
resp
\family typewriter
resync-rate
\family default
may be used to tune the height of the additional demand.
In addition, the newer parameters
\family typewriter
c-plan-ahead
\family default
,
\family typewriter
c-fill-target
\family default
,
\family typewriter
c-delay-target
\family default
,
\family typewriter
c-min-rate
\family default
,
\family typewriter
c-max-rate
\family default
and friends may be used to dynamically adapt to
\emph on
some
\emph default
situations where the application throughput
\emph on
could
\emph default
fit through the bottleneck.
These newer parameters were developed in a cooperation between 1&1 and
Linbit, the maker of DRBD.
\end_layout
\begin_layout Plain Layout
Please note that lowering / dynamically adapting the resync rates may help
in lowering the
\emph on
probability
\emph default
of occurrences of the above problems in practical scenarios where the bottlenec
k would recover to viable limits after some time.
However, lowering the rates will also increase the
\emph on
duration
\emph default
of re-sync operations accordingly.
The
\emph on
total amount of re-sync data
\emph default
simply does not decrease when lowering
\family typewriter
resync-rate
\family default
; it even tends to increase over time when new requests arrive.
Therefore, the
\emph on
expectancy value
\emph default
of problems caused by
\emph on
strong
\emph default
network bottlenecks (i.e.
when not even the ordinary application rate is fitting through) is
\emph on
not
\emph default
improved by lowering or adapting
\family typewriter
resync-rate
\family default
, but rather the expectancy value mostly depends on the
\emph on
relation
\emph default
between the amount of holdback data versus the amount of application write
data, both measured for the duration of some given strong bottleneck.
\end_layout
\end_inset
as indicated by the upper dotted blue box.
\end_layout
\begin_layout Standard
Of course, there is
\emph on
absolutely no chance
\emph default
to get the increased amount of data through our bottleneck, since not even
the ordinary application load (lower dotted lines) could be transferred.
\end_layout
\begin_layout Standard
Therefore, you run at a
\series bold
very high risk
\series default
that the re-sync cannot finish before the next
\family typewriter
timeout
\family default
/
\family typewriter
ping-timeout
\family default
cycle will drop the network connection again.
\end_layout
\begin_layout Standard
What will be the final result when that risk becomes true? Simply, your
secondary site will be
\emph on
permanently
\emph default
in state
\family typewriter
inconsistent
\family default
.
This means, you have lost your redundancy.
In our scenario, there is no chance at all to become consistent again,
because the network bottleneck declines more and more, slowly.
It is simply
\emph on
hopeless
\emph default
, by construction.
\end_layout
\begin_layout Standard
In case you lose your primary site now, you are lost at all.
\end_layout
\begin_layout Standard
Some people may argue that the probability for a similar scenario were low.
We don't agree on such an argumentation.
Not only because it really happens in pratice, and it may even last some
days until problems are fixed.
In case of
\series bold
rolling disasters
\series default
, the network is very likely to become flaky and/or overloaded shortly before
the final damage.
Even in other cases, you can easily end up with inconsistent secondaries.
It occurs not only in the lab, but also in practice if you operate some
hundreds or even thousands of DRBD instances.
\end_layout
\begin_layout Standard
The point is that you can produce an ill behaviour
\emph on
systematically
\emph default
just by overloading the network a bit for some sufficient duration.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
When coupling whole datacenters via some thousands of DRBD connections,
any (short) network loss will almost certainly increase the re-sync network
load each time the outage appears to be over.
As a consequence, overload may be
\emph on
provoked
\emph default
by the re-sync repair attempts.
This may easily lead to self-amplifying
\series bold
throughput storms
\series default
in some resonance frequency (similar to self-destruction of a bridge when
an army is marching over it in lockstep).
\end_layout
\begin_layout Standard
The only way for reliable prevention of loss of secondaries is to start
any re-connect
\emph on
only
\emph default
in such situations where you can
\emph on
predict in advance
\emph default
that the re-sync is
\emph on
guaranteed
\emph default
to finish before any network bottleneck / loss will cause an automatic
disconnect again.
We don't know of any method which can reliably predict the future behaviour
of a complex network.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
Conclusion: in the presence of network bottlenecks, you run a considerable
risk that your DRBD mirrors get destroyed just in that moment when you
desperately need them.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Notice that crossover cables usually never show a behaviour like depicted
by the red line.
Crossover cables are
\emph on
passive components
\emph default
which normally
\begin_inset Foot
status open
\begin_layout Plain Layout
Exceptions might be mechanical jiggling of plugs, or electro-magnetical
interferences.
We never noticed any of them.
\end_layout
\end_inset
either work, or not.
The binary connect / disconnect behaviour of DRBD has no problems to cope
with that.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
or
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Linbit recommends a
\series bold
workaround
\series default
for the inconsistencies during re-sync: LVM snapshots.
We tried it, but found a
\emph on
performance penalty
\emph default
which made it prohibitive for our concrete application.
A problem seems to be the cost of destroying snapshots.
LVM uses by default a BOW strategy (Backup On Write, which is the counterpart
of COW = Copy On Write).
BOW increases IO latencies during ordinary operation.
Retaining snapshots is cheap, but reverting them may be very costly, depending
on workload.
We didn't fully investigate that effect, and our experience is a few years
old.
You might come to a different conclusion for a different workload, for
newer versions of system software, or for a different strategy if you carefully
investigate the field.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
DRBD problems usually arise
\emph on
only
\emph default
when the network throughput shows some
\begin_inset Quotes eld
\end_inset
awkward
\begin_inset Quotes erd
\end_inset
analog behaviour, such as overload, or as occasionally produced by various
switches / routers / transmitters, or other potential sources of packet
loss.
\end_layout
\begin_layout Subsection
Behaviour of MARS
\begin_inset CommandInset label
LatexCommand label
name "sub:Behaviour-of-MARS"
\end_inset
\end_layout
\begin_layout Standard
The behaviour of MARS in the above scenario:
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/network-bottleneck-mars.fig
width 80col%
\end_inset
\end_layout
\begin_layout Standard
\noindent
When the network is restrained, an asynchronous system like MARS will continue
to serve the user IO requests (dotted green line) without any impact /
incident while the actual network throughput (solid green line) follows
the red line.
In the meantime, all changes to the block device are recorded at the transactio
n logfiles.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Here is one point in favour of DRBD: MARS stores its transaction logs on
the filesystem
\family typewriter
/mars/
\family default
.
When the network bottleneck is lasting very long (some days or even some
weeks), the filesystem will eventually run out of space some day.
Section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Defending-Overflow"
\end_inset
discusses countermeasures against that in detail.
In contrast to MARS, DRBD allocates its bitmap
\emph on
statically
\emph default
at resource creation time.
It uses up less space, and you don't have to monitor it for (potential)
overflows.
The space for transaction logs is the price you have to pay if you want
or need anytime consistency, or asynchronous replication in general.
\end_layout
\begin_layout Standard
In order to really grasp the
\emph on
heart
\emph default
of the difference between synchronous and asynchronous replication, we
look at the following modified scenario:
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/network-flaky-mars.fig
width 80col%
\end_inset
\end_layout
\begin_layout Standard
\noindent
This time, the network throughput (red line) is varying
\begin_inset Foot
status open
\begin_layout Plain Layout
In real life, many long-distance lines or even some heavily used metro lines
usually show fluctuations of their network bandwidth by an order of magnitude,
or even higher.
We have measured them.
The overall behaviour can be characterized as
\begin_inset Quotes eld
\end_inset
\series bold
chaotic
\series default
\begin_inset Quotes erd
\end_inset
.
\end_layout
\end_inset
in some unpredictable way.
As before, the application throughput served by MARS is assumed to be constant
(dotted green line, often superseded by the solid green line).
The actual replication network throughput is depicted by the solid green
line.
\end_layout
\begin_layout Standard
As you can see, a network dropdown undershooting the application demand
has no impact on the application throughput, but only on the replication
network throughput.
Whenever the network throughput is held back due to the flaky network,
it simply catches up as soon as possible by overshooting the application
throughput.
The amount of lag-behind is visualized as shaded area: downward shading
(below the application throughput) means an increase of the lag-behind,
while the upwards shaded areas (beyond the application throughput) indicate
a decrease of the lag-behind (catch-up).
Once the lag-behind has been fully caught up, the network throughput suddenly
jumps back to the application throughput (here visible in two cases).
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Note that the existence of lag-behind areas is roughly corresponding to
DRBD disconnect states, and in turn to DRBD inconsistent states of the
secondary as long as the lag-behind has not been fully cought up.
The very rough
\begin_inset Foot
status open
\begin_layout Plain Layout
Of course, this visualization is not exact.
On one hand, the DRBD inconsistency phase may start later as depicted here,
because it only starts
\emph on
after
\emph default
the first automatic disconnect, upon the first automatic re-connect.
In addition, the amount of resync data may be smaller than the amount of
corresponding MARS transaction logfile data, because the DRBD bitmap will
coalesce multiple writes to the same block into one single transfer.
On the other hand, DRBD will transfer no data at all during its disconnected
state, while MARS continues its best.
This leads to a prolongation of the DRBD inconsistent phase.
Depending on properties of the workload and of the network, the real duration
of the inconsistency phase may be both shorter or longer.
\end_layout
\end_inset
duration of the corresponding DRBD inconsistency phase is visualized as
magenta line at the time scale.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
MARS utilizes the existing network bandwidth as best as possible in order
to pipe through as much data as possible, provided that there exists some
data requiring expedition.
Conceptually, there exists no better way due to information theoretic limits
(besides data compression).
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Note that
\emph on
in average
\emph default
during a longer period of time, the network must have emough capacity for
transporting all of your data.
MARS cannot magically break through information-theoretic limits.
It cannot magically transport gigabytes of data over modem lines.
Only
\emph on
relatively short
\emph default
network problems / packet loss can be compensated.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
In case of lag-behind, the version of the data replicated to the secondary
site corresponds to some time in the past.
Since the data is always transferred in the same order as originally submitted
at the primary site, the secondary never gets inconsistent.
Your mirror always remains usable.
Your only potential problem could be the outdated state, corresponding
to some state in the past.
However, the
\begin_inset Quotes eld
\end_inset
as-best-as-possible
\begin_inset Quotes erd
\end_inset
approach to the network transfer ensures that your version is always
\emph on
as up-to-date as possible
\emph default
even under ill-behaving network bottlenecks.
\series bold
There is simply no better way to do it.
\series default
In presence of temporary network bottlenecks such as network congenstion,
there exists no better method than prescribed by the information theoretic
limit (red line, neglecting data compression).
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
In order to get all of your data through the line, somewhen the network
must be healthy again.
Otherwise, data will be recorded until the capacity of the
\family typewriter
/mars/
\family default
filesystem is exhausted, leading to an emergency mode (see section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Resolution-of-Emergency"
\end_inset
).
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
MARS' property of never sacrificing local data consistency (at the possible
cost of actuality, as long as you have enough capacity in
\family typewriter
/mars/
\family default
) is called
\series bold
Anytime Consistency
\series default
.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Even when the capacity of
\family typewriter
/mars/
\family default
is exhausted and when emergency mode is entered, the replicas will not
become inconsistent by themselves.
However, when the emergency mode is later
\emph on
cleaned up
\emph default
for a replica, it will become temporarily inconsistent during the fast
full sync.
Details are in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Resolution-of-Emergency"
\end_inset
.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Conclusion: you can even use
\series bold
traffic shaping
\series default
on MARS' TCP connections in order to globally balance your network throughput
(of course at the cost of actuality, but without sacrificing local data
consistency).
If you would try to do the same with DRBD, you could easily provoke a disaster.
MARS simply tolerates any network problems, provided that there is enough
disk space for transaction logfiles.
Even in case of completely filling up your disk with transaction logfiles
after some days or weeks, you will not lose local consistency anywhere
(see section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Defending-Overflow"
\end_inset
).
\end_layout
\begin_layout Standard
Finally, here is yet another scenario where MARS can cope with the situation:
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/network-constant-mars.fig
width 80col%
\end_inset
\end_layout
\begin_layout Standard
\noindent
This time, the network throughput limit (solid red line) is assumed to be
constant.
However, the application workload (dotted green line) shows some heavy
peaks.
We know from our 1&1 datacenters that such an application behaviour is
very common (e.g.
in case of certain kinds of DDOS attacks etc).
\end_layout
\begin_layout Standard
When the peaks are exceeding the network capabilities for some time, the
replication network throughput (solid green line) will be limited for a
short time, stay a little bit longer at the limit, and finally drop down
again to the normal workload.
In other words, you get a flexible buffering behaviour, coping with the
peaks.
\end_layout
\begin_layout Standard
Similar scenarios (where both the application workload has peaks and the
network is flaky to some degree) are rather common.
If you would use DRBD there, you were likely to run into regular application
performance problems and/or frequent automatic disconnect cycles, depending
on the height and on the duration of the peaks, and on network resources.
\end_layout
\begin_layout Section
Long Distances / High Latencies
\end_layout
\begin_layout Standard
In general and in some theories, latencies are conceptually independent
from throughput, at least to some degree.
There exist all 4 possible combinations:
\end_layout
\begin_layout Enumerate
There exist communication lines with high latencies but also high throughput.
Examples are raw fibre cables at the ground of the Atlantic.
\end_layout
\begin_layout Enumerate
High latencies on low-throughput lines is very easy to achieve.
If you never saw it, you never ran interactive
\family typewriter
vi
\family default
over
\family typewriter
ssh
\family default
in parallel to downloads on your old-fashioned modem line.
\end_layout
\begin_layout Enumerate
Low latencies need not be incompatible with high throughput.
See Myrinet, InfiniBand or high-speed point-to-point interconnects, such
as modern RAM busses.
\end_layout
\begin_layout Enumerate
Low latency combined with low throughput is also possible: in an ATM system
(or another pre-reservation system for bandwidth), just increase the multiplex
factor on low-capacity but short lines, which is only possible at the cost
of assigned bandwidth.
\end_layout
\begin_layout Standard
In the
\emph on
internet
\emph default
practice, however, it is very likely that high latencies will also lead
to worse throughput, because of the
\emph on
congestion control algorithms
\emph default
running all over the world.
\end_layout
\begin_layout Standard
We have experimented with extremely large TCP send/receive buffers plus
various window sizes and congestion control algorithms over long-distance
lines between the USA and Europe.
Yes, it is possible to improve the behaviour to some degree.
But magic does not happen.
Natural laws will always hold.
You simply cannot travel faster than the speed of light.
\end_layout
\begin_layout Standard
Our experience leads to the following rule of thumb, not formally proven
by anything, but just observed in practice:
\end_layout
\begin_layout Quotation
In general
\begin_inset Foot
status open
\begin_layout Plain Layout
We have heard of cases where even less than 50 km were not working with
DRBD.
It depends on application workload, on properties of the line, and on congestio
n caused by other traffic.
Some other people told us that according to
\emph on
their
\emph default
experience, much lesser distances should be considered operable, only in
the range of a few single kilometers.
However, they agree that DRBD is rock stable when used on crossover cables.
\end_layout
\end_inset
, synchronous data replication (not limited to applications of DRBD) works
reliably only over distances
\begin_inset Formula $<50$
\end_inset
km, or sometimes even less.
\end_layout
\begin_layout Standard
There may be some exceptions, e.g.
when dealing with low-end workstation loads.
But when you are responsible for a whole datacenter and/or some centralized
storage units, don't waste your time by trying (almost) impossible things.
We recommend to use MARS in such use cases.
\end_layout
\begin_layout Section
Higher Consistency Guarantees vs Actuality
\end_layout
\begin_layout Standard
We already saw in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Network-Bottlenecks"
\end_inset
that certain types of network bottlenecks can easily (and reproducibly)
destroy the consistency of your DRBD secondary, while MARS will preserve
local consistency at the cost of actuality (
\series bold
anytime consistency
\series default
).
\end_layout
\begin_layout Standard
Some people, often located at database operations, are obtrusively arguing
that actuality is such a high good that it must not be sacrificed under
any circumstances.
\end_layout
\begin_layout Standard
Anyone arguing this way has at least the following choices (list may be
incomplete):
\end_layout
\begin_layout Enumerate
None of the above use cases for MARS apply.
For instance, short distance replication over crossover cables is sufficient
(which occurs very often), or the network is reliable enough such that
bottlenecks can never occur (e.g.
because the total load is extremely low, or conversely the network is extremely
overengineered / expensive), or the occurrence of bottlenecks can
\emph on
provably
\emph default
be taken into account.
In such cases, DRBD is clearly the better solution than MARS, because it
provides better actuality than the current version of MARS, and it uses
up less disk resources.
\end_layout
\begin_layout Enumerate
In the presence of network bottlenecks, people didn't notice and/or didn't
understand and/or did under-estimate the risk of accidental invalidation
of their DRBD secondaries.
They should carefully check that risk.
They should convince themselves that the risk is
\emph on
really
\emph default
bearable.
Once they are hit by a systematic chain of events which
\emph on
reproducibly
\emph default
provoke the bad effect, it is too late
\begin_inset Foot
status open
\begin_layout Plain Layout
Some people seem to need a bad experience before they get the difference
between risk caused by reproducible effects and inverted luck.
\end_layout
\end_inset
.
\end_layout
\begin_layout Enumerate
In the presence of network bottlenecks, people found a solution such that
DRBD does not automatically re-connect after the connection has been dropped
due to network problems (c.f.
\family typewriter
ko-count
\family default
parameter).
So the risk of inconsistency
\emph on
appears
\emph default
to have vanished.
In some cases, people did not notice that the risk has
\emph on
not completely
\begin_inset Foot
status open
\begin_layout Plain Layout
Hint: what's the
\emph on
conceptual
\emph default
difference beween an automatic and a manual re-connect? Yes, you can try
to
\emph on
lower
\emph default
the risk in some cases by transferring risks to human analysis and human
decisions, but did you take into account the possibility of human errors?
\end_layout
\end_inset
\emph default
vanished, and/or they did not notice that now the actuality produced by
DRBD is even drastically worse than that of MARS (in the same situation).
It is true that DRBD provides better actuality in
\family typewriter
connected
\family default
state, but for a full picture the actuality in
\family typewriter
disconnected
\family default
state should not be neglected
\begin_inset Foot
status open
\begin_layout Plain Layout
Hint: a potential hurdle may be the fact that the current format of
\family typewriter
/proc/drbd
\family default
does neither display the timestamp of the first
\emph on
relevant
\emph default
network drop nor the total amount of lag-behind user data (which is
\emph on
not
\emph default
the same as the number of dirty bits in the bitmap), while
\family typewriter
marsadm view
\family default
can display it.
So it is difficult to judge the risks.
Possibly a chance is inspection of DRBD messages in the syslog, but quantificat
ion could remain hard.
\end_layout
\end_inset
.
So they didn't notice that their argumentation on the importance of actuality
may be fundamentally wrong.
A possible way to overcome that may be re-reading section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Behaviour-of-MARS"
\end_inset
and comparing its outcome with the corresponding outcome of DRBD in the
same situation.
\end_layout
\begin_layout Enumerate
People are stuck in contradictive requirements because the current version
of MARS does not yet support synchronous or pseudo-synchronous operation
modes.
This should be resolved some day.
\end_layout
\begin_layout Standard
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
A common misunderstanding is about the actuality guarantees provided by
filesystems.
The buffer cache / page cache uses by default a
\series bold
writeback strategy
\series default
for performance reasons.
Even modern journalling filesystems will (by default) provide only consistency
guarantees, but no strong actuality guarantee.
In case of power loss, some transactions may be even
\emph on
rolled back
\emph default
in order to restore consistency.
According to POSIX
\begin_inset Foot
status open
\begin_layout Plain Layout
The above argumentation also applies to Windows filesystems in analogous
way.
\end_layout
\end_inset
and other standards, the only
\emph on
reliable
\emph default
way to achieve actuality is usage of system calls like
\family typewriter
sync()
\family default
,
\family typewriter
fsync()
\family default
,
\family typewriter
fdatasync()
\family default
, flags like
\family typewriter
O_DIRECT
\family default
, or similar.
For performance reasons, the
\emph on
vast majority of applications
\emph default
don't use them at all, or use them only sparingly!
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
It makes no sense to require strong actuality guarantees from any block
layer replication (whether DRBD or future versions of MARS) while higher
layers such as filesystems or even applications are already sacrificing
them!
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
In summary, the
\series bold
anytime consistency
\series default
provided by MARS is an argument you should consider, even if you need an
extra hard disk for transaction logfiles.
\end_layout
\begin_layout Chapter
Quick Start Guide
\end_layout
\begin_layout Standard
This chapter is for impatient but experienced sysadmins who already know
DRBD.
For more complete information, refer to chapter
\begin_inset CommandInset ref
LatexCommand nameref
reference "chap:The-Sysadmin-Interface"
\end_inset
.
\end_layout
\begin_layout Section
Preparation: What you Need
\begin_inset CommandInset label
LatexCommand label
name "sec:Preparation:-What-you"
\end_inset
\end_layout
\begin_layout Standard
Typically, you will use MARS at servers in a datacenter for replication
of big masses of data.
\end_layout
\begin_layout Standard
Typically, you will use MARS for replication
\emph on
between
\emph default
multiple datacenters, when the distances are greater than
\begin_inset Formula $\approx50$
\end_inset
km.
Many other solutions, even from commercial storage vendors, will not work
reliably over large distances when your network is not
\emph on
extremely
\emph default
reliable, or when you try to push huge masses of data from high-performance
applications through a network bottleneck.
If you ever encountered suchalike problems (or try to avoid them in advance),
MARS is for you.
\end_layout
\begin_layout Standard
You can use MARS both at dedicated storage servers (e.g.
for serving Windows clients), or at standalone Linux servers where CPU
and storage are not separated.
\end_layout
\begin_layout Standard
In order to protect your data from low-level disk failures, you should use
a hardware RAID controller with BBU.
Software RAID is explicitly
\emph on
not
\emph default
recommended, because it generally provides worse performance due to the
lack of a hardware BBU (for some benchmark comparisons with/out BBU, see
\begin_inset Flex URL
status collapsed
\begin_layout Plain Layout
https://github.com/schoebel/blkreplay/raw/master/doc/blkreplay.pdf
\end_layout
\end_inset
).
\end_layout
\begin_layout Standard
Typically, you will need more than one RAID set
\begin_inset Foot
status open
\begin_layout Plain Layout
For low-cost storage, RAID-5 is no longer regarded safe for today's typical
storage sizes, because the error rate is regarded too high.
Therefore, use RAID-6.
If you need more than 15 disks in total, create multiple RAID sets (each
having at most 15 disks, better about 12 disks) and stripe them via LVM
(or via your hardware RAID controller if it supports RAID-60).
\end_layout
\end_inset
for big masses of data.
Therefore, use of LVM is also recommended
\begin_inset Foot
status open
\begin_layout Plain Layout
You may also combine MARS with commercial storage boxes connected via Fibrechann
el or iSCSI, but we have not yet operational experiences at 1&1 with such
setups.
\end_layout
\end_inset
for your data.
\end_layout
\begin_layout Standard
MARS' tolerance of networking problems comes with some cost.
You will need some extra space for the transaction logfiles of MARS, residing
at the
\family typewriter
/mars/
\family default
filesystem.
\end_layout
\begin_layout Standard
The exact space requirements for
\family typewriter
/mars/
\family default
depend on the
\emph on
average write rate
\emph default
of your application, not on the size of your data.
We found that only few applications are writing more than 1 TB per day.
Most are writing even less than 100 GB per day.
Usually, you want to dimension
\family typewriter
/mars/
\family default
such that you can survive a network loss lasting 3 days / about one weekend.
This can be achieved with current technology rather easily: as a simple
rule of thumb, just use one
\series bold
dedicated disk
\series default
having a capacity of 4 TB or more.
Typically, that will provide you with plenty of headroom even for bigger
networking incidents.
\end_layout
\begin_layout Standard
Dedicated disks for
\family typewriter
/mars/
\family default
have another advantage: their mechanical head movement is completely independen
t from your data head movements.
For best performance, attach that dedicated disk to your hardware RAID
controller with BBU, building a separate RAID set (even if it consists
only of a single disk -- notice that the
\series bold
hardware BBU
\series default
is the crucial point).
\end_layout
\begin_layout Standard
If you are concerned about reliability, use two disks switched together
as a relatively small RAID-1 set.
For extremely high performance demands, you may consider (and check) RAID-10.
\end_layout
\begin_layout Standard
Since the transaction logfiles are highly sequential in their access pattern,
a cheap but high-capacity SATA disk (or nearline-SAS disk) is usually sufficien
t.
At the time of this writing, standard SATA SSDs have shown to be
\emph on
not
\emph default
(yet) preferable.
Although they offer high random IOPS rate, their sequential throughput
is worse, and their long-term stability is questioned by many people at
the time of this writing.
However, as technology evolves and becomes more mature, this could change
in future.
\end_layout
\begin_layout Standard
Use
\family typewriter
ext4
\family default
for
\family typewriter
/mars/
\family default
.
Avoid
\family typewriter
ext3
\family default
, and don't use
\family typewriter
xfs
\family default
\begin_inset Foot
status open
\begin_layout Plain Layout
It seems that the late internal resource allocation strategy of
\family typewriter
xfs
\family default
(or another currently unknown reason) could be the reason for some resource
deadlocks which appear only with
\family typewriter
xfs
\family default
and only under
\emph on
extremely
\emph default
high IO load in combination with high memory pressure.
\end_layout
\end_inset
at all.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Notice that the filesystem
\family typewriter
/mars/
\family default
has nothing to do with an ordinary filesystem.
It is completely reserved for MARS internal purposes, namely as a
\series bold
storage container
\series default
for MARS' persistent data.
It does not obey any userspace rules like FHS (filesystem hierarchy standard),
and it should not be accessed by any userspace tool execpt the official
\family typewriter
marsadm
\family default
tool.
Its internal data format should be a regarded as a
\series bold
blackbox
\series default
by you.
The internal data format may change in future, or the complete
\family typewriter
/mars/
\family default
filesystem may be even replaced by a totally different container format,
while the official
\family typewriter
marsadm
\family default
interface is supposed to remain stable.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
That said, you might look into its contents
\emph on
by hand
\emph default
for curiosity or for
\emph on
debugging purposes
\emph default
, and only as root.
But don't program any tools / monitoring scripts / etc bypassing the official
\family typewriter
marsadm
\family default
tool.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
Like DRBD, the current version of MARS has
\series bold
no security
\series default
built in.
MARS assumes that it is running in a
\series bold
trusted network
\series default
.
Anyone who can connect to the MARS ports (default 7777 to 7779) can potentially
breach in and become root! Therefore, you
\series bold
must
\series default
protect your network by appropriate means, such as firewalling and/or encrypted
VPN.
\end_layout
\begin_layout Standard
Currently, MARS provides no shared secret like DRBD, because a simple shared
secret is way too weak to provide any real security (potentially misleading
people about the real level of security).
Future versions of MARS should provide at least 2-factor authorization,
and encryption via dynamic session keys.
Until that is implemented, use a secured VPN instead! And don't forget
to
\emph on
audit
\emph default
it for security holes!
\end_layout
\begin_layout Section
Setup Primary and Secondary Cluster Nodes
\end_layout
\begin_layout Standard
If you already use DRBD, you may migrate to MARS (or even back from MARS
to DRBD) if you use
\emph on
external
\begin_inset Foot
status open
\begin_layout Plain Layout
\emph on
Internal
\emph default
DRBD metadata should also work as long as the filesystem inside your block
device / disk already exists and is not re-created.
The latter would destroy the DRBD metadata, but even that will not hurt
you really: you can always switch back to DRBD using
\emph on
external
\emph default
metadata, as long as you have some small spare space somewhere.
\end_layout
\end_inset
\emph default
DRBD metadata (which is not touched by MARS).
\end_layout
\begin_layout Subsection
Kernel and MARS Module
\end_layout
\begin_layout Standard
At the time of this writing, a small pre-patch for the Linux kernel is needed.
It it trivial and consists mostly of
\family typewriter
EXPORT_SYMBOL()
\family default
statements.
The pre-patch must be applied to the kernel source tree before building
your (custom) kernel.
Future versions of MARS are planned to require no pre-patch anymore.
\end_layout
\begin_layout Standard
The MARS kernel module can be built in two different ways:
\end_layout
\begin_layout Enumerate
inplace in the kernel source tree:
\family typewriter
cd block/ && git clone git://github.com/schoebel/mars
\end_layout
\begin_layout Enumerate
as a separate kernel module, only for experienced
\begin_inset Foot
status open
\begin_layout Plain Layout
You should be familiar with the problems arising from orthogonal combination
of different kernel versions with different MARS module versions and with
different
\family typewriter
marsadm
\family default
userspace tool versions at the package management level.
Hint:
\family typewriter
modinfo
\family default
is your friend.
\end_layout
\end_inset
sysadmins: see file
\family typewriter
Makefile.dist
\family default
(tested with some older versions of Debian; may need some extra work with
other distros).
\end_layout
\begin_layout Standard
Further / more accurate / latest instructions can be found in
\family typewriter
README
\family default
and in
\family typewriter
INSTALL
\family default
.
You must not only install the kernel and the
\family typewriter
mars.ko
\family default
kernel module to all of your cluster nodes, but also the
\family typewriter
marsadm
\family default
userspace tool.
\end_layout
\begin_layout Subsection
Setup your Cluster Nodes
\begin_inset CommandInset label
LatexCommand label
name "sub:Setup-your-Cluster"
\end_inset
\end_layout
\begin_layout Standard
For your cluster, you need at least two nodes.
In the following, they will be called A and B.
In the beginning, A will have the
\family typewriter
primary
\family default
role, while B will be your initial
\family typewriter
secondary
\family default
.
The roles may change later.
\end_layout
\begin_layout Enumerate
You must be
\family typewriter
root
\family default
.
\end_layout
\begin_layout Enumerate
On each of A and B, create the
\family typewriter
/mars/
\family default
mountpoint.
\end_layout
\begin_layout Enumerate
On each node, create an
\family typewriter
ext4
\family default
filesystem on your separate disk / RAID set via
\family typewriter
mkfs.ext4
\family default
(for requirements on size etc see section
\begin_inset CommandInset ref
LatexCommand nameref
reference "sec:Preparation:-What-you"
\end_inset
).
\end_layout
\begin_layout Enumerate
On each node, mount that filesystem to
\family typewriter
/mars/
\family default
.
It is advisable to add an entry to
\family typewriter
/etc/fstab
\family default
.
\end_layout
\begin_layout Enumerate
For security reasons, execute
\family typewriter
chmod 0700 /mars
\family default
everyhwere after
\family typewriter
/mars/
\family default
has been mounted.
If you forget this step, any following
\family typewriter
marsadm
\family default
command will drop you a warning, but will fix the problem for you.
\end_layout
\begin_layout Enumerate
On node A, say
\family typewriter
marsadm create-cluster
\family default
.
\begin_inset Newline newline
\end_inset
This must be done
\emph on
exactly once
\emph default
, on exactly one node of your cluster.
Never do this twice or on different nodes, because that would create two
different clusters which would have nothing to do with each other.
The
\family typewriter
marsadm
\family default
tool protects you against accidentally joining / merging two different
clusters.
If you accidentally created two different clusters, just umount that
\family typewriter
/mars/
\family default
partition and start over with step 3 at that node.
\end_layout
\begin_layout Enumerate
On node B, you must have a working
\family typewriter
ssh
\family default
connection to node A (as
\family typewriter
root
\family default
).
Test it by saying
\family typewriter
ssh A w
\family default
on node B.
It should work without entering a password (otherwise, use
\family typewriter
ssh-agent
\family default
to achieve that).
In addition,
\family typewriter
rsync
\family default
must be installed.
\end_layout
\begin_layout Enumerate
On node B, say
\family typewriter
marsadm join-cluster A
\end_layout
\begin_layout Enumerate
Only
\emph on
after
\begin_inset Foot
status open
\begin_layout Plain Layout
In fact, you may already
\family typewriter
modprobe mars
\family default
at node A after the
\family typewriter
marsadm create-cluster
\family default
.
Just don't do any of the
\family typewriter
*-cluster
\family default
operations when the kernel module is loaded.
All other operations should have no such restriction.
\end_layout
\end_inset
\emph default
that, do
\family typewriter
modprobe mars
\family default
on each node.
\end_layout
\begin_layout Section
Creating and Maintaining Resources
\begin_inset CommandInset label
LatexCommand label
name "sec:Creating-and-Maintaining"
\end_inset
\end_layout
\begin_layout Standard
In the following example session, a block device
\family typewriter
/dev/lv-x/mydata
\family default
(shortly called
\emph on
disk
\emph default
) must already exist on both nodes A and B, respectively, having the same
\begin_inset Foot
status open
\begin_layout Plain Layout
Actually, the disk at the initially secondary side may be larger than that
at the initially primary side.
This will waste space and is therefore not recommended.
\end_layout
\end_inset
size.
For the sake of simplicity, the disk (underlying block device) as well
as its later logical resource name as well as its later virtual device
name will all be named uniformly by the same suffix
\family typewriter
mydata
\family default
.
In general, you might name each of them differently, but that is not recommende
d since it may easily lead to confusion in larger installations.
\end_layout
\begin_layout Standard
You may have already some data inside your disk
\family typewriter
/dev/lv-x/mydata
\family default
at the initially primary side A.
Before using it for MARS, it must be unused for any other purpose (such
as being mounted, or used by DRBD, etc).
MARS will require
\series bold
exclusive access
\series default
to it.
\end_layout
\begin_layout Enumerate
On node A, say
\family typewriter
marsadm create-resource mydata /dev/lv-x/mydata
\family default
.
\begin_inset Newline newline
\end_inset
As a result, a directory
\family typewriter
/mars/resource-mydata/
\family default
will be created on node A, containing some symlinks.
Node A will automatically start in the primary role for this resource.
Therefore, a new pseudo-device
\family typewriter
/dev/mars/mydata
\family default
will also appear after a few seconds.
\begin_inset Newline newline
\end_inset
Note that the initial contents of
\family typewriter
/dev/mars/mydata
\family default
will be exactly the same as in your pre-existing disk
\family typewriter
/dev/lv-x/mydata
\family default
.
\begin_inset Newline newline
\end_inset
If you like, you may already use
\family typewriter
/dev/mars/mydata
\family default
for mounting your already pre-existing data, or for creating a fresh filesystem
, or for exporting via iSCSI, and so on.
You may even do so before any other cluster node has joined the resource
(so-called
\begin_inset Quotes eld
\end_inset
standalone mode
\begin_inset Quotes erd
\end_inset
).
But you can also do so later after setup of (one ore many) secondaries.
\end_layout
\begin_layout Enumerate
Wait a few seconds until the directory
\family typewriter
/mars/resource-mydata/
\family default
and its symlink contents also appears on cluster node B.
The command
\family typewriter
marsadm wait-cluster
\family default
may be helpful.
\end_layout
\begin_layout Enumerate
On node B, say
\family typewriter
marsadm join-resource mydata /dev/lv-x/mydata
\family default
.
\begin_inset Newline newline
\end_inset
As a result, the initial full-sync from node A to node B should start automatica
lly.
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Of course, your old contents of your disk
\family typewriter
/dev/lv-x/mydata
\family default
at side B (and
\emph on
only
\emph default
there!) is overwritten by the version from side A.
Since you are an experienced sysadmin, you knew that, and it was just the
effect you deliberately wanted to achieve.
If you didn't check that your old contents didn't contain any valuable
data (or if you accidentally provided a wrong disk device argument), it
is too late now.
The
\family typewriter
marsadm
\family default
command checks that the disk device argument is really a block device,
and that exclusive access to it is possible (as well as some further safety
checks, e.g.
matching sizes).
However, MARS cannot know the
\emph on
purpose
\emph default
of your generic block device.
MARS (as well as DRBD) is completely ignorant of the
\emph on
contents
\emph default
of a generic block device; it does not interpret it in any way.
Therefore, you may use MARS (as well as DRBD) for mirroring Windows filesystems
, or raw devices from databases, or virtual machines, or whatever.
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Hint: by default, MARS uses the so-called
\begin_inset Quotes eld
\end_inset
fast fullsync
\begin_inset Quotes erd
\end_inset
algorithm.
It works similar to
\family typewriter
rsync
\family default
, first reading the data on both sides and computing an md5 checksum for
each block.
Heavy-weight data is only transferred over the long-distance network upon
checksum mismatch.
This is extremely fast if your data is already (almost) identical on both
sides.
Conversely, if you know in advance that your initial data is completely
different on both sides, you may choose to switch off the fast fullsync
algorithm via
\family typewriter
echo 0 > /proc/sys/mars/do_fast_fullsync
\family default
in order to save the additional IO overhead and network latencies introduced
by the separate checksum comparison steps.
\end_layout
\begin_layout Enumerate
Optionally, only for experienced sysadmins who
\emph on
really
\emph default
know what they are doing: if you will create a
\emph on
new
\emph default
filesystem on
\family typewriter
/dev/mars/mydata
\family default
\emph on
after(!)
\emph default
having created the MARS resource as well as
\emph on
after
\emph default
having already joined it on every replica, you may abandon the fast fullsync
phase
\emph on
before
\emph default
creating the fresh filesystem, because the old content of
\family typewriter
/dev/mars/mydata
\family default
will then be just garbage not used by the freshly created filesystem
\begin_inset Foot
status open
\begin_layout Plain Layout
It is
\emph on
vital
\emph default
that the transaction logfile contents created by
\family typewriter
mkfs
\family default
is
\emph on
fully
\emph default
propagated to the secondaries and then replayed there.
\end_layout
\begin_layout Plain Layout
Analogously, another exception is also possible, but at your own risk (be
careful, really!): when migrating your data from DRBD to MARS, and you
have ensured that (1) at the end of using DRBD both your replicas were
really equal (you should have checked that), and (2) before and after setting
up any side of MARS (
\family typewriter
create-resource
\family default
as well as
\family typewriter
join-resource
\family default
) nothing has been written at all to it (i.e.
no usage, neither of
\family typewriter
/dev/lv/mydata
\family default
nor of
\family typewriter
/dev/mars/mydata
\family default
has occurred in any way), the first transaction logfile
\family typewriter
/mars/resource-mydata/log-000000001-$primary
\family default
created by MARS will be empty.
Check whether this is really true! Then, and only then, you may also issue
a
\family typewriter
fake-sync
\family default
.
\end_layout
\end_inset
.
Then, and only then, you may say
\family typewriter
marsadm fake-sync mydata
\family default
in order to abort the sync operation.
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
Never do a
\family typewriter
fake-sync
\family default
unless you are
\series bold
absolutely sure
\series default
that you really don't need to sync the data! Otherwise, you are
\emph on
guaranteed
\emph default
to have produced harmful inconsistencies.
If you accidentally issued
\family typewriter
fake-sync
\family default
, you may startover the fast full sync at your secondary side by saying
\family typewriter
marsadm invalidate mydata
\family default
(analogously to the corresponding DRBD command).
\end_layout
\begin_layout Section
Keeping Resources Operational
\end_layout
\begin_layout Subsection
Logfile Rotation / Deletion
\begin_inset CommandInset label
LatexCommand label
name "sub:Logfile-Rotation"
\end_inset
\end_layout
\begin_layout Standard
As explained in section
\begin_inset CommandInset ref
LatexCommand nameref
reference "sec:The-Transaction-Logger"
\end_inset
, all changes to your resource data are recorded in transaction logfiles
residing on the
\family typewriter
/mars/
\family default
filesystem.
These files are always growing over time.
In order to avoid filesystem overflow, the following must be done in regular
time intervals:
\end_layout
\begin_layout Enumerate
\family typewriter
marsadm log-rotate all
\family default
\begin_inset Newline newline
\end_inset
This starts appending to a new logfile on all of your resources.
The logfiles are automatically numbered by an increasing 9-digit logfile
number.
This will suffice for many centuries even if you would logrotate once a
minute.
Practical frequencies for logfile rotation are more like once an hour,
or every 10 minutes when having highly-loaded storage servers.
\end_layout
\begin_layout Enumerate
\family typewriter
marsadm log-delete-all all
\family default
\begin_inset Newline newline
\end_inset
This determines all logfiles from all resources which are no longer needed
(i.e.
which are
\emph on
fully
\emph default
replayed, on
\emph on
all
\emph default
relevant secondaries).
All superfluous logfiles are then deleted, including all copies on all
secondaries.
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
The current version of MARS deletes either
\emph on
all
\emph default
replicas of a logfile everywhere, or
\emph on
none
\emph default
of the replicas.
This is a simple rule, but has the drawback that one node may hinder other
nodes from freeing space in
\family typewriter
/mars/
\family default
.
In particular, the command
\family typewriter
marsadm pause-replay $res
\family default
(as well as
\family typewriter
marsadm disconnect $res
\family default
) will freeze the space reclamation in the whole cluster when the pause
is lasting very long.
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Best practice is to do both
\family typewriter
log-rotate
\family default
and
\family typewriter
log-delete-all
\family default
in a
\family typewriter
cron
\family default
job.
In addition, you should establish some regular monitoring of the free space
present in the
\family typewriter
/mars/
\family default
filesystem.
\end_layout
\begin_layout Standard
More detailed information about about avoidance of
\family typewriter
/mars/
\family default
overflow is in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Defending-Overflow"
\end_inset
.
\end_layout
\begin_layout Subsection
Switch Primary / Secondary Roles
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/switching.fig
width 90col%
\end_inset
\end_layout
\begin_layout Standard
\noindent
In contrast to DRBD, MARS distinguishes between
\emph on
intended
\emph default
and
\emph on
forced
\emph default
switching.
This distinction is necessary due to differences in the communication architect
ure (asynchronous communication vs synchronous communication, see sections
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-Lamport-Clock"
\end_inset
and
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-Symlink-Tree"
\end_inset
).
\end_layout
\begin_layout Standard
Asynchronous communication means that (in worst case) a message may take
(almost) arbitrary time in a distorted network to propagate to another
node.
As a consequence, the risk for accidentally creating an (unintended) split
brain is increased (compared to a synchronous system like DRBD).
\end_layout
\begin_layout Standard
In order to minimize this risk, MARS has invested a lot of effort into an
internal handover protocol when you start an
\emph on
intended
\emph default
primary switch.
\end_layout
\begin_layout Subsubsection
Intended Switching / Planned Handover
\begin_inset CommandInset label
LatexCommand label
name "sub:Intended-Switching"
\end_inset
\end_layout
\begin_layout Standard
Before starting a planned handover from your old primary
\family typewriter
A
\family default
to a new primary
\family typewriter
B
\family default
, you should check the replication of the resource.
As a human, use
\family typewriter
marsadm view mydata
\family default
.
For scripting, use the macros from section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Predefined-Trivial-Macros"
\end_inset
(see also section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Scripting-HOWTO"
\end_inset
; an example can be found in
\begin_inset Flex URL
status collapsed
\begin_layout Plain Layout
contrib/example-scripts/check-mars-switchable.sh
\end_layout
\end_inset
).
The network should be OK, and the amount of replication delay should be
as low as possible.
Otherwise, handover may take a very long time.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Best practice is to
\series bold
prepare a planned handover
\series default
by the following steps:
\end_layout
\begin_layout Enumerate
Check the network and the replication lag.
It should be low (a few hundred megabytes, or a low number of gigabytes
- see also the rough time forecast shown by
\family typewriter
marsadm view mydata
\family default
when there is a larger replication delay, or directly access the forecast
by
\family typewriter
marsadm view-replinfo
\family default
).
\end_layout
\begin_layout Enumerate
Stop your application, then umount
\family typewriter
/dev/mars/mydata
\family default
on host
\family typewriter
A
\family default
.
\end_layout
\begin_layout Enumerate
When scripting, or when typing extremely fast, or for better safety, say
\family typewriter
marsadm wait-umount mydata
\family default
host
\family typewriter
B
\family default
.
When your network is OK, the propagation of the device usage state
\begin_inset Foot
status open
\begin_layout Plain Layout
Notice that the usage check for
\family typewriter
/dev/mars/mydata
\family default
on host
\family typewriter
B
\family default
is based on the
\emph on
open count
\emph default
transferred from
\emph on
another
\emph default
node
\family typewriter
A
\family default
.
Since MARS is operating asynchronously (in contrast to DRBD), it may take
some time until our node
\family typewriter
B
\family default
knows that the device is no longer used at
\family typewriter
A
\family default
.
This can lead to a race condition if you automate an intended takeover
with a script like
\family typewriter
ssh root@A
\begin_inset Quotes eld
\end_inset
umount /dev/mars/mydata
\begin_inset Quotes erd
\end_inset
; ssh root@B
\begin_inset Quotes eld
\end_inset
marsadm primary mydata
\begin_inset Quotes erd
\end_inset
\family default
because your second ssh command may be faster than the internal MARS symlink
tree propagation (cf section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-Symlink-Tree"
\end_inset
).
In order to prevent such races, you are strongly advised to use the command
\end_layout
\begin_layout Itemize
\family typewriter
marsadm wait-umount mydata
\end_layout
\begin_layout Plain Layout
on node
\family typewriter
B
\family default
before trying to become primary.
See also section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Scripting-HOWTO"
\end_inset
.
\end_layout
\end_inset
should take only a few seconds.
Otherwise, check for any network problems or any other problems.
\end_layout
\begin_layout Enumerate
On host
\family typewriter
B
\family default
, wait until
\family typewriter
marsadm view mydata
\family default
(or
\family typewriter
view-diskstate
\family default
) shows
\family typewriter
UpToDate
\family default
.
It is possible to omit this step, but then you have no control on the duration
of the handover, and in case of any transfer problems, disk space problems,
etc you are potentially risking to produce a split brain (although
\family typewriter
marsadm
\family default
will do its best to avoid it).
Doing the wait by yourself,
\emph on
before
\emph default
starting
\family typewriter
marsadm primary
\family default
, has a big advantage: you can abort the handover cycle at any time, just
by re-mounting the device
\family typewriter
/dev/mars/mydata
\family default
at the old primary
\family typewriter
A
\family default
again, and by re-starting your application.
Once you have started
\family typewriter
marsadm primary
\family default
on host
\family typewriter
B
\family default
, you might have to switch back, or possibly even via
\family typewriter
primary --force
\family default
(see sections
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Forced-Switching"
\end_inset
and
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Split-Brain-Resolution"
\end_inset
).
\end_layout
\begin_layout Standard
Switching the roles is very similar to DRBD: just issue the command
\end_layout
\begin_layout Itemize
\family typewriter
marsadm primary mydata
\end_layout
\begin_layout Standard
on your formerly secondary node
\family typewriter
B
\family default
.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
The most important difference to DRBD: don't use an intermediate
\family typewriter
marsadm secondary mydata
\family default
anywhere.
Although it would be possible, it has some
\emph on
disadvantages
\emph default
.
Always switch
\emph on
directly
\emph default
!
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
In contrast to DRBD, MARS remembers the designated primary, even when your
system crashes and reboots.
While in case of a crash you have to re-setup DRBD with commands like
\family typewriter
drbdadm up
\begin_inset Formula $\ldots$
\end_inset
; drbdadm primary
\begin_inset Formula $\ldots$
\end_inset
\family default
, MARS will automatically resume its former roles just by saying
\family typewriter
modprobe mars
\family default
.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Another fundamental difference to DRBD: when the network is healthy, there
can only exist
\emph on
one
\emph default
designated primary at a time (modulo some communication delays caused by
the
\begin_inset Quotes eld
\end_inset
eventually consistent
\begin_inset Quotes erd
\end_inset
communication model, see section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-Lamport-Clock"
\end_inset
).
By saying
\family typewriter
marsadm primary mydata
\family default
on host
\family typewriter
B
\family default
,
\series bold
all other
\series default
hosts (including
\family typewriter
A
\family default
) will
\series bold
automatically go into secondary role
\series default
after a while!
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
You simply
\emph on
don't need
\emph default
an intermediate
\family typewriter
marsadm secondary mydata
\family default
for planned handover!
\end_layout
\begin_layout Standard
Precondition for
\family typewriter
marsadm primary
\family default
is that you are up, that means in attached and connected state (cf.
\family typewriter
marsadm up
\family default
), and that any old primary (in this case
\family typewriter
A
\family default
) does not use its
\family typewriter
/dev/mars/mydata
\family default
device any longer, and that the network is healthy.
If some (parts of) logfiles are not yet (fully) transferred to the new
primary, you will need enough space on
\family typewriter
/mars/
\family default
at the target side.
If one of the preconditions described in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Operation-of-the"
\end_inset
is violated,
\family typewriter
marsadm primary
\family default
may refuse to start.
\end_layout
\begin_layout Standard
The preconditions try to protect you from doing silly things, such as accidental
ly provoking a split brain error state.
We try to avoid split brain as best as we can.
Therefore, we distinguish between
\emph on
intended
\emph default
and
\emph on
emergeny
\emph default
switching.
Intended switching will try to avoid split brain
\emph on
as best as it can
\emph default
.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Don't
\emph on
rely
\emph default
on split brain avoidance, in particular when scripting any higher-level
applications such as cluster managers (cf.
section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Scripting-HOWTO"
\end_inset
).
\family typewriter
marsadm
\family default
does its best, but at least in case of (unnoticed) network outages / partitions
(or
\emph on
extremely, really extremely
\emph default
slow / overloaded networks), an attempt to become
\family typewriter
UpToDate
\family default
may fail.
If you want to
\emph on
ensure
\emph default
that no split brain can result from intended primary switching, please
obey the the best practices from above, and please give the
\family typewriter
primary
\family default
command only after your secondary is
\emph on
known
\begin_inset Foot
status open
\begin_layout Plain Layout
As noted in many places in this manual, checking this cannot be done by
looking at the local state of a single cluster node.
You have to check several nodes.
\family typewriter
marsadm
\family default
can only check the
\emph on
local
\emph default
node reliably!
\end_layout
\end_inset
\emph default
to be
\emph on
really
\emph default
\family typewriter
UpToDate
\family default
(see
\family typewriter
marsadm wait-cluster
\family default
and
\family typewriter
marsadm view
\family default
and other macros described in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Inspecting-the-State"
\end_inset
).
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
A
\emph on
very rough
\emph default
estimation of the time to become
\family typewriter
UpToDate
\family default
is displayed by
\family typewriter
marsadm view mydata
\family default
or other macros (e.g.
\family typewriter
view-replinfo
\family default
).
However, on very flaky networks, the estimation may not only flicker much,
but also be inaccurate.
\end_layout
\begin_layout Subsubsection
Forced Switching
\begin_inset CommandInset label
LatexCommand label
name "sub:Forced-Switching"
\end_inset
\end_layout
\begin_layout Standard
In case the connection to the old primary is lost for whatever reason, we
just don't know anything about its
\emph on
current
\emph default
state (which may deviate from its
\emph on
last known
\emph default
state).
The following command sequence will skip many checks and tell your node
to become primary forcefully:
\end_layout
\begin_layout Itemize
\family typewriter
marsadm pause-fetch mydata
\end_layout
\begin_deeper
\begin_layout Itemize
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
notice that this is similar to
\family typewriter
drbdadm disconnect mydata
\family default
as you are probably used from DRBD.
For better compatibility with DRBD, you may use the alternate syntax
\family typewriter
marsadm disconnect mydata
\family default
instead.
However, there is a subtle difference to DRBD: DRBD will drop
\emph on
both
\emph default
sides of its single bi-directional connection and no longer try to re-connect
from any of both sides, while
\family typewriter
pause-fetch
\family default
is equivalent to
\family typewriter
pause-fetch-local
\family default
, which instructs only the
\emph on
local
\emph default
host to stop fetching logfiles.
Other members of the cluster, including the former primary, are
\emph on
not
\emph default
instructed to do so.
They may continue fetching logfiles over their own private TCP connections,
potentially using many connections in parallel, and potentially even from
any
\emph on
other
\emph default
member of the resource, if they think they can get the data from there.
In order to instruct
\begin_inset Foot
status open
\begin_layout Plain Layout
Notice that not all such instructions may arrive at all sites when the network
is interrupted (or extremely slow).
\end_layout
\end_inset
\emph on
all
\emph default
members of the resource to stop fetching logfiles, you may use
\family typewriter
marsadm pause-fetch-global mydata
\family default
instead (cf section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Operation-of-the"
\end_inset
).
\end_layout
\end_deeper
\begin_layout Itemize
\family typewriter
marsadm primary mydata --force
\end_layout
\begin_deeper
\begin_layout Itemize
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
this is the forceful switchover.
Use
\family typewriter
--force
\family default
only if you know what you are doing!
\end_layout
\end_deeper
\begin_layout Itemize
\family typewriter
marsadm resume-fetch mydata
\end_layout
\begin_deeper
\begin_layout Itemize
As such, the new primary does not really need this, because primaries are
producing their own logfiles without need for fetching.
This is only to undo the previous
\family typewriter
pause-fetch
\family default
, in order to avoid future surprises when the new primary will somewhen
change to secondary mode again (in the far-distant future), and you have
forgotten to remember the fact that fetching had been switched off.
\end_layout
\end_deeper
\begin_layout Standard
When using
\family typewriter
--force
\family default
, many precondition checks and other internal checks are skipped, and in
particular the internal handover protocol for split brain avoidance.
\end_layout
\begin_layout Standard
Therefore, use of
\family typewriter
--force
\family default
is
\emph on
likely
\emph default
to
\series bold
provoke a split brain
\series default
.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
\series bold
Split brain
\series default
is always an
\series bold
erroneous state
\series default
which should be never entered deliberately! Once you have entered it accidental
ly, you
\series bold
must
\series default
resolve it ASAP (see section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Split-Brain-Resolution"
\end_inset
), otherwise you cannot operate your resource in the long term.
\end_layout
\begin_layout Standard
In order to impede you from giving an accidental
\family typewriter
--force
\family default
, the precondition is different:
\family typewriter
--force
\family default
works only in
\emph on
locally disconnected
\emph default
state.
This is similar to DRBD.
\end_layout
\begin_layout Standard
Remember:
\family typewriter
marsadm primary
\family default
without
\family typewriter
--force
\family default
tries to prevent split brain as best as it can.
Use of the
\family typewriter
--force
\family default
option will almost
\emph on
certainly
\emph default
provoke a split brain, at least if the old primary continues to operate
on its local
\family typewriter
/dev/mars/mydata
\family default
device.
Therefore, you are
\series bold
strongly advised
\series default
to do this
\series bold
only
\series default
after
\end_layout
\begin_layout Enumerate
\family typewriter
marsadm primary
\family default
without
\family typewriter
--force
\family default
has failed
\emph on
for no good reason
\emph default
\begin_inset Foot
status open
\begin_layout Plain Layout
Most reasons will be displayed by
\family typewriter
marsadm
\family default
when it is rejecting the switchover.
\end_layout
\end_inset
, and
\end_layout
\begin_layout Enumerate
You are sure you
\emph on
really
\emph default
want to switch, even when that eventually leads to a split brain.
You also declare that you are willing to do
\emph on
manual
\emph default
split-brain resolution as described in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Split-Brain-Resolution"
\end_inset
, or even destruction / reconstruction of a damaged node as described in
section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Final-Destroy-of"
\end_inset
.
\end_layout
\begin_layout Standard
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Notice: in case of
\emph on
connection loss
\emph default
(e.g.
networking problems / network partitions), you may not be able to reliably
detect whether a split brain actually resulted, or not.
\end_layout
\begin_layout Paragraph
Some Background
\end_layout
\begin_layout Standard
In contrast to DRBD, split brain situations are handled differently by MARS
.
When two primaries are accidentally active at the same time, each of them
writes into different logfiles
\family typewriter
/mars/resource-mydata/log-000000001-A
\family default
and
\family typewriter
/mars/resource-mydata/log-000000001-B
\family default
where the
\emph on
origin
\emph default
host is always recorded in the filename.
Therefore, both nodes
\emph on
can theoretically
\emph default
run in primary mode independently from each other, at least for some time.
They
\emph on
might
\emph default
even
\family typewriter
log-rotate
\family default
independently from each other.
However, this is really no good idea.
The replication to third nodes will likely get stuck, and your
\family typewriter
/mars/
\family default
filesystem(s) will eventually run out of space.
Any further secondary node (when having
\begin_inset Formula $k>2$
\end_inset
replicas) will certainly get into serious problems: it simply does not
know which split-brain version it should follow.
Therefore, you will certainly loose the actuality of your redundancy.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
\family typewriter
marsadm secondary
\family default
is
\emph on
strongly discouraged
\emph default
.
It tells the whole cluster that
\emph on
nobody
\emph default
is designated as primary any more.
\emph on
All
\emph default
nodes should go into secondary mode, globally.
In the current version of MARS, the secondaries will no long fetch any
logfiles, since they don't know which version is the
\begin_inset Quotes eld
\end_inset
right
\begin_inset Quotes erd
\end_inset
one.
Syncing is also not possible.
When the device
\family typewriter
/dev/mars/mydata
\family default
is in use somewhere, it will remain in
\emph on
actual
\emph default
primary mode during that time.
As soon as the local
\family typewriter
/dev/mars/mydata
\family default
is released, the node will
\emph on
actually
\emph default
go into secondary mode if it is no longer designated as primary.
You should avoid it in advance by always
\emph on
directly
\emph default
switching over from one primary to another one, without intermediate
\family typewriter
secondary
\family default
command.
This is different from DRBD.
\end_layout
\begin_layout Standard
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Split brain situations are detected
\emph on
passively
\emph default
by secondaries.
Whenever a secondary detects that somewhere a split brain has happend,
it refuses to replay any logfiles behind the split point (and also to fetch
them when possible), or anywhere where something appears suspect or ambiguous.
This tries to keep its local disk state always being consistent, but outdated
with respect to any of the split brain versions.
As a consequence, becoming primary may be impossible, because it cannot
always know which logfiles are the correct ones to replay before
\family typewriter
/dev/mars/mydata
\family default
can appear.
The ambiguity must be resolved first.
\end_layout
\begin_layout Standard
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
If you
\emph on
really
\emph default
need the local device
\family typewriter
/dev/mars/mydata
\family default
to disappear
\emph on
everywhere
\emph default
in a split brain situation, you don't need a
\emph on
strongly discouraged
\emph default
\family typewriter
marsadm secondary
\family default
command for this.
\family typewriter
marsadm detach
\family default
or
\family typewriter
marsadm down
\family default
can do it also, without destroying knowledge about the former designated
primary.
\end_layout
\begin_layout Subsection
Split Brain Resolution
\begin_inset CommandInset label
LatexCommand label
name "sub:Split-Brain-Resolution"
\end_inset
\end_layout
\begin_layout Standard
Split brain can naturally occur during a long-lasting network outage (aka
network partition) when you (forcefully) switch primaries inbetween, or
due to final loss of your old primary node (fatal node crash) when not
all logfile data had been transferred immediately before the final crash.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
Remember that split brain is an
\series bold
erroneous state
\series default
which must be resolved as soon as possible!
\end_layout
\begin_layout Standard
Whenever split brain occurs for whatever reason, you have two choices for
resolution: either destroy one of your versions, or retain it under a different
resource name.
\end_layout
\begin_layout Standard
In any of both cases, do the following steps ASAP:
\end_layout
\begin_layout Enumerate
\series bold
Manually
\series default
check which (surviving) version is the
\begin_inset Quotes eld
\end_inset
right
\begin_inset Quotes erd
\end_inset
one.
Any error is up to you: destroying the wrong version is
\emph on
your
\emph default
fault, not the fault of MARS.
\end_layout
\begin_layout Enumerate
If you did not already switch your primary to the final destination determined
in the previous step, do it now (see description in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Forced-Switching"
\end_inset
).
Don't use an intermediate
\family typewriter
marsadm secondary
\family default
command (as known from DRBD):
\emph on
directly
\emph default
switch to the new designated primary!
\end_layout
\begin_layout Enumerate
On each non-right version (which you don't want to retain) which had been
primary before, umount your
\family typewriter
/dev/mars/mydata
\family default
or otherwise stop using it (e.g.
stop iSCSI or other users of the device).
Wait until each of them has actually left primary state and until their
local logfile(s) have been fully written back to the underlying disk.
\end_layout
\begin_layout Enumerate
Wait until the network works again.
All your (surviving) cluster nodes
\emph on
must
\emph default
\begin_inset Foot
status open
\begin_layout Plain Layout
If you are a MARS expert and you really know what you are doing (in particular,
you can anticipate the effects of the Lamport clock and of the symlink
update protocol including the
\begin_inset Quotes eld
\end_inset
eventually consistent
\begin_inset Quotes erd
\end_inset
behaviour including the not-yet-consistent intermediate states, see sections
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-Lamport-Clock"
\end_inset
and
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-Symlink-Tree"
\end_inset
), you may deviate from this requirement.
\end_layout
\end_inset
be able to communicate with each other.
If that is not possible, or if it takes too long, you may fall back to
the method described in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Final-Destroy-of"
\end_inset
, but do this only as far as necessary.
\end_layout
\begin_layout Standard
The next steps are different for different use cases:
\end_layout
\begin_layout Paragraph
Destroying a Wrong Split Brain Version
\end_layout
\begin_layout Standard
Continue with the following steps, each on those cluster node(s) where you
do not want to retain its split-brain version.
In preference, start with the old
\begin_inset Quotes eld
\end_inset
wrong
\begin_inset Quotes erd
\end_inset
primaries first (see advice at the end of this section):
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{enumerate}
\backslash
setcounter{enumi}{4}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
item
\end_layout
\end_inset
\family typewriter
marsadm invalidate mydata
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
end{enumerate}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\noindent
When no split brain is reported anymore after that (via
\family typewriter
marsadm view all
\family default
), you are done.
You need to repeat this on other secondaries only when necessary.
\end_layout
\begin_layout Standard
In very rare cases when things are screwed up very heavily (e.g.
a partly destroyed
\family typewriter
/mars/
\family default
partition), you may try an alternate method described in appendix
\begin_inset CommandInset ref
LatexCommand ref
reference "chap:Alternative-Methods-for"
\end_inset
.
\end_layout
\begin_layout Paragraph
Keeping a Split Brain Version
\end_layout
\begin_layout Standard
On those cluster node(s) where you want to retain the version (e.g.
for inspection purposes):
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{enumerate}
\backslash
setcounter{enumi}{4}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
item
\end_layout
\end_inset
\family typewriter
marsadm leave-resource mydata
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
item
\end_layout
\end_inset
After having done this on
\emph on
all
\emph default
those cluster nodes, check that the split brain is gone (e.g.
by saying
\family typewriter
marsadm view mydata
\family default
), as documented above.
In very rare cases, you might also need a
\family typewriter
log-purge-all
\family default
(see page
\begin_inset CommandInset ref
LatexCommand pageref
reference "log-purge-all$res"
\end_inset
).
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
item
\end_layout
\end_inset
Check that each underlying local disk
\family typewriter
/dev/lv-x/mydata
\family default
is really usable afterwards, e.g.
by test-mounting it (or
\family typewriter
fsck
\family default
if you can afford it).
If all is OK, don't forget to umount it before proceeding with the next
step.
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
item
\end_layout
\end_inset
Create a completely new MARS resource out of the underlying disk
\family typewriter
/dev/lv-x/mydata
\family default
having a different name, such as
\family typewriter
mynewdata
\family default
(see description in section
\begin_inset CommandInset ref
LatexCommand vref
reference "sec:Creating-and-Maintaining"
\end_inset
).
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
end{enumerate}
\end_layout
\end_inset
\end_layout
\begin_layout Paragraph
Keeping a Good Version
\end_layout
\begin_layout Standard
When you had a secondary which did not participate in the split brain, but
just got confused and therefore stopped replaying logfiles immediately
before the split-brain point, it may very well happen
\begin_inset Foot
status open
\begin_layout Plain Layout
In general, such a
\begin_inset Quotes eld
\end_inset
good
\begin_inset Quotes erd
\end_inset
behaviour cannot be guaranteed for all secondaries.
Race conditions in complex networks may asynchronously transfer
\begin_inset Quotes eld
\end_inset
wrong
\begin_inset Quotes erd
\end_inset
logfile data to a secondary much earlier than conflicting
\begin_inset Quotes eld
\end_inset
good
\begin_inset Quotes erd
\end_inset
logfile data which will be marked
\begin_inset Quotes eld
\end_inset
good
\begin_inset Quotes erd
\end_inset
only in the
\emph on
future.
\emph default
It is impossible to predict this in advance.
\end_layout
\end_inset
that you don't need to do any action for it.
When all wrong versions have disappeared from the cluster (by
\family typewriter
invalidate
\family default
or
\family typewriter
leave-resource
\family default
as described before), the confusion should be over, and the secondary should
automatically resume tracking of the new unique version.
\end_layout
\begin_layout Standard
Please check that
\emph on
all
\emph default
of your secondaries are no longer stuck.
You need to execute split brain resolution only for
\emph on
stuck
\emph default
nodes.
\end_layout
\begin_layout Standard
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Hint / advice for
\begin_inset Formula $k>2$
\end_inset
replicas: it is a good idea to start split brain resolution
\emph on
first
\emph default
with those (few) nodes which had been (accidentally) primary before, but
are not the new designated primary.
Usually, you had 2 primaries during split brain, so this will apply only
to
\emph on
one
\emph default
of them.
Leave the other one intact, by not umounting
\family typewriter
/dev/mars/mydata
\family default
at all, and keeping your applications running.
Even during emergency mode, see section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Emergency-Mode"
\end_inset
.
\emph on
First
\emph default
resolve the problem of the
\begin_inset Quotes eld
\end_inset
wrong
\begin_inset Quotes erd
\end_inset
primary(s) via
\family typewriter
invalidate
\family default
or
\family typewriter
leave-resource
\family default
.
Wait for a short while.
Then check the rest of your secondaries, whether they now are already following
the new (unique) primary, and finally check whether the split brain warning
reported by
\family typewriter
marsadm view all
\family default
is gone everywhere.
This way, you can often skip unnecessary invalidations of replicas.
\end_layout
\begin_layout Subsection
Final Destruction of a Damaged Node
\begin_inset CommandInset label
LatexCommand label
name "sub:Final-Destroy-of"
\end_inset
\end_layout
\begin_layout Standard
When a node has eventually died, do the following steps ASAP:
\end_layout
\begin_layout Enumerate
\emph on
Physically
\emph default
remove the dead node from your network.
Unplug all network cables! Failing to do so might provoke a disaster in
case it somehow resurrects in an uncontrolled manner, such as a partly-damaged
\family typewriter
/mars/
\family default
filesystem, a half-defective kernel, RAM / kernel memory corruption, disk
corruption, or whatever.
Don't risk any such unpredictable behaviour!
\end_layout
\begin_layout Enumerate
\series bold
Manually
\series default
check which of the surviving versions will be the
\begin_inset Quotes eld
\end_inset
right
\begin_inset Quotes erd
\end_inset
one.
Any error is up to you: resurrecting an unnecessarily old / outdated version
and/or destroying the newest / best version is
\emph on
your
\emph default
fault, not the fault of MARS.
\end_layout
\begin_layout Enumerate
If you did not already switch your primary to the final destination determined
in the previous step, do it now (see description in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Forced-Switching"
\end_inset
).
\end_layout
\begin_layout Enumerate
On a surviving node, but preferably
\emph on
not
\emph default
the new designated primary, give the following commands:
\end_layout
\begin_deeper
\begin_layout Enumerate
\family typewriter
marsadm --host=your-damaged-host down mydata
\end_layout
\begin_layout Enumerate
\family typewriter
marsadm --host=your-damaged-host leave-resource mydata
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
Check for misspellings, in particular the hostname of the dead node, and
check the command syntax before typing return! Otherwise, you may forcefully
destroy the wrong node!
\end_layout
\end_deeper
\begin_layout Enumerate
In case any of the previous commands should fail (which is rather likely),
repeat it with an additional
\family typewriter
--force
\family default
option.
Don't use
\family typewriter
--force
\family default
in the first place, alway try first without it!
\end_layout
\begin_layout Enumerate
Repeat the same with
\emph on
all
\emph default
resources which were formerly present at
\family typewriter
your-damaged-host
\family default
.
\end_layout
\begin_layout Enumerate
Finally, say
\family typewriter
marsadm --host=your-damaged-host leave-cluster
\family default
(optionally augmented with
\family typewriter
--force
\family default
).
\end_layout
\begin_layout Standard
Now your surviving nodes should
\emph on
believe
\emph default
that the old node
\family typewriter
your-damaged-host
\family default
does no longer exist, and that it does no longer participate in any resource.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
Even if your dead node comes to life again in some way: always ensure that
the mars kernel module cannot run any more.
\emph on
Never
\emph default
do a
\family typewriter
modprobe mars
\family default
on a node marked as dead this way!
\end_layout
\begin_layout Standard
Further instructions for complicated cases are in appendix
\begin_inset CommandInset ref
LatexCommand ref
reference "chap:Alternative-De--and"
\end_inset
and
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Cleanup-in-case"
\end_inset
.
\end_layout
\begin_layout Subsection
Online Resizing during Operation
\end_layout
\begin_layout Standard
You should have LVM or some other means of increasing the physical size
of your disk (e.g.
via firmware of some RAID controllers).
The network must be healthy.
Do the following steps:
\end_layout
\begin_layout Enumerate
Increase your local disks (usually
\family typewriter
/dev/vg/mydata
\family default
)
\emph on
everywhere
\emph default
in the whole cluster.
In order to avoid wasting space, increase them
\emph on
uniformly
\emph default
to the same size (when possible).
The
\family typewriter
lvresize
\family default
tool is documented elsewhere.
\end_layout
\begin_layout Enumerate
Check that all MARS switches are on.
If not, say
\family typewriter
marsadm up mydata
\family default
everywhere.
\end_layout
\begin_layout Enumerate
At the primary:
\family typewriter
marsadm resize mydata
\end_layout
\begin_layout Enumerate
If you have intermediate layers such as iSCSI, you may need some
\family typewriter
iscsiadm
\family default
update or other command.
\end_layout
\begin_layout Enumerate
Now you may increase your filesystem.
This is specific for the filesystem type and documented elsewhere.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Hint: the secondaries will start syncing the increased new part of the underlyin
g primary disk.
In many cases, this is not really needed, because the new junk data just
does not care.
If you are sure and if you know what you are doing, you may use
\family typewriter
marsadm fake-sync mydata
\family default
to abort such unnecessary traffic.
\end_layout
\begin_layout Section
The State of MARS
\begin_inset CommandInset label
LatexCommand label
name "sec:The-State-of"
\end_inset
\end_layout
\begin_layout Standard
In general, MARS tries to
\emph on
hide
\emph default
any network failures from you as best as it can.
After a network problem, any internal low-level socket connections are
\emph on
transparently
\emph default
tried to re-open ASAP, without need for sysadmin intervention.
In difference to DRBD, network failures will
\emph on
not
\emph default
automatically alter the state of MARS, such as switching to
\family typewriter
disconnected
\family default
after a
\family typewriter
ko_timeout
\family default
or similar.
From a high-level sysadmin viewpoint, communication may just take a very
long time to succeed.
\end_layout
\begin_layout Standard
When the behaviour of MARS is different from DRBD, it is usually intended
as a feature.
\end_layout
\begin_layout Standard
MARS is not only an
\series bold
asynchronous
\series default
system at block IO level, but also
\series bold
at control level
\series default
.
\end_layout
\begin_layout Standard
This is
\emph on
necessary
\emph default
because in a widely distributed long-distance system running on slow or
even temporarily failing networks, actions may take a long time, and there
may be many actions
\series bold
started in parallel
\series default
.
\end_layout
\begin_layout Standard
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Synchronous concepts are generally not sufficient for expressing that.
Because of inherent asynchronicity and of dynamic creation / joining of
resources, it is neither possible to comprehensively depict a complex distribut
ed MARS system, nor a comprehensive standalone snippet of MARS, as a finite
state transition diagram
\begin_inset Foot
status open
\begin_layout Plain Layout
Probably it could be possible to formally model MARS as a Petri net.
However, complete Petri nets are tending to become very conplex, and to
describe lots of low-level details.
Expressing hierarchy, in a top-down fashion, is cumbersome.
We find no clue in trying to do so.
\end_layout
\end_inset
.
\end_layout
\begin_layout Standard
Although MARS tries to
\emph on
approximate
\emph default
/
\emph on
emulate
\emph default
the synchronous control behaviour of DRBD at the interface level (
\family typewriter
marsadm
\family default
) in many situations as best as it can, the
\emph on
internal
\emph default
control model is necessarily asynchronous.
As an experiencend sysadmin, you will be curious how it works in principle.
When you know something about it, you will no longer be surprised when
some (detail) behaviour is different from DRBD.
\end_layout
\begin_layout Standard
The general principle is an asynchronous 2-edge handshake protocol, which
is used almost everywhere in MARS:
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/handshake.fig
width 80col%
\end_inset
\end_layout
\begin_layout Standard
We have a binary todo switch, which can be either in state
\begin_inset Quotes eld
\end_inset
on
\begin_inset Quotes erd
\end_inset
or
\begin_inset Quotes eld
\end_inset
off
\begin_inset Quotes erd
\end_inset
.
In addition, we have an actual response indicator, which is similar to
an LED indicating the actual status.
In our example, we imagine that both are used for controlling a big ventilator,
having a huge inert mass.
Imagine a big machine from a power plant, which is as tall as a human.
\end_layout
\begin_layout Standard
We start in a situation where the binary switch is off, and the ventilator
is stopped.
At point 1, we turn on the switch.
At that moment, a big contactor will sound like
\begin_inset Quotes eld
\end_inset
zonggg
\begin_inset Quotes erd
\end_inset
, and a big motor will start to hum.
At first you won't hear anything else.
It will take a while, say 1 minute, until the big wheel will have reached
its final operating RPM, due to the huge inert mass.
During that spin-up, the lights in your room will become slightly darker.
When having reached the full RPM at point 2, your workplace will then be
noisier, but in exchange your room lights will be back at ordinary strength,
and the actual response LED will start to lit in order to indicate that
the big fan is now operational.
\end_layout
\begin_layout Standard
Assume we want to turn the system off.
When turning the todo switch to
\begin_inset Quotes eld
\end_inset
off
\begin_inset Quotes erd
\end_inset
at point 3, first nothing will seem to happen at all.
The big wheel will keep spinning due to its heavy inert mass, and the RPM
as well as the sound will go down only slowly.
During spin-down, the actual response LED will stay illuminated, in order
to warn you that you should not touch the wheel, otherwise you may get
injuried
\begin_inset Foot
status open
\begin_layout Plain Layout
Notice that it is only safe to access the wheel when
\emph on
both
\emph default
the switch and the LED are off.
Conversely, if at least one of them is on, something is going on inside
the machine.
Transferred to MARS: always look at
\emph on
both
\emph default
the todo switch and the correponding actual indicator in order to not miss
something.
\end_layout
\end_inset
.
The LED will only go off after, say, 2 minutes, when the wheel has actually
stopped at point 4.
After that, the cycle may potentially start over again.
\end_layout
\begin_layout Standard
As you can see, all four possible cartesian product combinations between
two boolean values are occurring in the diagram.
\end_layout
\begin_layout Standard
The same handshake protocol is used in MARS for communication between userspace
and kernelspace, as well as for communication in the widely distributed
system.
\end_layout
\begin_layout Section
Inspecting the State of MARS
\begin_inset CommandInset label
LatexCommand label
name "sec:Inspecting-the-State"
\end_inset
\end_layout
\begin_layout Standard
The main command for viewing the current state of MARS is
\end_layout
\begin_layout Itemize
\family typewriter
marsadm view mydata
\end_layout
\begin_layout Standard
or its more specialized variant
\end_layout
\begin_layout Itemize
\family typewriter
marsadm view-
\emph on
$macroname
\emph default
mydata
\end_layout
\begin_layout Standard
where
\family typewriter
\emph on
$macroname
\family default
\emph default
is one of the macros described in chapter
\begin_inset CommandInset ref
LatexCommand ref
reference "chap:The-Macro-Processor"
\end_inset
, or a macro which has been written by yourself.
\end_layout
\begin_layout Standard
As always, you may replace the resource name
\family typewriter
mydata
\family default
with the special keyword
\family typewriter
all
\family default
in order to get the state of all locally joined resources, as well as a
list of all those resources.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
When using the variant
\family typewriter
marsadm view all
\family default
, additionally the global communication status will be displayed.
This helps humans in diagnosing problems.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Hint: use the compound command
\family typewriter
watch marsadm view all
\family default
for continuous display of the current state of MARS.
When starting this side-by-side in
\family typewriter
ssh
\family default
terminal windows for all your cluster nodes, you can easily watch what's
going on in the whole cluster.
\end_layout
\begin_layout Chapter
Basic Working Principle
\end_layout
\begin_layout Standard
Even if you are impatient, please read this chapter.
At the
\emph on
surface
\emph default
, MARS appears to be very similar to DRBD.
It looks like almost being a drop-in replacement for DRBD.
\end_layout
\begin_layout Standard
When taking this naïvely, you could easily step into some trivial pitfalls,
because the internal working principle of MARS is totally different from
DRBD.
Please forget (almost) anything you already know about the internal working
principles of DRBD, and look at the very different working principles of
MARS.
\end_layout
\begin_layout Section
The Transaction Logger
\begin_inset CommandInset label
LatexCommand label
name "sec:The-Transaction-Logger"
\end_inset
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/MARS_Data_Flow.pdf
lyxscale 60
width 100text%
\end_inset
\end_layout
\begin_layout Standard
\noindent
The basic idea of MARS is to record all changes made to your block device
in a so-called
\series bold
transaction logfile
\series default
.
\emph on
Any
\emph default
write reqeuest is treated like a transaction which changes the contents
of your block device.
\end_layout
\begin_layout Standard
This is similar in concept to some database systems, but there exists no
separate
\begin_inset Quotes eld
\end_inset
commit
\begin_inset Quotes erd
\end_inset
operation:
\emph on
any
\emph default
write request is acting like a commit.
\end_layout
\begin_layout Standard
The picture shows the flow of write requests.
Let's start with the primary node.
\end_layout
\begin_layout Standard
Upon submission of a write request on
\family typewriter
/dev/mars/mydata
\family default
, it is first buffered in a
\emph on
temporary
\emph default
memory buffer.
\end_layout
\begin_layout Standard
The temporary memory buffer serves multiple purposes:
\end_layout
\begin_layout Itemize
It keeps track of the order of write operations.
\end_layout
\begin_layout Itemize
Additionally, it keeps track of the positions in the underlying disk
\family typewriter
/dev/lv-x/mydata
\family default
.
In particular, it detects when the same block is overwritten multiple times.
\end_layout
\begin_layout Itemize
During pending write operation, any concurrent reads are served from the
memory buffer.
\end_layout
\begin_layout Standard
After the write has been buffered in the temporary memory buffer, the main
logger thread of the transaction logger creates a so-called
\emph on
log entry
\emph default
and starts an
\begin_inset Quotes eld
\end_inset
append
\begin_inset Quotes erd
\end_inset
operation on the transaction logfile.
The log entry contains vital information such as the logical block number
in the underlying disk, the length of the data, a timestamp, some header
magic in order to detect corruption, the log entry sequence number, of
course the data itself, and optional information like a checksum or compression
information.
\end_layout
\begin_layout Standard
Once the log entry has been written through to the
\family typewriter
/mars/
\family default
filesystem via fsync(), the application waiting for the write operation
at
\family typewriter
/dev/mars/mydata
\family default
is signalled that the write was successful.
\end_layout
\begin_layout Standard
This may happen even
\emph on
before
\emph default
the writeback to the underlying disk
\family typewriter
/dev/lv-x/mydata
\family default
has started.
Even when you power off the system right now, the information is not lost:
it is present in the logfile, and can be reconstructed from there.
\end_layout
\begin_layout Standard
Notice that the order of log records present in the transaction log defines
a total order among the write requests which is
\emph on
compatible
\emph default
to the partial order of write requests issued on
\family typewriter
/dev/mars/mydata
\family default
.
\end_layout
\begin_layout Standard
Also notice that despite its sequential nature, the transaction logfile
is typically
\emph on
not
\emph default
the performance bottleneck of the system: since appending to a logfile
is almost purely sequential IO, it runs much faster than random IO on typical
datacenter workloads.
\end_layout
\begin_layout Standard
In order to reclaim the temporary memory buffer, its content must be written
back to the underlying disk
\family typewriter
/dev/lv-x/mydat
\family default
a somewhen.
After writeback, the temporary space is freed.
The writeback can do the following optimizations:
\end_layout
\begin_layout Enumerate
writeback may be in
\emph on
any
\emph default
order; in particular, it may be
\emph on
sorted
\emph default
according to ascending sector ´numbers.
This will reduce the average seek distances of magnetic disks in general.
\end_layout
\begin_layout Enumerate
when the same sector is overwritten multiple times, only the
\begin_inset Quotes eld
\end_inset
last
\begin_inset Quotes erd
\end_inset
version need to be written back, skipping some intermediate versions.
\end_layout
\begin_layout Standard
In case the primary node crashes during writeback, it suffices to replay
the log entries from some point in the past until the end of the transaction
logfile.
It does no harm if you accidentally replay some log entries twice or even
more often: since the replay is in the original total order, any temporary
inconsistency is
\emph on
healed
\emph default
by the logfile application.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
In mathematics, the property that you can apply your logfile twice to your
data (or even as often as you want), is called
\series bold
idempotence
\series default
.
This is a very desirable property: it ensures that nothing goes wrong when
replaying
\begin_inset Quotes eld
\end_inset
too much
\begin_inset Quotes erd
\end_inset
/ starting your replay
\begin_inset Quotes eld
\end_inset
too early
\begin_inset Quotes erd
\end_inset
.
Idempotence is even more beneficial: in case anything should go wrong with
your data on your disk (e.g.
IO errors), replaying your logfile once more often may
\begin_inset Foot
status open
\begin_layout Plain Layout
Miracles cannot be guaranteed, but
\emph on
higher chances
\emph default
and
\emph on
improvements
\emph default
can be expected (e.g.
better chances for
\family typewriter
fsck
\family default
).
\end_layout
\end_inset
even
\series bold
heal
\series default
some defects.
Good news for desperate sysadmins forced to work with flaky hardware!
\end_layout
\begin_layout Standard
The basic idea of the asynchronous replication of MARS is rather simple:
just transfer the logfiles to your secondary nodes, and replay them onto
their copy of the disk data (also called
\emph on
mirror
\emph default
) in the same order as the total order defined by the primary.
\end_layout
\begin_layout Standard
Therefore, a mirror of your data on any secondary may be outdated, but it
always corresponds to some version which was valid in the past.
This property is called
\series bold
anytime consistency
\begin_inset Foot
status open
\begin_layout Plain Layout
Your secondary nodes are always consistent in themselves.
Notice that this kind of consistency is a
\emph on
local
\emph default
consistency model.
There exists no global consistency in MARS.
Global consistency would be practically impossible in long-distance replication
where Einstein's law of the speed of light is limiting global consistency.
The front-cover pictures showing the planets Earth and Mars tries to lead
your imagination away from global consistency models as used in
\begin_inset Quotes eld
\end_inset
DRBD Think(tm)
\begin_inset Quotes erd
\end_inset
, and try to prepare you mentally for local consistency as in
\begin_inset Quotes eld
\end_inset
MARS Think(tm)
\begin_inset Quotes erd
\end_inset
.
\end_layout
\end_inset
.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
As you can see in the picture, the process of transfering the logfiles is
\emph on
independent
\emph default
from the process which replays the logfiles onto the data at some secondary
site.
Both processes can be switched on / off separately (see commands
\family typewriter
marsadm {dis,}connect
\family default
and
\family typewriter
marsadm {pause,resume}-replay
\family default
in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Operation-of-the"
\end_inset
).
This may be
\emph on
exploited
\emph default
: for example, you may replicate your logfiles as soon as possible (to protect
against catastrophic failures), but deliberately wait one hour until it
is replayed (under regular circumstances).
If your data inside your filesystem
\family typewriter
/mydata/
\family default
at the primary site is accidentally destroyed by
\family typewriter
rm -rf /mydata/
\family default
, you have an old copy at the secondary site.
This way, you can substitute
\emph on
some parts
\begin_inset Foot
status open
\begin_layout Plain Layout
Please note that MARS cannot
\emph on
fully
\emph default
substitute a backup system, because it can keep only
\emph on
physical
\emph default
copies, and does not create logical copies.
\end_layout
\end_inset
\emph default
of conventional backup functionality by MARS.
In case you need the actual version, just replay in
\begin_inset Quotes eld
\end_inset
fast-forward
\begin_inset Quotes erd
\end_inset
mode (similar to old-fashioned video tapes).
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Future versions of MARS Full are planned to also allow
\begin_inset Quotes eld
\end_inset
fast-backward
\begin_inset Quotes erd
\end_inset
rewinding, of course at some cost.
\end_layout
\begin_layout Section
The Lamport Clock
\begin_inset CommandInset label
LatexCommand label
name "sec:The-Lamport-Clock"
\end_inset
\end_layout
\begin_layout Standard
MARS is always
\emph on
asynchonously
\emph default
communicating in the distributed system on
\emph on
any
\emph default
topics, even strategic decisions.
\end_layout
\begin_layout Standard
If there were a
\emph on
strict
\emph default
global consistency model, which would be roughly equivalent to a standalone
model, we would need
\emph on
locking
\emph default
in order to serialize conflicting requests.
It is known for many decades that
\emph on
distributed locks
\emph default
do not only suffer from performance problems, but they are also cumbersome
to get them working reliably in scenarios where nodes or network links
may fail at any time.
\end_layout
\begin_layout Standard
Therefore, MARS uses a very different consistency model:
\series bold
Eventually Consistent
\series default
.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Notice that the network bottleneck problems described in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Network-Bottlenecks"
\end_inset
are
\emph on
demanding
\emph default
an
\begin_inset Quotes eld
\end_inset
eventually consistent
\begin_inset Quotes erd
\end_inset
model.
You have
\series bold
no chance
\series default
against natural laws, like Einstein's laws.
In order to cope with the problem area, you have to
\emph on
invest some additional effort
\emph default
.
Unfortunately, asynchronous communication models are more tricky to program
and to debug than simple strictly consistent models.
In particular, you
\emph on
have to cope with
\emph default
additional
\series bold
race conditions
\series default
\emph on
inherent
\emph default
\emph on
to
\emph default
the
\begin_inset Quotes eld
\end_inset
eventually consistent
\begin_inset Quotes erd
\end_inset
model.
In the face of the laws of the universe, motivate yourself by looking at
the graphics at the cover page: the planets are a
\emph on
symbol
\emph default
for what you have to do!
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Example: the asynchronous communication protocol of MARS leads to a different
behaviour from DRBD in case of
\series bold
network partitions
\series default
(temporary interruption of communication between some cluster nodes), because
MARS
\emph on
remembers
\emph default
the old state of remote nodes over long periods of time, while DRBD knows
absolutely nothing about its peers in disconnected state.
Sysadmins familiar with DRBD might find the following behaviour unusual:
\end_layout
\begin_layout Standard
\noindent
\align center
\size tiny
\begin_inset Tabular
<lyxtabular version="3" rows="6" columns="3">
<features rotate="0" tabularvalignment="middle">
<column alignment="left" valignment="top" width="0pt">
<column alignment="left" valignment="top" width="0pt">
<column alignment="left" valignment="top" width="0pt">
<row>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
Event
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
DRBD Behaviour
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
MARS Behaviour
\end_layout
\end_inset
</cell>
</row>
<row endhead="true">
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
1.
the network partitions
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
automatic disconnect
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
nothing happens, but replication lags behind
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
2.
on A:
\family typewriter
umount $device
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
works
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
works
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
3.
on A:
\family typewriter
{drbd,mars}adm secondary
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
works
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
works
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
4.
on B:
\family typewriter
{drbd,mars}adm primary
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
works, split brain happens
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
\size tiny
refused
\series default
because B believes that A is primary
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
5.
the network resumes
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
automatic connect attempt fails
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
communication automatically resumes
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Standard
\noindent
If you intentionally want to switch over (and to produce a split brain as
a side effect), the following variant must be used with MARS:
\end_layout
\begin_layout Standard
\noindent
\align center
\size tiny
\begin_inset Tabular
<lyxtabular version="3" rows="9" columns="3">
<features rotate="0" tabularvalignment="middle">
<column alignment="left" valignment="top" width="0pt">
<column alignment="left" valignment="top" width="0pt">
<column alignment="left" valignment="top" width="0pt">
<row>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
Event
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
DRBD Behaviour
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
MARS Behaviour
\end_layout
\end_inset
</cell>
</row>
<row endhead="true">
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
1.
the network partitions
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
automatic disconnect
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
nothing happens, but replication lags behind
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
2.
on A:
\family typewriter
umount $device
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
works
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
works
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
3.
on A:
\family typewriter
{drbd,mars}adm secondary
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
works
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
works (but
\emph on
not remmonended!
\emph default
)
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
4.
on B:
\family typewriter
{drbd,mars}adm primary
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
split brain, but nobody knows
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
\size tiny
refused
\series default
because B believes that A is primary
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
5.
on B:
\family typewriter
marsadm disconnect
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
works, nothing happens
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
6.
on B:
\family typewriter
marsadm primary --force
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
-
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
works, split brain happens on B, but A doesn't know
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
7.
on B:
\family typewriter
marsadm connect
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
-
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
works, nothing happens
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
8.
the network resumes
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
automatic connect attempt fails
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size tiny
communication resumes, A now detects the split brain
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Standard
\noindent
In order to implement the consistency model
\begin_inset Quotes eld
\end_inset
eventually consistent
\begin_inset Quotes erd
\end_inset
, MARS uses a so-called Lamport
\begin_inset Foot
status open
\begin_layout Plain Layout
Published in the late 1970s by Leslie Lamport, also known as inventor of
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
LaTeX
\end_layout
\end_inset
.
\end_layout
\end_inset
clock.
MARS uses a special variant called
\begin_inset Quotes eld
\end_inset
physical Lamport clock
\begin_inset Quotes erd
\end_inset
.
\end_layout
\begin_layout Standard
The physical Lamport clock is another almost-realtime clock which
\emph on
can
\emph default
run independently from the Linux kernel system clock.
However, the Lamport clock tries to remain as near as possible to the system
clock.
\end_layout
\begin_layout Standard
Both clocks can be queried at any time via
\family typewriter
cat /proc/sys/mars/lamport_clock
\family default
.
The result will show both clocks in parallel, in units of seconds since
the Unix epoch, with nanosecond resolution.
\end_layout
\begin_layout Standard
When there are no network messages at all, both the system clock and the
Lamport clock will show almost the same time (except some minor differences
of a few nanoseconds resulting from the finite processor clock speed).
\end_layout
\begin_layout Standard
The physical Lamport clock works rather simple:
\emph on
any
\emph default
message on the network is augmented with a Lamport time stamp telling when
the message was
\emph on
sent
\emph default
according to the local Lamport clock of the sender.
Whenever that message is received by some receiver, it checks whether the
time ordering relation would be violated: whenever the Lamport timestamp
in the message would claim that the sender had sent it
\emph on
after
\emph default
it arrived at the receiver (according to drifts in their respective local
clocks), something must be wrong.
In this case, the local Lamport clock of the
\emph on
receiver
\emph default
is advanced shortly after the sender Lamport timestamp, such that the time
ordering relation is no longer violated.
\end_layout
\begin_layout Standard
As a consequence, any local Lamport clock may precede the corresponding
local system clock.
In order to avoid accumulation of deltas between the Lamport and the system
clock, the Lamport clock will run slower after that, possibly until it
reaches the system clock again (if no other message arrives which sets
it forward again).
After having reached the system clock, the Lamport clock will continue
with
\begin_inset Quotes eld
\end_inset
normal
\begin_inset Quotes erd
\end_inset
speed.
\end_layout
\begin_layout Standard
MARS uses the local Lamport clock for anything where other systems would
use the local system clock: for example, timestamp generation in the
\family typewriter
/mars/
\family default
filesystem.
Even symlinks created there are timestamped according to the Lamport clock.
Both the kernel module and the userspace tool
\family typewriter
marsadm
\family default
are always operating in the timescale of the Lamport clock.
Most importantly, all timestamp comparisons are always carried out with
respect to Lamport time.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Bigger differences between the Lamport and the system clock can be annoying
from a human point of view: when typing
\family typewriter
ls -l /mars/resource-mydata/
\family default
many timestamps may appear as if they were created in the
\begin_inset Quotes eld
\end_inset
future
\begin_inset Quotes erd
\end_inset
, because the
\family typewriter
ls
\family default
command compares the output formatting against the system clock (it does
not even know of the existence of the MARS Lamport clock).
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
Always use
\family typewriter
ntp
\family default
(or another clock synchronization service) in order to pre-synchronize
your system clocks as close as possible.
Bigger differences are not only annoying, but may lead some people to wrong
conclusions and therefore even lead to bad human decisions!
\end_layout
\begin_layout Standard
In a professional datacenter, you should use
\family typewriter
ntp
\family default
anyway, and you should monitor its effectiveness anyway.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Hint: many internal logfiles produced by the MARS kernel module contain
Lamport timestamps written as numerical values.
In order to convert them into human-readable form, use the command
\family typewriter
marsadm cat /mars/5.total.status
\family default
or similar.
\end_layout
\begin_layout Section
The Symlink Tree
\begin_inset CommandInset label
LatexCommand label
name "sec:The-Symlink-Tree"
\end_inset
\end_layout
\begin_layout Standard
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
The symlink tree as described here will be replaced by another representation
in future versions of MARS.
Therefore, don't do any scripting by directly accessing symlinks! Use the
primitive macros described in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Predefined-Trivial-Macros"
\end_inset
.
\end_layout
\begin_layout Standard
The current
\family typewriter
/mars/
\family default
filesystem container format contains not only transaction logfiles, but
also acts as a generic storage for (persistent) state information.
Both configuration information and runtime state information are currently
stored in symlinks.
Symlinks are
\begin_inset Quotes eld
\end_inset
misused
\begin_inset Foot
status open
\begin_layout Plain Layout
This means, the symlink targets need not be other files or directories,
but just any values like integers or strings.
\end_layout
\end_inset
\begin_inset Quotes erd
\end_inset
in order to represent some
\family typewriter
key -> value
\family default
pairs.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
It is not yet clear / decided, but there is a
\emph on
chance
\emph default
that the
\emph on
concept
\emph default
of
\family typewriter
key -> value
\family default
pairs will be retained in future versions of MARS.
Instead of being represented by symlinks, another representation will be
used, such that hopefully the
\family typewriter
key
\family default
part will remain in the form of a pathname, even if there were no longer
a physical representation in an actual filesystem.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
A fundamentally different behaviour than DRBD: when your DRBD primary crashed
some time ago, and now comes up again, you have to setup DRBD again by
a sequence of commands like
\family typewriter
modprobe drbd; drbdadm up all; drbdadm primary all
\family default
or similar.
In contrast, MARS needs only
\family typewriter
modprobe mars
\family default
(after
\family typewriter
/mars/
\family default
has been mounted by
\family typewriter
/etc/fstab
\family default
).
The
\emph on
persistence
\emph default
of the symlinks residing in
\family typewriter
/mars/
\family default
will automatically remember your previous state, even if some your resources
were primary while others were secondary (mixed operations).
You don't need to do any actions in order to
\begin_inset Quotes eld
\end_inset
restore
\begin_inset Quotes erd
\end_inset
a previous state, no matter how
\begin_inset Quotes eld
\end_inset
complex
\begin_inset Quotes erd
\end_inset
it was.
\end_layout
\begin_layout Standard
(Almost) all symlinks appearing in the
\family typewriter
/mars/
\family default
directory tree are automatically replicated thoughout the whole cluster,
provided that the cluster
\family typewriter
uuid
\family default
s are equal
\begin_inset Foot
status open
\begin_layout Plain Layout
This is protection against accidental
\begin_inset Quotes eld
\end_inset
merging
\begin_inset Quotes erd
\end_inset
of two unrelated clusters which had been created at different times with
different
\family typewriter
uuids
\family default
.
\end_layout
\end_inset
at all sites.
Thus the
\family typewriter
/mars/
\family default
directory forms some kind of
\emph on
global namespace
\emph default
.
\end_layout
\begin_layout Standard
In order to avoid name clashes, each pathname created at node A follows
a convention: the node name A should be a suffix of the pathname.
Typically, internal MARS names follow the scheme
\family typewriter
/mars/
\emph on
something
\emph default
/myname-A
\family default
.
When using the expert command
\family typewriter
marsadm {get,set}-link
\family default
(which will likely be replaced by something else in future MARS releases),
you should follow the best practice of systematically using pathnames like
\family typewriter
/mars/userspace/myname-A
\family default
or similar.
As a result, each node will automatically get informed about the state
at any other node, like B when the corresponding information is recorded
on node B under the name
\family typewriter
/mars/userspace/myname-B
\family default
(context-dependent names).
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Experts only: the symlink replication works generically.
You might use the
\family typewriter
/mars/userspace/
\family default
directory in order to place your own symlink there (for whatever purpose,
which need not have to do with MARS).
However, the symlinks are likely to disappear.
Use
\family typewriter
marsadm {get,set}-link
\family default
instead.
There is a chance that these abstract commands (or variants thereof) will
be retained, by acting on the new data representation in future, even if
the old symlink format will vanish some day.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Important: the convention of placing the
\series bold
creator host name
\series default
inside your pathnames should be used wherever possible.
The name part is a kind of
\begin_inset Quotes eld
\end_inset
ownership indicator
\begin_inset Quotes erd
\end_inset
.
It is crucial that no other host writes any symlink not
\begin_inset Quotes eld
\end_inset
belonging
\begin_inset Quotes erd
\end_inset
to him.
Other hosts may read foreign information as often as they want, but never
modify them.
This way, your cluster nodes are able to
\emph on
communicate
\emph default
with each other via symlink / information updates.
\end_layout
\begin_layout Standard
Although experts might create (and change) the current symlinks with userspace
tools like
\family typewriter
ln -s
\family default
, you should use the following marsadm commands instead:
\end_layout
\begin_layout Itemize
\family typewriter
marsadm set-link myvalue /mars/userspace/mykey-A
\end_layout
\begin_layout Itemize
\family typewriter
marsadm delete-file /mars/userspace/mykey-A
\end_layout
\begin_layout Standard
There are many reasons for this: first, the
\family typewriter
marsadm set-link
\family default
command will automatically use the Lamport clock for symlink creation,
and therefore will avoid any errors resulting from a
\begin_inset Quotes eld
\end_inset
wrong
\begin_inset Quotes erd
\end_inset
system clock (as in
\family typewriter
ln -s
\family default
).
Second, the
\family typewriter
marsadm delete-file
\family default
(which also deletes symlinks) works on the
\emph on
whole cluster
\emph default
.
And finally, there is a chance that this will work in future versions of
MARS even after the symlinks have vanished.
\end_layout
\begin_layout Standard
What's the difference? If you would try to remove your symlink locally by
hand via
\family typewriter
rm -f
\family default
, you will be surprised: since the symlink has been replicated to the other
cluster nodes, it will be re-transferred from there and will be resurrected
locally after some short time.
This way, you cannot delete any object reliably, because your whole cluster
(which may consist of many nodes) remembers all your state information
and will
\begin_inset Quotes eld
\end_inset
correct
\begin_inset Quotes erd
\end_inset
it whenever
\begin_inset Quotes eld
\end_inset
necessary
\begin_inset Quotes erd
\end_inset
.
\end_layout
\begin_layout Standard
In order to solve the deletion problem, MARS uses some internal deletion
protocol using auxiliary symlinks residing in
\family typewriter
/mars/todo-global/.
\family default
The deletion protocol ensures that all replicas get deleted in the whole
cluster, and only thereafter the auxiliary symlinks in
\family typewriter
/mars/todo-global/
\family default
are also deleted eventually.
\end_layout
\begin_layout Standard
You may update your already existing symlink via
\family typewriter
marsadm set-link some-other-value /mars/userspace/mykey-A
\family default
.
The new value will be propagated throughout the cluster according to a
\series bold
timestamp comparison protocol
\series default
: whenever node B notices that A has a
\emph on
newer
\emph default
version of some symlink (according to the Lamport timestamp), it will replace
its elder version by the newer one.
The opposite does
\emph on
not
\emph default
work: if B notices that A has an elder version, just nothing happens.
This way, the timestamps of symlinks can only progress in forward direction,
but never backwards in time.
\end_layout
\begin_layout Standard
As a consequence, symlink updates made
\begin_inset Quotes eld
\end_inset
by hand
\begin_inset Quotes erd
\end_inset
via
\family typewriter
ln -sf
\family default
may get lost when the local system clock is much more earlier than the
Lamport clock.
\end_layout
\begin_layout Standard
When your cluster is fully connected by the network, the last timestamp
will finally win everywhere.
Only in case of network outages leading to
\emph on
network partitions
\emph default
, some information may be
\emph on
temporarily inconsistent
\emph default
, but only for the duration of the network outage.
The timestamp comparison protocol in combination with the Lamport clock
and with the persistence of the
\family typewriter
/mars/
\family default
filesystem will automatically heal any temporary inconsistencies as soon
as possible, even in case of temporary node shutdown.
\end_layout
\begin_layout Standard
The meaning of some internal MARS symlinks residing in
\family typewriter
/mars/
\family default
will be hopefully documented in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Documentation-of-the"
\end_inset
some day.
\end_layout
\begin_layout Section
Defending Overflow of
\family typewriter
/mars/
\begin_inset CommandInset label
LatexCommand label
name "sec:Defending-Overflow"
\end_inset
\end_layout
\begin_layout Standard
This section describes an important difference to DRBD.
The metadata of DRBD is allocated
\emph on
statically
\emph default
at
\emph on
creation
\emph default
\emph on
time
\emph default
of the resource.
In contrast, the MARS transaction logfiles are allocated
\emph on
dynamically
\emph default
at
\emph on
runtime
\emph default
.
\end_layout
\begin_layout Standard
This leads to a potential risk from the perspective of a sysadmin: what
happens if the
\family typewriter
/mars/
\family default
filesystem runs out of space?
\end_layout
\begin_layout Standard
No risk, no fun.
If you want a system which survives long-lasting network outages while
keeping your replicas always consistent (anytime consistency), you
\emph on
need
\emph default
dynamic memory for that.
It is
\emph on
impossible
\emph default
to solve that problem using static memory
\begin_inset Foot
status open
\begin_layout Plain Layout
The bitmaps used by DRBD don't preserve the
\emph on
order
\emph default
of write operations.
They cannot do that, because their space is
\begin_inset Formula $O(k)$
\end_inset
for some constant
\begin_inset Formula $k$
\end_inset
.
In contrast, MARS preserves the order.
Preserving the order as such (even when only
\emph on
facts
\emph default
about the order were recorded without recording the actual data contents)
requires
\begin_inset Formula $O(n)$
\end_inset
space where
\begin_inset Formula $n$
\end_inset
is infinitely growing over time.
\end_layout
\end_inset
.
\end_layout
\begin_layout Standard
Therefore, DRBD and MARS have different application areas.
If you just want a simple system for mirroring your data over short distances
like a crossover cable, DRBD will be a suitable choice.
However, if you need to replicate over longer distances, or if you need
higher levels of reliability even when multiple failures may accumulate
(such as network loss during a
\emph on
re
\emph default
sync of DRBD), the transaction logs of MARS can solve that, but at some
\emph on
cost
\emph default
.
\end_layout
\begin_layout Subsection
Countermeasures
\end_layout
\begin_layout Subsubsection
Dimensioning of
\family typewriter
/mars/
\begin_inset CommandInset label
LatexCommand label
name "sub:Dimensioning-of-/mars/"
\end_inset
\end_layout
\begin_layout Standard
The first (and most important) measure against overflow of
\family typewriter
/mars/
\family default
is simply to dimension it large enough to survive longer-lasting problems,
at least one weekend.
\end_layout
\begin_layout Standard
Recommended size is at least one dedicated disk, residing at a hardware
RAID controller with BBU (see section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Preparation:-What-you"
\end_inset
).
During normal operation, that size is needed only for a small fraction,
typically a few percent or even less than one percent.
However, it is your
\series bold
safety margin
\series default
.
Keep it high enough!
\end_layout
\begin_layout Subsubsection
Monitoring
\end_layout
\begin_layout Standard
The next (equally important) measure is
\series bold
monitoring in userspace
\series default
.
\end_layout
\begin_layout Standard
Following is a list of countermeasures both in userspace and in kernelspace,
in the order of
\begin_inset Quotes eld
\end_inset
defensive walling
\begin_inset Quotes erd
\end_inset
:
\end_layout
\begin_layout Enumerate
Regular userspace monitoring must throw an INFO if a certain freespace limit
\begin_inset Formula $l_{1}$
\end_inset
of
\family typewriter
/mars/
\family default
is undershot.
Typical values for
\begin_inset Formula $l_{1}$
\end_inset
are 30%.
Typical actions are automated calls of
\family typewriter
marsadm log-rotate all
\family default
followed by
\family typewriter
marsadm log-delete-all all
\family default
.
You have to implement that yourself in sysadmin space.
\end_layout
\begin_layout Enumerate
Regular userspace monitoring must throw a WARNING if a certain freespace
limit
\begin_inset Formula $l_{2}$
\end_inset
of
\family typewriter
/mars/
\family default
is undershot.
Typical values for
\begin_inset Formula $l_{2}$
\end_inset
are 20%.
Typical actions are (in addition to
\family typewriter
log-rotate
\family default
and
\family typewriter
log-delete-all
\family default
) alarming human supervisors via SMS and/or further stronger automated actions.
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Frequently large space is occupied by files stemming from debugging output,
or from other programs or processes.
A hot candidate is
\begin_inset Quotes eld
\end_inset
forgotten
\begin_inset Quotes erd
\end_inset
removal of debugging output to
\family typewriter
/mars/
\family default
.
Sometimes, an
\family typewriter
rm -rf $(find /mars/ -name
\begin_inset Quotes eld
\end_inset
*.log
\begin_inset Quotes erd
\end_inset
)
\family default
can work miracles.
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Another source of space hogging is a
\begin_inset Quotes eld
\end_inset
forgotten
\begin_inset Quotes erd
\end_inset
\family typewriter
pause-sync
\family default
or
\family typewriter
disconnect
\family default
.
Therefore, a simple
\family typewriter
marsadm connect-global all
\family default
followed by
\family typewriter
marsadm resume-replay-global all
\family default
may also work miracles (if you didn't want to freeze some mirror deliberately).
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
If you just wanted to freeze a mirror at an outdated state for a very long
time, you simply
\emph on
cannot
\emph default
do that without causing infinite growth of space consumption in
\family typewriter
/mars/
\family default
.
Therefore, a
\family typewriter
marsadm leave-resource $res
\family default
at
\emph on
exactly that(!)
\emph default
secondary site where the mirror is frozen, can also work miracles.
If you want to automate this in unserspace, be careful.
It is easy to get unintended effects when choosing the wrong site for
\family typewriter
leave-resource
\family default
.
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Hint: you can / should start some of these measures even earlier at the
INFO level (see item 1), or even earlier.
\end_layout
\begin_layout Enumerate
Regular userspace monitoring must throw an ERROR if a certain freespace
limit
\begin_inset Formula $l_{3}$
\end_inset
of
\family typewriter
/mars/
\family default
is undershot.
Typical values for
\begin_inset Formula $l_{3}$
\end_inset
are 10%.
Typical actions are alarming the CEO via SMS and/or even stronger automated
actions.
For example, you may choose to automatically call
\family typewriter
marsadm leave-resource $res
\family default
on some or all secondary nodes, such that the primary will be left alone
and now has a chance to really delete its logfiles because no one else
is any longer potentially needing it.
\end_layout
\begin_layout Enumerate
First-level kernelspace action, automatically executed when
\family typewriter
\begin_inset Flex URL
status open
\begin_layout Plain Layout
/proc/sys/mars/required_free_space_4_gb
\end_layout
\end_inset
\family default
+
\family typewriter
\begin_inset Flex URL
status open
\begin_layout Plain Layout
/proc/sys/mars/required_free_space_3_gb
\end_layout
\end_inset
\family default
+
\family typewriter
\begin_inset Flex URL
status open
\begin_layout Plain Layout
/proc/sys/mars/required_free_space_2_gb
\end_layout
\end_inset
\family default
+
\family typewriter
\begin_inset Flex URL
status open
\begin_layout Plain Layout
/proc/sys/mars/required_free_space_1_gb
\end_layout
\end_inset
\family default
is undershot:
\begin_inset Newline newline
\end_inset
a warning will be issued.
\end_layout
\begin_layout Enumerate
Second-level kernelspace action, automatically executed when
\family typewriter
\begin_inset Flex URL
status open
\begin_layout Plain Layout
/proc/sys/mars/required_free_space_3_gb
\end_layout
\end_inset
\family default
+
\family typewriter
\begin_inset Flex URL
status open
\begin_layout Plain Layout
/proc/sys/mars/required_free_space_2_gb
\end_layout
\end_inset
\family default
+
\family typewriter
\begin_inset Flex URL
status open
\begin_layout Plain Layout
/proc/sys/mars/required_free_space_1_gb
\end_layout
\end_inset
\family default
is undershot:
\begin_inset Newline newline
\end_inset
all locally secondary resources will delete local copies of transaction
logfiles which are no longer needed locally.
This is a desperate action of the kernel module.
\end_layout
\begin_layout Enumerate
Third-level kernelspace action, automatically executed when
\family typewriter
\begin_inset Flex URL
status open
\begin_layout Plain Layout
/proc/sys/mars/required_free_space_2_gb
\end_layout
\end_inset
\family default
+
\family typewriter
\begin_inset Flex URL
status open
\begin_layout Plain Layout
/proc/sys/mars/required_free_space_1_gb
\end_layout
\end_inset
\family default
is undershot:
\begin_inset Newline newline
\end_inset
all locally secondary resources will stop fetching transaction logfiles.
This is a more desperate action of the kernel module.
You don't want to get there (except for testing).
\end_layout
\begin_layout Enumerate
Last desperate kernelspace action when all else has failed and
\family typewriter
\begin_inset Flex URL
status open
\begin_layout Plain Layout
/proc/sys/mars/required_free_space_1_gb
\end_layout
\end_inset
\family default
is undershot:
\begin_inset Newline newline
\end_inset
all locally primary resources will enter
\series bold
emergency mode
\series default
(see description below in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Emergency-Mode"
\end_inset
).
This is the most desperate action of the kernel module.
You don't want to get there (except for testing).
\end_layout
\begin_layout Standard
In addition, the kernel module obeys a general global limit
\family typewriter
\begin_inset Flex URL
status open
\begin_layout Plain Layout
/proc/sys/mars/required_total_space_0_gb
\end_layout
\end_inset
+
\family default
the sum of all of the above limits.
When the
\emph on
total size
\emph default
of
\family typewriter
/mars/
\family default
undershots that sum, the kernel module refuses to start at all, because
it assumes that it is senseless to try to operate MARS on a system with
such low memory resources.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
The current level of emergency kernel actions may be viewed at any time
via
\family typewriter
\begin_inset Flex URL
status collapsed
\begin_layout Plain Layout
/proc/sys/mars/mars_emergency_mode
\end_layout
\end_inset
\family default
.
\end_layout
\begin_layout Subsubsection
Throttling
\end_layout
\begin_layout Standard
The last measure for defense of overflow is
\series bold
throttling your performance pigs
\series default
.
\end_layout
\begin_layout Standard
Motivation: in rare cases, some users with
\family typewriter
ssh
\family default
access can do
\emph on
very
\emph default
silly things.
For example, some of them are creating their own backups via user-cron
jobs, and they do it every 5 minutes.
Some example guy created a zip archive (almost 1GB) by regularly copying
his old zip archive into a new one, then appending deltas to the new one,
and finally deleting the old archive.
Every 5 minutes.
Yes, every 5 minutes, although almost never any new files were added to
the archive.
Essentially, he copied over his archive, for nothing.
This led to massive bulk write requests, for ridiculous reasons.
\end_layout
\begin_layout Standard
In general, your hard disks (or even RAID systems) allow much higher write
IO rates than you can ever transport over a standard TCP network from your
primary site to your secondary, at least over longer distances (see use
cases for MARS in chapter
\begin_inset CommandInset ref
LatexCommand ref
reference "chap:Use-Cases-for"
\end_inset
).
Therefore, it is easy to create a such a high write load that it will be
\emph on
impossible
\emph default
to replicate it over the network,
\emph on
by construction
\emph default
.
\end_layout
\begin_layout Standard
Therefore, we
\emph on
need
\emph default
some mechanism for throttling bulk writers whenever the network is weaker
than your IO subsystem.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Notice that DRBD will
\emph on
always
\emph default
throttle your writes whenever the network forms a bottleneck, due to its
synchronous operation mode.
In contrast, MARS allows for buffering of performance peaks in the transaction
logfiles.
\emph on
Only when
\emph default
your buffer in
\family typewriter
/mars/
\family default
runs short (cf subsection
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Dimensioning-of-/mars/"
\end_inset
), MARS will start to throttle your application writes.
\end_layout
\begin_layout Standard
There are a lot of screws named
\family typewriter
/proc/sys/mars/write_throttle_*
\family default
with the following meaning:
\end_layout
\begin_layout Description
\family typewriter
write_throttle_start_percent
\family default
Whenever the used space in
\family typewriter
/mars/
\family default
is below this threshold, no throttling will occur at all.
Only when this threshold is exceeded, throttling will start
\emph on
slowly
\emph default
.
Typical values for this are 60%.
\end_layout
\begin_layout Description
\family typewriter
write_throttle_end_percent
\family default
Maximum throttling will occur once this space threshold is reached, i.e.
the throttling is now at its maximum effect.
Typical values for this are 90%.
When the actual space in
\family typewriter
/mars/
\family default
lies between
\family typewriter
write_throttle_start_percent
\family default
and
\family typewriter
write_throttle_end_percent
\family default
, the strength of throttling will be interpolated linearly between the extremes.
In practice, this should lead to an equilibrum between new input flow into
\family typewriter
/mars/
\family default
and output flow over the network to secondaries.
\end_layout
\begin_layout Description
\family typewriter
write_throttle_size_threshold_kb
\family default
(readonly) This parameter shows the internal strength calculation of the
throttling.
Only write
\begin_inset Foot
status open
\begin_layout Plain Layout
Read requests are never throttled at all.
\end_layout
\end_inset
requests exceeding this size (in KB) are throttled at all.
Typically, this will hurt the bulk performance pigs first, while leaving
ordinary users (issuing small requests) unaffected.
\end_layout
\begin_layout Description
\family typewriter
write_throttle_ratelimit_kb
\family default
Set the global IO rate in KB/s for those write requests which are throttled.
In case of strongest
\begin_inset Foot
status open
\begin_layout Plain Layout
In case of lighter throttling, the input flow into
\family typewriter
/mars/
\family default
may be higher because small requests are not throttled.
\end_layout
\end_inset
throttling, this parameters determines the input flow into
\family typewriter
/mars/
\family default
.
The default value is 5.000 KB/s.
Please adjust this value to your application needs and to your environment.
\end_layout
\begin_layout Description
\family typewriter
write_throttle_rate_kb
\family default
(readonly) Shows the current rate of exactly those requests which are actually
throttled (in contrast to
\emph on
all
\emph default
requests).
\end_layout
\begin_layout Description
\family typewriter
write_throttle_cumul_kb
\family default
(logically readonly) Same as before, but the cumulative sum of all throttled
requests since startup / reset.
This value can be reset from userspace in order to prevent integer overflow.
\end_layout
\begin_layout Description
\family typewriter
write_throttle_count_ops
\family default
(logically readonly) Shows the cumulative number of throttled requests.
This value can be reset from userspace in order to prevent integer overflow.
\end_layout
\begin_layout Description
\family typewriter
write_throttle_maxdelay_ms
\family default
Each request is delayed at most for this timespan.
Smaller values will improve the responsiveness of your userspace application,
but at the cost of potentially retarding the requests not sufficiently.
\end_layout
\begin_layout Description
\family typewriter
write_throttle_minwindow_ms
\family default
Set the minimum length of the measuring window.
The measuring window is the timespan for which the average (throughput)
rate is computed (see
\family typewriter
write_throttle_rate_kb
\family default
).
Lower values can increase the responsiveness of the controller algorithm,
but at the cost of accuracy.
\end_layout
\begin_layout Description
\family typewriter
write_throttle_maxwindow_ms
\family default
This parameter must be set sufficiently much greater than
\family typewriter
write_throttle_minwindow_ms
\family default
.
In case the flow of throttled operations pauses for some natural reason
(e.g.
switched off, low load, etc), this parameter determines when a completely
new rate calculation should be started over
\begin_inset Foot
status open
\begin_layout Plain Layout
Motivation: if requests would pause for one hour, the measuring window could
become also an hour.
Of course, that would lead to completely meaningless results.
Two requests in one hour is
\begin_inset Quotes eld
\end_inset
incorrect
\begin_inset Quotes erd
\end_inset
from a human point of view: we just have to ensure that averages are computed
with respect to a reasonable maximum time window in the magnitude of 10s.
\end_layout
\end_inset
.
\end_layout
\begin_layout Subsection
Emergency Mode and its Resolution
\begin_inset CommandInset label
LatexCommand label
name "sub:Emergency-Mode"
\end_inset
\end_layout
\begin_layout Standard
When
\family typewriter
/mars/
\family default
is almost full and there is really absolutely no chance of getting rid
of any local transaction logfile (or free some space in any other way),
there is only one exit strategy: stop creating new logfile data.
\end_layout
\begin_layout Standard
This means that the ability for replication gets lost.
\end_layout
\begin_layout Standard
When entering emergency mode, the kernel module will execute the following
steps for all resources where the affected host is acting as a primary:
\end_layout
\begin_layout Enumerate
Do a kind of
\begin_inset Quotes eld
\end_inset
logrotate
\begin_inset Quotes erd
\end_inset
, but create a
\emph on
hole
\emph default
in the sequence of transaction logfile numbers.
The
\begin_inset Quotes eld
\end_inset
new
\begin_inset Quotes erd
\end_inset
logfile is left empty, i.e.
no data ist written to it (for now).
The hole in the numbering will prevent any secondaries from replaying any
logfiles behind the hole (should they ever contain some data, e.g.
because the emergency mode has been left again).
This works because the secondaries are regularly checking the logfile numbers
for contiguity, and they will refuse to replay anything which is not contiguous.
As a result, the secondaries will be left in a consistent, but outdated
state (at least if they already were consistent before that).
\end_layout
\begin_layout Enumerate
The kernel module writes back all data present in the temporary memory buffer
(see figure in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-Transaction-Logger"
\end_inset
).
This may lead to a (short) delay of user write requests until that has
finished (typically fractions of a second or a few seconds).
The reason is that the temporary memory buffer must not be increased in
parallel during this phase (race conditions).
\end_layout
\begin_layout Enumerate
After the temporary memory buffer is empty, all local IO requests (whether
reads or writes) are directly going to the underlying disk.
This has the same effect as if MARS would not be present anymore.
Transaction logging does no longer take place.
\end_layout
\begin_layout Enumerate
Any sync from any secondary is stopped ASAP.
In case they are resuming their sync somewhen later, they will start over
from the beginning (position
\begin_inset Formula $0$
\end_inset
).
\end_layout
\begin_layout Standard
In order to leave emergency mode, the sysadmin should do the following steps:
\end_layout
\begin_layout Enumerate
Free enough space.
For example, delete any foreign files on
\family typewriter
/mars/
\family default
which have nothing to do with MARS, or resize the
\family typewriter
/mars/
\family default
filesystem, or whatever.
\end_layout
\begin_layout Enumerate
If
\family typewriter
\begin_inset Flex URL
status open
\begin_layout Plain Layout
/proc/sys/mars/mars_reset_emergency
\end_layout
\end_inset
\family default
is not set, now it is time to set it.
Normally, it should be already set.
\end_layout
\begin_layout Enumerate
Notice: as long as not enough space has been freed, a message containing
\family typewriter
\begin_inset Quotes eld
\end_inset
EMEGENCY MODE HYSTERESIS
\begin_inset Quotes erd
\end_inset
\family default
(or similar) will be displayed by
\family typewriter
marsadm view all
\family default
.
As a consequence, any sync will be automatically halted.
This applies to freshly invoked syncs also, for example created by
\family typewriter
invalidate
\family default
or
\family typewriter
join-resource
\family default
.
\end_layout
\begin_layout Enumerate
On the secondaries, use
\family typewriter
marsadm invalidate $res
\family default
in order to request updating your outdated mirrors.
\end_layout
\begin_layout Enumerate
On the primary:
\family typewriter
marsadm log-delete-all all
\end_layout
\begin_layout Enumerate
As soon as emough space has been freed everywhere to leave the
\family typewriter
EMEGENCY MODE HYSTERESIS
\family default
, sync should really start.
Until that it had been halted.
\end_layout
\begin_layout Standard
Alternatively, there is another method by roughly following the instructions
from appendix
\begin_inset CommandInset ref
LatexCommand ref
reference "chap:Alternative-Methods-for"
\end_inset
, but in a slightly different order.
In this case, do
\family typewriter
leave-resource
\family default
everywhere on
\emph on
all
\emph default
secondaries, but
\emph on
don't
\emph default
start the
\family typewriter
join-resource
\family default
phase
\emph on
for now
\emph default
.
Then cleanup all your secondaries via
\family typewriter
log-purge-all
\family default
, and finally
\family typewriter
log-delete-all all
\family default
at the primary, and wait until the emergency has vanished everywhere.
Only after that, re-
\family typewriter
join-resource
\family default
your secondaries.
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Expert advice for
\begin_inset Formula $k=2$
\end_inset
replicas: this means you had only 1 mirror per resource before the overflow
happened.
Provided that you have enough space on your LVMs and on
\family typewriter
/mars/
\family default
, and provided that transaction logging has automatically restarted after
\family typewriter
leave-resource
\family default
and
\family typewriter
log-purge-all
\family default
, you can recover redundancy by creating a
\emph on
new
\emph default
replica via
\family typewriter
marsadm join-resource $res
\family default
on a
\emph on
third
\emph default
node.
Only after the initial full sync has finished there, run
\family typewriter
join-resource
\family default
at your original mirror.
This way, you will always retain at least one
\series bold
consistent mirror
\series default
somewhere.
After all is up-to-date, you can delete the superfluous mirror by
\family typewriter
marsadm leave-resource $res
\family default
and reclaim the disk space from its underlying LVM disk.
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
If you already have
\begin_inset Formula $k>2$
\end_inset
replicas in total, it may be a wise idea to prefer the
\family typewriter
leave-resource ; log-purge-all ; join-resource
\family default
method in front of
\family typewriter
invalidate
\family default
because it does not invalidate
\emph on
all
\emph default
your replicas at the same time (when handled properly in the right order).
\end_layout
\begin_layout Chapter
The Macro Processor
\begin_inset CommandInset label
LatexCommand label
name "chap:The-Macro-Processor"
\end_inset
\end_layout
\begin_layout Standard
\family typewriter
marsadm
\family default
comes with a customizable macro processor.
It can be used for high-level complex display of the state of MARS (so-called
\emph on
complex macros
\emph default
), as well as for low-level display of lots of individual state values (so-calle
d
\emph on
primitive macros
\emph default
).
\end_layout
\begin_layout Standard
From the commandline, any macro can be called via
\family typewriter
marsadm view-
\emph on
$macroname
\emph default
mydata
\family default
.
The short form
\family typewriter
marsadm view mydata
\family default
is equivalent to
\family typewriter
marsadm view-default mydata
\family default
.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
In general, the command
\family typewriter
marsadm view-
\emph on
$macroname
\emph default
all
\family default
will first call the macro
\family typewriter
\emph on
$macroname
\family default
\emph default
in a loop for
\emph on
all
\emph default
resources we are a
\emph on
member locally
\emph default
.
Finally, a trailing macro
\family typewriter
\emph on
$macroname
\emph default
-global
\family default
will be called with an empty
\family typewriter
%{res}
\family default
argument, provided that such a macro is defined.
This way, you can produce per-resource output followed by global output
which does not depend on a particular resource.
\end_layout
\begin_layout Section
Predefined Macros
\end_layout
\begin_layout Standard
The macro processor is a very flexible and versatile tool for
\series bold
customizing
\series default
.
You can create your own macros, but probably the rich set of predefined
macros is already sufficient for your needs.
\end_layout
\begin_layout Subsection
Predefined Complex and High-Level Macros
\begin_inset CommandInset label
LatexCommand label
name "sub:Predefined-Complex-and"
\end_inset
\end_layout
\begin_layout Standard
The following predefined complex macros try to address the information needs
of humans.
Use them only in scripts when you are prepared about the fact that the
output format may change during development of MARS.
\end_layout
\begin_layout Standard
Notice: the definitions of predefined complex macros may be updated in the
course of the MARS project.
However, the primitive macros recursively called by the complex ones will
be hopefully rather stable in future (with the exception of bugfixes).
If you want to retain an old / outdated version of a complex macro, just
check it out from git, follow the instructions in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Creating-your-own"
\end_inset
, and preferably give it a different name in order to avoid confusion with
the newer version.
In general, it should be possible to use old macros with newer versions
of
\family typewriter
marsadm
\family default
\begin_inset Foot
status open
\begin_layout Plain Layout
You might need to check out also old versions of further macros and adapt
their names, whenever complex macros call each other.
\end_layout
\end_inset
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
default
\family default
This is equivalent to
\family typewriter
marsadm view mydata
\family default
without
\family typewriter
\emph on
-maroname
\family default
\emph default
suffix.
It shows a one-line status summary for each resource, optionally followed
by informational lines such as progress bars whenever a sync or a fetch
of logfiles is currently running.
The status line has the following fields:
\end_layout
\begin_deeper
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
%{res}
\family default
resource name.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
%include{diskstate}
\family default
see
\family typewriter
diskstate
\family default
macro below.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
%include{replstate}
\family default
see
\family typewriter
replstate
\family default
macro below.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
%include{flags}
\family default
see
\family typewriter
flags
\family default
macro below.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
%include{role}
\family default
see
\family typewriter
role
\family default
macro below.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
%include{primarynode}
\family default
see
\family typewriter
primarynode
\family default
macro below.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
%include{commstate}
\family default
see
\family typewriter
commstate
\family default
macro below.
\end_layout
\end_deeper
\begin_layout Labeling
\labelwidthstring 00.00.0000
\begin_inset space ~
\end_inset
After that, optional lines such as progress bars are appearing only when
something unusual is happening.
These lines are subject to future changes.
For examples, wasted disk space due to missing
\family typewriter
resize
\family default
is reported when
\family typewriter
%{threshold}
\family default
is exceeded.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
1and1
\family default
\begin_inset space ~
\end_inset
or
\begin_inset space ~
\end_inset
\family typewriter
default-1and1
\family default
A variant of
\family typewriter
default
\family default
for internal use by 1&1 Internet AG.
You may call this complex macro by saying
\family typewriter
marsadm view-1and1 all
\family default
.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Note: the
\family typewriter
marsadm view-1and1
\family default
command has been intensely tested in Spring 2014 to produce exactly the
same output than the 1&1 internal
\begin_inset Foot
status open
\begin_layout Plain Layout
In addition to allow for customization, the macro processor is also meant
as an exit strategy for removing dependencies from non-free software.
\series bold
Please put your future macros also under GPL!
\end_layout
\end_inset
tool
\family typewriter
marsview
\family default
\begin_inset Foot
status open
\begin_layout Plain Layout
There are some subtle differences: numbers are displayed in a different
precision, some bug fixes in the macro version (which might have occurred
\emph on
in the meantime
\emph default
) may lead to different output as a side effect from bug fixes in
\emph on
predefined
\emph default
macros, because the original
\family typewriter
marsview
\family default
command is currently not actively maintained.
Documentation of
\family typewriter
marsview
\family default
can be found in the corresponding manpage, see
\family typewriter
man marsview
\family default
.
By construction, this is also the (unmaintained) documentation of
\family typewriter
marsadm view-1and1
\family default
and other
\family typewriter
-1and1
\family default
macros.
Notice that all
\family typewriter
*-1and1
\family default
macros are not officially supported by the developer of MARS, and they
may disappear in a future major release.
However, they could be useful for your own customization macros.
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Customization via your own macros (see section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Creating-your-own"
\end_inset
) is explicitly encouraged by the developer.
It would be nice if a vibrant user community would emerge, helping each
other by exchange of macros.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Hint: in order to produce your own customized inspection / monitoring tools,
you may ask the author for an official reservation of a macro sub-namespace
such as
\family typewriter
*-
\emph on
yourcompanyname
\family default
\emph default
.
You will be fully responsible for your own reserved namespace and can do
with it whatever you want.
The official MARS release will guarantee that
\emph on
no name clashes
\emph default
with your reserved sub-namespace will occur in future.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
default-global
\family default
Currently, this just calls
\family typewriter
comminfo
\family default
(see below).
May be extended in future.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
diskstate
\family default
Shows the status of the underlying disk device, in the following order
of precedence
\begin_inset Foot
status open
\begin_layout Plain Layout
When an earlier list item is displayed, no combinations with following items
are possible.
This kind of
\begin_inset Quotes eld
\end_inset
hiding effect
\begin_inset Quotes erd
\end_inset
can lead to an
\emph on
information loss
\emph default
.
In order to get a non-lossy picture from the state of your system, please
look at the
\family typewriter
flags
\family default
which are able to display cartesian combinations of more detailed internal
states.
\end_layout
\end_inset
:
\end_layout
\begin_deeper
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
NotJoined
\family default
(cf
\family typewriter
%get-disk{}
\family default
) No underlying disk device is configured.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
NotPresent
\family default
(cf
\family typewriter
%disk-present{}
\family default
) The underlying disk device (as configured, see
\family typewriter
marsadm view-get-disk
\family default
) does not exist or the device node is not accessible.
Therefore MARS cannot work.
Check that LVM or other software is properly configured and running.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
Detached
\family default
(cf
\family typewriter
InConsistent
\family default
,
\family typewriter
NeedsReplay
\family default
,
\family typewriter
%todo-attach{}
\family default
,
\family typewriter
%is-attach{}
\family default
) The underlying disk is willingly switched off (see
\family typewriter
marsadm detach
\family default
), and it actually is no longer opened by MARS.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
Detaching
\family default
(cf
\family typewriter
%todo-attach{}
\family default
and
\family typewriter
%is-attach{}
\family default
) Access to the underlying disk is switched off, but actually not yet
\family typewriter
close()
\family default
d by MARS.
This can happen for a long time on a primary when other secondaries are
accessing the disk remotely for syncing.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
DefectiveLog[
\emph on
description-text
\emph default
]
\family default
(cf
\family typewriter
%replay-code{}
\family default
) Typicially this indicates an
\family typewriter
md5
\family default
checksum error in a transaction logfile, or another (hardware / filesystem)
defect.
This occurs extremely rarely in practice, but has been observed more frequently
during a massive failure of air conditioning in a datacenter, when disk
temperatures raised to more than 80° Celsius.
Notice that a secondary
\series bold
refuses
\series default
to apply any knowingly defective logfile data to the disk.
Although this message is
\emph on
not directly
\emph default
referring to the underlying disk, it is mentioned here because of its superior
\series bold
relevance
\series default
for the diskstate.
A damaged transaction logfile will always affect the
\emph on
actuality
\emph default
of the disk, but not its
\emph on
integrity
\emph default
(by itself).
What to do in such a case?
\end_layout
\begin_deeper
\begin_layout Enumerate
When the damage is only at one of your secondaries, you should first ensure
that the primary has a good logfile after a
\family typewriter
marsadm log-rotate
\family default
, then try
\family typewriter
marsadm invalidate
\family default
at the damaged secondary.
It is crucial that the primary has a fresh correct logfile behind the error
position, and that it is continuing to operate correctly.
\end_layout
\begin_layout Enumerate
When
\emph on
all
\emph default
of your secondaries are reporting
\family typewriter
DefectiveLog
\family default
, the primary could have
\emph on
produced
\emph default
a damaged logfile (e.g.
in RAM, in a DMA channel, etc) while continuing to operate, and all of
your secondaries got that defective logfile.
After
\family typewriter
marsadm log-delete-all all
\family default
, you can check this by comparing the
\family typewriter
md5sum
\family default
of the first primary logfile (having the lowest serial number) with the
versions on your replicas.
The problem is that you don't know whether the primary side has a silent
corruption on any of its disks, or not.
You will need to take an operational decision whether to switchover to
a secondary via
\family typewriter
primary --force
\family default
, or whether to continue operation at the primary and
\family typewriter
invalidate
\family default
your secondaries.
\end_layout
\begin_layout Enumerate
When the original primary is affected in a very bad way, such that it crashed
badly and afterwards even recovery of the
\emph on
primary
\emph default
is impossible
\begin_inset Foot
status open
\begin_layout Plain Layout
In such a rare case, the
\emph on
original primary
\emph default
(but not any other host)
\series bold
refuses
\series default
to come up during recovery with
\emph on
his own
\emph default
logfile originally produced by
\emph on
himself
\emph default
.
This is not a bug, but saves you from incorrectly assuming that your original
primary disk were consistent - it is
\emph on
known
\emph default
to be inconsistent, but recovery is impossible due to the damaged logfile.
Thus
\emph on
this one
\emph default
replica is trapped by defective hardware.
The other replicas shouldn't.
\end_layout
\end_inset
due to this error (which typically occurs extremely rarely, observed two
times during 7 millions of operating hours on defective hardware), you
need to take an operational decision between the following alternatives:
\end_layout
\begin_deeper
\begin_layout Enumerate
switchover to a former secondary via
\family typewriter
primary --force
\family default
, producing a split brain, and producing some (typically small) data loss.
However, integrity is more important than actuality in such an extreme
case.
\end_layout
\begin_layout Enumerate
deconstruction of the resource at
\emph on
all
\emph default
replicas via
\family typewriter
leave-resource --force
\family default
, running
\family typewriter
fsck
\family default
or similar tools by hand at the underlying disks, selecting the best replica
out of them, and finally re-constructing the resource again.
\end_layout
\begin_layout Enumerate
restore your backup.
\end_layout
\end_deeper
\end_deeper
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
NoAttach
\family default
(cf
\family typewriter
%is-attach{}
\family default
) The underlying disk is currently not opened by MARS.
Reasons may be that the kernel module is not loaded, or an exclusive
\family typewriter
open()
\family default
is currently not possible because somebody else has already opened it.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
InConsistent
\family default
(cf
\family typewriter
%is-consistent{}
\family default
) A logfile replay and/or sync is known to be needed / or to complete (e.g.
after
\family typewriter
invalidate
\family default
has started) in order to restore local consistency (for details, look at
\family typewriter
flags
\family default
).
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Hint: in the current implementation of MARS, this will never happen on secondari
es during ordinary replay (but only when either sync has not yet finished,
or when the
\emph on
initial
\emph default
logfile replay after the sync has not yet finished), because the ordinary
logfile replay always maintains anytime consistency once a consistent state
had been reached.
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
\emph on
Only
\emph default
in case of a primary node crash, and
\emph on
only
\emph default
after attempts have failed to become primary again (e.g.
IO errors, etc), this
\emph on
can
\emph default
(but need not) mean that something went wrong.
Even in such an extremely unlikely event, chances are high that
\family typewriter
fsck
\family default
can fix any remaining problems (and, of course, you can also switchover
to a former secondary).
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
When this message appears, simply start MARS again (e.g.
\family typewriter
modprobe mars; marsadm up all
\family default
), in whatever role you are intending.
This will
\emph on
automatically
\emph default
try to replay any necessary transaction logfile(s) in order to fix the
inconsistency.
Only if the automatic fix fails and this message persists for a long time
without progress, you
\emph on
might
\emph default
have a problem.
Typically, as observed at a large installation at 1&1, this happens extremely
rarely, and then typically indicates that your hardware is likely to be
defective.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
OutDated[FR]
\family default
(cf
\family typewriter
%work-reached{}
\family default
) Only at secondaries.
Tells whether it is
\emph on
currently known
\emph default
that the disk has any lag-behind when compared to the
\emph on
currently known
\emph default
state of the current designated primary (if there exists one).
Only meaningful if a current designated primary exists.
Notice that this kind of status display is subject to
\emph on
natural races
\emph default
, for example when new logfile data has been produced in parallel, or network
propagation is very slow.
Additional information is in brackets:
\end_layout
\begin_deeper
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
[F]
\family default
Fetch is known to be needed.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
[R]
\family default
Replay is known to be needed.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
[FR]
\family default
Both are known to be needed.
\end_layout
\end_deeper
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
WriteBack
\family default
(cf
\family typewriter
%is-primary{}
\family default
) Appears only at actual primaries (whether designated or not), when the
writeback from the RAM buffer is active (see section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-Transaction-Logger"
\end_inset
)
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
Recovery
\family default
(cf
\family typewriter
%todo-primary{}
\family default
) Appears only at the designated primary before it actually has become primary.
Similar to database recovery, this indicates the recovery phase after a
crash
\begin_inset Foot
status open
\begin_layout Plain Layout
In some cases,
\family typewriter
primary --force
\family default
may also trigger this message.
\end_layout
\end_inset
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
EmergencyMode
\family default
(cf
\family typewriter
%is-emergency{}
\family default
) A current designated primary exists, and it is known that this host has
entered emergency mode.
See section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Emergency-Mode"
\end_inset
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
UpToDate
\family default
Displayed when none of the above has been detected.
\end_layout
\end_deeper
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
diskstate-1and1
\family default
A variant for internal use by 1&1 Internet AG.
See above note.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
replstate
\family default
Shows the status of the replication in the following order of precedence:
\end_layout
\begin_deeper
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
ModuleNotLoaded
\family default
(cf
\family typewriter
%is-module-loaded{}
\family default
) No kernel module is loaded, and as a consequence no
\family typewriter
/proc/sys/mars/
\family default
does exist.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
UnResponsive
\family default
(cf
\family typewriter
%is-alive{%{host}}
\family default
) The main thread
\family typewriter
mars_light
\family default
did not do any noticable work for more than
\family typewriter
%{window}
\family default
(default 30) seconds.
Notice that this may happen when deleting
\emph on
extremely
\emph default
large logfiles (up to hundreds of gigabytes or terabytes).
If this happens for a
\emph on
very
\emph default
long time, you should check whether you might need a reboot in order to
fix the hang.
The time window may be changed by
\family typewriter
--window=$seconds
\family default
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
NotJoined
\family default
(cf
\family typewriter
%get-disk{}
\family default
) No underlying disk device is configured for this resource.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
NotStarted
\family default
(cf
\family typewriter
%todo-attach{}
\family default
) Replication has not been started.
\end_layout
\begin_layout Itemize
When the current host is designated as a primary, the rest of the precedence
list looks as follows:
\end_layout
\begin_deeper
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
EmergencyMode
\family default
(cf.
\family typewriter
%is-emergency{}
\family default
) See section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Emergency-Mode"
\end_inset
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
Replicating
\family default
(cf.
\family typewriter
%is-primary{}
\family default
) Primary mode has been entered.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
NotYetPrimary
\family default
(catchall) This means the current host
\emph on
should
\emph default
act as a primary (see
\family typewriter
marsadm primary
\family default
or
\family typewriter
marsadm primary --force
\family default
), but currently doesn't (yet).
This happens during logfile replay, before primary mode is actually entered.
Notice that replay of very big logfiles may take a long time.
\end_layout
\end_deeper
\begin_layout Itemize
When the current host is
\emph on
not
\emph default
designated as a primary:
\end_layout
\begin_deeper
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
PausedSync
\family default
(cf.
\family typewriter
%sync-rest{}
\family default
and
\family typewriter
%todo-sync{}
\family default
) Some data needs to be synced, but sync is currently switched off.
See
\family typewriter
marsadm {pause,resume}-sync
\family default
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
Syncing
\family default
(cf.
\family typewriter
%is-sync{}
\family default
) Sync is currently running.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
PausedFetch
\family default
(cf.
\family typewriter
%todo{fetch}
\family default
) Fetch is currently switched off.
See
\family typewriter
marsadm {pause,resume}-fetch
\family default
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
PausedReplay
\family default
(cf.
\family typewriter
%todo{replay}
\family default
) Replay is currently switched off.
See
\family typewriter
marsadm {pause,resume}-replay
\family default
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
NoPrimaryDesignated
\family default
(cf.
\family typewriter
%get-primary{}
\family default
) A
\family typewriter
secondary
\family default
command has been given somewhere in the cluster.
Thus no designated primary exists.
All resource members are in state
\family typewriter
Secondary
\family default
or try to approach it.
Sync and other operations are not possible.
This state is therefore not recommended.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
PrimaryUnreachable
\family default
(cf.
\family typewriter
%is-alive{}
\family default
) A current designated primary has been set, but this host has not been
remotely updated for more than 30 seconds (see also
\family typewriter
--window=$seconds
\family default
).
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
Replaying
\family default
(catchall) None of the previous conditions have triggered.
\end_layout
\end_deeper
\end_deeper
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
replstate-1and1
\family default
A variant for internal use by 1&1 Internet AG.
See above note.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
flags
\family default
For each of disk, consistency, attach, sync, fetch, and replay, show exactly
one character.
Each character is either a capital one, or the corresponding lowercase
one, or a dash.
The meaning is as follows:
\end_layout
\begin_deeper
\begin_layout Labeling
\labelwidthstring 00.00.0000
disk/device:
\family typewriter
D
\family default
= the device
\family typewriter
/dev/mars/mydata
\family default
is present,
\family typewriter
d
\family default
= only the underlying disk
\family typewriter
/dev/lv-x/mydata
\family default
is present,
\family typewriter
-
\family default
= none present / configured.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
consistency: this relates to the
\emph on
underlying disk
\emph default
, not to
\family typewriter
/dev/mars/mydata
\family default
!
\family typewriter
C
\family default
= locally consistent,
\family typewriter
c
\family default
= maybe inconsistent (no guarantee), - = cannot determine.
Notice: this does not tell anything about
\emph on
actuality
\emph default
.
Notice: like the other flags, this flag is subject to races and therefore
should be relied on only in
\emph on
detached
\emph default
state! See also description of macro
\family typewriter
is-consistent
\family default
below.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
attach:
\family typewriter
A
\family default
= attached,
\family typewriter
a
\family default
= currently trying to attach/detach but not yet ready (intermediate state),
\family typewriter
-
\family default
= attach is switched off.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
sync:
\family typewriter
S
\family default
= sync finished,
\family typewriter
s
\family default
= currently syncing,
\family typewriter
-
\family default
= sync is switched off.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
fetch:
\family typewriter
F
\family default
= according to knowlege, fetched logfiles are up-to-date,
\family typewriter
f
\family default
= currently fetching (some parts of) a logfile,
\family typewriter
-
\family default
= fetch is switched off.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
replay:
\family typewriter
R
\family default
= all fetched logfiles are replayed,
\family typewriter
r
\family default
= currently replaying,
\family typewriter
-
\family default
= replay is switched off.
\end_layout
\end_deeper
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
flags-1and1
\family default
A variant for internal use by 1&1 Internet AG.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
todo-role
\family default
Shows the
\emph on
designated
\emph default
state:
\family typewriter
None
\family default
,
\family typewriter
Primary
\family default
or
\family typewriter
Secondary
\family default
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
role
\family default
Shows the
\emph on
actual
\emph default
state:
\family typewriter
None
\family default
,
\family typewriter
NotYetPrimary
\family default
,
\family typewriter
Primary
\family default
,
\family typewriter
RemainsPrimary
\family default
, or
\family typewriter
Secondary
\family default
.
Any differences to the designated state are indicated by a prefix to the
keyword
\family typewriter
Primary
\family default
:
\family typewriter
NotYet
\family default
means that it
\emph on
should
\emph default
become primary, but actually hasn't.
Vice versa,
\family typewriter
Remains
\family default
means that it
\emph on
should
\emph default
leave primary state in order to become secondary, but actually cannot do
that because the
\family typewriter
/dev/mars/mydata
\family default
device is currently in use .
\begin_inset Newline newline
\end_inset
\begin_inset Tabular
<lyxtabular version="3" rows="3" columns="3">
<features rotate="0" tabularvalignment="middle">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
%todo-primary{} == 0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
%todo-primary{} == 1
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
%is-primary{} == 0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
None
\family default
/
\family typewriter
Secondary
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
NotYetPrimary
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
%is-primary{} == 1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
RemainsPrimary
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
Primary
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
role-1and1
\family default
A variant for internal use by 1&1 Internet AG.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
primarynode
\family default
Display
\family typewriter
(none)
\family default
or the hostname of the designated primary.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
primarynode-1and1
\family default
A variant for internal use by 1&1 Internet AG.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
commstate
\family default
When the last metadata communication to the designated primary is longer
ago than
\family typewriter
${window}
\family default
(see also
\family typewriter
--window=
\emph on
seconds
\family default
\emph default
option), display that age in human readable form.
See also primitive macro
\family typewriter
%alive-age{}
\family default
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
syncinfo
\family default
Shows an informational progress bar when sync is running.
Intended for humans.
Scripts should not rely on any details from this.
Scripts may use this only as an
\emph on
approximate
\emph default
means for detecting progress (when comparing the
\emph on
full
\emph default
output text to a prior version and finding
\emph on
any
\emph default
difference, they may conclude that some progress has happened, how small
whatsoever).
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
syncinfo-1and1
\family default
A variant for internal use by 1&1 Internet AG.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
replinfo
\family default
Shows an informational progress bar when fetch is running.
This should not be used for scripting at all, because it contains realtime
information in human-readable form.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
replinfo-1and1
\family default
A variant for internal use by 1&1 Internet AG.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
fetch-line
\family default
Additional details, called by
\family typewriter
replinfo
\family default
.
Shows the amount of data to be fetched, as well as the current transfer
rate and a very rough estimation of the future duration.
When primitive macros
\family typewriter
%fetch-age{}
\family default
or
\family typewriter
%fetch-lag{}
\family default
exceed
\family typewriter
${window}
\family default
, their values are also displayed for human informational purposes.
See description of these primitive macros.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
replay-line
\family default
Additional details, called by
\family typewriter
replinfo
\family default
.
Shows the amount of data to be replayed, as well as the current replay
rate and a very rough estimation of the future duration.
When primitive macro
\family typewriter
%replay-age{}
\family default
exceeds
\family typewriter
${window}
\family default
, it is also displayed for human informational purposes.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
comminfo
\family default
When the network communication is in an unusual condition, display it.
Otherwise, don't produce any output.
\end_layout
\begin_layout Subsection
Predefined Primitive Macros
\begin_inset CommandInset label
LatexCommand label
name "sub:Predefined-Trivial-Macros"
\end_inset
\end_layout
\begin_layout Subsubsection
Intended for Humans
\end_layout
\begin_layout Standard
In the following, shell glob notation
\family typewriter
{a,b}
\family default
is used to document similar variants of similar macros in a single place.
When you actually call the macro, you must choose one of the possible variants
(excluding the braces).
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
the-err-msg
\family default
Show reported errors for a resource.
When the resource argument is missing or empty, show global error information.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
all-err-msg
\family default
Like before, but show all information including those which are
\family typewriter
OK
\family default
.
This way, you get a list
\begin_inset Foot
status open
\begin_layout Plain Layout
The list may be extended in future versions of MARS.
\end_layout
\end_inset
of
\emph on
all
\emph default
potential error information present in the system.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{all,the}-wrn-msg
\family default
Show all / reported warnings in the system.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{all,the}-inf-msg
\family default
Show all / reported informational messages in the system.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{all,the}-msg
\family default
Show all / reported messages regardless of its classification.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{all,the}-global-msg
\family default
Show global messages not associated with any resource (the resource argument
of the
\family typewriter
marsadm
\family default
command is ignored in this case).
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{all,the}-global-{inf,wrn,err}-msg
\family default
Dito, but more specific.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{all,the}-pretty-{global-,}{inf-,wrn-,err-,}msg
\family default
Dito, but show numerical timestamps in a human readable form.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{all,the}-{global-,}{inf-,wrn-,err-,}count
\family default
Instead of showing the messages, show their count (number of lines).
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
errno-text
\family default
This macro takes 1 argument, which must represent a Linux
\family typewriter
errno
\family default
number, and converts it to human readable form (similar to the C
\family typewriter
strerror()
\family default
function).
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
todo-{attach,sync,fetch,replay,primary}
\family default
Shows a boolean value (0 or 1) indicating the current state of the correspondin
g todo switch (whether on or off).
The meaning of todo switches is illustrated in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-State-of"
\end_inset
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
get-resource-{fat,err,wrn}
\family default
Access to the internal error status files.
This is not an official interface and may thus change at any time without
notice.
Use this only for human inspection, not for scripting!
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
These macros, as well as the error status files, are likely to disappear
in future versions of MARS.
They should be used for debugging only.
At least when merging into the upstream Linux kernel, only the
\family typewriter
*-msg
\family default
macros will likely survive.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
get-resource-{fat,err,wrn}-count
\family default
Dito, but get the number of lines instead of the text.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
replay-code
\family default
Indicate the current state of logfile replay / recovery:
\end_layout
\begin_deeper
\begin_layout Labeling
\labelwidthstring 00.00.0000
(empty) Unknown.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
0 No replay is currently running.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
1 Replay is currently running.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
2 Replay has successfully stopped.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
<0 See Linux
\family typewriter
errno
\family default
code.
Typically this indicates a damaged logfile, or another filesystem error
at
\family typewriter
/mars
\family default
.
\end_layout
\end_deeper
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
is-{attach,sync,fetch,replay,primary,module-loaded}
\family default
Shows a boolean value (0 or 1) indicating the
\emph on
actual
\emph default
state, whether the corresponding action has been actually carried out,
or not (yet).
Notice that the values indicated by
\family typewriter
is-*
\family default
may differ from the
\family typewriter
todo-*
\family default
values when something is not (yet) working.
More explanations can be found in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-State-of"
\end_inset
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
is-split-brain
\family default
Shows whether split brain (see section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Split-Brain-Resolution"
\end_inset
) has been detected, or not.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
is-consistent
\family default
Shows whether the
\emph on
underlying disk
\emph default
is in a locally consistent state, i.e.
whether it
\emph on
could
\emph default
be (potentially) detached and then used for read-only test-mounting
\begin_inset Foot
status open
\begin_layout Plain Layout
Notice that the
\emph on
writeback
\emph default
at the primary side is out-of-order by default, for performance reasons.
Therefore, the underlying disk is only guaranteed to be consistent when
there is no data left to be written back.
Notice that this condition is racy by construction.
When your primary node crashes during writeback and then comes up again,
you must do a
\family typewriter
modprobe mars
\family default
first in order to automatically replay the transaction logfiles, which
will automatically heal such temporary inconsistencies.
\end_layout
\end_inset
.
Don't confuse this with the consistency of
\family typewriter
/dev/mars/mydata
\family default
, which is by construction
\emph on
always
\emph default
locally consistent once it has appeared
\begin_inset Foot
status open
\begin_layout Plain Layout
Exceptions are possible when using
\family typewriter
marsadm fake-sync
\family default
.
Even in split brain situations,
\family typewriter
marsadm primary --force
\family default
tries to prevent any further potential exception as best as it can, by
not letting
\family typewriter
/dev/mars/mydata
\family default
to appear and by insisting on split brain resolution first.
In future implementations, this might change if more pressure is put on
the developer to sacrifice consistency in preference to not waiting for
a full logfile replay.
\end_layout
\end_inset
.
By construction of MARS, the disk of secondaries will
\emph on
always
\emph default
remain in a locally consistent state once the initial sync has finished
as well as the initial logfile replay.
Notice that local consistency does not necessarily imply actuality (see
high-level explanation in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Behaviour-of-MARS"
\end_inset
).
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
is-emergency
\family default
Shows whether emergency mode (see section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Emergency-Mode"
\end_inset
) has been entered for the named resource, or not.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
rest-space
\family default
(global, no resource argument necessary) Shows the
\emph on
logically
\emph default
available space in
\family typewriter
/mars/
\family default
, which may deviate from the physically available space as indicated by
the
\family typewriter
df
\family default
command.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
get-{disk,device}
\family default
Show the name of the underlying disk, or of the
\family typewriter
/dev/mars/mydata
\family default
device (if it is available).
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{disk,device}-present
\family default
Show (as a boolean value) whether the underlying disk, or the
\family typewriter
/dev/mars/mydata
\family default
device, is available.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
device-opened
\family default
Show (as a number) how often
\family typewriter
/dev/mars/mydata
\family default
has been actually openend, e.g.
by
\family typewriter
mount
\family default
or by some processes like
\family typewriter
dd
\family default
, or by iSCSI, etc.
\end_layout
\begin_layout Subsubsection
Intended for Scripting
\end_layout
\begin_layout Standard
While complex macros may output a whole bunch of information, the following
primitive macros are outputting exactly one value.
They are intended for script use (cf.
section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Scripting-HOWTO"
\end_inset
).
Of course, curious humans may also try them :)
\end_layout
\begin_layout Standard
In the following, shell glob notation
\family typewriter
{a,b}
\family default
is used to document similar variants of similar macros in a single place.
When you actually call the macro, you must choose one of the possible variants
(excluding the braces).
\end_layout
\begin_layout Paragraph
Name Querying
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
cluster-members
\family default
Show a newline-separated list of all host names participating in the cluster.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
resource-members
\family default
Show a newline-separated list of all host names participating in the particular
resource
\family typewriter
%{res}
\family default
.
Notice that this may be a subset of
\family typewriter
%cluster-members{}
\family default
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{my,all}-resources
\family default
Show a newline-separated list of either all resource names existing in
the cluster, or only those where the current host
\family typewriter
%{host}
\family default
is member.
Optionally, you may specify the hostname as a parameter, e.g.
\family typewriter
%my-resources{
\emph on
otherhost
\emph default
}
\family default
.
\end_layout
\begin_layout Paragraph
Amounts of Data Inquiry
\end_layout
\begin_layout Standard
\begin_inset Float figure
placement h
wide false
sideways false
status open
\begin_layout Plain Layout
\noindent
\align center
\begin_inset Graphics
filename images/fetch-replay-total.fig
width 80col%
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
overview on amounts / cursors
\begin_inset CommandInset label
LatexCommand label
name "fig:overview-on-amounts"
\end_inset
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\noindent
The following macros are meaningful for both primary and secondary nodes:
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
deletable-size
\family default
Show the total amount of
\emph on
locally present
\emph default
logfile data which
\emph on
could
\emph default
be deleted by
\family typewriter
marsadm log-delete-all mydata
\family default
.
This differs almost always from both
\family typewriter
replay-pos
\family default
and
\family typewriter
occupied-size
\family default
due to granularity reasons (only whole logfiles can be deleted).
Units are
\emph on
bytes
\emph default
, not kilobytes.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
occupied-size
\family default
Show the total amount of
\emph on
locally present
\emph default
logfile data (sum of all file sizes).
This is often roughly approximate to
\family typewriter
fetch-pos
\family default
, but it may differ vastly (in both directions) when logfiles are not completely
transferred, when some are damaged, during split brain, after a
\family typewriter
join-resource
\family default
/
\family typewriter
invalidate
\family default
, or when the resource is in emergency mode (see section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Emergency-Mode"
\end_inset
).
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
disk-size
\family default
Show the size of the underlying local disk in bytes.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
resource-size
\family default
Show the logical size of the resource in bytes.
When this value is lower than
\family typewriter
disk-size
\family default
, you are wasting space.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
device-size
\family default
At a primary node, this may differ from
\family typewriter
resource-size
\family default
only for a very short time during the
\family typewriter
resize
\family default
operation.
At secondaries, there will be no difference.
\end_layout
\begin_layout Standard
\noindent
The following macros are only meaningful for secondary nodes.
By information theoretic limits, they can only tell what is
\emph on
locally known
\emph default
.
They
\series bold
cannot
\series default
reflect the
\begin_inset Quotes eld
\end_inset
true (global) state
\begin_inset Foot
status open
\begin_layout Plain Layout
Notice that according to Einstein's law, and according to observations by
Lamport, the concept of
\begin_inset Quotes eld
\end_inset
true state
\begin_inset Quotes erd
\end_inset
does not exist at all in a distributed system.
Anything you can know in a distributed system is always local knowlege,
which races with other (remote) knowlege, and may be outdated at
\emph on
any
\emph default
time.
\end_layout
\end_inset
\begin_inset Quotes erd
\end_inset
of a cluster, in particular during network partitions.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{sync,fetch,replay,work}-size
\family default
Show the total amount of data which is / was to be processed by either
sync, fetch, or replay.
\family typewriter
work-size
\family default
is equivalent to
\family typewriter
fetch-size
\family default
.
\family typewriter
replay-size
\family default
is equivalent to
\family typewriter
fetch-pos
\family default
(see below).
Units are
\emph on
bytes
\emph default
, not kilobytes.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{sync,fetch,replay,work}-pos
\family default
Show the total amount of data which is already processed (current
\begin_inset Quotes eld
\end_inset
cursor
\begin_inset Quotes erd
\end_inset
position).
\family typewriter
work-pos
\family default
is equivalent to
\family typewriter
replay-pos
\family default
.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
The 0% point is the
\emph on
locally contiguous
\emph default
amount of data since the last
\family typewriter
create-resource
\family default
,
\family typewriter
join-resource
\family default
, or
\family typewriter
invalidate
\family default
, or since the last emergency mode, but possibly shortened by
\family typewriter
log-delete
\family default
s.
Notice that the 0% point may be different on different cluster nodes, because
their resource history may be different or non-contiguous during split
brain, or after a
\family typewriter
join-resource
\family default
, or after
\family typewriter
invalidate
\family default
, or during / after emergency mode.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{sync,fetch,replay,work}-rest
\family default
Shows the difference between
\family typewriter
*-size
\family default
and
\family typewriter
*-pos
\family default
(amount of work to do).
\family typewriter
work-rest
\family default
is therefore the difference between
\family typewriter
fetch-size
\family default
and
\family typewriter
replay-pos
\family default
, which is the
\emph on
total
\emph default
amount of work to do (regardless whether to be fetched and/or to be replayed).
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{sync,fetch,replay,work}-reached
\family default
Boolean value indicating whether
\family typewriter
*-rest
\family default
dropped down to zero
\begin_inset Foot
status open
\begin_layout Plain Layout
Recall from chapter
\begin_inset CommandInset ref
LatexCommand ref
reference "chap:Use-Cases-for"
\end_inset
that MARS (in its current stage of development) does only guarantee local
consistency, but cannot guarantee actuality in all imaginable situations.
Notice that a general notion of
\begin_inset Quotes eld
\end_inset
actuality
\begin_inset Quotes erd
\end_inset
is
\emph on
undefinable
\emph default
in a widely distributed system at all, according to Einstein's laws.
\end_layout
\begin_layout Plain Layout
Let's look at an example.
In case of a node crash, and after the node is up again, a
\family typewriter
modprobe mars
\family default
has to occur, in order to replay the transaction logs of MARS again.
However, at the recovery phase before, the journalling
\family typewriter
ext4
\family default
filesystem
\family typewriter
/mars/
\family default
\emph on
may
\emph default
have rolled back some internal symlink updates which have occurred immediately
before the crash.
MARS is relying on the fact that journalling filesystems like
\family typewriter
ext4
\family default
should do their recovery in a consistent way, possibly by sacrifycing actuality
a little bit.
Therefore, the above macros cannot guarantee to deliver true information
about what is persisted at the moment.
\end_layout
\begin_layout Plain Layout
Notice that there are further potential caveats.
\end_layout
\begin_layout Plain Layout
In case of
\family typewriter
{sync,fetch}-reached
\family default
, MARS uses
\family typewriter
bio
\family default
callbacks resp.
\family typewriter
fdatasync()
\family default
by default, thus the underlying storage layer has
\emph on
told
\emph default
us that it
\emph on
believes
\emph default
it has commited the data in a reboot-safe way.
Whether this is
\emph on
really
\emph default
true does not depend on MARS, but on the lower layers of the storage hierarchy.
There exists hardware where this claim is known to be wrong under certain
circumstances, such as certain hard disk drives in certain modes of operation.
Please check the hardware for any violations of storage semantics under
certain circumstances such as power loss, and check information sources
like magazines about the problem area.
Please notice that such a problem, if it exists at all, is independent
from MARS.
It would also exist if you wouldn't use MARS on the same system.
\end_layout
\end_inset
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{fetch,replay,work}-threshold-reached
\family default
Boolean value indicating whether
\family typewriter
*-rest
\family default
dropped down to
\family typewriter
%{threshold}
\family default
, which is pre-settable by the
\family typewriter
--threshold=
\emph on
size
\family default
\emph default
command line option (default is 10 MiB).
In asynchronous use cases of MARS, this should be preferred over
\family typewriter
*-reached
\family default
for
\emph on
human display
\emph default
, because it produces less flickering by the inevitable replication delay.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{fetch,replay,work}-almost-reached
\family default
Boolean value indicating whether
\family typewriter
*-rest
\family default
\emph on
almost
\emph default
/
\emph on
approximately
\emph default
dropped down to zero.
The default is that at lease 990 permille are reached.
In asynchronous use cases of MARS, this can be preferred over
\family typewriter
*-reached
\family default
for
\emph on
human display
\emph default
only, because it produces less flickering by the inevitable replication
delay.
However, don't base any decisions on this!
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{sync,fetch,replay,work}-percent
\family default
The cursor position
\family typewriter
*-pos
\family default
as a percentage of
\family typewriter
*-size
\family default
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{sync,fetch,replay,work}-permille
\family default
The cursor position
\family typewriter
*-pos
\family default
as permille of
\family typewriter
*-size
\family default
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{sync,fetch,replay,work}-rate
\family default
Show the current throughput in bytes
\begin_inset Foot
status open
\begin_layout Plain Layout
Notice that the internal granularity reported by the kernel may be coarser,
such as KiB.
This interfaces abstracts away from kernel internals and thus presents
everything in byte units.
\end_layout
\end_inset
per second.
\family typewriter
work-rate
\family default
is the
\emph on
maximum
\emph default
of
\family typewriter
fetch-rate
\family default
and
\family typewriter
replay-rate
\family default
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{sync,fetch,replay,work}-remain
\family default
Show the
\emph on
estimated
\emph default
remaining time for completion of the respective operation.
This is just a very raw guess.
Units are seconds.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
summary-vector
\family default
Show the colon-separated CSV value
\family typewriter
%replay-pos{}:%fetch-pos{}:%fetch-size{}
\family default
.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
replay-basenr
\family default
Get currently first reachable logfile number (see figure
\begin_inset CommandInset ref
LatexCommand vref
reference "fig:overview-on-amounts"
\end_inset
).
Only for curious humans or for debugging / monitoring - don't base any
decisions on this.
Use the
\family typewriter
*-{pos,size}
\family default
macros instead.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{replay,fetch,work}-lognr
\family default
Get current logfile number of replay or fetch position, or of the currently
known last reachable number (see figure
\begin_inset CommandInset ref
LatexCommand vref
reference "fig:overview-on-amounts"
\end_inset
).
Only for curious humans or for debugging / monitoring - don't base any
decisions on this.
Use the
\family typewriter
*-{pos,size}
\family default
macros instead.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{replay,fetch,work}-logcount
\family default
Get current number of logfiles which are already replayed, or are already
fetched, or are to be applied in total (see figure
\begin_inset CommandInset ref
LatexCommand vref
reference "fig:overview-on-amounts"
\end_inset
).
Only for curious humans or for debugging / monitoring - don't base any
decisions on this.
Use the
\family typewriter
*-{rest}
\family default
macros instead.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
alive-timestamp
\family default
Tell the Lamport Unix timestamp (seconds since 1970) of the last metadata
communication to the designated primary (or to any other host given by
the first argument).
Returns
\begin_inset Formula $-1$
\end_inset
if no such host exists.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{fetch,replay,work}-timestamp
\family default
Tell the Lamport Unix timestamp (seconds since 1970) when the last progress
has been made.
When no such action exists,
\begin_inset Formula $-1$
\end_inset
is returned.
\family typewriter
%work-timestamp{
\emph on
hostname
\emph default
}
\family default
is the maximum of
\family typewriter
%fetch-timestamp{
\emph on
hostname
\emph default
}
\family default
and
\family typewriter
%replay-timestamp{
\emph on
hostname
\emph default
}
\family default
.
When the parameter
\family typewriter
\emph on
hostname
\family default
\emph default
is empty, the local host will be reported (default).
Example usage:
\family typewriter
marsadm view all --macro=
\begin_inset Quotes erd
\end_inset
%replay-timestamp{%todo-primary{}}
\begin_inset Quotes erd
\end_inset
\family default
shows the timestamp of the last reported
\begin_inset Foot
status open
\begin_layout Plain Layout
Updates of this information are occurring with lower frequency than actual
writebacks, for performance reasons.
The metadata network update protocol will add further delays.
Therefore, the accuracy is only in the range of minutes.
\end_layout
\end_inset
writeback action at the designated primary.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{alive,fetch,replay,work}-age
\family default
Tell the number of seconds since the last respective action, or
\begin_inset Formula $-1$
\end_inset
if none exists.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
{alive,fetch,replay,work}-lag
\family default
Report the time difference (in seconds) between the last
\emph on
known
\emph default
action at the local host and at the designated primary (or between any
other hosts when 2 parameters are given).
Returns
\begin_inset Formula $-1$
\end_inset
if no such action exists at any of the two hosts.
Attention! This need not reflect the
\emph on
actual
\emph default
state in case of networking problems.
Don't draw wrong conclusions from a high
\family typewriter
{fetch,replay}-lag
\family default
value: it could also mean that simply no write operation at all has occurred
at the primary side for a long time.
Conversely, a low lag value does not imply that the replication is recent:
it may refer to
\emph on
different
\emph default
write operations at each of the hosts; therefore it only tells that
\emph on
some
\emph default
progress has been made, but says nothing about the amount of the progress.
\end_layout
\begin_layout Paragraph
Misc Informational Status
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
get-primary
\family default
Return the name of the current designated primary node as locally known.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
actual-primary
\family default
(deprecated) try to determine the name of the node which
\emph on
appears
\emph default
to be the actual primary.
This only a
\series bold
\emph on
guess
\series default
\emph default
, because it is not generally unique in split brain situations! Don't use
this macro.
Instead, use
\family typewriter
is-primary
\family default
on those nodes you are interested in.
The explanations from section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-State-of"
\end_inset
also apply to
\family typewriter
get-primary
\family default
versus
\family typewriter
actual-primary
\family default
analogously.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
is-alive
\family default
Boolean value indicating whether all other nodes participating in
\family typewriter
mydata
\family default
are reachable / healthy.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
uuid
\family default
(global) Show the unique identifier created by
\family typewriter
create-cluster
\family default
or by
\family typewriter
create-uuid
\family default
.
Hint: this is immutable, and it is firmly bound to the
\family typewriter
/mars/
\family default
filesystem.
It can only be destroyed by deleting the whole filesystem (see section
\begin_inset CommandInset ref
LatexCommand ref
reference "leave-cluster"
\end_inset
).
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
tree
\family default
(global) Indicate symlink tree version (see section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-Symlink-Tree"
\end_inset
).
\end_layout
\begin_layout Paragraph
Experts Only
\end_layout
\begin_layout Standard
The following is for hackers who know what they are doing.
The following is not officially supported.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
wait-{is,todo}-{attach,sync,fetch,replay,primary}-{on,off}
\family default
This may be used to program some useful waiting conditions in advanced
macro scripts.
Use at your own risk!
\end_layout
\begin_layout Section
Creating your own Macros
\begin_inset CommandInset label
LatexCommand label
name "sub:Creating-your-own"
\end_inset
\end_layout
\begin_layout Standard
In order to create your own macros, you could start writing them from scratch
with your favorite ASCII text editor.
However, it is much easier to take an existing macro and to customize it
to your needs.
In addition, you can learn something about macro programming by looking
at the existing macro code.
\end_layout
\begin_layout Standard
Go to a new empty directory and say
\end_layout
\begin_layout Itemize
\family typewriter
marsadm dump-macros
\end_layout
\begin_layout Standard
in order to get the most interesting complex macros, or say
\end_layout
\begin_layout Itemize
\family typewriter
marsadm dump-all-macros
\end_layout
\begin_layout Standard
in order to additionally get some primitive macros which could be customized
if needed.
This will write lots of files
\family typewriter
*.tpl
\family default
into your current working directory.
\end_layout
\begin_layout Standard
Any modfied or new macro file should be placed either into the current working
directory
\family typewriter
./
\family default
, or into
\family typewriter
$HOME/.marsadm/
\family default
, or into
\family typewriter
/etc/marsadm/
\family default
.
They will be searched in this order, and the first match will win.
When no macro file is found, the built-in version will be used if it exists.
This way, you may override builtin macros.
\end_layout
\begin_layout Standard
Example: if you have a file
\family typewriter
./mymacro.tpl
\family default
you just need to say
\family typewriter
marsadm view-mymacro mydata
\family default
in order to invoke it in the resource context
\family typewriter
mydata
\family default
.
\end_layout
\begin_layout Subsection
General Macro Syntax
\end_layout
\begin_layout Standard
Macros are simple ASCII text, enriched with calls to other macros.
\end_layout
\begin_layout Standard
ASCII text outside of comments are copied to the output verbatim.
Comments are skipped.
Comments may have one of the following well-known forms:
\end_layout
\begin_layout Itemize
\family typewriter
# skipped text until / including next newline character
\end_layout
\begin_layout Itemize
\family typewriter
// skipped text until / including next newline character
\end_layout
\begin_layout Itemize
\family typewriter
/* skipped text including any newline characters */
\end_layout
\begin_layout Itemize
denoted as Perl regex:
\family typewriter
\backslash
\backslash
\backslash
n
\backslash
s*
\family default
(single backslash directly followed by a newline character, and eating up
any whitespace characters at the beginning of the next line) Hint: this
may be fruitfully used to structure macros in a more readable form / indentatio
n.
\end_layout
\begin_layout Standard
Special characters are always initiated by a backslash.
The following pre-defined special character sequences are recognized:
\end_layout
\begin_layout Itemize
\family typewriter
\backslash
n
\family default
newline
\end_layout
\begin_layout Itemize
\family typewriter
\backslash
r
\family default
return (useful for DOS compatibility)
\end_layout
\begin_layout Itemize
\family typewriter
\backslash
t
\family default
tab
\end_layout
\begin_layout Itemize
\family typewriter
\backslash
f
\family default
formfeed
\end_layout
\begin_layout Itemize
\family typewriter
\backslash
b
\family default
backspace
\end_layout
\begin_layout Itemize
\family typewriter
\backslash
a
\family default
alarm (bell)
\end_layout
\begin_layout Itemize
\family typewriter
\backslash
e
\family default
escape (e.g.
for generating ANSI escape sequences)
\end_layout
\begin_layout Itemize
\family typewriter
\backslash
\family default
followed by anything else: assure that the next character is taken verbatim.
Although possible, please don't use this for escaping letters, because
further escape sequences might be pre-defined in future.
Best practice is to use this only for escaping the backslash itself, or
for escaping the percent sign when you don't want to call a macro (protect
against evaluation), or to escape a brace directly after a macro call (verbatim
brace not to be interpreted as a macro parameter).
\end_layout
\begin_layout Itemize
All other characters stand for their own.
If you like, you should be able to produce XML, HTML, JSON and other ASCII-base
d output formats this way.
\end_layout
\begin_layout Standard
Macro calls have the following syntax:
\end_layout
\begin_layout Itemize
\family typewriter
%
\emph on
macroname
\emph default
{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}{
\emph on
argn
\emph default
}
\end_layout
\begin_layout Itemize
Of course, arguments may be empty, denoted as
\family typewriter
{}
\end_layout
\begin_layout Itemize
It is possible to supply more arguments than required.
These are simply ignored.
\end_layout
\begin_layout Itemize
There must be always at least 1 argument, even for parameterless macros.
In such a case, it is good style to leave it empty (even if it is actually
ignored).
Just write
\family typewriter
%parameterlessmacro{}
\family default
in such a case.
\end_layout
\begin_layout Itemize
\family typewriter
%{
\emph on
varname
\emph default
}
\family default
syntax: As a special case, the macro name may be empty, but then the first
argument must denote a previously defined variable (such as assigned via
\family typewriter
%let{varname}{myvalue}
\family default
, or a pre-defined standard variable like
\family typewriter
%{res}
\family default
for the current resource name, see later paragraph
\begin_inset CommandInset ref
LatexCommand ref
reference "par:Predefined-Variables"
\end_inset
).
\end_layout
\begin_layout Itemize
Of course, parameter calls may be (almost) arbitrarily nested.
\end_layout
\begin_layout Itemize
Of course, the
\emph on
correctness
\emph default
of nesting of braces must be generally obeyed, as usual in any other macro
processor language.
General rule: for each opening brace, there must be exactly one closing
brace somewhere afterwards.
\end_layout
\begin_layout Standard
These rules are hopefully simple and intuitive.
There are currently no exceptions.
In particular, there is no special infix operator syntax for arithmetic
expressions, and therefore no operator precedence rules are necessary.
You have to write nested arithmetic expressions always in the above prefix
syntax, like
\family typewriter
%*{7}{%+{2}{3}}
\family default
(similar to non-inverse polish notation).
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
When deeply nesting macros and their braces, you may easily find yourself
in a feeling like in the good old days of Lisp.
Use the above backslash-newline syntax to indent your macros in a readable
and structured way.
Fortunately, modern text editors like (x)emacs or vim have modes for dealing
with the correctness of nested braces.
\end_layout
\begin_layout Subsection
Calling Builtin / Primitive Macros
\end_layout
\begin_layout Standard
Primitive macros can be called in two alternate forms:
\end_layout
\begin_layout Itemize
\family typewriter
%primitive-
\emph on
macroname
\emph default
{
\emph on
something
\emph default
}
\end_layout
\begin_layout Itemize
\family typewriter
%
\emph on
macroname
\emph default
{
\emph on
something
\emph default
}
\end_layout
\begin_layout Standard
When using the
\family typewriter
%primitive-*{}
\family default
form, you
\emph on
explicitly disallow
\emph default
interception of the call by a
\family typewriter
*.tpl
\family default
file.
Otherwise, you may override the standard definition even of primitive macros
by your own template files.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Notice that
\family typewriter
%call{}
\family default
conventions are used in such a case.
The parameters are passed via
\family typewriter
%{0}
\family default
\begin_inset Formula $\ldots$
\end_inset
\family typewriter
%{n}
\family default
variables (see description below).
\end_layout
\begin_layout Paragraph
Standard MARS State Inspection Macros
\end_layout
\begin_layout Standard
These are already described in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Predefined-Trivial-Macros"
\end_inset
.
When calling one of them, the call will simply expand to the corresponding
value.
\end_layout
\begin_layout Standard
Example:
\family typewriter
%get-primary{}
\family default
will expand to the hostname of the current designated primary node.
\end_layout
\begin_layout Paragraph
Further MARS State Inspection Macros
\end_layout
\begin_layout Paragraph
Variable Access Macros
\end_layout
\begin_layout Itemize
\family typewriter
%let{
\emph on
varname
\emph default
}{
\emph on
expression
\emph default
}
\family default
Evaluates both
\family typewriter
\emph on
varname
\family default
\emph default
and the
\family typewriter
\emph on
expression
\family default
\emph default
.
The
\family typewriter
\emph on
expression
\family default
\emph default
is then assigned to
\family typewriter
varname
\family default
.
\end_layout
\begin_layout Itemize
\family typewriter
%let{
\emph on
varname
\emph default
}{
\emph on
expression
\emph default
}
\family default
Evaluates both
\family typewriter
\emph on
varname
\family default
\emph default
and the
\family typewriter
\emph on
expression
\family default
\emph default
.
The
\family typewriter
\emph on
expression
\family default
\emph default
is then appended to
\family typewriter
varname
\family default
(concatenation).
\end_layout
\begin_layout Itemize
\family typewriter
%{
\emph on
varname
\emph default
}
\family default
Evaluates
\family typewriter
\emph on
varname
\family default
\emph default
, and outputs the value of the corresponding variable.
When the variable does not exist, the empty string is returned.
\end_layout
\begin_layout Itemize
\family typewriter
%{++}{
\emph on
varname
\emph default
}
\family default
or
\family typewriter
%{
\emph on
varname
\emph default
}{++}
\family default
Has the obvious well-known side effect e.g.
from C or Java.
You may also use
\family typewriter
--
\family default
instead of
\family typewriter
++
\family default
.
This is handy for programming loops (see below).
\end_layout
\begin_layout Itemize
\family typewriter
%dump-vars{}
\family default
Writes all currently defined variables (from the currently active scope)
to
\family typewriter
stderr
\family default
.
This is handy for debugging.
\end_layout
\begin_layout Paragraph
CSV Array Macros
\end_layout
\begin_layout Itemize
\family typewriter
%{
\emph on
varname
\emph default
}{
\emph on
delimiter
\emph default
}{
\emph on
index
\emph default
}
\family default
Evaluates all arguments.
The contents of
\family typewriter
\emph on
varname
\family default
\emph default
is interpreted as a comma-separated list, delimited by
\family typewriter
\emph on
delimiter
\family default
\emph default
.
The
\family typewriter
\emph on
index
\family default
\emph default
'th list element is returned.
\end_layout
\begin_layout Itemize
\family typewriter
%set{
\emph on
varname
\emph default
}{
\emph on
delimiter
\emph default
}{
\emph on
index
\emph default
}{
\emph on
expression
\emph default
}
\family default
Evaluates all arguments.
The contents of the old
\family typewriter
\emph on
varname
\family default
\emph default
is interpreted as a comma-separated list, delimited by
\family typewriter
\emph on
delimiter
\family default
\emph default
.
The
\family typewriter
\emph on
index
\family default
\emph default
'th list element is the assigend to, or substituted by,
\family typewriter
\emph on
expression
\family default
\emph default
.
\end_layout
\begin_layout Paragraph
Arithmetic Expression Macros
\end_layout
\begin_layout Standard
The following macros can also take more than two arguments, carrying out
the corresponding arithmetic operation in sequence (it depends on the operator
whether this accords to the associative law).
\end_layout
\begin_layout Itemize
\family typewriter
%+{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Evaluates the arguments, inteprets them as numbers, and adds them together.
\end_layout
\begin_layout Itemize
\family typewriter
%-{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Subtraction.
\end_layout
\begin_layout Itemize
\family typewriter
%*{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Multiplication.
\end_layout
\begin_layout Itemize
\family typewriter
%/{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Division.
\end_layout
\begin_layout Itemize
\family typewriter
%%{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Modulus.
\end_layout
\begin_layout Itemize
\family typewriter
%&{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Bitwise Binary And.
\end_layout
\begin_layout Itemize
\family typewriter
%|{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Bitwise Binary Or.
\end_layout
\begin_layout Itemize
\family typewriter
%^{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Bitwise Binary Exclusive Or.
\end_layout
\begin_layout Itemize
\family typewriter
%<<{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Binary Shift Left.
\end_layout
\begin_layout Itemize
\family typewriter
%>>{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Binary Shift Right.
\end_layout
\begin_layout Itemize
\family typewriter
%min{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Compute the arithmetic minimum of the arguments.
\end_layout
\begin_layout Itemize
\family typewriter
%max{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Compute the arithmetic maximum of the arguments.
\end_layout
\begin_layout Paragraph
Boolean Condition Macros
\end_layout
\begin_layout Itemize
\family typewriter
%=={
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Numeral Equality.
\end_layout
\begin_layout Itemize
\family typewriter
%!={
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Numeral Inequality.
\end_layout
\begin_layout Itemize
\family typewriter
%<{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Numeral Less Then.
\end_layout
\begin_layout Itemize
\family typewriter
%<={
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Numeral Less or Equal.
\end_layout
\begin_layout Itemize
\family typewriter
%>{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Numeral Greater Then.
\end_layout
\begin_layout Itemize
\family typewriter
%>={
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Numeral Greater or Equal.
\end_layout
\begin_layout Itemize
\family typewriter
%eq{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
\begin_inset space ~
\end_inset
String Equality.
\end_layout
\begin_layout Itemize
\family typewriter
%ne{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
String Inequality.
\end_layout
\begin_layout Itemize
\family typewriter
%lt{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
String Less Then.
\end_layout
\begin_layout Itemize
\family typewriter
%le{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
String Less or Equal.
\end_layout
\begin_layout Itemize
\family typewriter
%gt{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
String Greater Then.
\end_layout
\begin_layout Itemize
\family typewriter
%ge{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
String Greater or Equal.
\end_layout
\begin_layout Itemize
\family typewriter
%=~{
\emph on
string
\emph default
}{
\emph on
regex
\emph default
}{
\emph on
opts
\emph default
}
\family default
or
\family typewriter
%match{
\emph on
string
\emph default
}{
\emph on
regex
\emph default
}{
\emph on
opts
\emph default
}
\family default
Checks whether
\family typewriter
\emph on
string
\family default
\emph default
matches the Perl regular expression
\family typewriter
\emph on
regex
\family default
\emph default
.
Modifiers can be given via
\family typewriter
\emph on
opts
\family default
\emph default
.
\end_layout
\begin_layout Paragraph
Shortcut Evaluation Operators
\end_layout
\begin_layout Standard
The following operators evaluate their arguments only when needed (like
in C).
\end_layout
\begin_layout Itemize
\family typewriter
%&&{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Logical And.
\end_layout
\begin_layout Itemize
\family typewriter
%and{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Alias for
\family typewriter
%&&{}
\family default
.
\end_layout
\begin_layout Itemize
\family typewriter
%||{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Logical Or.
\end_layout
\begin_layout Itemize
\family typewriter
%or{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}
\family default
Alias for
\family typewriter
%||{}
\family default
.
\end_layout
\begin_layout Paragraph
Unary Operators
\end_layout
\begin_layout Itemize
\family typewriter
%!{
\emph on
arg
\emph default
}
\family default
Logical Not.
\end_layout
\begin_layout Itemize
\family typewriter
%not{
\emph on
arg
\emph default
}
\family default
Alias for
\family typewriter
%!{}
\family default
.
\end_layout
\begin_layout Itemize
\family typewriter
%~{
\emph on
arg
\emph default
}
\family default
Bitwise Ńegation.
\end_layout
\begin_layout Paragraph
String Functions
\end_layout
\begin_layout Itemize
\family typewriter
%length{
\emph on
string
\emph default
}
\family default
Return the number of ASCII characters present in
\family typewriter
\emph on
string
\family default
\emph default
.
\end_layout
\begin_layout Itemize
\family typewriter
%toupper{
\emph on
string
\emph default
}
\family default
Return all ASCII characters converted to uppercase.
\end_layout
\begin_layout Itemize
\family typewriter
%tolower{
\emph on
string
\emph default
}
\family default
Return all ASCII characters converted to lowercase.
\end_layout
\begin_layout Itemize
\family typewriter
%append{
\emph on
varname
\emph default
}{
\emph on
string
\emph default
}
\family default
Equivalent to
\family typewriter
%let{
\emph on
varname
\emph default
}{%{
\emph on
varname
\emph default
}
\emph on
string
\emph default
}
\family default
.
\end_layout
\begin_layout Itemize
\family typewriter
%subst{
\emph on
string
\emph default
}{
\emph on
regex
\emph default
}{
\emph on
subst
\emph default
}{
\emph on
opts
\emph default
}
\family default
Perl regex substitution.
\end_layout
\begin_layout Itemize
\family typewriter
%sprintf{
\emph on
fmt
\emph default
}{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}{
\emph on
argn
\emph default
}
\family default
Perl
\family typewriter
sprintf()
\family default
operator.
Details see Perl manual.
\end_layout
\begin_layout Itemize
\family typewriter
%human-number{
\emph on
unit
\emph default
}{
\emph on
delim
\emph default
}{
\emph on
unit-sep
\emph default
}{
\emph on
number
\emph default
1}{
\emph on
number
\emph default
2}
\begin_inset Formula $\ldots$
\end_inset
\family default
Convert a number or a list of numbers into human-readable
\family typewriter
B
\family default
,
\family typewriter
KiB
\family default
,
\family typewriter
MiB
\family default
,
\family typewriter
GiB
\family default
,
\family typewriter
TiB
\family default
, as given by
\family typewriter
\emph on
unit
\family default
\emph default
.
When
\family typewriter
\emph on
unit
\family default
\emph default
is empty, a reasonable unit will be guessed automatically from the maximum
of all given numbers.
A single result string is produced, where multiple numbers are separated
by
\family typewriter
\emph on
delim
\family default
\emph default
when necessary.
When
\family typewriter
\emph on
delim
\family default
\emph default
is empty, the slash symbol
\family typewriter
/
\family default
is used by default (the most obvious use case is result strings like
\family typewriter
\begin_inset Quotes eld
\end_inset
17/32 KiB
\begin_inset Quotes erd
\end_inset
\family default
).
The final unit text is separated from the previous number(s) by
\family typewriter
\emph on
unit-sep
\family default
\emph default
.
When
\family typewriter
\emph on
unit-sep
\family default
\emph default
is empty, a single blank is used by default.
\end_layout
\begin_layout Itemize
\family typewriter
%human-seconds{
\emph on
number
\emph default
}
\family default
Convert the given number of seconds into
\family typewriter
hh:mm:ss
\family default
format.
\end_layout
\begin_layout Paragraph
Complex Helper Macros
\end_layout
\begin_layout Itemize
\family typewriter
%progress{20}
\family default
Return a string containing a progress bar showing the values from
\family typewriter
%summary-vector{}
\family default
.
The default width is 20 characters plus two braces.
\end_layout
\begin_layout Itemize
\family typewriter
%progress{20}{
\emph on
minvalue
\emph default
}{
\emph on
midvalue
\emph default
}{
\emph on
maxvalue
\emph default
}
\family default
Instead of taking the values from
\family typewriter
%summary-vector{}
\family default
, use the supplied values.
\family typewriter
minvalue
\family default
and
\family typewriter
midvalue
\family default
indicate two different intermediate points, while
\family typewriter
maxvalue
\family default
will determine the 100% point.
\end_layout
\begin_layout Paragraph
Control Flow Macros
\end_layout
\begin_layout Itemize
\family typewriter
%if{
\emph on
expression
\emph default
}{
\emph on
then-part
\emph default
}
\family default
or
\family typewriter
%if{
\emph on
expression
\emph default
}{
\emph on
then-part
\emph default
}{
\emph on
else-part
\emph default
}
\family default
Like in any other macro or programming language, this evaluates the
\family typewriter
expression
\family default
once, not copying its outcome to the output.
If the result is non-empty and is not a string denoting the number
\family typewriter
0
\family default
, the
\family typewriter
\emph on
then-part
\family default
\emph default
is evaluated and copied to the output.
Otherwise, the
\family typewriter
else-part
\family default
is evaluated and copied, provided that one exists.
\end_layout
\begin_layout Itemize
\family typewriter
%unless{
\emph on
expression
\emph default
}{
\emph on
then-part
\emph default
}
\family default
or
\family typewriter
%unless{
\emph on
expression
\emph default
}{
\emph on
then-part
\emph default
}{
\emph on
else-part
\emph default
}
\family default
Like
\family typewriter
%if{}
\family default
, but the expression is logically negated.
Essentially, this is a shorthand for
\family typewriter
%if{%not{expression}}{...}
\family default
or similar.
\end_layout
\begin_layout Itemize
\family typewriter
%elsif{
\emph on
expr1
\emph default
}{
\emph on
then1
\emph default
}{
\emph on
expr2
\emph default
}{
\emph on
then2
\emph default
}
\family default
\begin_inset Formula $\ldots$
\end_inset
or
\family typewriter
%elsif{
\emph on
expr1
\emph default
}{
\emph on
then1
\emph default
}{
\emph on
expr2
\emph default
}{
\emph on
then2
\emph default
}
\family default
\begin_inset Formula $\ldots$
\end_inset
\family typewriter
{
\emph on
odd-else-part
\emph default
}
\family default
This is for simplification of boring if-else-if chains.
The classical if-syntax (as shown above) has the drawback that inner if-parts
need to be nested into outer else-parts, so rather deep nestings may occur
when you are programming longer chains.
This is an alternate syntax for avoidance of deep nesting.
When giving an odd number of arguments, the last argument is taken as final
else-part.
\end_layout
\begin_layout Itemize
\family typewriter
%elsunless
\family default
\begin_inset Formula $\ldots$
\end_inset
Like
\family typewriter
%elsif
\family default
, but
\emph on
all
\emph default
conditions are negated.
\end_layout
\begin_layout Itemize
\family typewriter
%while{
\emph on
expression
\emph default
}{
\emph on
body
\emph default
}
\family default
Evaluates the
\family typewriter
\emph on
expression
\family default
\emph default
in a while loop, like in any other macro or programming language.
The
\family typewriter
\emph on
body
\family default
\emph default
is evaluated exactly as many times as the
\family typewriter
\emph on
expression
\family default
\emph default
holds.
Notice that endless loops can be only avoided by a calling a non-pure macro
inspecting external state information, or by creating (and checking) another
side effect somewhere, like assigning to a variable somewhere.
\end_layout
\begin_layout Itemize
\family typewriter
%until{
\emph on
expression
\emph default
}{
\emph on
body
\emph default
}
\family default
Like
\family typewriter
%while{
\emph on
expression
\emph default
}{
\emph on
body
\emph default
}
\family default
, but negate the expression.
\end_layout
\begin_layout Itemize
\family typewriter
%for{
\emph on
exp
\emph default
r1}{
\emph on
exp
\emph default
r2}{
\emph on
exp
\emph default
r3}{
\emph on
body
\emph default
}
\family default
As you will expect from the corresponding C, Perl, Java, or (add your favorite
language) construct.
Only the syntactic sugar is a little bit different.
\end_layout
\begin_layout Itemize
\family typewriter
%foreach{
\emph on
varname
\emph default
}{
\emph on
CSV-delimited-string
\emph default
}{
\emph on
delimiter
\emph default
}{
\emph on
body
\emph default
}
\family default
As you can expect from similar
\family typewriter
foreach
\family default
constructs in other languages like Perl.
Currently, the macro processor has no arrays, but can use comma-separated
strings as a substitute.
\end_layout
\begin_layout Itemize
\family typewriter
%eval{
\emph on
count
\emph default
}{
\emph on
body
\emph default
}
\family default
Evaluates the
\family typewriter
\emph on
body
\family default
\emph default
exactly as many times as indicated by the numeric argument
\family typewriter
\emph on
count
\family default
\emph default
.
This may be used to re-evaluate the output of other macros once again.
\end_layout
\begin_layout Itemize
\family typewriter
%protect{
\emph on
body
\emph default
}
\family default
Equivalent to
\family typewriter
%eval{0}{
\emph on
body
\emph default
}
\family default
, which means that the body is not evaluated at all, but copied to the output
verbatim
\begin_inset Foot
status open
\begin_layout Plain Layout
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
TeX
\end_layout
\end_inset
\begin_inset space ~
\end_inset
or
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
LaTeX
\end_layout
\end_inset
\begin_inset space ~
\end_inset
fans usually know what this is good for ;)
\end_layout
\end_inset
.
\end_layout
\begin_layout Itemize
\family typewriter
%eval-down{
\emph on
body
\emph default
}
\family default
Evaluates the
\family typewriter
\emph on
body
\family default
\emph default
in a loop until the result does not change any more
\begin_inset Foot
status open
\begin_layout Plain Layout
Mathematicians knowing Banach's fixedpoint theorem will know what this is
good for ;)
\end_layout
\end_inset
.
\end_layout
\begin_layout Itemize
\family typewriter
%tmp{
\emph on
body
\emph default
}
\family default
Evaluates the
\family typewriter
\emph on
body
\family default
\emph default
once in a temporary scope which is thrown away afterwards.
\end_layout
\begin_layout Itemize
\family typewriter
%call{
\emph on
macroname
\emph default
}{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}{
\emph on
argn
\emph default
}
\family default
Like in many other macro languages, this evaluates the named macro in the
a new scope.
This means that any side effects produced by the called macro, such as
variable assignments, will be reverted after the call, and therefore not
influence the old scope.
However notice that the arguments
\family typewriter
\emph on
arg1
\family default
\emph default
to
\family typewriter
\emph on
argn
\family default
\emph default
are evaluted in the
\emph on
old
\emph default
scope before the call actually happens (possibly producing side effects
if they contain some), and their result is respectively assigned to
\family typewriter
%{1}
\family default
until
\family typewriter
%{
\emph on
n
\emph default
}
\family default
in the new scope, analogously to the Shell or to Perl.
In addition, the new
\family typewriter
%{0}
\family default
gets the
\family typewriter
\emph on
macroname
\family default
\emph default
.
Notice that the argument evaluation happens non-lazily in the old scope
and therefore differs from other macro processors like
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
TeX
\end_layout
\end_inset
.
\end_layout
\begin_layout Itemize
\family typewriter
%include{
\emph on
macroname
\emph default
}{
\emph on
arg1
\emph default
}{
\emph on
arg2
\emph default
}{
\emph on
argn
\emph default
}
\family default
Like
\family typewriter
%call{}
\family default
, but evaluates the named macro in the
\emph on
current
\emph default
scope (similar to the
\family typewriter
source
\family default
command of the bourne shell).
This means that any side effects produced by the called macro, such as
variable assignments, will
\emph on
not
\emph default
be reverted after the call.
Even the
\family typewriter
%{0}
\family default
until
\family typewriter
%{
\emph on
n
\emph default
}
\family default
variables will continue to exist (and may lead to confusion if you aren't
aware of that).
\end_layout
\begin_layout Itemize
\family typewriter
%callstack{}
\family default
Useful for debugging: show the current chain of macro invocations.
\end_layout
\begin_layout Paragraph
Time Handling Macros
\end_layout
\begin_layout Itemize
\family typewriter
%time{}
\family default
Return the current Lamport timestamp (see section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-Lamport-Clock"
\end_inset
), in units of seconds since the Unix epoch.
\end_layout
\begin_layout Itemize
\family typewriter
%sleep{
\emph on
seconds
\emph default
}
\family default
Pause the given number of seconds.
\end_layout
\begin_layout Itemize
\family typewriter
%timeout{
\emph on
seconds
\emph default
}
\family default
Like
\family typewriter
%sleep{
\emph on
seconds
\emph default
}
\family default
, but abort the
\family typewriter
marsadm
\family default
command after the total waiting time has exceeded the timeout given by
the
\family typewriter
--timeout=
\family default
parameter.
\end_layout
\begin_layout Paragraph
Misc Macros
\end_layout
\begin_layout Itemize
\family typewriter
%warn{
\emph on
text
\emph default
}
\family default
Show a WARNING:
\end_layout
\begin_layout Itemize
\family typewriter
%die{
\emph on
text
\emph default
}
\family default
Abort execution with an error message.
\end_layout
\begin_layout Paragraph
Experts Only - Risky
\end_layout
\begin_layout Standard
The following macros are unstable and may change at any time without notice.
\end_layout
\begin_layout Itemize
\family typewriter
%get-msg{
\emph on
name
\emph default
}
\family default
Low-level access to system messages.
You should not use this, since this is not extensible (you must know the
name in advance).
\end_layout
\begin_layout Itemize
\family typewriter
%readlink{
\emph on
path
\emph default
}
\family default
Low-level access to symlinks.
Don't misuse this for circumvention of the abstraction macros from the
symlink tree!
\end_layout
\begin_layout Itemize
\family typewriter
%setlink{
\emph on
value
\emph default
}{
\emph on
path
\emph default
}
\family default
Low-level creation of symlinks.
Don't misuse this for circumvention of the abstraction macros for the symlink
tree!
\end_layout
\begin_layout Itemize
\family typewriter
%fetch-info{}
\family default
etc.
Low-level access to internal symlink formats.
Don't use this in scripts! Only for curious humans.
\end_layout
\begin_layout Itemize
\family typewriter
%is-almost-consistent{}
\family default
Whatever you guess what this could mean, don't use it, at least never in
place of
\family typewriter
%is-consistent{}
\family default
- it is risky to base decisions on this.
Mostly for historical reasons.
\end_layout
\begin_layout Itemize
\family typewriter
%does{
\emph on
name
\emph default
}
\family default
Equivalent to
\family typewriter
%is-
\emph on
name
\emph default
{}
\family default
(just more handy for computing the macro name).
Use with care!
\end_layout
\begin_layout Subsection
Predefined Variables
\begin_inset CommandInset label
LatexCommand label
name "par:Predefined-Variables"
\end_inset
\end_layout
\begin_layout Itemize
\family typewriter
%{cmd}
\family default
The command argument of the invoked
\family typewriter
marsadm
\family default
command.
\end_layout
\begin_layout Itemize
\family typewriter
%{res}
\family default
The resource name given to the
\family typewriter
marsadm
\family default
command as a command line parameter (or, possibly expanded from
\family typewriter
all
\family default
).
\end_layout
\begin_layout Itemize
\family typewriter
%{resdir}
\family default
The corresponding resource directory.
The current version of MARS uses
\family typewriter
/mars/resource-%{res}/
\family default
, but this may change in future.
Normally, you should not need this, since anything should be already abstracted
for you.
In case you
\emph on
really
\emph default
need low-level access to something, please prefer this variable over
\family typewriter
%{mars}/resource-%{res}
\family default
because it is a bit more abstracted.
\end_layout
\begin_layout Itemize
\family typewriter
%{mars}
\family default
Currently the fixed string
\family typewriter
/mars
\family default
.
This may change in future, probably with the advent of MARS Full.
\end_layout
\begin_layout Itemize
\family typewriter
%{host}
\family default
The hostname of the local node.
\end_layout
\begin_layout Itemize
\family typewriter
%{ip}
\family default
The IP address of the local node.
\end_layout
\begin_layout Itemize
\family typewriter
%{timeout}
\family default
The value given by the
\family typewriter
--timeout=
\family default
option, or the corresonding default value.
\end_layout
\begin_layout Itemize
\family typewriter
%{threshold}
\family default
The value given by the
\family typewriter
--threshold=
\family default
option, or the corresonding default value.
\end_layout
\begin_layout Itemize
\family typewriter
%{window}
\family default
The value given by the
\family typewriter
--window=
\family default
option, or the corresonding default value.
\end_layout
\begin_layout Itemize
\family typewriter
%{force}
\family default
The number of times the
\family typewriter
--force
\family default
option has been given.
\end_layout
\begin_layout Itemize
\family typewriter
%{dry-run}
\family default
The number of times the
\family typewriter
--dry-run
\family default
option has been given.
\end_layout
\begin_layout Itemize
\family typewriter
%{verbose}
\family default
The number of times the
\family typewriter
--verbose
\family default
option has been given.
\end_layout
\begin_layout Itemize
\family typewriter
%{callstack}
\family default
Same as the
\family typewriter
%callstack{}
\family default
macro.
The latter gives you an opportunity for overriding, while the former is
firmly built in.
\end_layout
\begin_layout Section
Scripting HOWTO
\begin_inset CommandInset label
LatexCommand label
name "sec:Scripting-HOWTO"
\end_inset
\end_layout
\begin_layout Standard
Both the
\series bold
asynchronous communication model
\series default
of MARS (cf section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-Lamport-Clock"
\end_inset
) including the Lamport clock, and the
\series bold
state model
\series default
(cf section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-State-of"
\end_inset
) is something you
\emph on
definitely
\emph default
should have in mind when you want to do some scripting.
Here is some further concrete advice:
\end_layout
\begin_layout Itemize
Don't access anything on
\family typewriter
/mars/
\family default
directly, except for debugging purposes.
Use
\family typewriter
marsadm
\family default
.
\end_layout
\begin_layout Itemize
Avoid running scripts in parallel, other than for inspection / monitoring
purposes.
When you give two
\family typewriter
marsadm
\family default
commands in parallel (whether on the same host, or on different hosts belonging
to the same cluster), it is very likely to produce a mess.
\family typewriter
marsadm
\family default
has no internal locking.
There is no cluster-wide locking at all.
Unfortunately, some systems like Pacemaker are violating this in many cases
(depending on their configuration).
Best is if you have a dedicated / more or less centralized
\series bold
control machine
\series default
which controls masses of your georedundant working servers.
This reduces the risk of running interfering actions in parallel.
Of course, you need backup machines for your control machines, and in different
locations.
Not obeying this advice can easily lead to problems such as complex races
which are very difficult to solve in long-distance distributed systems,
even in general (not limited to MARS).
\end_layout
\begin_layout Itemize
\family typewriter
marsadm wait-cluster
\family default
is your friend.
Whenever your (near-)central script has to switch between different hosts
\family typewriter
A
\family default
and
\family typewriter
B
\family default
(of the same cluster), use it in the following way:
\begin_inset Newline newline
\end_inset
\family typewriter
ssh A
\begin_inset Quotes eld
\end_inset
marsadm action1
\begin_inset Quotes erd
\end_inset
; ssh B
\begin_inset Quotes eld
\end_inset
marsadm wait-cluster; marsadm action2
\begin_inset Quotes erd
\end_inset
\begin_inset Newline newline
\end_inset
\family default
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Don't ignore this advice! Interference is almost
\emph on
sure
\emph default
! As a rule of thumb, precede almost any action command with some appropriate
waiting command!
\end_layout
\begin_layout Itemize
Further friends are any
\family typewriter
marsadm wait-*
\family default
commands, such as
\family typewriter
wait-umount
\family default
.
\end_layout
\begin_layout Itemize
In some places, busy-wait loops might be needed, e.g.
for waiting until a specific resource is
\family typewriter
UpToDate
\family default
or matches some other condition.
Examples of waiting conditions can be found under
\family typewriter
github.com/schoebel/test-suite
\family default
in subdirectory
\family typewriter
mars/modules/
\family default
, specifically
\family typewriter
02_predicates.sh
\family default
or similar.
\end_layout
\begin_layout Itemize
In case of network problems, some command may hang (forever), if you don't
set the
\family typewriter
--timeout=
\family default
option.
Don't forget the check the return state of any failed / timeouted commands,
and to take appropriate measures!
\end_layout
\begin_layout Itemize
Test your scripts in failure scenarios!
\end_layout
\begin_layout Chapter
The Sysadmin Interface (
\family typewriter
marsadm
\family default
and
\family typewriter
/proc/sys/mars/
\family default
)
\family typewriter
\begin_inset CommandInset label
LatexCommand label
name "chap:The-Sysadmin-Interface"
\end_inset
\end_layout
\begin_layout Standard
In general, the term
\begin_inset Quotes eld
\end_inset
after a while
\begin_inset Quotes erd
\end_inset
means that other cluster nodes will take notice of your actions according
to the
\begin_inset Quotes eld
\end_inset
eventually consistent
\begin_inset Quotes erd
\end_inset
propagation protocol described in sections
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-Lamport-Clock"
\end_inset
and
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-Symlink-Tree"
\end_inset
.
Please be aware that this
\begin_inset Quotes eld
\end_inset
while
\begin_inset Quotes erd
\end_inset
may last very long in case of network outages or bad firewall rules.
\end_layout
\begin_layout Standard
In the following tables, column
\begin_inset Quotes eld
\end_inset
Cmp
\begin_inset Quotes erd
\end_inset
means compatibility with DRBD.
Please note that 100% exact compatibility is not possible, because of the
asynchronous communication paradigm.
\end_layout
\begin_layout Standard
The following table documents common options which work with (almost) any
command:
\end_layout
\begin_layout Standard
\size scriptsize
\begin_inset Tabular
<lyxtabular version="3" rows="10" columns="3">
<features rotate="0" islongtable="true" longtabularalignment="left">
<column alignment="left" valignment="top" width="0pt">
<column alignment="center" valignment="top">
<column alignment="left" valignment="top" width="0pt">
<row endhead="true" endfirsthead="true" endfoot="true" endlastfoot="true">
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Option
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Cmp
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Description
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
--dry-run
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Run the command without actually creating symlinks or touching files or
executing rsync.
This option
\emph on
should
\emph default
be used first at any dangerous command, in order to check what would happen.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Don't use in scripts! Only use by hand!
\end_layout
\begin_layout Plain Layout
\size scriptsize
This option does not change the waiting logic.
Many commands are waiting until the desired effect has taken place.
However, with
\family typewriter
--dry-run
\family default
the desired effect will never happen, so the command may wait forever (or
abort with a timeout).
\end_layout
\begin_layout Plain Layout
\size scriptsize
In addition, this option can lead to additional aborts of the commands due
to unmet conditions, which cannot be met because the symlinks are not actually
created / altered.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Thus this option can give only a
\series bold
rough estimate
\series default
of what would happen later!
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
--force
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
almost
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Some preconditions are skipped, i.e.
the command will / should work although some (more or less) vital preconditions
are violated.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Instead of giving
\family typewriter
--force
\family default
, you may alternatively prefix your command with
\family typewriter
force-
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
THIS OPTION IS DANGEROUS!
\end_layout
\begin_layout Plain Layout
\size scriptsize
Use it only when you are absolutely sure that you know what you are doing!
\end_layout
\begin_layout Plain Layout
\size scriptsize
Use it only as a last resort if the same command without
\family typewriter
--force
\family default
has failed
\emph on
for no good reason
\emph default
!
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
--verbose
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Some (few) commands will become more speaky.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
--timeout=$seconds
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Some commands require response from either the local kernel module, or from
other cluster nodes.
In order to prevent infinite waiting in case of network outages or other
problems, the command will fail after the given timeout has been reached.
\end_layout
\begin_layout Plain Layout
\size scriptsize
When $seconds is -1, the command will wait forever.
\end_layout
\begin_layout Plain Layout
\size scriptsize
When $seconds is 0, the command will not wait in case any precondition is
not met, und abort without performing an action..
\end_layout
\begin_layout Plain Layout
\size scriptsize
The default timeout is 5s.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
--window=$seconds
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
The time window for checking the aliveness of other nodes in the network.
When no symlink updates have occurred during the last window, the node
is considered dead.
Default is 30s
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
--threshold=$size
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
The macros containing the substring
\family typewriter
-threshold-
\family default
or
\family typewriter
-almost-
\family default
are using this as a default value for approximation whether something has
been approximately reached.
Default is 10MiB.
\end_layout
\begin_layout Plain Layout
\size scriptsize
The $size argument may be a number optionally followed by one the lowercase
characters k m g t p for indicating kilo mega giga tera or peta bytes as
multiples of 1000.
When using the corresponding uppercase character, multiples of 1024 are
formed instead.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
--host=$host
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
The command acts as if the command were executed on another host $host.
This option should not be used regularly, because the local information
in the symlink tree may be outdated or even wrong.
Additionally, some local information like remote sizes of physical devices
(e.g.
remote disks) is not present in the symlink tree at all, or is wrong (reflectin
g only the
\emph on
local
\emph default
state).
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
THIS OPTION IS DANGEROUS!
\end_layout
\begin_layout Plain Layout
\size scriptsize
Use it only for final destruction of dead cluster nodes, see section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Final-Destroy-of"
\end_inset
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
--ip=$ip
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
By default,
\family typewriter
marsadm
\family default
always uses the IP for
\family typewriter
$host
\family default
as stored in the symlink tree (directory
\family typewriter
/mars/ips/
\family default
).
When such an IP entry does not (yet) exist (e.g.
\family typewriter
create-cluster
\family default
or
\family typewriter
join-cluster
\family default
), all local network interfaces are automatically scanned for IPv4 adresses,
and the first one is taken.
This may lead to wrong decisions if you have multiple network interfaces.
\end_layout
\begin_layout Plain Layout
\size scriptsize
In order to override the automatic IP detection and.to explicitly tell the
IP address of your storage network, use this option.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
Usually you will need this only at
\family typewriter
{create,join}-cluster
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
--verbose
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Some (few) commands will become more speaky.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Section
Cluster Operations
\begin_inset CommandInset label
LatexCommand label
name "sec:Cluster-Operations"
\end_inset
\end_layout
\begin_layout Standard
\size scriptsize
\begin_inset Tabular
<lyxtabular version="3" rows="6" columns="3">
<features rotate="0" islongtable="true" longtabularalignment="left">
<column alignment="left" valignment="top" width="0pt">
<column alignment="center" valignment="top">
<column alignment="left" valignment="top" width="0pt">
<row endhead="true" endfirsthead="true" endfoot="true" endlastfoot="true">
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Command / Params
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Cmp
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Description
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
create-cluster
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: the
\family typewriter
/mars/
\family default
filesystem must be mounted and it must be empty (
\family typewriter
mkfs.ext4
\family default
, see instructions in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Setup-your-Cluster"
\end_inset
).
The kernel module must
\emph on
not
\emph default
be loaded.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: the initial symlink tree is created in
\family typewriter
/mars/
\family default
.
Additionally, the
\family typewriter
/mars/uuid
\family default
symlink is created for later distribution in the cluster.
It uniquely indentifies the cluster in the world.
\end_layout
\begin_layout Plain Layout
\size scriptsize
This must be called exactly once at the initial primary.
\end_layout
\begin_layout Plain Layout
Hint: use the
\family typewriter
--ip=
\family default
option if you have multiple interfaces.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
join-cluster
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$host
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: the
\family typewriter
/mars/
\family default
filesystem must be mounted and it must be empty (
\family typewriter
mkfs.ext4
\family default
, see instructions in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Setup-your-Cluster"
\end_inset
).
The kernel module must
\emph on
not
\emph default
be loaded.
The cluster must have been already created at another node
\family typewriter
$host
\family default
.
A working ssh connecttion to $host as root must exist (without password).
\family typewriter
rsync
\family default
must be installed at all cluster nodes.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: the initial symlink tree
\family typewriter
/mars/
\family default
is replicated from the remote host
\family typewriter
$host
\family default
, and the local host has been added as another cluster member.
\end_layout
\begin_layout Plain Layout
\size scriptsize
This must be called exactly once at every initial secondary node.
\end_layout
\begin_layout Plain Layout
Hint: use the
\family typewriter
--ip=
\family default
option if you have multiple interfaces.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
leave-cluster
\begin_inset CommandInset label
LatexCommand label
name "leave-cluster"
\end_inset
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: the
\family typewriter
/mars/
\family default
filesystem must be mounted and it must contain a valid MARS symlink tree
produced by the other
\family typewriter
marsadm
\family default
commands.
The local node must no longer be member of any resource (see
\family typewriter
marsadm leave-resource
\family default
).
The kernel module should be loaded and the network should be operating
in order to also propogate the effect to the other nodes.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: the local node is removed from the replicated symlink tree
\family typewriter
/mars/
\family default
such that other nodes will cease to communicate with it after a while.
The converse it not true: the local node may continue
\begin_inset Foot
status open
\begin_layout Plain Layout
\size scriptsize
Reason:
\family typewriter
leave-cluster
\family default
removes only its
\emph on
own
\emph default
IP address from
\family typewriter
/mars/ips/
\family default
, but does not destroy the usual symmetry of the symlink tree by leaving
the other IPs intact.
Therefore, the local node will continue fetching updates from all nodes
present in
\family typewriter
/mars/ips/
\family default
.
As an effect, the local node will
\emph on
passively
\emph default
mirror the symlinks of other cluster members, but not vice versa.
There is no communication from the local node to the other ones, turning
the local node into a
\series bold
whitness
\series default
according to some terminology from Distributed Systems.
This is a feature, not a bug.
It could be used for porst-mortem analysis, or for monitoring purposes.
However,
\emph on
deletions
\emph default
of symlinks are not guaranteed to take place, so your whitness may
\emph on
accumulate
\emph default
thousands of old symlinks over a long time.
If you want to eventually stop all communication to the local node, just
run
\family typewriter
rmmod
\family default
.
\end_layout
\end_inset
passivley fetching the symlink tree.
In order to really stop all communication, the kernel module should be
unloaded afterwards.
The local
\family typewriter
/mars/
\family default
filesystem may be manually destroyed after that (at least if you need to
reuse it).
\end_layout
\begin_layout Plain Layout
\size scriptsize
In case of an eventual node loss (e.g.
fire, water, ...) this command should be used on another node $helper in order
to finally remove $damaged from the cluster via the command
\family typewriter
marsadm leave-cluster --host=$damaged --force
\family default
.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
In case you cannot use
\family typewriter
leave-resource
\family default
for any reason, you may do the following: just destroy the
\family typewriter
/mars/
\family default
filesystem on the host
\family typewriter
$deadhost
\family default
you want to remove (e.g.
by
\family typewriter
mkfs
\family default
), or take other measures to
\emph on
ensure
\emph default
that it cannot be accidentally re-used in any way (e.g.
physical destruction of the underlying RAID,
\family typewriter
lvremove
\family default
, etc).
On all other hosts, do
\family typewriter
rmmod mars
\family default
, then delete the symlink
\family typewriter
/mars/ips/ip-$deadhost
\family default
everywhere by hand, and finally
\family typewriter
modprobe mars
\family default
again.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
\size scriptsize
Notice that the last
\family typewriter
leave-resource
\family default
operation does not delete the cluster as such.
It just creates an
\emph on
empty
\emph default
cluster which has no longer any members.
In particular, the cluster ID
\family typewriter
/mars/uuid
\family default
is
\emph on
not
\emph default
removed, deliberately
\begin_inset Foot
status open
\begin_layout Plain Layout
\size scriptsize
This is a feature, not a bug.
The
\family typewriter
uuid
\family default
is created once, but never alterered anywhere.
The only way to get rid of it is
\emph on
external
\emph default
deletion (not by
\family typewriter
marsadm
\family default
)
\emph on
together(!)
\emph default
with all other contents of
\family typewriter
/mars/
\family default
.
This prevents you from accidentally merging half-dead remains which could
have survived a disaster for any reason, such as snapshotting filesystems
/ VMs or whatever.
\end_layout
\end_inset
.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
Before you can re-use
\emph on
any
\emph default
left-over
\family typewriter
/mars/
\family default
filesystem for creating / joining a new / different cluster, you
\emph on
must
\emph default
obey the instructions in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Setup-your-Cluster"
\end_inset
and use
\family typewriter
mkfs.ext4
\family default
accordingly.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
wait-cluster
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
See section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Waiting"
\end_inset
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
create-uuid
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
Deprecated.
Only for compatibility with old version light0.1beta05 or earlier.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Precondition: the
\family typewriter
/mars/
\family default
filesystem must be mounted.
A
\family typewriter
uuid
\family default
(such as automatically created by recent versions of
\family typewriter
marsadm create-cluster
\family default
) must not already exist; i.e.
you have a very old and outdated symlink tree.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: the
\family typewriter
/mars/uuid
\family default
symlink is created for later distribution in the cluster.
It uniquely indentifies the cluster in the world.
\end_layout
\begin_layout Plain Layout
\size scriptsize
This must be called at most once at the current primary.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Section
Resource Operations
\begin_inset CommandInset label
LatexCommand label
name "sec:Resource-Operations"
\end_inset
\end_layout
\begin_layout Standard
Common precondition for all resource operations is that the
\family typewriter
/mars/
\family default
filesystem is mounted, that it contains a valid MARS symlink tree produced
by other
\family typewriter
marsadm
\family default
commands (including a unique
\family typewriter
uuid
\family default
), that your current node is a valid member of the cluster, and that the
kernel module is loaded.
When communication is impossible due to network outages or bad firewall
rules, most commands will succeed, but other cluster nodes may take a long
time to notice your changes.
\end_layout
\begin_layout Standard
Instead of executing
\family typewriter
marsadm
\family default
commands serveral times for each resource argument, you may give the special
resource argument
\family typewriter
all
\family default
.
This work even when combined with
\family typewriter
--force
\family default
, but be cautious when giving dangerous command combinations like
\family typewriter
marsadm delete-resource --force all
\family default
.
\end_layout
\begin_layout Standard
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Beware when combining this with
\family typewriter
--host=somebody
\family default
.
In some very rare cases, like final destruction of a whole datacenter after
an earthquake, you might need a combination like
\family typewriter
marsadm --host=defective delete-resource --force all
\family default
.
Don't use such combinations if you don't need them
\emph on
really
\emph default
! You can easily shoot yourself in your head if you are not carefully operating
such commands!
\end_layout
\begin_layout Subsection
Resource Creation / Deletion / Modification
\begin_inset CommandInset label
LatexCommand label
name "sub:Resource-Creation"
\end_inset
\end_layout
\begin_layout Standard
\size scriptsize
\begin_inset Tabular
<lyxtabular version="3" rows="6" columns="3">
<features rotate="0" islongtable="true" longtabularalignment="left">
<column alignment="left" valignment="top" width="0pt">
<column alignment="center" valignment="top">
<column alignment="left" valignment="top" width="0pt">
<row endhead="true" endfirsthead="true" endfoot="true" endlastfoot="true">
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Command / Params
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Cmp
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Description
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
create-resource
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$disk_dev
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
[$mars_name]
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
[$size]
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: the resource argument
\family typewriter
$res
\family default
must not denote an already existing resource name in the cluster.
The argument
\family typewriter
$disk_dev
\family default
must denote an absolute path to a usable local block device, its size must
be greater zero.
When the optional
\family typewriter
$mars_name
\family default
is given, that name must not already exist on the local node; when not
given,
\family typewriter
$mars_name
\family default
defaults to
\family typewriter
$res
\family default
.
When the optional
\family typewriter
$size
\family default
argument is given, it must be a number, optionally followed by a lowercase
suffix
\family typewriter
k
\family default
,
\family typewriter
m
\family default
,
\family typewriter
g
\family default
,
\family typewriter
t
\family default
, or
\family typewriter
p
\family default
(denoting size factors as multiples of 1000), or an uppercase suffix
\family typewriter
K
\family default
,
\family typewriter
M
\family default
,
\family typewriter
G
\family default
,
\family typewriter
T
\family default
or
\family typewriter
P
\family default
(denoting size factors as multiples of 1024).
The given size must not exceed the actual size of
\family typewriter
$disk_dev
\family default
.
It will specify the future resource size as shown by
\family typewriter
marsadm view-resource-size $res
\family default
.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: the resource
\family typewriter
$res
\family default
is created, the inital role of the current node is primary.
The corresponding symlink tree information is asynchonously distributed
in the cluster (in the background).
The device
\family typewriter
/dev/mars/$mars_name
\family default
should appear after a while.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Notice: when
\family typewriter
$size
\family default
is strictly smaller than the size of
\family typewriter
$disk_dev
\family default
, you will unnecessarily waste some space..
\end_layout
\begin_layout Plain Layout
\size scriptsize
This must be called exactly once for any new resource.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
join-resource
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$disk_dev
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
[$mars_name]
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: the resource argument
\family typewriter
$res
\family default
must denote an already existing resource in the cluster (i.e.
its symlink tree information must have been received).
The resource must have a designated primary, and it must no be in emergency
mode.
There must not exist a split brain in the cluster.
The local node must not be already member of that resource.
The argument
\family typewriter
$disk_dev
\family default
must denote an absolute path to a usable (but currently unused) local block
device, its size must be greater or equal to the logical size of the resource.
When the optional
\family typewriter
$mars_name
\family default
is given, that name must not already exist on the local node; when not
given,
\family typewriter
$mars_name
\family default
defaults to
\family typewriter
$res
\family default
.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: the current node becomes a member of resource
\family typewriter
$res
\family default
, the inital role is secondary.
The initial full sync should start after a while.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Notice: when the size of $disk_dev is strictly greater than the size of
the resource, you will unnecessarily waste some space..
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
leave-resource
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: the local node must be a member of the resource
\family typewriter
$res
\family default
; its current role must be secondary.
Sync, fetch and replay must be paused (see commands
\family typewriter
pause-{sync,fetch,replay}
\family default
or their abbreviation
\family typewriter
down
\family default
).
The disk must be detatched (see commands
\family typewriter
detach
\family default
or
\family typewriter
down
\family default
).
The kernel module should be loaded and the network should be operating
in order to also propogate the effect to the other nodes.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: the local node is no longer a member of
\family typewriter
$res
\family default
.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Notice: as a side effect for other nodes, their
\family typewriter
log-delete
\family default
may now become possible, since the current node does no longer count as
a candidate for logfile application.
In addition, a split brain situation may be (partly) resolved by this.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
Please notice that this command
\emph on
may
\emph default
lead to (but does not guarantee) split-brain resolution.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
\size scriptsize
The contents of the disk is not changed by this command.
Before issuing this command, check whether the disk appears to be locally
consistent (see
\family typewriter
view-is-consistent
\family default
)! After giving this command, any internal information indicating the consistenc
y state will be gone, and you will no longer be able to guess consistency
properties.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
When you are
\emph on
sure
\emph default
.that the disk was consistent before (or is now by manually checking it),
you may re-create a new resource out of it via
\family typewriter
create-resource
\family default
.
\end_layout
\begin_layout Plain Layout
\size scriptsize
In case of an eventual node loss (e.g.
fire, water, ...) this command may be used on another node $helper in order
to finally remove all the resources $damaged from the cluster via the command
\family typewriter
marsadm leave-resource $res --host=$damaged --force
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
delete-resource
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: the resource must be empty (i.e.
all members must have left via
\family typewriter
leave-resource
\family default
).
This precondition is overridable by
\family typewriter
--force
\family default
, increasing the danger to maximum! It is even possible to combine
\family typewriter
--force
\family default
with an invalid resource argument and an invalid
\family typewriter
--host=somebodyelse
\family default
argument in order to desperately try to destroy remains of incomplete or
pysically damaged hardware.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: all cluster members will somewhen be forcefully removed from
\family typewriter
$res
\family default
.
In case of network interruptions, the forced removal may take place far
in the future.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
THIS COMMAND IS
\emph on
VERY
\emph default
DANGEROUS!
\end_layout
\begin_layout Plain Layout
\size scriptsize
Use this only in desperate situations, and only manually.
Don't call this from scripts.
You are forcefully using a sledgehammer, even without
\family typewriter
--force
\family default
! The danger is that the
\emph on
true
\emph default
state of other cluster nodes need not be known in case of network problems
.Even when it were known, it could be compromised by
\series bold
byzantine failures
\series default
.
\end_layout
\begin_layout Plain Layout
\size scriptsize
It is strongly advised to try this command with
\family typewriter
--dry-run
\family default
first.
\end_layout
\begin_layout Plain Layout
\size scriptsize
When combined with
\family typewriter
--force
\family default
, this command will definitely
\series bold
murder
\series default
other cluster nodes, possibly after a long while, and even when they are
operating in primary mode / having split brains / etc.
However, there is no guarantee that other cluster nodes will be
\emph on
really
\emph default
dead -- it is (theoretically) possible that they remain only
\emph on
half
\emph default
\emph on
dead
\emph default
.
For example, a half dead node may continue to write data to
\family typewriter
/mars/
\family default
and thus lead to overflow somewhen.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
This command implies a forceful detach, possibly destroying consistency.
\size scriptsize
It is similar in spirit to a
\series bold
STONITH
\series default
.
In particular, when a cluster node was operating in primary mode (
\family typewriter
/dev/mars/mydata
\family default
being continuously in use), the forceful detach cannot be carried out until
the device is completely unused.
In the meantime, the current transaction logfile will be appended to, but
the file
\emph on
might
\emph default
be already unlinked (orphan file filling up the disk).
After the forceful detach, the underlying disk need not be consistent (although
MARS does its best).
Since this command deletes any symlinks which normally would indicate the
consistency state, no guarantees about consistency can be given after this
\emph on
in general
\emph default
! Always check consistency by hand!
\end_layout
\begin_layout Plain Layout
\size scriptsize
When possible / as soon as possible, check the local state on the other
nodes in order to
\emph on
really
\emph default
shutdown the resource everywhere (e.g.
to
\emph on
really
\emph default
unuse the
\family typewriter
/dev/mars/mydata
\family default
device, etc).
\end_layout
\begin_layout Plain Layout
\size scriptsize
After this command, you
\emph on
should
\emph default
rebuild the resource under a different name, in order to avoid any clashes
caused by unexpected resurrection of
\begin_inset Quotes eld
\end_inset
dead
\begin_inset Quotes erd
\end_inset
or
\begin_inset Quotes eld
\end_inset
half-dead
\begin_inset Quotes erd
\end_inset
nodes (beware of shapshot / restores on virtual machines!!).
MARS does its best to avoid problems even in case the new resource name
should equal the old one, but there can be
\emph on
no guarantee
\emph default
in all possible failure scenarios / usage scenarios.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
When possible, prefer
\family typewriter
leave-resource
\family default
over this!
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
wait-resource
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
{is-,}{attach,
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
primary,
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
device}{-off,}
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
See section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Waiting"
\end_inset
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Subsection
Operation of the Resource
\begin_inset CommandInset label
LatexCommand label
name "sub:Operation-of-the"
\end_inset
\end_layout
\begin_layout Standard
Common preconditions are the preconditions from section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Resource-Operations"
\end_inset
, plus the respective resource
\family typewriter
$res
\family default
must exist, and the local node must be a member of it.
With the single exception of
\family typewriter
attach
\family default
itself, all other operations must be started in
\family typewriter
attached
\family default
state.
\end_layout
\begin_layout Standard
When
\family typewriter
$res
\family default
has the special reserved value
\family typewriter
all
\family default
, the following operations will work on all resources where the current
node is a member (analogously to DRBD).
\end_layout
\begin_layout Standard
\size scriptsize
\begin_inset Tabular
<lyxtabular version="3" rows="42" columns="3">
<features rotate="0" islongtable="true" longtabularalignment="left">
<column alignment="left" valignment="top" width="0pt">
<column alignment="center" valignment="top">
<column alignment="left" valignment="top" width="0pt">
<row endhead="true" endfirsthead="true" endfoot="true" endlastfoot="true">
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Command / Params
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Cmp
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Description
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
attach
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
yes
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: the local disk belonging to $res is not in use by anyone else.
Its contents has not been altered in the meantime since the last
\family typewriter
detach
\family default
.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
Mounting
\emph on
read-only
\emph default
is allowed during the detached phase.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
\size scriptsize
However, be careful! If you
\emph on
accidentally
\emph default
forget to give the right readonly-mount flags, if you use
\family typewriter
fsck
\family default
in repair mode inbetween, or alter the disk content in any other way (beware
of LVM snapshots / restores etc), you will almost certainly produce an
\series bold
unnoticed inconsistency
\series default
(not reported by
\family typewriter
view-is-consistent
\family default
)! MARS has
\emph on
no chance
\emph default
to notice suchalike!
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: MARS uses the local disk and is able to work with it (e.g.
replay logfiles on it).
\end_layout
\begin_layout Plain Layout
\size scriptsize
Note: the local disk is opened in exclusive read-write mode.
This should protect against most common misuse, such as opening the disk
in parallel to MARS.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
\size scriptsize
However, this does not necessarily protect against non-exclusive openers.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
detach
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
yes
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: the local
\family typewriter
/dev/mars/mydata
\family default
device (when present) is no longer opened by anybody.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: the local disk belonging to $res is no longer in use.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
In contrast to DRBD, you need not explicitly pause syncing, fetching, or
replaying
\emph on
to
\emph default
(as apposed to
\emph on
from
\emph default
) the local disk.
These processes are automatically paused.
As another contrast to DRBD, the respective processes will usually
\emph on
automatically
\emph default
resume after re-attach, as far as possible in the respective new situation.
This will usually work even over
\family typewriter
rmmod
\family default
or reboot cycles, since the internal symlink tree will automatically persist
all todo switches for you (c.f.
section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-State-of"
\end_inset
).
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
\size scriptsize
Notice: only
\emph on
local
\emph default
transfer operations
\emph on
to
\emph default
the local disk are paused by a detach.
When another node is remotely running a sync
\emph on
from
\emph default
your local disk, it will likely remain in use for remote reading.
The reason is that the server part of MARS is operating purely passively,
in order serve all remote requests as best as possible (similar to the
original Unix philosophy).
In order to really stop all accesses, do a
\family typewriter
pause-sync
\family default
on all other resource member where a sync is currently running.
You may also try
\family typewriter
pause-sync-global
\family default
.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
\size scriptsize
WARNING! After this, and ather having paused any remote data access, you
might use the underlying disk for your own purposes, such as test-mounting
it in
\emph on
readonly
\emph default
mode.
\series bold
Don't modifiy
\series default
its contents in any way! Not even by an
\family typewriter
fsck
\family default
\begin_inset Foot
status open
\begin_layout Plain Layout
\size scriptsize
Some (but not all)
\family typewriter
fsck
\family default
tools for some filesystems have options to start only a test repair / verify
mode / dry run, without doing actual modifications to the data.
Of course, these modes
\emph on
can
\emph default
be used.
But be really sure! Double-check for the right options!
\end_layout
\end_inset
! Otherwise, you will have inconsistencies
\emph on
guaranteed
\emph default
.
MARS has no way for knowing of any modifications to your disk when bypassing
\family typewriter
/dev/mars/*
\family default
.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
In case you accidentally modified the underlying disk at the
\emph on
primary
\emph default
side, you may choose to resolve the inconsistencies by
\family typewriter
marsadm invalide $res
\family default
on
\emph on
each
\emph default
secondary.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
pause-sync
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Equivalent to
\family typewriter
pause-sync-local
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
pause-sync-local
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: none additionally.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: any sync operation targeting the local disk (when not yet
completed) is paused after a while (cf section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-State-of"
\end_inset
).
When successfully completed, this operation will remember the switch state
forever and automatically become relevant if a sync is needed again (e.g.
\family typewriter
invalidate
\family default
or
\family typewriter
resize
\family default
).
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
pause-sync-global
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Like
\family typewriter
*-local
\family default
, but operates on all members of the resource.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
resume-sync
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Equivalent to
\family typewriter
resume-sync-local
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
resume-sync-local
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: additionally, a primary must be designated, and it must not
be in emergency mode.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: any sync operation targeting the local disk (when not yet
completed) is resumed after a while.
When completed, this operation will remember the switch state forever and
become relevant if a sync is needed again (e.g.
\family typewriter
invalidate
\family default
or
\family typewriter
resize
\family default
).
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
resume-sync-global
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Like
\family typewriter
*-local
\family default
, but operates on all members of the resource.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
pause-fetch
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Equivalent to
\family typewriter
pause-fetch-local
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
pause-fetch-local
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: none additionally.
The resource
\emph on
should
\emph default
be in secondary role.
Otherwise the switch has
\emph on
no
\emph default
\emph on
immediate
\emph default
effect, but will come (possibly unexpectedly) into effect whenever secondary
role is entered later for whatever reason.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: any transfer of (parts of) transaction logfiles which are
present at another primary host to the local
\family typewriter
/mars/
\family default
storage are paused at their current stage.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
This switch works independently from
\family typewriter
{pause,resume}-replay
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
pause-fetch-global
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Like
\family typewriter
*-local
\family default
, but operates on all members of the resource.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
resume-fetch
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Equivalent to
\family typewriter
resume-fetch-local
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
resume-fetch-local
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: none additionally.
The resource
\emph on
should
\emph default
be in secondary role.
Otherwise the switch has
\emph on
no
\emph default
\emph on
immediate
\emph default
effect, but will come (possibly unexpectedly) into effect whenever secondary
role is entered later for whatever reason.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: any (parts of) transaction logfiles which are present at
another primary host shouldl be transferred to the local
\family typewriter
/mars/
\family default
storage as far as not yet locally present.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
This works independently from
\family typewriter
{pause,resume}-replay
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
resume-fetch-global
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Like
\family typewriter
*-local
\family default
, but operates on all members of the resource.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
pause-replay
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Equivalent to
\family typewriter
pause-replay-local
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
pause-replay-local
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: none additionally.
The resource
\emph on
should
\emph default
be in secondary role.
Otherwise the switch has
\emph on
no
\emph default
\emph on
immediate
\emph default
effect, but will come (possibly unexpectedly) into effect whenever secondary
role is entered later for whatever reason.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: any local replay operations of transaction logfiles to the
local disk are paused at their current stage.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
This works independently from
\family typewriter
{pause,resume}-fetch
\family default
resp.
\family typewriter
{dis,}connect
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
pause-replay-global
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Like
\family typewriter
*-local
\family default
, but operates on all members of the resource.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
resume-replay
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Equivalent to
\family typewriter
pause-replay-local
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
resume-replay-local
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status collapsed
\begin_layout Plain Layout
\size scriptsize
Precondition: must be in secondary role.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: any (parts of) locally existing transaction logfiles (whether
replicated from other hosts or produced locally) are started for replay
to the local disk, as far as they have not yet been applied.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
resume-replay-global
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Like
\family typewriter
*-local
\family default
, but operates on all members of the resource.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
connect
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Equivalent to
\family typewriter
connect-local
\family default
and to
\family typewriter
resume-fetch-local
\family default
.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
Note: although this sounds similar to DRBD's
\family typewriter
drbdadm connect
\family default
, there are subtle differences.
DRBD has exactly one connection per resource, which is associated with
\emph on
pairs
\emph default
of nodes.
In contrast, MARS may create multiple connections per resource at runtime,
and these are associated with the
\emph on
target
\emph default
host (not with
\emph on
pairs
\emph default
of hosts).
As a consequence, the fetch may
\emph on
potentially
\emph default
occur from any other other source host which happens to be reachable (although
the current implementation prefers the current designated primary, but
this may change in future).
In addition,
\family typewriter
marsadm disconnect
\family default
does not stop
\emph on
all
\emph default
communication.
It only stops fetching logfiles.
The symlink update running in background is
\emph on
not
\emph default
stopped, in order to always propagate as much metadata as possible in the
cluster.
In case of a later incident, chances are higher for a better knowledge
of the
\emph on
real
\emph default
state of the cluster.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
connect-local
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Equivalent to
\family typewriter
resume-fetch-local
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
connect-global
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Equivalent to
\family typewriter
resume-fetch-global
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
disconnect
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Equivalent to
\family typewriter
disconnect-local
\family default
and to
\family typewriter
pause-fetch-local
\family default
.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
See above note at
\family typewriter
connect
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
disconnect-local
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Equivalent to
\family typewriter
pause-fetch-local
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
disconnect-global
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
partly
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Equivalent to
\family typewriter
pause-fetch-global
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
up
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
yes
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Equivalent to
\family typewriter
attach
\family default
followed by
\family typewriter
resume-fetch
\family default
followed by
\family typewriter
resume-replay
\family default
followed by
\family typewriter
resume-sync
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
down
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
yes
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Equivalent to
\family typewriter
pause-sync
\family default
followed by
\family typewriter
pause-fetch
\family default
followed by
\family typewriter
pause-replay
\family default
followed by
\family typewriter
detach
\family default
.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
Hint: consider to prefer plain
\family typewriter
detach
\family default
over this, because
\family typewriter
detach
\family default
will remember the last state of all switches, while
\family typewriter
down
\family default
will
\emph on
not
\emph default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
primary
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
almost
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: sync must have finished at any resource member.
All relevant transaction logfiles must be either already locally present,
or be fetchable (see
\family typewriter
resume-fetch
\family default
and
\family typewriter
resume-replay
\family default
).
When some logfile data is locally missing, there must be enough space on
\family typewriter
/mars/
\family default
to fetch it.
Any replay must not have been interrupted by a replay error (see macro
%replay-code{} or diskstate
\family typewriter
DefectiveLog
\family default
).
The current designated primary must be reachable over network.
When there is no designated primary (i.e.
\family typewriter
marsadm secondary
\family default
had been executed before, which is explicitly
\emph on
not recommended
\emph default
),
\emph on
all
\emph default
other members of the resource must be reachable (since we have no memory
who was the old primary before), and then they must also match the same
preconditions.
When another host is currently primary (whether designated or not), it
must match the preconditions of
\family typewriter
marsadm secondary
\family default
(that means, its local
\family typewriter
/dev/mars/mydata
\family default
device must not be in use any more).
A split brain must not already exist.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition:
\family typewriter
/dev/mars/$dev_name
\family default
appears locally and is usable; the current host is in primary role.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Switches the
\series bold
designated primary
\series default
.
There are two variants:
\end_layout
\begin_layout Plain Layout
\size scriptsize
1)
\series bold
Handover
\series default
when
\emph on
not
\emph default
giving
\family typewriter
--force
\family default
: when another host is currently primary, it is first asked to leave its
primary role, and it is waited until it actually has become secondary.
After that, the local host is asked to become primary.
Before actually becoming primary, all relevant logfiles are transferred
over the network and replayed, in order to avoid accidental creation of
split brain as best as possible
\begin_inset Foot
status open
\begin_layout Plain Layout
\size scriptsize
Note that split brain avoidance is
\series bold
best effort
\series default
and cannot be guaranteed in general.
For example, it may be impossible to avoid split brain in case of long-lasting
network outages.
\end_layout
\end_inset
.
Only after that,
\family typewriter
/dev/mars/$dev_name
\family default
will appear.
When network transfers of the symlink tree are very slow (or currently
impossible), this command may take a very long time.
\end_layout
\begin_layout Plain Layout
\size scriptsize
In case a split brain is already detected at the initial situation, the
local host will refuse to switch the designated primary without
\family typewriter
--force
\family default
.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
In case of
\begin_inset Formula $k>2$
\end_inset
replicas: if you want to handover between host
\family typewriter
A
\family default
and
\family typewriter
B
\family default
while a sync is currently running at host
\family typewriter
C
\family default
, you have the following options:
\end_layout
\begin_layout Enumerate
\size scriptsize
wait until the sync has finished (see macro
\family typewriter
sync-rest
\family default
, or
\family typewriter
marsadm view
\family default
in general).
\end_layout
\begin_layout Enumerate
\size scriptsize
do a
\family typewriter
leave-resouce
\family default
on host
\family typewriter
C
\family default
, and later
\family typewriter
join-resource
\family default
after the handover completed successfully.
\end_layout
\begin_layout Plain Layout
\size scriptsize
2)
\series bold
Forced switching
\series default
: by giving --force while
\family typewriter
pause-fetch
\family default
is active (but not
\family typewriter
pause-replay
\family default
), most preconditions are ignored, and MARS does its best to actually become
primary even if some logfiles are missing or incomplete or even defective.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
\family typewriter
\size scriptsize
primary --force
\family default
is a potentially harmful variant, because it will provoke a split brain
in many cases, and therefore in turn will lead to
\series bold
data loss
\series default
because one of your split brain versions must be discarded later in order
to resolve the split brain (see section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Split-Brain-Resolution"
\end_inset
).
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
\series bold
\size scriptsize
Never
\series default
call
\family typewriter
primary --force
\family default
when
\family typewriter
primary
\family default
without
\family typewriter
--force
\family default
is sufficient! If
\family typewriter
primary
\family default
without
\family typewriter
--force
\family default
complains that the device is in use at the former primary side, take it
seriously! Don't override with
\family typewriter
--force
\family default
, but rather umount
\begin_inset Foot
status open
\begin_layout Plain Layout
\size scriptsize
A common misconception is when people think that they can keep their filesystem
mounted without provoking a split brain, because they have their application
stopped and thus don't write any data into the filesystem.
This is a wrong idea, because filesystems may write some metadata, like
booking information, even after hours or days of inactivity.
Therefore MARS insists that the device is no longer in use before any handover
can take place.
\end_layout
\end_inset
the device at the other side!
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
Only use
\family typewriter
primary --force
\family default
when something is
\emph on
already broken
\emph default
, such as a network outage, or a node crash, etc.
During ordinary operations (network OK, nodes OK), you should never need
\family typewriter
primary --force
\family default
!
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
If you umount
\family typewriter
/dev/mars/mydata
\family default
on the old primary
\family typewriter
A
\family default
, and then wait until
\family typewriter
marsadm view
\family default
(or another suitable macro) on the target host
\family typewriter
B
\family default
shows that everything is
\family typewriter
UpToDate
\family default
, you can prevent a split brain by yourself even when giving
\family typewriter
primary --force
\family default
afterwards.
However, checking / assuring this is
\emph on
your
\emph default
responsibility!
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
\family typewriter
\size scriptsize
primary --force
\family default
switches the
\emph on
designated
\emph default
primary.
In some extremely rare cases, when
\emph on
multiple
\emph default
faults have accumulated in a
\emph on
weird
\emph default
situation, it
\emph on
might
\emph default
be impossible becoming the / an actual primary.
Typically you may be
\emph on
already
\emph default
in a split brain situation.
This has not been observed for a long operations time on recent versions
of MARS, but in general becoming primary via
\family typewriter
--force
\family default
cannot be guaranteed always, although MARS does its best.
In split brain situations, or if you ever encounter such a problem, you
\emph on
must
\emph default
resolve the split brain immediately after giving this command (see section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Split-Brain-Resolution"
\end_inset
).
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
Hint in case of
\begin_inset Formula $k>2$
\end_inset
replicas:
\family typewriter
marsadm invalidate
\family default
cannot always resolve a split brain at other secondaries (which are neither
the old nor the new designated primary).
Therefore, prefer the
\family typewriter
leave-resource
\family default
method described in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Split-Brain-Resolution"
\end_inset
, starting with a
\family typewriter
leave-resource
\family default
phase at the old primary, and proceeding to
\begin_inset Quotes eld
\end_inset
unrelated
\begin_inset Quotes erd
\end_inset
secondaries step by step, until the split brain is gone.
Don't
\family typewriter
join-resource
\family default
again before the split brain is gone! This way, all these replicas will
remain consistent for now, but of course outdated (or potentially even
a
\begin_inset Quotes eld
\end_inset
wrong
\begin_inset Quotes erd
\end_inset
split-brain version, but
\emph on
potentially usable
\emph default
in case you get under pressure in some way).
In the hopefully unlikely case that you should later discover that you
accidentally forced the
\emph on
wrong
\emph default
replica via
\family typewriter
primary --force
\family default
, you will have a chance to recover by either forcing the
\begin_inset Quotes eld
\end_inset
correct
\begin_inset Quotes erd
\end_inset
host to primary (if it did not already leave the resource), or by creating
a completely fresh resource out of the
\begin_inset Quotes eld
\end_inset
correct
\begin_inset Quotes erd
\end_inset
local disk.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
Generally: in case of
\family typewriter
primary --force
\family default
, the preconditions are different.
The fetch
\emph on
must
\emph default
be switched off (see
\family typewriter
pause-fetch
\family default
), in order to get stable logfile positions.
See section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Forced-Switching"
\end_inset
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
secondary
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
almost
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: the local
\family typewriter
/dev/mars/$dev_name
\family default
is no longer in use (e.g.
umounted).
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: There exists no designated primary any more.
During split brain and when the network is OK (again), all actual primaries
(including the local host) will leave primary ASAP (i.e.
when their
\family typewriter
/dev/mars/mydata
\family default
is no longer in use).
Any secondary will start following (old) logfiles (even from backlogs)
by replaying transaction logs if it is
\emph on
uniquely
\emph default
possible (which is often violated during split brain).
On any secondary,
\family typewriter
/dev/mars/$dev_name
\family default
will have disappeared.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
Notice: in difference to DRBD, you
\series bold
don't need
\series default
this command during normal operation, including handover.
Any resource member which is
\emph on
not
\emph default
designated as primary will
\emph on
automatically
\emph default
go into secondary role.
For example, if you have
\begin_inset Formula $k=4$
\end_inset
replicas, only
\emph on
one of them
\emph default
can be designated as a primary.
When the network is OK, all other 3 nodes will know this fact, and they
will
\emph on
automatically
\emph default
go into secondary mode, following the transaction logs from the (new) primary.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
\size scriptsize
Hint: avoid this command.
It turns off
\emph on
any
\emph default
primary,
\series bold
globally
\series default
\begin_inset Foot
status open
\begin_layout Plain Layout
\size scriptsize
A serious
\series bold
misconception
\series default
among some people is when they believe that they can switch
\begin_inset Quotes eld
\end_inset
a certain node to secondary
\begin_inset Quotes erd
\end_inset
.
It is not possible to switch individual nodes to secondary, without affecting
other nodes! The concept of
\begin_inset Quotes eld
\end_inset
designated primary
\begin_inset Quotes erd
\end_inset
is
\series bold
global
\series default
throughout a resource!
\end_layout
\end_inset
.
You cannot start a sync after that (e.g.
\family typewriter
invalidate
\family default
or
\family typewriter
join-resource
\family default
or
\family typewriter
resume-sync
\family default
), because it is
\emph on
not unique
\emph default
wherefrom the data shall be fetched.
In split brain situations (when the network is OK again), this may have
further drawbacks.
It is much better / easier to
\series bold
\emph on
directly
\emph default
switch the designated primary
\series default
from one node to another via the
\family typewriter
primary
\family default
command.
See also section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Forced-Switching"
\end_inset
.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
\size scriptsize
There is only one valid use case where you
\emph on
really
\emph default
need this command: before finally destroying a resouce via the
\emph on
last
\emph default
\family typewriter
leave-resource
\family default
(or the dangerous
\family typewriter
delete-resource
\family default
), you will need this before you can do that.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
wait-umount
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
See section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Waiting"
\end_inset
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
log-purge-all
\begin_inset CommandInset label
LatexCommand label
name "log-purge-all$res"
\end_inset
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: none additionally.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: all locally known logfiles and version links are removed,
whenever they are not / no longer reachable by any split brain version.
\end_layout
\begin_layout Plain Layout
Rationale: remove hindering split-brain /
\family typewriter
leave-resource
\family default
leftovers.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Use this only when split brain does not go away by means of
\family typewriter
leave-resource
\family default
(which
\emph on
could
\emph default
happen in very weird scenarios such as MARS running on virtual machines
doing a restore of their snapshots, or otherwise unexpected resurrection
of dead or half-dead nodes).
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
THIS IS POTENTIALLY DANGEROUS!
\end_layout
\begin_layout Plain Layout
\size scriptsize
This command
\emph on
might
\emph default
destroy some valuable logfiles / other information in case the local informatio
n is outdated or otherwise incorrect.
MARS does its best for checking anything, but there is no guarantee.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Hint: use
\family typewriter
--dry-run
\family default
beforehand for checking!
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
resize
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
[$size]
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
almost
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: The local host must be primary.
All disks in the cluster participating in
\family typewriter
$res
\family default
must be physically larger than the logical resource size (e.g, by use of
\family typewriter
lvm
\family default
; can be checked by macros
\family typewriter
%disk-size{}
\family default
and
\family typewriter
%resource-size{}
\family default
).
When the optional
\family typewriter
$size
\family default
argument is present, it must be smaller than the minimum of all physical
sizes, but larger than the current logical size of the resource.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: the logical size of
\family typewriter
/dev/mars/$dev_name
\family default
will reflect the new size after a while.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Subsection
Logfile Operations
\end_layout
\begin_layout Standard
\size scriptsize
\begin_inset Tabular
<lyxtabular version="3" rows="4" columns="3">
<features rotate="0" islongtable="true" longtabularalignment="left">
<column alignment="left" valignment="top" width="0pt">
<column alignment="center" valignment="top">
<column alignment="left" valignment="top" width="0pt">
<row endhead="true" endfirsthead="true" endfoot="true" endlastfoot="true">
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Command / Params
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Cmp
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Description
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
log-rotate
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: the local node
\family typewriter
$host
\family default
must be primary at
\family typewriter
$res
\family default
.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: after a while, a new transaction logfile
\family typewriter
/mars/resource-$res/log-$new_nr-$host
\family default
will be used instead of
\family typewriter
/mars/resource-$res/log-$old_nr-$host
\family default
where
\family typewriter
$new_nr
\family default
=
\family typewriter
$old_nr
\family default
+ 1.
Without
\family typewriter
--force
\family default
, this will only carry out actions at the primary side since it makes no
sense on secondaries.
With
\family typewriter
--force
\family default
, secondaries are
\emph on
trying
\emph default
to
\emph on
remotely
\emph default
trigger a log-rotate, but without any guarantee (likely even a split-brain
may result instead, so use this only if you are
\emph on
really
\emph default
desperate).
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
log-delete
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: the local node must be a member of
\family typewriter
$res
\family default
.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: when there exists an old transaction logfile
\family typewriter
/mars/resource-$res/log-$old_nr-$some_host
\family default
where
\family typewriter
$old_nr
\family default
is the minimum existing number and that logfile is no longer referenced
by any of the symlinks
\family typewriter
/mars/resource-$res/replay-*
\family default
, that logfile is marked for deletion in the whole cluster.
When no such logfile exists, nothing will happen.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
log-delete-all
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Like
\family typewriter
log-delete
\family default
, but mark
\emph on
all
\emph default
currently unreferenced logfiles for deletion.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Subsection
Consistency Operations
\end_layout
\begin_layout Standard
\size scriptsize
\begin_inset Tabular
<lyxtabular version="3" rows="4" columns="3">
<features rotate="0" islongtable="true" longtabularalignment="left">
<column alignment="left" valignment="top" width="0pt">
<column alignment="center" valignment="top">
<column alignment="left" valignment="top" width="0pt">
<row endhead="true" endfirsthead="true" endfoot="true" endlastfoot="true">
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Command / Params
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Cmp
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Description
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
invalidate
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: the local node must be in secondary role at
\family typewriter
$res
\family default
.
A
\emph on
designated
\emph default
primary must exist.
When having
\begin_inset Formula $k>2$
\end_inset
replicas, no split brain must exist (otherwise, or when
\family typewriter
invalidate
\family default
does not work in case of
\begin_inset Formula $k=2$
\end_inset
, use the
\family typewriter
leave-resource
\family default
;
\family typewriter
join-resource
\family default
method described in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Split-Brain-Resolution"
\end_inset
).
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: the local disk is marked as inconsistent, and a fast fullsync
from the designated primary will start after a while.
Notice that
\family typewriter
marsadm {pause,resume}-sync
\family default
will influence whether the sync really starts.
When the fullsync has finished successfully, the local node will be consistent
again.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
fake-sync
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: the local node must be in secondary role at
\family typewriter
$res
\family default
.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: when a fullsync is running, it will stop after a while, and
the local node will be
\emph on
marked
\emph default
as consistent as if it were consistent again.
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
\size scriptsize
ONLY USE THIS IF YOU REALLY KNOW WHAT YOU ARE DOING!
\begin_inset Newline newline
\end_inset
See the WARNING in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Creating-and-Maintaining"
\end_inset
\begin_inset Newline newline
\end_inset
Use this only
\emph on
before
\emph default
creating a fresh filesystem inside
\family typewriter
/dev/mars/$res
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
set-replay
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
\size scriptsize
ONLY FOR ADVANCED HACKERS WHO KNOW WHAT THEY ARE DOING!
\begin_inset Newline newline
\end_inset
This command is deliberately not documented.
You need the competence level RTFS (
\begin_inset Quotes eld
\end_inset
read the fucking sources
\begin_inset Quotes erd
\end_inset
).
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Section
Further Operations
\end_layout
\begin_layout Subsection
Inspection Commands
\end_layout
\begin_layout Standard
\size scriptsize
\begin_inset Tabular
<lyxtabular version="3" rows="14" columns="3">
<features rotate="0" islongtable="true" longtabularalignment="left">
<column alignment="left" valignment="top" width="0pt">
<column alignment="center" valignment="top">
<column alignment="left" valignment="top" width="0pt">
<row endhead="true" endfirsthead="true" endfoot="true" endlastfoot="true">
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Command / Params
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Cmp
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Description
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
view-
\emph on
macroname
\begin_inset Newline newline
\end_inset
\emph default
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Display the output of a macro evaluation.
See section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Inspecting-the-State"
\end_inset
for a thorough description.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
view
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Equivalent to
\family typewriter
view-default
\family default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
role
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Deprectated.
Use
\family typewriter
view-role
\family default
instead.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
state
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Deprectated.
Use
\family typewriter
view-state
\family default
instead.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
cstate
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Deprectated.
Use
\family typewriter
view-cstate
\family default
instead.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
dstate
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Deprectated.
Use
\family typewriter
view-dstate
\family default
instead.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
status
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Deprectated.
Use
\family typewriter
view-status
\family default
instead.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
show-state
\end_layout
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Deprectated.
Don't use it.
Use
\family typewriter
view-state
\family default
instead, or other macros.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
show-info
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Deprectated.
Don't use it.
Use
\family typewriter
view-info
\family default
instead, or other macros.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
show
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Deprectated.
Don't use it.
Use or implement some macros instead.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
show-errors
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Deprectated.
Use
\family typewriter
view-the-err-msg
\family default
or
\family typewriter
view-resource-err
\family default
similar macros.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
cat
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$file
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Write the file content to stdout, but replace all occurences of numeric
timestamps converted to a human-readable format.
Thus is most useful for inspection of status and log files, e.g.
\family typewriter
marsadm cat /mars/5.total.log
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Subsection
Setting Parameters
\begin_inset CommandInset label
LatexCommand label
name "sub:Setting-Parameters"
\end_inset
\end_layout
\begin_layout Subsubsection
Per-Resource Parameters
\end_layout
\begin_layout Standard
\size scriptsize
\begin_inset Tabular
<lyxtabular version="3" rows="4" columns="3">
<features rotate="0" islongtable="true" longtabularalignment="left">
<column alignment="left" valignment="top" width="0pt">
<column alignment="center" valignment="top">
<column alignment="left" valignment="top" width="0pt">
<row endhead="true" endfirsthead="true" endfoot="true" endlastfoot="true">
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Command / Params
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Cmp
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Description
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
set-emergency-limit $res
\emph on
n
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
The argument
\emph on
n
\emph default
must be percentage between 0 and 100 %.
When the remaining store space in
\family typewriter
/mars/
\family default
undershoots the given percentage, the resource will go
\emph on
earlier
\emph default
into emergency mode than by the global computation described in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Defending-Overflow"
\end_inset
.
0 means unlimited.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
get-emergency-limit $res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Inquiry of the preceding value.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Subsubsection
Global Parameters
\end_layout
\begin_layout Standard
\size scriptsize
\begin_inset Tabular
<lyxtabular version="3" rows="8" columns="3">
<features rotate="0" islongtable="true" longtabularalignment="left">
<column alignment="left" valignment="top" width="0pt">
<column alignment="center" valignment="top">
<column alignment="left" valignment="top" width="0pt">
<row endhead="true" endfirsthead="true" endfoot="true" endlastfoot="true">
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Command / Params
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Cmp
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Description
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
set-sync-limit-value
\emph on
n
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Limit the concurrency of sync operations to some maximum number.
0 means unlimited.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
get-sync-limit-value
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Inquiry of the preceding value.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
set-sync-pref-list res1,res2,resn
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Set the order of preferences for syning.
The argument must be comma-separated list of resource names.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
get-sync-pref-list
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Inquiry of the preceding value.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
set-connect-pref-list host1,host2,hostn
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Set the order of preferences for connections when there are more than 2
hosts participating in a cluster.
The argument must be comma-separated list of node names.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
get-connect-pref-list
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Inquiry of the preceding value.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Subsection
Waiting
\begin_inset CommandInset label
LatexCommand label
name "sub:Waiting"
\end_inset
\end_layout
\begin_layout Standard
\size scriptsize
\begin_inset Tabular
<lyxtabular version="3" rows="5" columns="3">
<features rotate="0" islongtable="true" longtabularalignment="left">
<column alignment="left" valignment="top" width="0pt">
<column alignment="center" valignment="top">
<column alignment="left" valignment="top" width="0pt">
<row endhead="true" endfirsthead="true" endfoot="true" endlastfoot="true">
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Command / Params
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Cmp
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Description
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
wait-cluster
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: the
\family typewriter
/mars/
\family default
filesystem must be mounted and it must contain a valid MARS symlink tree
produced by the other
\family typewriter
marsadm
\family default
commands.
The kernel module must be loaded.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: none.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Wait until
\emph on
all
\emph default
nodes in the cluster have sent a message, or until timeout.
The default timeout is 30 s (exceptionally) and
\size default
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Be
\size scriptsize
may be changed by
\family typewriter
--timeout=$seconds
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
wait-resource
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
{is-,}{attach,
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
primary,
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
device}{-off,}
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: the local node must be a member of the resource
\family typewriter
$res
\family default
.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: none.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Wait until the local node reaches a specified condition on
\family typewriter
$res
\family default
, or until timeout.
The default timeout of 60 s may be changed by
\family typewriter
--timeout=$seconds
\family default
.
The last argument denotes the condition.
The condition is inverted if suffixed by
\family typewriter
-off
\family default
.
When preceded by
\family typewriter
is-
\family default
(which is the most useful case), it is checked whether the condition is
actually reached.
When the
\family typewriter
is-
\family default
prefix is left off, the check is whether another
\family typewriter
marsadm
\family default
command has been already given which
\emph on
tries
\emph default
to achieves the intended result (typicially, you may use this after the
\family typewriter
is-
\family default
variant has failed).
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
wait-connect
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
almost
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
This is an alias for
\family typewriter
wait-cluster
\family default
waiting until only those nodes are reachable which belong to
\family typewriter
$res
\family default
(instead of waiting for the
\emph on
full
\emph default
cluster).
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
wait-umount
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$res
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Precondition: none additionally.
\end_layout
\begin_layout Plain Layout
\size scriptsize
Postcondition: the local
\family typewriter
/dev/mars/$dev_name
\family default
is no longer in use (e.g.
umounted).
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Subsection
Low-Level Expert Commands
\end_layout
\begin_layout Standard
These commands are for experts and advanced sysadmins only.
The interface is not stable, i.e.
the meaning may change at any time.
Use at your own risk!
\end_layout
\begin_layout Standard
\size scriptsize
\begin_inset Tabular
<lyxtabular version="3" rows="4" columns="3">
<features rotate="0" islongtable="true" longtabularalignment="left">
<column alignment="left" valignment="top" width="0pt">
<column alignment="center" valignment="top">
<column alignment="left" valignment="top" width="0pt">
<row endhead="true" endfirsthead="true" endfoot="true" endlastfoot="true">
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Command / Params
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Cmp
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Description
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
set-link
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
RTFS.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
get-link
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
RTFS.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
delete-file
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
RTFS.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Standard
The following commands are for manual setup / repair of cluster membership.
Only to be used by experts who know what they are doing! In general, cluster-wi
de operations on IP addresses may need to be repeated at all hosts in the
cluster iff the communication is not (yet) possible and/or not (yet) actually
working (e.g.
firewalling problems etc).
\end_layout
\begin_layout Standard
\size scriptsize
\begin_inset Tabular
<lyxtabular version="3" rows="4" columns="3">
<features rotate="0" islongtable="true" longtabularalignment="left">
<column alignment="left" valignment="top" width="0pt">
<column alignment="center" valignment="top">
<column alignment="left" valignment="top" width="0pt">
<row endhead="true" endfirsthead="true" endfoot="true" endlastfoot="true">
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Command / Params
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Cmp
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Description
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "30col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
lowlevel-ls-host-ips
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "50col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
List all configured cluster members together with their currently configured
IP addresses, as known
\emph on
locally
\emph default
.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "30col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
lowlevel-set-host-ip
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$hostname
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$ip
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "50col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Change the assignment of IP addresses
\emph on
locally
\emph default
.
May be used when hosts are moved to different network locations, or when
different network interfaces are to be used for replication (e.g.
dedicated replication IPs).
Notice that the names of hosts must not change at all, only their IP addresses
may be changed.
Check active connections with
\family typewriter
netstat
\family default
& friends.
Updates may need some time to proceed (socket timeouts etc).
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "30col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
lowlevel-delete-host
\begin_inset Newline newline
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
strut
\backslash
hfill
\end_layout
\end_inset
$hostname
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "50col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Remove a host from the cluster membership
\emph on
locally
\emph default
, together with its IP address assignment.
This does not remove any further information.
In particular, resource memberships are untouched.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Subsection
Senseless Commands (from DRBD)
\end_layout
\begin_layout Standard
\size scriptsize
\begin_inset Tabular
<lyxtabular version="3" rows="11" columns="3">
<features rotate="0" islongtable="true" longtabularalignment="left">
<column alignment="left" valignment="top" width="0pt">
<column alignment="center" valignment="top">
<column alignment="left" valignment="top" width="0pt">
<row endhead="true" endfirsthead="true" endfoot="true" endlastfoot="true">
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Command / Params
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Cmp
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Description
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
syncer
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
new-current-uuid
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
create-md
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
dump-md
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
dump
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
get-gi
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
show-gi
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
outdate
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
adjust
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
yes
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
Implemented as NOP (not necessary with MARS).
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
hidden-commands
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Subsection
Forbidden Commands (from DRBD)
\end_layout
\begin_layout Standard
These commands are not implemented because they would be dangerous in MARS
context:
\end_layout
\begin_layout Standard
\size scriptsize
\begin_inset Tabular
<lyxtabular version="3" rows="3" columns="3">
<features rotate="0" islongtable="true" longtabularalignment="left">
<column alignment="left" valignment="top" width="0pt">
<column alignment="center" valignment="top">
<column alignment="left" valignment="top" width="0pt">
<row endhead="true" endfirsthead="true" endfoot="true" endlastfoot="true">
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Command / Params
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Cmp
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
Description
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
invalidate-remote
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
This would be too dangerous in case you have multiple secondaries.
A similar effect can be achieved with the
\family typewriter
--host=
\family default
option.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "20col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\family typewriter
\size scriptsize
verify
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
no
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\size scriptsize
\begin_inset Box Frameless
position "t"
hor_pos "c"
has_inner_box 1
inner_pos "t"
use_parbox 0
use_makebox 0
width "60col%"
special "none"
height "1in"
height_special "totalheight"
status open
\begin_layout Plain Layout
\size scriptsize
This would cause unintended side effects due to races between logfile transfer
/ application and block-wise comparison of the underlying disks.
However,
\family typewriter
marsadm join-resource
\family default
or
\family typewriter
invalidate
\family default
will do the same as DRBD verify followed by DRBD resync, i.e.
this will automatically correct any found errors;.
Note that the fast-fullsync algorithm of MARS will minimize network traffic.
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Section
The
\family typewriter
/proc/sys/mars/
\family default
and other Expert Tweaks
\begin_inset CommandInset label
LatexCommand label
name "sec:The-/proc/sys/mars/-Expert"
\end_inset
\end_layout
\begin_layout Standard
In general, you shouldn't need to deal with any tweaks in
\family typewriter
/proc/sys/mars/
\family default
because everything should already default to reasonable predefined values.
This interface allows access to some internal kernel variables of the
\family typewriter
mars.ko
\family default
kernel module at runtime.
Thus it is
\emph on
not
\emph default
a stable interface.
It is not only specific for MARS, but may also change between releases
without notice.
\end_layout
\begin_layout Standard
This section describes only those tweaks intended for sysadmins, not those
for developers / very deep internals.
\end_layout
\begin_layout Subsection
Syslogging
\end_layout
\begin_layout Standard
All internal messages produced by the kernel module belong to one of the
following classes:
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
0 debug messages
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
1 info messages
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
2 warnings
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
3 error messages
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
4 fatal error messages
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
5 any message (summary of 0 to 4)
\end_layout
\begin_layout Subsubsection
Logging to Files
\end_layout
\begin_layout Standard
These classes are used to produce status files
\family typewriter
$class.*.status
\family default
in the
\family typewriter
/mars/
\family default
and/or in the
\family typewriter
/mars/resource-
\emph on
mydata
\emph default
/
\family default
directory / directories.
\end_layout
\begin_layout Standard
When you create a file
\family typewriter
$class.*.log
\family default
in parallel to any
\family typewriter
$class.*.status
\family default
, the
\family typewriter
*.log
\family default
file will be appended forever with the same messages as in
\family typewriter
*.status
\family default
.
The difference is that *.status is regenerated anew from an empty starting
point, while *.log can (potentially) increase indefinitely unless you remove
it, or rename it to something else.
\end_layout
\begin_layout Standard
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Beware, any permamently present
\family typewriter
*.log
\family default
file can easily fill up your
\family typewriter
/mars/
\family default
partition until the problems described in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Defending-Overflow"
\end_inset
will appear.
Use
\family typewriter
*.log
\family default
only for a
\series bold
limited time
\series default
, and
\series bold
only for debugging!
\end_layout
\begin_layout Subsubsection
Logging to Syslog
\end_layout
\begin_layout Standard
The classes also play a role in the following
\family typewriter
/proc/sys/mars/
\family default
tweaks:
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
syslog_min_class
\family default
(rw) The
\emph on
mimimum
\emph default
class number for
\emph on
permanent
\emph default
syslogging.
By default, this is set to -1 in order to switch off perment logging completely.
Permament logging can easily flood your syslog with such huge amounts of
messages (in particular when class=0), that your system as a whole may
become unusable (because vital kernel threads may be blocked too long or
too often by the userspace syslog daemon).
Instead, please use the flood-protected syslogging described below!
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
syslog_max_class
\family default
(rw) The
\emph on
maximum
\emph default
class number for
\emph on
permanent
\emph default
syslogging.
Please use the flood-protected version instead.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
syslog_flood_class
\family default
(rw) The mimimum class of flood-protected syslogging.
The maximum class is always 4.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
syslog_flood_limit
\family default
(rw) The maxmimum number of messages after which the flood protection will
start.
This is a hard limit for the the number of messages written to the syslog.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
syslog_flood_recovery_s
\family default
(rw) The number of seconds after which the internal flood counter is reset
(after flood protection state has been reached).
When no new messages appear after this time, the flood protection will
start over at count 0.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
The rationale behind flood protected syslogging: sysadmins are usually only
interested in the point in time where some problems / incidents / etc have
\emph on
started
\emph default
.
They are usually not interested in capturing
\emph on
each
\emph default
and
\emph on
every
\emph default
single error message (in particular when they are flooding the system logs).
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
If you
\emph on
really
\emph default
need complete error information, use the
\family typewriter
*.log
\family default
files described above, compress them and save them to somewhere else
\emph on
regularly
\emph default
by a cron job.
This bears much less overhead than filtering via the syslog daemon, or
even remote syslogging in real time which will almost surely screw up your
system in case of network problems co-inciding with flood messages, such
as caused in turn by those problems.
Don't rely on real-time concepts, just do it the old-fashioned batch job
way.
\end_layout
\begin_layout Subsubsection
Tuning Verbosity of Logging
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
show_debug_messages
\family default
Boolean switch, 0 or 1.
Mostly useful only for developers.
This can easily flood your logs if our are not careful.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
show_log_messages
\family default
Boolean switch, 0 or 1.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
show_connections
\family default
Boolean switch, 0 or 1.
Show detailed internal statistics on sockets.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
show_statistics_local
\begin_inset space ~
\end_inset
/
\begin_inset space ~
\end_inset
show_statistics_global
\family default
Only useful for kernel developers.
Shows some internal information on internal brick instances, memory usage,
etc.
\end_layout
\begin_layout Subsection
Tuning the Sync
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
sync_flip_interval_sec
\family default
(rw) The sync process must not run in parallel to logfile replay, in order
to easily guarantee consistency of your disk.
If logfile replay would be paused for the full duration of very large or
long-lasting syncs (which could take some days over very slow networks),
your
\family typewriter
/mars/
\family default
filesystem could overflow because no replay would be possible in the meantime.
Therefore, MARS regulary flips between actually syncing and actually replaying,
if both is enabled.
You can set the time interval for flipping here.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
sync_limit
\family default
(rw) When > 0, this limits the maximum number of sync processes actually
running parallel.
This is useful if you have a large number of resources, and you don't want
to overload the network with sync processes.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
sync_nr
\family default
(ro) Passive indicator for the number of sync processes currently running.
\end_layout
\begin_layout Labeling
\labelwidthstring 00.00.0000
\family typewriter
sync_want
\family default
(ro) Passive indicator for the number of sync processes which
\emph on
demand
\emph default
running.
\end_layout
\begin_layout Chapter
Tips and Tricks
\end_layout
\begin_layout Section
Avoiding Inappropriate Clustermanager Types for Medium and Long-Distance
Replication
\end_layout
\begin_layout Standard
This section addresses some wide-spread misconceptions.
Its main target audience is developers, but sysadmins will profit from
\series bold
detailed explanations of problems and pitfalls
\series default
.
When the problems described in this section are solved somewhen in future,
this section will be shortened and some relevant parts moved to the appendix.
\end_layout
\begin_layout Standard
Doing
\series bold
High Availability (HA)
\series default
wrong at
\emph on
concept level
\emph default
may easily get you into trouble, and may cost you several millions of €
or $ in larger installations, or even knock you out of business when disasters
are badly dealt with at higher levels such as clustermanagers.
\end_layout
\begin_layout Subsection
General Cluster Models
\end_layout
\begin_layout Standard
The most commonly known cluster model is called
\series bold
shared-disk
\series default
, and typically controlled by clustermanagers like
\family typewriter
PaceMaker
\family default
:
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/shared-disk-model.fig
width 50col%
\end_inset
\end_layout
\begin_layout Standard
\noindent
The most important property of shared-disk is that there exists only a single
disk instance.
Nowadays, this disk often has some
\emph on
internal
\emph default
redundancy such as RAID.
At
\emph on
system
\emph default
architecure layer / network level, there exists no redundant disk at all.
Only the application cluster is built redundant.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
It should be immediately clear that shared-disk clusters are only suitable
for short-distance operations in the same datacenter.
Although running one of the data access lines over short distances between
very near-by datacenters (e.g.
1 km) would be theoretically possible, there would be no sufficient protection
against failure of a whole datacenter.
\end_layout
\begin_layout Standard
Both DRBD and MARS belong to a different architectural model called
\series bold
shared-nothing
\series default
:
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/shared-nothing-model.fig
width 50col%
\end_inset
\end_layout
\begin_layout Standard
\noindent
The characteristic feature of a shared-nothing model is (additional)
\series bold
redundancy at network level
\series default
.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Shared-nothing
\begin_inset Quotes eld
\end_inset
clusters
\begin_inset Foot
status open
\begin_layout Plain Layout
Notice that the term
\begin_inset Quotes eld
\end_inset
cluster computing
\begin_inset Quotes erd
\end_inset
usually refers to short-distance only.
Long-distance coupling should be called
\begin_inset Quotes eld
\end_inset
grid computing
\begin_inset Quotes erd
\end_inset
in preference.
As known from the scientific literature, grid computing requires different
concepts and methods in general.
Only for the sake of simplicity, we use
\begin_inset Quotes eld
\end_inset
cluster
\begin_inset Quotes erd
\end_inset
and
\begin_inset Quotes eld
\end_inset
grid
\begin_inset Quotes erd
\end_inset
interchangeably.
\end_layout
\end_inset
\begin_inset Quotes erd
\end_inset
could theoretically be built for
\emph on
any
\emph default
distances, from short to medium to long distances.
However, concrete technologies of disk coupling such as synchronous operation
may pose practical limits on the distances (see chapter
\begin_inset CommandInset ref
LatexCommand ref
reference "chap:Use-Cases-for"
\end_inset
).
\end_layout
\begin_layout Standard
In general, clustermanagers must fit to the model.
Some clustermanager can be configured to fit to multiple models.
If so, this must be done properly, or you may get into serious trouble.
\end_layout
\begin_layout Standard
Some people don't know, or they don't believe, that different architectural
models like shared-disk or shared-nothing will
\emph on
require
\emph default
an
\emph on
appropriate
\emph default
type of clustermanager and/or a different configuration.
Failing to do so, by selection of an inappropriate clustermanager type
and/or an inappropriate configuration may be hazardous.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Selection of the right model alone is not sufficient.
Some, if not many, clustermanagers have not been designed for long distances.
As explained in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Special-Requirements-for"
\end_inset
, long distances have further
\series bold
hard requirements
\series default
.
Disregarding them may be also hazardous!
\end_layout
\begin_layout Subsection
Handover / Failover Reasons and Scenarios
\end_layout
\begin_layout Standard
From a sysadmin perspective, there exist a number of different
\series bold
reasons
\series default
why the application workload must be switched from the currently active
side A to the currently passive side B:
\end_layout
\begin_layout Enumerate
Some
\series bold
defect
\series default
has occurred at cluster side A or at some corresponding part of the network.
\end_layout
\begin_layout Enumerate
Some
\series bold
maintenance
\series default
has to be done at side A which would cause a longer downtime (e.g.
security kernel update or replacement of core network equipment or maintainance
of UPS or of the BBU cache etc - hardware isn't 24/7/365 in practice, although
some vendors
\emph on
claim
\emph default
it - it is either not really true, or it becomes
\emph on
extremely
\emph default
expensive).
\end_layout
\begin_layout Standard
Both reasons are valid and must be automatically handled in larger installations.
In order to deal with all of these reasons, the following basic mechanisms
can be used in either model:
\end_layout
\begin_layout Enumerate
\series bold
Failover
\series default
(triggered either manually or automatically)
\end_layout
\begin_layout Enumerate
\series bold
Handover
\series default
(triggered manually
\begin_inset Foot
status open
\begin_layout Plain Layout
Automatic triggering could be feasible for prophylactic treatments.
\end_layout
\end_inset
)
\end_layout
\begin_layout Standard
It is important to not confuse handover with failover at concept level.
Not only the reasons / preconditions are very different, but also the
\emph on
requirements
\emph default
.
Example: precondition for handover is that
\emph on
both
\emph default
cluster sides are healthy, while precondition for failover is that
\emph on
some relevant(!)
\emph default
failure has been
\emph on
detected
\emph default
somewhere (whether this is
\emph on
really
\emph default
true is another matter).
Typically, failover must be able to run in masses, while planned handover
often has lower scaling requirements.
\end_layout
\begin_layout Standard
Not all existing clustermanagers are dealing with all of these cases (or
their variants) equally well, and some are not even dealing with some of
these cases / variants
\emph on
at all
\emph default
.
\end_layout
\begin_layout Standard
Some clustermanagers cannot easily express the concept of
\begin_inset Quotes eld
\end_inset
automatic triggering
\begin_inset Quotes erd
\end_inset
versus
\begin_inset Quotes eld
\end_inset
manual triggering
\begin_inset Quotes erd
\end_inset
of an action.
There exists simply no cluster-global switch which selects either
\begin_inset Quotes eld
\end_inset
manual mode
\begin_inset Quotes erd
\end_inset
or
\begin_inset Quotes eld
\end_inset
automatic mode
\begin_inset Quotes erd
\end_inset
(except when you start to hack the code and/or write new plugins; then
you might notice that there is almost no architectural layering / sufficient
separation between mechanism and strategy).
Being forced to permanently use an automatic mode for several hundreds
or even thousands of clusters is not only boring, but bears a considerable
risk when automatics do a wrong decision at hundreds of instances in parallel.
\end_layout
\begin_layout Subsection
Granularity and Layering Hierarchy for Long Distances
\end_layout
\begin_layout Standard
Many existing clustermanager solutions are dealing with a single cluster
instance, as the term
\begin_inset Quotes eld
\end_inset
\emph on
cluster
\emph default
manager
\begin_inset Quotes erd
\end_inset
suggests.
However, when running several hundreds or thousands of cluster instances,
you likely will not want to manage each of them individually.
In addition, failover should
\emph on
not only
\emph default
be
\emph on
triggered
\emph default
(not to be confused with
\emph on
executed
\emph default
) individually at cluster level, but likely
\emph on
also
\emph default
at a higher granularity such as a room, or a whole datacenter.
Otherwise, some chaos is likely to happen.
\end_layout
\begin_layout Standard
Here is what you probably will
\series bold
need
\series default
, possibly in difference to what you may find on the market (whether OpenSource
or not).
For simplicity, the following diagram shows only two levels of granularity,
but can be easily extended to multiple layers of granularity, or to some
concept of various
\emph on
subsets of clusters
\emph default
:
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/clustermanager-hierarchy.fig
width 70col%
\end_inset
\end_layout
\begin_layout Standard
\noindent
Notice that many existing clustermanager solutions are not addressing the
datacenter granularity at all.
Typically, they use concepts like
\series bold
quorums
\series default
for determining failures
\emph on
at cluster level
\emph default
solely, and then immediately executing failover of the cluster, sometimes
without clean architectural distinction between trigger and execution (similar
to the
\begin_inset Quotes eld
\end_inset
separation of concerns
\begin_inset Quotes erd
\end_inset
between
\series bold
mechanism
\series default
and
\series bold
strategy
\series default
in Operating Systems).
Sometimes there is even no internal software layering / modularization
according to this separation of concerns at all.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
When there is no distinction between different levels of granularity, you
are hopelessly bound to a non-extensible and thus non-adaptable system
when you need to operate masses of clusters.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
A lacking distinction between automatic mode and manual mode, and/or lack
of corresponding
\series bold
architectural software layers
\series default
is not only a blatant ignoration of well-established best practices of
\series bold
software engineering
\series default
, but will bind you even more firmly to an inflexible system.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Terminology: for practical reasons, we use the general term
\begin_inset Quotes eld
\end_inset
clustermanager
\begin_inset Quotes erd
\end_inset
also for speaking about layers dealing with higher granularity, such as
datacenter layers, and also for long-distance replication scenarios, although
some terminology from grid computing would be more appropriate in a scientific
background.
\end_layout
\begin_layout Standard
Please consider the following: when it comes to long-distance HA, the above
layering architecture is also motivated by vastly different numbers of
instances for each layer.
Ideally, the topmost automatics layer should be able to overview several
datacenters in parallel, in order to cope with (almost) global network
problems such as network partitions.
Additionally, it should also detect single cluster failures, or intermediate
problems like
\begin_inset Quotes eld
\end_inset
rack failure
\begin_inset Quotes erd
\end_inset
or
\begin_inset Quotes eld
\end_inset
room failure
\begin_inset Quotes erd
\end_inset
, as well as various types of (partial / intermediate) (replication) network
failures.
Incompatible decisions at each of the different granularities would be
a no-go in practice.
Somewhere and somehow, you need one single
\begin_inset Foot
status open
\begin_layout Plain Layout
If you have
\emph on
logical pairs of datacenters
\emph default
which are firmly bound together, you could also have several topmost automatics
instances, e.g.
for each
\emph on
pair
\emph default
of datacenters.
However, that would be very
\series bold
inflexible
\series default
, because then you cannot easily mix locations or migrate your servers between
datacenters.
Using
\begin_inset Formula $k>2$
\end_inset
replicas with MARS would also become a nightmare.
In your own interest, please don't create any concepts where masses of
hardware are firmly bound to fixed constants at some software layers.
\end_layout
\end_inset
top-most
\emph on
logical
\emph default
problem detection / ranking instance, which should be
\emph on
internally distributed
\emph default
of course, typically using some
\series bold
distributed consensus protocol
\series default
; but in difference to many published distributed consensus algorithms it
should be able to work with multiple granularities at the same time.
\end_layout
\begin_layout Subsection
Methods and their Appropriateness
\end_layout
\begin_layout Subsubsection
Failover Methods
\begin_inset CommandInset label
LatexCommand label
name "sub:Failover-Methods"
\end_inset
\end_layout
\begin_layout Standard
Failover methods are only needed in case of an incident.
They should not be used for regular handover.
\end_layout
\begin_layout Paragraph
STONITH-like Methods
\end_layout
\begin_layout Standard
STONITH = Shoot The Other Node In The Head
\end_layout
\begin_layout Standard
These methods are widely known, although they have several serious drawbacks.
Some people even believe that
\emph on
any
\emph default
clustermanager must
\emph on
always
\emph default
have some STONITH-like functionality.
This is wrong.
There
\emph on
exist
\emph default
alternatives, as shown in the next paragraph.
\end_layout
\begin_layout Standard
The most obvious drawback is that STONITH will always create a
\series bold
damage
\series default
, by definition.
\end_layout
\begin_layout Standard
Example: a typical contemporary STONITH implementation uses IPMI for automatical
ly powering off your servers, or at least pushes the (virtual) reset button.
This will
\emph on
always
\emph default
create a certain type of damage: the affected systems will definitely not
be available, at least for some time until they have (manually) rebooted.
\end_layout
\begin_layout Standard
This is a conceptual contradiction: the reason for starting failover is
that you want to restore availability as soon as possible, but in order
to do so you will first
\emph on
destroy
\emph default
the availability of a particular
\emph on
component
\emph default
.
This may be counter-productive.
\end_layout
\begin_layout Standard
Example: when your hot standby node B does not work as expected, or if it
works even
\emph on
worse
\emph default
than A before, you will loose some time until you
\emph on
can
\emph default
become operational again at the old side A.
\end_layout
\begin_layout Standard
Here is an example method for handling a failure scenario.
The old active side A is assumed to be no longer healthy anymore.
The method uses a sequential state transition chain with a STONITH-like
step:
\end_layout
\begin_layout Description
Phase1 Check whether the hot standby B is currently usable.
If this is violated (which may happen during certain types of disasters),
abort the failover for any affected resources.
\end_layout
\begin_layout Description
Phase2
\emph on
Try
\emph default
to shutdown the damaged side A (in the
\emph on
hope
\emph default
that there is no
\emph on
serious
\emph default
damage).
\end_layout
\begin_layout Description
Phase3 In case phase2 did not work during a grace period / after a timeout,
assume that A is badly damaged and therefore STONITH it.
\end_layout
\begin_layout Description
Phase4 Start the application at the hot standby B.
\end_layout
\begin_layout Standard
Notice: any cleanup actions, such as
\series bold
repair
\series default
of defective hard- or software etc, are outside the scope of failover processes.
Typically, they are executed much later when restoring redundancy.
\end_layout
\begin_layout Standard
Also notice: this method is a
\emph on
heavily
\emph default
distributed one, in the sense that sequential actions are alternated multiple
times on different hosts.
This is known to be cumbersome in distributed systems, in particular in
presence of network problems.
\end_layout
\begin_layout Standard
\begin_inset CommandInset label
LatexCommand label
name "Phase4-in-more"
\end_inset
Phase4 in more detail for DRBD, augmented with some pseudo code for application
control:
\end_layout
\begin_layout Enumerate
at side B:
\family typewriter
drbdadm disconnect all
\end_layout
\begin_layout Enumerate
at side B:
\family typewriter
drbdadm primary --force all
\end_layout
\begin_layout Enumerate
at side B:
\family typewriter
applicationmanager start all
\end_layout
\begin_layout Standard
The same phase4 using MARS:
\end_layout
\begin_layout Enumerate
at side B:
\family typewriter
marsadm pause-fetch all
\end_layout
\begin_layout Enumerate
at side B:
\family typewriter
marsadm primary --force all
\end_layout
\begin_layout Enumerate
at side B:
\family typewriter
applicationmanager start all
\end_layout
\begin_layout Standard
This sequential 4-phase method is far from optimal, for the following reasons:
\end_layout
\begin_layout Itemize
The method tries to handle both failover and handover scenarios with one
single sequential receipe.
In case of a true failover scenario where it is
\emph on
already known for sure
\emph default
that side A is badly damaged, this method will unnecessarily waste time
for phase 2.
This could be fixed by introduction of a conceptual distinction between
handover and failover, but it would not fix the following problems.
\end_layout
\begin_layout Itemize
Before phase4 is started (which will re-establish the service from a user's
perspective), a lot of time is wasted by
\emph on
both
\emph default
phases 2
\emph on
and
\emph default
3.
Even if phase 2 would be skipped, phase 3 would unnecessarily cost some
time.
In the next paragraph, an alternative method is explained which eliminates
any unnecessary waiting time at all.
\end_layout
\begin_layout Itemize
The above method is adapted to the shared-disk model.
It does not take advantage of the shared-nothing model, where further possibili
ties for better solutions exist.
\end_layout
\begin_layout Itemize
In case of long-distance network partitions and/or sysadmin / system management
subnetwork outages, you may not even be able to (remotely) start STONITH
at at.
Thus the above method misses an important failure scenario.
\end_layout
\begin_layout Standard
Some people seem to have a
\emph on
binary
\emph default
view at the healthiness of a system: in their view, a system is either
operational, or it is damaged.
This kind of view is ignoring the fact that some systems may be half-alive,
showing only
\emph on
minor
\emph default
problems, or occurring only from time to time.
\end_layout
\begin_layout Standard
It is obvious that damaging a healthy system is a bad idea by itself.
Even
\emph on
generally
\emph default
damaging a half-alive system in order to
\begin_inset Quotes eld
\end_inset
fix
\begin_inset Quotes erd
\end_inset
problems is not generally a good idea, because it may increase the damage
when you don't know the
\emph on
real
\emph default
reason
\begin_inset Foot
status open
\begin_layout Plain Layout
Example, occurring in masses: an incorrectly installed bootloader, or a
wrong BIOS boot priority order which unexpectedly lead to hangs or infinite
reboot cycles once the DHCP or BOOTP servers are not longer available /
reachable.
\end_layout
\end_inset
.
\end_layout
\begin_layout Standard
Even worse: in a distributed system
\begin_inset Foot
status open
\begin_layout Plain Layout
Notice: the STONITH concept is more or less associated with short-distance
scenarios where
\series bold
crossover cables
\series default
or similare equipment are used.
The assumption is that crossover cables can't go defective, or at least
it would be an extremely unlikely scenario.
For long-distance replication, this assumption is simply not true.
\end_layout
\end_inset
you sometimes
\emph on
cannot(!)
\emph default
know whether a system is healthy, or to what degree it is healthy.
Typical STONITH methods as used in some contemporary clustermanagers are
\series bold
assuming a worst case
\series default
, even if that worst case is currently not for real.
\end_layout
\begin_layout Standard
Therefore, avoid the following
\series bold
fundamental flaws
\series default
in failover concepts and healthiness models, which apply to implementors
/ configurators of clustermanagers:
\end_layout
\begin_layout Itemize
Don't mix up knowledge with conclusions about a (sub)system, and also don't
mix this up with the real state of that (sub)system.
In reality, you don't have any knowledge about a complex distributed system.
You only may have
\emph on
some
\emph default
knowledge about
\emph on
some
\emph default
parts of the system, but you cannot
\begin_inset Quotes eld
\end_inset
see
\begin_inset Quotes erd
\end_inset
a complex distributed system as a whole.
What you think is your knowledge, isn't knowledge in reality: in many cases,
it is
\emph on
conclusion
\emph default
, not knowledge.
Don't mix this up!
\end_layout
\begin_layout Itemize
Some systems are more complex than your model of it.
Don't neglect important parts (such as networks, routers, switches, cables,
plugs) which may lead you to wrong conclusions!
\end_layout
\begin_layout Itemize
Don't restrict your mind to boolean models of healthyness.
Doing so can easily create unnecessary damage by construction, and even
at concept level.
You should know from software engineering that defects in concepts or models
are much more serious than simple bugs in implementations.
Choosing the wrong model cannot be fixed as easily as a typical bug or
a typo.
\end_layout
\begin_layout Itemize
Try to deduce the state of a system as
\series bold
reliably
\series default
as possible.
If you don't know something for sure, don't generally assume that it has
gone wrong.
Don't confuse missing knowledge with the conclusion that something is bad.
Boolean algebra restricts your mind to either
\begin_inset Quotes eld
\end_inset
good
\begin_inset Quotes erd
\end_inset
or
\begin_inset Quotes eld
\end_inset
bad
\begin_inset Quotes erd
\end_inset
.
Use at least
\series bold
tri-state algebra
\series default
which has a means for expressing
\series bold
\begin_inset Quotes eld
\end_inset
unknown
\begin_inset Quotes erd
\end_inset
\series default
.
Even better: attach a probability to anything you (believe to) know.
Errare humanum est: nothing is absolutely sure.
\end_layout
\begin_layout Itemize
Oversimplification: don't report an
\begin_inset Quotes eld
\end_inset
unknown
\begin_inset Quotes erd
\end_inset
or even a
\begin_inset Quotes eld
\end_inset
broken
\begin_inset Quotes erd
\end_inset
state for a complex system whenever a smaller subsystem exists for which
you have some knowledge (or you can conclude something about it with reasonable
evidence).
Otherwise, your users / sysadmins may draw wrong conclusions, and assume
that the whole system is broken, while in reality only some minor part
has some minor problem.
Users could then likely make wrong decisions, which may then easily lead
to bigger damages.
\end_layout
\begin_layout Itemize
Murphy's law:
\series bold
never assume that something can't go wrong!
\series default
Doing so is a blatant misconception at topmost level: the
\emph on
purpose
\emph default
of a clustermanager is creating High Availablity (HA) out of more or less
\begin_inset Quotes eld
\end_inset
unreliable
\begin_inset Quotes erd
\end_inset
components.
It is the damn duty of both a clustermanager and its configurator to try
to compensate
\emph on
any
\emph default
failures,
\emph on
regardless of their probability
\emph default
\begin_inset Foot
status open
\begin_layout Plain Layout
Never claim that something has only low probability (and therefore it were
not relevant).
In the HA area, you simply
\series bold
cannot know
\series default
that, because you typically have
\emph on
sporadic
\emph default
incidents.
In extreme cases, the
\emph on
purpose
\emph default
of your HA solution is protection against 1 failure per 10 years.
You simply don't have the time to wait for creating an incident statistics
about that!
\end_layout
\end_inset
, as best as possible.
\end_layout
\begin_layout Itemize
Never confuse
\series bold
probability
\series default
with
\series bold
expectancy value!
\series default
If you don't know the mathematical term
\begin_inset Quotes eld
\end_inset
expectancy value
\begin_inset Quotes erd
\end_inset
, or if you don't know what this means
\emph on
in practice
\emph default
, don't take responsibility for millions of € or $.
\end_layout
\begin_layout Itemize
When operating masses of hard- and software: never assume that a particular
failure can occur only at a low number of instances.
There are
\series bold
\emph on
unknown(!)
\emph default
systematic errors
\series default
which may pop up at the wrong time and in huge masses when you don't expect
them.
\end_layout
\begin_layout Itemize
Multiple layers of fallback:
\emph on
any
\emph default
action can fail.
Be prepared to have a plan B, and even a plan C, and even better a plan
D, wherever possible.
\end_layout
\begin_layout Itemize
Never increase any damage anywhere, unnecessarily! Always try to
\emph on
miminize
\emph default
any damage! It can be mathematically proven that in deterministic probabilistic
systems having finite state, increases of a damage level
\emph on
at the wrong place
\emph default
will
\emph on
introduce
\emph default
an
\emph on
additional
\emph default
\emph on
risk
\emph default
of getting into an
\series bold
endless loop
\series default
.
This is also true for nondeterministic systems, as known from formal language
theory
\begin_inset Foot
status open
\begin_layout Plain Layout
Finite automatons are known to be transformable to deterministic ones, usually
by an exponential increase in the number of states.
\end_layout
\end_inset
.
\end_layout
\begin_layout Itemize
Use the
\series bold
best effort principle
\series default
.
You should be aware of the following fact: in general, it is impossible
to create an
\emph on
absolutely reliable system
\emph default
out of unreliable components.
You can
\emph on
lower
\emph default
the risk of failures to any
\begin_inset Formula $\epsilon>0$
\end_inset
by investing a lot of resources and of money, but whatever you do:
\begin_inset Formula $\epsilon=0$
\end_inset
is impossible.
Therefore, be careful with boolean algebra.
Prefer approximation methods / optimizing methods instead.
Always do
\emph on
your
\emph default
best, instead of trying to reach a
\emph on
global
\emph default
optimum which likely does not exist at all (because the
\begin_inset Formula $\epsilon$
\end_inset
can only
\emph on
converge
\emph default
to an optimum, but will never actually reach it).
The best effort principle means the following: if you discover a method
for improving your operating state by reduction of a (potential) damage
in a reasonable time and with reasonable effort, then
\series bold
simply do it
\series default
.
Don't argue that a particular step is no 100% solution for all of your
problems.
\emph on
Any
\emph default
\emph on
improvement
\emph default
is valuable.
\series bold
Don't miss any valuable step
\series default
having reasonable costs with respect to your budget.
Missing valuable measures which have low costs are certainly a violation
of the best effort principle, because you are not doing
\emph on
your
\emph default
best.
Keep that in mind.
\begin_inset Newline newline
\end_inset
If you have
\emph on
understood
\emph default
this (e.g.
deeply think at least one day about it), you will no longer advocate STONITH
methods
\emph on
in general
\emph default
, when there are alternatives.
STONITH methods are only valuable when you
\emph on
know in advance
\emph default
that the final outcome (after reboot) will most likely be better, and that
waiting for reboot will most likely
\emph on
pay off
\emph default
.
In general, this condition is
\emph on
not true
\emph default
if you have a healthy hot standby system.
This should be easy to see.
But there exist well-known clustermanager solutions / configurations blatantly
ignoring
\begin_inset Foot
status open
\begin_layout Plain Layout
For some
\emph on
special(!)
\emph default
cases of the shared-disk model, there exist some justifications for doing
STONITH
\emph on
before
\emph default
starting the application at the hot standby.
Under certain circumstances, it can happen that system A running amok could
destroy the data on your single shared disk (example: a filesystem doubly
mounted
\emph on
in parallel
\emph default
, which will certainly destroy your data, except you are using
\family typewriter
ocfs2
\family default
or suchalike).
This argument is only valid for
\emph on
passive
\emph default
disks which are
\emph on
directly
\emph default
attached to
\emph on
both
\emph default
systems A and B, such that there is no
\emph on
external
\emph default
means for fencing the disk.
In case of iSCSI running over ordinary network equipment such as routers
or switches, the argument
\begin_inset Quotes eld
\end_inset
fencing the disk is otherwise not possible
\begin_inset Quotes erd
\end_inset
does not apply.
You can interrupt iSCSI connection at the network gear, or you can often
do it at cluster A or at the iSCSI target.
Even commercial storage appliances speaking iSCSI can be remotely controlled
for forcefully aborting iSCSI sessions.
In modern times, the STONITH method has no longer such a justification.
The justification stems from ancient times when a disk was a purely passive
mechanical device, and its disk controller was part of the server system.
\end_layout
\end_inset
this.
Only when the former standby system does not work as expected (this means
that
\emph on
all
\emph default
of your redundant systems are not healthy enough for your application),
\emph on
only then
\begin_inset Foot
status open
\begin_layout Plain Layout
Notice that STONITH may be needed for (manual or partially automatic)
\emph on
repair
\emph default
in some cases, e.g.
when you know that a system has a kernel crash.
Don't mix up the repair phase with failover or handover phases.
Typically, they are executed at different times.
The repair phase is outside the scope of this section.
\end_layout
\end_inset
\emph default
STONITH is unevitable as a
\emph on
last resort
\emph default
option.
\begin_inset Newline newline
\end_inset
In short: blindly using STONITH without true need during failover is a violation
of the best effort principle.
You are simply not doing your best.
\end_layout
\begin_layout Itemize
When your budget is limited, carefully select those improvements which make
your system
\series bold
as reliable as possible
\series default
, given your fixed budget.
\end_layout
\begin_layout Itemize
Create statistics on the duration of your actions.
Based on this, try to get a
\emph on
balanced
\emph default
optimum between time and costs.
\end_layout
\begin_layout Itemize
Whatever actions you can
\series bold
start in parallel
\series default
for saving time, do it.
Otherwise you are disregarding the best effort principle, and your solution
will be sub-optimal.
You will require deep knowledge of parallel systems, as well as experience
with dealing with problems like (distributed) races.
Notice that
\emph on
any
\emph default
distributed system is
\emph on
inherently parallel
\emph default
.
Don't believe that sequential methods can deliver an optimum solution in
such a difficult area.
\end_layout
\begin_layout Itemize
If you don't have the
\series bold
necessary skills
\series default
for (a) recognizing already existing parallelism, (b) dealing with parallelism
at concept level, (c) programming and/or configuring parallelism race-free
and deadlock-free (or if you even don't know what a race condition is and
where it may occur in practice), then don't take responsibility for millions
of € or $.
\end_layout
\begin_layout Itemize
Avoid hard timeouts wherever possible.
Use
\series bold
adaptive timeouts
\series default
instead.
Reason: depending on hardware or workload, the same action A may take a
very short time on cluster 1, but take a very long time on cluster 2.
If you need to guard action A from hanging (which is almost always the
case because of Murphy's law), don't configure any fixed timeout for it.
When having several hundreds of clusters, you would need to use the
\emph on
worst case value
\emph default
, which is the longest time occurring somewhere at the very slow clusters
/ slow parts of the network.
This wastes a lot of time in case one of the fast clusters is hanging.
Adaptive timeouts work differently: they use a kind of
\begin_inset Quotes eld
\end_inset
progress bar
\begin_inset Quotes erd
\end_inset
to monitor the
\emph on
progress
\emph default
of an action.
They will abort only if there is
\emph on
no progress
\emph default
for a certain amount of time.
Hint: among others,
\family typewriter
marsadm view-*-rest
\family default
commands or macros are your friend.
\end_layout
\begin_layout Paragraph
ITON = Ignore The Other Node
\end_layout
\begin_layout Standard
This means
\series bold
fencing from application traffic
\series default
, and can be used as an alternative to STONITH when done properly.
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/fencing-hierarchy.fig
width 60col%
\end_inset
\end_layout
\begin_layout Standard
\noindent
Fencing from application traffic is best suited for the shared-nothing model,
but can also be adapted to the shared-disk model with some quirks.
\end_layout
\begin_layout Standard
The idea is simple: always route your application network traffic to the
current (logically) active side, whether it is currently A or B.
Just don't route any application requests to the current (logically) passive
side at all.
\end_layout
\begin_layout Standard
For failover (and
\emph on
only
\emph default
for that), you
\emph on
should not care about
\emph default
any split brain occurring at the low-level generic block device:
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/split-brain-history.fig
width 50col%
\end_inset
\end_layout
\begin_layout Standard
\noindent
Although having a split brain at the generic low-level block device, you
now define the
\begin_inset Quotes eld
\end_inset
logically active
\begin_inset Quotes erd
\end_inset
and
\begin_inset Quotes eld
\end_inset
logically passive
\begin_inset Quotes erd
\end_inset
side by yourself by
\emph on
logically ignoring
\emph default
the
\begin_inset Quotes eld
\end_inset
wrong
\begin_inset Quotes erd
\end_inset
side as defined by yourself:
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/split-brain-resolved.fig
width 50col%
\end_inset
\end_layout
\begin_layout Standard
\noindent
This is possible because the generic block devices provided by DRBD or MARS
are completely
\series bold
agnostic
\series default
of the
\begin_inset Quotes eld
\end_inset
meaning
\begin_inset Quotes erd
\end_inset
of either version A or B.
Higher levels such as clustermanagers (or humans like sysadmins) can assign
them a meaning like
\begin_inset Quotes eld
\end_inset
relevant
\begin_inset Quotes erd
\end_inset
or
\begin_inset Quotes eld
\end_inset
not relevant
\begin_inset Quotes erd
\end_inset
, or
\begin_inset Quotes eld
\end_inset
logically active
\begin_inset Quotes erd
\end_inset
or
\begin_inset Quotes eld
\end_inset
logically passive
\begin_inset Quotes erd
\end_inset
.
\end_layout
\begin_layout Standard
As a result of fencing from application traffic, the
\begin_inset Quotes eld
\end_inset
logically passive
\begin_inset Quotes erd
\end_inset
side will
\emph on
logically
\emph default
cease any actions such as updating user data, even if it is
\begin_inset Quotes eld
\end_inset
physically active
\begin_inset Quotes erd
\end_inset
during split-brain (when two primaries exist in DRBD or MARS sense
\begin_inset Foot
status open
\begin_layout Plain Layout
Hint: some clustermanagers and/or some people seem to define the term
\begin_inset Quotes eld
\end_inset
split-brain
\begin_inset Quotes erd
\end_inset
differently from DRBD or MARS.
In the context of generic block devices, split brain means that the
\emph on
history
\emph default
of both versions has been split to a Y-like
\series bold
fork
\series default
(for whatever reason), such that re-joining them
\emph on
incrementally
\emph default
by ordinary write operations is no longer guaranteed to be possible.
As a slightly simplified definition, you might alternatively use the definition
\begin_inset Quotes eld
\end_inset
two incompatible primaries are existing in parallel
\begin_inset Quotes erd
\end_inset
, which means almost the same in practice.
Details of formal semantics are not the scope of this treatment.
\end_layout
\end_inset
).
\end_layout
\begin_layout Standard
If you already have some load balancing, or BGP, or another
\emph on
mechanism
\emph default
for dynamic routing, you already have an important part for the ITON method.
Additionally, ensure by an appropriate
\emph on
strategy
\emph default
that your balancer status / BGP announcement etc does always coincide with
the
\begin_inset Quotes eld
\end_inset
logically active
\begin_inset Quotes erd
\end_inset
side (recall that even during split-brain
\emph on
you
\emph default
must define
\begin_inset Quotes eld
\end_inset
logically active
\begin_inset Quotes erd
\end_inset
\series bold
uniquely
\series default
\begin_inset Foot
status open
\begin_layout Plain Layout
A possible strategy is to use a Lamport clock for route changes: the change
with the most recent Lamport timestamp will always win over previous changes.
\end_layout
\end_inset
by yourself).
\end_layout
\begin_layout Standard
Example:
\end_layout
\begin_layout Description
Phase1 Check whether the hot standby B is currently usable.
If this is violated (which may happen during certain types of disasters),
abort the failover for any affected resources.
\end_layout
\begin_layout Description
Phase2 Do the following
\emph on
in parallel
\begin_inset Foot
status open
\begin_layout Plain Layout
For database applications where no transactions should get lost, you should
slightly modify the order of operations: first fence the old side A, then
start the application at standby side B.
However, be warned that even this cannot guarantee that no transaction
is lost.
When the network between A and B is interrupted
\emph on
before
\emph default
the incident happens, DRBD will automatically disconnect, and MARS will
show a lagbehind.
In order to fully eliminate this possibility, you can either use DRBD and
configure it to hang forever during network outages (such that users will
be unable to commit any transactions at all), or you can use the shared-disk
model instead.
But in the latter case, you are introducing a SPOF at the single shared
disk.
The former case is logically almost equivalent to shared-disk, but avoiding
some parts of the physical SPOF.
In a truly distributed system, the famous CAP theorem is limiting your
possibilities.
Therefore, no general solution exists fulfilling all requirements at the
same time.
\end_layout
\end_inset
:
\end_layout
\begin_deeper
\begin_layout Itemize
Start all affected applications at the hot standby B.
This can be done with the same DRBD or MARS procedure as described
\begin_inset CommandInset ref
LatexCommand vpageref
reference "Phase4-in-more"
\end_inset
.
\end_layout
\begin_layout Itemize
Fence A by fixedly routing all affected application traffic to B.
\end_layout
\end_deeper
\begin_layout Standard
That's all which has to be done for a shared-nothing model.
Of course, this will likely produce a split-brain (even when using DRBD
in place of MARS), but that will not matter from a user's perspective,
because the users will no longer
\begin_inset Quotes eld
\end_inset
see
\begin_inset Quotes erd
\end_inset
the
\begin_inset Quotes eld
\end_inset
logically passive
\begin_inset Quotes erd
\end_inset
side A through their network.
Only during the relatively small time period where application traffic
was going to the old side A while not replicated to B due to the incident,
a very small number of updates
\emph on
could
\emph default
have gone lost.
In fields like webhosting, this is taken into account.
Users will usually not complain when some (smaller amount of) data is lost
due to split-brain.
They will complain when the service is unavailable.
\end_layout
\begin_layout Standard
This method is the fastest for restoring availability, because it doesn't
try to execute any (remote) action at side A.
Only from a sysadmin's perspective, there remain some cleanup tasks to
be done during the following repair phase, such as split-brain resolution,
which are outside the scope of this treatment.
\end_layout
\begin_layout Standard
By running the application fencing step
\emph on
sequentially
\emph default
(including wait for its partial successfulness such that the old side A
can no longer be reached by any users) in front of the failover step, you
may minimize the amount of lost data, but at the cost of total duration.
Your service will take longer to be available again, while the amount of
lost data is typically somewhat smaller.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
A few people might clamour when some data is lost.
In long-distance replication scenarios with high update traffic, there
is
\emph on
simply no way at all
\emph default
for guaranteeing that no data can be lost ever.
According to the laws of Einstein and the laws of Distributed Systems like
the famous CAP theorem, this isn't the fault of DRBD+proxy or MARS, but
simply the
\emph on
consequence
\emph default
of having long distances.
If you want to protect against data loss as best as possible, then don't
use
\begin_inset Formula $k=2$
\end_inset
replicas.
Use
\begin_inset Formula $k\geq4$
\end_inset
, and spread them over different distances, such as mixed small + medium
+ long distances.
Future versions of MARS will support adaptive pseudo-synchronous modes,
which will allow individual adaptation to network latencies / distances.
\end_layout
\begin_layout Standard
The ITON method can be adapted to shared-disk by additionally fencing the
common disk from the (presumably) failed cluster node A.
\end_layout
\begin_layout Subsubsection
Handover Methods
\end_layout
\begin_layout Standard
Planned handover is conceptually simpler, because both sides must be (almost)
healthy as a
\emph on
precondition
\emph default
.
There are simply no pre-existing failures to deal with.
\end_layout
\begin_layout Standard
Here is an example using DRBD, some application commands denoted as pseudo
code:
\end_layout
\begin_layout Enumerate
at side A:
\family typewriter
applicationmanager stop all
\end_layout
\begin_layout Enumerate
at side A:
\family typewriter
drbdadm secondary all
\end_layout
\begin_layout Enumerate
at side B:
\family typewriter
drbdadm primary all
\end_layout
\begin_layout Enumerate
at side B:
\family typewriter
applicationmanager start all
\end_layout
\begin_layout Standard
MARS already has a conceptual distinction between handover and failover.
With MARS, it becomes even simpler, because a generic handover procedure
is already built in:
\end_layout
\begin_layout Enumerate
at side A:
\family typewriter
applicationmanager stop all
\end_layout
\begin_layout Enumerate
at side B:
\family typewriter
marsadm primary all
\end_layout
\begin_layout Enumerate
at side B:
\family typewriter
applicationmanager start all
\end_layout
\begin_layout Subsubsection
Hybrid Methods
\end_layout
\begin_layout Standard
In general, a planned handover may fail at any stage.
Notice that such a failure is also a failure, but (partially) caused by
the planned handover.
You have the following alternatives for automatically dealing with such
cases:
\end_layout
\begin_layout Enumerate
In case of a failure, switch back to the old side A.
\end_layout
\begin_layout Enumerate
Instead, forcefully switch to the new side A, similar to the methods described
in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Failover-Methods"
\end_inset
.
\end_layout
\begin_layout Standard
Similar options exist for a failed failover (at least in theory), but chances
are lower for actually recovering if you have only
\begin_inset Formula $k=2$
\end_inset
replicas in total.
\end_layout
\begin_layout Standard
Whatever you decide to do in what case in whatever priority order, whether
you decide it in advance or during the course of a failing action: it simply
means that according to the best effort principle, you should
\series bold
never leave your system in a broken state
\series default
when there exists a chance to recover availability with any method.
\end_layout
\begin_layout Standard
Therefore, you should
\emph on
implement
\emph default
neither handover nor failover in their pure forms.
Always implement hybrid forms following the best effort principle.
\end_layout
\begin_layout Subsection
Special Requirements for Long Distances
\begin_inset CommandInset label
LatexCommand label
name "sub:Special-Requirements-for"
\end_inset
\end_layout
\begin_layout Standard
Most contemporary clustermanagers have been constructed for short distance
shared-nothing clusters, or even for
\emph on
local
\emph default
shared-nothing clusters (c.f.
DRBD over crossover cables), or even for shared-disk clusters (
\emph on
originally
\emph default
, when their
\emph on
concepts
\emph default
were developed).
Blindly using them for long-distance replication without modification /
adaptation bears some additional risks.
\end_layout
\begin_layout Itemize
Notice that long-distance replication always
\emph on
requires
\emph default
a
\series bold
shared-nothing
\series default
model.
\end_layout
\begin_layout Itemize
As a consequence,
\series bold
split brain
\series default
can appear
\emph on
regularly
\emph default
during failover.
There is no way for preventing it! This is an
\emph on
inherent property
\emph default
of distributed systems, not limited to MARS (e.g.
also ocurring with DRBD if you try to use it over long distances).
Therefore, you
\emph on
must
\emph default
deal with occurences of split-brain as a
\emph on
requirement
\emph default
.
\end_layout
\begin_layout Itemize
The probability of
\series bold
network partitions
\series default
is much higher: although you should have been required by Murphy's law
to deal with network partitions already in short-distance scenarios, it
now becomes
\emph on
mandatory
\emph default
.
\end_layout
\begin_layout Itemize
Be prepared that in case of certain types of (more or less global) internet
partitions, you may not be able to trigger STONITH actions
\emph on
at all
\emph default
.
Therefore,
\series bold
fencing of application traffic
\series default
is
\emph on
mandatory
\emph default
.
\end_layout
\begin_layout Section
Creating Backups via Pseudo Snapshots
\end_layout
\begin_layout Standard
When all your secondaries are all homogenously located in a standby datacenter,
they will be almost idle all the time.
This is a waste of computing resources.
\end_layout
\begin_layout Standard
Since MARS is no substitute for a full-fledged backup system, and since
backups may put high system load onto your active side, you may want to
utilize your passive hardware resources in a better way.
\end_layout
\begin_layout Standard
MARS supports this thanks to its ability to switch the
\family typewriter
pause-replay
\family default
\emph on
independently
\emph default
from
\family typewriter
pause-fetch
\family default
.
\end_layout
\begin_layout Standard
The basic idea is simple: just use
\family typewriter
pause-replay
\family default
at your secondary site, but leave the replication of transaction logfiles
intact by deliberately
\emph on
not
\emph default
saying
\family typewriter
pause-fetch
\family default
.
This way, your secondary replica (block device) will stay frozen for a
limited time, without loosing your redundancy: since the transaction logs
will continue to replicate in the meantime, you can start
\family typewriter
resume-replay
\family default
at any time, in particular when a primary-side incident should happen unexpecte
dly.
The former secondary will just catch up by replaying the outstanding parts
of the transaction logs in order to become recent.
\end_layout
\begin_layout Standard
However, some
\emph on
details
\emph default
have to be obeyed.
In particular, the current version of MARS needs an additional
\family typewriter
detach
\family default
operation, in order to release exclusive access to the underlying disk
\family typewriter
/dev/lv/$res
\family default
.
Future versions of MARS are planned to support this more directly, without
need for an intermediate
\family typewriter
detach
\family default
operation.
\end_layout
\begin_layout Standard
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Beware:
\family typewriter
mount -o ro /dev/vg/$res
\family default
can lead to
\series bold
unnoticed write operations
\series default
if you are not careful! Some journalling filesystems like
\family typewriter
xfs
\family default
or
\family typewriter
ext4
\family default
may replay their journals onto the disk, leading to
\emph on
binary
\emph default
differences and thus
\series bold
destroying your consistency
\series default
later when you re-enable
\family typewriter
resume-replay
\family default
!
\end_layout
\begin_layout Standard
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Therefore, you may use small LVM snapshots (only in such cases).
Typically,
\family typewriter
xfs
\family default
journal replay will require only a few megabytes.
Therefore you typically don't need much temporary space for this.
Here is a more detailed description of steps:
\end_layout
\begin_layout Enumerate
\family typewriter
marsadm pause-replay $res
\end_layout
\begin_layout Enumerate
\family typewriter
marsadm detach $res
\end_layout
\begin_layout Enumerate
\family typewriter
lvcreate --size 100m --snapshot --name ro-$res /dev/vg/$res
\end_layout
\begin_layout Enumerate
\family typewriter
mount -o ro /dev/vg/ro-$res /mnt/tmp
\end_layout
\begin_layout Enumerate
Now draw your backup from
\family typewriter
/mnt/tmp/
\end_layout
\begin_layout Enumerate
\family typewriter
umount /mnt/tmp
\end_layout
\begin_layout Enumerate
\family typewriter
lvremove -f /dev/vg/ro-$res
\end_layout
\begin_layout Enumerate
\family typewriter
marsadm up $res
\end_layout
\begin_layout Standard
Hint: during the backup, the transaction logs will accumulate on
\family typewriter
/mars/
\family default
.
In order to avoid overflow of
\family typewriter
/mars/
\family default
(c.f.
section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Defending-Overflow"
\end_inset
), don't unnecessarily prolong the backup duration.
\end_layout
\begin_layout Chapter
MARS for Developers
\end_layout
\begin_layout Standard
This chapter is organized strictly top-down.
\end_layout
\begin_layout Standard
If you are a sysadmin and want to inform yourself about internals (useful
for debugging), the relevant information is at the beginning, and you don't
need to dive into all technical details at the end.
\end_layout
\begin_layout Standard
If you are a kernel developer and want to contribute code to the emerging
MARS community, please read it (almost) all.
Due to the top-down organization, sometimes you will need to follow some
forward references in order to understand details.
Therefore I recommend reading this chapter twice in two different reading
modes: in the first reading pass, you just get a raw network of principles
and structures in your brain (you don't want to grasp details, therefore
don't strive for a full understanding).
In the second pass, you will exploit your knowlegde from the first pass
for a deeper understanding of the details.
\end_layout
\begin_layout Standard
Alternatively, you may first read the sections about general architecture,
and then start a bottom-up scan by first reading the last section about
generic objects and aspects, and working in reverse
\emph on
section
\emph default
order (but read
\emph on
sub
\emph default
sections in-order) until you finally reach the kernel interfaces / symlink
trees.
\end_layout
\begin_layout Section
Motivation / Politics
\end_layout
\begin_layout Standard
MARS is not yet upstream in the Linux kernel.
This section tries to clear up some potential doubts.
Some people have asked why MARS uses its own internal framework instead
of
\emph on
directly
\emph default
\begin_inset Foot
status open
\begin_layout Plain Layout
Notice that
\emph on
indirect
\emph default
use of pre-existing Linux infrastructure is not only possible, but actually
implemented, by usinig it
\emph on
internally
\emph default
in brick
\emph on
implementations
\emph default
(black-box principle).
However, such bricks are not portable to other environments like userspace.
\end_layout
\end_inset
being based on some already existing Linux kernel infrastructures like
the device mapper.
Here is a list of technical reasons:
\end_layout
\begin_layout Enumerate
The existing device mapper infrastructure is based on
\family typewriter
struct bio
\family default
.
In contrast, the new XIO personality of the generic brick infrastructure
is based on the concept of AIO (Asynchronous IO), which is a
\series bold
true superset
\series default
of block IO.
\end_layout
\begin_layout Enumerate
In particular,
\family typewriter
struct bio
\family default
is firmly referencing to
\family typewriter
struct page
\family default
(via intermediate
\family typewriter
struct bio_vec
\family default
), using types like
\family typewriter
sector_t
\family default
in the field
\family typewriter
bi_sector
\family default
.
Basic transfer units are blocks, or sectors, or pages, or the like.
In contrast,
\family typewriter
struct aio_object
\family default
used by the XIO personality can address
\series bold
arbitrary granularity
\series default
memory with byte resolution even at odd
\begin_inset Foot
status open
\begin_layout Plain Layout
Some brick
\emph on
implementations
\emph default
(as opposed to the capabilities of the
\emph on
interface
\emph default
) may be (and, in fact,
\emph on
are
\emph default
) restricted to
\family typewriter
PAGE_SIZE
\family default
operations or the like.
This is no general problem, because IOP can automatically insert some translato
r bricks extending the capabilities to universal granularity (of course
at some performance costs).
\end_layout
\end_inset
positions in (virtual) files / devices, similar to classical Unix file
IO, but
\emph on
asynchronously
\emph default
.
Practical experience shows that even non-functional properties like performance
of many datacenter workloads are profiting from that
\begin_inset Foot
status open
\begin_layout Plain Layout
The current transaction logger uses variable-sized headers at
\begin_inset Quotes eld
\end_inset
odd
\begin_inset Quotes erd
\end_inset
addresses.
Although this increases
\family typewriter
memcpy()
\family default
load due to
\begin_inset Quotes eld
\end_inset
misalignment
\begin_inset Quotes erd
\end_inset
, the
\emph on
overall performance
\emph default
was provably better than in variants where sector / page alignment was
strictly obeyed, but space was wasted for alignments.
Such functionality is only possible if the XIO infrastructure
\emph on
allows
\emph default
\emph on
for
\emph default
(but doesn't force)
\begin_inset Quotes eld
\end_inset
mis-aligned
\begin_inset Quotes erd
\end_inset
IO operations.
In future, many different transaction logfile formats showing different
runtime behaviour (e.g.
optimized for high-throughput SSD loads) may co-exist in parallel.
Note that properly aligned XIO operations bear no noticeable overhead compared
to classical block IO, at least in typical datacenter RAID scenarios.
\end_layout
\end_inset
.
The AIO/XIO abstraction contains no fixed link to kernel abstractions and
should be
\series bold
easily portable
\series default
to other environments.
In summary, the new personality provides a uniform abstraction which abstracts
away from multiple different kernel interfaces; it is designed to be useful
even in userspace.
\end_layout
\begin_layout Enumerate
Kernel infrastructures for the concept of
\emph on
direct IO
\emph default
are different from those for
\emph on
buffered IO
\emph default
.
The XIO personality used by MARS subsumes both concepts as use case
\emph on
variants
\emph default
.
\series bold
Buffering
\series default
is an optional internal property of XIO bricks (almost non-functional property
with support for consistency guarantees).
\end_layout
\begin_layout Enumerate
The AIO/XIO personality is generically designed for remote operations over
networks, at arbitrary places in the IO stack, with (almost
\begin_inset Foot
status open
\begin_layout Plain Layout
By default, automatic network connection re-establishment and infinite network
retries are already implemented in the
\family typewriter
xio_client
\family default
and
\family typewriter
xio_server
\family default
bricks to provide fully transparent semantics.
However, this may be undesirable in case of fatal crashes.
Therefore, abort operations are also configurable, as well as network timeouts
which are then mapped to classical IO errors.
\end_layout
\end_inset
) no semantic differences to local operations (built-in
\series bold
network transparency
\series default
).
There are universal provisions for mixed operation of different versions
(
\series bold
rolling software updates
\series default
in clusters / grids).
\end_layout
\begin_layout Enumerate
The generic brick infrastructure (as well as its personalities like XIO
or any other future personality) supports
\series bold
dynamic re-wiring / re-configuration
\series default
\emph on
during
\emph default
operation (even while parallel IO requests are flying, some of them taking
different paths in the IO stack in parallel).
This is absolutely needed for MARS logfile rotation.
In the long term, this would be useful for many advanced new features and
products, not limited to multipathing.
\end_layout
\begin_layout Enumerate
The generic brick infrastructure (and in turn all personalities) provide
\series bold
additional comfort
\series default
to the programmer while enabling
\series bold
increased functionality
\series default
: by use of a generalization of
\series bold
aspect orientation
\series default
\begin_inset Foot
status open
\begin_layout Plain Layout
Similar to AOP, insertion of IOP bricks for checking / debugging etc is
one of the key advantages of the generic brick infrastructure.
In contrast to AOP where debugging is usually {en,dis}abled statically
at compile time, IOP allows for
\emph on
dynamic
\emph default
(re-)configuration of debugging bricks, automatic repair, and many more
features promoted by
\emph on
organic computing
\emph default
.
\end_layout
\end_inset
, the programmer need no longer worry about dynamic memory allocations for
\emph on
local state
\emph default
in a brick instance.
MARS is
\series bold
automating local state
\series default
even when dynamically instantiating new bricks (possibly having the same
brick type) at runtime.
Specifially, XIO is automating
\series bold
request stacking
\series default
at the completion path this way, even while dynamically reconfiguring the
IO stack
\begin_inset Foot
status open
\begin_layout Plain Layout
The generic aspect orientation approach leads to better
\series bold
separation of concerns
\series default
: local state needed by brick implementations is not visible from outside
by default.
In other words, local state is also
\series bold
private state
\series default
.
Accidental hampering of internal operations is impeded.
\end_layout
\begin_layout Plain Layout
Example from the kernel: in
\family typewriter
include/linux/blkdev.h
\family default
the definition of
\family typewriter
struct request
\family default
contains the following comment:
\family typewriter
/* the following two fields are internal, NEVER access directly */
\family default
.
It appears that
\family typewriter
struct request
\family default
contains not only fields relevant for the caller, but also
\series bold
internal fields
\series default
needed only in
\emph on
some
\emph default
\emph on
specific
\emph default
callees.
For example,
\family typewriter
rb_node
\family default
is documented to be used only in IO schedulers.
\end_layout
\begin_layout Plain Layout
XIO goes one step further: there need not exist exactly one IO scheduler
instance in the IO stack for a single device.
Future
\family typewriter
xio_scheduler_{deadline,cfq,...}
\family default
brick types could be each instantiated many times, and in arbitrary places,
even for the same (logical) device.
The equivalent of
\family typewriter
rb_node
\family default
would then be automatically instantiated multiple times for the same IO
request, by automatically instantiating the right local aspect instances.
\end_layout
\end_inset
.
A similar automation
\begin_inset Foot
status open
\begin_layout Plain Layout
DM can achieve stacking and dynamic routing by a workaround called
\emph on
request cloning
\emph default
, potentially leading to mass creation of temporary / intermediate object
instances.
\end_layout
\end_inset
does not exist in the rest of the Linux kernel.
\end_layout
\begin_layout Enumerate
The generic brick infrastructure, together with personalities like XIO,
enables
\series bold
new long-term functional and non-functional opportunities
\series default
by use of concepts from instance-oriented programming (IOP
\begin_inset Foot
status open
\begin_layout Plain Layout
See
\begin_inset Flex URL
status collapsed
\begin_layout Plain Layout
http://athomux.net/papers/paper_inst2.pdf
\end_layout
\end_inset
\end_layout
\end_inset
).
The application area is
\series bold
not limited to device drivers
\series default
.
For example, a new personality for
\emph on
stackable filesystems
\emph default
could be developed in future.
\end_layout
\begin_layout Standard
In summary, anyone who would insist that MARS should be
\emph on
directly
\begin_inset Foot
status open
\begin_layout Plain Layout
Notice that kernel-specific structures like
\family typewriter
struct bio
\family default
are of course used by MARS, but only
\emph on
inside
\emph default
the blackbox implementation of bricks like
\family typewriter
mars_bio
\family default
or
\family typewriter
mars_if
\family default
which act as
\series bold
adaptors
\series default
to/from that structure.
It is possible to write further adaptors, e.g.
for direct interfacing to the device mapper infrastructure.
\end_layout
\end_inset
\emph default
based on pre-existing kernel structures / frameworks instead of contributing
a new framework would cause a
\emph on
massive regression of functionality
\emph default
.
\end_layout
\begin_layout Itemize
On one hand, all code contributed by the MARS project is
\series bold
non-intrusive
\series default
into the rest of the Linux kernel.
From the viewpoint of other parts of the kernel, the whole addition
\emph on
behaves
\emph default
\emph on
like
\emph default
a driver (although its infrastructure is much more than a driver).
\end_layout
\begin_layout Itemize
On the other hand, if people are interested, the contributed infrastructure
\emph on
may
\emph default
be used to
\emph on
add
\emph default
to the power of the Linux kernel.
It is designed to be
\series bold
open for contributions
\series default
.
\end_layout
\begin_layout Itemize
A
\emph on
possible
\emph default
(but not the only possible) way to do this is giving the generic brick
framework / the XIO personality as well as future personalities / the MARS
application the status of a
\emph on
subsystem
\emph default
inside the kernel (in the long term), similar to the SCSI subsystem or
the network subsystem.
Noone is forced to use it, but anybody may use it if he/she likes.
\end_layout
\begin_layout Itemize
Politically, the author is a FOSS advocate willing to collaborate and to
support anyone interested in contributions.
The author's personal interest is long-term and is open for both in-tree
and out-of-tree extensions of both the framework and MARS by any other
party obeying the GPL and not hazarding FOSS by patents (instead supporting
organizations like the Open Invention Network).
The author is open to closer relationships with the Linux Foundation and
other parts of the Linux ecosystem.
\end_layout
\begin_layout Section
Architecture Overview
\end_layout
\begin_layout Standard
\begin_inset Graphics
filename images/MARS_Framework_Architecture.pdf
width 100col%
\end_inset
\end_layout
\begin_layout Section
Some Architectural Details
\end_layout
\begin_layout Standard
The following pictures show some
\begin_inset Quotes eld
\end_inset
zones of responsibility
\begin_inset Quotes erd
\end_inset
, not necessarily a strict hierarchy (although Dijkstra's famous layering
rules from THE are tried to be respected as much as possible).
The construction principle follows the concept of
\series bold
Instance Oriented Programming
\series default
(IOP) described in
\begin_inset Flex URL
status collapsed
\begin_layout Plain Layout
http://athomux.net/papers/paper_inst2.pdf
\end_layout
\end_inset
.
Please note that MARS is only instance-
\emph on
based
\emph default
\begin_inset Foot
status open
\begin_layout Plain Layout
Similar to OOP, where
\begin_inset Quotes eld
\end_inset
object-based
\begin_inset Quotes erd
\end_inset
means a weaker form of
\begin_inset Quotes eld
\end_inset
object-oriented
\begin_inset Quotes erd
\end_inset
, the term
\begin_inset Quotes eld
\end_inset
instance-based
\begin_inset Quotes erd
\end_inset
means that the
\emph on
strategy
\emph default
brick layer need not be fully modularized according to the IOP principles,
but the
\emph on
worker
\emph default
brick layer already is.
\end_layout
\end_inset
, while MARS Full is planned to be fully instance-
\emph on
oriented
\emph default
.
\end_layout
\begin_layout Subsection
MARS Architecture
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/mars-light-architecture.fig
width 40col%
\end_inset
\end_layout
\begin_layout Subsection
MARS Full Architecture (planned)
\end_layout
\begin_layout Standard
\noindent
\align center
\begin_inset Graphics
filename images/mars-full-architecture.fig
width 80col%
\end_inset
\end_layout
\begin_layout Section
Documentation of the Symlink Trees
\begin_inset CommandInset label
LatexCommand label
name "sec:Documentation-of-the"
\end_inset
\end_layout
\begin_layout Standard
The
\family typewriter
/mars/
\family default
symlink tree is serving the following purposes, all at the same time:
\end_layout
\begin_layout Enumerate
For
\series bold
communication
\series default
between cluster nodes, see sections
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-Lamport-Clock"
\end_inset
and
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:The-Symlink-Tree"
\end_inset
.
This communication is even the
\emph on
only
\emph default
communication between cluster nodes (apart from the
\emph on
contents
\emph default
of transaction logfiles and sync data).
\end_layout
\begin_layout Enumerate
\series bold
\emph on
Internal
\emph default
interface
\series default
between the kernel module and the userspace tool
\family typewriter
marsadm
\family default
.
\end_layout
\begin_layout Enumerate
\series bold
\emph on
Internal
\emph default
persistent repository
\series default
which keeps state information between reboots (also in case of node crashes).
It is even the
\emph on
only
\emph default
place where state information is kept.
There is no other place like
\family typewriter
/etc/drbd.conf
\family default
.
\end_layout
\begin_layout Standard
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Because of its internal character, its representation and semantics may
change at any time without notice (e.g.
via an
\emph on
internal
\emph default
upgrade procedure between major releases).
It is
\emph on
not
\emph default
an external interface to the outer world.
Don't build anything on it.
\end_layout
\begin_layout Standard
However, knowledge of the symlink tree is useful for advanced sysadmins,
for
\series bold
human inspection
\series default
and for
\series bold
debugging
\series default
.
And, of course, for developers.
\end_layout
\begin_layout Standard
As an
\begin_inset Quotes eld
\end_inset
official
\begin_inset Quotes erd
\end_inset
interface from outside, only the
\family typewriter
marsadm
\family default
command should be used.
\end_layout
\begin_layout Subsection
Documentation of the MARS Symlink Tree
\end_layout
\begin_layout Section
XIO Worker Bricks
\end_layout
\begin_layout Section
StrategY Worker Bricks
\end_layout
\begin_layout Standard
NYI
\end_layout
\begin_layout Section
The XIO Brick Personality
\end_layout
\begin_layout Section
The Generic Brick Infrastructure Layer
\end_layout
\begin_layout Section
The Generic Object and Aspect Infrastructure
\end_layout
\begin_layout Chapter
\start_of_appendix
Technical Data MARS
\end_layout
\begin_layout Standard
MARS has some built-in limitations which should be overcome
\begin_inset Foot
status open
\begin_layout Plain Layout
Some internal algorithms are quadratic.
The reason is that MARS evolved from a lab prototype which wasn't originally
intended for enterprise grade usage, but should have been succeeded by
the fully instance-oriented MARS Full much earlier.
\end_layout
\end_inset
by the future MARS Full.
Please don't exceed the following limits:
\end_layout
\begin_layout Itemize
maximum 10 nodes per cluster
\end_layout
\begin_layout Itemize
maximum 10 resources per cluster
\end_layout
\begin_layout Itemize
maximum 100 logfiles per resource
\end_layout
\begin_layout Chapter
Handout for Midnight Problem Solving
\end_layout
\begin_layout Standard
Here are generic instructions for the generic
\family typewriter
marsadm
\family default
and commandline level.
Other levels (e.g.
different types of cluster managers, PaceMaker, control scripts /
\family typewriter
rc
\family default
scripts /
\family typewriter
upstart
\family default
scripts, etc should be described elsewhere.
\end_layout
\begin_layout Section
Inspecting the State of MARS
\end_layout
\begin_layout Standard
For manual inspection, please prefer the new
\family typewriter
marsadm view all
\family default
over the old
\family typewriter
marsadm view-1and1 all
\family default
.
It shows more appropriate / detailed information.
\end_layout
\begin_layout Standard
Hint: this might change in future when somebody will program better marcros
for the
\family typewriter
view-1and1
\family default
variant, or create even better other macros.
\end_layout
\begin_layout Quotation
\family typewriter
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
# watch marsadm view all
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Checking the low-level network connections at runtime:
\end_layout
\begin_layout Quotation
\family typewriter
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
# watch "netstat --tcp | grep 777"
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Meaning of the port numbers (as currently configured into the kernel module,
may change in future):
\end_layout
\begin_layout Itemize
7777 = metadata / symlink propagation
\end_layout
\begin_layout Itemize
7778 = transfer of transaction logfiles
\end_layout
\begin_layout Itemize
7779 = transfer of sync traffic
\end_layout
\begin_layout Standard
7777 must be always active on a healthy cluster.
7778 and 7779 will appear only on demand, when some data is transferred.
\end_layout
\begin_layout Standard
Hint: when one of the columns Send-Q or Recv-Q are constantly at high values,
you might have a network bottleneck.
\end_layout
\begin_layout Section
Replication is Stuck
\end_layout
\begin_layout Standard
Indications for a stuck:
\end_layout
\begin_layout Itemize
One of the flags shown by
\family typewriter
marsadm view all
\family default
or
\family typewriter
marsadm view-flags all
\family default
contain a symbol
\family typewriter
"-"
\family default
(dash).
This means that some switch is currently switched off (deliberately).
Please check whether there is a valid reason why somebody else switched
it off.
If the switch-off is just by accident, use the following command to fix
the stuck:
\family typewriter
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
# marsadm up all
\end_layout
\end_inset
\family default
(or replace
\family typewriter
all
\family default
by a particular resource name if you want to start only a specific one).
\begin_inset Newline newline
\end_inset
Note:
\family typewriter
up
\family default
is equivalent to the sequence
\family typewriter
attach; resume-fetch; resume-replay; resume-sync
\family default
.
Instead of switching each individual knob, use
\family typewriter
up
\family default
as a shortcut for switching on anything which is currently off.
\end_layout
\begin_layout Itemize
\family typewriter
netstat --tcp | grep 7777
\family default
does not show anything.
Please check the following:
\end_layout
\begin_deeper
\begin_layout Itemize
Is the kernel module loaded? Check
\family typewriter
lsmod | grep mars
\family default
.
When necessary, run
\family typewriter
modprobe mars
\family default
.
\end_layout
\begin_layout Itemize
Is the network interface down? Check
\family typewriter
ifconfig
\family default
, and/or
\family typewriter
ethtool
\family default
and friends, and fix it when necessary.
\end_layout
\begin_layout Itemize
Is a
\family typewriter
ping <partner-host>
\family default
possible? If not, fix the network / routing / firewall / etc.
When fixed, the MARS connections should automatically appear after about
1 minute.
\end_layout
\begin_layout Itemize
When
\family typewriter
ping
\family default
is possible, but a MARS connection to port 7777 does not appear after a
few minutes, try to connect to remote port 7777 by hand via
\family typewriter
telnet
\family default
.
But don't type anything, just abort the connection immediately when it
works! Typing anything will almost certainly throw a harsh error message
at the other server, which could unnecessarily alarm other people.
\end_layout
\end_deeper
\begin_layout Itemize
Check whether
\family typewriter
marsadm view all
\family default
shows some progress bars somewhere.
Example:
\family typewriter
\size scriptsize
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
istore-test-bap1:~# marsadm view all
\end_layout
\begin_layout Plain Layout
--------- resource lv-0
\end_layout
\begin_layout Plain Layout
lv-0 OutDated[F] PausedReplay dCAS-R Secondary istore-test-bs1
\end_layout
\begin_layout Plain Layout
replaying: [>...................] 1.21% (12/1020)MiB logs: [2..3]
\end_layout
\begin_layout Plain Layout
> fetch: 1008.198 MiB rate: 0 B/sec remaining: --:--:-- hrs
\end_layout
\begin_layout Plain Layout
> replay: 0 B rate: 0 B/sec remaining: 00:00:00 hrs
\end_layout
\end_inset
\family default
\size default
At least one of the
\family typewriter
rate:
\family default
values should be greater than 0.
When none of the
\family typewriter
rate:
\family default
values indicate any progress for a longer time, try
\family typewriter
marsadm up all
\family default
again.
If it doesn't help, check and repair the network.
If even this does not help, check the hardware for any IO hangups, or kernel
hangups.
First, check the RAID controllers.
Often (but not certainly), a stuck kernel can be recognized when many processes
are
\emph on
permanently
\emph default
in state "D", for a long time:
\family typewriter
ps ax | grep " D" | grep -v grep
\family default
or similar.
Please check whether there is just an overload, or
\emph on
really
\emph default
a true kernel problem.
Discrimination is not easy, and requires experience (as with any other
system; not limited to MARS).
A truly stuck kernel can only be resurrected by rebooting.
The same holds for any hardware problems.
\end_layout
\begin_layout Itemize
Check whether
\family typewriter
marsadm view all
\family default
reports any lines like
\family typewriter
WARNING: SPLIT BRAIN at '' detected
\family default
.
In such a case, check that there is
\emph on
really
\emph default
a split brain, before obeying the instructions in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Resolution-of-Split"
\end_inset
.
Notice that network outages or missing
\family typewriter
marsadm log-delete-all all
\family default
may continue to report an old split brain which has gone in the meantime.
\end_layout
\begin_layout Itemize
Check whether
\family typewriter
/mars/
\family default
is too full.
For a rough impression,
\family typewriter
df /mars/
\family default
may be used.
For getting authoritative values as internally used by the MARS emergency-mode
computations, use
\family typewriter
marsadm view-rest-space
\family default
(the unit is GiB).
In practice, the differences are only marginal, at least on bigger
\family typewriter
/mars/
\family default
partitions.
When there is only few rest space (or none at all), please obey the instruction
s in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Resolution-of-Emergency"
\end_inset
.
\end_layout
\begin_layout Section
Resolution of Emergency Mode
\begin_inset CommandInset label
LatexCommand label
name "sec:Resolution-of-Emergency"
\end_inset
\end_layout
\begin_layout Standard
Emergency mode occurs when
\family typewriter
/mars/
\family default
runs out of space, such that no new logfile data can be written anymore.
\end_layout
\begin_layout Standard
In emergency mode, the primary will write any write requests
\emph on
directly
\emph default
to the underlying disk, as if MARS were not present at all.
Thus, your application will continue to run.
Only the
\emph on
replication
\emph default
as such is stopped.
\end_layout
\begin_layout Standard
\begin_inset Note Greyedout
status open
\begin_layout Plain Layout
Notice: emergency mode means that your secondary nodes are usually in a
\emph on
consistent
\emph default
, but
\emph on
outdated
\emph default
state (exception: when a sync was running in parallel to the emergency
mode, then the sync will be automatically started over again).
You can check consistency via
\family typewriter
marsadm view-flags all
\family default
.
Only when a local disk shows a lower-case letter
\family typewriter
"d"
\family default
instead of an uppercase
\family typewriter
"D"
\family default
, it is known to be inconsistent (e.g.
during a sync).
When there is a dash instead, it usually means that the disk is detatched
or misconfigured or the kernel module is not started.
Please fix these problems first before believing that your local disk is
unusable.
Even if it is really inconsistent (which is very unlikely, typically occurring
only as a consequence of hardware failures, or of the above-mentioned exception
), you have a big chance to recover most of the data via
\family typewriter
fsck
\family default
and friends.
\end_layout
\end_inset
\end_layout
\begin_layout Standard
A currently existing Emergency mode can be detected by
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
primary:~# marsadm view-is-emergency all
\end_layout
\begin_layout Plain Layout
secondary:~# marsadm view-is-emergency all
\end_layout
\end_inset
Notice: this delivers the current state, telling nothing about the past.
\end_layout
\begin_layout Standard
Currently, emergency mode will also show something like
\family typewriter
WARNING: SPLIT BRAIN at '' detected
\family default
.
This ambiguity will be resolved in a future MARS release.
It is however not crucial: the resolution methods for both cases are very
similar.
If in doubt, start emergency resolution first, and only proceed to split
brain resoultion if it did not help.
\end_layout
\begin_layout Standard
Preconditions:
\end_layout
\begin_layout Itemize
Only current version of MARS: the space at the primary side should have
been already released, and the emergency mode should have been already
left.
Otherwise, you might need the split-brain resolution method from section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Resolution-of-Split"
\end_inset
.
\end_layout
\begin_layout Itemize
The network
\series bold
must
\series default
be working.
Check that the following gives an entry for each secondary:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
primary:~# netstat --tcp | grep 7777
\end_layout
\end_inset
When necessary, fix the network first (see instructions above).
\end_layout
\begin_layout Standard
Emergency mode should now be resolved via the following instructions:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
primary:~# marsadm view-is-emergency all
\end_layout
\begin_layout Plain Layout
primary:~# du -s /mars/resource-* | sort -n
\end_layout
\end_inset
Remember the affected resources.
Best practice is to do the following, starting with the
\emph on
biggest
\emph default
resource as shown by the
\family typewriter
du | sort
\family default
output in reverse order, but
\emph on
starting
\emph default
the following only with the
\emph on
affected
\emph default
resources in the first place:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
secondary1:~# marsadm invalidate <res1>
\end_layout
\begin_layout Plain Layout
secondary1:~# marsadm log-delete-all all
\end_layout
\begin_layout Plain Layout
...
dito with all resources showing emergency mode
\end_layout
\begin_layout Plain Layout
...
dito on all other secondaries
\end_layout
\begin_layout Plain Layout
primary:~# marsadm log-delete-all all
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Hint: during the resolution process, some other resources might have gone
into emergency mode concurrently.
In addition, it is possible that some secondaries are stuck at particular
resources while the corresponding primary has
\emph on
not yet
\emph default
entered emergency mode.
Please repeat the steps in such a case, and look for emergency modes at
secondaries additionally.
When necessary, extend your list of
\emph on
affected
\emph default
resources.
\end_layout
\begin_layout Standard
Hint: be patient.
Deleting large bulks of logfile data may take a long time, at least on
highly loaded systems.
You should give the cleanup processes at least 5 minutes before concluding
that an
\family typewriter
invalidate
\family default
followed by
\family typewriter
log-delete-all
\family default
had no effect! Don't forget to give the
\family typewriter
log-delete-all
\family default
at all cluster nodes, even when seemingly unaffected.
\end_layout
\begin_layout Standard
In very complex scenarios, when the primary roles of different resources
are spread over diffent hosts (aka mixed operation), you may need to repeat
the whole cycle iteratively for a few cycles until the jam is resolved.
\end_layout
\begin_layout Standard
If it does not go away, you have another chance by the following split-brain
resolution process, which will also cleanup emergency mode as a side effect.
\end_layout
\begin_layout Section
Resolution of Split Brain and of Emergency Mode
\begin_inset CommandInset label
LatexCommand label
name "sec:Resolution-of-Split"
\end_inset
\end_layout
\begin_layout Standard
Hint: in many cases (but not guaranteed), the previous receipe for resolution
of emergency mode will also cleanup split brain.
Good chances are in case of
\begin_inset Formula $k=2$
\end_inset
total replicas.
Please collect your own experiences which method works better for you!
\end_layout
\begin_layout Standard
Precondition: the network must be working.
Check that the following gives an entry for each secondary:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
primary:~# netstat --tcp | grep 7777
\end_layout
\end_inset
When necessary, fix the network first (see instructions above).
\end_layout
\begin_layout Standard
Inspect the split brain situation:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
primary:~# marsadm view all
\end_layout
\begin_layout Plain Layout
primary:~# du -s /mars/resource-* | sort -n
\end_layout
\end_inset
Remember those resources where a message like
\family typewriter
WARNING: SPLIT BRAIN at '' detected
\family default
appears.
Do the following only for
\emph on
affected
\emph default
resources, starting with the biggest one (before proceeding to the next
one).
\end_layout
\begin_layout Standard
Do the following with only
\emph on
one
\emph default
resource at a time (before proceeding to the next one), and repeat the
actions on that resource at every secondary (if there are multiple secondaries)
:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
secondary1:~# marsadm leave-resource $res1
\end_layout
\begin_layout Plain Layout
secondary1:~# marsadm log-delete-all all
\end_layout
\end_inset
Check whether the split brain has vanished everywhere.
Startover with other resources at their secondaries when necessary.
\end_layout
\begin_layout Standard
Finally, when no split brain is reported at any (former) secondary, do the
following on the primary:
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
primary:~# marsadm log-delete-all all
\end_layout
\begin_layout Plain Layout
primary:~# sleep 30
\end_layout
\begin_layout Plain Layout
primary:~# marsadm view all
\end_layout
\end_inset
Now, the split brain should be gone even at the primary.
If not, repeat this step.
\end_layout
\begin_layout Standard
In case even this should fail on some
\family typewriter
$res
\family default
(which is very unlikely), read the PDF manual before using
\family typewriter
marsadm log-purge-all $res
\family default
.
\end_layout
\begin_layout Standard
Finally, when the split brain is gone everywhere, rebuild the redundancy
at every secondary via
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
secondary1:~# marsadm join-resource $res1 /dev/<lv-x>/$res1
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\noindent
If even this method does not help, setup the whole cluster afresh by
\family typewriter
rmmod mars
\family default
everywhere, and creating a fresh
\family typewriter
/mars/
\family default
filesystem everywhere, followed by the same procedure as installing MARS
for the first time (which is outside the scope of this handout).
\end_layout
\begin_layout Section
Handover of Primary Role
\end_layout
\begin_layout Standard
When there exists a method for primary handover in higher layers such as
cluster managers, please prefer that method (e.g.
\family typewriter
cm3
\family default
or other tools).
\end_layout
\begin_layout Standard
If suchalike doesn't work, or if you need to handover some resource
\family typewriter
$res1
\family default
by hand, do the following:
\end_layout
\begin_layout Itemize
Stop the load / application corresponding to
\family typewriter
$res1
\family default
on the old primary side.
\end_layout
\begin_layout Itemize
\family typewriter
umount /dev/mars/$res1
\family default
, or otherwise close any openers such as iSCSI.
\end_layout
\begin_layout Itemize
At the new primary:
\family typewriter
marsadm primary $res1
\end_layout
\begin_layout Itemize
Restart the application at the new site (in reverse order to above).
In case you want to switch
\emph on
all
\emph default
resources which are not yet at the new side, you may use
\family typewriter
marsadm primary all
\family default
.
\end_layout
\begin_layout Section
Emergency Switching of Primary Role
\end_layout
\begin_layout Standard
Emergency switching is necessary when your primary is no longer reachable
over the network for a
\emph on
longer
\emph default
time, or when the hardware is defective.
\end_layout
\begin_layout Standard
Emergency switching will very often lead to a split brain, which requires
lots of manual actions to resolve (see above).
Therefore, try to avoid emergency switching when possible!
\end_layout
\begin_layout Standard
Hint: MARS can automatically recover after a primary crash / reboot, as
well as after secondary crashes, just by executing
\family typewriter
modprobe mars
\family default
after
\family typewriter
/mars/
\family default
had been mounted.
Please consider to wait until your system comes up again, instead of risking
a split brain.
\end_layout
\begin_layout Standard
The decision between emergency switching and continuing operation at the
same primary side is an operational one.
MARS can support your decision by the following information at the potentially
new primary side (which was in secondary mode before):
\family typewriter
\size scriptsize
\begin_inset listings
inline false
status open
\begin_layout Plain Layout
istore-test-bap1:~# marsadm view all
\end_layout
\begin_layout Plain Layout
--------- resource lv-0
\end_layout
\begin_layout Plain Layout
lv-0 InConsistent Syncing dcAsFr Secondary istore-test-bs1
\end_layout
\begin_layout Plain Layout
syncing: [====>..............] 27.84% (567/2048)MiB rate: 72583.00 KiB/sec remaining: 00:00:20
hrs
\end_layout
\begin_layout Plain Layout
> sync: 567.293/2048 MiB rate: 72583 KiB/sec remaining: 00:00:20 hrs
\end_layout
\begin_layout Plain Layout
replaying: [>:::::::::::::::::::] 0.00% (0/12902)KiB logs: [1..1]
\end_layout
\begin_layout Plain Layout
> fetch: 0 B rate: 38 KiB/s remaining: 00:00:00
\end_layout
\begin_layout Plain Layout
> replay: 12902.047 KiB rate: 0 B/s remaining: --:--:--
\end_layout
\end_inset
\family default
\size default
When your target is syncing (like in this example), you cannot switch to
it (same as with DRBD).
When you had an emergency mode before, you should first resolve that (whenever
possible).
When a split brain is reported, try to resolve it first (same as with DRBD).
Only in case you
\emph on
know
\emph default
that the primary is really damaged, or it is really impossible to the run
the application there for some reason, emergency switching is desirable.
\end_layout
\begin_layout Standard
Hint: in case the secondary is inconsistent for some reason, e.g.
because of an incremental fast full-sync, you have a last chance to recover
most data after forceful switching by using a filesystem check or suchalike.
This might be even faster than restoring data from the backup.
But use it only if you are
\emph on
really
\emph default
desperate!
\end_layout
\begin_layout Standard
The amount of data which is
\emph on
known
\emph default
to be missing at your secondary is shown after the
\family typewriter
> fetch:
\family default
in human-readable form.
However, in cases of networking problems this information may be outdated.
You
\emph on
always
\emph default
need to consider further facts which cannot be known by MARS.
\end_layout
\begin_layout Standard
When there exists a method for emergency switching of the primary in higher
layers such as cluster managers, please prefer that method in front of
the following one.
\end_layout
\begin_layout Standard
If suchalike doesn't work, or when a handover attempt has failed several
times, or if you
\emph on
really need
\emph default
forceful switching of some resource
\family typewriter
$res1
\family default
by hand, you can do the following:
\end_layout
\begin_layout Itemize
When possible, stop the load / application corresponding to
\family typewriter
$res1
\family default
on the old primary side.
\end_layout
\begin_layout Itemize
When possible,
\family typewriter
umount /dev/mars/$res1
\family default
, or otherwise close any openers such as iSCSI.
\end_layout
\begin_layout Itemize
When possible (if you have some time), wait until as much data has been
propagated to the new primary as possible (watch the
\family typewriter
fetch:
\family default
indicator).
\end_layout
\begin_layout Itemize
At the new primary:
\family typewriter
marsadm disconnect $res1; marsadm primary --force $res1
\end_layout
\begin_layout Itemize
Restart the application at the new site (in reverse order to above).
\end_layout
\begin_layout Itemize
After the application is known to run reliably, check for split brains and
cleanup them when necessary.
\end_layout
\begin_layout Chapter
Alternative Methods for Split Brain Resolution
\begin_inset CommandInset label
LatexCommand label
name "chap:Alternative-Methods-for"
\end_inset
\end_layout
\begin_layout Standard
Instead of
\family typewriter
marsadm invalidate
\family default
, the following steps may be used.
In preference, start with the old
\begin_inset Quotes eld
\end_inset
wrong
\begin_inset Quotes erd
\end_inset
primaries first:
\end_layout
\begin_layout Enumerate
\family typewriter
marsadm leave-resource mydata
\end_layout
\begin_layout Enumerate
After having done this on one cluster node, check whether the split brain
is already gone (e.g.
by saying
\family typewriter
marsadm view mydata
\family default
).
There are chances that you don't need this on all of your nodes.
Only in very rare
\begin_inset Foot
status open
\begin_layout Plain Layout
When your network had partitioned in a very awkward way for a long time,
and when your partitioned primaries did several
\family typewriter
log-rotate
\family default
operations indendently from each other, there is a small chance that
\family typewriter
leave-resource
\family default
does not clean up
\emph on
all
\emph default
remains of such an awkward situation.
Only in such a case, try
\family typewriter
log-purge-all
\family default
.
\end_layout
\end_inset
cases, it might happen that the preceding l
\family typewriter
eave-resource
\family default
operations were not able to clean up all logfiles produced in parallel
by the split brain situation.
\end_layout
\begin_layout Enumerate
Read the documentation about
\family typewriter
log-purge-all
\family default
(see page
\begin_inset CommandInset ref
LatexCommand pageref
reference "log-purge-all$res"
\end_inset
) and use it.
\end_layout
\begin_layout Enumerate
If you want to restore redundancy, you can follow-up a
\family typewriter
join-resource
\family default
phase to the old resource name (using the correct device name, double-check
it!) This will restore your redundancy by overwriting your bad split brain
version with the correct one.
\end_layout
\begin_layout Standard
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
It is important to resolve the split brain
\emph on
before
\emph default
you can start the
\family typewriter
join-resource
\family default
reconstruction phase! In order to keep as many
\begin_inset Quotes eld
\end_inset
good
\begin_inset Quotes erd
\end_inset
versions as possible (e.g.
for emergency cases), don't re-join them all in parallel, but rather start
with the oldest / most outdated / worst / inconsistent version first.
It is recommended to start the next one only when the previous one has
sucessfully finished.
\end_layout
\begin_layout Chapter
Alternative De- and Reconstruction of a Damaged Resource
\begin_inset CommandInset label
LatexCommand label
name "chap:Alternative-De--and"
\end_inset
\end_layout
\begin_layout Standard
In case
\family typewriter
leave-resource --host=
\family default
does not work, you may use the following fallback.
On the surviving new designated primary, give the following commands:
\end_layout
\begin_layout Enumerate
\family typewriter
marsadm disconnect-all mydata
\end_layout
\begin_layout Enumerate
\family typewriter
marsadm down mydata
\end_layout
\begin_layout Enumerate
Check by hand whether your local disk is consistent, e.g.
by test-mounting it readonly,
\family typewriter
fsck
\family default
, etc.
\end_layout
\begin_layout Enumerate
\family typewriter
marsadm delete-resource mydata
\end_layout
\begin_layout Enumerate
Check whether the other vital cluster nodes don't report the dead resource
any more, e.g.
\family typewriter
marsadm view all
\family default
at
\emph on
each
\emph default
of them.
In case the resource has not disappeared anywhere (which may happen during
network problems), do the
\family typewriter
down ; delete-resource
\family default
steps also there (optionally again with
\family typewriter
--force
\family default
).
\end_layout
\begin_layout Enumerate
Be sure that the resource has disappeared
\emph on
everywhere
\emph default
.
When necessary, repeat the
\family typewriter
delete-resource
\family default
with
\family typewriter
--force
\family default
.
\end_layout
\begin_layout Enumerate
\family typewriter
marsadm create-resource newmydata ...
\family default
at the
\emph on
correct
\emph default
node using the
\emph on
correct
\emph default
disk device containing the
\emph on
correct
\emph default
version, and further steps to setup your resource from scratch, preferably
under a different name to minimize any risk.
\end_layout
\begin_layout Standard
\noindent
In any case,
\series bold
manually check
\series default
whether a split brain is reported for any resource on any of your
\emph on
surviving
\emph default
cluster nodes.
If you find one there (and only then), please (re-)execute the split brain
resolution steps on the affected node(s).
\end_layout
\begin_layout Chapter
Cleanup in case of Complicated Cascading Failures
\begin_inset CommandInset label
LatexCommand label
name "sub:Cleanup-in-case"
\end_inset
\end_layout
\begin_layout Standard
MARS does its best to recover even from multiple failures (e.g.
\series bold
rolling disasters
\series default
).
Chances are high that the instructions from sections
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Split-Brain-Resolution"
\end_inset
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Final-Destroy-of"
\end_inset
or appendix
\begin_inset CommandInset ref
LatexCommand ref
reference "chap:Alternative-Methods-for"
\end_inset
\begin_inset CommandInset ref
LatexCommand ref
reference "chap:Alternative-De--and"
\end_inset
will work even in case of multiple failures, such as a network failure
plus local node failure at only 1 node (even if that node is the former
primary node).
\end_layout
\begin_layout Standard
However, in general (e.g.
when more than 1 node is damaged and/or when the filesystem
\family typewriter
/mars/
\family default
is badly damaged) there is no general guarantee that recovery will
\emph on
always
\emph default
succeed under
\emph on
any
\emph default
(weird) circumstances.
That said, your chances for recovery are
\emph on
very
\emph default
high when some disk remains usable at least at one of your surviving secondarie
s.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
It should be very hard to finally trash a secondary, because the transaction
logfiles are containing
\family typewriter
md5
\family default
checksums for all data records.
Any attempt to replay currupted logfiles is refused by MARS.
In addition, the sequence numbers of
\family typewriter
log-rotate
\family default
d logfiles are checked for contiguity.
Finally, the
\emph on
sequence path
\emph default
of logfile applications (consisting of logfile names plus their respective
length) is additionally secured by a
\family typewriter
git
\family default
-like incremental checksum over the whole path history (so-called
\begin_inset Quotes eld
\end_inset
version links
\begin_inset Quotes erd
\end_inset
).
This should detect split brains even if logfiles are appended / modified
\emph on
after
\emph default
a (forceful) switchover has already taken place.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresToxiques.png
lyxscale 50
scale 17
\end_inset
That said, your risk of final data loss is very high if you remove the
\series bold
BBU
\series default
from your hardware RAID controller before all hot data has been flushed
to the physical disks.
Therefore, never try to
\begin_inset Quotes eld
\end_inset
repair
\begin_inset Quotes erd
\end_inset
a seemingly dead node before your replication is up again somewhere else!
Only unplug the network cables when advised, but never try to repair the
hardware instantly!
\end_layout
\begin_layout Standard
In case of desperate situations where none of the previous instructions
have succeeded, your last chance is rebuilding all your resources from
intact disks as follows:
\end_layout
\begin_layout Enumerate
Do
\family typewriter
rmmod mars
\family default
on all your cluster nodes and/or reboot them.
Note: if you are less desperate, chances are high that the following will
also work when the kernel module remains active and everywhere a
\family typewriter
marsadm down
\family default
is given instead, but for an
\emph on
ultimate
\emph default
instruction you should eliminate
\emph on
potential
\emph default
kernel problems by
\family typewriter
rmmod
\family default
/
\family typewriter
reboot
\family default
, at least if you can afford the downtime on concurrently operating resources.
\end_layout
\begin_layout Enumerate
For safety, physically remove the storage network cables on
\emph on
all
\emph default
your cluster nodes.
Note: the same disclaimer holds.
MARS really does its best, even when
\family typewriter
delete-resource
\family default
is given while the network is fully active and multiple split-brain primaries
are actively using their local device in parallel (approved by some testcases
from the automatic test suite, but note that it is impossible to catch
all possible failure scenarios).
Don't challenge your fate if you are desperate! Don't
\emph on
rely
\emph default
on this! Nothing is absolutely fail-safe!
\end_layout
\begin_layout Enumerate
\series bold
Manually
\series default
check which surviving disk is usable, and which is the
\begin_inset Quotes eld
\end_inset
best
\begin_inset Quotes erd
\end_inset
one for your purpose.
\end_layout
\begin_layout Enumerate
Do
\family typewriter
modprobe mars
\family default
\emph on
only
\emph default
on that node.
If that fails,
\family typewriter
rmmod
\family default
and/or reboot again, and start over with a completely fresh
\family typewriter
/mars/
\family default
partition (
\family typewriter
mkfs.ext4 /mars/
\family default
or similar)
\emph on
everywhere
\emph default
on
\emph on
all
\emph default
cluster nodes, and continue with step 7.
\end_layout
\begin_layout Enumerate
If your old
\family typewriter
/mars/
\family default
works, and you did not already (forcefully) switch your designated primary
to the final destination, do it now (see description in section
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Forced-Switching"
\end_inset
).
Wait until any old logfile data has been replayed.
\end_layout
\begin_layout Enumerate
Say
\family typewriter
marsadm delete-resource mydata --force
\family default
.
This will cleanup all internal symlink tree information for the resource,
but will leave your disk data intact.
\end_layout
\begin_layout Enumerate
Locally build up the new resource(s) as usual, out of the underlying disks.
\end_layout
\begin_layout Enumerate
Check whether the new resource(s) work in standalone mode.
\end_layout
\begin_layout Enumerate
When necessary, repeat these steps with other resources.
\end_layout
\begin_layout Standard
Now you can choose how the rebuild your cluster.
If you rebuilt
\family typewriter
/mars/
\family default
anywhere, you
\emph on
must
\emph default
rebuild it on
\emph on
all
\emph default
new cluster nodes and start over with a fresh
\family typewriter
join-cluster
\family default
on each of them, from scratch.
It is not possible to mix the old cluster with the new one.
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{enumerate}
\backslash
setcounter{enumi}{9}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
item
\end_layout
\end_inset
Finally, do all the necessary
\family typewriter
join-resource
\family default
s on the respective cluster nodes, according to your new redundancy scenario
after the failures (e.g.
after activating spare nodes, etc).
If you have
\begin_inset Formula $k>2$
\end_inset
replicas, start
\family typewriter
join-resource
\family default
on the worst / most damaged version first, and start the next preferably
only after the previous sync has completed successfully.
This way, you will be permanently retaining some (old and outdated, but
hopefully potentially usable) replicas while a sync is running.
Don't start too many syncs in parallel.
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
end{enumerate}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Never use
\family typewriter
delete-resource
\family default
twice on the same resource name, after you have already a working standalone
primary
\begin_inset Foot
status open
\begin_layout Plain Layout
Of course, when you don't have created the
\emph on
same
\emph default
resource anew, you may repeat
\family typewriter
delete-resource
\family default
on other cluster nodes in order to get rid of local files / symlinks which
had not been propagated to other nodes before.
\end_layout
\end_inset
.
You might accidentally destroy your again-working copy! You
\emph on
can
\emph default
issue
\family typewriter
delete-resource
\family default
multiple times on different nodes, e.g.
when the network has problems, but doing so
\emph on
after
\emph default
re-establishment of the initial primary bears some risk.
Therefore, the safest way is first deleting the resources everywhere, and
then starting over afresh.
\end_layout
\begin_layout Standard
Before re-connecting any network cable on any non-primary (new secondaries),
ensure that all
\family typewriter
/dev/mars/mydata
\family default
devices are no longer in use (e.g.
from an old primary role before the incident happened), and that each local
disk is detached.
Only after that, you should be able to safely re-connect the network.
The
\family typewriter
delete-resource
\family default
given at the new primary should propagate now to each of your secondaries,
and your local disk should be usable for a re-
\family typewriter
join-resource
\family default
.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
When you did not rebuild your cluster from scratch with fresh
\family typewriter
/mars/
\family default
filesystems, and one of the old cluster nodes is supposed to be removed
permanently, use
\family typewriter
leave-resource
\family default
(optionally with
\family typewriter
--host=
\family default
and/or
\family typewriter
--force
\family default
) and finally
\family typewriter
leave-cluster
\family default
.
\end_layout
\begin_layout Chapter
Experts only: Special Trick Switching and Rebuild
\begin_inset CommandInset label
LatexCommand label
name "chap:Experts-only:-Special"
\end_inset
\end_layout
\begin_layout Standard
The following is a further alternative for
\series bold
experts
\series default
who really know what they are doing.
The method is very simple and therefore well-suited for coping with mass
failures, e.g.
\series bold
power blackout of whole datacenters
\series default
.
\end_layout
\begin_layout Standard
In case a primary datacenter fails as a whole for whatever reason and you
have a backup datacenter, do the following steps in the backup datacenter:
\end_layout
\begin_layout Enumerate
Fencing step: by means of firewalling,
\series bold
ensure
\series default
that the (virtually) damaged datacenter nodes
\series bold
cannot
\series default
be reached over the network.
For example, you may place REJECT rules into all of your local iptables
firewalls at the backup datacenter.
Alternatively / additionally, you may block the routes at the appropriate
central router(s) in your network.
\end_layout
\begin_layout Enumerate
Run the sequence
\family typewriter
marsadm disconnect all; marsadm primary --force all
\family default
on all nodes in the backup datacenter.
\end_layout
\begin_layout Enumerate
Restart your services in the backup datacenter (as far as necessary).
Depending on your network setup, further steps like switching BGP routes
etc may be necessary.
\end_layout
\begin_layout Enumerate
Check that
\emph on
all
\emph default
your services are
\emph on
really
\emph default
up and running, before you try to repair anything! Failing to do so may
result in data loss when you execute the following restore method for
\emph on
experts
\emph default
.
\end_layout
\begin_layout Standard
Now your backup datacenter should continue servicing your clients.
The final reconstruction of the originally primary datacenter works as
follows:
\end_layout
\begin_layout Enumerate
At the damaged primary datacenter, ensure that nowhere the MARS kernel module
is running.
In case of a power blackout, you shouldn't have executed an automatic
\family typewriter
modprobe mars
\family default
anywhere during reboot, so you should be already done when all your nodes
are up again.
In case some nodes had no reboot, execute
\family typewriter
rmmod mars
\family default
everywhere.
If
\family typewriter
rmmod
\family default
refuses to run, you may need to umount the
\family typewriter
/dev/mars/mydata
\family default
device first.
When nothing else helps, you may just mass reboot your hanging nodes.
\end_layout
\begin_layout Enumerate
At the failed side, do
\family typewriter
rm -rf /mars/resource-$mydata/
\family default
for all those resources which had been primary before the blackout.
Do this
\emph on
only
\emph default
for those cases, otherwise you will need unnecessary
\family typewriter
leave-resource
\family default
s or
\family typewriter
invalidate
\family default
s later (e.g.
when half of your nodes were already running at the surving side).
In order to avoid unnecessary traffic, please do this only as far as really
necessary.
Don't remove any other directories.
In particular,
\family typewriter
/mars/ips/
\family default
\emph on
must
\emph default
remain intact.
In case you accidentally deleted them, or you had to re-create
\family typewriter
/mars/
\family default
from scratch, try
\family typewriter
rsync
\family default
with the correct options.
\begin_inset Newline newline
\end_inset
\begin_inset Graphics
filename images/MatieresCorrosives.png
lyxscale 50
scale 17
\end_inset
Caution! before doing this, check that the corresponding directory exists
at the backup datacenter, and that it is
\emph on
really
\emph default
healthy!
\end_layout
\begin_layout Enumerate
Un-Fencing: restore your network firewall / routes and check that they work
(
\family typewriter
ping
\family default
etc).
\end_layout
\begin_layout Enumerate
Do
\family typewriter
modprobe mars
\family default
everywhere.
All missing directories and their missing symlinks should be automatically
fetched from the backup datacenter.
\end_layout
\begin_layout Enumerate
Run
\family typewriter
marsadm join-resource $res
\family default
, but only at those places where the directory was removed previously, while
using the same disk devices as before.
This will minimize actual traffic thanks to the fast full sync algorithm.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
It is
\series bold
crucial
\series default
that the fencing step
\series bold
must
\series default
be executed
\emph on
before
\emph default
any
\family typewriter
primary --force
\family default
! This way, no split brain will be
\emph on
visible
\emph default
at the backup datacenter side, because there is simply no chance for transferri
ng different versions over the network.
It is also crucial to remove any (potentially diverging) resource directories
\emph on
before
\emph default
the
\family typewriter
modprobe
\family default
! This way, the backup datacenter never runs into split brain.
This saves you a lot of detail work for split brain resolution when you
have to restore bulks of nodes in a short time.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
In case the repair of a full datacenter should take so extremely long that
some
\family typewriter
/mars/
\family default
partitions are about to run out of space at the surviving side, you may
use the
\family typewriter
leave-resource --host=failed-node
\family default
trick described earlier, followed by
\family typewriter
log-delete-all
\family default
.
Best if you have prepared a fully automatic script long before the incident,
which executes suchalike only as far as necessary in each individual case.
\end_layout
\begin_layout Standard
\noindent
\begin_inset Graphics
filename images/lightbulb_brightlit_benj_.png
lyxscale 12
scale 7
\end_inset
Even better: train such scenarios in advance, and prepare scripts for mass
automation.
Look into section
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Scripting-HOWTO"
\end_inset
.
\end_layout
\begin_layout Chapter
GNU Free Documentation License
\begin_inset CommandInset label
LatexCommand label
name "chap:GNU-FDL"
\end_inset
\end_layout
\begin_layout Standard
\noindent
\family typewriter
\size footnotesize
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
lstinputlisting{fdl.txt}
\end_layout
\end_inset
\end_layout
\end_body
\end_document