mirror of
https://github.com/schoebel/mars
synced 2025-03-06 21:37:38 +00:00
Fix typos
[small adaptations by Thomas Schoebel-Theuer, and some problems with LyX-specific file format fixed]
This commit is contained in:
parent
b5792db970
commit
dd1e4e1323
@ -18,7 +18,7 @@ Finally we have [4/4] replicas.
|
||||
3) handover the primary to one node of the _new_ pair.
|
||||
|
||||
4) use "marsadm leave-resource" to get rid of the _old_ two replicas.
|
||||
New replica status is again [2/4], but the the replicas have been
|
||||
New replica status is again [2/4], but the replicas have been
|
||||
migrated to the new pair in the meantime.
|
||||
|
||||
5) Finally, use "marsadm split-cluster" to go back to [2/2].
|
||||
|
@ -31,7 +31,7 @@ This document including all attachments is under GNU FDL.
|
||||
---++ 3. Processes
|
||||
%IMAGE{"processes.png" size="1000"}%
|
||||
|
||||
After the initial synchronzation of the secondary with the primary the data written on the mars device /dev/mars/<resource-name> of a primary is at first written to sequential logfiles (residing in primary's directory /mars/resource-<resource-name>/) and then copied to the primary's underlying device. The secondary fetches the primary's logfiles in it's /mars/resource-<resource-name>/ directory and then copies the data to it's underlying device. %BR%
|
||||
After the initial synchronzation of the secondary with the primary the data written on the mars device /dev/mars/<resource-name> of a primary is at first written to sequential logfiles (residing in primary's directory /mars/resource-<resource-name>/) and then copied to the primary's underlying device. The secondary fetches the primary's logfiles in its /mars/resource-<resource-name>/ directory and then copies the data to its underlying device. %BR%
|
||||
We distinguish the following subprocesses of a replication:
|
||||
|
||||
* Sync: Synchronizes the underlying device of the secondary with the data of the underlying device of the primary. Triggered by: <verbatim>marsadm invalidate <resource-name></verbatim> <verbatim>marsadm join-resource <resource-name> /dev/...</verbatim> During sync the data on the secondary's underlying device is inconsistent, i.e. unusable.
|
||||
@ -40,7 +40,7 @@ We distinguish the following subprocesses of a replication:
|
||||
|
||||
#ChapterFour
|
||||
---++ 4. Process state
|
||||
Mars works asynchronously. This means that each of the processes mentioned above does it's job without waiting for the others.%BR% Examples:
|
||||
Mars works asynchronously. This means that each of the processes mentioned above does its job without waiting for the others.%BR% Examples:
|
||||
* An application may write on the mars device on the primary (process 1) thereby filling the logfiles (process 2) independent from the process (the replay (process 3)) which writes the logfile data to the underlying device. If there is a gap between the writer and the replay (which occurs very rarely) this can be shown with <verbatim>marsadm view-1and1 <resource-name>|all</verbatim> or more specific <verbatim>marsadm view-replay-line-1and1 <resource-name>|all</verbatim> or if you are interested in numbers given in bytes <verbatim>marsadm view-replay-rest <resource-name>|all</verbatim>
|
||||
* If the network connection between primary and secondary is slow the fetch process (process 4) lags behind, i.e. there are more logfile data on the primary than on the secondary. Replacing "replay" with "fetch" in the commands given above shows the relevant information for the fetch process.
|
||||
* As you already guessed probably: The sync process can be watched by replacing "replay" with "sync" in the mentioned commands.
|
||||
|
@ -257,7 +257,7 @@ sub display_partner {
|
||||
print_screen "$SStatus$SSpeed\n";
|
||||
print_screen "\t\t---> WORK: Sync in progress = ($SStatus% < 100.00%)", "$Color_blue";
|
||||
if ( "$SConnect" ne "OK" ){
|
||||
print_screen ", transfered from $SConnect\n", "$Color_blue";
|
||||
print_screen ", transferred from $SConnect\n", "$Color_blue";
|
||||
} else {
|
||||
print_screen "\n";
|
||||
}
|
||||
@ -333,7 +333,7 @@ sub display_partner {
|
||||
print_screen "$RStatus$RSpeed\n";
|
||||
print_screen "\t\t---> WORK: Replay in progress = ($RStatus% < 100.00%)", "$Color_blue";
|
||||
if ( "$RConnect" ne "OK" ){
|
||||
print_screen ", transfered from $RConnect\n", "$Color_blue";
|
||||
print_screen ", transferred from $RConnect\n", "$Color_blue";
|
||||
} else {
|
||||
print_screen "\n";
|
||||
}
|
||||
@ -353,7 +353,7 @@ sub display_partner {
|
||||
|
||||
### replay - hints
|
||||
if ($PLogFile[2] != 0) {
|
||||
print_screen "\t\t---> WORK: Replay-Todo is actualy $PLogFile[2], ", "$Color_blue";
|
||||
print_screen "\t\t---> WORK: Replay-Todo is actually $PLogFile[2], ", "$Color_blue";
|
||||
if ( $PLogFile[2] < 0 ) {
|
||||
print_screen "replaying backwards ??? Check this !!!\n", "$Color_red";
|
||||
} elsif ( $PLogFile[2] > 0 ) {
|
||||
@ -930,7 +930,7 @@ sub info_version {
|
||||
|
||||
### status
|
||||
print_screen "MARS Status - $himself, $version", "$Color_blue";
|
||||
if ( $params->{'resource'} ) { print_screen ", Ressource: $params->{'resource'}", "$Color_blue"; }
|
||||
if ( $params->{'resource'} ) { print_screen ", Resource: $params->{'resource'}", "$Color_blue"; }
|
||||
print_screen "\n";
|
||||
|
||||
### marsadm
|
||||
@ -1061,14 +1061,14 @@ sub check_limit {
|
||||
print_screen "$mars_limit_sol $LimitSolEin,";
|
||||
### restliches
|
||||
} elsif ( $mars_limit_sol < 1 ) {
|
||||
print_screen "is now unsed,", "$Color_green";
|
||||
print_screen "is now unused,", "$Color_green";
|
||||
} else {
|
||||
print_screen "is set to ";
|
||||
print_screen "$mars_limit_sol $LimitSolEin,", "$Color_red";
|
||||
}
|
||||
} elsif ( !($LimitSolVar) && ($LimitIstVar) ) {
|
||||
### only ist
|
||||
print_screen "is actualy ";
|
||||
print_screen "is actually ";
|
||||
|
||||
if ( $mars_limit_ist < 1 ) {
|
||||
if ( $LimitIstEin eq "on/off" ) {
|
||||
@ -1086,12 +1086,12 @@ sub check_limit {
|
||||
# TODO fixen !
|
||||
# } elsif ( ($LimitSolVar) && ($LimitIstVar) && ($mars_limit_sol < 1) ) {
|
||||
# ### sol & ist = 0
|
||||
# print_screen "is actualy unused(X),";
|
||||
# print_screen "is actually unused(X),";
|
||||
} else {
|
||||
### sol & ist / rest ...
|
||||
print_screen "is set to ";
|
||||
print_screen "$mars_limit_sol $LimitSolEin", "$Color_red";
|
||||
print_screen ", actualy used ";
|
||||
print_screen ", actually used ";
|
||||
print_screen "$mars_limit_ist $LimitIstEin,", "$Color_red";
|
||||
}
|
||||
|
||||
@ -1147,7 +1147,7 @@ sub check_systemstatus {
|
||||
|
||||
my $mars_disk_space = `df '$mars_dir' | grep '$mars_dir'| awk '{print \$2}'`;
|
||||
$mars_disk_space = sprintf("%01.2f", $mars_disk_space / 1024);
|
||||
check_limit "-> Free-Space-Limit on /mars", "required_free_space_1_gb", "mb (actualy $mars_disk_space mb used)";
|
||||
check_limit "-> Free-Space-Limit on /mars", "required_free_space_1_gb", "mb (actually $mars_disk_space mb used)";
|
||||
print "\n";
|
||||
}
|
||||
|
||||
|
@ -172,7 +172,7 @@ Scope
|
||||
\end_layout
|
||||
|
||||
\begin_layout Standard
|
||||
The following topics are covered withing this document:
|
||||
The following topics are covered within this document:
|
||||
\end_layout
|
||||
|
||||
\begin_layout Itemize
|
||||
@ -568,7 +568,7 @@ transaction logfile
|
||||
\emph on
|
||||
Any
|
||||
\emph default
|
||||
write reqeuest is treated like a transaction which changes the contents
|
||||
write request is treated like a transaction which changes the contents
|
||||
of your LV.
|
||||
\end_layout
|
||||
|
||||
@ -717,7 +717,7 @@ fsync()
|
||||
\family typewriter
|
||||
/dev/mars/mydata
|
||||
\family default
|
||||
is signalled that the write was successful
|
||||
is signaled that the write was successful
|
||||
\begin_inset Foot
|
||||
status open
|
||||
|
||||
@ -1309,7 +1309,7 @@ https://github.com/schoebel/blkreplay/raw/master/doc/blkreplay.pdf
|
||||
\end_layout
|
||||
|
||||
\begin_layout Standard
|
||||
For many application workloads, RAID-6 provides a good compromize between
|
||||
For many application workloads, RAID-6 provides a good compromise between
|
||||
cost and performance.
|
||||
Reads are very fast due to RAID-6 striping, while the slow RAID-6 writes
|
||||
are partially compensated by the MARS kernel memory buffer (see section
|
||||
@ -1530,7 +1530,7 @@ dedicated HDD
|
||||
with a capacity of 4 TB or more.
|
||||
Typically, this will provide you with plenty of headroom even for bigger
|
||||
networking incidents.
|
||||
Performace of a single HDD over a BBU is typically good enough for
|
||||
Performance of a single HDD over a BBU is typically good enough for
|
||||
\family typewriter
|
||||
/mars
|
||||
\family default
|
||||
@ -1565,7 +1565,7 @@ mars
|
||||
\end_layout
|
||||
|
||||
\begin_layout Enumerate
|
||||
For extemely high performance, separate SSD sets for the user data VG and
|
||||
For extremely high performance, separate SSD sets for the user data VG and
|
||||
for
|
||||
\family typewriter
|
||||
/mars
|
||||
@ -1578,7 +1578,7 @@ For extemely high performance, separate SSD sets for the user data VG and
|
||||
exist
|
||||
\emph default
|
||||
some workloads where sequntial IO to HDDs is faster than to SSDs.
|
||||
Sometimes, there are hidden performance bottlenecks, such as SAS busses,
|
||||
Sometimes, there are hidden performance bottlenecks, such as SAS buses,
|
||||
or some old-generation RAID controllers.
|
||||
\end_layout
|
||||
|
||||
@ -4188,7 +4188,7 @@ netstat --tcp | grep 777
|
||||
|
||||
|
||||
\family default
|
||||
Both variants should show up some healty connections.
|
||||
Both variants should show up some healthy connections.
|
||||
If not, fix your network configuration and/or firewalling etc.
|
||||
Details are outside of the scope of this manual.
|
||||
\end_layout
|
||||
@ -4277,7 +4277,7 @@ noprefix "false"
|
||||
\emph on
|
||||
some dynamic behaviour
|
||||
\emph default
|
||||
like growth and hardware lifecyle.
|
||||
like growth and hardware lifecycle.
|
||||
Thus they need
|
||||
\emph on
|
||||
updates
|
||||
@ -4666,7 +4666,7 @@ huge
|
||||
\family typewriter
|
||||
Football
|
||||
\family default
|
||||
for hardware lifecyle or for long-term load balancing over a very long
|
||||
for hardware lifecycle or for long-term load balancing over a very long
|
||||
time): newer versions of
|
||||
\family typewriter
|
||||
marsadm
|
||||
@ -4839,7 +4839,7 @@ standalone mode
|
||||
\end_inset
|
||||
|
||||
).
|
||||
But you can also do so later after setup of (one ore many) secondaries.
|
||||
But you can also do so later after setup of (one or many) secondaries.
|
||||
\end_layout
|
||||
|
||||
\begin_layout Enumerate
|
||||
@ -5720,7 +5720,7 @@ description-text
|
||||
\family typewriter
|
||||
%replay-code{}
|
||||
\family default
|
||||
) Typicially this indicates a checksum error in a transaction logfile, or
|
||||
) Typically this indicates a checksum error in a transaction logfile, or
|
||||
another (hardware / filesystem) defect.
|
||||
This occurs extremely rarely in practice, but has been observed more frequently
|
||||
during a massive failure of air conditioning in a datacenter, when disk
|
||||
@ -5805,7 +5805,7 @@ When the damage is only at one of your secondaries, and the primary continues
|
||||
\family typewriter
|
||||
marsadm cron
|
||||
\family default
|
||||
, wait for the secondary to get this knowlege over the network, and try
|
||||
, wait for the secondary to get this knowledge over the network, and try
|
||||
|
||||
\family typewriter
|
||||
marsadm invalidate
|
||||
@ -5998,7 +5998,7 @@ marsadm invalidate
|
||||
\begin_inset Newline newline
|
||||
\end_inset
|
||||
|
||||
There is an execption: shortly after
|
||||
There is an exception: shortly after
|
||||
\family typewriter
|
||||
join-resource
|
||||
\family default
|
||||
@ -6375,7 +6375,7 @@ UnResponsive
|
||||
\family typewriter
|
||||
mars_main
|
||||
\family default
|
||||
did not do any noticable work for more than
|
||||
did not do any noticeable work for more than
|
||||
\family typewriter
|
||||
%{window}
|
||||
\family default
|
||||
@ -6776,7 +6776,7 @@ fetch:
|
||||
\family typewriter
|
||||
F
|
||||
\family default
|
||||
= according to knowlege, fetched logfiles are up-to-date,
|
||||
= according to knowledge, fetched logfiles are up-to-date,
|
||||
\family typewriter
|
||||
f
|
||||
\family default
|
||||
@ -7639,7 +7639,7 @@ right
|
||||
\emph on
|
||||
actual
|
||||
\emph default
|
||||
primary mode during that time, and the secondaries will sync therefrom.
|
||||
primary mode during that time, and the secondaries will sync there from.
|
||||
As soon as the local
|
||||
\family typewriter
|
||||
/dev/mars/mydata
|
||||
@ -8097,7 +8097,7 @@ status open
|
||||
\begin_layout Plain Layout
|
||||
Notice: in certain network outage scenarios, you may not be able to remotely
|
||||
login to the console and to check whether a server is running.
|
||||
Therefore it may happen that you erronously think hostA is dead, while
|
||||
Therefore it may happen that you erroneously think hostA is dead, while
|
||||
in reality it continues running.
|
||||
Even if you would know it, you might not be able to remotely kill it in
|
||||
a STONITH-like manner.
|
||||
@ -8292,7 +8292,7 @@ connection loss
|
||||
\emph default
|
||||
(e.g.
|
||||
networking problems / network partitions), you may not be able to reliably
|
||||
detect whether a split brain has actually occured, or not.
|
||||
detect whether a split brain has actually occurred, or not.
|
||||
\end_layout
|
||||
|
||||
\begin_layout Paragraph
|
||||
@ -8747,7 +8747,7 @@ In rare cases (when
|
||||
\family typewriter
|
||||
/mars
|
||||
\family default
|
||||
is almost full somewhere, or when emergency mode has occured somewhere),
|
||||
is almost full somewhere, or when emergency mode has occurred somewhere),
|
||||
you may need to run
|
||||
\family typewriter
|
||||
marsadm cron
|
||||
@ -8821,7 +8821,7 @@ On those cluster nodes where you want to retain some SPLIT BRAIN version
|
||||
\begin_inset Quotes eld
|
||||
\end_inset
|
||||
|
||||
emergengy backup
|
||||
emergency backup
|
||||
\begin_inset Quotes erd
|
||||
\end_inset
|
||||
|
||||
@ -9079,7 +9079,7 @@ first
|
||||
one
|
||||
\emph default
|
||||
of them.
|
||||
Leave the other one intact, by not umounting
|
||||
Leave the other one intact, by not unmounting
|
||||
\family typewriter
|
||||
/dev/mars/mydata
|
||||
\family default
|
||||
@ -10471,7 +10471,7 @@ write_throttle_start_percent
|
||||
slowly
|
||||
\emph default
|
||||
.
|
||||
Defaul value is 0, which means
|
||||
Default value is 0, which means
|
||||
\begin_inset Quotes eld
|
||||
\end_inset
|
||||
|
||||
@ -10480,7 +10480,7 @@ off
|
||||
\end_inset
|
||||
|
||||
.
|
||||
Practical values for this coule be around 80%.
|
||||
Practical values for this could be around 80%.
|
||||
\end_layout
|
||||
|
||||
\begin_layout Description
|
||||
@ -10844,7 +10844,7 @@ marsadm cron
|
||||
\end_layout
|
||||
|
||||
\begin_layout Enumerate
|
||||
As soon as emough space has been freed everywhere to leave the
|
||||
As soon as enough space has been freed everywhere to leave the
|
||||
\family typewriter
|
||||
EMEGENCY MODE HYSTERESIS
|
||||
\family default
|
||||
@ -10914,7 +10914,7 @@ Wait until sync has finished at hostB.
|
||||
\begin_layout Enumerate
|
||||
If you have more than 2 replicas in total: proceed with step 4 at hostC,
|
||||
and so on.
|
||||
This time, you could join multipe resources in parallel, because you already
|
||||
This time, you could join multiple resources in parallel, because you already
|
||||
have a life replica at hostB.
|
||||
\end_layout
|
||||
|
||||
@ -11182,7 +11182,7 @@ Don't use in scripts! Only use by hand!
|
||||
\begin_layout Plain Layout
|
||||
|
||||
\size scriptsize
|
||||
This option does not change the internal waiting logic for thois commands
|
||||
This option does not change the internal waiting logic for this commands
|
||||
which emulate synchronous behaviour on top of the asynchronous communication
|
||||
paradigm.
|
||||
Many commands are waiting until the desired effect has succeeded.
|
||||
@ -11679,7 +11679,7 @@ planned
|
||||
|
||||
\end_inset
|
||||
|
||||
Careful when using this on extremely huge LVs where the sync may take serveral
|
||||
Careful when using this on extremely huge LVs where the sync may take several
|
||||
days, or weeks.
|
||||
It is your sysadmin decision what you want to prefer: restarting the sync,
|
||||
or planned handover.
|
||||
@ -12819,7 +12819,7 @@ status open
|
||||
\size scriptsize
|
||||
Avoid any potential timeouts / hangs caused by networks or firewalls, by
|
||||
explicitly disabling the old ssh-based communication method, and relying
|
||||
on the new MARS communication protocol (by defaut on port 7777).
|
||||
on the new MARS communication protocol (by default on port 7777).
|
||||
\end_layout
|
||||
|
||||
\end_inset
|
||||
@ -13008,7 +13008,7 @@ all local network interfaces are scanned by
|
||||
\family typewriter
|
||||
/sbin/ip
|
||||
\family default
|
||||
for IPv4 adresses, and the
|
||||
for IPv4 addresses, and the
|
||||
\emph on
|
||||
first
|
||||
\emph default
|
||||
@ -13327,7 +13327,7 @@ Postcondition: the initial symlink tree is created in
|
||||
/mars/uuid
|
||||
\family default
|
||||
symlink is created for later distribution in the cluster.
|
||||
It uniquely indentifies the cluster in the world.
|
||||
It uniquely identifies the cluster in the world.
|
||||
\end_layout
|
||||
|
||||
\begin_layout Plain Layout
|
||||
@ -13631,7 +13631,7 @@ In ancient MARS versions before mars0.1astable101 the kernel module
|
||||
\emph on
|
||||
must not
|
||||
\emph default
|
||||
be loaded, and a working ssh connecttion to
|
||||
be loaded, and a working ssh connection to
|
||||
\family typewriter
|
||||
$host
|
||||
\family default
|
||||
@ -13823,7 +13823,7 @@ marsadm leave-resource
|
||||
\family default
|
||||
).
|
||||
The kernel module should be loaded and the network should be operating
|
||||
in order to also propogate the effect to the other cluster nodes.
|
||||
in order to also propagate the effect to the other cluster nodes.
|
||||
\end_layout
|
||||
|
||||
\begin_layout Plain Layout
|
||||
@ -13898,7 +13898,7 @@ rmmod
|
||||
|
||||
\end_inset
|
||||
|
||||
passivley fetching the symlink tree.
|
||||
passively fetching the symlink tree.
|
||||
In order to really stop all communication, the kernel module should be
|
||||
unloaded afterwards (rmmod mars).
|
||||
The local
|
||||
@ -13926,7 +13926,7 @@ zombies
|
||||
|
||||
\size scriptsize
|
||||
In case of an unintended hardware destruction (e.g.
|
||||
fire, water, ...) this command should be used on another healty cluster node
|
||||
fire, water, ...) this command should be used on another healthy cluster node
|
||||
$helper in order to finally remove $damaged from the cluster via the command
|
||||
|
||||
\family typewriter
|
||||
@ -14902,7 +14902,7 @@ Postcondition: the
|
||||
/mars/uuid
|
||||
\family default
|
||||
symlink is created for later distribution in the cluster.
|
||||
It uniquely indentifies the cluster in the world.
|
||||
It uniquely identifies the cluster in the world.
|
||||
\end_layout
|
||||
|
||||
\begin_layout Plain Layout
|
||||
@ -14967,7 +14967,7 @@ Instead of executing
|
||||
\family typewriter
|
||||
marsadm
|
||||
\family default
|
||||
commands serveral times for each resource argument, you may give the special
|
||||
commands several times for each resource argument, you may give the special
|
||||
resource argument
|
||||
\family typewriter
|
||||
all
|
||||
@ -15328,8 +15328,8 @@ Postcondition: the resource
|
||||
\family typewriter
|
||||
$res
|
||||
\family default
|
||||
is created, the inital role of the current node is primary.
|
||||
The corresponding symlink tree information is asynchonously distributed
|
||||
is created, the initial role of the current node is primary.
|
||||
The corresponding symlink tree information is asynchronously distributed
|
||||
in the cluster (in the background).
|
||||
The device
|
||||
\family typewriter
|
||||
@ -15544,7 +15544,7 @@ Postcondition: the current node becomes a member of resource
|
||||
\family typewriter
|
||||
$res
|
||||
\family default
|
||||
, the inital role is secondary.
|
||||
, the initial role is secondary.
|
||||
The initial full sync should start after a while.
|
||||
\end_layout
|
||||
|
||||
@ -15693,7 +15693,7 @@ marsadm down
|
||||
\family default
|
||||
).
|
||||
The kernel module should be loaded and the network should be operating
|
||||
in order to also propogate the effect to the other cluster nodes.
|
||||
in order to also propagate the effect to the other cluster nodes.
|
||||
\end_layout
|
||||
|
||||
\begin_layout Plain Layout
|
||||
@ -15931,7 +15931,7 @@ leave-resource
|
||||
--host=somebodyelse
|
||||
\family default
|
||||
argument in order to desperately try to destroy remains of incomplete or
|
||||
pysically damaged hardware.
|
||||
physically damaged hardware.
|
||||
\end_layout
|
||||
|
||||
\begin_layout Plain Layout
|
||||
@ -16113,7 +16113,7 @@ half-dead
|
||||
\begin_inset Quotes erd
|
||||
\end_inset
|
||||
|
||||
zombie nodes (beware of shapshot / restores on virtual machines!!).
|
||||
zombie nodes (beware of snapshot / restores on virtual machines!!).
|
||||
MARS does its best to avoid problems even in case the new resource name
|
||||
should equal the old one, but there can be
|
||||
\emph on
|
||||
@ -16422,7 +16422,7 @@ marsadm cron
|
||||
\emph on
|
||||
temporary
|
||||
\emph default
|
||||
purposes (in constrast to
|
||||
purposes (in contrast to
|
||||
\emph on
|
||||
full
|
||||
\emph default
|
||||
@ -17121,7 +17121,7 @@ pause-sync-global
|
||||
|
||||
|
||||
\size scriptsize
|
||||
WARNING! After this, and ather having paused any remote data access, you
|
||||
WARNING! After this, and other having paused any remote data access, you
|
||||
might use the underlying disk for your own purposes, such as test-mounting
|
||||
it in
|
||||
\emph on
|
||||
@ -17130,7 +17130,7 @@ readonly
|
||||
mode.
|
||||
|
||||
\series bold
|
||||
Don't modifiy
|
||||
Don't modify
|
||||
\series default
|
||||
its contents in any way! Not even by an
|
||||
\family typewriter
|
||||
@ -17188,7 +17188,7 @@ primary
|
||||
\emph default
|
||||
side, you may choose to resolve the inconsistencies by
|
||||
\family typewriter
|
||||
marsadm invalide $res
|
||||
marsadm invalid $res
|
||||
\family default
|
||||
on
|
||||
\emph on
|
||||
@ -20738,7 +20738,7 @@ marsadm secondary
|
||||
not recommended
|
||||
\emph default
|
||||
), at least the old primary must be reachable.
|
||||
The (old) primarie's virutal device
|
||||
The (old) primarie's virtual device
|
||||
\family typewriter
|
||||
/dev/mars/mydata
|
||||
\family default
|
||||
@ -21628,7 +21628,7 @@ reference "subsec:Forced-Switching"
|
||||
\emph on
|
||||
really
|
||||
\emph default
|
||||
need this command: before finally destroying a resouce via the
|
||||
need this command: before finally destroying a resource via the
|
||||
\emph on
|
||||
last
|
||||
\emph default
|
||||
@ -23399,7 +23399,7 @@ marked
|
||||
|
||||
|
||||
\size scriptsize
|
||||
THIS IS HIGLY DANGEROUS FOR DATA CONSISTENCY!
|
||||
THIS IS HIGHLY DANGEROUS FOR DATA CONSISTENCY!
|
||||
\end_layout
|
||||
|
||||
\begin_layout Plain Layout
|
||||
@ -23954,7 +23954,7 @@ status open
|
||||
\begin_layout Plain Layout
|
||||
|
||||
\size scriptsize
|
||||
Deprectated, will vanish.
|
||||
Deprecated, will vanish.
|
||||
Use
|
||||
\family typewriter
|
||||
view-role
|
||||
@ -24067,7 +24067,7 @@ status open
|
||||
\begin_layout Plain Layout
|
||||
|
||||
\size scriptsize
|
||||
Deprectated, will vanish.
|
||||
Deprecated, will vanish.
|
||||
Use
|
||||
\family typewriter
|
||||
view-state
|
||||
@ -24180,7 +24180,7 @@ status open
|
||||
\begin_layout Plain Layout
|
||||
|
||||
\size scriptsize
|
||||
Deprectated, will vanish.
|
||||
Deprecated, will vanish.
|
||||
Use
|
||||
\family typewriter
|
||||
view-cstate
|
||||
@ -24293,7 +24293,7 @@ status open
|
||||
\begin_layout Plain Layout
|
||||
|
||||
\size scriptsize
|
||||
Deprectated, will vanish.
|
||||
Deprecated, will vanish.
|
||||
Use
|
||||
\family typewriter
|
||||
view-dstate
|
||||
@ -24406,7 +24406,7 @@ status open
|
||||
\begin_layout Plain Layout
|
||||
|
||||
\size scriptsize
|
||||
Deprectated.
|
||||
Deprecated.
|
||||
Use
|
||||
\family typewriter
|
||||
view-status
|
||||
@ -24550,7 +24550,7 @@ status open
|
||||
\begin_layout Plain Layout
|
||||
|
||||
\size scriptsize
|
||||
Deprectated, will vanish.
|
||||
Deprecated, will vanish.
|
||||
Don't use it.
|
||||
Use
|
||||
\family typewriter
|
||||
@ -24664,7 +24664,7 @@ status open
|
||||
\begin_layout Plain Layout
|
||||
|
||||
\size scriptsize
|
||||
Deprectated, will vanish.
|
||||
Deprecated, will vanish.
|
||||
Don't use it.
|
||||
Use
|
||||
\family typewriter
|
||||
@ -24778,7 +24778,7 @@ status open
|
||||
\begin_layout Plain Layout
|
||||
|
||||
\size scriptsize
|
||||
Deprectated, will vanish.
|
||||
Deprecated, will vanish.
|
||||
Don't use it.
|
||||
Implement your own macros instead.
|
||||
\end_layout
|
||||
@ -24888,7 +24888,7 @@ status open
|
||||
\begin_layout Plain Layout
|
||||
|
||||
\size scriptsize
|
||||
Deprectated, will vanish.
|
||||
Deprecated, will vanish.
|
||||
Use
|
||||
\family typewriter
|
||||
view-the-err-msg
|
||||
@ -25005,7 +25005,7 @@ status open
|
||||
\begin_layout Plain Layout
|
||||
|
||||
\size scriptsize
|
||||
Write the file content to stdout, but replace all occurences of numeric
|
||||
Write the file content to stdout, but replace all occurrences of numeric
|
||||
timestamps converted to a human-readable format.
|
||||
Thus is most useful for inspection of status and log files, e.g.
|
||||
|
||||
@ -25740,7 +25740,7 @@ status open
|
||||
\begin_layout Plain Layout
|
||||
|
||||
\size scriptsize
|
||||
Inquiry of the maxium sync concurrency.
|
||||
Inquiry of the maximum sync concurrency.
|
||||
See also the primitive macro
|
||||
\family typewriter
|
||||
%global-sync-limit-value{}
|
||||
@ -26787,7 +26787,7 @@ marsadm
|
||||
\emph on
|
||||
tries
|
||||
\emph default
|
||||
to achieves the intended result (typicially, you may use this after the
|
||||
to achieves the intended result (typically, you may use this after the
|
||||
|
||||
\family typewriter
|
||||
is-
|
||||
@ -30319,7 +30319,7 @@ Now we come to benchmarking
|
||||
\end_layout
|
||||
|
||||
\begin_layout Standard
|
||||
So you might expect that performace of
|
||||
So you might expect that performance of
|
||||
\family typewriter
|
||||
/dev/mars/lv-0
|
||||
\family default
|
||||
@ -31193,7 +31193,7 @@ socket bundling
|
||||
\end_layout
|
||||
|
||||
\begin_layout Standard
|
||||
It is mostly intendend for lines showing high packet loss.
|
||||
It is mostly intended for lines showing high packet loss.
|
||||
By using multiple TCP sockets in parallel for emulating a single logical
|
||||
connection, throughput can be significantly increased.
|
||||
\end_layout
|
||||
@ -31273,7 +31273,7 @@ random
|
||||
\begin_layout Standard
|
||||
The next graphics shows the same, but over a medium distance of about 50km.
|
||||
This line is even more heavily loaded with respect to the number of TCP
|
||||
connections running in parallel (probly some 10,000 or even 100,000 if
|
||||
connections running in parallel (probably some 10,000 or even 100,000 if
|
||||
not more), and there is some kind of
|
||||
\begin_inset Quotes eld
|
||||
\end_inset
|
||||
@ -31517,7 +31517,7 @@ $class.*.status
|
||||
|
||||
\end_inset
|
||||
|
||||
Beware, any permamently present
|
||||
Beware, any permanently present
|
||||
\family typewriter
|
||||
*.log
|
||||
\family default
|
||||
@ -31566,7 +31566,7 @@ syslog_min_class
|
||||
\family default
|
||||
(rw) The
|
||||
\emph on
|
||||
mimimum
|
||||
minimum
|
||||
\emph default
|
||||
class number for
|
||||
\emph on
|
||||
@ -31574,7 +31574,7 @@ permanent
|
||||
\emph default
|
||||
syslogging.
|
||||
By default, this is set to -1 in order to switch off perment logging completely.
|
||||
Permament logging can easily flood your syslog with such huge amounts of
|
||||
Permanent logging can easily flood your syslog with such huge amounts of
|
||||
messages (in particular when class=0), that your system as a whole may
|
||||
become unusable (because vital kernel threads may be blocked too long or
|
||||
too often by the userspace syslog daemon).
|
||||
@ -31605,7 +31605,7 @@ permanent
|
||||
\family typewriter
|
||||
syslog_flood_class
|
||||
\family default
|
||||
(rw) The mimimum class of flood-protected syslogging.
|
||||
(rw) The minimum class of flood-protected syslogging.
|
||||
The maximum class is always 4.
|
||||
\end_layout
|
||||
|
||||
@ -31615,9 +31615,9 @@ syslog_flood_class
|
||||
\family typewriter
|
||||
syslog_flood_limit
|
||||
\family default
|
||||
(rw) The maxmimum number of messages after which the flood protection will
|
||||
(rw) The maximum number of messages after which the flood protection will
|
||||
start.
|
||||
This is a hard limit for the the number of messages written to the syslog.
|
||||
This is a hard limit for the number of messages written to the syslog.
|
||||
\end_layout
|
||||
|
||||
\begin_layout Labeling
|
||||
@ -32845,7 +32845,7 @@ $HOME/.marsadm/systemd-templates/
|
||||
/usr/local/lib/marsadm/systemd-templates/
|
||||
\family default
|
||||
.
|
||||
Futher places can be defined by overriding the $
|
||||
Further places can be defined by overriding the $
|
||||
\family typewriter
|
||||
MARS_PATH
|
||||
\family default
|
||||
@ -34203,7 +34203,7 @@ man systemd
|
||||
to fail seem to be different from general conflicts in unit dependencies.
|
||||
Although not precisely documented, the observed behaviour luckily appears
|
||||
to make HA more likely.
|
||||
There remains some uncertainity caused by the documented failure possibility.
|
||||
There remains some uncertainty caused by the documented failure possibility.
|
||||
A new option called
|
||||
\family typewriter
|
||||
--job-mode=append
|
||||
@ -34226,7 +34226,7 @@ unnecessary
|
||||
|
||||
\end_inset
|
||||
|
||||
, which resulted in some behavioural improvements) the
|
||||
, which resulted in some behavioral improvements) the
|
||||
\family typewriter
|
||||
--job-mode=fail
|
||||
\family default
|
||||
@ -34291,7 +34291,7 @@ is-failed
|
||||
\begin_layout Enumerate
|
||||
Systemd lacks an important property called Idempotence.
|
||||
Idempotence is a very common feature in big industry plants, where hundreds
|
||||
of human workers may act on controlling hundrets of facilities.
|
||||
of human workers may act on controlling hundreds of facilities.
|
||||
Each alarm call may cause a different person to try to
|
||||
\begin_inset Quotes eld
|
||||
\end_inset
|
||||
@ -34416,7 +34416,7 @@ controlled abortion
|
||||
at all.
|
||||
Care must be taken that failures caused by aborts will not occur too frequently
|
||||
for HA.
|
||||
When failures caused by aborts are occuring too frequently, the concept
|
||||
When failures caused by aborts are occurring too frequently, the concept
|
||||
of abort should be disabled.
|
||||
\end_layout
|
||||
|
||||
@ -35268,7 +35268,7 @@ BindsTo=
|
||||
\family typewriter
|
||||
PartOf=
|
||||
\family default
|
||||
dependencies, peferably augmented with
|
||||
dependencies, preferably augmented with
|
||||
\family typewriter
|
||||
After=
|
||||
\family default
|
||||
@ -35897,7 +35897,7 @@ marsadm
|
||||
\family typewriter
|
||||
{all,the}-global-{inf,wrn,err}-msg
|
||||
\family default
|
||||
Dito, but more specific.
|
||||
Ditto, but more specific.
|
||||
\end_layout
|
||||
|
||||
\begin_layout Labeling
|
||||
@ -35906,7 +35906,7 @@ marsadm
|
||||
\family typewriter
|
||||
{all,the}-pretty-{global-,}{inf-,wrn-,err-,}msg
|
||||
\family default
|
||||
Dito, but show numerical timestamps in a human readable form.
|
||||
Ditto, but show numerical timestamps in a human readable form.
|
||||
\end_layout
|
||||
|
||||
\begin_layout Labeling
|
||||
@ -35961,7 +35961,7 @@ todo-primary
|
||||
get-primary
|
||||
\family default
|
||||
is equal to the current host.
|
||||
Similary,
|
||||
Similarly,
|
||||
\family typewriter
|
||||
todo-secondary
|
||||
\family default
|
||||
@ -36014,7 +36014,7 @@ get-resource-{fat,err,wrn}
|
||||
\family typewriter
|
||||
get-resource-{fat,err,wrn}-count
|
||||
\family default
|
||||
Dito, but get the number of lines instead of the text.
|
||||
Ditto, but get the number of lines instead of the text.
|
||||
\end_layout
|
||||
|
||||
\begin_layout Labeling
|
||||
@ -36362,7 +36362,7 @@ device-opened
|
||||
\family typewriter
|
||||
/dev/mars/mydata
|
||||
\family default
|
||||
has been actually openend, e.g.
|
||||
has been actually opened, e.g.
|
||||
by
|
||||
\family typewriter
|
||||
mount
|
||||
@ -36392,7 +36392,7 @@ device-nrflying
|
||||
\family default
|
||||
Show the number of currently flying IO requests.
|
||||
This is an indicator of queueing at the low-level device.
|
||||
When it is permenantly very high, it may point at IO problems, such as
|
||||
When it is permanently very high, it may point at IO problems, such as
|
||||
RAID degradation.
|
||||
\end_layout
|
||||
|
||||
@ -36408,10 +36408,10 @@ disk-error
|
||||
\emph on
|
||||
known
|
||||
\emph default
|
||||
IO error, as reported upwards to applications, and before it was resetted
|
||||
IO error, as reported upwards to applications, and before it was reset
|
||||
for whatever reason.
|
||||
For example, it may be the last open() error on the underlying disk, or
|
||||
something else may have occured during operations, and sometimes it may
|
||||
something else may have occurred during operations, and sometimes it may
|
||||
have corrected itself.
|
||||
Normally, this should be always zero.
|
||||
When < 0 according to return-code conventions as explained at
|
||||
@ -36433,7 +36433,7 @@ device-error
|
||||
\emph on
|
||||
known
|
||||
\emph default
|
||||
IO error, as reported upwards to applications, and before it was resetted
|
||||
IO error, as reported upwards to applications, and before it was reset
|
||||
for whatever reason.
|
||||
Normally, this should be always zero.
|
||||
When < 0 according to return-code conventions as explained at
|
||||
@ -36927,8 +36927,8 @@ true state
|
||||
\end_inset
|
||||
|
||||
does not exist at all in a distributed system.
|
||||
Anything you can know in a distributed system is always local knowlege,
|
||||
which races with other (remote) knowlege, and may be outdated at
|
||||
Anything you can know in a distributed system is always local knowledge,
|
||||
which races with other (remote) knowledge, and may be outdated at
|
||||
\emph on
|
||||
any
|
||||
\emph default
|
||||
@ -37169,7 +37169,7 @@ told
|
||||
\emph on
|
||||
believes
|
||||
\emph default
|
||||
it has commited the data in a reboot-safe way.
|
||||
it has committed the data in a reboot-safe way.
|
||||
Whether this is
|
||||
\emph on
|
||||
really
|
||||
@ -37770,7 +37770,7 @@ mydata
|
||||
\family typewriter
|
||||
global-sync-limit-value
|
||||
\family default
|
||||
(global) Report the maxium parallelism degree of sync, as configurable
|
||||
(global) Report the maximum parallelism degree of sync, as configurable
|
||||
via
|
||||
\family typewriter
|
||||
set-global-sync-limit
|
||||
@ -38550,7 +38550,7 @@ delimiter
|
||||
index
|
||||
\family default
|
||||
\emph default
|
||||
'th list element is the assigend to, or substituted by,
|
||||
'th list element is the assigned to, or substituted by,
|
||||
\family typewriter
|
||||
\emph on
|
||||
expression
|
||||
@ -38582,7 +38582,7 @@ arg2
|
||||
\emph default
|
||||
}
|
||||
\family default
|
||||
Evaluates the arguments, inteprets them as numbers, and adds them together.
|
||||
Evaluates the arguments, interprets them as numbers, and adds them together.
|
||||
\end_layout
|
||||
|
||||
\begin_layout Itemize
|
||||
@ -39936,7 +39936,7 @@ arg1
|
||||
argn
|
||||
\family default
|
||||
\emph default
|
||||
are evaluted in the
|
||||
are evaluated in the
|
||||
\emph on
|
||||
old
|
||||
\emph default
|
||||
@ -40340,7 +40340,7 @@ The value given by the
|
||||
\family typewriter
|
||||
--timeout=
|
||||
\family default
|
||||
option, or the corresonding default value.
|
||||
option, or the corresponding default value.
|
||||
\end_layout
|
||||
|
||||
\begin_layout Itemize
|
||||
@ -40352,7 +40352,7 @@ The value given by the
|
||||
\family typewriter
|
||||
--threshold=
|
||||
\family default
|
||||
option, or the corresonding default value.
|
||||
option, or the corresponding default value.
|
||||
\end_layout
|
||||
|
||||
\begin_layout Itemize
|
||||
@ -40364,7 +40364,7 @@ The value given by the
|
||||
\family typewriter
|
||||
--window=
|
||||
\family default
|
||||
option, or the corresonding default value (60s).
|
||||
option, or the corresponding default value (60s).
|
||||
\end_layout
|
||||
|
||||
\begin_layout Itemize
|
||||
@ -41988,7 +41988,7 @@ marsadm view-flags all
|
||||
\family default
|
||||
, it is known to be inconsistent (e.g.
|
||||
during a sync).
|
||||
When there is a dash instead, it usually means that the disk is detatched
|
||||
When there is a dash instead, it usually means that the disk is detached
|
||||
or misconfigured or the kernel module is not started.
|
||||
Please fix these problems first before believing that your local disk is
|
||||
unusable.
|
||||
@ -42498,7 +42498,7 @@ old snapshot
|
||||
\family typewriter
|
||||
/mars/
|
||||
\family default
|
||||
and/or of some underly resource disk /
|
||||
and/or of some underlie resource disk /
|
||||
\emph on
|
||||
underlying
|
||||
\emph default
|
||||
@ -42527,7 +42527,7 @@ some(!)
|
||||
\family default
|
||||
has survived and the storage is operational again.
|
||||
For exampley, any defective RAID disks have been already replaced and the
|
||||
underlaying RAID is now
|
||||
underlying RAID is now
|
||||
\emph on
|
||||
rebuilt
|
||||
\emph default
|
||||
@ -43154,7 +43154,7 @@ longer
|
||||
|
||||
Do not blame MARS for anything which is outside its scope residing at kernel
|
||||
level.
|
||||
Wheter and when a failover is needed for whatever reason, and in which
|
||||
Whether and when a failover is needed for whatever reason, and in which
|
||||
parallelism degree, is clearly outside the scope of a
|
||||
\emph on
|
||||
component
|
||||
@ -43558,7 +43558,7 @@ good
|
||||
for emergency cases), don't re-join them all in parallel, but rather start
|
||||
with the oldest / most outdated / worst / inconsistent version first.
|
||||
It is recommended to start the next one only when the previous one has
|
||||
sucessfully finished.
|
||||
successfully finished.
|
||||
\end_layout
|
||||
|
||||
\begin_layout Chapter
|
||||
@ -43775,7 +43775,7 @@ It should be very hard to finally trash a secondary, because the transaction
|
||||
md5
|
||||
\family default
|
||||
checksums for all data records.
|
||||
Any attempt to replay currupted logfiles is refused by MARS.
|
||||
Any attempt to replay corrupted logfiles is refused by MARS.
|
||||
In addition, the sequence numbers of rotated logfiles (e.g.
|
||||
via
|
||||
\family typewriter
|
||||
@ -44211,7 +44211,7 @@ The following is a further alternative for
|
||||
experts
|
||||
\series default
|
||||
who really know what they are doing.
|
||||
The method is very simple and therefore well-suited for coping with mass
|
||||
The method is very simple and therefore well-suited for copying with mass
|
||||
failures, e.g.
|
||||
|
||||
\series bold
|
||||
@ -44499,7 +44499,7 @@ name "chap:Creating-Backups-via"
|
||||
\end_layout
|
||||
|
||||
\begin_layout Standard
|
||||
When all your secondaries are all homogenously located in a standby datacenter,
|
||||
When all your secondaries are all homogeneously located in a standby datacenter,
|
||||
they will be almost idle all the time.
|
||||
This is a waste of computing resources.
|
||||
\end_layout
|
||||
|
@ -65,7 +65,7 @@ config MARS_DEBUG
|
||||
default n
|
||||
help
|
||||
Some of these checks and some additional error tracing may
|
||||
consume noticable amounts of memory.
|
||||
consume noticeable amounts of memory.
|
||||
OFF for production systems. ON for testing!
|
||||
|
||||
config MARS_DEBUG_DEFAULT
|
||||
@ -111,7 +111,7 @@ config MARS_DEFAULT_PORT
|
||||
default 7777
|
||||
help
|
||||
Best practice is to uniformly use the same port number
|
||||
in a cluster. Therefore, this is a compiletime constant.
|
||||
in a cluster. Therefore, this is a compile time constant.
|
||||
You may override this at insmod time via the mars_port= parameter.
|
||||
|
||||
config MARS_SEPARATE_PORTS
|
||||
@ -175,24 +175,24 @@ config MARS_ROLLOVER_INTERVAL
|
||||
depends on MARS
|
||||
default 3
|
||||
help
|
||||
May influence the system load; dont use too low nubmers.
|
||||
May influence the system load; don't use too low numbers.
|
||||
|
||||
config MARS_SCAN_INTERVAL
|
||||
int "re-scanning of symlinks in /mars/ (in seconds)"
|
||||
depends on MARS
|
||||
default 5
|
||||
help
|
||||
May influence the system load; dont use too low nubmers.
|
||||
May influence the system load; don't use too low numbers.
|
||||
|
||||
config MARS_PROPAGATE_INTERVAL
|
||||
int "network propagation delay of changes in /mars/ (in seconds)"
|
||||
depends on MARS
|
||||
default 5
|
||||
help
|
||||
May influence the system load; dont use too low nubmers.
|
||||
May influence the system load; don't use too low numbers.
|
||||
|
||||
config MARS_SYNC_FLIP_INTERVAL
|
||||
int "interrpt sync by logfile update after (seconds)"
|
||||
int "interrupt sync by logfile update after (seconds)"
|
||||
depends on MARS
|
||||
default 60
|
||||
help
|
||||
|
@ -383,7 +383,7 @@ int _make_mref(struct copy_brick *brick,
|
||||
st = &GET_STATE(brick, index);
|
||||
old_mref = READ_ONCE(st->table[queue]);
|
||||
if (unlikely(old_mref)) {
|
||||
MARS_ERR("cannot overrride old_mref=%p at index=%u queue=%d pos=%lld+%lld flags=%d\n",
|
||||
MARS_ERR("cannot override old_mref=%p at index=%u queue=%d pos=%lld+%lld flags=%d\n",
|
||||
old_mref,
|
||||
index, queue,
|
||||
current_pos, diff, flags);
|
||||
@ -570,7 +570,7 @@ restart:
|
||||
brick->copy_shutdown_started.tv_sec) {
|
||||
struct lamport_time force_when;
|
||||
|
||||
/* We use the force alrady after mars_copy_timeout / 2
|
||||
/* We use the force already after mars_copy_timeout / 2
|
||||
* because the shutdown itself may take some
|
||||
* further time (e.g. over network).
|
||||
*/
|
||||
@ -743,7 +743,7 @@ restart:
|
||||
* _starting_ the writes is in order.
|
||||
* This is only correct when all lower bricks obey the
|
||||
* order of ref_io() operations.
|
||||
* Currenty, bio and aio are obeying this. Be careful when
|
||||
* Currently, bio and aio are obeying this. Be careful when
|
||||
* implementing new IO bricks!
|
||||
*/
|
||||
if (mars_copy_strict_write_order &&
|
||||
|
@ -2,7 +2,7 @@ GPLed software AS IS, sponsored by 1&1 Internet AG (www.1und1.de).
|
||||
|
||||
The test suite is work in progress.
|
||||
|
||||
The test suite was developped by Frank Liepold during his stay at 1&1.
|
||||
The test suite was developed by Frank Liepold during his stay at 1&1.
|
||||
|
||||
The email address frank.liepold@1und1.de will no longer work, since Frank has
|
||||
left 1&1 since May 2014.
|
||||
@ -51,7 +51,7 @@ Contents
|
||||
1.1. Global settings
|
||||
------------------
|
||||
The directory where this README resides (normally <git-repo>/test_suite) is
|
||||
called base directory in the following. If relativ paths are given they refer to
|
||||
called base directory in the following. If relative paths are given they refer to
|
||||
this base directory.
|
||||
|
||||
The frame work of the test suite consists of the files README,
|
||||
@ -119,11 +119,11 @@ subdirectory (we call it start directory) as follows:
|
||||
branch start directory -> test directory) by default. It may be changed by
|
||||
the option --config_root_dir=<my dir>.
|
||||
If the start directory coincides with the test directory the file
|
||||
<test directory name>.conf is included for conveniance (in the strict
|
||||
<test directory name>.conf is included for convenience (in the strict
|
||||
sense there are no true subdirectories of the start directory residing
|
||||
above the test directory).
|
||||
The <subdirnam>.conf files may reside in the test directory or any of
|
||||
it's parent directories (up to 20 levels).
|
||||
its parent directories (up to 20 levels).
|
||||
|
||||
|
||||
Examples:
|
||||
@ -212,7 +212,7 @@ The output consists of the following sections:
|
||||
--------------------------------------
|
||||
Calling start_test.sh --help from a test directory doesn't start a test but
|
||||
gives you the output mentioned in 1.2.1 without the sections produced during
|
||||
real test excecution.
|
||||
real test execution.
|
||||
In particular the section "Configuration variables" is printed. So you can
|
||||
determine which functions are called via the variable run_list. These functions
|
||||
should be commented extensively enough to be able to understand the test's
|
||||
|
@ -364,7 +364,7 @@ sub device_exists {
|
||||
# Silent fallback to local detection for old kernel module versions
|
||||
my $buildtag = get_alive_link("buildtag", $peer, 1);
|
||||
if (!$buildtag) {
|
||||
# VERY old MARS modules dont report their version
|
||||
# VERY old MARS modules don't report their version
|
||||
$buildtag = `cut -d' ' -f1 < /proc/sys/mars/version`;
|
||||
# Sometimes "never touch a running system" is a BAD strategy...
|
||||
lwarn "Please upgrade your EXTREMELY OLD module version '$buildtag'\n" if $buildtag;
|
||||
@ -815,7 +815,7 @@ sub _scan_caches {
|
||||
lhint "Peer '$this_peer' looks like decommissioned (or we are in split-cluster).\n";
|
||||
}
|
||||
}
|
||||
# ABOLUTE NOGO: the currently running host CANNOT be deleted
|
||||
# ABSOLUTE NOGO: the currently running host CANNOT be deleted
|
||||
if ($this_peer eq $real_host &&
|
||||
(!$raw_ip || get_link_stamp($path) < $now - $window)) {
|
||||
lwarn "IMPORTANT: this script is running under the REAL hostname '$real_host'\n";
|
||||
@ -854,7 +854,7 @@ sub _scan_caches {
|
||||
next;
|
||||
}
|
||||
}
|
||||
# All has been checked now: rember this peer.
|
||||
# All has been checked now: remember this peer.
|
||||
$total_peers{$this_peer} = {};
|
||||
}
|
||||
# Add all known resources to %total_resources but _not_ to %any_resources.
|
||||
@ -1145,7 +1145,7 @@ my $generated_scripts_subdir = defined($ENV{SYSTEMD_SCRIPTS_SUBDIR}) ?
|
||||
my $predefined_unit_path = "/etc/systemd/system,/run/systemd/system,/usr/lib/systemd/system";
|
||||
|
||||
my $systemd_system_dirs =
|
||||
# prefer the "offical" systemd path as documented in "man systemd.unit"
|
||||
# prefer the "official" systemd path as documented in "man systemd.unit"
|
||||
defined($ENV{SYSTEMD_UNIT_PATH}) ?
|
||||
join(",", split(":", $ENV{SYSTEMD_UNIT_PATH})) .
|
||||
(
|
||||
@ -2113,7 +2113,7 @@ sub systemd_commit {
|
||||
# At the moment, the complete transitive closure is re-computed once
|
||||
# a small detail has changed. This is on the safe side, but not optimal.
|
||||
# There is certainly room for improvement. However be cautious
|
||||
# with respect to correctness under all cirumstances.
|
||||
# with respect to correctness under all circumstances.
|
||||
#
|
||||
# Knuth is cited: "I can do it in half the time if it doesn't have
|
||||
# to be correct".
|
||||
@ -2144,7 +2144,7 @@ sub __systemd_generate {
|
||||
@res_list = ($res);
|
||||
} else {
|
||||
@res_list = get_any_resources($host);
|
||||
# We can only delete when the full set of transitive dependecies is known.
|
||||
# We can only delete when the full set of transitive dependencies is known.
|
||||
$do_delete = ($make_want && $make_watcher);
|
||||
}
|
||||
# Create initial systemd units
|
||||
@ -2607,7 +2607,7 @@ sub get_global_versions {
|
||||
lwarn "using different minor versions is possible, but you should upgrade your kernel module ASAP\n";
|
||||
}
|
||||
}
|
||||
# compute the mimimum of kernel features capabilities
|
||||
# compute the minimum of kernel features capabilities
|
||||
my $start_time = mars_time();
|
||||
my $stone_age = $start_time;
|
||||
if ($cron_autoclean_days) {
|
||||
@ -2689,7 +2689,7 @@ sub get_alive_links {
|
||||
next if $peer =~ $match_reserved_id;
|
||||
# After join-cluster & co, links may take a while to appear
|
||||
$peers{$peer} = 1 if $non_participating_peers;
|
||||
# peer must be a candiate matching the hosts spec
|
||||
# peer must be a candidate matching the hosts spec
|
||||
if ($hosts && $hosts ne "*") {
|
||||
next unless $peer =~ m/(^|[+,{}])$hosts($|[+,{}])/;
|
||||
}
|
||||
@ -3384,7 +3384,7 @@ sub check_not_primary {
|
||||
if (!$is_primary_recent || $desginated_primary_recent) {
|
||||
if ($max_retry-- < 0) {
|
||||
lwarn "Sorry, the primary status on resource '$res' is UNSTABLE or FLIPPING AROUND\n";
|
||||
ldie "Please check whether there are DISTRIBUED RACES or amok-running scripts etc.\n" unless $force;
|
||||
ldie "Please check whether there are DISTRIBUTED RACES or amok-running scripts etc.\n" unless $force;
|
||||
lwarn "You said --force, I will continue AT YOUR RISK\n"
|
||||
} else {
|
||||
_trigger();
|
||||
@ -4893,7 +4893,7 @@ sub senseless_cmd {
|
||||
|
||||
sub forbidden_cmd {
|
||||
my ($cmd, $res) = @_;
|
||||
ldie "command '$cmd' on resource '$res' cannot be used with MARS (migth affect too many hosts, lead to undesired consequences)\n";
|
||||
ldie "command '$cmd' on resource '$res' cannot be used with MARS (might affect too many hosts, lead to undesired consequences)\n";
|
||||
}
|
||||
|
||||
sub nyi_cmd {
|
||||
@ -5318,11 +5318,11 @@ sub merge_cluster_old {
|
||||
if ($other_uuid eq $uuid) {
|
||||
lprint "Other cluster peer '$peer' has the same UUID.\n";
|
||||
lprint "No resource name checking necessary.\n";
|
||||
lprint "Operation '$cmd' will therfore work logically idempotent.\n";
|
||||
lprint "Operation '$cmd' will therefore work logically idempotent.\n";
|
||||
} else {
|
||||
if (link_exists("$mars/tree-$peer")) {
|
||||
lwarn "A valid tree signature '$mars/tree-$peer' already exists, thus it appears to be already merged!\n";
|
||||
ldie "Aborting for saftey. Override via --force only if you know what you are doing!\n" unless $force;
|
||||
ldie "Aborting for safety. Override via --force only if you know what you are doing!\n" unless $force;
|
||||
}
|
||||
# Check that both sets of resources are disjoint
|
||||
lprint "Other cluster peer '$peer' has a different UUID, checking for resource name conflicts.\n";
|
||||
@ -5344,7 +5344,7 @@ sub merge_cluster_old {
|
||||
foreach my $res (@conflicts) {
|
||||
lprint "\t$res\n";
|
||||
}
|
||||
ldie "Cannot $cmd: some resource directories exist at both clusters with same name.\nThis cannot be overriden.\nPlease resolve the conflict by hand.\n";
|
||||
ldie "Cannot $cmd: some resource directories exist at both clusters with same name.\this cannot be overridden.\nPlease resolve the conflict by hand.\n";
|
||||
}
|
||||
lprint "List of total resources:\n";
|
||||
foreach my $res (keys(%total_res)) {
|
||||
@ -6134,7 +6134,7 @@ sub logrotate_res {
|
||||
lwarn "logfile '$next' already exists - nothing to do\n";
|
||||
return 0;
|
||||
}
|
||||
# safeguard defective /mars: the corresonding versionlink must exist.
|
||||
# safeguard defective /mars: the corresponding versionlink must exist.
|
||||
if (!is_link_recent($last)) {
|
||||
my $vers_path = $last;
|
||||
$vers_path =~ s:/log-:/version-:;
|
||||
@ -6432,7 +6432,7 @@ sub link_purge_global {
|
||||
# removal, because this would induce a _plethora_ of further changes to
|
||||
# many reports / commands / interfaces / etc etc.
|
||||
# Thus we _cannot_ use the _get_min_time() protection against dead / decommissioned
|
||||
# peers here, UNFORTUNATLY :(
|
||||
# peers here, UNFORTUNATELY :(
|
||||
# Reason: this protection can only protect at more fine-grained layers, but it
|
||||
# cannot protect the _base_ of all of this.
|
||||
# Example: if you destroy the _foundation_ of a building, you have agreed to
|
||||
@ -6567,7 +6567,7 @@ sub logdelete_res {
|
||||
my $next = shift(@paths);
|
||||
# never delete the very last logfile
|
||||
last unless $next;
|
||||
# safeguard: only delete logfiles having a minium age
|
||||
# safeguard: only delete logfiles having a minimum age
|
||||
last if !$force && is_link_recent($first);
|
||||
$nr = $first;
|
||||
$nr =~ s/^.*log-([0-9]+)-.+$/$1/;
|
||||
@ -7822,7 +7822,7 @@ sub progress_bar {
|
||||
sub make_numeric {
|
||||
my $number = shift;
|
||||
return 0 if (!defined($number) || $number eq "");
|
||||
# skip followin parts of comma-separated lists
|
||||
# skip following parts of comma-separated lists
|
||||
$number =~ s/,.*//;
|
||||
return $number;
|
||||
}
|
||||
@ -9742,7 +9742,7 @@ my %trivial_globs =
|
||||
"occupied-size"
|
||||
=> "",
|
||||
"replay-code"
|
||||
=> "When negative, this indidates that a replay/recovery error has occurred.",
|
||||
=> "When negative, this indicates that a replay/recovery error has occurred.",
|
||||
"errno-text"
|
||||
=> "Convert errno numbers (positive or negative) into human readable text.",
|
||||
"{sync,fetch,replay,work,syncpos}-{size,pos}"
|
||||
@ -10235,7 +10235,7 @@ my %cmd_table =
|
||||
"Deprecated.",
|
||||
"Please use \"marsadm cron\" instead.",
|
||||
"When possible, globally delete all old transaction logfiles which",
|
||||
"are known to be superflous, i.e. all secondaries no longer need",
|
||||
"are known to be superfluous, i.e. all secondaries no longer need",
|
||||
"to replay them.",
|
||||
"This must be regularly called by a cron job or similar, in order",
|
||||
"to prevent overflow of the /mars/ directory.",
|
||||
@ -10933,7 +10933,7 @@ marsadm [<global_options>] view[-<macroname>] [<resource_names> | all ]
|
||||
--verbose
|
||||
Increase speakyness of some commands.
|
||||
--parallel
|
||||
Only resonable when combined with \"all\".
|
||||
Only reasonable when combined with \"all\".
|
||||
For each resource, fork() a sub-process running independently
|
||||
from other resources. May seepd up handover a lot.
|
||||
However, several cluster managers are known to have problems
|
||||
@ -10974,13 +10974,13 @@ marsadm [<global_options>] view[-<macroname>] [<resource_names> | all ]
|
||||
--timeout=<seconds>
|
||||
Current default: $timeout
|
||||
Abort safety checks and waiting loops after timeout with an error.
|
||||
When giving 'all' as resource agument, this works for each
|
||||
When giving 'all' as resource argument, this works for each
|
||||
resource independently.
|
||||
The special value -1 means \"infinite\".
|
||||
--window=<seconds>
|
||||
Current default: $window
|
||||
Treat other cluster nodes as healthy when some communcation has
|
||||
occured during the given time window.
|
||||
Treat other cluster nodes as healthy when some communication has
|
||||
occurred during the given time window.
|
||||
--stuck-seconds=<seconds>
|
||||
Current default: $stuck_seconds
|
||||
Some warnings, like stucking fetch or replay, will appear in
|
||||
@ -11732,7 +11732,7 @@ if (ref($func) eq "ARRAY") {
|
||||
sleep(1);
|
||||
my $now = mars_time();
|
||||
if ($now - $start_time > $timeout) {
|
||||
lwarn "Condition '$headline' for resources '$res' not reached withing $timeout s\n";
|
||||
lwarn "Condition '$headline' for resources '$res' not reached within $timeout s\n";
|
||||
last;
|
||||
}
|
||||
}
|
||||
|
@ -16,7 +16,7 @@ the time to fully analyze all distros / distro versions and their udev
|
||||
rules.
|
||||
|
||||
Since I am not an expert in writing udev rules (and I just needed
|
||||
a quickfix for my own work), the files in this directoy should
|
||||
a quickfix for my own work), the files in this directory should
|
||||
be regarded as examples.
|
||||
|
||||
For example, the file 65-mars.rules should be copied to /lib/udev/rules.d/
|
||||
|
Loading…
Reference in New Issue
Block a user