From 2cde9074800fb179a376f839950afdceb11afe89 Mon Sep 17 00:00:00 2001 From: Thomas Schoebel-Theuer Date: Tue, 10 Sep 2019 12:38:59 +0200 Subject: [PATCH] user-manual: rework scripting advice --- docu/mars-user-manual.lyx | 380 +++++++++++++++++++------------------- 1 file changed, 187 insertions(+), 193 deletions(-) diff --git a/docu/mars-user-manual.lyx b/docu/mars-user-manual.lyx index dd4ff6a7..f33e1d6b 100644 --- a/docu/mars-user-manual.lyx +++ b/docu/mars-user-manual.lyx @@ -31718,6 +31718,193 @@ Same as the firmly built in. \end_layout +\begin_layout Section +Scripting Advice +\begin_inset CommandInset label +LatexCommand label +name "sec:Scripting-HOWTO" + +\end_inset + + +\end_layout + +\begin_layout Standard +Both the +\series bold +asynchronous communication model +\series default + of MARS including the Lamport clock, and the +\series bold +state model +\series default + (cf section +\begin_inset CommandInset ref +LatexCommand ref +reference "sec:The-State-of" + +\end_inset + +) is something you +\emph on +definitely +\emph default + should have in mind when you want to do some scripting. + Here is some advice: +\end_layout + +\begin_layout Itemize +Don't access anything on +\family typewriter +/mars/ +\family default + directly, except for debugging purposes. + Use +\family typewriter +marsadm +\family default +. +\end_layout + +\begin_layout Itemize +Avoid running scripts in parallel, other than for inspection / monitoring + purposes. + When you give two +\family typewriter +marsadm +\family default + commands in parallel (whether on the same host, or on different hosts belonging + to the same cluster), it is possible to produce a mess. + +\family typewriter +marsadm +\family default + has no internal locking. + There is no cluster-wide locking at all, because if would cause trouble + during long-distance network outages. + Unfortunately, some systems like Pacemaker are violating this in many cases + (depending on their configuration). + Best is if you have a dedicated / more or less centralized +\series bold +control machine +\series default + which controls masses of your georedundant working servers. + This reduces the risk of running interfering actions in parallel. + Of course, you need backup machines for your control machines, and in different + locations. + Not obeying this advice can easily lead to problems such as complex races + which are very difficult to solve in long-distance distributed systems, + even in general (not limited to MARS). +\end_layout + +\begin_layout Itemize + +\family typewriter +marsadm wait-cluster +\family default + is your friend. + Whenever your (near-)central script has to switch between different hosts + +\family typewriter +A +\family default + and +\family typewriter +B +\family default + (of the same cluster), use it in the following way: +\begin_inset Newline newline +\end_inset + + +\family typewriter +ssh A +\begin_inset Quotes eld +\end_inset + +marsadm action1 +\begin_inset Quotes erd +\end_inset + +; ssh B +\begin_inset Quotes eld +\end_inset + +marsadm wait-cluster; marsadm action2 +\begin_inset Quotes erd +\end_inset + + +\begin_inset Newline newline +\end_inset + + +\family default + +\begin_inset Graphics + filename images/MatieresCorrosives.png + lyxscale 50 + scale 17 + +\end_inset + + Don't ignore this advice! Interference is almost +\emph on +sure +\emph default +! As a rule of thumb, precede almost any action command with some appropriate + waiting command! +\end_layout + +\begin_layout Itemize +Further friends are any +\family typewriter +marsadm wait-* +\family default + commands, such as +\family typewriter +wait-umount +\family default +. +\end_layout + +\begin_layout Itemize +In some places, busy-wait loops might be needed, e.g. + for waiting until a specific resource is +\family typewriter +UpToDate +\family default + or matches some other condition. + Examples of waiting conditions can be found under +\family typewriter +github.com/schoebel/test-suite +\family default + in subdirectory +\family typewriter +mars/modules/ +\family default +, specifically +\family typewriter +02_predicates.sh +\family default + or similar. +\end_layout + +\begin_layout Itemize +In case of network problems, some command may hang (forever), if you don't + set the +\family typewriter +--timeout= +\family default + option. + Don't forget the check the return state of any failed / timeouted commands, + and to take appropriate measures! +\end_layout + +\begin_layout Itemize +Test your scripts in failure scenarios! +\end_layout + \begin_layout Chapter Troubleshooting \end_layout @@ -32014,199 +32201,6 @@ name "chap:The-Macro-Processor" \end_layout -\begin_layout Section -Scripting HOWTO -\begin_inset CommandInset label -LatexCommand label -name "sec:Scripting-HOWTO" - -\end_inset - - -\end_layout - -\begin_layout Standard -Both the -\series bold -asynchronous communication model -\series default - of MARS (cf section -\begin_inset CommandInset ref -LatexCommand ref -reference "sec:The-Lamport-Clock" - -\end_inset - -) including the Lamport clock, and the -\series bold -state model -\series default - (cf section -\begin_inset CommandInset ref -LatexCommand ref -reference "sec:The-State-of" - -\end_inset - -) is something you -\emph on -definitely -\emph default - should have in mind when you want to do some scripting. - Here is some further concrete advice: -\end_layout - -\begin_layout Itemize -Don't access anything on -\family typewriter -/mars/ -\family default - directly, except for debugging purposes. - Use -\family typewriter -marsadm -\family default -. -\end_layout - -\begin_layout Itemize -Avoid running scripts in parallel, other than for inspection / monitoring - purposes. - When you give two -\family typewriter -marsadm -\family default - commands in parallel (whether on the same host, or on different hosts belonging - to the same cluster), it is very likely to produce a mess. - -\family typewriter -marsadm -\family default - has no internal locking. - There is no cluster-wide locking at all. - Unfortunately, some systems like Pacemaker are violating this in many cases - (depending on their configuration). - Best is if you have a dedicated / more or less centralized -\series bold -control machine -\series default - which controls masses of your georedundant working servers. - This reduces the risk of running interfering actions in parallel. - Of course, you need backup machines for your control machines, and in different - locations. - Not obeying this advice can easily lead to problems such as complex races - which are very difficult to solve in long-distance distributed systems, - even in general (not limited to MARS). -\end_layout - -\begin_layout Itemize - -\family typewriter -marsadm wait-cluster -\family default - is your friend. - Whenever your (near-)central script has to switch between different hosts - -\family typewriter -A -\family default - and -\family typewriter -B -\family default - (of the same cluster), use it in the following way: -\begin_inset Newline newline -\end_inset - - -\family typewriter -ssh A -\begin_inset Quotes eld -\end_inset - -marsadm action1 -\begin_inset Quotes erd -\end_inset - -; ssh B -\begin_inset Quotes eld -\end_inset - -marsadm wait-cluster; marsadm action2 -\begin_inset Quotes erd -\end_inset - - -\begin_inset Newline newline -\end_inset - - -\family default - -\begin_inset Graphics - filename images/MatieresCorrosives.png - lyxscale 50 - scale 17 - -\end_inset - - Don't ignore this advice! Interference is almost -\emph on -sure -\emph default -! As a rule of thumb, precede almost any action command with some appropriate - waiting command! -\end_layout - -\begin_layout Itemize -Further friends are any -\family typewriter -marsadm wait-* -\family default - commands, such as -\family typewriter -wait-umount -\family default -. -\end_layout - -\begin_layout Itemize -In some places, busy-wait loops might be needed, e.g. - for waiting until a specific resource is -\family typewriter -UpToDate -\family default - or matches some other condition. - Examples of waiting conditions can be found under -\family typewriter -github.com/schoebel/test-suite -\family default - in subdirectory -\family typewriter -mars/modules/ -\family default -, specifically -\family typewriter -02_predicates.sh -\family default - or similar. -\end_layout - -\begin_layout Itemize -In case of network problems, some command may hang (forever), if you don't - set the -\family typewriter ---timeout= -\family default - option. - Don't forget the check the return state of any failed / timeouted commands, - and to take appropriate measures! -\end_layout - -\begin_layout Itemize -Test your scripts in failure scenarios! -\end_layout - \begin_layout Chapter The Sysadmin Interface ( \family typewriter