#LyX 2.0 created this file. For more info see http://www.lyx.org/ \lyxformat 413 \begin_document \begin_header \textclass scrreprt \begin_preamble \usepackage[dvipsnames]{xcolor} \usepackage{listings} \end_preamble \options abstracton \use_default_options true \begin_modules customHeadersFooters enumitem fixltx2e \end_modules \maintain_unincluded_children false \language english \language_package default \inputencoding auto \fontencoding global \font_roman default \font_sans default \font_typewriter default \font_default_family rmdefault \use_non_tex_fonts false \font_sc false \font_osf false \font_sf_scale 100 \font_tt_scale 100 \graphics default \default_output_format default \output_sync 0 \bibtex_command default \index_command default \paperfontsize 10 \spacing single \use_hyperref true \pdf_title "MARS Manual" \pdf_author "Thomas Schöbel-Theuer" \pdf_bookmarks true \pdf_bookmarksnumbered false \pdf_bookmarksopen false \pdf_bookmarksopenlevel 1 \pdf_breaklinks true \pdf_pdfborder true \pdf_colorlinks true \pdf_backref false \pdf_pdfusetitle true \papersize a4paper \use_geometry true \use_amsmath 1 \use_esint 1 \use_mhchem 1 \use_mathdots 1 \cite_engine basic \use_bibtopic false \use_indices false \paperorientation portrait \suppress_date false \use_refstyle 1 \index Index \shortcut idx \color #008000 \end_index \leftmargin 3.7cm \topmargin 2.7cm \rightmargin 2.8cm \bottommargin 2.3cm \secnumdepth 3 \tocdepth 3 \paragraph_separation indent \paragraph_indentation default \quotes_language english \papercolumns 1 \papersides 2 \paperpagestyle headings \tracking_changes false \output_changes false \html_math_output 0 \html_css_as_file 0 \html_be_strict false \end_header \begin_body \begin_layout Title \family typewriter MARS Manual \begin_inset Newline newline \end_inset \begin_inset space ~ \end_inset \end_layout \begin_layout Subtitle Multiversion Asynchronous Replicated Storage \begin_inset Newline newline \end_inset \begin_inset space ~ \end_inset \begin_inset Newline newline \end_inset \begin_inset Graphics filename images/earth-mars-transfer.fig width 70col% \end_inset \end_layout \begin_layout Author Thomas Schöbel-Theuer ( \family typewriter tst@1und1.de \family default ) \end_layout \begin_layout Date Version 0.12 (incomplete) \end_layout \begin_layout Lowertitleback \noindent Copyright (C) 2013 Thomas Schöbel-Theuer / 1&1 Internet AG \begin_inset Newline newline \end_inset (see \begin_inset Flex URL status open \begin_layout Plain Layout http://www.1und1.de \end_layout \end_inset shortly called 1&1 in the following). \begin_inset Newline newline \end_inset \size footnotesize Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled \begin_inset Quotes eld \end_inset \begin_inset CommandInset ref LatexCommand nameref reference "chap:GNU-FDL" \end_inset \begin_inset Quotes erd \end_inset . \end_layout \begin_layout Abstract \family typewriter \begin_inset ERT status open \begin_layout Plain Layout \backslash sloppy \end_layout \end_inset MARS \family default Light is a block-level storage replication system for long distances / flaky networks under GPL. It runs as a Linux kernel module. The sysadmin interface is similar to DRBD \begin_inset Foot status open \begin_layout Plain Layout Registered trademarks are the property of their respective owner. \end_layout \end_inset , but its internal engine is completely different from DRBD: it works with \series bold transaction logging \series default , similar to some database systems. \end_layout \begin_layout Abstract Therefore, MARS Light can provide stronger \series bold consistency guarantees \series default . Even in case of network bottlenecks / problems / failures, the secondaries may become outdated (reflect an elder state), but never become inconsistent. In contrast to DRBD, MARS Light preserves the \series bold order of write operations \series default even when the network is flaky ( \series bold Anytime Consistency \series default ). \end_layout \begin_layout Abstract The current version of MARS Light works \series bold asynchronously \series default . Therefore, application performance is completely decoupled from any network problems. Future versions are planned to also support synchronous or near-synchronous modes. \end_layout \begin_layout Abstract \paragraph_spacing double \noindent \begin_inset space ~ \end_inset \begin_inset Newline newline \end_inset \begin_inset space ~ \end_inset \begin_inset Newline newline \end_inset \begin_inset Box Frameless position "c" hor_pos "c" has_inner_box 1 inner_pos "c" use_parbox 0 use_makebox 1 width "100col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \begin_inset Graphics filename images/earth-mars-transfer.fig width 70col% \end_inset \end_layout \end_inset \end_layout \begin_layout Standard \begin_inset CommandInset toc LatexCommand tableofcontents \end_inset \end_layout \begin_layout Chapter Use Cases for MARS vs DRBD \begin_inset CommandInset label LatexCommand label name "chap:Use-Cases-for" \end_inset \end_layout \begin_layout Standard DRBD has a long history of successfully providing HA features to many users of Linux. With the advent of MARS, many people are wondering what the difference is. They ask for recommendations. In which use cases should DRBD be recommended, and in which other cases is MARS the better choice? \end_layout \begin_layout Standard There exist \emph on some \emph default cases where DRBD is better than MARS. 1&1 has a long history of experiences with DRBD where it works very fine, in particular coupling Linux devices rack-to-rack via crossover cables. DRBD is just \emph on constructed \emph default for that use case (RAID-1 over network). \end_layout \begin_layout Standard On the other hand, there exist other cases where DRBD did not work as expected, leading to incidents and other operational problems. We analyzed them for those use cases, and found that they could only be resolved by fundamental changes in the overall architecture of DRBD. Therefore, we started the development of MARS. \end_layout \begin_layout Standard MARS and DRBD simply have \series bold different application areas \series default . \end_layout \begin_layout Standard In the following, we will discuss the pros and cons of each system in particular situations and contexts, and we shed some light at their conceptual and operational differences. \end_layout \begin_layout Section Network Bottlenecks \begin_inset CommandInset label LatexCommand label name "sec:Network-Bottlenecks" \end_inset \end_layout \begin_layout Subsection Behaviour of DRBD \begin_inset CommandInset label LatexCommand label name "sub:Behaviour-of-DRBD" \end_inset \end_layout \begin_layout Standard In order to describe the most important problem we found when DRBD was used to couple whole datacenters (each encompassing thousands of servers) over metro distances, we strip down that complicated real-life scenario to a simplified laboratory scenario in order to demonstrate the effect with minimal means. The following picture illustrates an effect which is not only observable in practice, but is also reproducible by the MARS test suite \begin_inset Foot status open \begin_layout Plain Layout The effect has been demonstrated with DRBD version 8.3.13. By construction, is is independent from any of the DRBD series 8.3.x, 8.4.x, or 9.0.x. \end_layout \end_inset : \end_layout \begin_layout Standard \noindent \align center \begin_inset Graphics filename images/network-bottleneck-drbd.fig width 80col% \end_inset \end_layout \begin_layout Standard \noindent The simplified scenario is the following: \end_layout \begin_layout Enumerate DRBD is loaded with a low to medium, but constant rate of write operations for the sake of simplicity of the scenario. \end_layout \begin_layout Enumerate The network has some throughput bottleneck, depicted as a red line. For the sake of simplicity, we just linearly decrease it over time, starting from full throughput, down to zero. The decrease is very slowly over time (some minutes, or even hours). \end_layout \begin_layout Standard What will happen in this scenario? \end_layout \begin_layout Standard As long as the actual DRBD write throughput is lower than the network bandwidth (left part of the horizontal blue line), DRBD works as expected. \end_layout \begin_layout Standard Once the maximum network throughput (red line) starts to fall short of the required application throughput (first blue dotted line), we get into trouble. By its very nature, DRBD works \series bold synchronously \series default . Therefore, it \emph on must \emph default transfer all your application writes through the bottleneck, but now it is impossible \begin_inset Foot status open \begin_layout Plain Layout This is independent from the DRBD protocols A through C, because it just depends on an information-theoretic argument independently from any protocol. We have a fundamental conflict between network capabilities and application demands here, which cannot be circumvented due to the \series bold synchronous \series default nature of DRBD. \end_layout \end_inset due to the bottleneck. As a consequence, the application running on top of DRBD will see increasingly higher IO latencies and/or stalls / hangs. We found practical cases (at least with former versions of DRBD) where IO latencies exceeded practical monitoring limits such as \begin_inset Formula $5$ \end_inset s by far, up to the range of \emph on minutes \emph default . As an experienced sysadmin, you know what happens next: your application will run into an incident, and your customers will be dissatisfied. \end_layout \begin_layout Standard In order to deal with such situations, DRBD has lots of tuning parameters. In particular, the \family typewriter timeout \family default parameter and/or the \family typewriter ping-timeout \family default parameter will determine when DRBD will give up in such a situation and simply drop the network connection as an emergency measure. Dropping the network connection is roughly equivalent to an automatic \family typewriter disconnect \family default , followed by an automatic re-connect attempt after \family typewriter connect-int \family default seconds. During the dropped connection, the incident will appear as being resolved, but at some hidden cost \begin_inset Foot status open \begin_layout Plain Layout By appropriately tuning various DRBD parameters, such as \family typewriter timeout \family default and/or \family typewriter ping-timeout \family default , you can keep the impact of the incident below some viable limit. However, the automatic disconnect will then happen earlier and more often in practice. Flaky or overloaded networks may easily lead to an enormous number of automatic disconnects. \end_layout \end_inset . \end_layout \begin_layout Standard What happens next in our scenario? During the \family typewriter disconnect \family default , DRBD will record all positions of writes in its bitmap and/or in its activity log. As soon as the automatic re-connect succeeds after \family typewriter connect-int \family default seconds, DRBD has to do a partial re-sync of those blocks which were marked dirty in the meantime. This leads to an \emph on additional \emph default bandwidth demand \begin_inset Foot status open \begin_layout Plain Layout DRBD parameters \family typewriter sync-rate \family default resp \family typewriter resync-rate \family default may be used to tune the height of the additional demand. In addition, the newer parameters \family typewriter c-plan-ahead \family default , \family typewriter c-fill-target \family default , \family typewriter c-delay-target \family default , \family typewriter c-min-rate \family default , \family typewriter c-max-rate \family default and friends may be used to dynamically adapt to \emph on some \emph default situations where the application throughput \emph on could \emph default fit through the bottleneck. These newer parameters were developed in a cooperation between 1&1 and Linbit, the maker of DRBD. \end_layout \begin_layout Plain Layout Please note that lowering / dynamically adapting the resync rates may help in lowering the \emph on probability \emph default of occurrences of the above problems in practical scenarios where the bottlenec k would recover to viable limits after some time. However, lowering the rates will also increase the \emph on duration \emph default of re-sync operations accordingly. The \emph on total amount of re-sync data \emph default simply does not decrease when lowering \family typewriter resync-rate \family default ; it even tends to increase over time when new requests arrive. Therefore, the \emph on expectancy value \emph default of problems caused by \emph on strong \emph default network bottlenecks (i.e. when not even the ordinary application rate is fitting through) is \emph on not \emph default improved by lowering or adapting \family typewriter resync-rate \family default , but rather the expectancy value mostly depends on the \emph on relation \emph default between the amount of holdback data versus the amount of application write data, both measured for the duration of some given strong bottleneck. \end_layout \end_inset as indicated by the upper dotted blue box. \end_layout \begin_layout Standard Of course, there is \emph on absolutely no chance \emph default to get the increased amount of data through our bottleneck, since not even the ordinary application load (lower dotted lines) could be transferred. \end_layout \begin_layout Standard Therefore, you run at a \series bold very high risk \series default that the re-sync cannot finish before the next \family typewriter timeout \family default / \family typewriter ping-timeout \family default cycle will drop the network connection again. \end_layout \begin_layout Standard What will be the final result when that risk becomes true? Simply, your secondary site will be in state \family typewriter inconsistent \family default . This means, you have lost your redundancy. In our scenario, there is no chance at all to become consistent again, because the network bottleneck declines more and more, slowly. It is simply \emph on hopeless \emph default , by construction. \end_layout \begin_layout Standard In case you lose your primary site now, you are lost at all. \end_layout \begin_layout Standard Some people may argue that the probability for a similar scenario were low. We don't agree on such an argumentation. Not only because it really happens in pratice, and it may even last some days until problems are fixed. In case of \series bold rolling disasters \series default , the network is very likely to become flaky and/or overloaded shortly before the final damage. Even in other cases, you can easily end up with inconsistent secondaries. It occurs not only in the lab, but also in practice if you operate some hundreds or even thousands of DRBD instances. \end_layout \begin_layout Standard The point is that you can produce an ill behaviour \emph on systematically \emph default just by overloading the network a bit for some sufficient duration. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename images/MatieresCorrosives.png lyxscale 50 scale 17 \end_inset When coupling whole datacenters via some thousands of DRBD connections, any (short) network loss will almost certainly increase the re-sync network load each time the outage appears to be over. As a consequence, overload may be \emph on provoked \emph default by the re-sync repair attempts. This may easily lead to self-amplifying \series bold throughput storms \series default in some resonance frequency (similar to self-destruction of a bridge when an army is marching over it in lockstep). \end_layout \begin_layout Standard The only way for reliable prevention of loss of secondaries is to start any re-connect \emph on only \emph default in such situations where you can \emph on predict in advance \emph default that the re-sync is \emph on guaranteed \emph default to finish before any network bottleneck / loss will cause an automatic disconnect again. We don't know of any method which can reliably predict the future behaviour of a complex network. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename images/MatieresToxiques.png lyxscale 50 scale 17 \end_inset Conclusion: in the presence of network bottlenecks, you run a considerable risk that your DRBD mirrors get destroyed just in that moment when you desperately need them. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset Notice that crossover cables usually never show a behaviour like depicted by the red line. Crossover cables are \emph on passive components \emph default which normally \begin_inset Foot status open \begin_layout Plain Layout Exceptions might be mechanical jiggling of plugs, or electro-magnetical interferences. We never noticed any of them. \end_layout \end_inset either work, or not. The binary connect / disconnect behaviour of DRBD has no problems to cope with that. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset or \begin_inset Graphics filename images/MatieresCorrosives.png lyxscale 50 scale 17 \end_inset Linbit recommends a \series bold workaround \series default for the inconsistencies during re-sync: LVM snapshots. We tried it, but found a \emph on performance penalty \emph default which made it prohibitive for our concrete application. A problem seems to be the cost of destroying snapshots. LVM uses by default a BOW strategy (Backup On Write, which is the counterpart of COW = Copy On Write). BOW increases IO latencies during ordinary operation. Retaining snapshots is cheap, but reverting them may be very costly, depending on workload. We didn't fully investigate that effect, and our experience is a few years old. You might come to a different conclusion for a different workload, for newer versions of system software, or for a different strategy if you carefully investigate the field. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename images/MatieresCorrosives.png lyxscale 50 scale 17 \end_inset DRBD problems usually arise \emph on only \emph default when the network throughput shows some \begin_inset Quotes eld \end_inset awkward \begin_inset Quotes erd \end_inset analog behaviour, such as overload, or as occasionally produced by various switches / routers / transmitters, or other potential sources of packet loss. \end_layout \begin_layout Subsection Behaviour of MARS \begin_inset CommandInset label LatexCommand label name "sub:Behaviour-of-MARS" \end_inset \end_layout \begin_layout Standard The behaviour of MARS in the above scenario: \end_layout \begin_layout Standard \noindent \align center \begin_inset Graphics filename images/network-bottleneck-mars.fig width 80col% \end_inset \end_layout \begin_layout Standard \noindent When the network is restrained, an asynchronous system like MARS will continue to serve the user IO requests (dotted green line) without any impact / incident while the actual network throughput (solid green line) follows the red line. In the meantime, all changes to the block device are recorded at the transactio n logfiles. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename images/MatieresCorrosives.png lyxscale 50 scale 17 \end_inset Here is one point in favour of DRBD: MARS stores its transaction logs on the filesystem \family typewriter /mars/ \family default . When the network bottleneck is lasting very long (some days or even some weeks), the filesystem will eventually run out of space some day. Section \begin_inset CommandInset ref LatexCommand ref reference "sec:Defending-Overflow" \end_inset discusses countermeasures against that in detail. In contrast to MARS, DRBD allocates its bitmap \emph on statically \emph default at resource creation time. It uses up less space, and you don't have to monitor it for (potential) overflows. The space for transaction logs is the price you have to pay if you want or need anytime consistency, or asynchronous replication in general. \end_layout \begin_layout Standard In order to really grasp the \emph on heart \emph default of the difference between synchronous and asynchronous replication, we look at the following modified scenario: \end_layout \begin_layout Standard \noindent \align center \begin_inset Graphics filename images/network-flaky-mars.fig width 80col% \end_inset \end_layout \begin_layout Standard \noindent This time, the network throughput (red line) is varying \begin_inset Foot status open \begin_layout Plain Layout In real life, many long-distance lines or even some heavily used metro lines usually show fluctuations of their network bandwidth by an order of magnitude, or even higher. We have measured them. The overall behaviour can be characterized as \begin_inset Quotes eld \end_inset \series bold chaotic \series default \begin_inset Quotes erd \end_inset . \end_layout \end_inset in some unpredictable way. As before, the application throughput served by MARS is assumed to be constant (dotted green line, often superseded by the solid green line). The actual replication network throughput is depicted by the solid green line. \end_layout \begin_layout Standard As you can see, a network dropdown undershooting the application demand has no impact on the application throughput, but only on the replication network throughput. Whenever the network throughput is held back due to the flaky network, it simply catches up as soon as possible by overshooting the application throughput. The amount of lag-behind is visualized as shaded area: downward shading (below the application throughput) means an increase of the lag-behind, while the upwards shaded areas (beyond the application throughput) indicate a decrease of the lag-behind (catch-up). Once the lag-behind has been fully caught up, the network throughput suddenly jumps back to the application throughput (here visible in two cases). \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename images/MatieresCorrosives.png lyxscale 50 scale 17 \end_inset Note that the existence of lag-behind areas is roughly corresponding to DRBD disconnect states, and in turn to DRBD inconsistent states of the secondary as long as the lag-behind has not been fully cought up. The very rough \begin_inset Foot status open \begin_layout Plain Layout Of course, this visualization is not exact. On one hand, the DRBD inconsistency phase may start later as depicted here, because it only starts \emph on after \emph default the first automatic disconnect, upon the first automatic re-connect. In addition, the amount of resync data may be smaller than the amount of corresponding MARS transaction logfile data, because the DRBD bitmap will coalesce multiple writes to the same block into one single transfer. On the other hand, DRBD will transfer no data at all during its disconnected state, while MARS continues its best. This leads to a prolongation of the DRBD inconsistent phase. Depending on properties of the workload and of the network, the real duration of the inconsistency phase may be both shorter or longer. \end_layout \end_inset duration of the corresponding DRBD inconsistency phase is visualized as magenta line at the time scale. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset MARS utilizes the existing network bandwidth as best as possible in order to pipe through as much data as possible, provided that there exists some data requiring expedition. Conceptually, there exists no better way due to information theoretic limits (besides data compression). \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset In case of lag-behind, the version of the data replicated to the secondary site corresponds to some time in the past. Since the data is always transferred in the same order as originally submitted at the primary site, the secondary never gets inconsistent. Your mirror always remains usable. Your only potential problem could be the outdated state, corresponding to some state in the past. However, the \begin_inset Quotes eld \end_inset as-best-as-possible \begin_inset Quotes erd \end_inset approach to the network transfer ensures that your version is always \emph on as up-to-date as possible \emph default even under ill-behaving network bottlenecks. \series bold There is simply no better way to do it. \series default In presence of network bottlenecks, there exists no better method than prescribed by the information theoretic limit (red line, neglecting data compression). \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset MARS' property of never sacrificing local data consistency (at the possible cost of actuality) is called \series bold Anytime Consistency \series default . \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset Conclusion: you can even use \series bold traffic shaping \series default on MARS' TCP connections in order to globally balance your network throughput (of course at the cost of actuality, but without sacrificing local data consistency). If you would try to do the same with DRBD, you could easily provoke a disaster. MARS simply tolerates any network problems, provided that there is enough disk space for transaction logfiles. Even in case of completely filling up your disk with transaction logfiles after some days or weeks, you will not lose local consistency anywhere (see section \begin_inset CommandInset ref LatexCommand ref reference "sec:Defending-Overflow" \end_inset ). \end_layout \begin_layout Standard Finally, here is yet another scenario where MARS can cope with the situation: \end_layout \begin_layout Standard \noindent \align center \begin_inset Graphics filename images/network-constant-mars.fig width 80col% \end_inset \end_layout \begin_layout Standard \noindent This time, the network throughput limit (solid red line) is assumed to be constant. However, the application workload (dotted green line) shows some heavy peaks. We know from our 1&1 datacenters that such an application behaviour is very common. \end_layout \begin_layout Standard When the peaks are exceeding the network capabilities for some time, the replication network throughput (solid green line) will be limited for a short time, stay a little bit longer at the limit, and finally drop down again to the normal workload. In other words, you get a flexible buffering behaviour, coping with the peaks. \end_layout \begin_layout Standard Similar scenarios (where both the application workload has peaks and the network is flaky to some degree) are rather common. If you would use DRBD there, you were likely to run into regular application performance problems and/or frequent automatic disconnect cycles, depending on the height and on the duration of the peaks, and on network resources. \end_layout \begin_layout Section Long Distances / High Latencies \end_layout \begin_layout Standard In general and in some theories, latencies are conceptually independent from throughput, at least to some degree. There exist all 4 possible combinations: \end_layout \begin_layout Enumerate There exist lines with high latencies but also high throughput. Examples are raw fibre cables at the ground of the Atlantic. \end_layout \begin_layout Enumerate High latencies on low-throughput lines is very easy to achieve. If you never saw it, you never ran interactive \family typewriter vi \family default over \family typewriter ssh \family default in parallel to downloads on your old-fashioned modem line. \end_layout \begin_layout Enumerate Low latencies need not be incompatible with high throughput. See Myrinet, InfiniBand or high-speed point-to-point interconnects, such as modern memory busses. \end_layout \begin_layout Enumerate Low latency combined with low throughput is also possible: in an ATM system (or another pre-reservation system for bandwidth), just increase the multiplex factor on low-capacity but short lines, which is only possible at the cost of assigned bandwidth. \end_layout \begin_layout Standard In the \emph on internet \emph default practice, however, it is very likely that high latencies will also lead to worse throughput, because of the \emph on congestion control algorithms \emph default running all over the world. \end_layout \begin_layout Standard We have experimented with extremely large TCP send/receive buffers plus various window sizes and congestion control algorithms over long-distance lines between the USA and Europe. Yes, it is possible to improve the behaviour to some degree. But magic does not happen. Natural laws will always hold. You simply cannot travel faster than the speed of light. \end_layout \begin_layout Standard Our experience leads to the following rule of thumb, not formally proven by anything, but just observed in practice: \end_layout \begin_layout Quotation In general, synchronous data replication (not limited to applications of DRBD) works reliably only over distances \begin_inset Formula $<50$ \end_inset km. \end_layout \begin_layout Standard There may be some exceptions, at least when dealing with low-end workstation loads. But when you are responsible for a whole datacenter and/or some centralized storage units, don't waste your time by trying (almost) impossible things. We recommend to use MARS in such use cases. \end_layout \begin_layout Section Higher Consistency Guarantees vs Actuality \end_layout \begin_layout Standard We already saw in section \begin_inset CommandInset ref LatexCommand ref reference "sec:Network-Bottlenecks" \end_inset that certain types of network bottlenecks can easily (and reproducibly) destroy the consistency of your DRBD secondary, while MARS will preserve local consistency at the cost of actuality ( \series bold anytime consistency \series default ). \end_layout \begin_layout Standard Some people, often located at database operations, are obtrusively arguing that actuality is such a high good that it must not be sacrificed under any circumstances. \end_layout \begin_layout Standard Anyone arguing this way has at least the following choices (list may be incomplete): \end_layout \begin_layout Enumerate None of the above use cases for MARS apply. For instance, short distance replication over crossover cables is sufficient (which occurs very often), or the network is reliable enough such that bottlenecks can never occur (e.g. because the total load is extremely low, or conversely the network is extremely overengineered / expensive), or the occurrence of bottlenecks can \emph on provably \emph default be taken into account. In such cases, DRBD is clearly the better solution than MARS, because it provides better actuality than the current version of MARS, and it uses up less disk resources. \end_layout \begin_layout Enumerate In the presence of network bottlenecks, people didn't notice and/or didn't understand and/or did under-estimate the risk of accidental invalidation of their DRBD secondaries. They should carefully check that risk. They should convince themselves that the risk is \emph on really \emph default bearable. Once they are hit by a systematic chain of events which \emph on reproducibly \emph default provoke the bad effect, it is too late \begin_inset Foot status open \begin_layout Plain Layout Some people seem to need a bad experience before they get the difference between risk caused by reproducible effects and inverted luck. \end_layout \end_inset . \end_layout \begin_layout Enumerate In the presence of network bottlenecks, people found a solution such that DRBD does not automatically re-connect after the connection has been dropped due to network problems (c.f. \family typewriter ko-count \family default parameter). So the risk of inconsistency \emph on appears \emph default to have vanished. In some cases, people did not notice that the risk has \emph on not completely \begin_inset Foot status open \begin_layout Plain Layout Hint: what's the \emph on conceptual \emph default difference beween an automatic and a manual re-connect? Yes, you can try to \emph on lower \emph default the risk in some cases by transferring risks to human analysis and human decisions, but did you take into account the possibility of human errors? \end_layout \end_inset \emph default vanished, and/or they did not notice that now the actuality produced by DRBD is even drastically worse than that of MARS (in the same situation). It is true that DRBD provides better actuality in \family typewriter connected \family default state, but for a full picture the actuality in \family typewriter disconnected \family default state should not be neglected \begin_inset Foot status open \begin_layout Plain Layout Hint: a potential hurdle may be the fact that the current format of \family typewriter /proc/drbd \family default does neither display the timestamp of the first \emph on relevant \emph default network drop nor the total amount of lag-behind user data (which is \emph on not \emph default the same as the number of dirty bits in the bitmap), while \family typewriter marsadm view \family default can display it. So it is difficult to judge the risks. Possibly a chance is inspection of DRBD messages in the syslog, but quantificat ion could remain hard. \end_layout \end_inset . So they didn't notice that their argumentation on the importance of actuality may be fundamentally wrong. A possible way to overcome that may be re-reading section \begin_inset CommandInset ref LatexCommand ref reference "sub:Behaviour-of-MARS" \end_inset and comparing its outcome with the corresponding outcome of DRBD in the same situation. \end_layout \begin_layout Enumerate People are stuck in contradictive requirements because the current version of MARS Light does not yet support synchronous or pseudo-synchronous operation modes. This should be resolved some day. \end_layout \begin_layout Standard \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset A common misunderstanding is about the actuality guarantees provided by filesystems. The buffer cache / page cache uses by default a \series bold writeback strategy \series default for performance reasons. Even modern journalling filesystems will (by default) provide only consistency guarantees, but no strong actuality guarantee. In case of power loss, some transactions may be even \emph on rolled back \emph default in order to restore consistency. According to POSIX \begin_inset Foot status open \begin_layout Plain Layout The above argumentation also applies to Windows filesystems in analogous way. \end_layout \end_inset and other standards, the only \emph on reliable \emph default way to achieve actuality is usage of system calls like \family typewriter sync() \family default , \family typewriter fsync() \family default , \family typewriter fdatasync() \family default , flags like \family typewriter O_DIRECT \family default , or similar. For performance reasons, the \emph on vast majority of applications \emph default don't use them at all, or use them only sparingly! \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename images/MatieresCorrosives.png lyxscale 50 scale 17 \end_inset It makes no sense to require strong actuality guarantees from any block layer replication (whether DRBD or future versions of MARS) while higher layers such as filesystems or even applications are already sacrificing them! \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset In summary, the \series bold anytime consistency \series default provided by MARS is an argument you should consider, even if you need an extra hard disk for transaction logfiles. \end_layout \begin_layout Chapter Quick Start Guide \end_layout \begin_layout Standard This chapter is for impatient but experienced sysadmins who already know DRBD. For more complete information, refer to chapter \begin_inset CommandInset ref LatexCommand nameref reference "chap:The-Sysadmin-Interface" \end_inset . \end_layout \begin_layout Section Preparation: What you Need \begin_inset CommandInset label LatexCommand label name "sec:Preparation:-What-you" \end_inset \end_layout \begin_layout Standard Typically, you will use MARS Light at servers in a datacenter for replication of big masses of data. \end_layout \begin_layout Standard Typically, you will use MARS Light for replication \emph on between \emph default multiple datacenters, when the distances are greater than \begin_inset Formula $\approx50$ \end_inset km. Many other solutions, even from commercial storage vendors, will not work reliably over large distances when your network is not \emph on extremely \emph default reliable, or when you try to push huge masses of data from high-performance applications through a network bottleneck. If you ever encountered suchalike problems (or try to avoid them in advance), MARS is for you. \end_layout \begin_layout Standard You can use MARS Light both at dedicated storage servers (e.g. for serving Windows clients), or at standalone Linux servers where CPU and storage are not separated. \end_layout \begin_layout Standard In order to protect your data from low-level disk failures, you should use a hardware RAID controller with BBU. Software RAID is explicitly \emph on not \emph default recommended, because it generally provides worse performance due to the lack of a hardware BBU (for some benchmark comparisons with/out BBU, see \begin_inset Flex URL status collapsed \begin_layout Plain Layout https://github.com/schoebel/blkreplay/raw/master/doc/blkreplay.pdf \end_layout \end_inset ). \end_layout \begin_layout Standard Typically, you will need more than one RAID set \begin_inset Foot status open \begin_layout Plain Layout For low-cost storage, RAID-5 is no longer regarded safe for today's typical storage sizes, because the error rate is regarded too high. Therefore, use RAID-6. If you need more than 15 disks in total, create multiple RAID sets (each having at most 15 disks, better about 12 disks) and stripe them via LVM (or via your hardware RAID controller if it supports RAID-60). \end_layout \end_inset for big masses of data. Therefore, use of LVM is also recommended \begin_inset Foot status open \begin_layout Plain Layout You may also combine MARS with commercial storage boxes connected via Fibrechann el or iSCSI, but we have not yet operational experiences at 1&1 with such setups. \end_layout \end_inset for your data. \end_layout \begin_layout Standard MARS' tolerance of networking problems comes with some cost. You will need some extra space for the transaction logfiles of MARS, residing at the \family typewriter /mars/ \family default filesystem. \end_layout \begin_layout Standard The exact space requirements for \family typewriter /mars/ \family default depend on the \emph on average write rate \emph default of your application, not on the size of your data. We found that only few applications are writing more than 1 TB per day. Most are writing even less than 100 GB per day. Usually, you want to dimension \family typewriter /mars/ \family default such that you can survive a network loss lasting 3 days / about one weekend. This can be achieved with current technology rather easily: as a simple rule of thumb, just use one \series bold dedicated disk \series default having a capacity of 4 TB or more. Typically, that will provide you with plenty of headroom even for bigger networking incidents. \end_layout \begin_layout Standard Dedicated disks for \family typewriter /mars/ \family default have another advantage: their mechanical head movement is completely independen t from your data head movements. For best performance, attach that dedicated disk to your hardware RAID controller with BBU, building a separate RAID set (even if it consists only of a single disk -- notice that the \series bold hardware BBU \series default is the crucial point). \end_layout \begin_layout Standard If you are concerned about reliability, use two disks switched together as a relatively small RAID-1 set. For extremely high performance demands, you may consider (and check) RAID-10. \end_layout \begin_layout Standard Since the transaction logfiles are highly sequential in their access pattern, a cheap but high-capacity SATA disk (or nearline-SAS disk) is usually sufficien t. At the time of this writing, standard SATA SSDs have shown to be \emph on not \emph default (yet) preferable. Although they offer high random IOPS rate, their sequential throughput is worse, and their long-term stability is questioned by many people at the time of this writing. However, as technology evolves and becomes more mature, this could change in future. \end_layout \begin_layout Standard Use \family typewriter ext4 \family default for \family typewriter /mars/ \family default . Avoid \family typewriter ext3 \family default , and don't use \family typewriter xfs \family default \begin_inset Foot status open \begin_layout Plain Layout It seems that the late internal resource allocation strategy of \family typewriter xfs \family default (or another currently unknown reason) could be the reason for some resource deadlocks which appear only with \family typewriter xfs \family default and only under \emph on extremely \emph default high IO load in combination with high memory pressure. \end_layout \end_inset at all. \end_layout \begin_layout Section Setup Primary and Secondary Cluster Nodes \end_layout \begin_layout Standard If you already use DRBD, you may migrate to MARS (or even back from MARS to DRBD) if you use \emph on external \begin_inset Foot status open \begin_layout Plain Layout \emph on Internal \emph default DRBD metadata should also work as long as the filesystem inside your block device / disk already exists and is not re-created. The latter would destroy the DRBD metadata, but even that will not hurt you really: you can always switch back to DRBD using \emph on external \emph default metadata, as long as you have some small spare space somewhere. \end_layout \end_inset \emph default DRBD metadata (which is not touched by MARS). \end_layout \begin_layout Subsection Kernel and MARS Module \end_layout \begin_layout Standard At the time of this writing, a small pre-patch for the Linux kernel is needed. It it trivial and consists mostly of \family typewriter EXPORT_SYMBOL() \family default statements. The pre-patch must be applied to the kernel source tree before building your (custom) kernel. Hopefully, the patch will be integrated upstream some day. \end_layout \begin_layout Standard The MARS kernel module can be built in two different ways: \end_layout \begin_layout Enumerate inplace in the kernel source tree: \family typewriter cd block/ && git clone git://github.com/schoebel/mars \end_layout \begin_layout Enumerate as a separate kernel module, only for experienced \begin_inset Foot status open \begin_layout Plain Layout You should be familiar with the problems arising from orthogonal combination of different kernel versions with different MARS module versions and with different \family typewriter marsadm \family default userspace tool versions at the package management level. Hint: \family typewriter modinfo \family default is your friend. \end_layout \end_inset sysadmins: see file \family typewriter Makefile.dist \family default (tested with Debian; may need some extra work with other distros). \end_layout \begin_layout Standard Further / more accurate / latest instructions can be found in \family typewriter README \family default and in \family typewriter INSTALL \family default . You must not only install the kernel and the \family typewriter mars.ko \family default kernel module to all of your cluster nodes, but also the \family typewriter marsadm \family default userspace tool. \end_layout \begin_layout Subsection Setup your Cluster Nodes \end_layout \begin_layout Standard For your cluster, you need at least two nodes. In the following, they will be called A and B. In the beginning, A will have the \family typewriter primary \family default role, while B will be your initial \family typewriter secondary \family default . The roles may change later. \end_layout \begin_layout Enumerate You must be \family typewriter root \family default . \end_layout \begin_layout Enumerate On each of A and B, create the \family typewriter /mars/ \family default mountpoint. \end_layout \begin_layout Enumerate On each node, create an \family typewriter ext4 \family default filesystem on your separate disk / RAID set (see description in section \begin_inset CommandInset ref LatexCommand nameref reference "sec:Preparation:-What-you" \end_inset ). \end_layout \begin_layout Enumerate On each node, mount that filesystem to \family typewriter /mars/ \family default . It is advisable to add an entry to \family typewriter /etc/fstab \family default . \end_layout \begin_layout Enumerate On node A, say \family typewriter marsadm create-cluster \family default . \begin_inset Newline newline \end_inset This must be done \emph on exactly once \emph default , on exactly one node of your cluster. Never do this twice or on different nodes, because that would create two different clusters which would have nothing to do with each other. The \family typewriter marsadm \family default tool protects you against accidentally joining / merging two different clusters. If you accidentally created two different clusters, just umount that \family typewriter /mars/ \family default partition and start over with step 3 at that node. \end_layout \begin_layout Enumerate On node B, you must have a working \family typewriter ssh \family default connection to node A. Test it by saying \family typewriter ssh A w \family default on node B. It should work without entering a password (otherwise, use \family typewriter ssh-agent \family default to achieve that). In addition, \family typewriter rsync \family default must be installed. \end_layout \begin_layout Enumerate On node B, say \family typewriter marsadm join-cluster A \end_layout \begin_layout Enumerate Only \emph on after \begin_inset Foot status open \begin_layout Plain Layout In fact, you may already \family typewriter modprobe mars \family default at node A after the \family typewriter marsadm create-cluster \family default . Just don't do any of the \family typewriter *-cluster \family default operations when the kernel module is loaded. All other operations should have no such restriction. \end_layout \end_inset \emph default that, do \family typewriter modprobe mars \family default on each node. \end_layout \begin_layout Section Creating and Maintaining Resources \begin_inset CommandInset label LatexCommand label name "sec:Creating-and-Maintaining" \end_inset \end_layout \begin_layout Standard In the following example session, a block device \family typewriter /dev/lv-x/mydata \family default (shortly called \emph on disk \emph default ) must already exist on both nodes A and B, respectively, having the same \begin_inset Foot status open \begin_layout Plain Layout Actually, the disk at the initially secondary side may be larger than that at the initially primary side. This will waste space and is therefore not recommended. \end_layout \end_inset size. For the sake of simplicity, the disk (underlying block device) as well as its later logical resource name as well as its later virtual device name will all be named uniformly by the same suffix \family typewriter mydata \family default . In general, you might name each of them differently, but that is not recommende d since it may easily lead to confusion in larger installations. \end_layout \begin_layout Standard You may have already some data inside your disk \family typewriter /dev/lv-x/mydata \family default at the initially primary side A. Before using it for MARS, it must be unused for any other purpose (such as being mounted, or used by DRBD, etc). MARS will require \series bold exclusive access \series default to it. \end_layout \begin_layout Enumerate On node A, say \family typewriter marsadm create-resource mydata /dev/lv-x/mydata \family default . \begin_inset Newline newline \end_inset As a result, a directory \family typewriter /mars/resource-mydata/ \family default will be created on node A, containing some symlinks. Node A will automatically start in the primary role for this resource. Therefore, a new pseudo-device \family typewriter /dev/mars/mydata \family default will also appear after a few seconds. \begin_inset Newline newline \end_inset Note that the initial contents of \family typewriter /dev/mars/mydata \family default will be exactly the same as in your pre-existing disk \family typewriter /dev/lv-x/mydata \family default . \begin_inset Newline newline \end_inset If you like, you may already use \family typewriter /dev/mars/mydata \family default for mounting your already pre-existing data, or for creating a fresh filesystem , or for exporting via iSCSI, and so on. You may even do so before any other cluster node has joined the resource (so-called \begin_inset Quotes eld \end_inset standalone mode \begin_inset Quotes erd \end_inset ). But you can also do so later after setup of (one ore many) secondaries. \end_layout \begin_layout Enumerate Wait a few seconds until the directory \family typewriter /mars/resource-mydata/ \family default and its symlink contents also appears on cluster node B. \end_layout \begin_layout Enumerate On node B, say \family typewriter marsadm join-resource mydata /dev/lv-x/mydata \family default . \begin_inset Newline newline \end_inset As a result, the initial full-sync from node A to node B should start automatica lly. \begin_inset Newline newline \end_inset \begin_inset Graphics filename images/MatieresCorrosives.png lyxscale 50 scale 17 \end_inset Of course, your old contents of your disk \family typewriter /dev/lv-x/mydata \family default at side B (and \emph on only \emph default there!) is overwritten by the version from side A. Since you are an experienced sysadmin, you knew that, and it was just the effect you deliberately wanted to achieve. If you didn't check that your old contents didn't contain any valuable data (or if you accidentally provided a wrong disk device argument), it is too late now. The \family typewriter marsadm \family default command checks that the disk device argument is really a block device, and that exclusive access to it is possible (as well as some further safety checks, e.g. matching sizes). However, MARS cannot know the \emph on purpose \emph default of your generic block device. MARS (as well as DRBD) is completely ignorant of the \emph on contents \emph default of a generic block device; it does not interpret it in any way. Therefore, you may use MARS (as well as DRBD) for mirroring Windows filesystems , or raw devices from databases, or whatever. \begin_inset Newline newline \end_inset \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset Hint: by default, MARS uses the so-called \begin_inset Quotes eld \end_inset fast fullsync \begin_inset Quotes erd \end_inset algorithm. It works similar to \family typewriter rsync \family default , first reading the data on both sides and computing an md5 checksum for each block. Heavy-weight data is only transferred over the long-distance network upon checksum mismatch. This is extremely fast if your data is already (almost) identical on both sides. Conversely, if you know in advance that your initial data is completely different on both sides, you may choose to switch off the fast fullsync algorithm via \family typewriter echo 0 > /proc/sys/mars/do_fast_fullsync \family default in order to save the additional IO overhead and network latencies introduced by the separate checksum comparison steps. \end_layout \begin_layout Enumerate Optionally: if you create a \emph on new \emph default filesystem on \family typewriter /dev/mars/mydata \family default \emph on after(!) \emph default having created the MARS resource, you may skip the fast fullsync phase at all, because the old content of \family typewriter /dev/mars/mydata \family default is just garbage not used by the freshly created filesystem. Just say \family typewriter marsadm fake-sync mydata \family default in order to abort the sync operation. \begin_inset Newline newline \end_inset \begin_inset Graphics filename images/MatieresToxiques.png lyxscale 50 scale 17 \end_inset Never do a \family typewriter fake-sync \family default unless you are \series bold absolutely sure \series default that you really don't need the data! Otherwise, you are almost \emph on guaranteed \emph default to have produced harmful inconsistencies. If you accidentally issued \family typewriter fake-sync \family default , you may startover the full sync at your secondary side at any time by saying \family typewriter marsadm invalidate mydata \family default (analogously to the corresponding DRBD command). \end_layout \begin_layout Section Keeping Resources Operational \end_layout \begin_layout Subsection Logfile Rotation / Deletion \begin_inset CommandInset label LatexCommand label name "sub:Logfile-Rotation" \end_inset \end_layout \begin_layout Standard As explained in section \begin_inset CommandInset ref LatexCommand nameref reference "sec:The-Transaction-Logger" \end_inset , all changes to your resource data are recorded in transaction logfiles residing on the \family typewriter /mars/ \family default filesystem. These files are always growing over time. In order to avoid filesystem overflow, the following must be done in regular time intervals: \end_layout \begin_layout Enumerate \family typewriter marsadm log-rotate all \family default \begin_inset Newline newline \end_inset This starts appending to a new logfile on all of your resources. The logfiles are automatically numbered by an increasing 9-digit logfile number. This will suffice for many centuries even if you would logrotate once a minute. Practical frequencies for logfile rotation are more like once an hour \begin_inset Foot status open \begin_layout Plain Layout Under \emph on extremely \emph default high load conditions, you might want to log-rotate serveral times an hour, in order to keep the size of each logfile under some practical limit. At 1&1 datacenters, we have not yet encountered conditions where that was really \emph on necessary \emph default . \end_layout \end_inset , or once a day (depending on your load). \end_layout \begin_layout Enumerate \family typewriter marsadm log-delete-all all \family default \begin_inset Newline newline \end_inset This determines all logfiles from all resources which are no longer needed (i.e. which are \emph on fully \emph default applied, on \emph on all \emph default relevant secondaries). All superfluous logfiles are then deleted, including all copies on all secondaries. \begin_inset Newline newline \end_inset \begin_inset Graphics filename images/MatieresCorrosives.png lyxscale 50 scale 17 \end_inset The current version of MARS deletes either \emph on all \emph default replicas of a logfile everywhere, or \emph on none \emph default of the replicas. This is a simple rule, but has the drawback that one node may hinder other nodes from freeing space in \family typewriter /mars/ \family default . In particular, the command \family typewriter marsadm pause-replay $res \family default (as well as \family typewriter marsadm disconnect $res \family default ) will freeze the space reclamation in the whole cluster when the pause is lasting very long. \begin_inset Newline newline \end_inset \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset Best practice is to do both \family typewriter log-rotate \family default and \family typewriter log-delete-all \family default in a \family typewriter cron \family default job. In addition, you should establish some regular monitoring of the free space present in the \family typewriter /mars/ \family default filesystem. \end_layout \begin_layout Standard More detailed information about about avoidance of \family typewriter /mars/ \family default overflow is in section \begin_inset CommandInset ref LatexCommand ref reference "sec:Defending-Overflow" \end_inset . \end_layout \begin_layout Subsection Switch Primary / Secondary Roles \end_layout \begin_layout Standard In contrast to DRBD, MARS Light distinguishes between \emph on intended \emph default and \emph on emergency \emph default switching. This distinction is necessary due to subtle differences in the communication architecture (asynchronous communication vs synchronous communication, see sections \begin_inset CommandInset ref LatexCommand ref reference "sec:The-Lamport-Clock" \end_inset and \begin_inset CommandInset ref LatexCommand ref reference "sec:The-Symlink-Tree" \end_inset ). \end_layout \begin_layout Subsubsection Intended Switching \begin_inset CommandInset label LatexCommand label name "sub:Intended-Switching" \end_inset \end_layout \begin_layout Standard Switching the roles is very similar to DRBD: just issue the command \end_layout \begin_layout Itemize \family typewriter marsadm primary mydata \end_layout \begin_layout Standard on your formerly secondary node. Precondition is that you are in connected state, and that the old primary does not use its \family typewriter /dev/mars/mydata \family default device any longer. If the preconditions are violated, \family typewriter marsadm primary \family default refuses to run. \end_layout \begin_layout Standard The preconditions try to protect you from doing silly things, such as accidental ly provoking a split brain error state. We try to avoid split brain as best as we can. Therefore, we distinguish between \emph on intended \emph default and \emph on emergeny \emph default switching. Intended switching will try to avoid split brain \emph on as best as it can \emph default . \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename images/MatieresCorrosives.png lyxscale 50 scale 17 \end_inset Don't \emph on rely \emph default on split brain avoidance, in particular when scripting any higher-level applications such as cluster managers. \family typewriter marsadm \family default does its best, but at least in case of (unnoticed) network outages / partitions (or even \emph on very \emph default slow / overloaded networks), an attempt to become up-to-date is likely to fail. If you want to \emph on ensure \emph default that no split brain can result from intended primary switching, please give the \family typewriter primary \family default command only after your secondary is \emph on known \emph default to be up-to-date. \end_layout \begin_layout Standard Notice that the usage check for \family typewriter /dev/mars/mydata \family default is based on the \emph on open count \emph default transferred from another cluster node. Since MARS is operating asynchronously (in contrast to DRBD), it may take some time until our node knows that the device is no longer used at another node. This can lead to a race condition if you automate an intended takeover with a script like \family typewriter ssh A \begin_inset Quotes eld \end_inset umount /dev/mars/mydata \begin_inset Quotes erd \end_inset ; ssh B \begin_inset Quotes eld \end_inset marsadm primary mydata \begin_inset Quotes erd \end_inset \family default because your second ssh command may be faster than the internal MARS symlink tree propagation (cf section \begin_inset CommandInset ref LatexCommand ref reference "sec:The-Symlink-Tree" \end_inset ). In order to prevent such races, you should use the command \end_layout \begin_layout Itemize \family typewriter marsadm wait-umount mydata \end_layout \begin_layout Standard on node B before trying to become primary. The script should look like \family typewriter ssh A \begin_inset Quotes eld \end_inset umount /dev/mars/mydata \begin_inset Quotes erd \end_inset ; ssh B \begin_inset Quotes eld \end_inset marsadm wait-umount mydata && marsadm primary mydata \begin_inset Quotes erd \end_inset \family default . \end_layout \begin_layout Subsubsection Emergency Switching \begin_inset CommandInset label LatexCommand label name "sub:Emergency-Switching" \end_inset \end_layout \begin_layout Standard In case the connection to the old primary is lost for whatever reason, we just don't know anything about its \emph on current \emph default state (which may deviate from its \emph on last known \emph default state). The following variant will skip many checks and tell your node to become primary forcefully: \end_layout \begin_layout Itemize \family typewriter marsadm disconnect mydata \end_layout \begin_layout Itemize \family typewriter marsadm primary mydata --force \end_layout \begin_layout Itemize \family typewriter marsadm connect mydata \end_layout \begin_layout Standard The \family typewriter disconnect \family default is a precondition analogously to DRBD. It tries to prevent you from accidental creation of a split brain error state. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename images/MatieresToxiques.png lyxscale 50 scale 17 \end_inset \series bold Split brain \series default is always an \series bold erroneous state \series default which should be never entered deliberately! Once you have entered it accidental ly, you \series bold must \series default resolve it ASAP (see section \begin_inset CommandInset ref LatexCommand ref reference "sub:Split-Brain-Resolution" \end_inset ), otherwise you cannot operate your resource any longer. \end_layout \begin_layout Standard While \family typewriter marsadm primary \family default without \family typewriter --force \family default tries to prevent split brain as best as it can (even in \family typewriter disconnected \family default mode, which is a major difference to DRBD's behaviour), any use of the \family typewriter --force \family default option will almost \emph on certainly \emph default provoke a split brain if the old primary continues to operate on its local \family typewriter /dev/mars/mydata \family default device. Therefore, you are \series bold strongly advised \series default to do this \series bold only \series default after \end_layout \begin_layout Enumerate \family typewriter marsadm primary \family default without \family typewriter --force \family default has failed \emph on for no good reason \emph default \begin_inset Foot status open \begin_layout Plain Layout Most reasons will be displayed by \family typewriter marsadm \family default when it is rejecting to execute the switchover. \end_layout \end_inset , and \end_layout \begin_layout Enumerate You are sure you \emph on really \emph default want to switch, even when that eventually leads to a split brain. You also declare that you are also willing to do \emph on manual \emph default split-brain resolution as described in section \begin_inset CommandInset ref LatexCommand ref reference "sub:Split-Brain-Resolution" \end_inset . \end_layout \begin_layout Standard \begin_inset Graphics filename images/MatieresCorrosives.png lyxscale 50 scale 17 \end_inset Notice: in case of \emph on connection loss \emph default (e.g. networking problems / network partitions) you might not be able to reliably detect whether a split brain will actually result, or not. \end_layout \begin_layout Standard In contrast to DRBD, split brain situations are handled differently by MARS Light. When two primaries are accidentally active at the same time, each of them writes into different logfiles \family typewriter /mars/resource-mydata/log-000000001-A \family default and \family typewriter /mars/resource-mydata/log-000000001-B \family default where the \emph on origin \emph default host is always recorded in the filename. Therefore, both nodes \emph on can theoretically \emph default run in primary mode independently from each other, at least for some time. They might even \family typewriter log-rotate \family default independently from each other. However, the replication will certainly get stuck, and your \family typewriter /mars/ \family default filesystem will eventually run out of space. Any other secondary node will certainly get into serious problems: it simply does not not know which split-brain version it should follow. Therefore, you will certainly loose your redundancy. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename images/MatieresCorrosives.png lyxscale 50 scale 17 \end_inset When one of your multiple split brain nodes has left its actual primary role, e.g. via \family typewriter marsadm secondary \family default and umounting its \family typewriter /dev/mars/mydata \family default device while the network is up (again), we cannot guarantee that it is always possible to re-enter primary mode again, even when \family typewriter primary --force \family default is given. First cleanup the split brain via \family typewriter leave-resource \family default and friends, or use the method described in section \begin_inset CommandInset ref LatexCommand ref reference "sub:Cleanup-in-case" \end_inset . Remember that split brain is an \series bold \emph on erroneous \series default \emph default state. Therefore it is \series bold generally no good idea to (re-)enter it deliberately! \end_layout \begin_layout Standard Split brain situations are detected \emph on passively \emph default by secondaries. Whenever a secondary detects that somewhere a split brain has happend, it just refuses to fetch and to apply any logfiles behind the split point. This means that its local disk state will remain consistent, but outdated which respect to any of the split brain versions. \end_layout \begin_layout Subsection Split Brain Resolution \begin_inset CommandInset label LatexCommand label name "sub:Split-Brain-Resolution" \end_inset \end_layout \begin_layout Standard Split brain can naturally occur during a long-lasting network outage (aka network partition) when you (forcefully) switch primaries inbetween, or due to final loss of your old primary node (fatal node crash) when not all logfile data had been transferred immediately before the final crash. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename images/MatieresToxiques.png lyxscale 50 scale 17 \end_inset Remember that split brain is always an \series bold erroneous state \series default which must be resolved as soon as possible! \end_layout \begin_layout Subsubsection Final Destruction of a Damaged Node \begin_inset CommandInset label LatexCommand label name "sub:Final-Destroy-of" \end_inset \end_layout \begin_layout Standard When a node has eventually died, do the following steps ASAP: \end_layout \begin_layout Enumerate \emph on Physically \emph default remove the dead node from your network. Unplug all network cables! Failing to do so might provoke a disaster in case it somehow resurrects in an uncontrolled manner, such as a partly-damaged \family typewriter /mars/ \family default filesystem, or whatever. Don't risk any such unpredictable behaviour! \end_layout \begin_layout Enumerate \series bold Manually \series default check which of the surviving versions will be the \begin_inset Quotes eld \end_inset right \begin_inset Quotes erd \end_inset one. Any error is up to you: resurrecting an unnecessarily old / outdated version and/or destroying the newest / best version is \emph on your \emph default fault, not the fault of MARS. \end_layout \begin_layout Enumerate If you did not already switch your primary to the final destination determined in the previous step, do it now (see description in section \begin_inset CommandInset ref LatexCommand ref reference "sub:Emergency-Switching" \end_inset ). \end_layout \begin_layout Enumerate On the surviving new designated primary, give the following commands: \end_layout \begin_deeper \begin_layout Enumerate \family typewriter marsadm --host=your-damaged-host disconnect mydata \end_layout \begin_layout Enumerate \family typewriter marsadm --host=your-damaged-host leave-resource mydata \end_layout \end_deeper \begin_layout Enumerate In case any of the previous commands should fail (which is rather likely), repeat it with an additional \family typewriter --force \family default option. Don't use \family typewriter --force \family default in the first place, alway try first without it! \end_layout \begin_layout Enumerate Repeat the same with \emph on all \emph default resources which were formerly present at \family typewriter your-damaged-host \family default . \end_layout \begin_layout Enumerate Finally, say \family typewriter marsadm --host=your-damaged-host leave-cluster \family default (optionally augmented with \family typewriter --force \family default ). \end_layout \begin_layout Standard Now your surviving nodes should \emph on believe \emph default that the old node \family typewriter your-damaged-host \family default does no longer exist, and that it does no longer participate in any resource. \end_layout \begin_layout Standard In case \family typewriter leave-resource --host= \family default does not work, you can try the following alternative: \end_layout \begin_layout Standard \begin_inset ERT status open \begin_layout Plain Layout \backslash begin{enumerate} \backslash setcounter{enumi}{3} \end_layout \end_inset \end_layout \begin_layout Standard \begin_inset ERT status open \begin_layout Plain Layout \backslash item \end_layout \end_inset On the surviving new designated primary, give the following commands \end_layout \begin_layout Enumerate \family typewriter marsadm disconnect-all mydata \end_layout \begin_layout Enumerate \family typewriter marsadm down mydata \end_layout \begin_layout Enumerate Check by hand whether your local disk is consistent, e.g. by test-mounting is, \family typewriter fsck \family default , etc. \end_layout \begin_layout Enumerate \family typewriter marsadm delete-resource mydata \end_layout \begin_layout Enumerate Check whether the other cluster nodes are dead. If not, STONITH them by hand. \end_layout \begin_layout Enumerate \family typewriter marsadm create-resource newmydata ... \family default and further steps to setup your resource from scratch. \end_layout \begin_layout Standard \begin_inset ERT status open \begin_layout Plain Layout \backslash end{enumerate} \end_layout \end_inset \end_layout \begin_layout Standard \noindent In any case, \series bold manually check \series default whether a split brain is reported for any resource on any of your \emph on surviving \emph default cluster nodes. If you find one (and only then), please continue with the following recipe as if you just had had a temporary failure of \emph on some \emph default of the surviving nodes: \end_layout \begin_layout Subsubsection Split Brain Resolution after a Temporary Failure \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename images/MatieresToxiques.png lyxscale 50 scale 17 \end_inset Please remember that split brain is always an \series bold erroneous state \series default which must be resolved as soon as possible! \end_layout \begin_layout Standard Whenever split brain occurs for whatever reason, you have two choices for resolution: either destroy one of your versions, or retain it under a different resource name. \end_layout \begin_layout Standard In any of both cases, do the following steps ASAP: \end_layout \begin_layout Enumerate \series bold Manually \series default check which (surviving) version is the \begin_inset Quotes eld \end_inset right \begin_inset Quotes erd \end_inset one. Any error is up to you: destroying the wrong version is \emph on your \emph default fault, not the fault of MARS. \end_layout \begin_layout Enumerate If you did not already switch your primary to the final destination determined in the previous step, do it now (see description in section \begin_inset CommandInset ref LatexCommand ref reference "sub:Emergency-Switching" \end_inset ). \end_layout \begin_layout Enumerate On each non-right version (which you don't want to retain) which had been primary before, umount your \family typewriter /dev/mars/mydata \family default or otherwise stop using it (e.g. stop iSCSI or other users of the device). Wait until each of them has actually left primary state and until their local logfile(s) have been fully written back to the underlying disk. \end_layout \begin_layout Enumerate Wait until the network works again. All your (surviving) cluster nodes \emph on must \emph default \begin_inset Foot status open \begin_layout Plain Layout If you are a MARS expert and you really know what you are doing (in particular, you can anticipate the effects of the Lamport clock and of the symlink update protocol including the \begin_inset Quotes eld \end_inset eventually consistent \begin_inset Quotes erd \end_inset behaviour including the not-yet-consistent intermediate states, see sections \begin_inset CommandInset ref LatexCommand ref reference "sec:The-Lamport-Clock" \end_inset and \begin_inset CommandInset ref LatexCommand ref reference "sec:The-Symlink-Tree" \end_inset ), you may deviate from this requirement. \end_layout \end_inset be able to communicate with each other. If that is not possible, or if it takes too long, use the method described in section \begin_inset CommandInset ref LatexCommand ref reference "sub:Final-Destroy-of" \end_inset . \end_layout \begin_layout Enumerate If any of your (surviving) cluster nodes has already the \begin_inset Quotes eld \end_inset right \begin_inset Quotes erd \end_inset version and was not in a primary role when the split brain happened, you don't need to do the following steps for it, of course. The following applies only to those nodes which \emph on deviate \emph default from the correct version: \end_layout \begin_layout Enumerate It may happen that the \begin_inset Quotes eld \end_inset right \begin_inset Quotes erd \end_inset version you want to retain is \emph on not \emph default the version which is currently designated as primary for the whole cluster. Only in such a case, switch the primary role as described in sections \begin_inset CommandInset ref LatexCommand ref reference "sub:Intended-Switching" \end_inset or \begin_inset CommandInset ref LatexCommand ref reference "sub:Emergency-Switching" \end_inset . Here is a repetition of the necessary steps: \end_layout \begin_deeper \begin_layout Enumerate First try \family typewriter marsadm primary mydata \family default on the new designated primary host. Don't mix up your shell windows! \end_layout \begin_layout Enumerate Only if that refuses working \emph on for no good reason \emph default , do the following steps: \end_layout \begin_deeper \begin_layout Enumerate \family typewriter marsadm disconnect mydata \family default . \end_layout \begin_layout Enumerate \family typewriter marsadm primary mydata --force \family default . \end_layout \begin_layout Enumerate \family typewriter marsadm connect mydata \family default . \end_layout \end_deeper \end_deeper \begin_layout Standard The next steps are different for different use cases: \end_layout \begin_layout Paragraph Keeping a Split Brain Version \end_layout \begin_layout Standard Continue with the following steps, each on those cluster node(s) you don't want to retain: \end_layout \begin_layout Standard \begin_inset ERT status open \begin_layout Plain Layout \backslash begin{enumerate} \backslash setcounter{enumi}{6} \end_layout \end_inset \end_layout \begin_layout Standard \begin_inset ERT status open \begin_layout Plain Layout \backslash item \end_layout \end_inset \family typewriter marsadm leave-resource mydata \end_layout \begin_layout Standard \begin_inset ERT status open \begin_layout Plain Layout \backslash item \end_layout \end_inset After having done this on \emph on all \emph default non-right cluster nodes, check that the split brain is gone (e.g. by saying \family typewriter marsadm status \family default ). In very rare \begin_inset Foot status open \begin_layout Plain Layout When your network had partitioned in a very awkward way for a long time, and when your partitioned primaries did several \family typewriter log-rotate \family default operations indendently from each other, there is a small chance that \family typewriter leave-resource \family default does not clean up \emph on all \emph default remains of such an awkward situation. Only in such a case, try \family typewriter log-purge-all \family default . \end_layout \end_inset cases, it might happen that he preceding l \family typewriter eave-resource \family default operations were not able to clean up all logfiles produced in parallel by the split brain situation. Only in such rare cases, read the documentation about \family typewriter log-purge-all \family default (see page \begin_inset CommandInset ref LatexCommand pageref reference "log-purge-all$res" \end_inset ) and try it. \end_layout \begin_layout Standard \begin_inset ERT status open \begin_layout Plain Layout \backslash item \end_layout \end_inset Check that each underlying local disk \family typewriter /dev/lv-x/mydata \family default is really usable afterwards, e.g. by test-mounting it (or \family typewriter fsck \family default if you can afford it). If all is OK, don't forget to umount it before proceeding with the next step. \end_layout \begin_layout Standard \begin_inset ERT status open \begin_layout Plain Layout \backslash item \end_layout \end_inset Create a completely new MARS resource out of the underlying disk \family typewriter /dev/lv-x/mydata \family default having a different name, such as \family typewriter mynewdata \family default (see description in section \begin_inset CommandInset ref LatexCommand nameref reference "sec:Creating-and-Maintaining" \end_inset ). \end_layout \begin_layout Standard \begin_inset ERT status open \begin_layout Plain Layout \backslash end{enumerate} \end_layout \end_inset \end_layout \begin_layout Paragraph Destroying a Wrong Split Brain Version \end_layout \begin_layout Standard As before, do the \family typewriter leave-resource \family default step on each node and check that the split brain has gone, but omit the re-creation. You may just follow-up a \family typewriter join-resource \family default to the old resource name instead, in order to restore your redundancy by overwriting your bad split brain version with the correct one. \end_layout \begin_layout Standard Alternatively, you may try the following short procedure instead, which is however not guaranteed to resolve all (desperate) split-brain situations (see documentation of \family typewriter log-purge-all \family default on page \begin_inset CommandInset ref LatexCommand pageref reference "log-purge-all$res" \end_inset ): \end_layout \begin_layout Standard \begin_inset ERT status open \begin_layout Plain Layout \backslash begin{enumerate} \backslash setcounter{enumi}{6} \end_layout \end_inset \end_layout \begin_layout Standard \begin_inset ERT status open \begin_layout Plain Layout \backslash item \end_layout \end_inset On each node with a non- \begin_inset Quotes erd \end_inset right \begin_inset Quotes erd \end_inset version, say \family typewriter marsadm invalidate mydata \family default . \end_layout \begin_layout Standard \begin_inset ERT status open \begin_layout Plain Layout \backslash end{enumerate} \end_layout \end_inset \end_layout \begin_layout Paragraph Keeping a Good Version \end_layout \begin_layout Standard When you had a secondary which did not participate in the split brain, but just got confused and therefore stopped applying logfiles immediately after the split-brain point, it may very well happen \begin_inset Foot status open \begin_layout Plain Layout In general, such a \begin_inset Quotes eld \end_inset good \begin_inset Quotes erd \end_inset behaviour cannot be guaranteed for all secondaries. Race conditions in complex networks may asynchronously transfer \begin_inset Quotes eld \end_inset wrong \begin_inset Quotes erd \end_inset logfile data to a secondary much earlier than conflicting \begin_inset Quotes eld \end_inset good \begin_inset Quotes erd \end_inset logfile data which will be marked \begin_inset Quotes eld \end_inset good \begin_inset Quotes erd \end_inset only in the \emph on future. \emph default It is impossible to predict this in advance. \end_layout \end_inset that you don't need to do any action for it. When all wrong versions have disappeared from the cluster (either by \family typewriter invalidate \family default or by \family typewriter leave-resource \family default ), the confusion should be over, and the secondary should automatically resume tracking of the new unique version. \end_layout \begin_layout Standard Please check that \emph on all \emph default of your secondaries are no longer stuck. You need to execute split brain resolution only for \emph on stuck \emph default nodes. \end_layout \begin_layout Subsubsection Cleanup in case of Complicated Cascading Failures \begin_inset CommandInset label LatexCommand label name "sub:Cleanup-in-case" \end_inset \end_layout \begin_layout Standard MARS Light does its best to recover even from multiple failures (e.g. \series bold rolling disasters \series default ). Chances are high that the previous instructions will work even in case of multiple failures, such as a network failure plus local node failure at only 1 node (even if that node is the former primary node). \end_layout \begin_layout Standard However, in general (e.g. when more than 1 node is damaged) there is no general guarantee that recovery will \emph on always \emph default succeed under \emph on any \emph default (weird) circumstances. That said, your chances for recovery are \emph on very \emph default high when some disk remains usable at least at one of your surviving secondarie s. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset It should be very hard to finally trash a secondary, because the transaction logfiles are containing \family typewriter md5 \family default checksums for all data records. Any attempt to apply currupted logfiles is refused by MARS. In addition, the sequence numbers of log-rotated logfiles are checked for contiguity. Finally, the \emph on sequence path \emph default of logfile applications (consisting of logfile names plus their respective length) is additionally secured by a \family typewriter git \family default -like incremental checksum over the whole path (so-called \begin_inset Quotes eld \end_inset version links \begin_inset Quotes erd \end_inset ). This should detect split brains even if logfiles are appended / modified \emph on after \emph default a (forceful) switchover has already taken place. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename images/MatieresToxiques.png lyxscale 50 scale 17 \end_inset That said, your \begin_inset Quotes eld \end_inset chances \begin_inset Quotes erd \end_inset for final loss of data are very high if you remove the BBU of your hardware RAID system before all hot data has been flushed to the physical disks. Therefore, never try to \begin_inset Quotes eld \end_inset repair \begin_inset Quotes erd \end_inset a seemingly dead node before your replication is up again somewhere else! Only unplug the network cables when advised, but never try to repair the hardware instantly! \end_layout \begin_layout Standard In case of desperate situations where none of the previous instructions have succeeded, your last chance is rebuilding your resource from an intact disk as follows: \end_layout \begin_layout Enumerate Do \family typewriter rmmod mars \family default on all your cluster nodes and/or reboot them. Note: if you are less desperate, chances are high that the following will also work when the kernel module remains active and everywhere a \family typewriter marsadm down \family default is given instead, but for an \emph on ultimate \emph default instruction you should eliminate \emph on potential \emph default kernel problems by \family typewriter rmmod \family default / \family typewriter reboot \family default , at least if you can afford the downtime on concurrently operating resources. \end_layout \begin_layout Enumerate For safety, physically remove the storage network cables on \emph on all \emph default your cluster nodes. Note: the same disclaimer holds. MARS really does its best, even when \family typewriter delete-resource \family default is given while the network is fully active and multiple split-brain primaries are actively using their local device in parallel (approved by some testcases from the automatic test suite, but note that it is impossible to catch all possible failure scenarios). Don't challenge your fate if you are desperate! Don't \emph on rely \emph default on this! Nothing is absolutely fail-safe! \end_layout \begin_layout Enumerate \series bold Manually \series default check which surviving disk is usable, and which is the \begin_inset Quotes eld \end_inset best \begin_inset Quotes erd \end_inset one for your purpose. \end_layout \begin_layout Enumerate Do \family typewriter modprobe mars \family default \emph on only \emph default on that node. If that fails, \family typewriter rmmod \family default and/or reboot again, and start over with a completely fresh \family typewriter /mars/ \family default partition ( \family typewriter mkfs.ext4 /mars/ \family default or similar), and continue with step 7. \end_layout \begin_layout Enumerate If your old \family typewriter /mars/ \family default works, and you did not already (forcefully) switch your designated primary to the final destination, do it now (see description in section \begin_inset CommandInset ref LatexCommand ref reference "sub:Emergency-Switching" \end_inset ). \end_layout \begin_layout Enumerate Say \family typewriter marsadm delete-resource mydata --force \family default . \end_layout \begin_layout Enumerate Locally build up the new resource as usual. \end_layout \begin_layout Enumerate Check whether the new resource works in standalone mode. \end_layout \begin_layout Enumerate When necessary, repeat these steps with other resources. \end_layout \begin_layout Enumerate Finally, do all the \family typewriter join-resource \family default s on the respective cluster nodes, according to your new redundancy scenario after the failures (e.g. after activating spare nodes, etc). \end_layout \begin_layout Standard Now you can choose how the rebuild your cluster. If you rebuilt \family typewriter /mars/ \family default anywhere, you should do the same on all other (surviving) cluster nodes and start over with a fresh \family typewriter join-cluster \family default on them. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename images/MatieresCorrosives.png lyxscale 50 scale 17 \end_inset Never use \family typewriter delete-resource \family default twice on the same resource name, at least after you have already a working standalone primary \begin_inset Foot status open \begin_layout Plain Layout Of course, when you don't have created the \emph on same \emph default resource anew, you may repeat \family typewriter delete-resource \family default on other cluster nodes in order to get rid of local files / symlinks which had not been propagated to other nodes before. \end_layout \end_inset . You might accidentally destroy your again-working copy! \end_layout \begin_layout Standard Before re-connecting any network cable on any non-primary (new secondaries), ensure that all \family typewriter /dev/mars/mydata \family default devices are no longer in use (e.g. from an old primary role before the incident happened), and that each local disk is detached. Only after that, you should be able to safely re-connect the network. The \family typewriter delete-resource \family default given at the new primary should propagate now to each of your secondaries, and your local disk should be usable for a re- \family typewriter join-resource \family default . \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset When you did not rebuild your cluster from scratch with fresh \family typewriter /mars/ \family default filesystems, and one of the old cluster nodes is supposed to be removed permanently, use \family typewriter leave-resource \family default (optionally with \family typewriter --host= \family default and/or \family typewriter --force \family default ) and finally \family typewriter leave-cluster \family default . \end_layout \begin_layout Chapter Basic Working Principle \end_layout \begin_layout Standard Even if you are impatient, please read this chapter. At the \emph on surface \emph default , MARS appears to be very similar to DRBD. It looks like almost being a drop-in replacement for DRBD. \end_layout \begin_layout Standard When taking this naïvely, you could easily step into some trivial pitfalls, because the internal working principle of MARS is totally different from DRBD. Please forget (almost) anything you already know about the internal working principles of DRBD, and look at the very different working principles of MARS. \end_layout \begin_layout Section The Transaction Logger \begin_inset CommandInset label LatexCommand label name "sec:The-Transaction-Logger" \end_inset \end_layout \begin_layout Standard \noindent \align center \begin_inset Graphics filename images/MARS_Data_Flow.pdf lyxscale 60 width 100text% \end_inset \end_layout \begin_layout Standard \noindent The basic idea of MARS is to record all changes made to your block device in a so-called \series bold transaction logfile \series default . \emph on Any \emph default write reqeuest is treated like a transaction which changes the contents of your block device. \end_layout \begin_layout Standard This is similar in concept to some database systems, but there exists no separate \begin_inset Quotes eld \end_inset commit \begin_inset Quotes erd \end_inset operation: \emph on any \emph default write request is acting like a commit. \end_layout \begin_layout Standard The picture shows the flow of write requests. Let's start with the primary node. \end_layout \begin_layout Standard Upon submission of a write request on \family typewriter /dev/mars/mydata \family default , it is first buffered in a \emph on temporary \emph default memory buffer. \end_layout \begin_layout Standard The temporary memory buffer serves multiple purposes: \end_layout \begin_layout Itemize It keeps track of the order of write operations. \end_layout \begin_layout Itemize Additionally, it keeps track of the positions in the underlying disk \family typewriter /dev/lv-x/mydata \family default . In particular, it detects when the same block is overwritten multiple times. \end_layout \begin_layout Itemize During pending write operation, any concurrent reads are served from the memory buffer. \end_layout \begin_layout Standard After the write has been buffered in the temporary memory buffer, the main logger thread of the transaction logger creates a so-called \emph on log entry \emph default and starts an \begin_inset Quotes eld \end_inset append \begin_inset Quotes erd \end_inset operation on the transaction logfile. The log entry contains vital information such as the logical block number in the underlying disk, the length of the data, a timestamp, some header magic in order to detect corruption, the log entry sequence number, of course the data itself, and optional information like a checksum or compression information. \end_layout \begin_layout Standard Once the log entry has been written through to the \family typewriter /mars/ \family default filesystem via fsync(), the application waiting for the write operation at \family typewriter /dev/mars/mydata \family default is signalled that the write was successful. \end_layout \begin_layout Standard This may happen even \emph on before \emph default the writeback to the underlying disk \family typewriter /dev/lv-x/mydata \family default has started. Even when you power off the system right now, the information is not lost: it is present in the logfile, and can be reconstructed from there. \end_layout \begin_layout Standard Notice that the order of log records present in the transaction log defines a total order among the write requests which is \emph on compatible \emph default to the partial order of write requests issued on \family typewriter /dev/mars/mydata \family default . \end_layout \begin_layout Standard Also notice that despite its sequential nature, the transaction logfile is typically \emph on not \emph default the performance bottleneck of the system: since appending to a logfile is almost purely sequential IO, it runs much faster than random IO on typical datacenter workloads. \end_layout \begin_layout Standard In order to reclaim the temporary memory buffer, its content must be written back to the underlying disk \family typewriter /dev/lv-x/mydat \family default a somewhen. After writeback, the temporary space is freed. The writeback can do the following optimizations: \end_layout \begin_layout Enumerate writeback may be in \emph on any \emph default order; in particular, it may be \emph on sorted \emph default according to ascending sector ´numbers. This will reduce the average seek distances of magnetic disks in general. \end_layout \begin_layout Enumerate when the same sector is overwritten multiple times, only the \begin_inset Quotes eld \end_inset last \begin_inset Quotes erd \end_inset version need to be written back, skipping some intermediate versions. \end_layout \begin_layout Standard In case the primary node crashes during writeback, it suffices to replay the log entries from some point in the past until the end of the transaction logfile. It does no harm if you accidentally replay some log entries twice or even more often: since the replay is in the original total order, any temporary inconsistency is \emph on healed \emph default by the logfile application. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset In mathematics, the property that you can apply your logfile twice to your data (or even as often as you want), is called \series bold idempotence \series default . This is a very desirable property: it ensures that nothing goes wrong when applying \begin_inset Quotes eld \end_inset too much \begin_inset Quotes erd \end_inset / starting your replay \begin_inset Quotes eld \end_inset too early \begin_inset Quotes erd \end_inset . Idempotence is even more beneficial: in case anything should go wrong with your data on your disk (e.g. IO errors), applying your logfile once more often may \begin_inset Foot status open \begin_layout Plain Layout Miracles cannot be guaranteed, but \emph on higher chances \emph default and \emph on improvements \emph default can be expected (e.g. better chances for \family typewriter fsck \family default ). \end_layout \end_inset even \series bold heal \series default some defects. Good news for desperate sysadmins forced to work with flaky hardware! \end_layout \begin_layout Standard The basic idea of the asynchronous replication of MARS is rather simple: just transfer the logfiles to your secondary nodes, and apply them to their copy of the disk data (also called \emph on mirror \emph default ) in the same order as the total order defined by the primary. \end_layout \begin_layout Standard Therefore, a mirror of your data on any secondary may be outdated, but it always corresponds to some version which was valid in the past. This property is called \series bold anytime consistency \begin_inset Foot status open \begin_layout Plain Layout Your secondary nodes are always consistent in themselves. Notice that this kind of consistency is a \emph on local \emph default consistency model. There exists no global consistency in MARS. Global consistency would be practically impossible in long-distance replication where Einstein's law of the speed of light is limiting global consistency. The front-cover pictures showing the planets Earth and Mars tries to lead your imagination away from global consistency models as used in \begin_inset Quotes eld \end_inset DRBD Think(tm) \begin_inset Quotes erd \end_inset , and try to prepare you mentally for local consistency as in \begin_inset Quotes eld \end_inset MARS Think(tm) \begin_inset Quotes erd \end_inset . \end_layout \end_inset . \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset As you can see in the picture, the process of transfering the logfiles is \emph on independent \emph default from the process which applies the logfiles to the data at some secondary site. Both processes can be switched on / off separately (see commands \family typewriter marsadm {dis,}connect \family default and \family typewriter marsadm {pause,resume}-replay \family default in section \begin_inset CommandInset ref LatexCommand ref reference "sub:Operation-of-the" \end_inset ). This may be \emph on exploited \emph default : for example, you may replicate your logfiles as soon as possible (to protect against catastrophic failures), but deliberately wait one hour until it is applied (under regular circumstances). If your data inside your filesystem \family typewriter /mydata/ \family default at the primary site is accidentally destroyed by \family typewriter rm -rf /mydata/ \family default , you have an old copy at the secondary site. This way, you can substitute \emph on some parts \begin_inset Foot status open \begin_layout Plain Layout Please note that MARS cannot \emph on fully \emph default substitute a backup system, because it can keep only \emph on physical \emph default copies, and does not create logical copies. \end_layout \end_inset \emph default of conventional backup functionality by MARS. In case you need the actual version, just replay in \begin_inset Quotes eld \end_inset fast-forward \begin_inset Quotes erd \end_inset mode (similar to old-fashioned video tapes). \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset Future versions of MARS Full are planned to also allow \begin_inset Quotes eld \end_inset fast-backward \begin_inset Quotes erd \end_inset rewinding, of course at some cost. \end_layout \begin_layout Section The Lamport Clock \begin_inset CommandInset label LatexCommand label name "sec:The-Lamport-Clock" \end_inset \end_layout \begin_layout Standard MARS is always \emph on asynchonously \emph default communicating in the distributed system on \emph on any \emph default topics, even strategic decisions. \end_layout \begin_layout Standard If there were a \emph on strict \emph default global consistency model, which is roughly equivalent to a standalone model, we would need \emph on locking \emph default in order to serialize conflicting requests. It is known for many decades that \emph on distributed locks \emph default do not only suffer from performance problems, but they are also cumbersome to get them working reliably in scenarios where nodes or network links may fail at any time. \end_layout \begin_layout Standard Therefore, MARS uses a very different consistency model: \series bold Eventually Consistent \series default . \end_layout \begin_layout Standard \begin_inset Graphics filename images/MatieresCorrosives.png lyxscale 50 scale 17 \end_inset The asynchronous communication protocol of MARS leads to a different behaviour from DRBD in case of \series bold network partitions \series default (temporary interruption of communication between some cluster nodes), because MARS \emph on remembers \emph default the old state of remote nodes over long periods of time, while DRBD knows absolutely nothing about its peers in disconnected state. Sysadmins familiar with DRBD might find the following behaviour unusual: \end_layout \begin_layout Standard \noindent \align center \size tiny \begin_inset Tabular \begin_inset Text \begin_layout Plain Layout \size tiny Event \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny DRBD Behaviour \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny MARS Behaviour \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny 1. the network partitions \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny automatic disconnect \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny nothing happens, but replication lags behind \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny 2. on A: \family typewriter umount $device \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny works \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny works \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny 3. on A: \family typewriter {drbd,mars}adm secondary \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny works \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny works \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny 4. on B: \family typewriter {drbd,mars}adm primary \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny works, split brain happens \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \series bold \size tiny refused \series default because B believes that A is primary \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny 5. the network resumes \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny automatic connect attempt fails \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny communication automatically resumes \end_layout \end_inset \end_inset \end_layout \begin_layout Standard \noindent If you intentionally want to switch over (and to produce a split brain as a side effect), the following variant must be used with MARS: \end_layout \begin_layout Standard \noindent \align center \size tiny \begin_inset Tabular \begin_inset Text \begin_layout Plain Layout \size tiny Event \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny DRBD Behaviour \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny MARS Behaviour \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny 1. the network partitions \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny automatic disconnect \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny nothing happens, but replication lags behind \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny 2. on A: \family typewriter umount $device \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny works \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny works \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny 3. on A: \family typewriter {drbd,mars}adm secondary \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny works \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny works \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny 4. on B: \family typewriter {drbd,mars}adm primary \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny split brain, but nobody knows \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \series bold \size tiny refused \series default because B believes that A is primary \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny 5. on B: \family typewriter marsadm disconnect \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny - \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny works, nothing happens \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny 6. on B: \family typewriter marsadm primary --force \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny - \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny works, split brain happens on B, but A doesn't know \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny 7. on B: \family typewriter marsadm connect \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny - \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny works, nothing happens \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny 8. the network resumes \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny automatic connect attempt fails \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size tiny communication resumes, A now detects the split brain \end_layout \end_inset \end_inset \end_layout \begin_layout Standard \noindent In order to implement the consistency model \begin_inset Quotes eld \end_inset eventually consistent \begin_inset Quotes erd \end_inset , MARS uses a so-called Lamport \begin_inset Foot status open \begin_layout Plain Layout Published in the late 1970s by Leslie Lamport, also known as inventor of \begin_inset ERT status open \begin_layout Plain Layout \backslash LaTeX \end_layout \end_inset . \end_layout \end_inset clock. MARS uses a special variant called \begin_inset Quotes eld \end_inset physical Lamport clock \begin_inset Quotes erd \end_inset . \end_layout \begin_layout Standard The physical Lamport clock is another almost-realtime clock which \emph on can \emph default run independently from the Linux kernel system clock. However, the Lamport clock tries to remain as near as possible to the system clock. \end_layout \begin_layout Standard Both clocks can be queried at any time via \family typewriter cat /proc/sys/mars/lamport_clock \family default . The result will show both clocks in parallel, in units of seconds since the Unix epoch, with nanosecond resolution. \end_layout \begin_layout Standard When there are no network messages at all, both the system clock and the Lamport clock will show almost the same time (except some minor differences of a few nanoseconds resulting from the finite processor clock speed). \end_layout \begin_layout Standard The physical Lamport clock works rather simple: \emph on any \emph default message on the network is augmented with a Lamport time stamp telling when the message was \emph on sent \emph default according to the local Lamport clock of the sender. Whenever that message is received by some receiver, it checks whether the time ordering relation would be violated: whenever the Lamport timestamp in the message would claim that the sender had sent it \emph on after \emph default it arrived at the receiver (according to drifts in their respective local clocks), something must be wrong. In this case, the local Lamport clock of the \emph on receiver \emph default is advanced shortly after the sender Lamport timestamp, such that the time ordering relation is no longer violated. \end_layout \begin_layout Standard As a consequence, any local Lamport clock may precede the corresponding local system clock. In order to avoid accumulation of deltas between the Lamport and the system clock, the Lamport clock will run slower after that, possibly until it reaches the system clock again (if no other message arrives which sets it forward again). After having reached the system clock, the Lamport clock will continue with \begin_inset Quotes eld \end_inset normal \begin_inset Quotes erd \end_inset speed. \end_layout \begin_layout Standard MARS uses the local Lamport clock for anything where other systems would use the local system clock: for example, timestamp generation in the \family typewriter /mars/ \family default filesystem. Even symlinks created there are timestamped according to the Lamport clock. Both the kernel module and the userspace tool \family typewriter marsadm \family default are always operating in the timescale of the Lamport clock. Most importantly, all timestamp comparisons are always carried out with respect to Lamport time. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename images/MatieresCorrosives.png lyxscale 50 scale 17 \end_inset Bigger differences between the Lamport and the system clock can be annoying from a human point of view: when typing \family typewriter ls -l /mars/resource-mydata/ \family default many timestamps may appear as if they were created in the \begin_inset Quotes eld \end_inset future \begin_inset Quotes erd \end_inset , because the \family typewriter ls \family default command compares the output formatting against the system clock (it does not even know of the existence of the MARS Lamport clock). \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename images/MatieresToxiques.png lyxscale 50 scale 17 \end_inset Always use \family typewriter ntp \family default (or another clock synchronization service) in order to pre-synchronize your system clocks as close as possible. Bigger differences are not only annoying, but may lead some people to wrong conclusions and therefore even lead to bad human decisions! \end_layout \begin_layout Standard In a professional datacenter, you should use \family typewriter ntp \family default anyway, and you should monitor its effectiveness anyway. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset Hint: many internal logfiles produced by the MARS kernel module contain Lamport timestamps written as numerical values. In order to convert them into human-readable form, use the command \family typewriter marsadm cat /mars/5.total.status \family default or similar. \end_layout \begin_layout Section The Symlink Tree \begin_inset CommandInset label LatexCommand label name "sec:The-Symlink-Tree" \end_inset \end_layout \begin_layout Standard The \family typewriter /mars/ \family default filesystem contains not only transaction logfiles, but also acts as a generic storage for (persistent) state information. Both configuration information and runtime state information are stored in symlinks. Symlinks are \begin_inset Quotes eld \end_inset misused \begin_inset Foot status open \begin_layout Plain Layout This means, the symlink targets need not be other files or directories, but just any values like integers or strings. \end_layout \end_inset \begin_inset Quotes erd \end_inset in order to represent some \family typewriter key -> value \family default pairs. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset Therefrom results a fundamentally different behaviour than DRBD. When your DRBD primary crashed before and now comes up again, you have to setup DRBD again by a sequence of commands like \family typewriter modprobe drbd; drbdadm up all; drbdadm primary all \family default or similar. In contrast, MARS needs only \family typewriter modprobe mars \family default (after \family typewriter /mars/ \family default has been mounted by \family typewriter /etc/fstab \family default ). The \emph on persistence \emph default of the symlinks residing in \family typewriter /mars/ \family default will automatically remember your previous state, even if some your resources were primary while others were secondary (mixed operations). You don't need to do any actions in order to \begin_inset Quotes eld \end_inset restore \begin_inset Quotes erd \end_inset a previous state, no matter how \begin_inset Quotes eld \end_inset complex \begin_inset Quotes erd \end_inset it was. \end_layout \begin_layout Standard (Almost) all symlinks appearing in the \family typewriter /mars/ \family default directory tree are automatically replicated thoughout the whole cluster. Thus the \family typewriter /mars/ \family default directory forms some kind of \emph on global namespace \emph default . \end_layout \begin_layout Standard Since the symlink replication works generically, you may use the \family typewriter /mars/userspace/ \family default directory in order to place your own symlink there (for whatever purpose, which need not have to do with MARS). \end_layout \begin_layout Standard In order to avoid name clashes, each symlink created at node A should have the name A in its path name. Typically, internal MARS names follow the scheme \family typewriter /mars/ \emph on something \emph default /myname-A \family default , and you should follow the best practice of systematically using \family typewriter /mars/userspace/myname-A \family default or similar. As a result, each node will automatically get informed about the state at any other node, like B when the corresponding information is recorded on node B under the name \family typewriter /mars/userspace/myname-B \family default (context-dependent names). \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset Important: the convention of placing the \series bold creator host name \series default inside your symlink names should be used wherever possible. The name part is a kind of \begin_inset Quotes eld \end_inset ownership indicator \begin_inset Quotes erd \end_inset . It is crucial that no other host writes any symlink not \begin_inset Quotes eld \end_inset belonging \begin_inset Quotes erd \end_inset to him. Other hosts may read foreign symlinks as often as they want, but never modify them. This way, your cluster nodes are able to \emph on communicate \emph default with each other via symlink updates. \end_layout \begin_layout Standard Although you may create (and change) your symlinks with userspace tools like \family typewriter ln -s \family default , you should use the following marsadm commands instead: \end_layout \begin_layout Itemize \family typewriter marsadm set-link myvalue /mars/userspace/mykey-A \end_layout \begin_layout Itemize \family typewriter marsadm delete-file /mars/userspace/mykey-A \end_layout \begin_layout Standard There are two reasons for this: first, the \family typewriter marsadm set-link \family default command will automatically use the Lamport clock for symlink creation, and therefore will avoid any errors resulting from a \begin_inset Quotes eld \end_inset wrong \begin_inset Quotes erd \end_inset system clock (as in \family typewriter ln -s \family default ). Second, the \family typewriter marsadm delete-file \family default (which also deletes symlinks) works on the \emph on whole cluster \emph default . \end_layout \begin_layout Standard What's the difference? If you try to remove your symlink locally by hand via \family typewriter rm -f \family default , you will be surprised: since the symlink has been replicated to other cluster nodes, it will be re-transferred from there and will be resurrected locally after some short time. This way, you cannot delete any object reliably, because your whole cluster (which may consist of many nodes) remembers all your state information and will resurrect it whenever \begin_inset Quotes eld \end_inset necessary \begin_inset Quotes erd \end_inset . \end_layout \begin_layout Standard In order to solve the deletion problem, MARS Light uses some internal deletion protocol using auxiliary symlinks residing in \family typewriter /mars/todo-global/. \family default The deletion protocol ensures that all replicas get deleted in the whole cluster, and only after that the auxiliary symlinks in \family typewriter /mars/todo-global/ \family default are also deleted eventually. \end_layout \begin_layout Standard You may change your already existing symlink via \family typewriter marsadm set-link some-other-value /mars/userspace/mykey-A \family default . The new value will be propagated in the cluster according to a \series bold timestamp comparison protocol \series default : whenever node B notices that A has a \emph on newer \emph default version of some symlink (according to the Lamport timestamp), it will replace its elder version by the newer one. The opposite does \emph on not \emph default work: if B notices that A has an elder version, just nothing happens. This way, the timestamps of symlinks can only progress in forward direction, but never backwards in time. \end_layout \begin_layout Standard As a consequence, symlink updates made \begin_inset Quotes eld \end_inset by hand \begin_inset Quotes erd \end_inset via \family typewriter ln -s \family default may get lost when the local system clock is much more earlier than the Lamport clock. \end_layout \begin_layout Standard When your cluster is fully connected by the network, the last timestamp will finally win everywhere. Only in case of network outages leading to \emph on network partitions \emph default , some information may be \emph on temporarily inconsistent \emph default , but only for the duration of the network outage. The timestamp comparison protocol in combination with the Lamport clock and with the persistence of the \family typewriter /mars/ \family default filesystem will automatically heal any temporary inconsistencies as soon as possible, even in case of temporary node shutdown. \end_layout \begin_layout Standard The meaning of the internal MARS Light symlinks residing in \family typewriter /mars/ \family default is documented in section \begin_inset CommandInset ref LatexCommand ref reference "sec:Documentation-of-the" \end_inset . \end_layout \begin_layout Section Defending Overflow of \family typewriter /mars/ \begin_inset CommandInset label LatexCommand label name "sec:Defending-Overflow" \end_inset \end_layout \begin_layout Standard This section describes an important difference to DRBD. The metadata of DRBD is allocated \emph on statically \emph default at \emph on creation \emph default \emph on time \emph default of the resource. In contrast, the MARS transaction logfiles are allocated \emph on dynamically \emph default at \emph on runtime \emph default . \end_layout \begin_layout Standard This leads to a potential risk from the perspective of a sysadmin: what happens if the \family typewriter /mars/ \family default filesystem runs out of space? \end_layout \begin_layout Standard No risk, no fun. If you want a system which survives long-lasting network outages while keeping your replicas always consistent (anytime consistency), you \emph on need \emph default dynamic memory for that. It is \emph on impossible \emph default to solve that problem using static memory \begin_inset Foot status open \begin_layout Plain Layout The bitmaps used by DRBD don't preserve the \emph on order \emph default of write operations. They cannot do that, because their space is \begin_inset Formula $O(k)$ \end_inset for some constant \begin_inset Formula $k$ \end_inset . In contrast, MARS preserves the order. Preserving the order as such (even when only \emph on facts \emph default about the order were recorded without recording the actual data contents) requires \begin_inset Formula $O(n)$ \end_inset space where \begin_inset Formula $n$ \end_inset is infinitely growing over time. \end_layout \end_inset . \end_layout \begin_layout Standard Therefore, DRBD and MARS have different application areas. If you just want a simple system for mirroring your data over short distances like a crossover cable, DRBD will be a suitable choice. However, if you need to replicate over longer distances, or if you need higher levels of reliability even when multiple failures may accumulate (such as network loss during a \emph on re \emph default sync of DRBD), the transaction logs of MARS can solve that, but at some \emph on cost \emph default . \end_layout \begin_layout Subsection Countermeasures \end_layout \begin_layout Subsubsection Dimensioning of \family typewriter /mars/ \begin_inset CommandInset label LatexCommand label name "sub:Dimensioning-of-/mars/" \end_inset \end_layout \begin_layout Standard The first (and most important) measure against overflow of \family typewriter /mars/ \family default is simply to dimension it large enough to survive longer-lasting problems, at least one weekend. \end_layout \begin_layout Standard Recommended size is at least one dedicated disk, residing at a hardware RAID controller with BBU (see section \begin_inset CommandInset ref LatexCommand ref reference "sec:Preparation:-What-you" \end_inset ). During normal operation, that size is needed only for a small fraction, typically a few percent or even less than one percent. However, it is your \series bold safety margin \series default . Keep it high enough! \end_layout \begin_layout Subsubsection Monitoring \end_layout \begin_layout Standard The next (equally important) measure is \series bold monitoring in userspace \series default . \end_layout \begin_layout Standard Following is a list of countermeasures both in userspace and in kernelspace, in the order of \begin_inset Quotes eld \end_inset defensive walling \begin_inset Quotes erd \end_inset : \end_layout \begin_layout Enumerate Regular userspace monitoring must throw an INFO if a certain freespace limit \begin_inset Formula $l_{1}$ \end_inset of \family typewriter /mars/ \family default is undershot. Typical values for \begin_inset Formula $l_{1}$ \end_inset are 30%. Typical actions are automated calls of \family typewriter marsadm log-rotate all \family default followed by \family typewriter marsadm log-delete-all all \family default . You have to implement that yourself in sysadmin space. \end_layout \begin_layout Enumerate Regular userspace monitoring must throw a WARNING if a certain freespace limit \begin_inset Formula $l_{2}$ \end_inset of \family typewriter /mars/ \family default is undershot. Typical values for \begin_inset Formula $l_{2}$ \end_inset are 20%. Typical actions are (in addition to \family typewriter log-rotate \family default and \family typewriter log-delete-all \family default ) alarming human supervisors via SMS and/or further stronger automated actions. \begin_inset Newline newline \end_inset \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset Frequently large space is occupied by files stemming from debugging output, or from other programs or processes. A hot candidate is \begin_inset Quotes eld \end_inset forgotten \begin_inset Quotes erd \end_inset removal of debugging output to \family typewriter /mars/ \family default . Sometimes, an \family typewriter rm -rf $(find /mars/ -name \begin_inset Quotes eld \end_inset *.log \begin_inset Quotes erd \end_inset ) \family default can work miracles. \begin_inset Newline newline \end_inset \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset Another source of space hogging is a \begin_inset Quotes eld \end_inset forgotten \begin_inset Quotes erd \end_inset \family typewriter pause-sync \family default or \family typewriter disconnect \family default . Therefore, a simple \family typewriter marsadm connect-global all \family default followed by \family typewriter marsadm resume-replay-global all \family default may also work miracles (if you didn't want to freeze some mirror deliberately). \begin_inset Newline newline \end_inset \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset If you just wanted to freeze a mirror at an outdated state for a very long time, you simply \emph on cannot \emph default do that without causing infinite growth of space consumption in \family typewriter /mars/ \family default . Therefore, a \family typewriter marsadm leave-resource $res \family default at \emph on exactly that(!) \emph default secondary site where the mirror is frozen, can also work miracles. If you want to automate this in unserspace, be careful. It is easy to get unintended effects when choosing the wrong site for \family typewriter leave-resource \family default . \begin_inset Newline newline \end_inset \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset Hint: you can / should start some of these measures even earlier at the INFO level (see item 1), or even earlier. \end_layout \begin_layout Enumerate Regular userspace monitoring must throw an ERROR if a certain freespace limit \begin_inset Formula $l_{3}$ \end_inset of \family typewriter /mars/ \family default is undershot. Typical values for \begin_inset Formula $l_{3}$ \end_inset are 10%. Typical actions are alarming the CEO via SMS and/or even stronger automated actions. For example, you may choose to automatically call \family typewriter marsadm leave-resource $res \family default on some or all secondary nodes, such that the primary will be left alone and now has a chance to really delete its logfiles because no one else is any longer potentially needing it. \end_layout \begin_layout Enumerate First-level kernelspace action, automatically executed when \family typewriter \begin_inset Flex URL status open \begin_layout Plain Layout /proc/sys/mars/required_free_space_4_gb \end_layout \end_inset \family default + \family typewriter \begin_inset Flex URL status open \begin_layout Plain Layout /proc/sys/mars/required_free_space_3_gb \end_layout \end_inset \family default + \family typewriter \begin_inset Flex URL status open \begin_layout Plain Layout /proc/sys/mars/required_free_space_2_gb \end_layout \end_inset \family default + \family typewriter \begin_inset Flex URL status open \begin_layout Plain Layout /proc/sys/mars/required_free_space_1_gb \end_layout \end_inset \family default is undershot: \begin_inset Newline newline \end_inset all locally secondary resources will stop fetching transaction logfiles. As a side effect, other nodes in the cluster may become unable to delete their logfiles also. This is a desperate action of the kernel module. \end_layout \begin_layout Enumerate Second-level kernelspace action, automatically executed when \family typewriter \begin_inset Flex URL status open \begin_layout Plain Layout /proc/sys/mars/required_free_space_3_gb \end_layout \end_inset \family default + \family typewriter \begin_inset Flex URL status open \begin_layout Plain Layout /proc/sys/mars/required_free_space_2_gb \end_layout \end_inset \family default + \family typewriter \begin_inset Flex URL status open \begin_layout Plain Layout /proc/sys/mars/required_free_space_1_gb \end_layout \end_inset \family default is undershot: \begin_inset Newline newline \end_inset all locally secondary resources will start removing any logfiles which are no longer used locally. This is a more desperate action of the kernel module. \end_layout \begin_layout Enumerate Third-level kernelspace action, automatically executed when \family typewriter \begin_inset Flex URL status open \begin_layout Plain Layout /proc/sys/mars/required_free_space_2_gb \end_layout \end_inset \family default + \family typewriter \begin_inset Flex URL status open \begin_layout Plain Layout /proc/sys/mars/required_free_space_1_gb \end_layout \end_inset \family default is undershot: \begin_inset Newline newline \end_inset all locally primary resources are checked for logfiles which are no longer needed \emph on locally \emph default . Locally unneeded files are deleted even when some secondary needs them. As a consequence, some secondaries may get stuck (left in consistent, but outdated state). In order to get them actual again, they will need a \family typewriter marsadm invalidate \family default later. This is an even more desperate action of the kernel module. You don't want to get there (except for testing). \end_layout \begin_layout Enumerate Last desperate kernelspace action when all other has failed and \family typewriter \begin_inset Flex URL status open \begin_layout Plain Layout /proc/sys/mars/required_free_space_1_gb \end_layout \end_inset \family default is undershot: \begin_inset Newline newline \end_inset all locally primary resources will enter \series bold emergency mode \series default (see description below in section \begin_inset CommandInset ref LatexCommand ref reference "sub:Emergency-Mode" \end_inset ). This is the most desperate action of the kernel module. You don't want to get there (except for testing). \end_layout \begin_layout Standard In addition, the kernel module obeys a general global limit \family typewriter \begin_inset Flex URL status open \begin_layout Plain Layout /proc/sys/mars/required_total_space_0_gb \end_layout \end_inset + \family default the sum of all of the above limits. When the \emph on total size \emph default of \family typewriter /mars/ \family default undershots that sum, the kernel module refuses to start at all, because it assumes that it is senseless to try to operate MARS on a system with such low memory resources. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset The current level of emergency kernel actions may be viewed at any time via \family typewriter \begin_inset Flex URL status collapsed \begin_layout Plain Layout /proc/sys/mars/mars_emergency_mode \end_layout \end_inset \family default . \end_layout \begin_layout Subsubsection Throttling \end_layout \begin_layout Standard The last measure for defense of overflow is \series bold throttling your performance pigs \series default . \end_layout \begin_layout Standard Motivation: in rare cases, some users with \family typewriter ssh \family default access can do \emph on very \emph default silly things. For example, some of them are creating their own backups via user-cron jobs, and they do it every 5 minutes. Some example guy created a zip archive (almost 1GB) by regularly copying his old zip archive into a new one, then appending deltas to the new one, and finally deleting the old archive. Every 5 minutes. Yes, every 5 minutes, although almost never any new files were added to the archive. Essentially, he copied over his archive, for nothing. This led to massive bulk write requests, for ridiculous reasons. \end_layout \begin_layout Standard In general, your hard disks (or even RAID systems) allow much higher write IO rates than you can ever transport over a standard TCP network from your primary site to your secondary, at least over longer distances (see use cases for MARS in chapter \begin_inset CommandInset ref LatexCommand ref reference "chap:Use-Cases-for" \end_inset ). Therefore, it is easy to create a such a high write load that it will be \emph on impossible \emph default to replicate it over the network, \emph on by construction \emph default . \end_layout \begin_layout Standard Therefore, we \emph on need \emph default some mechanism for throttling bulk writers whenever the network is weaker than your IO subsystem. \end_layout \begin_layout Standard \noindent \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset Notice that DRBD will \emph on always \emph default throttle your writes whenever the network forms a bottleneck, due to its synchronous operation mode. In contrast, MARS allows for buffering of performance peaks in the transaction logfiles. \emph on Only when \emph default your buffer in \family typewriter /mars/ \family default runs short (cf subsection \begin_inset CommandInset ref LatexCommand ref reference "sub:Dimensioning-of-/mars/" \end_inset ), MARS will start to throttle your application writes. \end_layout \begin_layout Standard There are a lot of screws named \family typewriter /proc/sys/mars/write_throttle_* \family default with the following meaning: \end_layout \begin_layout Description \family typewriter write_throttle_start_percent \family default Whenever the used space in \family typewriter /mars/ \family default is below this threshold, no throttling will occur at all. Only when this threshold is exceeded, throttling will start \emph on slowly \emph default . Typical values for this are 60%. \end_layout \begin_layout Description \family typewriter write_throttle_end_percent \family default Maximum throttling will occur once this space threshold is reached, i.e. the throttling is now at its maximum effect. Typical values for this are 90%. When the actual space in \family typewriter /mars/ \family default lies between \family typewriter write_throttle_start_percent \family default and \family typewriter write_throttle_end_percent \family default , the strength of throttling will be interpolated linearly between the extremes. In practice, this should lead to an equilibrum between new input flow into \family typewriter /mars/ \family default and output flow over the network to secondaries. \end_layout \begin_layout Description \family typewriter write_throttle_size_threshold_kb \family default (readonly) This parameter shows the internal strength calculation of the throttling. Only write \begin_inset Foot status open \begin_layout Plain Layout Read requests are never throttled at all. \end_layout \end_inset requests exceeding this size (in KB) are throttled at all. Typically, this will hurt the bulk performance pigs first, while leaving ordinary users (issuing small requests) unaffected. \end_layout \begin_layout Description \family typewriter write_throttle_ratelimit_kb \family default Set the global IO rate in KB/s for those write requests which are throttled. In case of strongest \begin_inset Foot status open \begin_layout Plain Layout In case of lighter throttling, the input flow into \family typewriter /mars/ \family default may be higher because small requests are not throttled. \end_layout \end_inset throttling, this parameters determines the input flow into \family typewriter /mars/ \family default . The default value is 5.000 KB/s. Please adjust this value to your application needs and to your environment. \end_layout \begin_layout Description \family typewriter write_throttle_rate_kb \family default (readonly) Shows the current rate of exactly those requests which are actually throttled (in contrast to \emph on all \emph default requests). \end_layout \begin_layout Description \family typewriter write_throttle_cumul_kb \family default (logically readonly) Same as before, but the cumulative sum of all throttled requests since startup / reset. This value can be reset from userspace in order to prevent integer overflow. \end_layout \begin_layout Description \family typewriter write_throttle_count_ops \family default (logically readonly) Shows the cumulative number of throttled requests. This value can be reset from userspace in order to prevent integer overflow. \end_layout \begin_layout Description \family typewriter write_throttle_maxdelay_ms \family default Each request is delayed at most for this timespan. Smaller values will improve the responsiveness of your userspace application, but at the cost of potentially retarding the requests not sufficiently. \end_layout \begin_layout Description \family typewriter write_throttle_minwindow_ms \family default Set the minimum length of the measuring window. The measuring window is the timespan for which the average (throughput) rate is computed (see \family typewriter write_throttle_rate_kb \family default ). Lower values can increase the responsiveness of the controller algorithm, but at the cost of accuracy. \end_layout \begin_layout Description \family typewriter write_throttle_maxwindow_ms \family default This parameter must be set sufficiently much greater than \family typewriter write_throttle_minwindow_ms \family default . In case the flow of throttled operations pauses for some natural reason (e.g. switched off, low load, etc), this parameter determines when a completely new rate calculation should be started over \begin_inset Foot status open \begin_layout Plain Layout Motivation: if requests would pause for one hour, the measuring window could become also an hour. Of course, that would lead to completely meaningless results. Two requests in one hour is \begin_inset Quotes eld \end_inset incorrect \begin_inset Quotes erd \end_inset from a human point of view: we just have to ensure that averages are computed with respect to a reasonable maximum time window in the magnitude of 10s. \end_layout \end_inset . \end_layout \begin_layout Subsection Emergency Mode \begin_inset CommandInset label LatexCommand label name "sub:Emergency-Mode" \end_inset \end_layout \begin_layout Standard When \family typewriter /mars/ \family default is almost full and there is really absolutely no chance of getting rid of any local transaction logfile (or free some space in any other way), there is only one exit strategy: stop creating new logfile data. \end_layout \begin_layout Standard This means that the ability for replication gets lost. \end_layout \begin_layout Standard When entering emergency mode, the kernel module will execute the following steps for all resources where the affected host is acting as a primary: \end_layout \begin_layout Enumerate Do a kind of \begin_inset Quotes eld \end_inset logrotate \begin_inset Quotes erd \end_inset , but create a \emph on hole \emph default in the sequence of transaction logfile numbers. The \begin_inset Quotes eld \end_inset new \begin_inset Quotes erd \end_inset logfile is left empty, i.e. no data ist written to it (for now). The hole in the numbering will prevent any secondaries from applying any logfiles behind the hole (should they ever contain some data, e.g. because the emergency mode has been left again). This works because the secondaries are regularly checking the logfile numbers for contiguity, and they will refuse to apply anything which is not contiguous. As a result, the secondaries will be left in a consistent, but outdated state. \end_layout \begin_layout Enumerate The kernel module writes back all data present in the temporary memory buffer (see figure in section \begin_inset CommandInset ref LatexCommand ref reference "sec:The-Transaction-Logger" \end_inset ). This may lead to a (short) delay of user write requests until that has finished (typically fractions of a second or a few seconds). The reason is that the temporary memory buffer must not be increased in parallel during this phase (race conditions). \end_layout \begin_layout Enumerate After the temporary memory buffer is empty, all local IO requests (whether reads or writes) are directly going to the underlying disk. This has the same effect as if MARS was not present anymore. \end_layout \begin_layout Standard In order to leave emergency mode, the sysadmin should do the following steps: \end_layout \begin_layout Enumerate Free enough space. For example, delete any foreign files on \family typewriter /mars/ \family default which have nothing to do with MARS, or resize the \family typewriter /mars/ \family default filesystem, or whatever. \end_layout \begin_layout Enumerate If \family typewriter \begin_inset Flex URL status collapsed \begin_layout Plain Layout /proc/sys/mars/mars_reset_emergency \end_layout \end_inset \family default is not set, now it is time to set it. Normally, it should be already set. In consequence, the primary sides should continue transaction logging automatic ally. \end_layout \begin_layout Enumerate On the secondaries, use \family typewriter marsadm invalidate $res \family default in order to get your outdated mirrors uptodate. This will lead to temporarily inconsistent mirrors, so don't do this on all secondaries in parallel, but sequentially step by step. This way, if you have more than 1 mirror, you will always retain at least one consistent, but outdated copy. \begin_inset Newline newline \end_inset \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset If you had only 1 mirror per resource before the overflow happened, you can now create a new one via \family typewriter marsadm join-resource $res \family default on a third node (provided that your storage space permits that after the cleanup). After the initial full sync has finished there, do an \family typewriter marsadm invalidate $res \family default on the outdated mirror. This way, you will always retain at least one consistent mirror somewhere. After all is up-to-date, you can delete the superfluous mirror by \family typewriter marsadm leave-resource $res \family default and reclaim the disk space from its underlying disk. \end_layout \begin_layout Chapter The Sysadmin Interface \family typewriter marsadm \begin_inset CommandInset label LatexCommand label name "chap:The-Sysadmin-Interface" \end_inset \end_layout \begin_layout Standard In general, the term \begin_inset Quotes eld \end_inset after a while \begin_inset Quotes erd \end_inset means that other cluster nodes will take notice of your actions according to the \begin_inset Quotes eld \end_inset eventually consistent \begin_inset Quotes erd \end_inset propagation protocol described in sections \begin_inset CommandInset ref LatexCommand ref reference "sec:The-Lamport-Clock" \end_inset and \begin_inset CommandInset ref LatexCommand ref reference "sec:The-Symlink-Tree" \end_inset . Please be aware that this \begin_inset Quotes eld \end_inset while \begin_inset Quotes erd \end_inset may last very long in case of network outages or bad firewall rules. \end_layout \begin_layout Standard In the following tables, column \begin_inset Quotes eld \end_inset Cmp \begin_inset Quotes erd \end_inset means compatibility with DRBD. Please note that 100% exact compatibility is not possible, because of the asynchronous communication paradigm. \end_layout \begin_layout Standard The following table documents common options which work with (almost) any command: \end_layout \begin_layout Standard \size scriptsize \begin_inset Tabular \begin_inset Text \begin_layout Plain Layout \size scriptsize Option \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Cmp \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Description \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize --dry-run \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Run the command without actually creating symlinks or touching files or executing rsync. This option \emph on should \emph default be used first at any dangerous command, in order to check what would happen. \end_layout \begin_layout Plain Layout \begin_inset Graphics filename images/MatieresCorrosives.png lyxscale 50 scale 17 \end_inset Don't use in scripts! Only use by hand! \end_layout \begin_layout Plain Layout \size scriptsize This option does not change the waiting logic. Many commands are waiting until the desired effect has taken place. However, with \family typewriter --dry-run \family default the desired effect will never happen, so the command may wait forever (or abort with a timeout). \end_layout \begin_layout Plain Layout \size scriptsize In addition, this option can lead to additional aborts of the commands due to unmet conditions, which cannot be met because the symlinks are not actually created / altered. \end_layout \begin_layout Plain Layout \size scriptsize Thus this option can give only a \series bold rough estimate \series default of what would happen later! \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize --force \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize almost \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Some preconditions are skipped, i.e. the command will / should work although some (more or less) vital preconditions are violated. \end_layout \begin_layout Plain Layout \size scriptsize Instead of giving \family typewriter --force \family default , you may alternatively prefix your command with \family typewriter force- \end_layout \begin_layout Plain Layout \begin_inset Graphics filename images/MatieresToxiques.png lyxscale 50 scale 17 \end_inset THIS OPTION IS DANGEROUS! \end_layout \begin_layout Plain Layout \size scriptsize Use it only when you are absolutely sure that you know what you are doing! \end_layout \begin_layout Plain Layout \size scriptsize Use it only as a last resort if the same command without \family typewriter --force \family default has failed \emph on for no good reason \emph default ! \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize --timeout=$seconds \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Some commands require response from either the local kernel module, or from other cluster nodes. In order to prevent infinite waiting in case of network outages or other problems, the command will fail after the given timeout has been reached. \end_layout \begin_layout Plain Layout \size scriptsize When $seconds is -1, the command will wait forever. \end_layout \begin_layout Plain Layout \size scriptsize When $seconds is 0, the command will not wait in case any precondition is not met, und abort without performing an action.. \end_layout \begin_layout Plain Layout \size scriptsize The default timeout is 5s. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize --host=$host \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize The command acts as if the command were executed on another host $host. This option should not be used regularly, because the local information in the symlink tree may be outdated or even wrong. Additionally, some local information like remote sizes of physical devices (e.g. remote disks) is not present in the symlink tree at all, or is wrong (reflectin g only the \emph on local \emph default state). \end_layout \begin_layout Plain Layout \begin_inset Graphics filename images/MatieresToxiques.png lyxscale 50 scale 17 \end_inset THIS OPTION IS DANGEROUS! \end_layout \begin_layout Plain Layout \size scriptsize Use it only for final destruction of dead cluster nodes, see section \begin_inset CommandInset ref LatexCommand ref reference "sub:Final-Destroy-of" \end_inset . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize --ip=$ip \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize By default, \family typewriter marsadm \family default always uses the IP for \family typewriter $host \family default as stored in the symlink tree (directory \family typewriter /mars/ips/ \family default ). When such an IP entry does not (yet) exist (e.g. \family typewriter create-cluster \family default or \family typewriter join-cluster \family default ), all local network interfaces are automatically scanned for IPv4 adresses, and the first one is taken. This may lead to wrong decisions if you have multiple network interfaces. \end_layout \begin_layout Plain Layout \size scriptsize In order to override the automatic IP detection and.to explicitly tell the IP address of your storage network, use this option. \end_layout \begin_layout Plain Layout \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset \size scriptsize Usually you will need this only at \family typewriter {create,join}-cluster \family default . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize --verbose \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Some (few) commands will become more speaky. \end_layout \end_inset \end_layout \end_inset \end_inset \end_layout \begin_layout Section Cluster Operations \begin_inset CommandInset label LatexCommand label name "sec:Cluster-Operations" \end_inset \end_layout \begin_layout Standard \size scriptsize \begin_inset Tabular \begin_inset Text \begin_layout Plain Layout \size scriptsize Command / Params \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Cmp \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Description \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize create-cluster \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: the \family typewriter /mars/ \family default filesystem must be mounted and it must be empty. The kernel module must not be loaded. \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: the initial symlink tree is created in \family typewriter /mars/ \family default . Additionally, the \family typewriter /mars/uuid \family default symlink is created for later distribution in the cluster. It uniquely indentifies the cluster in the world. \end_layout \begin_layout Plain Layout \size scriptsize This must be called exactly once at the initial primary. \end_layout \begin_layout Plain Layout Hint: use the \family typewriter --ip= \family default option if you have multiple interfaces. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize join-cluster \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $host \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: the \family typewriter /mars/ \family default filesystem must be mounted and it must be empty. The kernel module must not be loaded. The cluster must have been already created at another node \family typewriter $host \family default . A working ssh connecttion to $host must exit (without password). \family typewriter rsync \family default must be installed at all cluster nodes. \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: the initial symlink tree \family typewriter /mars/ \family default is replicated from the remote host \family typewriter $host \family default , and the local host has been added as another cluster member. \end_layout \begin_layout Plain Layout \size scriptsize This must be called exactly once at every initial secondary. \end_layout \begin_layout Plain Layout Hint: use the \family typewriter --ip= \family default option if you have multiple interfaces. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize leave-cluster \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: the \family typewriter /mars/ \family default filesystem must be mounted and it must contain a valid MARS symlink tree produced by the other \family typewriter marsadm \family default commands. The kernel module must be loaded. The local node must no longe be member of any resource (see \family typewriter marsadm leave-resource \family default ). \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: the local node is removed from the replicated symlink tree \family typewriter /mars/ \family default such that other nodes will cease to communicate with it after a while. The local \family typewriter /mars/ \family default filesystem may be finally destroyed. \end_layout \begin_layout Plain Layout \size scriptsize In case of an eventual node loss (e.g. fire, water, ...) this may be used. on another node $helper in order to finally remove $damaged from the cluster via the command \family typewriter marsadm leave-cluster --host=$damaged --force \family default . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize wait-cluster \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize See section \begin_inset CommandInset ref LatexCommand ref reference "sub:Waiting" \end_inset . \end_layout \end_inset \end_layout \end_inset \end_inset \end_layout \begin_layout Section Resource Operations \begin_inset CommandInset label LatexCommand label name "sec:Resource-Operations" \end_inset \end_layout \begin_layout Standard Common precondition for all resource operations is that the \family typewriter /mars/ \family default filesystem is mounted, that it contains a valid MARS symlink tree produced by other \family typewriter marsadm \family default commands, that your current node is a member of the cluster, and that the kernel module is loaded. When communication is impossible due to network outages or bad firewall rules, most commands will succeed, but other cluster nodes may take a long time to notice your changes. \end_layout \begin_layout Subsection Resource Creation / Deletion / Modification \begin_inset CommandInset label LatexCommand label name "sub:Resource-Creation" \end_inset \end_layout \begin_layout Standard \size scriptsize \begin_inset Tabular \begin_inset Text \begin_layout Plain Layout \size scriptsize Command / Params \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Cmp \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Description \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize create-resource \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $disk_dev \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset [$mars_name] \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset [$size] \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: the resource argument \family typewriter $res \family default must not denote an already existing resource in the cluster. The argument \family typewriter $disk_dev \family default must denote a usable local block device, its size must be greater zero. When the optional \family typewriter $mars_name \family default is given, that name must not already exist on the local node; when not given, \family typewriter $mars_name \family default defaults to \family typewriter $res \family default . When the optional \family typewriter $size \family default argument is given, it must be a number, optionally followed by suffix \family typewriter k \family default , \family typewriter m \family default , \family typewriter g \family default , or \family typewriter t \family default (denoting size factors in powers of two). The given size must not exceed the actual size of \family typewriter $disk_dev \family default . \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: the resource \family typewriter $res \family default is created, the inital role of the current node is primary. The corresponding symlink tree information is asynchonously distributed in the cluster (in the background). The device \family typewriter /dev/mars/$mars_name \family default should appear after a while. \end_layout \begin_layout Plain Layout \size scriptsize Notice: when \family typewriter $size \family default is strictly smaller than the size of \family typewriter $disk_dev \family default , you will unnecessarily waste some space.. \end_layout \begin_layout Plain Layout \size scriptsize This must be called exactly once for any new resource. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize join-resource \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $disk_dev \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset [$mars_name] \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: the resource argument \family typewriter $res \family default must denote an already existing resource in the cluster (i.e. its symlink tree information must have been received). The resource must have a designated primary. The local node must not be already member of that resource. The argument \family typewriter $disk_dev \family default must denote a usable local block device, its size must be greater or equal to the logical size of the resource. When the optional \family typewriter $mars_name \family default is given, that name must not already exist on the local node; when not given, \family typewriter $mars_name \family default defaults to \family typewriter $res \family default . \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: the current node becomes a member of resource \family typewriter $res \family default , the inital role is secondary. The initial full sync should start after a while. \end_layout \begin_layout Plain Layout \size scriptsize Notice: when the size if $disk_dev is strictly greater than the size of the resource, you will unnecessarily waste some space.. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize leave-resource \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: the local node must be a member of the resource \family typewriter $res \family default ; its current role must be secondary. The disk must be detatched. \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: the local node is no longer a member of \family typewriter $res \family default . \end_layout \begin_layout Plain Layout \size scriptsize Notice: as a side effect for other nodes, their log-delete may now become possible, since the current node does no longer count as a candidate for logfile application. \end_layout \begin_layout Plain Layout \size scriptsize Also notice that this command \emph on may \emph default lead to (but does not guarantee) split-brain resolution. \end_layout \begin_layout Plain Layout \begin_inset Graphics filename images/MatieresCorrosives.png lyxscale 50 scale 17 \end_inset \size scriptsize The contents of the disk is not changed by this command. Before issuing this command, check whether the disk is locally consistent! After this command, any symlinks indicating the consistency state are gone, and you will no longer be able to guess consistency properties. \end_layout \begin_layout Plain Layout \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset \size scriptsize When you are \emph on sure \emph default .that the disk was consistent before (or is now by manually checking it), you may re-create a new resource out of it via \family typewriter create-resource \family default . \end_layout \begin_layout Plain Layout \size scriptsize In case of an eventual node loss (e.g. fire, water, ...) this command may be used on another node $helper in order to finally remove all the resources $damaged from the cluster via the command \family typewriter marsadm leave-resource $res --host=$damaged --force \family default . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize delete-resource \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: the resource must be empty (i.e. all members must have left via \family typewriter leave-resource \family default ). This precondition is overridable by \family typewriter --force \family default , increasing the danger to maximum! \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: all cluster members will somewhen be forcefully removed from \family typewriter $res \family default . In case of network interruptions, the forced removal may take place far in the future. \end_layout \begin_layout Plain Layout \begin_inset Graphics filename images/MatieresToxiques.png lyxscale 50 scale 17 \end_inset THIS COMMAND IS \emph on VERY \emph default DANGEROUS! \end_layout \begin_layout Plain Layout \size scriptsize Use this only in desperate situations. You are forcefully using a sledgehammer, even without \family typewriter --force \family default ! The danger is that the \emph on true \emph default state of other cluster nodes need not be known in case of network problems .Even when it were known, it could be compromised by \series bold byzantine failures \series default . \end_layout \begin_layout Plain Layout \size scriptsize It is strongly advised to try this command with \family typewriter --dry-run \family default first. \end_layout \begin_layout Plain Layout \size scriptsize When combined with \family typewriter --force \family default , this command will definitely \series bold murder \series default other cluster nodes, possibly after a long while, and even when they are operating in primary mode / having split brains / etc. However, there is no guarantee that other cluster nodes will be \emph on really \emph default dead - it is possible that they remain only \emph on half \emph default \emph on dead \emph default . For example, a half dead node may continue to write data to \family typewriter /mars/ \family default and thus lead to overflow somewhen. \end_layout \begin_layout Plain Layout \begin_inset Graphics filename images/MatieresToxiques.png lyxscale 50 scale 17 \end_inset This command implies a forceful detach, possibly destroying consistency. \size scriptsize In particular, when a cluster node was operating in primary mode ( \family typewriter /dev/mars/mydata \family default being continuously in use), the forceful detach cannot be carried out until the device is completely unused. In the meantime, the current transaction logfile will be appended to, but the file \emph on might \emph default be already unlinked (orphan file filling up the disk). After the forceful detach, the underlying disk need not be consistent (although we do our best). Since this command deletes any symlinks which normally would indicate the consistency state, no guarantees about consistency can be given after this \emph on in general \emph default ! Always check consistency by hand! \end_layout \begin_layout Plain Layout \size scriptsize When possible / as soon as possible, check the local state on the other nodes in order to \emph on really \emph default shutdown the resource everywhere (e.g. to \emph on really \emph default unuse the \family typewriter /dev/mars/mydata \family default device, etc). \end_layout \begin_layout Plain Layout \size scriptsize After this command, you \emph on should \emph default rebuild the resource under a different name, in order to avoid any clashes caused by unexpected resurrection of \begin_inset Quotes eld \end_inset dead \begin_inset Quotes erd \end_inset or \begin_inset Quotes eld \end_inset half-dead \begin_inset Quotes erd \end_inset nodes (beware of shapshot / restores on virtual machines!!). MARS Light does its best to avoid problems even in case the new resource name should equal the old one, but there can be \emph on no guarantee \emph default in all possible failure scenarios / usage scenarios. \end_layout \begin_layout Plain Layout \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset \size scriptsize When possible, prefer \family typewriter leave-resource \family default over this! \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize wait-resource \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset {is-,}{attach, \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset primary, \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset device}{-off,} \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize See section \begin_inset CommandInset ref LatexCommand ref reference "sub:Waiting" \end_inset . \end_layout \end_inset \end_layout \end_inset \end_inset \end_layout \begin_layout Subsection Operation of the Resource \begin_inset CommandInset label LatexCommand label name "sub:Operation-of-the" \end_inset \end_layout \begin_layout Standard Common preconditions are the preconditions from section \begin_inset CommandInset ref LatexCommand ref reference "sec:Resource-Operations" \end_inset , plus the respective resource \family typewriter $res \family default must exist, and the local node must be a member of it. With the single exception of \family typewriter attach \family default itself, all other operations must be started in \family typewriter attached \family default state. \end_layout \begin_layout Standard When \family typewriter $res \family default has the special reserved value \family typewriter all \family default , the following operations will work on all resources where the current node is a member (analogously to DRBD). \end_layout \begin_layout Standard \size scriptsize \begin_inset Tabular \begin_inset Text \begin_layout Plain Layout \size scriptsize Command / Params \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Cmp \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Description \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize attach \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize yes \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: the local disk belonging to $res is not in use by anyone else. \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: MARS uses the local disk and is able work with it (e.g. apply logfiles to it). \end_layout \begin_layout Plain Layout \size scriptsize Note: the local disk is opened in exclusive read-write mode. This should protect against most common misuse, such as opening the disk in parallel to MARS. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize detach \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize yes \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: the local host is in secondary role, \family typewriter pause-sync \family default and \family typewriter pause-replay \family default have been given.. \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: the local disk belonging to $res is no longer in use. \end_layout \begin_layout Plain Layout \begin_inset Graphics filename images/MatieresToxiques.png lyxscale 50 scale 17 \end_inset \size scriptsize WARNING! After this, you might use the underlying disk for other purposes, such as test-mounting it in \emph on readonly \emph default mode.. \series bold Don't modifiy \series default its contents in any way! Not even by an \family typewriter fsck \family default ! Otherwise, you will have inconsistencies \emph on guaranteed \emph default . MARS has no way for knowing of any modifications to your disk when not written via \family typewriter /dev/mars/* \family default . \end_layout \begin_layout Plain Layout \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset \size scriptsize In case you accidentally modified the underlying disk at the \emph on primary \emph default side, you may choose to resolve the inconsistencies by \family typewriter marsadm invalide $res \family default on \emph on each \emph default secondary. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize pause-sync \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Equivalent to \family typewriter pause-sync-local \family default . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize pause-sync-local \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: none additionally. \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: any sync operation targeting the local disk (when not yet completed) is paused after a while. When completed, this operation will remember the switch state forever and become relevant if a sync is needed again (e.g. \family typewriter invalidate \family default or \family typewriter resize \family default ). \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize pause-sync-global \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Like \family typewriter *-local \family default , but operates on all members of the resource. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize resume-sync \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Equivalent to \family typewriter pause-sync-local \family default . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize resume-sync-local \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: none additionally. \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: any sync operation targeting the local disk (when not yet completed) is resumed after a while. When completed, this operation will remember the switch state forever and become relevant if a sync is needed again (e.g. \family typewriter invalidate \family default or \family typewriter resize \family default ). \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize resume-sync-global \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Like \family typewriter *-local \family default , but operates on all members of the resource. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize pause-replay \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Equivalent to \family typewriter pause-replay-local \family default . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize pause-replay-local \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: must be in secondary role. \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: any local apply operations of transaction logfiles to the local disk are paused at their current stage. \end_layout \begin_layout Plain Layout \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset \size scriptsize This works independently from \family typewriter {dis,}connect \family default . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize pause-replay-global \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Like \family typewriter *-local \family default , but operates on all members of the resource. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize resume-replay \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Equivalent to \family typewriter pause-replay-local \family default . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize resume-replay-local \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: must be in secondary role. \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: any (parts of) locally existing transaction logfiles (whether replicated from other hosts or produced locally) are started for apply to the local disk, as far as they have not yet been applied. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize resume-replay-global \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Like \family typewriter *-local \family default , but operates on all members of the resource. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize connect \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Equivalent to \family typewriter connect-local \family default . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize connect-local \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: must be in secondary role. \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: any (parts of) transaction logfiles which are present at another primary host will be transferred to the local \family typewriter /mars/ \family default storage as far as not yet present locally. \end_layout \begin_layout Plain Layout \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset \size scriptsize This works independently from \family typewriter {pause,resume}-replay \family default . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize connect-global \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Like \family typewriter *-local \family default , but operates on all members of the resource. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize disconnect \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Equivalent to \family typewriter disconnect-local \family default . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize disconnect-local \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: must be in secondary role. \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: any transfer of (parts of) transaction logfiles which are present at another primary host to the local \family typewriter /mars/ \family default storage are paused at their current stage. \end_layout \begin_layout Plain Layout \begin_inset Graphics filename /usr/share/clipart/openclipart-0.18/electronics/bulb/lightbulb_brightlit_benj_.png lyxscale 12 scale 7 \end_inset \size scriptsize This works independently from \family typewriter {pause,resume}-replay \family default . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize disconnect-global \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize partly \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Like \family typewriter *-local \family default , but operates on all members of the resource. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize up \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize yes \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Equivalent to \family typewriter attach \family default followed by \family typewriter connect \family default followed by \family typewriter resume-replay \family default followed by \family typewriter resume-sync \family default . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize down \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize yes \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Equivalent to \family typewriter pause-sync \family default followed by \family typewriter disconnect \family default followed by \family typewriter pause-replay \family default followed by \family typewriter detach \family default . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize primary \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize almost \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: all relevant transaction logfiles must be either already locally present, or be fetchable (see \family typewriter connect \family default and \family typewriter resume-replay \family default ). When another host is currently primary, it must match the preconditions of \family typewriter marsadm secondary \family default . \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: \family typewriter /dev/mars/$dev_name \family default appears and is usable; the current host is in primary role. \end_layout \begin_layout Plain Layout \size scriptsize When another host is currently primary, it is first asked to become secondary, and waited for to actually be secondary. After that, the local host is asked to become primary. Before actually becoming primary, all relevant logfiles are applied. Only after that, \family typewriter /dev/mars/$dev_name \family default will appear. When netwrk transfers of the symlink tree are very slow (or currently impossibl e), this command may take a very long time. Therefore \family typewriter --force \family default will skip all checks depending on remote state. \end_layout \begin_layout Plain Layout \size scriptsize In case a split brain is detected, the local host will refuse to become primary without \family typewriter --force \family default . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize secondary \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize almost \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: the local \family typewriter /dev/mars/$dev_name \family default is no longer in use (e.g. umounted). \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: \family typewriter /dev/mars/$dev_name \family default has disappeared; the current host is in secondary role. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize wait-umount \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize See section \begin_inset CommandInset ref LatexCommand ref reference "sub:Waiting" \end_inset . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize log-purge-all \begin_inset CommandInset label LatexCommand label name "log-purge-all$res" \end_inset \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: none additionally. \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: all locally known logfiles and version links are removed, whenever they are not / no longer reachable by any split brain version. \end_layout \begin_layout Plain Layout Rationale: remove hindering split-brain / \family typewriter leave-resource \family default leftovers. \end_layout \begin_layout Plain Layout \size scriptsize Use this only when split brain does not go away by means of \family typewriter leave-resource \family default (which should never happen, but could happen in very weird scenarios such as MARS running on virtual machines doing a restore of their snapshots, or otherwise unexpected resurrection of dead or half-dead nodes). \end_layout \begin_layout Plain Layout \begin_inset Graphics filename images/MatieresToxiques.png lyxscale 50 scale 17 \end_inset THIS IS POTENTIALLY DANGEROUS! \end_layout \begin_layout Plain Layout \size scriptsize This command \emph on might \emph default destroy some valuable logfiles / other information in case the local informatio n is outdated or otherwise incorrect. MARS Light does its best for checking anything, but there is no guarantee. \end_layout \begin_layout Plain Layout \size scriptsize Hint: use \family typewriter --dry-run \family default beforehand for checking! \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize resize \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset [$size] \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize almost \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: all disks in the cluster participating in \family typewriter $res \family default must be physically larger than the logical resource size (e.g. by use of \family typewriter lvm \family default ). When the optional \family typewriter $size \family default argument is present, it must be smaller than the minimum of all physical sizes, but larger than the current logical size. \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: at the (future) primary (if any), the logical size of \family typewriter /dev/mars/$dev_name \family default will reflect the new size after a while. \end_layout \end_inset \end_layout \end_inset \end_inset \end_layout \begin_layout Subsection Logfile Operations \end_layout \begin_layout Standard \size scriptsize \begin_inset Tabular \begin_inset Text \begin_layout Plain Layout \size scriptsize Command / Params \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Cmp \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Description \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize log-rotate \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: the local node \family typewriter $host \family default must be primary at \family typewriter $res \family default . \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: after a while, a new transaction logfile \family typewriter /mars/resource-$res/log-$new_nr-$host \family default will be used instead of \family typewriter /mars/resource-$res/log-$old_nr-$host \family default where \family typewriter $new_nr \family default = \family typewriter $old_nr \family default + 1. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize log-delete \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: the local node must be a member of \family typewriter $res \family default . \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: when there exists an old transaction logfile \family typewriter /mars/resource-$res/log-$old_nr-$some_host \family default where \family typewriter $old_nr \family default is the minimum existing number and that logfile is no longer referenced by any of the symlinks \family typewriter /mars/resource-$res/replay-* \family default , that logfile is marked for deletion in the whole cluster. When no such logfile exists, nothing will happen. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize log-delete-all \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Like \family typewriter log-delete \family default , but mark \emph on all \emph default currently unreferenced logfiles for deletion. \end_layout \end_inset \end_layout \end_inset \end_inset \end_layout \begin_layout Subsection Consistency Operations \end_layout \begin_layout Standard \size scriptsize \begin_inset Tabular \begin_inset Text \begin_layout Plain Layout \size scriptsize Command / Params \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Cmp \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Description \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize invalidate \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: the local node must be in secondary role at \family typewriter $res \family default . \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: the local disk is marked as inconsistent, and a fast fullsync will start after a while. Notice that \family typewriter marsadm {pause,resume}-sync \family default will influence whether the sync really starts. When the fullsync has finished successfully, the local node will be consistent again. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize fake-sync \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: the local node must be in secondary role at \family typewriter $res \family default . \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: when a fullsync is running, it will stop after a while, and the local node will be \emph on marked \emph default as consistent as if it were consistent again. \end_layout \begin_layout Plain Layout \begin_inset Graphics filename images/MatieresToxiques.png lyxscale 50 scale 17 \end_inset \size scriptsize ONLY USE THIS IF YOU REALLY KNOW WHAT YOU ARE DOING! \begin_inset Newline newline \end_inset See the WARNING in section \begin_inset CommandInset ref LatexCommand ref reference "sec:Creating-and-Maintaining" \end_inset \begin_inset Newline newline \end_inset Use this only \emph on after \emph default having created a fresh filesystem inside \family typewriter /dev/mars/$res \family default . \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize set-replay \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \begin_inset Graphics filename images/MatieresToxiques.png lyxscale 50 scale 17 \end_inset \size scriptsize ONLY FOR ADVANCED HACKERS WHO KNOW WHAT THEY ARE DOING! \begin_inset Newline newline \end_inset This command is deliberately not documented. You need the competence level RTFS ( \begin_inset Quotes eld \end_inset read the fucking sources \begin_inset Quotes erd \end_inset ). \end_layout \end_inset \end_layout \end_inset \end_inset \end_layout \begin_layout Section Further Operations \end_layout \begin_layout Subsection Inspection Commands \end_layout \begin_layout Standard \size scriptsize \begin_inset Tabular \begin_inset Text \begin_layout Plain Layout \size scriptsize Command / Params \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Cmp \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Description \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize role \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize state \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize cstate \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout NYI \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize dstate \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout NYI \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize status \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout NYI \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize show-state \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize show-info \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize dstate \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize show \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize show-errors \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize cat \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \end_inset \end_layout \begin_layout Subsection Waiting \begin_inset CommandInset label LatexCommand label name "sub:Waiting" \end_inset \end_layout \begin_layout Standard \size scriptsize \begin_inset Tabular \begin_inset Text \begin_layout Plain Layout \size scriptsize Command / Params \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Cmp \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Description \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize wait-cluster \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: the \family typewriter /mars/ \family default filesystem must be mounted and it must contain a valid MARS symlink tree produced by the other \family typewriter marsadm \family default commands. The kernel module must be loaded. \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: none. \end_layout \begin_layout Plain Layout \size scriptsize Wait until \emph on all \emph default nodes in the cluster have sent a message, or until timeout. The default timeout is 30 s (exceptionally) and may be changed by \family typewriter --timeout=$seconds \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize wait-resource \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset {is-,}{attach, \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset primary, \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset device}{-off,} \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: the local node must be a member of the resource \family typewriter $res \family default . \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: none. \end_layout \begin_layout Plain Layout \size scriptsize Wait until the local node reaches a specified condition on \family typewriter $res \family default , or until timeout. The default timeout of 60 s may be changed by \family typewriter --timeout=$seconds \family default . The last argument denotes the condition. The condition is inverted if suffixed by \family typewriter -off \family default . When preceded by \family typewriter is- \family default (which is the most useful case), it is checked whether the condition is actually reached. When the \family typewriter is- \family default prefix is left off, the check is whether another \family typewriter marsadm \family default command has been already given which \emph on tries \emph default to achieves the intended result (typicially, you may use this after the \family typewriter is- \family default variant has failed). \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize wait-connect \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize almost \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize This is an alias for \family typewriter wait-cluster \family default waiting until only those nodes are reachable which belong to \family typewriter $res \family default (instead of waiting for the \emph on full \emph default cluster). \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize wait-umount \begin_inset Newline newline \end_inset \begin_inset ERT status open \begin_layout Plain Layout \backslash strut \backslash hfill \end_layout \end_inset $res \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Precondition: none additionally. \end_layout \begin_layout Plain Layout \size scriptsize Postcondition: the local \family typewriter /dev/mars/$dev_name \family default is no longer in use (e.g. umounted). \end_layout \end_inset \end_layout \end_inset \end_inset \end_layout \begin_layout Subsection Low-Level Helpers \end_layout \begin_layout Standard These commands are for advanced sysadmins only. The interface is not stable, i.e. the meaning may change at any time. \end_layout \begin_layout Standard \size scriptsize \begin_inset Tabular \begin_inset Text \begin_layout Plain Layout \size scriptsize Command / Params \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Cmp \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Description \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize set-link \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize delete-file \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \end_inset \end_layout \begin_layout Subsection Senseless Commands (from DRBD) \end_layout \begin_layout Standard \size scriptsize \begin_inset Tabular \begin_inset Text \begin_layout Plain Layout \size scriptsize Command / Params \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Cmp \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Description \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize syncer \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize new-current-uuid \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize create-md \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize dump-md \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize dump \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize get-gi \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize show-gi \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize outdate \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize adjust \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize yes \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize Implemented as NOP (not necessary with MARS). \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize hidden-commands \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \end_layout \end_inset \end_inset \end_layout \begin_layout Subsection Forbidden Commands (from DRBD) \end_layout \begin_layout Standard These commands are not implemented because they would be dangerous in MARS context: \end_layout \begin_layout Standard \size scriptsize \begin_inset Tabular \begin_inset Text \begin_layout Plain Layout \size scriptsize Command / Params \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Cmp \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize Description \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize invalidate-remote \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize This is too dangerous in case you have multiple secondaries. A similar effect can be achieved with the \family typewriter --host= \family default option. \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \family typewriter \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "20col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \family typewriter \size scriptsize verify \end_layout \end_inset \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize no \end_layout \end_inset \begin_inset Text \begin_layout Plain Layout \size scriptsize \begin_inset Box Frameless position "t" hor_pos "c" has_inner_box 1 inner_pos "t" use_parbox 0 use_makebox 0 width "60col%" special "none" height "1in" height_special "totalheight" status open \begin_layout Plain Layout \size scriptsize This would cause unintended side effects due to races between logfile transfer / application and block-wise comparison of the underlying disks. However, MARS \family typewriter invalide \family default will do the same as DRBD verify followed by DRBD resync, i.e. \family typewriter marsadm invalidate \family default will automatically correct any found errors; note that the fast-fullsync algorithm of MARS will minimize network traffic. \end_layout \end_inset \end_layout \end_inset \end_inset \end_layout \begin_layout Subsection Deprecated Operations \end_layout \begin_layout Chapter MARS for Developers \end_layout \begin_layout Standard This chapter is organized strictly top-down. \end_layout \begin_layout Standard If you are a sysadmin and want to inform yourself about internals (useful for debugging), the relevant information is at the beginning, and you don't need to dive into all technical details at the end (e.g., you may stop after reading the documentation on symlink trees or even use that documentation like an encyclopedia). \end_layout \begin_layout Standard If you are a kernel developer and want to contribute code to the MARS community, please read it (almost) all. Due to the top-down organization, sometimes you will need to follow some forward references in order to understand details. Therefore I recommend reading this chapter twice in two different reading modes: in the first reading pass, you just get a raw network of principles and structures in your brain (you don't want to grasp details, therefore don't strive for a full understanding). In the second pass, you exploit your knowlegde from the first pass for a deeper understanding of the details. \end_layout \begin_layout Standard Alternatively, you may first read the first section about general architecture, and then start a bottom-up scan by first reading the last section about generic objects and aspects, and working in reverse \emph on section \emph default order (but read \emph on sub \emph default sections in-order) until you finally reach the kernel interfaces / symlink trees. \end_layout \begin_layout Section General Architecture \end_layout \begin_layout Standard The following pictures show some \begin_inset Quotes eld \end_inset zones of responsibility \begin_inset Quotes erd \end_inset , not necessarily a strict hierarchy (although Dijkstra's famous layering rules from THE are tried to be respected as much as possible). The construction principles follow the concepts of \series bold Instance Oriented Programming \series default (IOP) described in \begin_inset Flex URL status collapsed \begin_layout Plain Layout http://athomux.net/papers/paper_inst2.pdf \end_layout \end_inset . Please note that MARS Light is only instance-based \begin_inset Foot status open \begin_layout Plain Layout Similar to OOP, where \begin_inset Quotes eld \end_inset object-based \begin_inset Quotes erd \end_inset means a weaker form of \begin_inset Quotes eld \end_inset object-oriented \begin_inset Quotes erd \end_inset , the term \begin_inset Quotes eld \end_inset instance-based \begin_inset Quotes erd \end_inset means that the \emph on strategy \emph default brick layer need not be fully modularized according to the IOP principles, but the \emph on worker \emph default brick layer already is. \end_layout \end_inset , while MARS Full is planned to be fully instance-oriented. \end_layout \begin_layout Subsection MARS Light Architecture \end_layout \begin_layout Standard \noindent \align center \begin_inset Graphics filename images/mars-light-architecture.fig width 40col% \end_inset \end_layout \begin_layout Subsection MARS Full Architecture (planned) \end_layout \begin_layout Standard \noindent \align center \begin_inset Graphics filename images/mars-full-architecture.fig width 80col% \end_inset \end_layout \begin_layout Section Documentation of the Symlink Trees \begin_inset CommandInset label LatexCommand label name "sec:Documentation-of-the" \end_inset \end_layout \begin_layout Standard The \family typewriter /mars/ \family default symlink tree is serving the following purposes, all at the same time: \end_layout \begin_layout Enumerate For \series bold communication \series default between cluster nodes, see sections \begin_inset CommandInset ref LatexCommand ref reference "sec:The-Lamport-Clock" \end_inset and \begin_inset CommandInset ref LatexCommand ref reference "sec:The-Symlink-Tree" \end_inset . This communication is even the \emph on only \emph default communication between cluster nodes (apart from the \emph on contents \emph default of transaction logfiles and sync data). \end_layout \begin_layout Enumerate \series bold \emph on Internal \emph default interface \series default between the kernel module and the userspace tool \family typewriter marsadm \family default . \end_layout \begin_layout Enumerate \series bold \emph on Internal \emph default persistent repository \series default which keeps state information between reboots (also in case of node crashes). It is even the \emph on only \emph default place where state information is kept. There is no other place like \family typewriter /etc/drbd.conf \family default . \end_layout \begin_layout Standard \begin_inset Graphics filename images/MatieresCorrosives.png lyxscale 50 scale 17 \end_inset Because of its internal character, its representation and semantics may change at any time without notice (e.g. via an \emph on internal \emph default upgrade procedure between major releases). It is \emph on not \emph default an external interface to the outer world. Don't build anything on it. \end_layout \begin_layout Standard However, knowledge of the symlink tree is useful for advanced sysadmins, for \series bold human inspection \series default and for \series bold debugging \series default . And, of course, for developers. \end_layout \begin_layout Standard As an \begin_inset Quotes eld \end_inset official \begin_inset Quotes erd \end_inset interface from outside, only the \family typewriter marsadm \family default command should be used. \end_layout \begin_layout Subsection Documentation of the MARS Light Symlink Tree \end_layout \begin_layout Section MARS Worker Bricks \end_layout \begin_layout Section MARS Strategy Bricks \end_layout \begin_layout Section The MARS Brick Infrastructure Layer \end_layout \begin_layout Section The Generic Brick Infrastructure Layer \end_layout \begin_layout Section The Generic Object and Aspect Infrastructure \end_layout \begin_layout Chapter \start_of_appendix GNU Free Documentation License \begin_inset CommandInset label LatexCommand label name "chap:GNU-FDL" \end_inset \end_layout \begin_layout Standard \noindent \family typewriter \size footnotesize \begin_inset ERT status open \begin_layout Plain Layout \backslash lstinputlisting{fdl.txt} \end_layout \end_inset \end_layout \end_body \end_document