Merge branch 'mars0.1.y' into mars0.1a.y

This commit is contained in:
Thomas Schoebel-Theuer 2018-07-10 19:00:57 +02:00
commit 71720650c3
11 changed files with 2078 additions and 31 deletions

View File

@ -285,6 +285,17 @@ Hint: branch 0.1a will get a merge from here, and then get the
(except Football related ones) will then go to 0.1b.
Finally, when 0.1a is stable, I will close this branch.
mars0.1stable60
* Major improvement: new option --ignore-sync allows primary
Handover without --force even when some sync is running
somewhere. Any running syncs will restart from scratch
(which might take some time, depending on LV size and
many more factors like the network).
* Minor fix: split-cluster did not work correctly when no
resources were existing anymore, at all.
* Doc: major update. More explanation on CAP theorem, and
on differences / commonalities with DRBD.
mars0.1stable59
* Major fix: "marsadm up" did not work when sync could not
be started. Now does "best effort".

View File

@ -44,7 +44,9 @@ Actions for inplace FS shrinking:
Actions for inplace FS extension:
./football.sh expand <resource> <percent>
./football.sh extend <resource> <percent>
Increase mounted filesystem size during operations.
Combined actions:
@ -168,6 +170,11 @@ General features:
# Set this to your convenience.
football_logdir="${football_logdir:-${logdir:-$HOME/football-logs}}"
## football_backup_dir
# In this directory, various backups are created.
# Intended for manual repair.
football_backup_dir="${football_backup_dir:-$football_logdir/backups}"
## screener
# When enabled, handover execution to the screener.
# Very useful for running Football in masses.
@ -201,6 +208,11 @@ General features:
# of the temporary shrink mirror filesystem.
rsync_opt_prepare="${rsync_opt_prepare:---exclude='.filemon2' --delete}"
## rsync_opt_hot
# This is only used at the final rsync, immediately before going
# online again.
rsync_opt_hot="${rsync_opt_hot:---delete}"
## rsync_nice
# Typically, the preparation steps are run with background priority.
rsync_nice="${rsync_nice:-nice -19}"
@ -215,6 +227,42 @@ General features:
# Number of rsync lines to skip in output (avoid overflow of logfiles).
rsync_skip_lines="${rsync_skip_lines:-1000}"
## use_tar
# Use incremental Gnu tar in place of rsync:
# 0 = don't use tar
# 1 = only use for the first (full) data transfer, then use rsync
# 2 = always use tar
# Experience: tar has better performance on local data than rsync, but
# it tends to produce false-positive failure return codes on online
# filesystems which are altered during tar.
# The combined mode 1 tries to find a good compromise between both
# alternatives.
use_tar="${use_tar:-1}"
## tar_exe
# Use this for activation of patched tar versions, such as the
# 1&1-internal patched spacetools-tar.
tar_exe="${tar_exe:-/bin/tar}"
## tar_options_src and tar_options_dst
# Here you may give different options for both sides of tar invocations
# (source and destination), such as verbosity options etc.
tar_options_src="${tar_options_src:-}"
tar_options_dst="${tar_options_dst:-}"
## tar_is_fixed
# Tell whether your tar version reports false-positive transfer errors,
# or not.
tar_is_fixed="${tar_is_fixed:-0}"
## tar_state_dir
# This directory is used for keeping incremental tar state information.
tar_state_dir="${tar_state_dir:-/var/tmp}"
## buffer_cmd
# Speed up tar by intermediate buffering.
buffer_cmd="${buffer_cmd:-buffer -m 16m -S 1024m || cat}"
## wait_timeout
# Avoid infinite loops upon waiting.
wait_timeout="${wait_timeout:-$(( 24 * 60 ))}" # Minutes
@ -223,6 +271,11 @@ General features:
# Some LVM versions are requiring this for unattended batch operations.
lvremove_opt="${lvremove_opt:--f}"
## automatic recovery options: enable_failure_*
enable_failure_restart_vm="${enable_failure_restart_vm:-1}"
enable_failure_recreate_cluster="${enable_failure_recreate_cluster:-0}"
enable_failure_rebuild_mars="${enable_failure_rebuild_mars:-1}"
## critical_status
# This is the "magic" exit code indicating _criticality_
# of a failed command.
@ -281,5 +334,394 @@ General features:
PLUGIN football-1and1config
1&1 specfic plugin for dealing with the cm3 clusters
and its concrete configuration .
and its concrete configuration.
## enable_1and1config
# ShaHoLin-specifc plugin for working with the infong platform
# (istore, icpu, infong) via 1&1-specific clustermanager cm3
# and related toolsets. Much of it is bound to a singleton database
# instance (clustermw & siblings).
enable_1and1config="${enable_1and1config:-$(if [[ "$0" =~ tetris ]]; then echo 1; else echo 0; fi)}"
PLUGIN football-cm3
1&1 specfic plugin for dealing with the cm3 cluster manager
and its concrete operating enviroment (singleton instance).
Current maximum cluster size limit:
Maximum #syncs running before migration can start:
Following marsadm --version must be installed:
Following mars kernel modules must be loaded:
Specific actions for plugin football-cm3:
./football.sh clustertool {GET|PUT} <url>
Call through to the clustertool via REST.
Useful for manual inspection and repair.
## enable_cm3
# ShaHoLin-specifc plugin for working with the infong platform
# (istore, icpu, infong) via 1&1-specific clustermanager cm3
# and related toolsets. Much of it is bound to a singleton database
# instance (clustermw & siblings).
enable_cm3="${enable_cm3:-$(if [[ "$0" =~ tetris ]]; then echo 1; else echo 0; fi)}"
## skip_resource_ping
# Enable this only for testing. Normally, a resource name denotes a
# container name == machine name which must be runnuing as a precondition,
# und thus must be pingable over network.
skip_resource_ping="${skip_resource_ping:-0}"
## date_lock
# Don't enter critical sections at certain days of the week,
# and/or during certain hours.
# This is a regex matching against "date +%u_%H"
date_lock="${date_lock:-}"
## check_ping_rounds
# Number of pings to try before a container is assumed to
# not respond.
check_ping_rounds="${check_ping_rounds:-5}"
## workaround_firewall
# Documentation of technical debt for later generations:
# This is needed since July 2017. In the many years before, no firewalling
# was effective at the replication network, because it is a physically
# separate network from the rest of the networking infrastructure.
# An attacker would first need to gain root access to the _hypervisor_
# (not only to the LXC container and/or to KVM) before gaining access to
# those physical replication network interfaces.
# Since about that time, which is about the same time when the requirements
# for Container Football had been communicated, somebody introduced some
# unnecessary firewall rules, based on "security arguments".
# These arguments were however explicitly _not_ required by the _real_
# security responsible person, and explicitly _not_ recommended by him.
# Now the problem is that it is almost politically impossible to get
# rid of suchalike "security feature".
# Until the problem is resolved, Container Football requires
# the _entire_ local firewall to be _temporarily_ shut down in order to
# allow marsadm commands over ssh to work.
# Notice: this is _not_ increasing the general security in any way.
# LONGTERM solution / TODO: future versions of mars should no longer
# depend on ssh.
# Then this "feature" can be turned off.
workaround_firewall="${workaround_firewall:-1}"
## ip_magic
# Similarly to workaround_firewall, this is needed since somebody
# introduced additional firewall rules also disabling sysadmin ssh
# connections at the _ordinary_ sysadmin network.
ip_magic="${ip_magic:-1}"
## do_split_cluster
# The current MARS branch 0.1a.y is not yet constructed for forming
# a BigCluster constisting of several thousands of machines.
# When a future version of mars0.1b.y (or 0.2.y) will allow this,
# this can be disabled.
do_split_cluster="${do_split_cluster:-1}"
## forbidden_hosts
# Regex for excluding hostnames from any Football actions.
# The script will fail when some of these is encountered.
forbidden_hosts="${forbidden_hosts:-}"
## forbidden_flavours
# Regex for excluding flavours from any Football actions.
# The script will fail when some of these is encountered.
forbidden_flavours="${forbidden_flavours:-}"
## forbidden_bz_ids
# PROVISIONARY regex for excluding certain bz_ids from any Football actions.
# NOTICE: bz_ids are deprecated and should not be used in future
# (technical debts).
# The script will fail when some of these is encountered.
forbidden_bz_ids="${forbidden_bz_ids:-}"
## clustertool_host
# URL prefix of the internal configuation database REST interface.
clustertool_host="${clustertool_host:-http://clustermw:3042}"
## clustertool_user
# Username for clustertool access.
# By default, scans for a *.password file (see next option).
clustertool_user="${clustertool_user:-$(get_cred_file "*.password" | head -1 | sed 's:.*/::g' | cut -d. -f1)}"
## clustertool_passwd_file
# Here you can supply the encrpted password.
# By default, a file $clustertool_user.password is used
# containing the encrypted password.
clustertool_passwd_file="${clustertool_passwd_file:-$(get_cred_file "$clustertool_user.password")}"
## clustertool_passwd
# Here you may override the password via config file.
# For security reasons, dont provide this at the command line.
clustertool_passwd="${clustertool_passwd:-$(< $clustertool_passwd_file)}" || echo "cannot read a password file *.password for clustermw: you MUST supply the credentials via default curl config files (see man page)"
## do_migrate
# Keep this enabled. Only disable for testing.
do_migrate="${do_migrate:-1}" # must be enabled; disable for dry-run testing
## always_migrate
# Only use for testing, or for special situation.
# This skip the test whether the resource has already migration.
always_migrate="${always_migrate:-0}" # only enable for testing
## check_segments
# 0 = disabled
# 1 = only display the segment names
# 2 = check for equality
# WORKAROUND, potentially harmful when used inadequately.
# The historical physical segment borders need to be removed for
# Container Football.
# Unfortunately, the subproject aiming to accomplish this did not
# proceed for one year now. In the meantime, Container Football can
# be only played within the ancient segment borders.
# After this big impediment is eventually resolved, this option
# should be switched off.
check_segments="${check_segments:-1}"
## enable_mod_deflate
# Internal, for support.
enable_mod_deflate="${enable_mod_deflate:-1}"
## enable_segment_move
# Seems to be needed by some other tooling.
enable_segment_move="${enable_segment_move:-1}"
## override_hwclass_id
# When necessary, override this from $include_dir/plugins/*.conf
override_hwclass_id="${override_hwclass_id:-}" # typically 25007
## override_hvt_id
# When necessary, override this from $include_dir/plugins/*.conf
override_hvt_id="${override_hvt_id:-}" # typically 8057 or 8059
## override_overrides
# When this is set and other override_* variables are not set,
# then try to _guess_ some values.
# No guarantees for correctness either.
override_overrides=${override_overrides:-1}
## iqn_base and iet_type and iscsi_eth and iscsi_tid
# Workaround: this is needed for _dynamic_ generation of iSCSI sessions
# bypassing the ordinary ones as automatically generated by the
# cm3 cluster manager (only at the old istore architecture).
# Notice: not needed for regular operations, only for testing.
# Normally, you dont want to shrink over a _shared_ 1MBit iSCSI line.
iqn_base="${iqn_base:-iqn.2000-01.info.test:test}"
iet_type="${iet_type:-blockio}"
iscsi_eth="${iscsi_eth:-eth1}"
iscsi_tid="${iscsi_tid:-4711}"
## monitis_downtime_script
# ShaHoLin-internal
monitis_downtime_script="${monitis_downtime_script:-}"
## monitis_downtime_duration
# ShaHoLin-internal
monitis_downtime_duration="${monitis_downtime_duration:-20}" # Minutes
## shaholin_finished_log
# ShaHoLin-specific logfile, reporting _only_ successful completion
# of an action.
shaholin_finished_log="${shaholin_finished_log:-$football_logdir/shaholin-finished.log}"
## ticket
# OPTIONAL: the meaning is ShaHoLin specific.
# This can be used for updating JIRA tickets.
# Can be set on the command line like "./tetris.sh $args --ticket=TECCM-4711
ticket="${ticket:-}"
## ticket_get_cmd
# Optional: when set, this script can be used for retrieving ticket IDs
# in place of commandline option --ticket=
ticket_get_cmd="${ticket_get_cmd:-}"
## ticket_update_cmd
# This can be used for calling an external command which updates
# the ticket(s) given by the $ticket parameter.
ticket_update_cmd="${ticket_update_cmd:-}"
## shaholin_action
# OPTIONAL: specific action script with parameters.
shaholin_action="${shaholin_action:-}"
PLUGIN football-basic
Generic driver for systemd-controlled MARS pools.
The current version supports only a flat model:
(1) There is a single "big cluster" at metadata level.
All cluster members are joined via merge-cluster.
All occurring names need to be globally unique.
(2) The network uses BGP or other means, thus any hypervisor
can (potentially) start any VM at any time.
(3) iSCSI or remote devices are not supported for now
(LocalSharding model). This may be extended in a future
release.
This plugin is exclusive-or with cm3.
Plugin specific actions:
./football.sh basic_add_host <hostname>
Manually add another host to the hostname cache.
## pool_cache_dir
# Directory for caching the pool status.
pool_cache_dir="${pool_cache_dir:-$script_dir/pool-cache}"
## initial_hostname_file
# This file must contain a list of storage and/or hypervisor hostnames
# where a /mars directory must exist.
# These hosts are then scanned for further cluster members,
# and the transitive closure of all host names is computed.
initial_hostname_file="${initial_hostname_file:-./hostnames.input}"
## hostname_cache
# This file contains the transitive closure of all host names.
hostname_cache="${hostname_cache:-$pool_cache_dir/hostnames.cache}"
## resources_cache
# This file contains the transitive closure of all resource names.
resources_cache="${resources_cache:-$pool_cache_dir/resources.cache}"
## res2hyper_cache
# This file contains the association between resources and hypervisors.
res2hyper_cache="${res2hyper_cache:-$pool_cache_dir/res2hyper.assoc}"
## enable_basic
# This plugin is exclusive-or with cm3.
enable_basic="${enable_basic:-$(if [[ "$0" =~ football ]]; then echo 1; else echo 0; fi)}"
## ssh_port
# Set this for separating sysadmin access from customer access
ssh_port="${ssh_port:-}"
## basic_mnt_dir
# Names the mountpoint directory at hypervisors.
# This must co-incide with the systemd mountpoints.
basic_mnt_dir="${basic_mnt_dir:-/mnt}"
PLUGIN football-downtime
Generic plugin for communication of customer downtime.
## downtime_cmd_{set,unset}
# External command for setting / unsetting (or communicating) a downtime
# Empty = don't do anything
downtime_cmd_set="${downtime_cmd_set:-}"
downtime_cmd_unset="${downtime_cmd_unset:-}"
PLUGIN football-motd
Generic plugin for motd. Communicate that Football is running
at login via motd.
## enable_motd
# whether to use the motd plugin.
enable_motd="${enable_motd:-0}"
## update_motd_cmd
# Distro-specific command for generating motd from several sources.
# Only tested for Debian Jessie at the moment.
update_motd_cmd="${update_motd_cmd:-update-motd}"
## download_motd_script and motd_script_dir
# When no script has been installed into /etc/update-motd.d/
# you can do it dynamically here, bypassing any "official" deployment
# methods. Use this only for testing!
# An example script (which should be deployed via your ordinary methods)
# can be found under $script_dir/update-motd.d/67-football-running
download_motd_script="${download_motd_script:-}"
motd_script_dir="${motd_script_dir:-/etc/update-motd.d}"
## motd_file
# This will contain the reported motd message.
# It is created by this plugin.
motd_file="${motd_file:-/var/motd/football.txt}"
## motd_color_on and motd_color_off
# ANSI escape sequences for coloring the generated motd message.
motd_color_on="${motd_color_on:-\\033[31m}"
motd_color_off="${motd_color_off:-\\033[0m}"
PLUGIN football-report
Generic plugin for communication of reports.
## report_cmd_{start,warning,failed,finished}
# External command which is called at start / failure / finish
# of Football.
# The following variables can be used (e.g. as parameters) when
# escaped with a backslash:
# $res = name of the resource (LV, container, etc)
# $primary = the current (old)
# $secondary_list = list of current (old) secondaries
# $target_primary = the target primary name
# $target_secondary = list of target secondaries
# $operation = the operation name
# $target_percent = the value used for shrinking
# $txt = some informative text from Football
# Further variables are possible by looking at the sourcecode, or by
# defining your own variables or functions externally or via plugins.
# Empty = don't do anything
report_cmd_start="${report_cmd_start:-}"
report_cmd_warning="${report_cmd_warning:-$script_dir/screener.sh notify "$res" warning "$txt"}"
report_cmd_failed="${report_cmd_failed:-}"
report_cmd_finished="${report_cmd_finished:-}"
PLUGIN football-waiting
Generic plugig, interfacing with screener: when this is used
by your script and enabled, then you will be able to wait for
"screener.sh continue" operations at certain points in your
script.
## enable_*_waiting
#
# When this is enabled, and when Football had been started by screener,
# then football will delay the start of several operations until a sysadmin
# does one of the following manually:
#
# a) ./screener.sh continue $session
# b) ./screener.sh resume $session
# c) ./screener.sh attach $session and press the RETURN key
# d) doing nothing, and $wait_timeout has exceeded
#
# CONVENTION: football resource names are used as screener session ids.
# This ensures that only 1 operation can be started for the same resource,
# and it simplifies the handling for junior sysadmins.
#
enable_startup_waiting="${enable_startup_waiting:-0}"
enable_handover_waiting="${enable_handover_waiting:-0}"
enable_migrate_waiting="${enable_migrate_waiting:-0}"
enable_shrink_waiting="${enable_shrink_waiting:-0}"
## enable_cleanup_delayed and wait_before_cleanup
# By setting this, you can delay the cleanup operations for some time.
# This way, you are keeping the old LV contents as a kind of "backup"
# for some limited time.
# HINT: dont set to wait_before_cleanuplarge values, because it can
# seriously slow down Football.
enable_cleanup_delayed="${enable_cleanup_delayed:-0}"
wait_before_cleanup="${wait_before_cleanup:-180}" # Minutes
## reduce_wait_msg
# Instead of reporting the waiting status once per minute,
# decrease the frequency of resporting.
# Warning: dont increase this too much. Do not exceed
# session_timeout/2 from screener. Because of the Nyquist criterion,
# stay on the safe side by setting session_timeout at least to _twice_
# the time than here.
reduce_wait_msg="${reduce_wait_msg:-60}" # Minutes
\end{verbatim}

View File

@ -43,7 +43,9 @@ Actions for inplace FS shrinking:
Actions for inplace FS extension:
./football.sh expand <resource> <percent>
./football.sh extend <resource> <percent>
Increase mounted filesystem size during operations.
Combined actions:
@ -126,7 +128,8 @@ General features:
PLUGIN football-1and1config
1&1 specfic plugin for dealing with the cm3 clusters
and its concrete configuration .
and its concrete configuration.
PLUGIN football-cm3
@ -141,6 +144,12 @@ PLUGIN football-cm3
Following mars kernel modules must be loaded:
Specific actions for plugin football-cm3:
./football.sh clustertool {GET|PUT} <url>
Call through to the clustertool via REST.
Useful for manual inspection and repair.
PLUGIN football-basic
@ -162,6 +171,11 @@ Plugin specific actions:
Manually add another host to the hostname cache.
PLUGIN football-downtime
Generic plugin for communication of customer downtime.
PLUGIN football-motd
Generic plugin for motd. Communicate that Football is running
@ -180,4 +194,5 @@ PLUGIN football-waiting
"screener.sh continue" operations at certain points in your
script.
\end{verbatim}

View File

@ -0,0 +1,17 @@
#FIG 3.2 Produced by xfig version 3.2.5c
Landscape
Center
Metric
A4
100.00
Single
-2
1200 2
2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 4
450 1800 1800 0 3150 1800 450 1800
2 1 0 3 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
1800 0 3150 1800
4 0 0 50 -1 18 12 0.0000 4 195 1515 2115 90 C = Consistency\001
4 0 0 50 -1 18 12 0.0000 4 195 1410 405 2115 A = Availability\001
4 0 0 50 -1 18 12 0.0000 4 195 2445 3060 2070 P = Partitioning Tolerance\001
4 0 4 50 -1 18 40 0.0000 4 480 435 450 2025 X\001

View File

@ -0,0 +1,17 @@
#FIG 3.2 Produced by xfig version 3.2.5c
Landscape
Center
Metric
A4
100.00
Single
-2
1200 2
2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 4
450 1800 1800 0 3150 1800 450 1800
2 1 0 3 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
3150 1800 450 1800
4 0 0 50 -1 18 12 0.0000 4 195 1515 2115 90 C = Consistency\001
4 0 0 50 -1 18 12 0.0000 4 195 1410 405 2115 A = Availability\001
4 0 0 50 -1 18 12 0.0000 4 195 2445 3060 2070 P = Partitioning Tolerance\001
4 0 4 50 -1 18 40 0.0000 4 480 435 1755 360 X\001

View File

@ -0,0 +1,16 @@
#FIG 3.2 Produced by xfig version 3.2.5c
Landscape
Center
Metric
A4
100.00
Single
-2
1200 2
2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 4
450 1800 1800 0 3150 1800 450 1800
2 1 0 3 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
1800 0 450 1800
4 0 0 50 -1 18 12 0.0000 4 195 1515 2115 90 C = Consistency\001
4 0 0 50 -1 18 12 0.0000 4 195 1410 405 2115 A = Availability\001
4 0 0 50 -1 18 12 0.0000 4 195 2445 3060 2070 P = Partitioning Tolerance\001

16
docu/images/cap-mars.fig Normal file
View File

@ -0,0 +1,16 @@
#FIG 3.2 Produced by xfig version 3.2.5c
Landscape
Center
Metric
A4
100.00
Single
-2
1200 2
2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 4
450 1800 1800 0 3150 1800 450 1800
2 1 0 3 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
3150 1800 450 1800
4 0 0 50 -1 18 12 0.0000 4 195 1410 405 2115 A = Availability\001
4 0 0 50 -1 18 12 0.0000 4 195 2445 3060 2070 P = Partitioning Tolerance\001
4 0 0 50 -1 18 12 0.0000 4 195 1515 2115 90 C = Consistency\001

View File

@ -0,0 +1,14 @@
#FIG 3.2 Produced by xfig version 3.2.5c
Landscape
Center
Metric
A4
100.00
Single
-2
1200 2
2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 4
450 1800 1800 0 3150 1800 450 1800
4 0 0 50 -1 18 12 0.0000 4 195 1515 2115 90 C = Consistency\001
4 0 0 50 -1 18 12 0.0000 4 195 1410 405 2115 A = Availability\001
4 0 0 50 -1 18 12 0.0000 4 195 2445 3060 2070 P = Partitioning Tolerance\001

File diff suppressed because it is too large Load Diff

View File

@ -13,6 +13,10 @@ marsadm [<global_options>] view[-<macroname>] [<resource_name> | all ]
Use this only when you really know what you are doing!
Warning! This is dangerous! First try --dry-run.
Not combinable with 'all'.
--ignore-sync
Allow primary handover even when some sync is running somewhere.
This is less rude than --force because it checks for all else
preconditions.
--dry-run
Don't modify the symlink tree, but tell what would be done.
Use this before starting potentially harmful actions such as
@ -567,5 +571,4 @@ marsadm [<global_options>] view[-<macroname>] [<resource_name> | all ]
{sync,fetch,replay,work}-{rest,{almost-,threshold-,}reached,percent,permille,vector}
{sync,fetch,replay}-{rate,remain}
{time,real-time}
{tree,features}-version
\end{verbatim}

View File

@ -106,7 +106,7 @@ sub lwarn {
my $Id = '$Id$ ';
my $user_version = 0.1;
my $marsadm_version = 2.3; # some rough hint at newer features
my $marsadm_version = 2.4; # some rough hint at newer features
my $mars = "/mars";
my $host = `uname -n` or ldie "cannot determine my network node name\n";
chomp $host;
@ -114,6 +114,7 @@ check_id($host);
my $real_host = $host;
my $backup_dir = "$mars/backups-" . time();
my $force = 0;
my $ignore_sync = 0;
my $cron_mode = 0;
my $timeout = -1;
my $ip = "";
@ -2623,7 +2624,7 @@ sub split_cluster {
$cmd .= "shopt -s nullglob; ";
$cmd .= "for i in $mars/resource-*; do if ! [[ -e \$i/data-$peer ]] && ! [[ -e \$i/replay-$peer ]]; then rm -rf $backup_dir/\${i##*/}; mv \$i $backup_dir/; fi; done; ";
$cmd .= "mkdir -p $mars/ips; ";
my $sub_list = "{ for i in \$(ls $mars/resource-*/data-$peer | cut -d/ -f1-3 | sort -u); do (cd \$i; ls data-*); done; echo x-$peer; }";
my $sub_list = "{ for dir in $mars/resource-*/data-$peer; do (cd \${dir%/*} && for i in data-*; do echo \$i; done); done; echo x-$peer; }";
my $sub_cmd = "echo RESTORE IP \$j; cp -a $ips_backup/ip-\$j $mars/ips/";
$cmd .= "for j in \$($sub_list | cut -d- -f2- | sort -u); do $sub_cmd; done";
lprint "$cmd\n";
@ -3172,16 +3173,19 @@ sub primary_phase0 {
lprint "Current designated primary: $old\n";
if ($cmd eq "primary") {
if ($host ne $old) {
check_sync_finished($res, $host);
lprint "Allowing handover in cases of sync: ignore_sync=$ignore_sync\n" if $ignore_sync;
check_sync_finished($res, $host, $ignore_sync);
# also check that other secondaries won't loose their sync primary
my @names = glob("$mars/resource-$res/data-*");
# for k <= 2 replicas, the previous check must have been sufficient
if (scalar(@names) > 2) {
my $allow_anyway = ($force || $ignore_sync);
lprint "Allowing handover in cases of sync: force=$force ignore_sync=$ignore_sync\n" if $allow_anyway;
foreach my $name (@names) {
$name =~ m:/data-(.+):;
my $peer = $1;
next if ($peer eq $old || $peer eq $host);
check_sync_finished($res, $peer, $force);
check_sync_finished($res, $peer, $allow_anyway);
}
}
}
@ -6120,6 +6124,10 @@ marsadm [<global_options>] view[-<macroname>] [<resource_names> | all ]
Use this only when you really know what you are doing!
Warning! This is dangerous! First try --dry-run.
Not combinable with 'all'.
--ignore-sync
Allow primary handover even when some sync is running somewhere.
This is less rude than --force because it checks for all else
preconditions.
--dry-run
Don't modify the symlink tree, but tell what would be done.
Use this before starting potentially harmful actions such as
@ -6237,6 +6245,9 @@ foreach my $arg (@ARGV) {
if ($arg eq "--force" || $arg eq "-f") {
$force++;
next;
} elsif ($arg eq "--ignore-sync") {
$ignore_sync++;
next;
} elsif ($arg eq "--dry-run" || $arg eq "-d") {
$dry_run++;
next;