mirror of
https://github.com/schoebel/mars
synced 2025-01-12 01:29:50 +00:00
1025 lines
39 KiB
Plaintext
1025 lines
39 KiB
Plaintext
\begin{verbatim}
|
|
verbose=1
|
|
Usage:
|
|
./football.sh --help [--verbose]
|
|
Show help
|
|
./football.sh --variable=<value>
|
|
Override any shell variable
|
|
|
|
Actions for resource migration:
|
|
|
|
./football.sh migrate <resource> <target_primary> [<target_secondary>]
|
|
Run the sequence
|
|
migrate_prepare ; migrate_wait ; migrate_finish; migrate_cleanup.
|
|
|
|
Dto for testing of phases:
|
|
|
|
./football.sh migrate_prepare <resource> <target_primary> [<target_secondary>]
|
|
Allocate LVM space at the targets and start MARS replication.
|
|
|
|
./football.sh migrate_wait <resource> <target_primary> [<target_secondary>]
|
|
Wait until MARS replication reports UpToDate.
|
|
|
|
./football.sh migrate_finish <resource> <target_primary> [<target_secondary>]
|
|
Call hooks for handover to the targets.
|
|
|
|
./football.sh migrate_cleanup <resource>
|
|
Remove old / currently unused LV replicas from MARS and deallocate
|
|
from LVM.
|
|
|
|
Actions for inplace FS shrinking:
|
|
|
|
./football.sh shrink <resource> <percent>
|
|
Run the sequence shrink_prepare ; shrink_finish ; shrink_cleanup.
|
|
|
|
Dto for testing of phases:
|
|
|
|
./football.sh shrink_prepare <resource> [<percent>]
|
|
Allocate temporary LVM space (when possible) and create initial
|
|
raw FS copy.
|
|
Default percent value(when left out) is 85.
|
|
|
|
./football.sh shrink_finish <resource>
|
|
Incrementally update the FS copy, swap old <=> new copy with
|
|
small downtime.
|
|
|
|
./football.sh shrink_cleanup <resource>
|
|
Remove old FS copy from LVM.
|
|
|
|
Actions for inplace FS extension:
|
|
|
|
./football.sh expand <resource> <percent>
|
|
./football.sh extend <resource> <percent>
|
|
Increase mounted filesystem size during operations.
|
|
|
|
Combined actions:
|
|
|
|
./football.sh migrate+shrink <resource> <target_primary> [<target_secondary>] [<percent>]
|
|
Similar to migrate ; shrink but produces less network traffic.
|
|
Default percent value (when left out) is 85.
|
|
|
|
./football.sh migrate+shrink+back <resource> <tmp_primary> [<percent>]
|
|
Migrate temporarily to <tmp_primary>, then shrink there,
|
|
finally migrate back to old primary and secondaries.
|
|
Default percent value (when left out) is 85.
|
|
|
|
Actions for (manual) repair in emergency situations:
|
|
|
|
./football.sh manual_handover <resource> <target_primary>
|
|
This is useful in place of going to the machines and starting
|
|
handover on their command line. You dont need to log in.
|
|
All hooks (e.g. for downtime / reporting / etc) are automatically
|
|
called.
|
|
Notice: it will only work when there is already a replica
|
|
at <target_primary>, and when further constraints such as
|
|
clustermanager constraints will allow it.
|
|
For a full Football game between different clusters, use
|
|
"migrate" instead.
|
|
|
|
./football.sh manual_migrate_config <resource> <target_primary> [<target_secondary>]
|
|
Transfer only the cluster config, without changing the MARS replicas.
|
|
This does no resource stopping / restarting.
|
|
Useful for reverting a failed migration.
|
|
|
|
./football.sh manual_config_update <hostname>
|
|
Only update the cluster config, without changing anything else.
|
|
Useful for manual repair of failed migration.
|
|
|
|
./football.sh manual_merge_cluster <hostname1> <hostname2>
|
|
Run "marsadm merge-cluster" for the given hosts.
|
|
Hostnames must be from different (former) clusters.
|
|
|
|
./football.sh manual_split_cluster <hostname_list>
|
|
Run "marsadm split-cluster" at the given hosts.
|
|
Useful for fixing failed / asymmetric splits.
|
|
Hint: provide _all_ hostnames which have formerly participated
|
|
in the cluster.
|
|
|
|
./football.sh repair_vm <resource> <primary_candidate_list>
|
|
Try to restart the VM <resource> on one of the given machines.
|
|
Useful during unexpected customer downtime.
|
|
|
|
./football.sh repair_mars <resource> <primary_candidate_list>
|
|
Before restarting the VM like in repair_vm, try to find a local
|
|
LV where a stand-alone MARS resource can be found and built up.
|
|
Use this only when the MARS resources are gone, and when you are
|
|
desperate. Problem: this will likely create a MARS setup which is
|
|
not usable for production, and therefore must be corrected later
|
|
by hand. Use this only during an emergency situation in order to
|
|
get the customers online again, while buying the downsides of this
|
|
command.
|
|
|
|
./football.sh manual_lock <item> <host_list>
|
|
./football.sh manual_unlock <item> <host_list>
|
|
Manually lock or unlock an item at all of the given hosts, in
|
|
an atomic fashion. In most cases, use "ALL" for the item.
|
|
|
|
Global maintenance:
|
|
|
|
./football.sh lv_cleanup <resource>
|
|
|
|
General features:
|
|
|
|
- Instead of <percent>, an absolute amount of storage with suffix
|
|
'k' or 'm' or 'g' can be given.
|
|
|
|
- When <resource> is currently stopped, login to the container is
|
|
not possible, and in turn the hypervisor node and primary storage node
|
|
cannot be automatically determined. In such a case, the missing
|
|
nodes can be specified via the syntax
|
|
<resource>:<hypervisor>:<primary_storage>
|
|
|
|
- The following LV suffixes are used (naming convention):
|
|
-tmp = currently emerging version for shrinking
|
|
-preshrink = old version before shrinking took place
|
|
|
|
- By adding the option --screener, you can handover football execution
|
|
to ./screener.sh .
|
|
When some --enable_*_waiting is also added, then the critical
|
|
sections involving customer downtime are temporarily halted until
|
|
some sysadmins says "screener.sh continue $resource" or
|
|
attaches to the sessions and presses the RETURN key.
|
|
|
|
Configuration:
|
|
|
|
You can place shell variable definitions for overriding any
|
|
tunables into the following locations:
|
|
|
|
football_includes=/usr/lib/mars/plugins /etc/mars/plugins /home/schoebel/mars/football-master.git/plugins /home/schoebel/.mars/plugins ./plugins
|
|
|
|
football_confs=/usr/lib/mars/confs /etc/mars/confs /home/schoebel/mars/football-master.git/confs /home/schoebel/.mars/confs ./confs
|
|
|
|
football_creds=/usr/lib/mars/creds /etc/mars/creds /home/schoebel/mars/football-master.git/creds /home/schoebel/mars/football-master.git /home/schoebel/.mars/creds ./creds
|
|
|
|
Filenames should match the following patterns:
|
|
|
|
football-*.preconf Here you may change paths and enable_* variables.
|
|
football-*.conf Inteded for main parameters.
|
|
football-*.postconf For late overrides after sourcing modules.
|
|
football-*.reconf Modify runtime parameters during waits.
|
|
|
|
## football_includes
|
|
# List of directories where football-*.sh and football-*.conf
|
|
# files can be found.
|
|
football_includes="${football_includes:-/usr/lib/mars/plugins /etc/mars/plugins $script_dir/plugins $HOME/.mars/plugins ./plugins}"
|
|
|
|
## football_confs
|
|
# Another list of directories where football-*.conf files can be found.
|
|
# These are sourced in a second pass after $football_includes.
|
|
# Thus you can change this during the first pass.
|
|
football_confs="${football_confs:-/usr/lib/mars/confs /etc/mars/confs $script_dir/confs $HOME/.mars/confs ./confs}"
|
|
|
|
## football_creds
|
|
# List of directories where various credential files can be found.
|
|
football_creds="${football_creds:-/usr/lib/mars/creds /etc/mars/creds $script_dir/creds $script_dir $HOME/.mars/creds ./creds}"
|
|
|
|
## trap_signals
|
|
# List of signal names which should be trapped.
|
|
# Traps are importnatn for housekeeping, e.g. automatic
|
|
# removal of locks.
|
|
trap_signals="${trap_signals:-SIGINT}"
|
|
|
|
## dry_run
|
|
# When set, actions are only simulated.
|
|
dry_run=${dry_run:-0}
|
|
|
|
## verbose
|
|
# increase speakiness.
|
|
verbose=${verbose:-0}
|
|
|
|
## confirm
|
|
# Only for debugging: manually started operations can be
|
|
# manually checked and confirmed before actually starting opersions.
|
|
confirm=${confirm:-1}
|
|
|
|
## force
|
|
# Normally, shrinking and extending will only be started if there
|
|
# is something to do.
|
|
# Enable this for debugging and testing: the check is then skipped.
|
|
force=${force:-0}
|
|
|
|
## debug_injection_point
|
|
# RTFS don't set this unless you are a developer knowing what you are doing.
|
|
debug_injection_point="${debug_injection_point:-0}"
|
|
|
|
## football_logdir
|
|
# Where the logfiles should be created.
|
|
# HINT: after playing Football in masses for a whiile, your $logdir will
|
|
# be easily populated with hundreds or thousands of logfiles.
|
|
# Set this to your convenience.
|
|
football_logdir="${football_logdir:-${logdir:-$HOME/football-logs}}"
|
|
|
|
## football_backup_dir
|
|
# In this directory, various backups are created.
|
|
# Intended for manual repair.
|
|
football_backup_dir="${football_backup_dir:-$football_logdir/backups}"
|
|
|
|
## screener
|
|
# When enabled, delegate execution to the screener.
|
|
# Very useful for running Football in masses.
|
|
screener="${screener:-1}"
|
|
|
|
## min_space
|
|
# When testing / debugging with extremely small LVs, it may happen
|
|
# that mkfs refuses to create extemely small filesystems.
|
|
# Use this to ensure a minimum size.
|
|
min_space="${min_space:-20000000}"
|
|
|
|
## cache_repeat_lapse
|
|
# When using the waiting capabilities of screener, and when waits
|
|
# are lasting very long, your dentry cache may become cold.
|
|
# Use this for repeated refreshes of the dentry cache after some time.
|
|
cache_repeat_lapse="${cache_repeat_lapse:-120}" # Minutes
|
|
|
|
## remote_ping
|
|
# Before using ssh, ping the target.
|
|
# This is only useful in special cases.
|
|
remote_ping="${remote_ping:-0}"
|
|
|
|
## ping_opts
|
|
# Options for ping checks.
|
|
ping_opts="${ping_opts:--W 1 -c 1}"
|
|
|
|
## ssh_opt
|
|
# Useful for customization to your ssh environment.
|
|
ssh_opt="${ssh_opt:--4 -A -o StrictHostKeyChecking=no -o ForwardX11=no -o KbdInteractiveAuthentication=no -o VerifyHostKeyDNS=no}"
|
|
|
|
## ssh_auth
|
|
# Useful for extra -i options.
|
|
ssh_auth="${ssh_auth:-}"
|
|
|
|
## rsync_opt
|
|
# The rsync options in general.
|
|
# IMPORTANT: some intermediate progress report is absolutely needed,
|
|
# because otherwise a false-positive TIMEOUT may be assumed when
|
|
# no output is generated for several hours.
|
|
rsync_opt="${rsync_opt:- -aSH --info=progress2,STATS}"
|
|
|
|
## rsync_opt_prepare
|
|
# Additional rsync options for preparation and updating
|
|
# of the temporary shrink mirror filesystem.
|
|
rsync_opt_prepare="${rsync_opt_prepare:---exclude='.filemon2' --delete}"
|
|
|
|
## rsync_opt_hot
|
|
# This is only used at the final rsync, immediately before going
|
|
# online again.
|
|
rsync_opt_hot="${rsync_opt_hot:---delete}"
|
|
|
|
## rsync_nice
|
|
# Typically, the preparation steps are run with background priority.
|
|
rsync_nice="${rsync_nice:-nice -19}"
|
|
|
|
## rsync_repeat_prepare and rsync_repeat_hot
|
|
# Tuning: increases the reliability of rsync and ensures that the dentry cache
|
|
# remains hot.
|
|
rsync_repeat_prepare="${rsync_repeat_prepare:-5}"
|
|
rsync_repeat_hot="${rsync_repeat_hot:-3}"
|
|
|
|
## rsync_skip_lines
|
|
# Number of rsync lines to skip in output (avoid overflow of logfiles).
|
|
rsync_skip_lines="${rsync_skip_lines:-1000}"
|
|
|
|
## use_tar
|
|
# Use incremental Gnu tar in place of rsync:
|
|
# 0 = don't use tar
|
|
# 1 = only use for the first (full) data transfer, then use rsync
|
|
# 2 = always use tar
|
|
# Experience: tar has better performance on local data than rsync, but
|
|
# it tends to produce false-positive failure return codes on online
|
|
# filesystems which are altered during tar.
|
|
# The combined mode 1 tries to find a good compromise between both
|
|
# alternatives.
|
|
use_tar="${use_tar:-1}"
|
|
|
|
## tar_exe
|
|
# Use this for activation of patched tar versions, such as the
|
|
# 1&1-internal patched spacetools-tar.
|
|
tar_exe="${tar_exe:-/bin/tar}"
|
|
|
|
## tar_options_src and tar_options_dst
|
|
# Here you may give different options for both sides of tar invocations
|
|
# (source and destination), such as verbosity options etc.
|
|
tar_options_src="${tar_options_src:-}"
|
|
tar_options_dst="${tar_options_dst:-}"
|
|
|
|
## tar_is_fixed
|
|
# Tell whether your tar version reports false-positive transfer errors,
|
|
# or not.
|
|
tar_is_fixed="${tar_is_fixed:-0}"
|
|
|
|
## tar_state_dir
|
|
# This directory is used for keeping incremental tar state information.
|
|
tar_state_dir="${tar_state_dir:-/var/tmp}"
|
|
|
|
## buffer_cmd
|
|
# Speed up tar by intermediate buffering.
|
|
buffer_cmd="${buffer_cmd:-buffer -m 16m -S 1024m || cat}"
|
|
|
|
## wait_timeout
|
|
# Avoid infinite loops upon waiting.
|
|
wait_timeout="${wait_timeout:-$(( 24 * 60 ))}" # Minutes
|
|
|
|
## lvremove_opt
|
|
# Some LVM versions are requiring this for unattended batch operations.
|
|
lvremove_opt="${lvremove_opt:--f}"
|
|
|
|
## automatic recovery options: enable_failure_*
|
|
enable_failure_restart_vm="${enable_failure_restart_vm:-1}"
|
|
enable_failure_recreate_cluster="${enable_failure_recreate_cluster:-0}"
|
|
enable_failure_rebuild_mars="${enable_failure_rebuild_mars:-1}"
|
|
|
|
## critical_status
|
|
# This is the "magic" exit code indicating _criticality_
|
|
# of a failed command.
|
|
critical_status="${critical_status:-199}"
|
|
|
|
## serious_status
|
|
# This is the "magic" exit code indicating _seriosity_
|
|
# of a failed command.
|
|
serious_status="${serious_status:-198}"
|
|
|
|
## pre_hand or --pre-hand=
|
|
# Set this to do an ordinary handover to a new start position
|
|
# (in the source cluster) before doing anything else.
|
|
# This may be used for handover to a different datacenter,
|
|
# in order to minimize cross traffic between datacenters.
|
|
pre_hand="${pre_hand:-}"
|
|
|
|
## post_hand or --post-hand=
|
|
# Set this to do an ordinary handover to a final position
|
|
# (in the target cluster) after everything has successfully finished.
|
|
# This may be used to establish a uniform default running location.
|
|
post_hand="${post_hand:-}"
|
|
|
|
## tmp_suffix
|
|
# Only for experts.
|
|
tmp_suffix="${tmp_suffix:--tmp}"
|
|
|
|
## shrink_suffix_old
|
|
# Suffix for backup LVs. These are kept for wome time until
|
|
# *_cleanup operations will remove them.
|
|
shrink_suffix_old="${shrink_suffix_old:--preshrink}"
|
|
|
|
## start_regex
|
|
# At which $operation the hook football_start
|
|
# shoule be called
|
|
start_regex="${start_regex:-^(migrate_prepare|migrate|migrate+|shrink_prepare|shrink)}"
|
|
|
|
## finished_regex
|
|
# At which $operation the hook football_finished
|
|
# shoule be called
|
|
finished_regex="${finished_regex:-^(migrate_finish|migrate|migrate+|shrink_finish|shrink)}"
|
|
|
|
## lock_break_timeout
|
|
# When remote ssh commands are failing, remote locks may sustain forever.
|
|
# Avoid deadlocks by breaking remote locks after this timeout has elapsed.
|
|
# NOTICE: these type of locks are only intended for short-term locking.
|
|
lock_break_timeout="${lock_break_timeout:-3600}" # seconds
|
|
|
|
## startup_when_locked
|
|
# When == 0:
|
|
# Don't abort and don't wait when a lock is detected at startup.
|
|
# When == 1 and when enable_startup_waiting=1:
|
|
# Wait until the lock is gone.
|
|
# When == 2:
|
|
# Abort start of script execution when a lock is detected.
|
|
# Later, when a locks are set _during_ execution, they will
|
|
# be obeyed when enable_*_waiting is set (instead), and will
|
|
# lead to waits instead of aborts.
|
|
startup_when_locked="${startup_when_locked:-1}"
|
|
|
|
## resource_pre_check
|
|
# Useful for debugging of container problems.
|
|
# Normally not needed.
|
|
resource_pre_check="${resource_pre_check:-0}"
|
|
|
|
## condition_check_interval
|
|
# How often conditions should be re-evaluated.
|
|
condition_check_interval="${condition_check_interval:-180}" # Seconds
|
|
|
|
## limit_syncs
|
|
# Limit the number of actually running syncs by waiting
|
|
# until less than this number of syncs are running at any
|
|
# target host.
|
|
limit_syncs="${limit_syncs:-4}"
|
|
|
|
## limit_shrinks
|
|
# Limit the number of actually running shrinks by waiting
|
|
# until less than this number of shrinks are running at any
|
|
# target host.
|
|
limit_shrinks="${limit_shrinks:-1}"
|
|
|
|
## count_shrinks_by_tmp_mount
|
|
# Only count the temporary mounts.
|
|
# Otherwise, LVs are counted. The latter may yield false positives
|
|
# because LVs may be created in advance (e.g. at another cluster member)
|
|
count_shrinks_by_tmp_mount="${count_shrinks_by_tmp_mount:-1}"
|
|
|
|
## limit_mars_logfile
|
|
# Dont handover when too much logfile data is missing at the
|
|
# new primary site.
|
|
limit_mars_logfile="${limit_mars_logfile:-1024}" # MiB
|
|
|
|
## optimize_dentry_cache
|
|
# Don't umount the temporary shrink space unnecessarily.
|
|
# Try to shutdown the VM / container without umounting.
|
|
# Important for high speed.
|
|
optimize_dentry_cache="${optimize_dentry_cache:-1}"
|
|
|
|
## mkfs_cmd
|
|
# Tunable for creation of new filesystems.
|
|
mkfs_cmd="${mkfs_cmd:-mkfs.xfs -s size=4096 -d agcount=1024}"
|
|
|
|
## mount_opts
|
|
# Options for temporary mounts.
|
|
# Not used for ordinary clustermanager operations.
|
|
mount_opts="${mount_opts:--o rw,nosuid,noatime,attr2,inode64,usrquota}"
|
|
|
|
## reuse_mount
|
|
# Assume that already existing temporary mounts are the correct ones.
|
|
# This will speed up interrupted and repeated runs by factors.
|
|
reuse_mount="${reuse_mount:-1}"
|
|
|
|
## reuse_lv
|
|
# Assume that temporary LVs are reusable.
|
|
reuse_lv="${reuse_lv:-1}"
|
|
|
|
## reuse_lv_check
|
|
# When set, this command is executed for checking whether
|
|
# the LV can be reused.
|
|
reuse_lv_check="${reuse_lv_check:-xfs_db -c sb -c print -r}"
|
|
|
|
## do_quota
|
|
# Transfer xfs quota information.
|
|
# 0 = off
|
|
# 1 = global xfs quota transfer
|
|
# 2 = additionally local one
|
|
do_quota="${do_quota:-2}"
|
|
|
|
## xfs_dump_dir
|
|
# Temporary space for keeping xfs quota dumps.
|
|
xfs_dump_dir="${xfs_dump_dir:-$football_backup_dir/xfs-quota-$start_stamp}"
|
|
|
|
## xfs_quota_enable
|
|
# Command for re-enabling the quota system after shrink.
|
|
xfs_quota_enable="${xfs_quota_enable:-xfs_quota -x -c enable}"
|
|
|
|
## xfs_dump and xfs_restore
|
|
# Commands for transfer of xfs quota information.
|
|
xfs_dump="${xfs_dump:-xfs_quota -x -c dump}"
|
|
xfs_restore="${xfs_restore:-xfs_quota -x -c restore}"
|
|
|
|
## fs_resize_cmd
|
|
# Command for online filesystem expansion.
|
|
fs_resize_cmd="${fs_resize_cmd:-xfs_growfs -d}"
|
|
|
|
## migrate_two_phase
|
|
# This is useful when the new hardware has a better replication network,
|
|
# e.g. 10GBit uplink instead of 1GBit.
|
|
# Instead of starting two or more syncs in parallel on the old hardware,
|
|
# run the syncs in two phases:
|
|
# 1. migrate data to the new primary only.
|
|
# 1b. handover to new primary.
|
|
# 2. now start migration of data to the new secondaries, over the better
|
|
# network attachment of the new hardware.
|
|
migrate_two_phase="${migrate_two_phase:-0}"
|
|
|
|
## migrate_always_all
|
|
# By default, migrate+shrink creates only 1 replica during the initial
|
|
# migration.
|
|
# When setting this, all replicas are created, which improves resilience,
|
|
# but worsens network performance.
|
|
migrate_always_all="${migrate_always_all:-0}"
|
|
|
|
## migrate_early_cleanup
|
|
# Early cleanup of old replicas when using migrate_always_all or
|
|
# migrate_two_phase.
|
|
# Only reasonable when combined with migrate+shrink.
|
|
# This is slightly less safe, but saves time when you want to
|
|
# decommission old hardware as fast as popssible.
|
|
# Early cleanup of the old replicase will only be done when
|
|
# at least 2 replicas are available at the new (target) side.
|
|
# These two new replicas can be created either by
|
|
# a) migrate_always_all=1 or
|
|
# b) migrate_two_phase=1 or automatically selected (or not) via
|
|
# c) auto_two_phase=1
|
|
migrate_early_cleanup="${migrate_early_cleanup:-1}"
|
|
|
|
## user_name
|
|
# Normally automatically derived from ssh agent or from $LOGNAME.
|
|
# Please override this only when really necessary.
|
|
export user_name="${user_name:-$(get_real_ssh_user)}"
|
|
export user_name="${user_name:-$LOGNAME}"
|
|
|
|
## replace_ssh_id_file
|
|
# When set, replace current ssh user with this one.
|
|
# The new user should hot have a passphrase.
|
|
# Useful for logging out the original user (interrupting the original
|
|
# ssh agent chain).
|
|
replace_ssh_id_file="${replace_ssh_id_file:-}"
|
|
|
|
|
|
PLUGIN football-1and1config
|
|
|
|
1&1 specfic plugin for dealing with the cm3 clusters
|
|
and its concrete configuration.
|
|
|
|
## enable_1and1config
|
|
# ShaHoLin-specifc plugin for working with the infong platform
|
|
# (istore, icpu, infong) via 1&1-specific clustermanager cm3
|
|
# and related toolsets. Much of it is bound to a singleton database
|
|
# instance (clustermw & siblings).
|
|
enable_1and1config="${enable_1and1config:-$(if [[ "$0" =~ tetris ]]; then echo 1; else echo 0; fi)}"
|
|
|
|
## runstack_host
|
|
# To be provided in a *.conf or *.preconf file.
|
|
runstack_host="${runstack_host:-}"
|
|
|
|
## runstack_cmd
|
|
# Command to be provided in a *.conf file.
|
|
runstack_cmd="${runstack_cmd:-}"
|
|
|
|
## runstack_ping
|
|
# Only call runstack when the container is pingable.
|
|
runstack_ping="${runstack_ping:-1}"
|
|
|
|
## dastool_host
|
|
# To be provided in a *.conf or *.preconf file.
|
|
dastool_host="${dastool_host:-}"
|
|
|
|
## dastool_cmd
|
|
# Command to be provided in a *.conf file.
|
|
dastool_cmd="${dastool_cmd:-}"
|
|
|
|
## update_host
|
|
# To be provided in a *.conf or *.preconf file.
|
|
update_host="${update_host:-}"
|
|
|
|
## update_cmd
|
|
# Command to be provided in a *.conf file.
|
|
update_cmd="${update_cmd:-}"
|
|
|
|
|
|
PLUGIN football-cm3
|
|
|
|
1&1 specfic plugin for dealing with the cm3 cluster manager
|
|
and its concrete operating enviroment (singleton instance).
|
|
|
|
Current maximum cluster size limit:
|
|
|
|
Maximum #syncs running before migration can start:
|
|
|
|
Following marsadm --version must be installed:
|
|
|
|
Following mars kernel modules must be loaded:
|
|
|
|
Specific actions for plugin football-cm3:
|
|
|
|
./football.sh clustertool {GET|PUT} <url>
|
|
Call through to the clustertool via REST.
|
|
Useful for manual inspection and repair.
|
|
|
|
Specific features with plugin football-cm3:
|
|
|
|
- Parameter syntax "cluster123" instead of "icpu456 icpu457"
|
|
This is an alternate specification syntax, which is
|
|
automatically replaced with the real machine names.
|
|
It tries to minimize datacenter cross-traffic by
|
|
taking the new $target_primary at the same datacenter
|
|
location where the container is currenty running.
|
|
|
|
## enable_cm3
|
|
# ShaHoLin-specifc plugin for working with the infong platform
|
|
# (istore, icpu, infong) via 1&1-specific clustermanager cm3
|
|
# and related toolsets. Much of it is bound to a singleton database
|
|
# instance (clustermw & siblings).
|
|
enable_cm3="${enable_cm3:-$(if [[ "$0" =~ tetris ]]; then echo 1; else echo 0; fi)}"
|
|
|
|
## skip_resource_ping
|
|
# Enable this only for testing. Normally, a resource name denotes a
|
|
# container name == machine name which must be runnuing as a precondition,
|
|
# und thus must be pingable over network.
|
|
skip_resource_ping="${skip_resource_ping:-0}"
|
|
|
|
## date_lock
|
|
# Don't enter critical sections at certain days of the week,
|
|
# and/or during certain hours.
|
|
# This is a regex matching against "date +%u_%H"
|
|
date_lock="${date_lock:-}"
|
|
|
|
## check_ping_rounds
|
|
# Number of pings to try before a container is assumed to
|
|
# not respond.
|
|
check_ping_rounds="${check_ping_rounds:-5}"
|
|
|
|
## additional_runstack
|
|
# Do an additional runstack after startup of the new container.
|
|
# In turn, this will only do something when source and target are
|
|
# different.
|
|
additional_runstack="${additional_runstack:-1}"
|
|
|
|
## workaround_firewall
|
|
# Documentation of technical debt for later generations:
|
|
# This is needed since July 2017. In the many years before, no firewalling
|
|
# was effective at the replication network, because it is a physically
|
|
# separate network from the rest of the networking infrastructure.
|
|
# An attacker would first need to gain root access to the _hypervisor_
|
|
# (not only to the LXC container and/or to KVM) before gaining access to
|
|
# those physical replication network interfaces.
|
|
# Since about that time, which is about the same time when the requirements
|
|
# for Container Football had been communicated, somebody introduced some
|
|
# unnecessary firewall rules, based on "security arguments".
|
|
# These arguments were however explicitly _not_ required by the _real_
|
|
# security responsible person, and explicitly _not_ recommended by him.
|
|
# Now the problem is that it is almost politically impossible to get
|
|
# rid of suchalike "security feature".
|
|
# Until the problem is resolved, Container Football requires
|
|
# the _entire_ local firewall to be _temporarily_ shut down in order to
|
|
# allow marsadm commands over ssh to work.
|
|
# Notice: this is _not_ increasing the general security in any way.
|
|
# LONGTERM solution / TODO: future versions of mars should no longer
|
|
# depend on ssh.
|
|
# Then this "feature" can be turned off.
|
|
workaround_firewall="${workaround_firewall:-1}"
|
|
|
|
## ip_magic
|
|
# Similarly to workaround_firewall, this is needed since somebody
|
|
# introduced additional firewall rules also disabling sysadmin ssh
|
|
# connections at the _ordinary_ sysadmin network.
|
|
ip_magic="${ip_magic:-1}"
|
|
|
|
## do_split_cluster
|
|
# The current MARS branch 0.1a.y is not yet constructed for forming
|
|
# a BigCluster constisting of several thousands of machines.
|
|
# When a future version of mars0.1b.y (or 0.2.y) will allow this,
|
|
# this can be disabled.
|
|
do_split_cluster="${do_split_cluster:-1}"
|
|
|
|
## forbidden_hosts
|
|
# Regex for excluding hostnames from any Football actions.
|
|
# The script will fail when some of these is encountered.
|
|
forbidden_hosts="${forbidden_hosts:-}"
|
|
|
|
## forbidden_flavours
|
|
# Regex for excluding flavours from any Football actions.
|
|
# The script will fail when some of these is encountered.
|
|
forbidden_flavours="${forbidden_flavours:-}"
|
|
|
|
## forbidden_bz_ids
|
|
# PROVISIONARY regex for excluding certain bz_ids from any Football actions.
|
|
# NOTICE: bz_ids are deprecated and should not be used in future
|
|
# (technical debts).
|
|
# The script will fail when some of these is encountered.
|
|
forbidden_bz_ids="${forbidden_bz_ids:-}"
|
|
|
|
## auto_two_phase
|
|
# When this is set, override the global migrate_two_phase parameter
|
|
# at runtime by ShaHoLin-specific checks
|
|
auto_two_phase="${auto_two_phase:-1}"
|
|
|
|
## clustertool_host
|
|
# URL prefix of the internal configuation database REST interface.
|
|
# Set this via *.preconf config files.
|
|
clustertool_host="${clustertool_host:-}"
|
|
|
|
## clustertool_user
|
|
# Username for clustertool access.
|
|
# By default, scans for a *.password file (see next option).
|
|
clustertool_user="${clustertool_user:-$(get_cred_file "*.password" | head -1 | sed 's:.*/::g' | cut -d. -f1)}"
|
|
|
|
## clustertool_passwd_file
|
|
# Here you can supply the encrpted password.
|
|
# By default, a file $clustertool_user.password is used
|
|
# containing the encrypted password.
|
|
clustertool_passwd_file="${clustertool_passwd_file:-$(get_cred_file "$clustertool_user.password")}"
|
|
|
|
## clustertool_passwd
|
|
# Here you may override the password via config file.
|
|
# For security reasons, dont provide this at the command line.
|
|
clustertool_passwd="${clustertool_passwd:-$(< $clustertool_passwd_file)}" || echo "cannot read a password file *.password for clustermw: you MUST supply the credentials via default curl config files (see man page)"
|
|
|
|
## do_migrate
|
|
# Keep this enabled. Only disable for testing.
|
|
do_migrate="${do_migrate:-1}" # must be enabled; disable for dry-run testing
|
|
|
|
## always_migrate
|
|
# Only use for testing, or for special situation.
|
|
# This skip the test whether the resource has already migration.
|
|
always_migrate="${always_migrate:-0}" # only enable for testing
|
|
|
|
## check_segments
|
|
# 0 = disabled
|
|
# 1 = only display the segment names
|
|
# 2 = check for equality
|
|
# WORKAROUND, potentially harmful when used inadequately.
|
|
# The historical physical segment borders need to be removed for
|
|
# Container Football.
|
|
# Unfortunately, the subproject aiming to accomplish this did not
|
|
# proceed for one year now. In the meantime, Container Football can
|
|
# be only played within the ancient segment borders.
|
|
# After this big impediment is eventually resolved, this option
|
|
# should be switched off.
|
|
check_segments="${check_segments:-1}"
|
|
|
|
## enable_mod_deflate
|
|
# Internal, for support.
|
|
enable_mod_deflate="${enable_mod_deflate:-1}"
|
|
|
|
## enable_segment_move
|
|
# Seems to be needed by some other tooling.
|
|
enable_segment_move="${enable_segment_move:-1}"
|
|
|
|
## override_hwclass_id
|
|
# When necessary, override this from $include_dir/plugins/*.conf
|
|
override_hwclass_id="${override_hwclass_id:-}" # typically 25007
|
|
|
|
## override_hvt_id
|
|
# When necessary, override this from $include_dir/plugins/*.conf
|
|
override_hvt_id="${override_hvt_id:-}" # typically 8057 or 8059
|
|
|
|
## override_overrides
|
|
# When this is set and other override_* variables are not set,
|
|
# then try to _guess_ some values.
|
|
# No guarantees for correctness either.
|
|
override_overrides=${override_overrides:-1}
|
|
|
|
## iqn_base and iet_type and iscsi_eth and iscsi_tid
|
|
# Workaround: this is needed for _dynamic_ generation of iSCSI sessions
|
|
# bypassing the ordinary ones as automatically generated by the
|
|
# cm3 cluster manager (only at the old istore architecture).
|
|
# Notice: not needed for regular operations, only for testing.
|
|
# Normally, you dont want to shrink over a _shared_ 1MBit iSCSI line.
|
|
iqn_base="${iqn_base:-iqn.2000-01.info.test:test}"
|
|
iet_type="${iet_type:-blockio}"
|
|
iscsi_eth="${iscsi_eth:-eth1}"
|
|
iscsi_tid="${iscsi_tid:-4711}"
|
|
|
|
## monitis_downtime_script
|
|
# ShaHoLin-internal
|
|
monitis_downtime_script="${monitis_downtime_script:-}"
|
|
|
|
## monitis_downtime_duration
|
|
# ShaHoLin-internal
|
|
monitis_downtime_duration="${monitis_downtime_duration:-20}" # Minutes
|
|
|
|
## shaholin_customer_report_cmd
|
|
# Action script when the hardware has improved.
|
|
shaholin_customer_report_cmd="${shaholin_customer_report_cmd:-}"
|
|
|
|
## shaholin_min_cpus and shaholin_dst_cpus
|
|
shaholin_src_cpus="${shaholin_src_cpus:-4}"
|
|
shaholin_dst_cpus="${shaholin_dst_cpus:-32}"
|
|
|
|
## shaholin_finished_log
|
|
# ShaHoLin-specific logfile, reporting _only_ successful completion
|
|
# of an action.
|
|
shaholin_finished_log="${shaholin_finished_log:-$football_logdir/shaholin-finished.log}"
|
|
|
|
## shaholin_action
|
|
# OPTIONAL: specific action script with parameters.
|
|
shaholin_action="${shaholin_action:-}"
|
|
|
|
## auto_handover
|
|
# Load-balancing accross locations.
|
|
# Works only together with the new syntax "cluster123".
|
|
# Depending on the number of syncs currently running, this
|
|
# will internally add --pre-hand and --post_hand options
|
|
# dynamically at runtime. This will spread much of the sync
|
|
# traffic to per-datacenter local behaviour.
|
|
# Notice: this may produce more total customer downtime when
|
|
# running a high parallelism degree.
|
|
# Thus it tries to reduce unnecessary handovers to other locations.
|
|
auto_handover="${auto_handover:-1}"
|
|
|
|
|
|
PLUGIN football-ticket
|
|
|
|
Generic plugin for creating and updating tickets,
|
|
e.g. Jira tickets.
|
|
|
|
You will need to hook in some external scripts which are
|
|
then creating / updating the tickets.
|
|
|
|
Comment texts may be provided with following conventions:
|
|
|
|
comment.$ticket_state.txt
|
|
comment.$ticket_phase.$ticket_state.txt
|
|
|
|
Directories where comments may reside:
|
|
|
|
football_creds=/usr/lib/mars/creds /etc/mars/creds /home/schoebel/mars/football-master.git/creds /home/schoebel/mars/football-master.git /home/schoebel/.mars/creds ./creds
|
|
football_confs=/usr/lib/mars/confs /etc/mars/confs /home/schoebel/mars/football-master.git/confs /home/schoebel/.mars/confs ./confs
|
|
football_includes=/usr/lib/mars/plugins /etc/mars/plugins /home/schoebel/mars/football-master.git/plugins /home/schoebel/.mars/plugins ./plugins
|
|
|
|
## enable_ticket
|
|
enable_ticket="${enable_ticket:-$(if [[ "$0" =~ tetris ]]; then echo 1; else echo 0; fi)}"
|
|
|
|
## ticket
|
|
# OPTIONAL: the meaning is installation specific.
|
|
# This can be used for identifying JIRA tickets.
|
|
# Can be set on the command line like "./tetris.sh $args --ticket=TECCM-4711
|
|
ticket="${ticket:-}"
|
|
|
|
## ticket_get_cmd
|
|
# Optional: when set, this script can be used for retrieving ticket IDs
|
|
# in place of commandline option --ticket=
|
|
# Retrieval should be unique by resource names.
|
|
# You may use any defined bash varibale by escaping them like
|
|
# $res .
|
|
# Example: ticket_get_cmd="my-ticket-getter-script.pl "$res""
|
|
ticket_get_cmd="${ticket_get_cmd:-}"
|
|
|
|
## ticket_create_cmd
|
|
# Optional: when set, this script can be used for creating new tickets.
|
|
# It will be called when $ticket_get_cmd does not retrieve anything.
|
|
# Example: ticket_create_cmd="my-ticket-create-script.pl "$res" "$target_primary""
|
|
# Afterwards, the new ticket needs to be retrievable via $ticket_get_cmd.
|
|
ticket_create_cmd="${ticket_create_cmd:-}"
|
|
|
|
## ticket_update_cmd
|
|
# This can be used for calling an external command which updates
|
|
# the ticket(s) given by the $ticket parameter.
|
|
# Example: ticket_update_cmd="my-script.pl "$ticket" "$res" "$ticket_phase" "$ticket_state""
|
|
ticket_update_cmd="${ticket_update_cmd:-}"
|
|
|
|
## ticket_require_comment
|
|
# Only update a ticket when a comment file exists in one of the
|
|
# directories $football_creds $football_confs $football_includes
|
|
ticket_require_comment="${ticket_require_comment:-1}"
|
|
|
|
|
|
PLUGIN football-basic
|
|
|
|
Generic driver for systemd-controlled MARS pools.
|
|
The current version supports only a flat model:
|
|
(1) There is a single "big cluster" at metadata level.
|
|
All cluster members are joined via merge-cluster.
|
|
All occurring names need to be globally unique.
|
|
(2) The network uses BGP or other means, thus any hypervisor
|
|
can (potentially) start any VM at any time.
|
|
(3) iSCSI or remote devices are not supported for now
|
|
(LocalSharding model). This may be extended in a future
|
|
release.
|
|
This plugin is exclusive-or with cm3.
|
|
|
|
Plugin specific actions:
|
|
|
|
./football.sh basic_add_host <hostname>
|
|
Manually add another host to the hostname cache.
|
|
|
|
## pool_cache_dir
|
|
# Directory for caching the pool status.
|
|
pool_cache_dir="${pool_cache_dir:-$script_dir/pool-cache}"
|
|
|
|
## initial_hostname_file
|
|
# This file must contain a list of storage and/or hypervisor hostnames
|
|
# where a /mars directory must exist.
|
|
# These hosts are then scanned for further cluster members,
|
|
# and the transitive closure of all host names is computed.
|
|
initial_hostname_file="${initial_hostname_file:-./hostnames.input}"
|
|
|
|
## hostname_cache
|
|
# This file contains the transitive closure of all host names.
|
|
hostname_cache="${hostname_cache:-$pool_cache_dir/hostnames.cache}"
|
|
|
|
## resources_cache
|
|
# This file contains the transitive closure of all resource names.
|
|
resources_cache="${resources_cache:-$pool_cache_dir/resources.cache}"
|
|
|
|
## res2hyper_cache
|
|
# This file contains the association between resources and hypervisors.
|
|
res2hyper_cache="${res2hyper_cache:-$pool_cache_dir/res2hyper.assoc}"
|
|
|
|
## enable_basic
|
|
# This plugin is exclusive-or with cm3.
|
|
enable_basic="${enable_basic:-$(if [[ "$0" =~ football ]]; then echo 1; else echo 0; fi)}"
|
|
|
|
## ssh_port
|
|
# Set this for separating sysadmin access from customer access
|
|
ssh_port="${ssh_port:-}"
|
|
|
|
## basic_mnt_dir
|
|
# Names the mountpoint directory at hypervisors.
|
|
# This must co-incide with the systemd mountpoints.
|
|
basic_mnt_dir="${basic_mnt_dir:-/mnt}"
|
|
|
|
|
|
PLUGIN football-downtime
|
|
|
|
Generic plugin for communication of customer downtime.
|
|
|
|
## downtime_cmd_{set,unset}
|
|
# External command for setting / unsetting (or communicating) a downtime
|
|
# Empty = don't do anything
|
|
downtime_cmd_set="${downtime_cmd_set:-}"
|
|
downtime_cmd_unset="${downtime_cmd_unset:-}"
|
|
|
|
|
|
PLUGIN football-motd
|
|
|
|
Generic plugin for motd. Communicate that Football is running
|
|
at login via motd.
|
|
|
|
## enable_motd
|
|
# whether to use the motd plugin.
|
|
enable_motd="${enable_motd:-0}"
|
|
|
|
## update_motd_cmd
|
|
# Distro-specific command for generating motd from several sources.
|
|
# Only tested for Debian Jessie at the moment.
|
|
update_motd_cmd="${update_motd_cmd:-update-motd}"
|
|
|
|
## download_motd_script and motd_script_dir
|
|
# When no script has been installed into /etc/update-motd.d/
|
|
# you can do it dynamically here, bypassing any "official" deployment
|
|
# methods. Use this only for testing!
|
|
# An example script (which should be deployed via your ordinary methods)
|
|
# can be found under $script_dir/update-motd.d/67-football-running
|
|
download_motd_script="${download_motd_script:-}"
|
|
motd_script_dir="${motd_script_dir:-/etc/update-motd.d}"
|
|
|
|
## motd_file
|
|
# This will contain the reported motd message.
|
|
# It is created by this plugin.
|
|
motd_file="${motd_file:-/var/motd/football.txt}"
|
|
|
|
## motd_color_on and motd_color_off
|
|
# ANSI escape sequences for coloring the generated motd message.
|
|
motd_color_on="${motd_color_on:-\\033[31m}"
|
|
motd_color_off="${motd_color_off:-\\033[0m}"
|
|
|
|
|
|
PLUGIN football-report
|
|
|
|
Generic plugin for communication of reports.
|
|
|
|
## report_cmd_{start,warning,failed,finished}
|
|
# External command which is called at start / failure / finish
|
|
# of Football.
|
|
# The following variables can be used (e.g. as parameters) when
|
|
# escaped with a backslash:
|
|
# $res = name of the resource (LV, container, etc)
|
|
# $primary = the current (old)
|
|
# $secondary_list = list of current (old) secondaries
|
|
# $target_primary = the target primary name
|
|
# $target_secondary = list of target secondaries
|
|
# $operation = the operation name
|
|
# $target_percent = the value used for shrinking
|
|
# $txt = some informative text from Football
|
|
# Further variables are possible by looking at the sourcecode, or by
|
|
# defining your own variables or functions externally or via plugins.
|
|
# Empty = don't do anything
|
|
report_cmd_start="${report_cmd_start:-}"
|
|
report_cmd_warning="${report_cmd_warning:-$script_dir/screener.sh notify "$res" warning "$txt"}"
|
|
report_cmd_failed="${report_cmd_failed:-}"
|
|
report_cmd_finished="${report_cmd_finished:-}"
|
|
|
|
|
|
PLUGIN football-waiting
|
|
|
|
Generic plugig, interfacing with screener: when this is used
|
|
by your script and enabled, then you will be able to wait for
|
|
"screener.sh continue" operations at certain points in your
|
|
script.
|
|
|
|
## enable_*_waiting
|
|
#
|
|
# When this is enabled, and when Football had been started by screener,
|
|
# then football will delay the start of several operations until a sysadmin
|
|
# does one of the following manually:
|
|
#
|
|
# a) ./screener.sh continue $session
|
|
# b) ./screener.sh resume $session
|
|
# c) ./screener.sh attach $session and press the RETURN key
|
|
# d) doing nothing, and $wait_timeout has exceeded
|
|
#
|
|
# CONVENTION: football resource names are used as screener session ids.
|
|
# This ensures that only 1 operation can be started for the same resource,
|
|
# and it simplifies the handling for junior sysadmins.
|
|
#
|
|
enable_startup_waiting="${enable_startup_waiting:-0}"
|
|
enable_handover_waiting="${enable_handover_waiting:-0}"
|
|
enable_migrate_waiting="${enable_migrate_waiting:-0}"
|
|
enable_shrink_waiting="${enable_shrink_waiting:-0}"
|
|
|
|
## enable_cleanup_delayed and wait_before_cleanup
|
|
# By setting this, you can delay the cleanup operations for some time.
|
|
# This way, you are keeping the old LV contents as a kind of "backup"
|
|
# for some limited time.
|
|
# HINT: dont set to wait_before_cleanuplarge values, because it can
|
|
# seriously slow down Football.
|
|
enable_cleanup_delayed="${enable_cleanup_delayed:-0}"
|
|
wait_before_cleanup="${wait_before_cleanup:-180}" # Minutes
|
|
|
|
## reduce_wait_msg
|
|
# Instead of reporting the waiting status once per minute,
|
|
# decrease the frequency of resporting.
|
|
# Warning: dont increase this too much. Do not exceed
|
|
# session_timeout/2 from screener. Because of the Nyquist criterion,
|
|
# stay on the safe side by setting session_timeout at least to _twice_
|
|
# the time than here.
|
|
reduce_wait_msg="${reduce_wait_msg:-60}" # Minutes
|
|
|
|
|
|
\end{verbatim}
|