mars/docu/screener-verbose.help
2018-09-26 15:48:52 +02:00

406 lines
14 KiB
Plaintext

\begin{verbatim}
OVERRIDE verbose=1
./screener.sh: Run _unattended_ processes in screen sessions.
Useful for MASS automation, running hundreds of unattended
commands in parallel.
HINT: for running more than ~500 sessions in parallel, you might need
some system tuning (e.g. rlimits, kernel patches etc) for creating
a huge number of file descritor / sockets / etc.
ADVANTAGE: You may attach to individual screens, kill them, or continue
some waiting commands.
Synopsis:
./screener.sh --help [--verbose]
./screener.sh list-running
./screener.sh list-waiting
./screener.sh list-interrupted
./screener.sh list-illegal
./screener.sh list-timeouted
./screener.sh list-failed
./screener.sh list-critical
./screener.sh list-serious
./screener.sh list-done
./screener.sh list
./screener.sh list-archive
./screener.sh list-screens
./screener.sh run <file.csv> [<condition_list>]
./screener.sh start <screen_id> <cmd> <args...>
./screener.sh [<options>] <operation> <screen_id>
Inquiry operations:
./screener.sh list-screens
Equivalent to screen -ls
./screener.sh list-<type>
Show a list of currently running, waiting (for continuation), failed,
and done/completed screen sessions.
./screener.sh list
First show a list of currently running screens, then
for each <type> a list of (old) failed / completed / sessions
(and so on).
./screener.sh status <screen_id>
Like list-*, but filter <sceen_id> and dont report timestamps.
./screener.sh show <screen_id>
Show the last logfile of <screen_id> at standard output.
./screener.sh less <screen_id>
Show the last logfile of <screen_id> using "less -r".
MASS starting of screen sessions:
./screener.sh run <file.csv> <condition_list>
Commands are launched in screen sessions via "./screener.sh start" commands,
unless the same <screen_id> is already running,
or is in some error state, or is already done (see below).
The commands are given by a column with CSV header name
containing "command", or by the first column.
The <screen_id> needs to be given by a column with CSV header
name matching "screen_id|resource".
The number and type of commands to launch can be reduced via
any combination of the following filter conditions:
--max=<number>
Limit the number of _new_ sessions additionally started this time.
--<column_name>==<value>
Only select lines where an arbitrary CSV column (given by its
CSV header name in C identifier syntax) has the given value.
--<column_name>!=<value>
Only select lines where the colum has _not_ the given value.
--<column_name>=~<bash_regex>
Only select lines where the bash regular expression matches
at the given column.
--max-per=<number>
Limit the number per _distinct_ value of the column denoted by
the _next_ filter condition.
Example: ./screener.sh run test.csv --dry-run --max-per=2 --dst_network=~.
would launch only 2 Football processes per destination network.
Hint: filter conditions can be easily checked by giving --dry-run.
Start / restart / kill / continue screen sessions:
./screener.sh start <screen_id> <cmd> <args...>
Start a new screen session, running arbitrary <cmd> and <args...>
inside.
./screener.sh restart <screen_id>
Works only when the last command for <screen_id> failed.
This will restart the old <cmd> and its <args...> as before.
Use only when you want to repeat the same command once again.
./screener.sh kill <screen_id>
Terminate the running screen session forcibly.
./screener.sh continue
./screener.sh continue <screen_id> [<screen_id_list>]
./screener.sh continue <number>
Useful for MASS automation of processes involving critical sections
such as customer downtime.
When giving a numerical <number> argument, up to that number
of sessions are resumed (ordered by age).
When no further arugment is given, _all_ currently waiting sessions
are continued.
When --auto-attach is given, it will sequentially resume the
sessions to be continued. By default, unless --force_attach is set,
it uses "screen -r" skipping those sessions which are already
attached to somebody else.
This feature works only with prepared scripts which are creating
an empty flagfile
/home/schoebel/mars/mars-migration.git/screener-logdir-testing/running/$screen_id.waiting
whenever they want to wait for manual intervention (for whatever reason).
Afterwards, the script must be polling this flagfile for removal.
This screener operation simply removes the flagfile, such that
the script will then continue afterwards.
Example: look into ./football.sh
and search for occurrences of substring "call_hook start_wait".
./screener.sh wakeup
./screener.sh wakeup <screen_id> [<screen_id_list>]
./screener.sh wakeup <number>
Similar to continue, but refers to delayed commands waiting for
a timeout. This can be used to individually shorten the timeout
period.
Example: Football cleanup operations may be artificially delayed
before doing "lvremove", to keep some sort of 'backup' for a
limited time. When your project is under time pressure, these
delays may be hindering.
Use this for premature ending of such artificial delays.
./screener.sh up <...>
Do both continue and wakeup.
./screener.sh auto <...>
Equivalent to ./screener.sh --auto-attach up <...>
Remember that only session without current attachment will be
attached to.
Attach to a running session:
./screener.sh attach <screen_id>
This is equivalent to screen -x $screen_id
./screener.sh resume <screen_id>
This is equivalent to screen -r $screen_id
Communication:
./screener.sh notify <screen_id> <txt>
May be called from external scripts to send emails etc.
Locking (only when supported by <cmd>):
./screener.sh lock
./screener.sh unlock
./screener.sh lock <screen_id>
./screener.sh unlock <screen_id>
Cleanup / bookkeeping:
./screener.sh clear-critical <screen_id>
./screener.sh clear-serious <screen_id>
./screener.sh clear-interrupted <screen_id>
./screener.sh clear-illegal <screen_id>
./screener.sh clear-timeouted <screen_id>
./screener.sh clear-failed <screen_id>
Mark the status as "done" and move the logfile away.
./screener.sh purge [<days>]
This will remove all old logfiles which are older than
<days>. By default, the variable $screener_log_purge_period
will be used, which is currently set to '30'.
./screener.sh cron
You should call this regulary from a user cron job, in order
to purge old logfiles, or to detect hanging sessions, or to
automatically send pending emails, etc.
Options:
--variable
--variable=$value
These must come first, in order to prevent mixup with
options of <cmd> <args...>.
Allows overriding of any internal shell variable.
--help --verbose
Show all overridable shell variables, also for plugins.
## screener_includes
# List of directories where screener-*.conf files can be found.
screener_includes="${screener_includes:-/usr/lib/mars/plugins /etc/mars/plugins $script_dir/plugins $HOME/.mars/plugins ./plugins}"
## screener_confs
# Another list of directories where screener-*.conf files can be found.
# These are sourced in a second pass after $screener_includes.
# Thus you can change this during the first pass.
screener_confs="${screener_confs:-/usr/lib/mars/confs /etc/mars/confs $script_dir/confs $HOME/.mars/confs ./confs}"
## title
# Used as a title for startup of screen sessions, and later for
# display at list-*
title="${title:-}"
## auto_attach
# Upon start or upon continue/wakuep/up, attach to the
# (newly created or existing) session.
auto_attach="${auto_attach:-0}"
## auto_attach_grace
# Before attaching, wait this time in seconds.
# The user may abort within this sleep time by
# pressing Ctrl-C.
auto_attach_grace="${auto_attach_grace:-10}"
## force_attach
# Use "screen -x" instead of "screen -r" allowing
# shared sessions between different users / end terminals.
force_attach="${force_attach:-0}"
## drop_shell
# When a <cmd> fails, the screen session will not terminated immediately.
# Instead, an interactive bash is started, so can later attach and
# rectify any probllems.
# WARNING! only activate this if you regulary check for failed sessions
# and then manually attach to them. Don't use this when running hundreds
# or thousand in parallel.
drop_shell="${drop_shell:-0}"
## session_timeout
# Detect hanging sessions when they don't produce any output anymore
# for a longer time. Hanging sessions are then marked as either
# 'timeout' or 'critical'.
session_timeout="${session_timeout:-$(( 3600 * 3 ))}" # seconds
## screener_logdir or logdir
# Where the logfiles and all status information is kept.
export screener_logdir="${screener_logdir:-${logdir:-$HOME/screener-logs}}"
## screener_command_log
# This logfile will accumulate all relevant $0 command invocations,
# including timestamps and ssh agent identities.
# To switch off, use /dev/null here.
screener_command_log="${screener_command_log:-$screener_logdir/commands.log}"
## screener_cron_log
# Since "$0 cron" works silently, you won't notice any errors.
# This logfiles gives you a chance for checking any problems.
screener_cron_log="${screener_cron_log:-$screener_logdir/cron.log}"
## screener_log_purge_period
# $0 cron or $0 purge will automatically remove all old logfiles
# from $screener_logdir/*/ when this period is exceeded.
screener_log_purge_period="${screener_log_purge_period:-30}" # Days
## screener_log_purge_archive
# When set, the logfiles will be moved to $screener_logdir/archive/
# Otherwise they will be deleted.
screener_log_purge_archive="${screener_log_purge_archive:-1}"
## dry_run
# Dont actually start screen sessions when set.
dry_run="${dry_run:-0}"
## verbose
# increase speakiness.
verbose=${verbose:-0}
## debug
# Some additional debug messages.
debug="${debug:-0}"
## sleep
# Workaround races by keeping sessions open for a few seconds.
# This is useful for debugging of immediate script failures.
# You have some short time window for attaching.
# HINT: instead, just inspect the logfiles in $screener_logdir/*/*.log
sleep="${sleep:-3}"
## screen_cmd
# Customize the screen command (e.g. add some further options, etc).
screen_cmd="${screen_cmd:-screen}"
## use_screenlog
# Add the -L option. Not really useful when running thousands of
# parallel screen sessions, because the automatically generated filenames
# are crap, and cannot be set in advance.
# Useful for basic debugging of setup problems etc.
use_screenlog="${use_screenlog:-0}"
## waiting_txt and delay_txt and condition_txt
# RTFS Don't use this, unless you know what you are doing.
waiting_txt="${waiting_txt:-SCREENER_waiting_WAIT}"
delayed_txt="${delayed_txt:-SCREENER_delayed_WAIT}"
condition_txt="${condition_txt:-SCREENER_condition_WAIT}"
## critical_status
# This is the "magic" exit code indicating _criticality_
# of a failed command.
critical_status="${critical_status:-199}"
## serious_status
# This is the "magic" exit code indicating _seriosity_
# of a failed command.
serious_status="${serious_status:-198}"
## interrupted_status
# This is the "magic" exit code indicating a manual interruption
# (e.g. keypress Ctl-c)
interrupted_status="${interrupted_status:-190}"
## illegal_status
# This is the "magic" exit code indicating an illegal command
# (e.g. syntax error, illegal arguments, etc)
illegal_status="${illegal_status:-191}"
## timeouted_status
# This is the "magic" internal code indicating a
# hanging session (see $session_timeout).
timeouted_status="${timeouted_status:-195}"
## less_cmd
# Used at $0 less $id
less_cmd="${less_cmd:-less -r}"
## date_format
# Here you can customize the appearance of list-* commands
date_format="${date_format:-%Y-%m-%d %H:%M}"
## csv_delimit
# The delimiter used for CSV file parsing
csv_delim="${csv_delim:-;}"
## csv_cmd_fields
# Regex telling the field name for 'cmd'
csv_cmd_fields="${csv_cmd_fields:-command}"
## csv_id_fields
# Regex telling the field name for 'screen_id'
csv_id_fields="${csv_id_fields:-screen_id|resource}"
## csv_remove
# Regex for global removal of command options
csv_remove="${csv_remove:---screener}"
## user_name
# Normally automatically derived from ssh agent or from $LOGNAME.
# Please override this only when really necessary.
export user_name="${user_name:-$(ssh-add -l | grep -o '[^ ]+@[^ ]+' | sort -u | tail -1)}"
export user_name="${user_name:-$LOGNAME}"
## screener_break_timeout
# Avoid deadlocks by breaking a screener lock after this timeout has elapsed.
# NOTICE: these type of locks are only intended for short-term locking.
screener_break_timeout="${screener_break_timeout:-30}" # seconds
## tmp_dir and tmp_stub
# Where temporary files are residing
tmp_dir="${tmp_dir:-/tmp}"
tmp_stub="${tmp_stub:-$tmp_dir/screener.$$}"
Running hook: email_describe_plugin
PLUGIN screener-email
Generic plugin for sending emails (or SMS via gateways)
upon status changes, such as script failures.
## email_*
# List of email addresses.
# Empty = don't send emails.
email_critical="${email_critical:-}"
email_serious="${email_serious:-}"
email_failed="${email_failed:-}"
email_warning="${email_warning:-}"
email_waiting="${email_waiting:-}"
email_done="${email_done:-}"
## sms_*
# List of email addresses of SMS gateways.
# These may be distinct from email_*.
# Empty = don't send sms.
sms_critical="${sms_critical:-}"
sms_serious="${sms_serious:-}"
sms_failed="${sms_failed:-}"
sms_warning="${sms_warning:-}"
sms_waiting="${sms_waiting:-}"
sms_done="${sms_done:-}"
## email_cmd
# Command for email sending.
# Please include your gateways etc here.
email_cmd="${email_cmd:-mailx -S smtp=mx.nowhere.org:587 -S smpt-auth-user=test}"
## email_logfiles
# Whether to include logfiles in the body.
# Not used for sms_*.
email_logfiles="${email_logfiles:-1}"
\end{verbatim}