mars/kernel/Kconfig
2019-11-05 19:11:05 +01:00

408 lines
12 KiB
Plaintext

#
# MARS configuration
#
config MARS
tristate "block storage replication MARS"
depends on BLOCK && PROC_SYSCTL && HIGH_RES_TIMERS
depends on !DEBUG_SLAB && !DEBUG_SG
default n
---help---
See https://github.com/schoebel/mars/docu/
Only compile as a module!
config MARS_CHECKS
bool "enable simple runtime checks in MARS"
depends on MARS
default y
---help---
These checks should be rather lightweight. Use them
for beta testing and for production systems where
safety is more important than performance.
In case of bugs in the reference counting, an automatic repair
is attempted, which lowers the risk of memory corruptions.
Disable only if you need the absolutely last grain of
performance.
If unsure, say Y here.
config MARS_DEBUG
bool "enable full runtime checks and some tracing in MARS"
depends on MARS
default n
---help---
Some of these checks and some additional error tracing may
consume noticable amounts of memory.
OFF for production systems. ON for testing!
config MARS_DEBUG_DEFAULT
bool "turn on debug messages by default (may flood the logfiles)"
depends on MARS_DEBUG
default n
---help---
normally OFF
config MARS_DEBUG_MEM
bool "debug memory operations"
depends on MARS_DEBUG
default n
---help---
This adds considerable space and time overhead, but catches
many errors (including some that are not caught by kmemleak).
Use only for development and thorough testing!
config MARS_DEBUG_MEM_STRONG
bool "intensified debugging of memory operations"
depends on MARS_DEBUG_MEM
default y
---help---
Trace all block allocations, find more errors.
Adds some overhead.
Use for debugging of new bricks or for intensified
regression testing.
config MARS_DEBUG_ORDER0
bool "also debug order0 operations"
depends on MARS_DEBUG_MEM
default n
---help---
Turn even order 0 allocations into order 1 ones and provoke
heavy memory fragmentation problems from the buddy allocator,
but catch some additional memory problems.
Use only if you know what you are doing!
Normally OFF.
config MARS_DEFAULT_PORT
int "port number where MARS is listening"
depends on MARS
default 7777
---help---
Best practice is to uniformly use the same port number
in a cluster. Therefore, this is a compiletime constant.
You may override this at insmod time via the mars_port= parameter.
config MARS_SEPARATE_PORTS
bool "use separate port numbers for traffic shaping"
depends on MARS
default y
---help---
When enabled, the following port assignments will be used:
CONFIG_MARS_DEFAULT_PORT : updates of symlinks
CONFIG_MARS_DEFAULT_PORT + 1 : replication of logfiles
CONFIG_MARS_DEFAULT_PORT + 2 : (initial) sync traffic
As a consequence, external traffic shaping may be used to
individually control the network speed for different types
of traffic.
Please don't hinder the symlink updates in any way -- they are
most vital, and they produce no mass traffic at all
(it's only some kind of meta-information traffic).
Say Y if you have a big datacenter.
Say N if you cannot afford a bigger hole in your firefall.
If unsure, say Y.
config MARS_IPv4_TOS
bool "use TOS / DSCP in IPv4"
depends on MARS_SEPARATE_PORTS
default y
---help---
Tag IP traffic differently for different ports.
In certain private networks, this can improve certain
network bottlenecks.
config MARS_MEM_RETRY
bool "make MARS memory allocation more robust"
depends on MARS
default y
---help---
Normally ON. Switch only off for systems having very low memory.
config MARS_LOGDIR
string "absolute path to the logging directory"
depends on MARS
default "/mars"
---help---
Path to the directory where all MARS messages will reside.
Usually this is equal to the global /mars directory.
Logfiles and status files obey the following naming conventions:
0.debug.log
1.info.log
2.warn.log
3.error.log
4.fatal.log
5.total.log
Logfiles must already exist in order to be appended.
Logiles can be rotated by renaming them and creating
a new empty file in place of the old one.
Status files follow the same rules, but .log is replaced
by .status, and they are created automatically. Their content
is however limited to a few seconds or minutes.
config MARS_ROLLOVER_INTERVAL
int "rollover time of logging status files (in seconds)"
depends on MARS
default 3
---help---
May influence the system load; dont use too low nubmers.
config MARS_SCAN_INTERVAL
int "re-scanning of symlinks in /mars/ (in seconds)"
depends on MARS
default 5
---help---
May influence the system load; dont use too low nubmers.
config MARS_PROPAGATE_INTERVAL
int "network propagation delay of changes in /mars/ (in seconds)"
depends on MARS
default 5
---help---
May influence the system load; dont use too low nubmers.
config MARS_SYNC_FLIP_INTERVAL
int "interrpt sync by logfile update after (seconds)"
depends on MARS
default 60
---help---
0 = OFF. Normally ON.
When disabled, application of logfiles may wait for
a very time, until full sync has finished. As a
consequence, your /mars/ filesystem may run out
of space. When enabled, the applied logfiles can
be deleted, freeing space on /mars/. Therefore,
will usually want this. However, you may increase
the time interval to increase throughput in favour
of latency.
config MARS_NETIO_TIMEOUT
int "timeout for remote IO operations (in seconds)"
depends on MARS
default 30
---help---
In case of network hangs, don't wait forever, but rather
abort with -ENOTCONN
when == 0, wait forever (may lead to hanging operations
similar to NFS hard mounts)
config MARS_MEM_PREALLOC
bool "avoid memory fragmentation by preallocation"
depends on MARS
default y
---help---
Normally ON. Switch off only for EXPERIMENTS!
config MARS_FAST_FULLSYNC
bool "decrease network traffic at initial sync"
depends on MARS
default y
---help---
Normally ON.
When on, both sides will read the data, compute a md5
checksum, and compare them. Only in case the checksum
mismatches, the data will be actually transferred over
the network. This may increase the IO traffic in favour
of network traffic. Usually it does no harm to re-read
the same data twice (only in case of mismatches) over bio
because RAID controllers will usually cache their data
for some time. In case of buffered aio reads from filesystems,
the data is cached by the kernel anyway.
config MARS_LOADAVG_LIMIT
bool "stall copy traffic when loadavg gets too high"
depends on MARS
default y
---help---
Normally ON.
config MARS_SHOW_CONNECTIONS
bool "show connection status symlinks"
depends on MARS
default n
---help---
Normally OFF.
When enabled, the status of all current network connections is
written to /mars/resource-$resource/local-$host/connect-*
showing the current status of all network connections by
the following encoding:
-1 = connection is closed (not intended to start)
0 = connection is interrupted / not established
1 = connection is established at TCP level
Warning! the symlinks are not deleted by the kernel. You have
to cleanup old symlinks by yourself in userspace.
When you forget the cleanup in regular intervals, you may
end up with thoundands of symlinks accumulating in the
/mars/resource-$resource/local-$host/ directory over a
longer time.
If unsure, say N here.
config MARS_LOGROT
bool "allow logrotate during operation"
depends on MARS
default y
---help---
Normally ON. Switch off only for EXPERIMENTS!
config MARS_MIN_SPACE_4
int "absolutely necessary free space in /mars/ (hard limit in GB)"
depends on MARS
default 2
---help---
HARDEST EMERGENCY LIMIT
When free space in /mars/ drops under this limit,
transaction logging to /mars/ will stop at all,
even at all primary resources. All IO will directly go to the
underlying raw devices. The transaction logfile sequence numbers
will be disrupted, deliberately leaving holes in the sequence.
This is a last-resort desperate action of the kernel.
As a consequence, all secodaries will have no chance to
replay at that gap, even if they got the logfiles.
The secondaries will stop at the gap, left in an outdated,
but logically consistent state.
After the problem has been fixed, the secondaries must
start a full-sync in order to continue replication at the
recent state.
This is the hardest measure the kernel can take in order
to TRY to continue undisrupted operation at the primary side.
In general, you should avoid such situations at the admin level.
Please implement your own monitoring at the admin level, which warns
you and/or takes appropriate countermeasures much earlier.
Never rely on this emergency feature!
config MARS_MIN_SPACE_3
int "free space in /mars/ for primary logfiles (additional limit in GB)"
depends on MARS
default 2
---help---
MEDIUM EMERGENCY LIMIT
When free space in /mars/ drops under
MARS_MIN_SPACE_4 + MARS_MIN_SPACE_3,
elder transaction logfiles will be deleted at primary resources.
As a consequence, the secondaries may no longer be able to
get a consecute series of copies of logfiles.
As a result, they may get stuck somewhere inbetween at an
outdated, but logically consistent state.
This is a desperate action of the kernel.
After the problem has been fixed, some secondaries may need to
start a full-sync in order to continue replication at the
recent state.
In general, you should avoid such situations at the admin level.
Please implement your own monitoring at the admin level, which warns
you and/or takes appropriate countermeasures much earlier.
Never rely on this emergency feature!
config MARS_MIN_SPACE_2
int "free space in /mars/ for secondary logfiles (additional limit in GB)"
depends on MARS
default 2
---help---
MEDIUM EMERGENCY LIMIT
When free space in /mars/ drops under
MARS_MIN_SPACE_4 + MARS_MIN_SPACE_3 + MARS_MIN_SPACE_2,
elder transaction logfiles will be deleted at secondary resources.
As a consequence, some local secondary resources
may get stuck somewhere inbetween at an
outdated, but logically consistent state.
This is a desperate action of the kernel.
After the problem has been fixed and the free space becomes
larger than MARS_MIN_SPACE_4 + MARS_MIN_SPACE_3 + MARS_MIN_SPACE_2
+ MARS_MIN_SPACE_2, the secondary tries to fetch the missing
logfiles from the primary again.
However, if the necessary logfiles have been deleted at the
primary side in the meantime, this may fail.
In general, you should avoid such situations at the admin level.
Please implement your own monitoring at the admin level, which warns
you and/or takes appropriate countermeasures much earlier.
Never rely on this emergency feature!
config MARS_MIN_SPACE_1
int "free space in /mars/ for replication (additional limit in GB)"
depends on MARS
default 2
---help---
LOWEST EMERGENCY LIMIT
When free space in /mars/ drops under MARS_MIN_SPACE_4
+ MARS_MIN_SPACE_3 + MARS_MIN_SPACE_2 + MARS_MIN_SPACE_1,
fetching of transaction logfiles will stop at local secondary
resources.
As a consequence, some local secondary resources
may get stuck somewhere inbetween at an
outdated, but logically consistent state.
This is a desperate action of the kernel.
After the problem has been fixed and the free space becomes
larger than MARS_MIN_SPACE_4 + MARS_MIN_SPACE_3 + MARS_MIN_SPACE_2
+ MARS_MIN_SPACE_2, the secondary will continue fetching its
copy of logfiles from the primary side.
In general, you should avoid such situations at the admin level.
Please implement your own monitoring at the admin level, which warns
you and/or takes appropriate countermeasures much earlier.
Never rely on this emergency feature!
config MARS_MIN_SPACE_0
int "total space needed in /mars/ for (additional limit in GB)"
depends on MARS
default 12
---help---
Operational pre-requirement.
In order to use MARS, the total space available in /mars/ must
be at least MARS_MIN_SPACE_4 + MARS_MIN_SPACE_3 + MARS_MIN_SPACE_2
+ MARS_MIN_SPACE_1 + MARS_MIN_SPACE_0.
If you cannot afford that amount of storage space, please use
DRBD in place of MARS.
config MARS_LOGROT_AUTO
int "automatic logrotate when logfile exceeds size (in GB)"
depends on MARS_LOGROT
default 32
---help---
You could switch this off by setting to 0. However, deletion
of really huge logfiles can take several minutes, or even substantial
fractions of hours (depending on the underlying filesystem).
Thus it is highly recommended to limit the logfile size to some
reasonable maximum size. Switch only off for experiments!
config MARS_PREFER_SIO
bool "prefer sio bricks instead of aio"
depends on MARS
default n
---help---
Normally OFF for production systems.
Only use as alternative for testing.