mirror of https://github.com/schoebel/mars
dd1e4e1323
[small adaptations by Thomas Schoebel-Theuer, and some problems with LyX-specific file format fixed] |
||
---|---|---|
.. | ||
example | ||
mars | ||
scripts | ||
README | ||
default-main.conf |
README
GPLed software AS IS, sponsored by 1&1 Internet AG (www.1und1.de). The test suite is work in progress. The test suite was developed by Frank Liepold during his stay at 1&1. The email address frank.liepold@1und1.de will no longer work, since Frank has left 1&1 since May 2014. Currently, there is no dedicated successor for him :( Also note that Frank developed the test suite on his personal workstation, which had a slow network connection to the datacenters where the test hosts were running. Thus it happeend that ssh commands were given slowly. When you have very fast ssh connections, some races may show up which did not occur in the environment where Frank was working. As a workaround, you may place something like "sleep 3;" in front of any ssh command, but this will of course lead to a large slowdown (anyway, a near-full run of the testsuite takes almost 24h when started on my workstation). This should be fixed when a successor of Frank is available. ============================================================================= Contents -------- 1. Framework 1.1. Global settings 1.2. Running a test 1.2.1. Test output 1.2.2. Help to understand a test 1.3. Programming hints and conventions 1.4. Error handling 1.5. Signal handling and unexpected termination 1.6. Example 2. mars specific settings 2.1. Configuration of the local build environment 2.2. Configuration of the test hosts on which the tests are running 2.2.1. Access requirements 2.2.2. Installation kernel and mars module on the test hosts 2.3. Configuration of logical volumes 2.4. Resources 2.5. Starting the whole test suite via cronjob 2.6. Concurrent test runs 2.7. Firewalls 1. Framework ------------ 1.1. Global settings ------------------ The directory where this README resides (normally <git-repo>/test_suite) is called base directory in the following. If relative paths are given they refer to this base directory. The frame work of the test suite consists of the files README, default-main.conf and the subdirectory scripts. You specify your use case of the test suite by creating a subdirectory - e.g. my_use_case_dir - of the base directory. In this subdirectory a file global.conf must be created, containing at least the following two variables: global_user_dir must be set to .../test_suite/my_use_case_dir global_user_module_dir must be set to a subdirectory of global_user_dir. In global_user_module_dir the use case specific shell scripts (called modules) are to be located. We recommend global_user_module_dir=$global_user_dir/modules. 1.2. Running a test ------------------- Tests are executed by a call to scripts/start_test.sh from a subdirectory of $global_user_dir. The scope and configuration of a test is completely described by this subdirectory (we call it start directory) as follows: - All subdirectories of the start directory containing a (usually empty) file named i_am_a_testdirectory define one test case. In the following we call such a subdirectory a test directory. If a test for one test directory below the start directory fails, the tests belonging to the other test directories of the start directory will not be executed. Though only tests which are likely not to fail should be executed by only one call to start_test.sh. The simple and recommended way is to call start_test.sh for each test directory once (which means that the start directory consists of only one test directory and coincides with it). - The function set usable is defined in the following set of shell scripts: -- scripts/lib/lib*.sh -- scripts/modules/*.sh -- $global_user_module_dir/*.sh - The configuration of the test case is defined by a set of *.conf files which are included by start_test.sh in the following order: -- default-*.conf: These are the configuration files belonging to the scripts scripts/modules/*.sh which are also included bei start_test.sh and which define some library functions which are independent of the specific use case. -- $global_user_dir/global.conf: Configuration file for use case specific global variables which cannot be assigned to a certain user module. -- $global_user_dir/default-*.conf: These are the configuration files belonging to the scripts $global_user_module_dir/*.sh which are also included bei start_test.sh and which define all functions necessary for your use case. -- <subdirname>.conf where subdirname runs top down through the parent directories starting with the first subdirectory of a directory called root config directory to the test directory. This config root directory is set to the start directory (on the directory branch start directory -> test directory) by default. It may be changed by the option --config_root_dir=<my dir>. If the start directory coincides with the test directory the file <test directory name>.conf is included for convenience (in the strict sense there are no true subdirectories of the start directory residing above the test directory). The <subdirnam>.conf files may reside in the test directory or any of its parent directories (up to 20 levels). Examples: 1. start directory = test_cases/perf test directory = test_cases/perf/apply/no_parallel_writer Option --config_root_dir not given leads to including of apply.conf and no_parallel_writer.conf (in this order!) which must reside as mentioned above in the test directory or any of it's parent directories (up to 20 levels). 2. start directory = test_cases/perf/apply/no_parallel_writer test directory = test_cases/perf/apply/no_parallel_writer Option --config_root_dir=test_cases/perf leads to including the same *.conf files as above. 3. start directory = test_cases/perf/apply/no_parallel_writer test directory = test_cases/perf/apply/no_parallel_writer Option --config_root_dir given or not leads to including of no_parallel_writer.conf - In at least one of the included <subdirname>.conf files of a test you must set one of the variables prepare_list, setup_list, run_list, cleanup_list or finish_list. Each of these variables may be empty or contain a list of shell function names of functions defined in one of the set of shell scripts mentioned above. These functions are called by start_test.sh in the order they appear in the mentioned *list variables. The *list variables are evaluated in the order prepare_list, setup_list, run_list, cleanup_list, finish_list. 1.2.1. Test output ------------------ The output of start_test.sh is very verbose to ease the research in case of any errors. You are encouraged to use grep & Co. to shrink it as desired. The output consists of the following sections: - List titled "Sourcing libraries in <library directory>" of included scripts from <libraries directory>/lib*.sh - List titled "Sourcing modules and default configuration" of included scripts (= the set of shell scripts mentioned above) and corresponding included default-*.conf files. - Line titled "Scanning subdirectories of <start directory>" followed by a list of ignored or skipped subdirectories - Per test directory: - Line titled "Test directory <test directory> <date and time>" - List titled "Sourcing config files between <config_root_dir> and <test directory>" of included *.conf files corresponding to the directory structure. - List titled "Configuration variables" of all configuration variables set via all included *.conf file - Section titled "Creating lock files" - Section titled "Starting <test directory> <date and time>" In this section lines starting with "calling ..." mark the call to one of the functions listed in the aforementioned variables prepare_list, setup_list, run_list, cleanup_list and finish_list. The main part of this section consists of output lines of the function lib_vmsg. These lines have the following format: <date time> [[<callstack>]] <message> The stack level which the bash call stack is printed from (<callstack>) is configurable via variable main_min_stack_level. - In case of an error: Subsection titled "Callstack" (to stderr) - Subsection titled "Deleting lock files" - Subsection titled "General checks of error and log files" - Line titled "Failure <return code> <test directory> <date and time>" (to stderr) in case of failure, "Finished <test directory> <date and time>" (to stdout) otherwise - If all tests in all test directories of the start directory terminated successfully: Line titled "Finished start directory <start directory>" 1.2.2. Help to understand a test -------------------------------------- Calling start_test.sh --help from a test directory doesn't start a test but gives you the output mentioned in 1.2.1 without the sections produced during real test execution. In particular the section "Configuration variables" is printed. So you can determine which functions are called via the variable run_list. These functions should be commented extensively enough to be able to understand the test's purpose. 1.3. Programming hints and conventions -------------------------------------- - all global variables should be defined and explained in a default-<module_name>.conf file - the names of all global variables resp. all functions should have as prefix the module name of the default-*.conf file resp. of the *.sh file which they are defined in. - in case of an error certain cleanup functions may be called (e.g. if a network connection is cut during a test case it should be restored). The associative array main_error_recovery_functions has function names as index and function parameters as values. In lib_exit each function contained in main_error_recovery_functions is called with it's corresponding arguments. 1.4. Error handling ------------------- Instead of directly calling exit you should use lib_exit from lib/lib.sh, which prints a call stack and does further checks. If the variable lib_general_checks_after_every_test_function is set in global_user_dir/global.conf to a shell function contained in the use case specific modules, this function is called by lib_exit. 1.5. Signal handling and unexpected termination ----------------------------------------------- Normally signals given to start_test.sh should terminate all child processes but this can not be guaranteed. Thus you have to kill left over child processes manually. Thus we recommend that all scripts and programs you are starting within a test case should have a significant part of name - e.g. in the use case mars we used the string MARS as part of all script names. 1.6. Example ------------ In the base directory's subdirectory example you find a simple use case. The directory tree looks as follows: example/blue.conf example/color_tests.conf example/default-colors.conf example/default-lib_err.conf example/global.conf example/green.conf example/red.conf example/color_tests example/color_tests/blue example/color_tests/blue/i_am_a_testdirectory example/color_tests/green example/color_tests/green/i_am_a_testdirectory example/color_tests/red example/color_tests/red/i_am_a_testdirectory example/modules example/modules/colors.sh example/modules/lib_err.sh Three test case are defined because there are three files i_am_a_testdirectory. The used shell functions are defined in the scripts colors.sh and lib_err.sh with corresponding *.conf files default-colors.conf resp. default-lib_err.conf. The test case configurations are given via the directory tree and the *.conf files red.conf, green.conf and blue.conf. In global.conf we defined global_user_dir, global_user_module_dir, main_min_stack_level and lib_general_checks_after_every_test_function. To start all three tests do: cd base directory/example ../scripts/start_test.sh 2>&1 | tee /tmp/test_all.out Thoroughly studying the output file /tmp/my_test.out will clarify the working of the test frame work. The third test (red) is of special interest because it fails. Thus in this case you can study the call stack, the call to lib_general_checks_after_every_test_function (in our example set to lib_err_general_checks_after_every_test_case) and the use of main_error_recovery_functions (used in colors.sh). To start only the test case red do: cd base directory/example/colors/red ../../../scripts/start_test.sh --config_root_dir=../.. 2>&1 | tee /tmp/test_red.out Note that the option --config_root_dir is important in this case because it tells start_test.sh up to which level it should regard parent directories and the corresponding *.conf files (see section 1.2). 2. mars specific settings ------------------------- 2.1. Configuration of the local build environment ------------------------------------------------- To build the mars module and userspace tools you need: - a directory containing a git repository which contains the kernel sources variable : checkout_mars_kernel_src_directory to be specified in file : default-checkout_mars.conf - the branch from which to take the kernel sources variable : checkout_mars_kernel_git_branch to be specified in file : default-checkout_mars.conf - a directory containing the kernel sources against which the mars module should be built (may be the same as checkout_mars_kernel_src_directory) variable : make_mars_kernel_src_directory to be specified in file : default-make_mars.conf - a directory containing a git repository which contains the mars sources variable : checkout_mars_src_directory to be specified in file : default-checkout_mars.conf - the branch from which to take the mars sources variable : checkout_mars_git_branch to be specified in file : default-checkout_mars.conf - a directory containing the mars sources which is linked in to the kernel source directory make_mars_kernel_src_directory for making the mars module. May coincide with checkout_mars_src_directory variable : make_mars_src_directory to be specified in file : default-make_mars.conf 2.2. Configuration of the test hosts on which the tests are running ------------------------------------------------------------------- The test hosts are given in global_host_list (global.conf). The two values can be overwritten by the environment variables MARS_INITIAL_PRIMARY_HOST and MARS_INITIAL_SECONDARY_HOST 2.2.1. Access requirements -------------------------- The tests are executed by ssh commands from your work station to the test hosts. These commands must not prompt for a password. Though a id file (for ssh -i <id-file>) with null password must be specified in main_ssh_idfile_opt (default-main.conf) or given via the environment variable MARS_SSH_KEYFILE All test hosts must have ssh access to each other without prompt for passwords (this is required by marsadm join-cluster). On your work station you must have sudo permissions for "make install", "cp". And root of your work station must have ssh access (as root) without prompt for password to all test hosts (this is required, because local files belonging to root must be copied to the test hosts). 2.2.2. Installation kernel and mars module on the test hosts ------------------------------------------------------------ This is done by executing the tests build_test_environment/checkout build_test_environment/make/make_mars/grub build_test_environment/install_mars It's a must to read default-checkout_mars.conf, default-make_mars.conf, default-install_mars.conf and the *.conf files in the directories above otherwise you risk damage of the boot information on your work station and/or the test hosts. After these three tests (which you should call one after the other!) you can reboot the test hosts with the new kernel. 2.3. Configuration of logical volumes ------------------------------------- This is done by the test build_test_environment/lv_config The configuration options are contained in default-lv_config.conf 2.4. Resources -------------- Cluster and Resources are created by build_test_environment/cluster build_test_environment/resource/create_resource Initially the first host in global_host_list is the primary, the following hosts secondaries. If you do not change any configuration variables after the build_test_environment/resource/create_resource Test you will have created one resource lv-1-2 on all hosts with underlying data devices /dev/vg-mars/lv-1-2. /mars will be mounted on /dev/vg-mars/lv-6-100. For further information please read default-cluster.conf and default-resource.conf. 2.5. Starting the whole test suite via cronjob ---------------------------------------------- This can be done mars_test_cronjob.sh. The variable tests_to_execute contains all tests to be executed. See the documentation in the header. 2.6. Concurrent test runs ------------------------- To avoid concurrent run of tests on the same host each run creates temporary files (/tmp/test-suite_on.<host>, see main_lock_file_list in global.conf) and deletes them afterwards. It may happen that these files are left over unintentionally in some cases of unexpected termination. Then you must remove them manually before starting the next test run. 2.7. Firewalls -------------- Certain tests cut temporarily network connections by defining firewall rules. Normally even in case of failure these connections are restored. To be sure that no firewall rules prevent tests from running the flag net_clear_iptables_in_prepare_phase is set to 1 in default-net.conf. This flag leads to deletion of *all* iptable chains on the test hosts. If you do not want this behaviour set net_clear_iptables_in_prepare_phase=0.