From f2f17fbb860ba139c4ac8c75fca8f76993a307ba Mon Sep 17 00:00:00 2001 From: Zac Dover Date: Fri, 26 Feb 2021 23:47:01 +1000 Subject: [PATCH] doc/dev: t8y interactive-on-error rewrite This PR rewrites the section of the Teuthology documentation that is about the --interactive- on-error flag. Signed-off-by: Zac Dover --- ...tion-testing-teuthology-debugging-tips.rst | 29 +++++++++++-------- 1 file changed, 17 insertions(+), 12 deletions(-) diff --git a/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-debugging-tips.rst b/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-debugging-tips.rst index 5d4c45f1f59..c6601434812 100644 --- a/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-debugging-tips.rst +++ b/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-debugging-tips.rst @@ -114,26 +114,31 @@ failure, ask one of the team members for help. Debugging an issue using interactive-on-error --------------------------------------------- -It is important to be able to reproduce an issue when investigating its cause. -Run a job similar to the failed job, using the `interactive-on-error`_ mode in -teuthology:: +When you encounter a job failure during testing, you should attempt to +reproduce it. This is where ``--interactive-on-error`` comes in. This +section explains how to use ``interactive-on-error`` and what it does. + +When you have verified that a job has failed, run the same job again in +teuthology but add the `interactive-on-error`_ flag:: ideepika@teuthology:~/teuthology$ ./virtualenv/bin/teuthology -v --lock --block $ --interactive-on-error -For this job, use either `custom config.yaml`_ or the yaml file from -the failed job. If you intend to use the yaml file from the failed job, copy -``orig.config.yaml`` to your local dir and change the `testing priority`_ -accordingly, like so:: +Use either `custom config.yaml`_ or the yaml file from the failed job. If +you use the yaml file from the failed job, copy ``orig.config.yaml`` to +your local directory:: ideepika@teuthology:~/teuthology$ cp /a/teuthology-2021-01-06_07:01:02-rados-master-distro-basic-smithi/5759282/orig.config.yaml test.yaml ideepika@teuthology:~/teuthology$ ./virtualenv/bin/teuthology -v --lock --block test.yaml --interactive-on-error +If a job fails when the ``interactive-on-error`` flag is used, teuthology +will lock the machines required by ``config.yaml``. Teuthology will halt +the testing machines and hold them in the state that they were in at the +time of the job failure. You will be put into an interactive python +session. From there, you can ssh into the system to investigate the cause +of the job failure. -In the event of job failure, teuthology will lock the machines required by -``config.yaml``. Teuthology will halt at an interactive python session. -By sshing into the targets, we can investigate their ctx values. After we have -investigated the system, we can manually terminate the session and let -teuthology clean the session up. +After you have investigated the failure, just terminate the session. +Teuthology will then clean up the session and unlock the machines. Suggested Resources --------------------