Commit Graph

74 Commits

Author SHA1 Message Date
Zack Cerza
e424d78c6b Be more verbose about log file locations
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-01-27 12:28:53 -06:00
Zack Cerza
50722a7d9c Symlink worker logs into job archive dir
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-01-24 10:19:43 -06:00
Zack Cerza
e8bb1654b2 call wait() on the teuthology-results Popen object
This ought to fix the issue where zombie teuthology-results processes
stick around.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-01-17 10:05:21 -06:00
Zack Cerza
53fc2d93dd Log a warning when killing long-running jobs.
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-01-16 10:53:53 -06:00
Zack Cerza
769ef8a960 Kill jobs that run for over 3 days (configurable)
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-01-16 10:38:39 -06:00
Zack Cerza
3cffea4917 Re-raise exceptions caught in the watchdog 2014-01-03 15:45:18 -06:00
Zack Cerza
f92174ff31 Strip stdout lines
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-01-03 15:01:31 -06:00
Zack Cerza
68b259fd00 Catch and log unhandled exceptions in the watchdog
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-01-03 14:56:46 -06:00
Zack Cerza
c6a9de0445 Add 'emperor' to list of branches with reporting
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-01-03 14:45:25 -06:00
Zack Cerza
d3afebe19c Be safer when calling ./bootstrap
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2014-01-03 11:55:13 -06:00
Zack Cerza
b4f524ebe4 Sleep once outside of the watchdog loop
Hopefully this will prevent the double-posting of jobs.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-31 14:25:05 -06:00
Zack Cerza
9a29c3ef71 Log calls to teuthology-report more verbosely
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-19 10:29:30 -06:00
Zack Cerza
b014c71829 Catch every exception here, for now.
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-19 10:29:30 -06:00
Zack Cerza
a0eb1a8e8c Use shell=True to call teuthology-report
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-16 14:22:22 -06:00
Zack Cerza
c22ee528b7 Catch OSError if script isn't in $PATH
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-16 13:34:37 -06:00
Zack Cerza
420fff6207 Revert "Use path when calling teuthology-report. …"
This reverts commit e4b5ab811e.
2013-12-16 11:43:06 -06:00
Sandon Van Ness
e4b5ab811e Use path when calling teuthology-report. …
The 'teuthology-report' command is probably not going to exist
in $PATH so get the location of the running command and assume its
in the same path.

Signed-off-by: Sandon Van Ness <sandon@inktank.com>
2013-12-14 07:14:51 -08:00
Zack Cerza
2e2b8feba2 Skip the 'dead' report on old branches
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-13 10:48:52 -06:00
Zack Cerza
966dad544b Make sure to report all results.
If a just-finished job was using a teuthology branch not known to
contain the reporting feature, then report the job via the
teuthology-report script. Note that in some cases this will result in
double reporting but the extra load should be negligible.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-12 17:33:53 -06:00
Zack Cerza
3d23b9b205 Remove the child's stderr completely
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-12 15:45:58 -06:00
Zack Cerza
57574fefc1 Don't show child's stderr, but show archive path
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-10 13:19:56 -06:00
Zack Cerza
339b7c474a Add debug statements
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-10 10:06:39 -06:00
Zack Cerza
48b8ba4ad2 Create a DateTime object from the timestamp
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-09 16:57:11 -06:00
Zack Cerza
d7289f75e8 Auto-restart
If /tmp/teuthology-restart-workers is newer than the running process,
restart.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-09 15:01:33 -06:00
Zack Cerza
856f83449c Implement a watchdog for queued jobs
This continually posts the run's status to the results server, if
configured, at an interval defaulting to 600 seconds.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-12-05 17:48:10 -06:00
Zack Cerza
d8f98201ac Don't re-call logging.basicConfig()
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-11-06 16:04:39 -06:00
Zack Cerza
f28a7ebc2c Move imports to top-level
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-10-11 12:48:55 -05:00
Zack Cerza
8351a3abfa PEP-8
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-10-10 19:09:34 -05:00
Zack Cerza
1bf3a3dadb Move teuthology-worker's arg parsing to scripts/
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-10-10 19:09:34 -05:00
Zack Cerza
08efeb7b9e Store the job_id as a str, not an int.
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
2013-10-04 13:11:03 -05:00
Zack Cerza
d1deb6d579 Don't hardcode teuthology's git repo URL 2013-09-20 15:24:11 -05:00
Zack Cerza
4d2e3c2736 Make run_job merge job_config['config'] if needed 2013-09-16 13:14:52 -05:00
Zack Cerza
e83b5defe8 Use check_output() and log.exception()
This should help us figure out why our checkouts keep getting deleted.
2013-09-12 11:14:08 -05:00
Zack Cerza
0ad9c8751e Ensure teuthology_branch is stored in job_config 2013-09-11 15:40:14 -05:00
Zack Cerza
fe51db6fc0 Merge job_config and ctx.config 2013-09-11 15:14:58 -05:00
Zack Cerza
713fa52455 Add job id and actual archive dir to job config
Also add job id to info.yaml
2013-09-11 13:44:28 -05:00
Sage Weil
1a05f9d0a2 queue: fix stderr redirect
Signed-off-by: Sage Weil <sage@inktank.com>
2013-09-06 09:39:30 -07:00
Sage Weil
5cd2f08132 queue: include tube name in worker logs
Signed-off-by: Sage Weil <sage@inktank.com>
2013-09-06 09:19:30 -07:00
Zack Cerza
44401f9a0e Workers: only log child's stderr, not stdout 2013-08-29 16:12:36 -05:00
Zack Cerza
6175a133f4 Don't assume anything about the base path here. 2013-08-28 13:36:15 -05:00
Josh Durgin
232e3d32bc Fix undefined symbol errors
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
2013-08-27 15:58:14 -07:00
Sage Weil
c861e2d70b queue: only git fetch once per minute per branch
This takes 1-2 seconds and makes launching jobs slow.  Only do it once
every 60 seconds per branch.

Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-23 15:43:35 -07:00
Sage Weil
973d5aff1c queue: only let one worker update the teuthology checkouts at a time
Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-23 13:43:18 -07:00
Zack Cerza
eafd591ab1 Move git stuff to fetch_teuthology_branch() 2013-08-23 10:08:01 -05:00
Zack Cerza
307284c2ed Rewrite branch fetching. 2013-08-23 09:59:48 -05:00
Sage Weil
22fc733770 queue: fetch origin, not branch
Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-22 22:20:26 -07:00
Sage Weil
c39ec60d48 queue: only bootstrap new checkouts
Until we figure out why bootstrap is getting stuck like this:

 9851 pts/7    S      0:03 /home/teuthworker/teuthology-master/virtualenv/bin/python ./teuthology-master/virtualenv/bin/teuthology-worker -v --archive-dir /var/lib/teuthworker/archive --tube plana --log-dir /var/lib/teuthworker/archive/worker_logs
 2075 pts/7    Z      0:00  \_ [git] <defunct>
 2112 pts/7    Z      0:00  \_ [git] <defunct>
 2138 pts/7    Z      0:00  \_ [bootstrap] <defunct>
 9852 pts/7    S      0:03 /home/teuthworker/teuthology-master/virtualenv/bin/python ./teuthology-master/virtualenv/bin/teuthology-worker -v --archive-dir /var/lib/teuthworker/archive --tube plana --log-dir /var/lib/teuthworker/archive/worker_logs
 2153 pts/7    Z      0:00  \_ [git] <defunct>
 2177 pts/7    Z      0:00  \_ [git] <defunct>
 2264 pts/7    Z      0:00  \_ [bootstrap] <defunct>
 9853 pts/7    S      0:03 /home/teuthworker/teuthology-master/virtualenv/bin/python ./teuthology-master/virtualenv/bin/teuthology-worker -v --archive-dir /var/lib/teuthworker/archive --tube plana --log-dir /var/lib/teuthworker/archive/worker_logs
 2141 pts/7    Z      0:00  \_ [git] <defunct>
 2276 pts/7    Z      0:00  \_ [git] <defunct>
 2305 pts/7    Z      0:00  \_ [bootstrap] <defunct>
 9854 pts/7    S      0:03 /home/teuthworker/teuthology-master/virtualenv/bin/python ./teuthology-master/virtualenv/bin/teuthology-worker -v --archive-dir /var/lib/teuthworker/archive --tube plana --log-dir /var/lib/teuthworker/archive/worker_logs
 7448 pts/7    Z      0:00  \_ [git] <defunct>
 7449 pts/7    Z      0:00  \_ [git] <defunct>
 7450 pts/7    Z      0:00  \_ [bootstrap] <defunct>
 7452 pts/7    Z      0:00  \_ [teuthology-resu] <defunct>
 9855 pts/7    S      0:01 /home/teuthworker/teuthology-master/virtualenv/bin/python ./teuthology-master/virtualenv/bin/teuthology-worker -v --archive-dir /var/lib/teuthworker/archive --tube plana --log-dir /var/lib/teuthworker/archive/worker_logs
 7712 pts/7    Z      0:00  \_ [git] <defunct>
 7713 pts/7    Z      0:00  \_ [git] <defunct>
 7714 pts/7    Z      0:00  \_ [bootstrap] <defunct>
 7716 pts/7    Z      0:00  \_ [teuthology-resu] <defunct>
 9856 pts/7    S      0:03 /home/teuthworker/teuthology-master/virtualenv/bin/python ./teuthology-master/virtualenv/bin/teuthology-worker -v --archive-dir /var/lib/teuthworker/archive --tube plana --log-dir /var/lib/teuthworker/archive/worker_logs
 2316 pts/7    Z      0:00  \_ [bootstrap] <defunct>
 9857 pts/7    S      0:03 /home/teuthworker/teuthology-master/virtualenv/bin/python ./teuthology-master/virtualenv/bin/teuthology-worker -v --archive-dir /var/lib/teuthworker/archive --tube plana --log-dir /var/lib/teuthworker/archive/worker_logs
 2340 pts/7    Z      0:00  \_ [bootstrap] <defunct>
 9858 pts/7    S      0:01 /home/teuthworker/teuthology-master/virtualenv/bin/python ./teuthology-master/virtualenv/bin/teuthology-worker -v --archive-dir /var/lib/teuthworker/archive --tube plana --log-dir /var/lib/teuthworker/archive/worker_logs
23188 pts/7    Z      0:00  \_ [bootstrap] <defunct>
 9859 pts/7    S      0:03 /home/teuthworker/teuthology-master/virtualenv/bin/python ./teuthology-master/virtualenv/bin/teuthology-worker -v --archive-dir /var/lib/teuthworker/archive --tube plana --log-dir /var/lib/teuthworker/archive/worker_logs

Signed-off-by: Sage Weil <sage@inktank.com>
2013-08-22 22:14:41 -07:00
Zack Cerza
98160c5c99 Fix SyntaxError 2013-08-22 18:31:18 -05:00
Zack Cerza
a9df6c2a6a Worker shouldn't attempt to rebuild an existing virtualenv 2013-08-22 18:02:22 -05:00
Zack Cerza
c773060914 Use the ceph.com git mirror. 2013-08-22 15:51:39 -05:00