haproxy public development tree
Go to file
Willy Tarreau 2bfefdbaef MAJOR: watchdog: implement a thread lockup detection mechanism
Since threads were introduced, we've naturally had a number of bugs
related to locking issues. In addition we've also got some issues
with corrupted lists in certain rare cases not necessarily involving
threads. Not only these events cause a lot of trouble to the production
as it is very hard to detect that the process is stuck in a loop and
doesn't deliver the service anymore, but it's often difficult (or too
late) to collect more debugging information.

The patch presented here implements a lockup detection mechanism, also
known as "watchdog". The principle is that (on systems supporting it),
each thread will have its own CPU timer which progresses as the thread
consumes CPU cycles, and when a deadline is met, a signal is delivered
(SIGALRM here since it doesn't interrupt gdb by default).

The thread handling this signal (which is not necessarily the one which
triggered the timer) figures the thread ID from the signal arguments and
checks if it's really stuck by looking at the time spent since last exit
from poll() and by checking that the thread's scheduler is still alive
(so that even when dealing with configuration issues resulting in insane
amount of tasks being called in turn, it is not possible to accidently
trigger it). Checking the scheduler's activity will usually result in a
second chance, thus doubling the detecting time.

In order not to incorrectly flag a thread as being the cause of the
lockup, the thread_harmless_mask is checked : a thread could very well
be spinning on itself waiting for all other threads to join (typically
what happens when issuing "show sess"). In this case, once all threads
but one (or two) have joined, all the innocent ones are marked harmless
and will not trigger the timer. Only the ones not reacting will.

The deadline is set to one second, which already appears impossible to
reach, especially since it's 1 second of CPU usage, not elapsed time
with the CPU being preempted by other threads/processes/hypervisor. In
practice due to the scheduler's health verification it takes up to two
seconds to decide to panic.

Once all conditions are met, the goal is to crash from the offending
thread. So if it's the current one, we call ha_panic() otherwise the
signal is bounced to the offending thread which deals with it. This
will result in all threads being woken up in turn to dump their context,
the whole state is emitted on stderr in hope that it can be logged, and
the process aborts, leaving a chance for a core to be dumped and for a
service manager to restart it.

An alternative mechanism could be implemented for systems unable to
wake up a thread once its CPU clock reaches a deadline (e.g. FreeBSD).
Instead of waking the timer each and every deadline, it is possible to
use a standard timer which is reset each time we leave poll(). Since
the signal handler rechecks the CPU consumption this will also work.
However a totally idle process may trigger it from time to time which
may or may not confuse some debugging sessions. The same is true for
alarm() which could be another option for systems not having such a
broad choice of timers (but it seems that in this case they will not
have per-thread CPU measurements available either).

The feature is currently implemented only when threads are enabled in
order to keep the code clean, since the main purpose is to detect and
address inter-thread deadlocks. But if it proves useful for other
situations this condition might be relaxed.
2019-05-22 11:50:48 +02:00
.github/ISSUE_TEMPLATE DOC: add github issue templates 2019-01-17 22:53:55 +01:00
contrib MINOR/DOC: spoe-server: Add documentation 2019-05-13 17:43:47 +02:00
doc MINOR: threads: add a "stuck" flag to the thread_info struct 2019-05-22 11:50:48 +02:00
ebtree CLEANUP: fix typos in comments in ebtree 2018-11-18 22:23:15 +01:00
examples MEDIUM: Make 'option forceclose' actually warn 2019-05-16 18:02:03 +02:00
include MINOR: threads: add a timer_t per thread in thread_info 2019-05-22 11:50:48 +02:00
reg-tests REGTEST: extend the check duration on tls_health_checks and mark it slow 2019-05-17 17:16:20 +02:00
scripts DOC: fix "successful" typo 2019-05-18 08:25:29 +02:00
src MAJOR: watchdog: implement a thread lockup detection mechanism 2019-05-22 11:50:48 +02:00
tests CLEANUP: fix a misspell in tests/filltab25.c 2018-11-18 22:23:15 +01:00
.cirrus.yml BUILD: enable freebsd builds on cirrus-ci 2019-05-16 09:27:51 +02:00
.gitignore DOC: split the README into README + INSTALL 2018-12-16 22:30:57 +01:00
.travis.yml BUILD: travis-ci: make TMPDIR global variable in travis-ci 2019-05-11 06:07:47 +02:00
CHANGELOG [RELEASE] Released version 2.0-dev3 2019-05-15 16:51:48 +02:00
CONTRIBUTING DOC: Fix typos in README and CONTRIBUTING 2018-11-12 08:54:12 +01:00
INSTALL Revert "CLEANUP: wurfl: remove dead, broken and unmaintained code" 2019-04-23 10:34:43 +02:00
LICENSE
MAINTAINERS DOC: wurfl: added point of contact in MAINTAINERS file 2019-04-23 11:00:23 +02:00
Makefile MAJOR: watchdog: implement a thread lockup detection mechanism 2019-05-22 11:50:48 +02:00
README DOC: split the README into README + INSTALL 2018-12-16 22:30:57 +01:00
ROADMAP DOC: update the roadmap about priority queues 2018-08-10 17:12:04 +02:00
SUBVERS
VERDATE [RELEASE] Released version 2.0-dev3 2019-05-15 16:51:48 +02:00
VERSION [RELEASE] Released version 2.0-dev3 2019-05-15 16:51:48 +02:00

The HAProxy documentation has been split into a number of different files for
ease of use.

Please refer to the following files depending on what you're looking for :

  - INSTALL for instructions on how to build and install HAProxy
  - LICENSE for the project's license
  - CONTRIBUTING for the process to follow to submit contributions

The more detailed documentation is located into the doc/ directory :

  - doc/intro.txt for a quick introduction on HAProxy
  - doc/configuration.txt for the configuration's reference manual
  - doc/lua.txt for the Lua's reference manual
  - doc/SPOE.txt for how to use the SPOE engine
  - doc/network-namespaces.txt for how to use network namespaces under Linux
  - doc/management.txt for the management guide
  - doc/regression-testing.txt for how to use the regression testing suite
  - doc/peers.txt for the peers protocol reference
  - doc/coding-style.txt for how to adopt HAProxy's coding style
  - doc/internals for developer-specific documentation (not all up to date)