haproxy public development tree
Go to file
Willy Tarreau f2cb169487 BUG/MAJOR: listener: fix thread safety in resume_listener()
resume_listener() can be called from a thread not part of the listener's
mask after a curr_conn has gone lower than a proxy's or the process' limit.
This results in fd_may_recv() being called unlocked if the listener is
bound to only one thread, and quickly locks up.

This patch solves this by creating a per-thread work_list dedicated to
listeners, and modifying resume_listener() so that it bounces the listener
to one of its owning thread's work_list and waking it up. This thread will
then call resume_listener() again and will perform the operation on the
file descriptor itself. It is important to do it this way so that the
listener's state cannot be modified while the listener is being moved,
otherwise multiple threads can take conflicting decisions and the listener
could be put back into the global queue if the listener was used at the
same time.

It seems like a slightly simpler approach would be possible if the locked
list API would provide the ability to return a locked element. In this
case the listener would be immediately requeued in dequeue_all_listeners()
without having to go through resume_listener() with its associated lock.

This fix must be backported to all versions having the lock-less accept
loop, which is as far as 1.8 since deadlock fixes involving this feature
had to be backported there. It is expected that the code should not differ
too much there. However, previous commit "MINOR: task: introduce work lists"
will be needed as well and should not present difficulties either. For 1.8,
the commits introducing thread_mask() and LIST_ADDED() will be needed as
well, either backporting my_flsl() or switching to my_ffsl() will be OK,
and some changes will have to be performed so that the init function is
properly called (and maybe the deinit one can be dropped).

In order to test for the fix, simply set up a multi-threaded frontend with
multiple bind lines each attached to a single thread (reproduced with 16
threads here), set up a very low maxconn value on the frontend, and inject
heavy traffic on all listeners in parallel with slightly more connections
than the configured limit ( typically +20%) so that it flips very
frequently. If the bug is still there, at some point (5-20 seconds) the
traffic will go much lower or even stop, either with spinning threads or
not.
2019-07-12 09:07:48 +02:00
.github/ISSUE_TEMPLATE DOC: add github issue templates 2019-01-17 22:53:55 +01:00
contrib DOC: contrib: spoa_server Add some hints for building spoa_server 2019-07-05 16:31:50 +02:00
doc DOC: Fix typos and grammer in configuration.txt 2019-07-11 10:25:53 +02:00
ebtree CLEANUP: fix typos in comments in ebtree 2018-11-18 22:23:15 +01:00
examples CLEANUP: removed obsolete examples an move a few to better places 2019-06-15 21:25:06 +02:00
include MINOR: task: introduce work lists 2019-07-12 09:07:48 +02:00
reg-tests BUG/MEDIUM: compression: Set Vary: Accept-Encoding for compressed responses 2019-06-17 18:51:43 +02:00
scripts CLEANUP: removed obsolete examples an move a few to better places 2019-06-15 21:25:06 +02:00
src BUG/MAJOR: listener: fix thread safety in resume_listener() 2019-07-12 09:07:48 +02:00
tests CLEANUP: fix a misspell in tests/filltab25.c 2018-11-18 22:23:15 +01:00
.cirrus.yml BUILD: enable freebsd builds on cirrus-ci 2019-05-16 09:27:51 +02:00
.gitignore DOC: create a BRANCHES file to explain the life cycle 2019-06-15 22:00:14 +02:00
.travis.yml BUILD: travis-ci: TFO and GETADDRINFO are now enabled by default 2019-06-15 23:24:47 +02:00
BRANCHES DOC: create a BRANCHES file to explain the life cycle 2019-06-15 22:00:14 +02:00
CHANGELOG [RELEASE] Released version 2.1-dev0 2019-06-16 21:49:47 +02:00
CONTRIBUTING DOC: Fix typos in CONTRIBUTING 2019-06-15 21:25:06 +02:00
INSTALL DOC: this is a development branch again. 2019-06-17 13:35:23 +02:00
LICENSE
MAINTAINERS DOC: wurfl: added point of contact in MAINTAINERS file 2019-04-23 11:00:23 +02:00
Makefile BUILD: makefile: do not rely on shell substitutions to determine git version 2019-06-22 08:28:32 +02:00
README DOC: create a BRANCHES file to explain the life cycle 2019-06-15 22:00:14 +02:00
ROADMAP DOC: update the outdated ROADMAP file 2019-06-15 21:59:54 +02:00
SUBVERS
VERDATE [RELEASE] Released version 2.0.0 2019-06-16 20:00:26 +02:00
VERSION [RELEASE] Released version 2.1-dev0 2019-06-16 21:49:47 +02:00

The HAProxy documentation has been split into a number of different files for
ease of use.

Please refer to the following files depending on what you're looking for :

  - INSTALL for instructions on how to build and install HAProxy
  - BRANCHES to understand the project's life cycle and what version to use
  - LICENSE for the project's license
  - CONTRIBUTING for the process to follow to submit contributions

The more detailed documentation is located into the doc/ directory :

  - doc/intro.txt for a quick introduction on HAProxy
  - doc/configuration.txt for the configuration's reference manual
  - doc/lua.txt for the Lua's reference manual
  - doc/SPOE.txt for how to use the SPOE engine
  - doc/network-namespaces.txt for how to use network namespaces under Linux
  - doc/management.txt for the management guide
  - doc/regression-testing.txt for how to use the regression testing suite
  - doc/peers.txt for the peers protocol reference
  - doc/coding-style.txt for how to adopt HAProxy's coding style
  - doc/internals for developer-specific documentation (not all up to date)