haproxy public development tree
Go to file
Olivier Houchard fc51f0f588 BUG/MEDIUM: fd/threads: fix a concurrency issue between add and rm on the same fd
There's a very hard-to-trigger bug in the FD list code where the
fd_add_to_fd_list() function assumes that if the FD it's trying to add
is already locked, it's in the process of being added. Unfortunately, it
can also be in the process of being removed. It is very hard to trigger
because it requires that one thread is removing the FD while another one
is adding it. First very few FDs run on multiple threads (listeners and
DNS), and second, it does not make sense to add and remove the FD at the
same time.

In practice the DNS code built on the older callback-only model does
perform bursts of fd_want_send() for all resolvers at once when it wants
to send a new query (dns_send_query()). And this is more likely to happen
when here are lots of resolutions in parallel and many resolvers, because
the dns_response_recv() callback can also trigger a series of queries on
all resolvers for each invalid response it receives. This means that it
really is perfectly possible to both stop and start in parallel during
short periods of time there.

This issue was not reported before 2.1, but 2.1 had the FD cache, built
on the exact same code base. It's very possible that the issue caused
exactly the opposite situation, where an event was occasionally lost,
causing a DNS retry that worked, and nobody noticing the problem in the
end. In 2.1 the lost entries are the updates asking for not polling for
writes anymore, and the effect is that the poller contiuously reports
writability on the socket when the issue happens.

This patch fixes bug #416 and must be backported as far as 1.8, and
absolutely requires that previous commit "MINOR: fd/threads: make
_GET_NEXT()/_GET_PREV() use the volatile attribute" is backported as
well otherwise it will make the issue worse.

Special thanks to Julien Pivotto for setting up a reliable reproducer
for this difficult issue.
2019-12-20 08:09:28 +01:00
.github/ISSUE_TEMPLATE DOC: Add GitHub issue config.yml 2019-11-03 15:36:06 +01:00
contrib BUG/MINOR: contrib/prometheus-exporter: decode parameter and value only 2019-11-27 11:51:35 +01:00
doc MINOR: http: add a new "replace-path" action 2019-12-19 09:24:57 +01:00
ebtree BUILD: ebtree: make eb_is_empty() and eb_is_dup() take a const 2019-10-02 15:24:19 +02:00
examples CLEANUP: removed obsolete examples an move a few to better places 2019-06-15 21:25:06 +02:00
include BUG/MAJOR: task: add a new TASK_SHARED_WQ flag to fix foreing requeuing 2019-12-19 14:42:22 +01:00
reg-tests REGTEST: run-regtests: implement #REQUIRE_BINARIES 2019-12-19 14:36:46 +01:00
scripts REGTEST: run-regtests: implement #REQUIRE_BINARIES 2019-12-19 14:36:46 +01:00
src BUG/MEDIUM: fd/threads: fix a concurrency issue between add and rm on the same fd 2019-12-20 08:09:28 +01:00
tests TESTS: Add a stress-test for mt_lists. 2019-09-23 18:16:08 +02:00
.cirrus.yml BUILD: CI: comment out cygwin build, upgrade various ssl libraries 2019-10-29 06:27:50 +01:00
.gitignore DOC: create a BRANCHES file to explain the life cycle 2019-06-15 22:00:14 +02:00
.travis.yml BUILD: CI: comment out cygwin build, upgrade various ssl libraries 2019-10-29 06:27:50 +01:00
BRANCHES DOC: create a BRANCHES file to explain the life cycle 2019-06-15 22:00:14 +02:00
CHANGELOG [RELEASE] Released version 2.2-dev0 2019-11-25 20:36:16 +01:00
CONTRIBUTING DOC: improve the wording in CONTRIBUTING about how to document a bug fix 2019-07-26 15:46:21 +02:00
INSTALL DOC: this is development again 2019-11-25 20:37:49 +01:00
LICENSE LICENSE: add licence exception for OpenSSL 2012-09-07 13:52:26 +02:00
MAINTAINERS DOC: wurfl: added point of contact in MAINTAINERS file 2019-04-23 11:00:23 +02:00
Makefile BUILD: reorder the objects in the makefile 2019-11-25 19:47:23 +01:00
README DOC: create a BRANCHES file to explain the life cycle 2019-06-15 22:00:14 +02:00
ROADMAP DOC: update the outdated ROADMAP file 2019-06-15 21:59:54 +02:00
SUBVERS BUILD: use format tags in VERDATE and SUBVERS files 2013-12-10 11:22:49 +01:00
VERDATE [RELEASE] Released version 2.1.0 2019-11-25 19:47:40 +01:00
VERSION [RELEASE] Released version 2.2-dev0 2019-11-25 20:36:16 +01:00

The HAProxy documentation has been split into a number of different files for
ease of use.

Please refer to the following files depending on what you're looking for :

  - INSTALL for instructions on how to build and install HAProxy
  - BRANCHES to understand the project's life cycle and what version to use
  - LICENSE for the project's license
  - CONTRIBUTING for the process to follow to submit contributions

The more detailed documentation is located into the doc/ directory :

  - doc/intro.txt for a quick introduction on HAProxy
  - doc/configuration.txt for the configuration's reference manual
  - doc/lua.txt for the Lua's reference manual
  - doc/SPOE.txt for how to use the SPOE engine
  - doc/network-namespaces.txt for how to use network namespaces under Linux
  - doc/management.txt for the management guide
  - doc/regression-testing.txt for how to use the regression testing suite
  - doc/peers.txt for the peers protocol reference
  - doc/coding-style.txt for how to adopt HAProxy's coding style
  - doc/internals for developer-specific documentation (not all up to date)