haproxy public development tree
Go to file
Willy Tarreau 4173f4ea29 BUG/MINOR: conn_stream: do not confirm a connection from the frontend path
In issue #1468 it was reported that sometimes server-side connection
attempts were only validated after the "timeout connect" value, and
that would only happen with an H2 client. A long code analysis with the
output dumps showed only one possible call path: an I/O event on the
frontend while reading had just been disabled calls h2_wake() which in
turns wakes cs_conn_io_cb(), which tries cs_conn_process() and cs_notify(),
which sees that the other side is not blocked (already in CS_ST_CON)
and tries cs_chk_snd() on it. But on that side the connection had just
finished to be set up and not yet woken the stream up, cs_notify()
would then call cs_conn_send() which succeeds and passes the connection
to CS_ST_RDY. The problem is that nothing new happened on the frontend
side so there's no reason to wake the stream up and the backend-side
conn_stream remains in CS_ST_RDY state with the stream never being
woken up.

Once the "timeout connect" strikes, process_stream() is woken up and
finds the connection finally setup, so it ignores the timeout and goes
on.

The number of conditions to meet to reproduce this is huge, which also
explains why the reporter says it's "occasional" and we were never able
to reproduce it in the lab. It needs at least reads to be disabled and
immediately re-enabled on the frontend side (e.g. buffer full) with an
I/O even reported before the poller had an opportunity to be disabled
but with no subscribe being reinstalled, so that sock_conn_iocb() has
no other choice but calling h2_wake(), and exactly at the same time
the backend connection must finish to set up so that it was not yet
reported by the poller, the data were sent and the polling for writes
disabled.

Several factors are to be considered here:
  - h2_wake() should probably not call h2_wake_some_streams() for
    ret >= 0 (common case), but only if some special event is reported
    for at least one stream; that part is sensitive though as in the
    past we managed to lose some rare cases (e.g. restart processing
    after a pause), and such wakeups are extremely rare so we'd better
    make that effort once in a while.

  - letting a lazy forward attempt on the frontend confirm a backend
    connection establishment is too smart to be reliable. That wasn't
    in fact the intent and it's inherited from the very old code where
    muxes didn't exist and where it was guaranteed that an even at this
    layer would wake everyone up.

Here the best thing to do is to refrain from attempting to forward data
until the connection is confirmed. This will let the poller report the
connect() event to the backend side which will process it as it should
and does in all other cases.

Thanks to Jimmy Crutchfield for having reported useful traces and
tested patches.

This will have to be backported to all stable branches after some
observation. Before 2.6 the function is stream_int_chk_snd_conn(),
and the flag to remove is SI_SB_CON.
2022-04-29 15:32:14 +02:00
.github CI: github actions: disable -Wno-deprecated 2022-04-11 19:05:03 +02:00
addons CLEANUP: tree-wide: Remove any ref to stream-interfaces 2022-04-13 15:10:16 +02:00
admin BUILD: halog: fix some incorrect signs in printf formats for integers 2022-04-12 08:40:38 +02:00
dev DEV: flags: No longer dump SI flags 2022-04-13 15:10:16 +02:00
doc MINOR: ssl: add a new global option "tune.ssl.hard-maxrecord" 2022-04-27 16:53:43 +02:00
examples MEDIUM: proxy: remove long-broken 'option http_proxy' 2021-07-18 19:35:32 +02:00
include BUG/MEDIUM: quic: Possible crash on STREAM frame loss 2022-04-28 16:22:40 +02:00
reg-tests REGTESTS: webstats: remove unused stats socket in /tmp 2022-04-26 16:15:23 +02:00
scripts SCRIPTS: announce-release: add shortened links to pending issues 2022-04-16 12:06:07 +02:00
src BUG/MINOR: conn_stream: do not confirm a connection from the frontend path 2022-04-29 15:32:14 +02:00
tests CLEANUP: assorted typo fixes in the code and comments 2021-08-16 12:37:59 +02:00
.cirrus.yml CI: cirrus: switch to FreeBSD-13.0 2022-04-12 07:59:06 +02:00
.gitattributes
.gitignore DOC: lua-api: Add documentation about lua filters 2021-08-15 20:56:44 +02:00
.mailmap DOC: update Tim's address in .mailmap 2021-09-16 09:14:14 +02:00
.travis.yml CI: travis-ci: temporarily disable arm64 builds 2021-08-07 07:28:15 +02:00
BRANCHES
CHANGELOG [RELEASE] Released version 2.6-dev7 2022-04-23 04:38:36 +02:00
CONTRIBUTING CLEANUP: assorted typo fixes in the code and comments 2021-08-16 12:37:59 +02:00
INSTALL DOC: install: document the fact that SSL engines are not enabled by default 2022-04-11 19:00:27 +02:00
LICENSE
MAINTAINERS
Makefile REORG: quic: use a dedicated module for qc_stream_desc 2022-04-21 11:05:27 +02:00
README
ROADMAP
SUBVERS
VERDATE [RELEASE] Released version 2.6-dev7 2022-04-23 04:38:36 +02:00
VERSION [RELEASE] Released version 2.6-dev7 2022-04-23 04:38:36 +02:00

The HAProxy documentation has been split into a number of different files for
ease of use.

Please refer to the following files depending on what you're looking for :

  - INSTALL for instructions on how to build and install HAProxy
  - BRANCHES to understand the project's life cycle and what version to use
  - LICENSE for the project's license
  - CONTRIBUTING for the process to follow to submit contributions

The more detailed documentation is located into the doc/ directory :

  - doc/intro.txt for a quick introduction on HAProxy
  - doc/configuration.txt for the configuration's reference manual
  - doc/lua.txt for the Lua's reference manual
  - doc/SPOE.txt for how to use the SPOE engine
  - doc/network-namespaces.txt for how to use network namespaces under Linux
  - doc/management.txt for the management guide
  - doc/regression-testing.txt for how to use the regression testing suite
  - doc/peers.txt for the peers protocol reference
  - doc/coding-style.txt for how to adopt HAProxy's coding style
  - doc/internals for developer-specific documentation (not all up to date)