haproxy public development tree
Go to file
Willy Tarreau 5d7dcc2a8e OPTIM: epoll: always poll for recv if neither active nor ready
The cost of enabling polling in one direction with epoll is very high
because it requires one syscall per FD and per direction change. In
addition we don't know about input readiness until we either try to
receive() or enable polling and watch the result. With HTTP keep-alive,
both are equally expensive as it's very uncommon to see the server
instantly respond (unless it's a second stage of the same process on
localhost, which has become much less common with threads).

But when a connection is established it's also quite usual to have to
poll for sending (except on localhost or UNIX sockets where it almost
always instantly works). So this cost of polling could be factored out
with the second step if both were enabled together.

This is the idea behind this patch. What it does is to always enable
polling for Rx if it's not ready and at least one direction is active.
This means that if it's not explicitly disabled, or if it was but in a
state that causes the loss of the information (rx ready cannot be
guessed), then let's take any opportunity for a polling change to
enable it at the same time, and learn about rx readiness for free.

In addition the FD never gets unregistered for Rx unless it's ready
and was blocked (buffer full). This avoids a lot of the flip-flop
behaviour at beginning and end of requests.

On a test with 10k requests in keep-alive, the difference is quite
noticeable:

Before:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 83.67    0.010847           0     20078           epoll_ctl
 16.33    0.002117           0      2231           epoll_wait
  0.00    0.000000           0        20        20 connect
------ ----------- ----------- --------- --------- ----------------
100.00    0.012964                 22329        20 total

After:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 96.35    0.003351           1      2644           epoll_wait
  2.36    0.000082           4        20        20 connect
  1.29    0.000045           0        66           epoll_ctl
------ ----------- ----------- --------- --------- ----------------
100.00    0.003478                  2730        20 total

It may also save a recvfrom() after connect() by changing the following
sequence, effectively saving one epoll_ctl() and one recvfrom() :

           before              |            after
  -----------------------------+----------------------------
  - connect()                  |  - connect()
  - epoll_ctl(add,out)         |  - epoll_ctl(add, in|out)
  - sendto()                   |  - epoll_wait() = out
  - epoll_ctl(mod,in|out)      |  - send()
  - epoll_wait() = out         |  - epoll_wait() = in|out
  - recvfrom() = EAGAIN        |  - recvfrom() = OK
  - epoll_ctl(mod,in)          |  - recvfrom() = EAGAIN
  - epoll_wait() = in          |  - epoll_ctl(mod, in)
  - recvfrom() = OK            |  - epoll_wait()
  - recvfrom() = EAGAIN        |
  - epoll_wait()               |
    (...)

Now on a 10M req test on 16 threads with 2k concurrent conns and 415kreq/s,
we see 190k updates total and 14k epoll_ctl() only.
2019-12-27 16:38:47 +01:00
.github/ISSUE_TEMPLATE
contrib BUG/MINOR: contrib/prometheus-exporter: decode parameter and value only 2019-11-27 11:51:35 +01:00
doc MINOR: http: add a new "replace-path" action 2019-12-19 09:24:57 +01:00
ebtree
examples
include MINOR: poller: do not call the IO handler if the FD is not active 2019-12-27 16:38:47 +01:00
reg-tests REGTEST: make the "set ssl cert" require version 2.1 2019-12-20 14:35:18 +01:00
scripts REGTEST: run-regtests: implement #REQUIRE_BINARIES 2019-12-19 14:36:46 +01:00
src OPTIM: epoll: always poll for recv if neither active nor ready 2019-12-27 16:38:47 +01:00
tests
.cirrus.yml
.gitignore
.travis.yml BUILD: travis-ci: reenable address sanitizer for clang builds 2019-12-26 06:30:21 +01:00
BRANCHES
CHANGELOG
CONTRIBUTING
INSTALL DOC: this is development again 2019-11-25 20:37:49 +01:00
LICENSE
MAINTAINERS
Makefile
README
ROADMAP
SUBVERS
VERDATE
VERSION

The HAProxy documentation has been split into a number of different files for
ease of use.

Please refer to the following files depending on what you're looking for :

  - INSTALL for instructions on how to build and install HAProxy
  - BRANCHES to understand the project's life cycle and what version to use
  - LICENSE for the project's license
  - CONTRIBUTING for the process to follow to submit contributions

The more detailed documentation is located into the doc/ directory :

  - doc/intro.txt for a quick introduction on HAProxy
  - doc/configuration.txt for the configuration's reference manual
  - doc/lua.txt for the Lua's reference manual
  - doc/SPOE.txt for how to use the SPOE engine
  - doc/network-namespaces.txt for how to use network namespaces under Linux
  - doc/management.txt for the management guide
  - doc/regression-testing.txt for how to use the regression testing suite
  - doc/peers.txt for the peers protocol reference
  - doc/coding-style.txt for how to adopt HAProxy's coding style
  - doc/internals for developer-specific documentation (not all up to date)