haproxy/include/types
Willy Tarreau 8d38805d3d MAJOR: task: make use of the scope-aware ebtree functions
Currently the task scheduler suffers from an O(n) lookup when
skipping tasks that are not for the current thread. The reason
is that eb32_lookup_ge() has no information about the current
thread so it always revisits many tasks for other threads before
finding its own tasks.

This is particularly visible with HTTP/2 since the number of
concurrent streams created at once causes long series of tasks
for the same stream in the scheduler. With only 10 connections
and 100 streams each, by running on two threads, the performance
drops from 640kreq/s to 11.2kreq/s! Lookup metrics show that for
only 200000 task lookups, 430 million skips had to be performed,
which means that on average, each lookup leads to 2150 nodes to
be visited.

This commit backports the principle of scope lookups for ebtrees
from the ebtree_v7 development tree. The idea is that each node
contains a mask indicating the union of the scopes for the nodes
below it, which is fed during insertion, and used during lookups.

Then during lookups, branches that do not contain any leaf matching
the requested scope are simply ignored. This perfectly matches a
thread mask, allowing a thread to only extract the tasks it cares
about from the run queue, and to always find them in O(log(n))
instead of O(n). Thus the scheduler uses tid_bit and
task->thread_mask as the ebtree scope here.

Doing this has recovered most of the performance, as can be seen on
the test below with two threads, 10 connections, 100 streams each,
and 1 million requests total :

                              Before     After    Gain
              test duration : 89.6s      4.73s     x19
    HTTP requests/s (DEBUG) : 11200     211300     x19
     HTTP requests/s (PROD) : 15900     447000     x28
             spin_lock time : 85.2s      0.46s    /185
            time per lookup : 13us       40ns     /325

Even when going to 6 threads (on 3 hyperthreaded CPU cores), the
performance stays around 284000 req/s, showing that the contention
is much lower.

A test showed that there's no benefit in using this for the wait queue
though.
2017-11-06 11:20:11 +01:00
..
acl.h
action.h MINOR: action: Add a function pointer in act_rule struct to check its validity 2017-10-31 11:36:12 +01:00
applet.h MEDIUM: cache: deliver objects from cache 2017-10-31 21:17:19 +01:00
arg.h
auth.h
backend.h MEDIUM: threads/lb: Make LB algorithms (lb_*.c) thread-safe 2017-10-31 13:58:31 +01:00
cache.h MEDIUM: cache: configuration parsing and initialization 2017-10-31 21:17:19 +01:00
capture.h
channel.h BUG/MEDIUM: filters: Fix channels synchronization in flt_end_analyze 2017-03-15 19:09:06 +01:00
checks.h MAJOR: connection : Split struct connection into struct connection and struct conn_stream. 2017-10-31 18:03:23 +01:00
cli.h MINOR: cli: add socket commands and config to prepend informational messages with severity 2017-09-13 13:37:59 +02:00
compression.h
connection.h MEDIUM: connection: add a destroy callback 2017-10-31 18:03:24 +01:00
counters.h
dns.h MEDIUM: thread/dns: Make DNS thread-safe 2017-10-31 13:58:33 +01:00
fd.h CLEANUP: threads: rename process_mask to thread_mask 2017-10-31 16:06:06 +01:00
filters.h MEDIUM: threads/filters: Add init/deinit callback per thread 2017-10-31 13:58:32 +01:00
freq_ctr.h
global.h MINOR: threads: Add thread-map config parameter in the global section 2017-10-31 13:58:33 +01:00
h1.h MINOR: h1: store the status code in the H1 message 2017-10-31 08:43:29 +01:00
hdr_idx.h
hlua.h MEDIUM: threads/lua: Cannot acces to the socket if we try to access from another thread. 2017-10-31 13:58:32 +01:00
lb_chash.h
lb_fas.h
lb_fwlc.h
lb_fwrr.h
lb_map.h MEDIUM: threads/lb: Make LB algorithms (lb_*.c) thread-safe 2017-10-31 13:58:31 +01:00
listener.h MEDIUM: threads/listeners: Make listeners thread-safe 2017-10-31 13:58:30 +01:00
log.h
mailers.h
map.h
obj_type.h MINOR: connection: introduce conn_stream 2017-10-31 18:03:23 +01:00
pattern.h MAJOR: threads/map: Make acls/maps thread safe 2017-10-31 13:58:32 +01:00
peers.h MAJOR: threads/peers: Make peers thread safe 2017-10-31 13:58:31 +01:00
pipe.h
port_range.h
proto_http.h MINOR: http: Mark the 425 code as "Too Early". 2017-10-27 10:53:32 +02:00
proto_udp.h
protocol.h MINOR: protocols: register the ->add function and stop calling them directly 2017-09-15 11:49:52 +02:00
proxy.h MINOR: threads/mailers: Add a lock to protect queues of email alerts 2017-10-31 13:58:33 +01:00
queue.h
sample.h
server.h MAJOR: threads/ssl: Make SSL part thread-safe 2017-10-31 13:58:32 +01:00
session.h MINOR: session: remove the list of streams from struct session 2017-10-08 22:32:05 +02:00
shctx.h MEDIUM: shctx: separate ssl and shctx 2017-10-31 03:49:40 +01:00
signal.h
spoe.h MEDIUM: thread/spoe: Make the SPOE thread-safe 2017-10-31 13:58:33 +01:00
ssl_sock.h MEDIUM: shctx: separate ssl and shctx 2017-10-31 03:49:40 +01:00
stats.h MEDIUM: stats: Add show json schema 2017-03-14 11:14:03 +01:00
stick_table.h MEDIUM: threads/stick-tables: handle multithreads on stick tables 2017-10-31 13:58:31 +01:00
stream_interface.h BUG/MEDIUM: stream: fix client-fin/server-fin handling 2017-03-21 15:04:43 +01:00
stream.h MINOR: session: remove the list of streams from struct session 2017-10-08 22:32:05 +02:00
task.h MAJOR: task: make use of the scope-aware ebtree functions 2017-11-06 11:20:11 +01:00
template.h
vars.h MEDIUM: thread/vars: Make vars thread-safe 2017-10-31 13:58:32 +01:00