1
0
mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-01-03 10:42:07 +00:00
Commit Graph

4373 Commits

Author SHA1 Message Date
MIZUTA Takeshi
b24bc0dfb6 MINOR: tcp: Support TCP keepalive parameters customization
It is now possible to customize TCP keepalive parameters.
These correspond to the socket options TCP_KEEPCNT, TCP_KEEPIDLE, TCP_KEEPINTVL
and are valid for the defaults, listen, frontend and backend sections.

This patch fixes GitHub issue .
2020-07-09 05:22:16 +02:00
Willy Tarreau
3b8f9b7b88 BUG/MEDIUM: lists: add missing store barrier in MT_LIST_ADD/MT_LIST_ADDQ
The torture test run for previous commit 787dc20 ("BUG/MEDIUM: lists: add
missing store barrier on MT_LIST_BEHEAD()") finally broke again after 34M
connections. It appeared that MT_LIST_ADD and MT_LIST_ADDQ were suffering
from the same missing barrier when restoring the original pointers before
giving up, when checking if the element was already added. This is indeed
something which seldom happens with the shared scheduler, in case two
threads simultaneously try to wake up the same tasklet.

With a store barrier there after reverting the pointers, the torture test
survived 750M connections on the NanoPI-Fire3, so it looks good this time.
Probably that MT_LIST_BEHEAD should be added to test-list.c since it seems
to be more sensitive to concurrent accesses with MT_LIST_ADDQ.

It's worth noting that there is no barrier between the last two pointers
update, while there is one in MT_LIST_POP and MT_LIST_BEHEAD, the latter
having shown to be needed, but I cannot demonstrate why we would need
one here. Given that the code seems solid here, let's stick to what is
shown to work.

This fix should be backported to 2.1, just for the sake of safety since
the issue couldn't be triggered there, but it could change with the
compiler or when backporting a fix for example.
2020-07-09 05:01:27 +02:00
Willy Tarreau
787dc20952 BUG/MEDIUM: lists: add missing store barrier on MT_LIST_BEHEAD()
When running multi-threaded tests on my NanoPI-Fire3 (8 A53 cores), I
managed to occasionally get either a bus error or a segfault in the
scheduler, but only when running at a high connection rate (injecting
on a tcp-request connection reject rule). The bug is rare and happens
around once per million connections. I could never reproduce it with
less than 4 threads nor on A72 cores.

Haproxy 2.1.0 would also fail there but not 2.1.7.

Every time the crash happened with the TL_URGENT task list corrupted,
though it was not immediately after the LIST_SPLICE() call, indicating
background activity survived the MT_LIST_BEHEAD() operation. This queue
is where the shared runqueue is transferred, and the shared runqueue
gets fast inter-thread tasklet wakeups from idle conn takeover and new
connections.

Comparing the MT_LIST_BEHEAD() and MT_LIST_DEL() implementations, it's
quite obvious that a few barriers are missing from the former, and these
will simply fail on weakly ordered caches.

Two store barriers were added before the break() on failure, to match
what is done on the normal path. Missing them almost always results in
a segfault which is quite rare but consistent (after ~3M connections).

The 3rd one before updating n->prev seems intuitively needed though I
coudln't make the code fail without it. It's present in MT_LIST_DEL so
better not be needlessly creative. The last one is the most important
one, and seems to be the one that helps a concurrent MT_LIST_ADDQ()
detect a late failure and try again. With this, the code survives at
least 30M connections.

Interestingly the exact same issue was addressed in 2.0-dev2 for MT_LIST_DEL
with commit 690d2ad4d ("BUG/MEDIUM: list: add missing store barriers when
updating elements and head").

This fix must be backported to 2.1 as MT_LIST_BEHEAD() is also used there.

It's only tagged as medium because it will only affect entry-level CPUs
like Cortex A53 (x86 are not affected), and requires load levels that are
very hard to achieve on such machines to trigger it. In practice it's
unlikely anyone will ever hit it.
2020-07-08 19:45:50 +02:00
Willy Tarreau
e3cb9978c2 MINOR: version: back to development, update status message
Update the status message and update INSTALL again.
2020-07-07 16:38:51 +02:00
Willy Tarreau
33205c23a7 [RELEASE] Released version 2.3-dev0
Released version 2.3-dev0 with the following main changes :
    - exact copy of 2.2.0
2020-07-07 16:35:28 +02:00
Willy Tarreau
44c47de81a MINOR: version: mention that it's an LTS release now
The new version is going to be LTS up to around Q2 2025.
2020-07-07 16:31:52 +02:00
William Lallemand
7d42ef5b22 WIP/MINOR: ssl: add sample fetches for keylog in frontend
OpenSSL 1.1.1 provides a callback registering function
SSL_CTX_set_keylog_callback, which allows one to receive a string
containing the keys to deciphers TLSv1.3.

Unfortunately it is not possible to store this data in binary form and
we can only get this information using the callback. Which means that we
need to store it until the connection is closed.

This patches add 2 pools, the first one, pool_head_ssl_keylog is used to
store a struct ssl_keylog which will be inserted as a ex_data in a SSL *.
The second one is pool_head_ssl_keylog_str which will be used to store
the hexadecimal strings.

To enable the capture of the keys, you need to set "tune.ssl.keylog on"
in your configuration.

The following fetches were implemented:

ssl_fc_client_early_traffic_secret,
ssl_fc_client_handshake_traffic_secret,
ssl_fc_server_handshake_traffic_secret,
ssl_fc_client_traffic_secret_0,
ssl_fc_server_traffic_secret_0,
ssl_fc_exporter_secret,
ssl_fc_early_exporter_secret
2020-07-06 19:08:03 +02:00
Ilya Shipitsin
46a030cdda CLEANUP: assorted typo fixes in the code and comments
This is 11th iteration of typo fixes
2020-07-06 14:34:32 +02:00
Willy Tarreau
b0be8ae2a8 CLEANUP: auth: fix useless self-include of auth-t.h
Since recent include cleanups auth-t.h ended up including itself.
2020-07-05 21:32:47 +02:00
Willy Tarreau
0c439d8956 BUILD: tools: make resolve_sym_name() return a const
Originally it was made to return a void* because some comparisons in the
code where it was used required a lot of casts. But now we don't need
that anymore. And having it non-const breaks the build on NetBSD 9 as
reported in issue .

So let's switch to const and adjust debug.c to accomodate this.
2020-07-05 20:26:04 +02:00
Olivier Houchard
a74bb7e26e BUG/MEDIUM: connections: Let the xprt layer know a takeover happened.
When we takeover a connection, let the xprt layer know. If it has its own
tasklet, and it is already scheduled, then it has to be destroyed, otherwise
it may run the new mux tasklet on the old thread.

Note that we only do this for the ssl xprt for now, because the only other
one that might wake the mux up is the handshake one, which is supposed to
disappear before idle connections exist.

No backport is needed, this is for 2.2.
2020-07-03 17:49:33 +02:00
Olivier Houchard
1662cdb0c6 BUG/MEDIUM: connections: Set the tid for the old tasklet on takeover.
In the various takeover() methods, make sure we schedule the old tasklet
on the old thread, as we don't want it to run on our own thread! This
was causing a very rare crash when building with DEBUG_STRICT, seeing
that either an FD's thread mask didn't match the thread ID in h1_io_cb(),
or that stream_int_notify() would try to queue a task with the wrong
tid_bit.

In order to reproduce this, it is necessary to maintain many connections
(typically 30k) at a high request rate flowing over H1+SSL between two
proxies, the second of which would randomly reject ~1% of the incoming
connection and randomly killing some idle ones using a very short client
timeout. The request rate must be adjusted so that the CPUs are nearly
saturated, but never reach 100%. It's easier to reproduce this by skipping
local connections and always picking from other threads. The issue
should happen in less than 20s otherwise it's necessary to restart to
reset the idle connections lists.

No backport is needed, takeover() is 2.2 only.
2020-07-03 17:49:23 +02:00
Willy Tarreau
43079e0731 MINOR: sched: split tasklet_wakeup() into tasklet_wakeup_on()
tasklet_wakeup() only checks tl->tid to know whether the task is
programmed to run on the current thread or on a specific thread. We'll
have to ease this selection in a subsequent patch, preferably without
modifying tl->tid, so let's have a new tasklet_wakeup_on() function
to specify the thread number to run on. That the logic has not changed
at all.
2020-07-03 17:19:47 +02:00
Emeric Brun
9f9b22c4f1 MINOR: log: add time second fraction field to rfc5424 log timestamp.
This patch adds the time second fraction in microseconds
as supported by the rfc.
2020-07-02 17:56:06 +02:00
Willy Tarreau
dab586c3a8 BUILD: debug: avoid build warnings with DEBUG_MEM_STATS
Some libcs define strdup() as a macro and cause redefine warnings to
be emitted, so let's first undefine all functions we redefine.
2020-07-02 10:25:01 +02:00
Dragan Dosen
1e3b16f74f MINOR: log-format: allow to preserve spacing in log format strings
Now it's possible to preserve spacing everywhere except in "log-format",
"log-format-sd" and "unique-id-format" directives, where spaces are
delimiters and are merged. That may be useful when the response payload
is specified as a log format string by "lf-file" or "lf-string", or even
for headers or anything else.

In order to merge spaces, a new option LOG_OPT_MERGE_SPACES is applied
exclusively on options passed to function parse_logformat_string().

This patch fixes an issue  ("http-request return log-format file
evaluation altering spacing of ASCII output/art").
2020-07-02 10:11:44 +02:00
Willy Tarreau
a6026a0c92 MINOR: debug: add a new "debug dev memstats" command
Now when building with -DDEBUG_MEM_STATS, some malloc/calloc/strdup/realloc
stats are kept per file+line number and may be displayed and even reset on
the CLI using "debug dev memstats". This allows to easily track potential
leakers or abnormal usages.
2020-07-02 09:14:48 +02:00
Willy Tarreau
76cc699017 MINOR: config: add a new tune.idle-pool.shared global setting.
Enables ('on') or disables ('off') sharing of idle connection pools between
threads for a same server. The default is to share them between threads in
order to minimize the number of persistent connections to a server, and to
optimize the connection reuse rate. But to help with debugging or when
suspecting a bug in HAProxy around connection reuse, it can be convenient to
forcefully disable this idle pool sharing between multiple threads, and force
this option to "off". The default is on.

This could have been nice to have during the idle connections debugging,
but it's not too late to add it!
2020-07-01 19:07:37 +02:00
Olivier Houchard
ff1d0929b8 MEDIUM: connections: Don't use a lock when moving connections to remove.
Make it so we don't have to take a lock while moving a connection from
the idle list to the toremove_list by taking advantage of the MT_LIST.
2020-07-01 17:09:19 +02:00
Olivier Houchard
f8f4c2ef60 CLEANUP: connections: rename the toremove_lock to takeover_lock
This lock was misnamed and a bit confusing. It's only used for takeover
so let's call it takeover_lock.
2020-07-01 17:09:10 +02:00
Olivier Houchard
bbee1f7e78 MINOR: list: Add MT_LIST_DEL_SAFE_NOINIT() and MT_LIST_ADDQ_NOCHECK()
Add two new macros, MT_LIST_DEL_SAFE_NOINIT makes sure we remove the
element from the list, without reinitializing its next and prev, and
MT_LIST_ADDQ_NOCHECK is similar to MT_LIST_ADDQ(), except it doesn't check
if the element is already in a list.
The goal is to be able to move an element from a list we're currently
parsing to another, keeping it locked in the meanwhile.
2020-07-01 17:04:00 +02:00
Willy Tarreau
eb8c2c69fa MEDIUM: sched: implement task_kill() to kill a task
task_kill() may be used by any thread to kill any task with less overhead
than a regular wakeup. In order to achieve this, it bypasses the priority
tree and inserts the task directly into the shared tasklets list, cast as
a tasklet. The task_list_size is updated to make sure it is properly
decremented after execution of this task. The task will thus be picked by
process_runnable_tasks() after checking the tree and sent to the TL_URGENT
list, where it will be processed and killed.

If the task is bound to more than one thread, its first thread will be the
one notified.

If the task was already queued or running, nothing is done, only the flag
is added so that it gets killed before or after execution. Of course it's
the caller's responsibility to make sur any resources allocated by this
task were already cleaned up or taken over.
2020-07-01 16:35:53 +02:00
Willy Tarreau
8a6049c268 MEDIUM: sched: create a new TASK_KILLED task flag
This flag, when set, will be used to indicate that the task must die.
At the moment this may only be placed by the task itself or by the
scheduler when placing it into the TL_NORMAL queue.
2020-07-01 16:35:49 +02:00
Willy Tarreau
364f25a688 MINOR: backend: don't always takeover from the same threads
The next thread walking algorithm in commit 566df309c ("MEDIUM:
connections: Attempt to get idle connections from other threads.")
proved to be sufficient for most cases, but it still has some rough
edges when threads are unevenly loaded. If one thread wakes up with
10 streams to process in a burst, it will mainly take over connections
from the next one until it doesn't have anymore.

This patch implements a rotating index that is stored into the server
list and that any thread taking over a connection is responsible for
updating. This way it starts mostly random and avoids always picking
from the same place. This results in a smoother distribution overall
and a slightly lower takeover rate.
2020-07-01 16:07:43 +02:00
Willy Tarreau
2f3f4d3441 MEDIUM: server: add a new pool-low-conn server setting
The problem with the way idle connections currently work is that it's
easy for a thread to steal all of its siblings' connections, then release
them, then it's done by another one, etc. This happens even more easily
due to scheduling latencies, or merged events inside the same pool loop,
which, when dealing with a fast server responding in sub-millisecond
delays, can really result in one thread being fully at work at a time.

In such a case, we perform a huge amount of takeover() which consumes
CPU and requires quite some locking, sometimes resulting in lower
performance than expected.

In order to fight against this problem, this patch introduces a new server
setting "pool-low-conn", whose purpose is to dictate when it is allowed to
steal connections from a sibling. As long as the number of idle connections
remains at least as high as this value, it is permitted to take over another
connection. When the idle connection count becomes lower, a thread may only
use its own connections or create a new one. By proceeding like this even
with a low number (typically 2*nbthreads), we quickly end up in a situation
where all active threads have a few connections. It then becomes possible
to connect to a server without bothering other threads the vast majority
of the time, while still being able to use these connections when the
number of available FDs becomes low.

We also use this threshold instead of global.nbthread in the connection
release logic, allowing to keep more extra connections if needed.

A test performed with 10000 concurrent HTTP/1 connections, 16 threads
and 210 servers with 1 millisecond of server response time showed the
following numbers:

   haproxy 2.1.7:           185000 requests per second
   haproxy 2.2:             314000 requests per second
   haproxy 2.2 lowconn 32:  352000 requests per second

The takeover rate goes down from 300k/s to 13k/s. The difference is
further amplified as the response time shrinks.
2020-07-01 15:23:15 +02:00
Willy Tarreau
35e30c9670 BUG/MINOR: server: fix the connection release logic regarding nearly full conditions
There was a logic bug in commit ddfe0743d ("MEDIUM: server: use the two
thresholds for the connection release algorithm"): instead of keeping
only our first idle connection when FDs become scarce, the condition was
inverted resulting in enforcing this constraint unless FDs are scarce.
This results in less idle connections than permitted to be kept under
normal condition.

No backport needed.
2020-07-01 14:14:29 +02:00
Willy Tarreau
daf8aa62a8 MINOR: pools: increase MAX_BASE_POOLS to 64
When not sharing pools (i.e. when building with -DDEBUG_DONT_SHARE_POOLS)
we have about 47 pools right now, while MAX_BASE_POOLS is only 32, meaning
that only the first 32 ones will benefit from a per-thread cache entry.
This totally kills performance when pools are not shared (roughly -20%).

Let's double the limit to gain some margin, and make it possible to set
it as a build option.

It might be useful to backport this to stable versions as they're likely
to be affected as well.
2020-06-30 14:29:02 +02:00
Willy Tarreau
ddfe0743d8 MEDIUM: server: use the two thresholds for the connection release algorithm
The algorithm improvement in bdb86bd ("MEDIUM: server: improve estimate
of the need for idle connections") is still not enough because there's
a hard limit between below and above the FD count, so it continues to
end up with many killed connections.

Here we're proceeding differently. Given that there are two configured
limits, a low and a high one, what we do is that we drop connections
when the high limit is reached (what's already done by the killing task
anyway), when we're between the low and the high threshold, we only keep
the connection if our idle entries are empty (with a preference for safe
ones), and below the low threshold, we keep any connection so as to give
them a chance of being reused or taken over by another thread.

Proceeding like this results in much less dropped connections, we
typically see a 99.3% reuse rate (76k conns for 10M requests over 200
servers and 4 threads, with 335k takeovers or 3%), and much less CPU
usage variations because there are no more bursts to try to kill extra
connections.

It should be possible to further improve this by counting the number
of threads exploiting a server and trying to optimize the amount of
per-thread idle connections so that it is approximately balanced among
the threads.
2020-06-29 21:54:38 +02:00
Willy Tarreau
e69282a03f BUG/MINOR: server: always count one idle slot for current thread
The idle server connection estimates brought in commit bdb86bd ("MEDIUM:
server: improve estimate of the need for idle connections") were committed
without the minimum of 1 idle conn needed for the current thread. The net
effect is that there are bursts of dropped connections when the load varies
because there's no provision for the last connection.

No backport needed, this is 2.2-dev.
2020-06-29 21:54:38 +02:00
Willy Tarreau
d59946e673 Revert "BUG/MEDIUM: lists: Lock the element while we check if it is in a list."
This reverts previous commit 347bbf79d20e1cff57075a8a378355dfac2475e2i.

The original code was correct. This patch resulted from a mistaken analysis
and breaks the scheduler:

 ########################## Starting vtest ##########################
 Testing with haproxy version: 2.2-dev11-90b7d9-23
 #    top  TEST reg-tests/lua/close_wait_lf.vtc TIMED OUT (kill -9)
 #    top  TEST reg-tests/lua/close_wait_lf.vtc FAILED (10.008) signal=9
 1 tests failed, 0 tests skipped, 88 tests passed

 Program terminated with signal SIGABRT, Aborted.
 [Current thread is 1 (Thread 0x7fb0dac2c700 (LWP 11292))]
 (gdb) bt
   0x00007fb0e7c143f8 in raise () from /lib64/libc.so.6
   0x00007fb0e7c15ffa in abort () from /lib64/libc.so.6
   0x000000000053f5d6 in ha_panic () at src/debug.c:269
   0x00000000005a6248 in wdt_handler (sig=14, si=<optimized out>, arg=<optimized out>) at src/wdt.c:119
   <signal handler called>
   0x00000000004fbccd in tasklet_wakeup (tl=0x1b5abc0) at include/haproxy/task.h:351
   listener_accept (fd=<optimized out>) at src/listener.c:999
   0x00000000004262df in fd_update_events (evts=<optimized out>, fd=6) at include/haproxy/fd.h:418
   _do_poll (p=<optimized out>, exp=<optimized out>, wake=<optimized out>) at src/ev_epoll.c:251
   0x0000000000548d0f in run_poll_loop () at src/haproxy.c:2949
  0x000000000054908b in run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:3067
  0x00007fb0e902b684 in start_thread () from /lib64/libpthread.so.0
  0x00007fb0e7ce5eed in clone () from /lib64/libc.so.6
 (gdb) up
   0x00000000004fbccd in tasklet_wakeup (tl=0x1b5abc0) at include/haproxy/task.h:351
 351                     if (MT_LIST_ADDQ(&task_per_thread[tl->tid].shared_tasklet_list, (struct mt_list *)&tl->list) == 1) {

If the commit above is ever backported, this one must be as well!
2020-06-29 21:54:37 +02:00
Olivier Houchard
347bbf79d2 BUG/MEDIUM: lists: Lock the element while we check if it is in a list.
In MT_LIST_ADDQ() and MT_LIST_ADD() we can't just check if the element is
already in a list, because there's a small race condition, it could be added
between the time we checked, and the time we actually set its next and prev.
So we have to lock it first.

This should be backported to 2.1.
2020-06-29 19:59:06 +02:00
Willy Tarreau
a9fcecbdf3 MINOR: stats: add the estimated need of concurrent connections per server
The max_used_conns value is used as an estimate of the needed number of
connections on a server to know how many to keep open. But this one is
not reported, making it hard to troubleshoot reuse issues. Let's export
it in the sessions/current column.
2020-06-29 16:29:11 +02:00
Willy Tarreau
bdb86bdaab MEDIUM: server: improve estimate of the need for idle connections
Starting with commit 079cb9a ("MEDIUM: connections: Revamp the way idle
connections are killed") we started to improve the way to compute the
need for idle connections. But the condition to keep a connection idle
or drop it when releasing it was not updated. This often results in
storms of close when certain thresholds are met, and long series of
takeover() when there aren't enough connections left for a thread on
a server.

This patch tries to improve the situation this way:
  - it keeps an estimate of the number of connections needed for a server.
    This estimate is a copy of the max over previous purge period, or is a
    max of what is seen over current period; it differs from max_used_conns
    in that this one is a counter that's reset on each purge period ;

  - when releasing, if the number of current idle+used connections is
    lower than this last estimate, then we'll keep the connection;

  - when releasing, if the current thread's idle conns head is empty,
    and we don't exceed the estimate by the number of threads, then
    we'll keep the connection.

  - when cleaning up connections, we consider the max of the last two
    periods to avoid killing too many idle conns when facing bursty
    traffic.

Thanks to this we can better converge towards a situation where, provided
there are enough FDs, each active server keeps at least one idle connection
per thread all the time, with a total number close to what was needed over
the previous measurement period (as defined by pool-purge-delay).

On tests with large numbers of concurrent connections (30k) and many
servers (200), this has quite smoothed the CPU usage pattern, increased
the reuse rate and roughly halved the takeover rate.
2020-06-29 16:29:10 +02:00
Willy Tarreau
b159132ea3 MINOR: activity: add per-thread statistics on FD takeover
The FD takeover operation might have certain impacts explaining
unexpected activities, so it's important to report such a counter
there. We thus count the number of times a thread has stolen an
FD from another thread.
2020-06-29 14:26:05 +02:00
Willy Tarreau
3bb617cfe0 MINOR: stats: add 3 new output values for the per-server idle conn state
The servers have internal states describing the status of idle connections,
unfortunately these were not exported in the stats. This patch adds the 3
following gauges:

 - idle_conn_cur : Current number of unsafe idle connections
 - safe_conn_cur : Current number of safe idle connections
 - used_conn_cur : Current number of connections in use
2020-06-29 14:26:05 +02:00
Willy Tarreau
20dc3cd4a6 MINOR: pools: move the LRU cache heads to thread_info
The LRU cache head was an array of list, which causes false sharing
between 4 to 8 threads in the same cache line. Let's move it to the
thread_info structure instead. There's no need to do the same for the
pool_cache[] array since it's already quite large (32 pointers each).

By doing this the request rate increased by 1% on a 16-thread machine.
2020-06-29 10:36:37 +02:00
Willy Tarreau
c03d7632a5 CLEANUP: pool: only include the type files from types
pool-t.h was mistakenly including the full-blown includes for threads,
lists and api instead of the types, and as such, CONFIG_HAP_LOCAL_POOLS
and CONFIG_HAP_LOCKLESS_POOLS were not visible everywhere.
2020-06-29 10:11:24 +02:00
Willy Tarreau
e4d1505c83 REORG: includes: create tinfo.h for the thread_info struct
The thread_info struct is convenient to store various per-thread info
without having to resort to a painful thread_local storage which is
slow and painful to initialize.

The problem is, by having this one in thread.h it's very difficult to
add more entries there because everyone already includes thread.h so
conversely thread.h cannot reference certain types.

There's no point in having this there, instead let's create a new pair
of files, tinfo{,-t}.h, which declare the structure. This way it will
become possible to extend them with other includes and have certain
files store their own types there.
2020-06-29 09:57:23 +02:00
Willy Tarreau
4d82bf5c2e MINOR: connection: align toremove_{lock,connections} and cleanup into idle_conns
We used to have 3 thread-based arrays for toremove_lock, idle_cleanup,
and toremove_connections. The problem is that these items are small,
and that this creates false sharing between threads since it's possible
to pack up to 8-16 of these values into a single cache line. This can
cause real damage where there is contention on the lock.

This patch creates a new array of struct "idle_conns" that is aligned
on a cache line and which contains all three members above. This way
each thread has access to its variables without hindering the other
ones. Just doing this increased the HTTP/1 request rate by 5% on a
16-thread machine.

The definition was moved to connection.{c,h} since it appeared a more
natural evolution of the ongoing changes given that there was already
one of them declared in connection.h previously.
2020-06-28 10:52:36 +02:00
Willy Tarreau
d79422a0ff BUG/MEDIUM: buffers: always allocate from the local cache first
It looked strange to see pool_evict_from_cache() always very present
on "perf top", but there was actually a reason to this: while b_free()
uses pool_free() which properly disposes the buffer into the local cache
and b_alloc_fast() allocates using pool_get_first() which considers the
local cache, b_alloc_margin() does not consider the local cache as it
only uses __pool_get_first() which only allocates from the shared pools.

The impact is that basically everywhere a buffer is allocated (muxes,
streams, applets), it's always picked from the shared pool (hence
involves locking) and is released to the local one and makes it grow
until it's required to trigger a flush using pool_evict_from_cache().
Buffers usage are thus not thread-local at all, and cause eviction of
a lot of possibly useful objects from the local caches.

Just fixing this results in a 10% request rate increase in an HTTP/1 test
on a 16-thread machine.

This bug was caused by recent commit ed891fd ("MEDIUM: memory: make local
pools independent on lockless pools") merged into 2.2-dev9, so not backport
is needed.
2020-06-28 10:45:35 +02:00
Willy Tarreau
4dc6c860b4 CLEANUP: buffers: remove unused buffer_wq_lock lock
Commit 2104659 ("MEDIUM: buffer: remove the buffer_wq lock") removed
usage of the lock but not the lock itself. It's totally unused, let's
remove it.
2020-06-28 10:45:34 +02:00
Anthonin Bonnefoy
85048f80c9 MINOR: http: Add support for http 413 status
Add 413 http "payload too large" status code. This will allow 413 to be
used in deny_status and errorfile.
2020-06-26 11:30:02 +02:00
Ilya Shipitsin
47d17182f4 CLEANUP: assorted typo fixes in the code and comments
This is 10th iteration of typo fixes
2020-06-26 11:27:28 +02:00
Ilya Shipitsin
f44d155515 BUILD: fix ssl_sample.c when building against BoringSSL
BoringSSL does not have X509_get_X509_PUBKEY
let our emulation level define that for BoringSSL as well

Build log:

src/ssl_sample.o: In function `smp_fetch_ssl_x_key_alg':
/home/travis/build/haproxy/haproxy/src/ssl_sample.c:592: undefined reference to `X509_get_X509_PUBKEY'
clang-7: error: linker command failed with exit code 1 (use -v to see invocation)
Makefile:860: recipe for target 'haproxy' failed
make: *** [haproxy] Error 1

travis-ci: https://travis-ci.com/github/haproxy/haproxy/jobs/351670996
2020-06-26 10:33:38 +02:00
Willy Tarreau
c54e5ad9cc MINOR: cfgparse: sanitize the output a little bit
With the rework of the config line parser, we've started to emit a dump
of the initial line underlined by a caret character indicating the error
location. But with extremely large lines it starts to take time and can
even cause trouble to slow terminals (e.g. over ssh), and this becomes
useless. In addition, control characters could be dumped as-is which is
bad, especially when the input file is accidently wrong (an executable).

This patch adds a string sanitization function which isolates an area
around the error position in order to report only that area if the string
is too large. The limit was set to 80 characters, which will result in
roughly 40 chars around the error being reported only, prefixed and suffixed
with "..." as needed. In addition, non-printable characters in the line are
now replaced with '?' so as not to corrupt the terminal. This way invalid
variable names, unmatched quotes etc will be easier to spot.

A typical output is now:

  [ALERT] 176/092336 (23852) : parsing [bad.cfg:8]: forbidden first char in environment variable name at position 811957:
    ...c$PATH$PATH$d(xlc`%?$PATH$PATH$dgc?T$%$P?AH?$PATH$PATH$d(?$PATH$PATH$dgc?%...
                                            ^
2020-06-25 09:43:27 +02:00
Willy Tarreau
e7723bddd7 MEDIUM: tasks: add a tune.sched.low-latency option
Now that all tasklet queues are scanned at once by run_tasks_from_lists(),
it becomes possible to always check for lower priority classes and jump
back to them when they exist.

This patch adds tune.sched.low-latency global setting to enable this
behavior. What it does is stick to the lowest ranked priority list in
which tasks are still present with an available budget, and leave the
loop to refill the tasklet lists if the trees got new tasks or if new
work arrived into the shared urgent queue.

Doing so allows to cut the latency in half when running with extremely
deep run queues (10k-100k), thus allowing forwarding of small and large
objects to coexist better. It remains off by default since it does have
a small impact on large traffic by default (shorter batches).
2020-06-24 12:21:26 +02:00
Willy Tarreau
59153fef86 MINOR: tasks: make run_tasks_from_lists() scan the queues itself
Now process_runnable_tasks is responsible for calculating the budgets
for each queue, dequeuing from the tree, and calling run_tasks_from_lists().
This latter one scans the queues, picking tasks there and respecting budgets.
Note that its name was updated with a plural "s" for this reason.
2020-06-24 12:21:26 +02:00
Willy Tarreau
ba48d5c8f9 MINOR: tasks: pass the queue index to run_task_from_list()
Instead of passing it a pointer to the queue, pass it the queue's index
so that it can perform all the work around current_queue and tl_class_mask.
2020-06-24 12:21:26 +02:00
Willy Tarreau
49f90bf148 MINOR: tasks: add a mask of the queues with active tasklets
It is neither convenient nor scalable to check each and every tasklet
queue to figure whether it's empty or not while we often need to check
them all at once. This patch introduces a tasklet class mask which gets
a bit 1 set for each queue representing one class of service. A single
test on the mask allows to figure whether there's still some work to be
done. It will later be usable to better factor the runqueue code.

Bits are set when tasklets are queued. They're cleared when queues are
emptied. It is possible that a queue is empty but has a bit if a tasklet
was added then removed, but this is not a problem as this is properly
checked for in run_tasks_from_list().
2020-06-24 12:21:26 +02:00
Willy Tarreau
c0a08ba2df MINOR: tasks: make current_queue an index instead of a pointer
It will be convenient to have the tasklet queue number soon, better make
current_queue an index rather than a pointer to the queue. When not currently
running (e.g. from I/O), the index is -1.
2020-06-24 12:21:26 +02:00