Commit Graph

10433 Commits

Author SHA1 Message Date
Willy Tarreau
09e0203ef4 BUG/MINOR: backend: do not try to install a mux when the connection failed
If si_connect() failed, do not try to install the mux nor to complete
the operations or add the connection to an idle list, and abort quickly
instead. No obvious side effects were identified, but continuing to
allocate some resources after something has already failed seems risky.

This was a result of a prior fix which already wanted to push this code
further : aa089d80b ("BUG/MEDIUM: server: Defer the mux init until after
xprt has been initialized.") but it ought to have pushed it even further
to maintain the error check just after si_connect().

To be backported to 2.0 and 1.9.
2019-07-18 16:49:11 +02:00
Willy Tarreau
69564b1c49 BUG/MEDIUM: http/htx: unbreak option http_proxy
The temporary connection used to hold the target connection's address
was missing a valid target, resulting in a 500 server error being
reported when trying to connect to a remote host. Strangely this
issue was introduced as a side effect of commit 2c52a2b9e ("MEDIUM:
connection: make mux->detach() release the connection") which at
first glance looks unrelated but solidly stops the bisection (note
that by default this part even crashes). It's suspected that the
error only happens when closing and destroys pending data in fact.

Given that this feature was broken very early during 1.8-rc1 development
it doesn't seem to be used often. This must be backported as far as 1.8.
2019-07-18 16:49:11 +02:00
Olivier Houchard
0ba6c85a0b BUG/MEDIUM: checks: Don't attempt to receive data if we already subscribed.
tcpcheck_main() might be called while we already attempted to subscribe, and
failed. There's no point in trying to call rcv_buf() again, and failing
would lead to us trying to subscribe again, which is not allowed.

This should be backported to 2.0 and 1.9.
2019-07-18 16:42:45 +02:00
Willy Tarreau
8280ea97a0 MINOR: applet: make appctx use their own pool
A long time ago, applets were seen as an alternative to connections,
and since their respective sizes were roughly equal it appeared wise
to share the same pool. Nowadays, connections got significantly larger
but applets are not that often used, except for the cache. However
applets are mostly complementary and not alternatives anymore, as
it's very possible not to have a back connection or to share one with
other streams.

The connections will soon lose their addresses and their size will
shrink so much that appctx won't fit anymore. Given that the old
benefits of sharing these pools have long disappeared, let's stop
doing this and have a dedicated pool for appctx.
2019-07-18 10:45:08 +02:00
Willy Tarreau
45726fd458 BUG/MINOR: dns: remove irrelevant dependency on a client connection
The do-resolve action tests for a client connection to the stream and
tries to get the client's address, otherwise it refrains from performing
the resolution. This really makes no sense at all and looks like an
earlier attempt at resolving the client's address to test that the
code was working. Further, it prevents the action from being used
from other places such as an autonomous applet for example, even if
at the moment this use case does not exist.

This patch simply removes the irrelevant test.

This can be backported to 2.0.
2019-07-17 14:11:57 +02:00
Jérôme Magnin
34ebb5cbab DOC: management: document cache_hits and cache_lookups in the CSV format
Counters for cache_hits and cache_lookups were added with commit
a1214a50 ("MINOR: cache: report the number of cache lookups and cache
hits") but not documented in management.txt.
2019-07-17 14:11:38 +02:00
Jérôme Magnin
708eb88845 DOC: management: document reuse and connect counters in the CSV format
Counters for connect and reuse were added in the stats with commit
f1573848 ("MINOR: backend: count the number of connect and reuse
per server and per backend") but not documented the CSV format in
management.txt
2019-07-17 09:40:46 +02:00
Willy Tarreau
db5140741d [RELEASE] Released version 2.1-dev1
Released version 2.1-dev1 with the following main changes :
    - BUG/MEDIUM: h2/htx: Update data length of the HTX when the cookie list is built
    - DOC: this is a development branch again.
    - MEDIUM: Make 'block' directive fatal
    - MEDIUM: Make 'redispatch' directive fatal
    - MEDIUM: Make '(cli|con|srv)timeout' directive fatal
    - MEDIUM: Remove 'option independant-streams'
    - MINOR: sample: Add sha2([<bits>]) converter
    - MEDIUM: server: server-state global file stored in a tree
    - BUG/MINOR: lua/htx: Make txn.req_req_* and txn.res_rep_* HTX aware
    - BUG/MINOR: mux-h1: Add the header connection in lower case in outgoing messages
    - BUG/MEDIUM: compression: Set Vary: Accept-Encoding for compressed responses
    - MINOR: htx: Add the function htx_change_blk_value_len()
    - BUG/MEDIUM: htx: Fully update HTX message when the block value is changed
    - BUG/MEDIUM: mux-h2: Reset padlen when several frames are demux
    - BUG/MEDIUM: mux-h2: Remove the padding length when a DATA frame size is checked
    - BUG/MEDIUM: lb_fwlc: Don't test the server's lb_tree from outside the lock
    - BUG/MAJOR: sample: Wrong stick-table name parsing in "if/unless" ACL condition.
    - BUILD: mworker: silence two printf format warnings around getpid()
    - BUILD: makefile: use :space: instead of digits to count commits
    - BUILD: makefile: adjust the sed expression of "make help" for solaris
    - BUILD: makefile: do not rely on shell substitutions to determine git version
    - BUG/MINOR: mworker-prog: Fix segmentation fault during cfgparse
    - BUG/MINOR: spoe: Fix memory leak if failing to allocate memory
    - BUG/MEDIUM: mworker: don't call the thread and fdtab deinit
    - BUG/MEDIUM: stream_interface: Don't add SI_FL_ERR the state is < SI_ST_CON.
    - BUG/MEDIUM: connections: Always add the xprt handshake if needed.
    - BUG/MEDIUM: ssl: Don't do anything in ssl_subscribe if we have no ctx.
    - BUG/MEDIUM: mworker/cli: command pipelining doesn't work anymore
    - BUG/MINOR: htx: Save hdrs_bytes when the HTX start-line is replaced
    - BUG/MAJOR: mux-h1: Don't crush trash chunk area when outgoing message is formatted
    - BUG/MINOR: memory: Set objects size for pools in the per-thread cache
    - BUG/MINOR: log: Detect missing sampling ranges in config
    - BUG/MEDIUM: proto_htx: Don't add EOM on 1xx informational messages
    - BUG/MEDIUM: mux-h1: Use buf_room_for_htx_data() to detect too large messages
    - BUG/MINOR: mux-h1: Make format errors during output formatting fatal
    - BUG/MEDIUM: ssl: Don't attempt to set alpn if we're not using SSL.
    - BUG/MEDIUM: mux-h1: Always release H1C if a shutdown for writes was reported
    - BUG/MINOR: mworker/cli: don't output a \n before the response
    - BUG/MEDIUM: checks: unblock signals in external checks
    - BUG/MINOR: mux-h1: Skip trailers for non-chunked outgoing messages
    - BUG/MINOR: mux-h1: Don't return the empty chunk on HEAD responses
    - BUG/MEDIUM: connections: Always call shutdown, with no linger.
    - BUG/MEDIUM: checks: Make sure the tasklet won't run if the connection is closed.
    - BUG/MINOR: contrib/prometheus-exporter: Don't use channel_htx_recv_max()
    - BUG/MINOR: hlua: Don't use channel_htx_recv_max()
    - BUG/MEDIUM: channel/htx: Use the total HTX size in channel_htx_recv_limit()
    - BUG/MINOR: hlua/htx: Respect the reserve when HTX data are sent
    - BUG/MINOR: contrib/prometheus-exporter: Respect the reserve when data are sent
    - BUG/MEDIUM: connections: Make sure we're unsubscribe before upgrading the mux.
    - BUG/MEDIUM: servers: Authorize tfo in default-server.
    - BUG/MEDIUM: sessions: Don't keep an extra idle connection in sessions.
    - MINOR: server: Add "no-tfo" option.
    - BUG/MINOR: contrib/prometheus-exporter: Don't try to add empty data blocks
    - MINOR: action: Add the return code ACT_RET_DONE for actions
    - BUG/MEDIUM: http/applet: Finish request processing when a service is registered
    - BUG/MEDIUM: lb_fas: Don't test the server's lb_tree from outside the lock
    - BUG/MEDIUM: mux-h1: Handle TUNNEL state when outgoing messages are formatted
    - BUG/MINOR: mux-h1: Don't process input or ouput if an error occurred
    - MINOR: stream-int: Factorize processing done after sending data in si_cs_send()
    - BUG/MEDIUM: stream-int: Don't rely on CF_WRITE_PARTIAL to unblock opposite si
    - DOC: contrib: spoa_server Add some hints for building spoa_server
    - DOC: Fix typo in intro.txt
    - BUG/MEDIUM: servers: Don't forget to set srv_cs to NULL if we can't reuse it.
    - BUG/MINOR: ssl: revert empty handshake detection in OpenSSL <= 1.0.2
    - MINOR: pools: release the pool's lock during the malloc/free calls
    - MINOR: pools: always pre-initialize allocated memory outside of the lock
    - MINOR: pools: make the thread harmless during the mmap/munmap syscalls
    - BUG/MEDIUM: fd/threads: fix excessive CPU usage on multi-thread accept
    - BUG/MINOR: server: Be really able to keep "pool-max-conn" idle connections
    - BUG/MEDIUM: checks: Don't attempt to read if we destroyed the connection.
    - BUG/MEDIUM: da: cast the chunk to string.
    - DOC: Fix typos and grammer in configuration.txt
    - CLEANUP: proto_tcp: Remove useless header inclusions.
    - BUG/MEDIUM: servers: Fix a race condition with idle connections.
    - MINOR: task: introduce work lists
    - BUG/MAJOR: listener: fix thread safety in resume_listener()
    - BUG/MEDIUM: mux-h1: Don't release h1 connection if there is still data to send
    - BUG/MINOR: mux-h1: Correctly report Ti timer when HTX and keepalives are used
    - BUG/MEDIUM: streams: Don't give up if we couldn't send the request.
    - BUG/MEDIUM: streams: Don't redispatch with L7 retries if redispatch isn't set.
    - BUG/MINOR: mux-pt: do not pretend there's more data after a read0
    - BUG/MEDIUM: tcp-check: unbreak multiple connect rules again
    - MEDIUM: mworker-prog: Add user/group options to program section
    - REGTESTS: checks: tcp-check connect to multiple ports
    - BUG/MEDIUM: threads: cpu-map designating a single thread/process are ignored
2019-07-16 19:15:28 +02:00
Willy Tarreau
7764a57d32 BUG/MEDIUM: threads: cpu-map designating a single thread/process are ignored
Since commit 81492c989 ("MINOR: threads: flatten the per-thread cpu-map"),
we don't keep the proc*thread matrix anymore to represent the full binding
possibilities, but only the proc and thread ones. The problem is that the
per-process binding is not the same for each thread and for the process,
and the proc[] array was assumed to store the per-proc first thread value
when doing this change. Worse, the logic present there tries to deal with
thread ranges and process ranges in a way which automatically exclused the
other possibility (since ranges cannot be used on both) but as such fails
to apply changes if neither the process nor the thread is expressed as a
range.

The real problem comes from the fact that specifying cpu-map 1/1 doesn't
yet reveal if the per-process mask or the per-thread mask needs to be
updated. In practice it's the thread one but then the current storage
doesn't allow to store the binding of the first thread of each other
process in nbproc>1 configurations.

When removing the proc*thread matrix, what ought to have been kept was
both the thread column for process 1 and the process line for threads 1,
but instead only the thread column was kept. This patch reintroduces the
storage of the configuration for the first thread of each process so that
it is again possible to store either the per-thread or per-process
configuration.

As a partial workaround for existing configurations, it is possible to
systematically indicate at least two processes or two threads at once
and map them by pairs or more so that at least two values are present
in the range. E.g :

  # set processes 1-4 to cpus 0-3 :

     cpu-map auto:1-4/1 0 1 2 3
  # or:
     cpu-map 1-2/1 0 1
     cpu-map 2-3/1 2 3

  # set threads 1-4 to cpus 0-3 :

     cpu-map auto:1/1-4 0 1 2 3
  # or :
     cpu-map 1/1-2 0 1
     cpu-map 3/3-4 2 3

This fix must be backported to 2.0.
2019-07-16 15:23:09 +02:00
Jérôme Magnin
885f64fb6d REGTESTS: checks: tcp-check connect to multiple ports
This test uses two sets of tcp-check connect port rules, with one
of the two ports being closed and expects the check to fail for both
backends at different steps. It aims at detecting regressions such as
the one fixed by 7df8ca62 (BUG/MEDIUM: tcp-check: unbreak multiple
connect rules again).
2019-07-16 10:20:52 +02:00
Andrew Heberle
9723696759 MEDIUM: mworker-prog: Add user/group options to program section
This patch adds "user" and "group" config options to the "program"
section so the configured command can be run as a different user.
2019-07-15 16:43:16 +02:00
Willy Tarreau
7df8ca6296 BUG/MEDIUM: tcp-check: unbreak multiple connect rules again
The last connect rule used to be ignored and that was fixed by commit
248f1173f ("BUG/MEDIUM: tcp-check: single connect rule can't detect
DOWN servers") during 1.9 development. However this patch went a bit
too far by not breaking out of the loop after a pending connect(),
resulting in a series of failed connect() to be quickly skipped and
only the last one to be taken into account.

Technically speaking the series is not exactly skipped, it's just that
TCP checks suffer from a design issue which is that there is no
distinction between a new rule and this rule's completion in the
"connect" rule handling code. As such, when evaluating TCPCHK_ACT_CONNECT
a new connection is created regardless of any previous connection in
progress, and the previous result is ignored. It seems that this issue
is mostly specific to the connect action if we refer to the comments at
the top of the function, so it might be possible to durably address it
by reworking the connect state.

For now this patch does something simpler, it restores the behaviour
before the commit above consisting in breaking out of the loop when
the connection is in progress and after skipping comment rules. This
way we fall back to the default code waiting for completion.

This patch must be backported as far as 1.8 since the commit above
was backported there. Thanks to Jérôme Magnin for reporting and
bisecting this issue.
2019-07-15 11:10:36 +02:00
Willy Tarreau
9cca8dfc0b BUG/MINOR: mux-pt: do not pretend there's more data after a read0
Commit 8706c8131 ("BUG/MEDIUM: mux_pt: Always set CS_FL_RCV_MORE.")
was a bit excessive in setting this flag, it refrained from removing
it after read0 unless it was on an empty call. The problem it causes
is that read0 is thus ignored on the first call :

  $ strace -tts200 -e trace=recvfrom,epoll_wait,sendto  ./haproxy -db -f tcp.cfg
  06:34:23.956897 recvfrom(9, "blah\n", 15360, 0, NULL, NULL) = 5
  06:34:23.956938 recvfrom(9, "", 15355, 0, NULL, NULL) = 0
  06:34:23.956958 recvfrom(9, "", 15355, 0, NULL, NULL) = 0
  06:34:23.957033 sendto(8, "blah\n", 5, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 5
  06:34:23.957229 epoll_wait(3, [{EPOLLIN|EPOLLHUP|EPOLLRDHUP, {u32=8, u64=8}}], 200, 0) = 1
  06:34:23.957297 recvfrom(8, "", 15360, 0, NULL, NULL) = 0

If CO_FL_SOCK_RD_SH is reported by the transport layer, it indicates the
read0 was already seen thus we must not try again and we must immedaitely
report it. The simple fix consists in removing the test on ret==0 :

  $ strace -tts200 -e trace=recvfrom,epoll_wait,sendto  ./haproxy -db -f tcp.cfg
  06:44:21.634835 recvfrom(9, "blah\n", 15360, 0, NULL, NULL) = 5
  06:44:21.635020 recvfrom(9, "", 15355, 0, NULL, NULL) = 0
  06:44:21.635056 sendto(8, "blah\n", 5, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 5
  06:44:21.635269 epoll_wait(3, [{EPOLLIN|EPOLLHUP|EPOLLRDHUP, {u32=8, u64=8}}], 200, 0) = 1
  06:44:21.635330 recvfrom(8, "", 15360, 0, NULL, NULL) = 0

The issue is minor, it only results in extra syscalls and CPU usage.
This fix should be backported to 2.0 and 1.9.
2019-07-15 06:47:54 +02:00
Olivier Houchard
4bd5867627 BUG/MEDIUM: streams: Don't redispatch with L7 retries if redispatch isn't set.
Move the logic to decide if we redispatch to a new server from
sess_update_st_cer() to a new inline function, stream_choose_redispatch(), and
use it in do_l7_retry() instead of just setting the state to SI_ST_REQ.
That way, when using L7 retries, we won't redispatch the request to another
server except if "option redispatch" is used.

This should be backported to 2.0.
2019-07-12 16:17:50 +02:00
Olivier Houchard
29cac3c5f7 BUG/MEDIUM: streams: Don't give up if we couldn't send the request.
In htx_request_forward_body(), don't give up if we failed to send the request,
and we have L7 retries activated. If we do, we will not retry when we should.

This should be backported to 2.0.
2019-07-12 16:17:50 +02:00
Dave Pirotte
234740f65d BUG/MINOR: mux-h1: Correctly report Ti timer when HTX and keepalives are used
When HTTP keepalives are used in conjunction with HTX, the Ti timer
reports the elapsed time since the beginning of the connection instead
of the end of the previous request as stated in the documentation. Th,
Tq and Tt also report incorrectly as a result.

When creating a new h1s, check if it is the first request on the
connection. If not, set the session create times to the current
timestamp rather than the initial session accept timestamp. This makes
the logged timers behave as stated in the documentation.

This fix should be backported to 1.9 and 2.0.
2019-07-12 16:14:12 +02:00
Christopher Faulet
37243bc61f BUG/MEDIUM: mux-h1: Don't release h1 connection if there is still data to send
When the h1 stream (h1s) is detached, If the connection is not really shutdown
yet and if there is still some data to send, the h1 connection (h1c) must not be
released. Otherwise, the remaining data are lost. This bug was introduced by the
commit 3ac0f430 ("BUG/MEDIUM: mux-h1: Always release H1C if a shutdown for
writes was reported").

Here is the conditions to release an h1 connection when the h1 stream is
detached :

  * An error or a shutdown write occurred on the connection
    (CO_FL_ERROR|CO_FL_SOCK_WR_SH)

  * an error, an h2 upgrade or full shutdown occurred on the h1 connection
    (H1C_F_CS_ERROR||H1C_F_UPG_H2C|H1C_F_CS_SHUTDOWN)

  * A shutdown write is pending on the h1 connection and there is no more data
    in the output buffer
    ((h1c->flags & H1C_F_CS_SHUTW_NOW) && !b_data(&h1c->obuf))

If one of these conditions is fulfilled, the h1 connection is
released. Otherwise, the release is delayed. If we are waiting to send remaining
data, a timeout is set.

This patch must be backported to 2.0 and 1.9. It fixes the issue #164.
2019-07-12 10:06:41 +02:00
Willy Tarreau
f2cb169487 BUG/MAJOR: listener: fix thread safety in resume_listener()
resume_listener() can be called from a thread not part of the listener's
mask after a curr_conn has gone lower than a proxy's or the process' limit.
This results in fd_may_recv() being called unlocked if the listener is
bound to only one thread, and quickly locks up.

This patch solves this by creating a per-thread work_list dedicated to
listeners, and modifying resume_listener() so that it bounces the listener
to one of its owning thread's work_list and waking it up. This thread will
then call resume_listener() again and will perform the operation on the
file descriptor itself. It is important to do it this way so that the
listener's state cannot be modified while the listener is being moved,
otherwise multiple threads can take conflicting decisions and the listener
could be put back into the global queue if the listener was used at the
same time.

It seems like a slightly simpler approach would be possible if the locked
list API would provide the ability to return a locked element. In this
case the listener would be immediately requeued in dequeue_all_listeners()
without having to go through resume_listener() with its associated lock.

This fix must be backported to all versions having the lock-less accept
loop, which is as far as 1.8 since deadlock fixes involving this feature
had to be backported there. It is expected that the code should not differ
too much there. However, previous commit "MINOR: task: introduce work lists"
will be needed as well and should not present difficulties either. For 1.8,
the commits introducing thread_mask() and LIST_ADDED() will be needed as
well, either backporting my_flsl() or switching to my_ffsl() will be OK,
and some changes will have to be performed so that the init function is
properly called (and maybe the deinit one can be dropped).

In order to test for the fix, simply set up a multi-threaded frontend with
multiple bind lines each attached to a single thread (reproduced with 16
threads here), set up a very low maxconn value on the frontend, and inject
heavy traffic on all listeners in parallel with slightly more connections
than the configured limit ( typically +20%) so that it flips very
frequently. If the bug is still there, at some point (5-20 seconds) the
traffic will go much lower or even stop, either with spinning threads or
not.
2019-07-12 09:07:48 +02:00
Willy Tarreau
64e6012eb9 MINOR: task: introduce work lists
Sometimes we need to delegate some list processing to a function running
on another thread. In this case the list element will simply be queued
into a dedicated self-locked list and the task responsible for this list
will be woken up, calling the associated function which will run over the
list.

This is what work_list does. Such lists will be dedicated to a limited
type of work but will significantly ease such remote handling. A function
is provided to create these per-thread lists, their tasks and to properly
bind each task to a distinct thread, so that the caller only has to store
the resulting pointer to the start of the structure.

These structures should not be abused though as each head will consume
4 pointers per thread, hence 32 bytes per thread or 2 kB for 64 threads.
2019-07-12 09:07:48 +02:00
Olivier Houchard
4be7190c10 BUG/MEDIUM: servers: Fix a race condition with idle connections.
When we're purging idle connections, there's a race condition, when we're
removing the connection from the idle list, to add it to the list of
connections to free, if the thread owning the connection tries to free it
at the same time.
To fix this, simply add a per-thread lock, that has to be hold before
removing the connection from the idle list, and when, in conn_free(), we're
about to remove the connection from every list. That way, we know for sure
the connection will stay valid while we remove it from the idle list, to add
it to the list of connections to free.
This should happen rarely enough that it shouldn't have any impact on
performances.
This has not been reported yet, but could provoke random segfaults.

This should be backported to 2.0.
2019-07-11 16:16:38 +02:00
Frédéric Lécaille
51596c166b CLEANUP: proto_tcp: Remove useless header inclusions.
I guess "sys/un.h" and "sys/stat.h" were included for debugging purposes when
"proto_tcp.c" was initially created. There are no more useful.
2019-07-11 10:40:20 +02:00
John Roesler
fb2fce1680 DOC: Fix typos and grammer in configuration.txt 2019-07-11 10:25:53 +02:00
David Carlier
7df4185f3c BUG/MEDIUM: da: cast the chunk to string.
in fetch mode, the output was incorrect, setting the type to string
explicitally.

This should be backported to all stable versions.
2019-07-11 10:20:09 +02:00
Olivier Houchard
bc89ad8d94 BUG/MEDIUM: checks: Don't attempt to read if we destroyed the connection.
In event_srv_chk_io(), only call __event_srv_chk_r() if we did not subscribe
for reading, and if wake_srv_chk() didn't return -1, as it would mean it
just destroyed the connection and the conn_stream, and attempting to use
those to recv data would lead to a crash.

This should be backported to 1.9 and 2.0.
2019-07-10 16:29:12 +02:00
Christopher Faulet
34ce7d075a BUG/MINOR: server: Be really able to keep "pool-max-conn" idle connections
The maximum number of idle connections for a server can be configured by setting
the server option "pool-max-conn". But when we try to add a connection in its
idle list, because of a wrong comparison, it may be rejected because there are
already "pool-max-conn - 1" idle connections.

This patch must be backported to 2.0 and 1.9.
2019-07-10 14:20:52 +02:00
Willy Tarreau
1dad3843dc BUG/MEDIUM: fd/threads: fix excessive CPU usage on multi-thread accept
While experimenting with potentially improved fairness and latency using
ticket locks on a Ryzen 16-thread/8-core, a very strange situation happened
a lot for some levels of traffic. Around 300k connections per second, no
more connections would be accepted on the multi-threaded listener but all
others would continue to work fine. All attempts to trace showed that the
threads were all in the trylock in the fd cache, or in the spinlock of
fd_update_events(), or in the one of fd_may_recv(). But as indicated this
was not a deadlock since the process continues to work fine.

After quite some investigation it appeared that the issue is caused by a
lack of fairness between the fdcache's trylock and these functions' spin
locks above. In fact, regardless of the success or failure of the fdcache's
attempt at grabbing the lock, the poller was calling fd_update_events()
which locks the FD once for something that can be done with a CAS, and
then calls fd_may_recv() with another lock for something that most often
didn't change. The high contention on these spinlocks leaves no chance to
any other thread to grab the lock using trylock(), and once this happens,
there is no thread left to process incoming connection events nor to stop
polling on the FD, leaving all threads at 100% CPU but partially operational.

This patch addresses the issue by using bit-test-and-set instead of the OR
in fd_may_recv() / fd_may_send() so that nothing is done if the FD was
already configured as expected. It does the same in fd_update_events()
using a CAS to check if the FD's events need to be changed at all or not.
With this patch applied, it became impossible to reproduce the issue, and
now there's no way to saturate all 16 CPUs with the load used for testing,
as no more than 1350-1400 were noticed at 300+kcps vs 1600.

Ideally this patch should go further and try to remove the remaining
incarnations of the fdlock as this seems possible, but it's difficult
enough to be done in a distinct patch that will not have to be backported.

It is possible that workloads involving a high connection rate may slightly
benefit from this patch and observe a slightly lower CPU usage even when
the service doesn't misbehave.

This patch must be backported to 2.0 and 1.9.
2019-07-09 10:41:24 +02:00
Willy Tarreau
85b2cae63c MINOR: pools: make the thread harmless during the mmap/munmap syscalls
These calls can take quite some time and leave the thread harmless so
it's better to mark it as such. This makes "show sess" respond way
faster during high loads running on processes build with DEBUG_UAF
since these calls are stressed a lot.
2019-07-09 10:40:33 +02:00
Willy Tarreau
828675421e MINOR: pools: always pre-initialize allocated memory outside of the lock
When calling mmap(), in general the system gives us a page but does not
really allocate it until we first dereference it. And it turns out that
this time is much longer than the time to perform the mmap() syscall.
Unfortunately, when running with memory debugging enabled, we mmap/munmap()
each object resulting in lots of such calls and a high contention on the
allocator. And the first accesses to the page being done under the pool
lock is extremely damaging to other threads.

The simple fact of writing a 0 at the beginning of the page after
allocating it and placing the POOL_LINK pointer outside of the lock is
enough to boost the performance by 8x in debug mode and to save the
watchdog from triggering on lock contention. This is what this patch
does.
2019-07-09 10:40:33 +02:00
Willy Tarreau
3e853ea74d MINOR: pools: release the pool's lock during the malloc/free calls
The malloc and free calls and especially the underlying mmap/munmap()
can occasionally take a huge amount of time and even cause the thread
to sleep. This is visible when haproxy is compiled with DEBUG_UAF which
causes every single pool allocation/free to allocate and release pages.
In this case, when using the locked pools, the watchdog can occasionally
fire under high contention (typically requesting 40000 1M objects in
parallel over 8 threads). Then, "perf top" shows that 50% of the CPU
time is spent in mmap() and munmap(). The reason the watchdog fires is
because some threads spin on the pool lock which is held by other threads
waiting on mmap() or munmap().

This patch modifies this so that the pool lock is released during these
syscalls. Not only this allows other threads to request try to allocate
their data in parallel, but it also considerably reduces the lock
contention.

Note that the locked pools are only used on small architectures where
high thread counts would not make sense, so this will not provide any
benefit in the general case. However it makes the debugging versions
way more stable, which is always appreciated.
2019-07-09 10:40:33 +02:00
Lukas Tribus
4979916134 BUG/MINOR: ssl: revert empty handshake detection in OpenSSL <= 1.0.2
Commit 54832b97 ("BUILD: enable several LibreSSL hacks, including")
changed empty handshake detection in OpenSSL <= 1.0.2 and LibreSSL,
from accessing packet_length directly (not available in LibreSSL) to
calling SSL_state() instead.

However, SSL_state() appears to be fully broken in both OpenSSL and
LibreSSL.

Since there is no possibility in LibreSSL to detect an empty handshake,
let's not try (like BoringSSL) and restore this functionality for
OpenSSL 1.0.2 and older, by reverting to the previous behavior.

Should be backported to 2.0.
2019-07-09 04:47:18 +02:00
Olivier Houchard
a1ab97316f BUG/MEDIUM: servers: Don't forget to set srv_cs to NULL if we can't reuse it.
In connect_server(), if there were already a CS assosciated with the stream,
but we can't reuse it, because the target is different (because we tried a
previous connection, it failed, and we use redispatch so we switched servers),
don't forget to set srv_cs to NULL. Otherwise, if we end up reusing another
connection, we would consider we already have a conn_stream, and we won't
create a new one, so we'd have a new connection but we would not be able to
use it.
This can explain frozen streams and connections stuck in CLOSE_WAIT when
using redispatch.

This should be backported to 1.9 and 2.0.
2019-07-08 16:32:58 +02:00
Alain Belkadi
ac52095098 DOC: Fix typo in intro.txt 2019-07-06 11:41:37 +02:00
Aleksandar Lazic
a71447539d DOC: contrib: spoa_server Add some hints for building spoa_server 2019-07-05 16:31:50 +02:00
Christopher Faulet
037b3ebd35 BUG/MEDIUM: stream-int: Don't rely on CF_WRITE_PARTIAL to unblock opposite si
In the function stream_int_notify(), when the opposite stream-interface is
blocked because there is no more room into the input buffer, if the flag
CF_WRITE_PARTIAL is set on this buffer, it is unblocked. It is a way to unblock
the reads on the other side because some data was sent.

But it is a problem during the fast-forwarding because only the stream is able
to remove the flag CF_WRITE_PARTIAL. So it is possible to have this flag because
of a previous send while the input buffer of the opposite stream-interface is
now full. In such case, the opposite stream-interface will be woken up for
nothing because its input buffer is full. If the same happens on the opposite
side, we will have a loop consumming all the CPU.

To fix the bug, the opposite side is now only notify if there is some available
room in its input buffer in the function si_cs_send(), so only if some data was
sent.

This patch must be backported to 2.0 and 1.9.
2019-07-05 14:26:15 +02:00
Christopher Faulet
86162db15c MINOR: stream-int: Factorize processing done after sending data in si_cs_send()
In the function si_cs_send(), what is done when an error occurred on the
connection or the conn_stream or when some successfully data was send via a pipe
or the channel's buffer may be factorized at the function. It slightly simplify
the function.

This patch must be backported to 2.0 and 1.9 because a bugfix depends on it.
2019-07-05 14:26:15 +02:00
Christopher Faulet
0e54d547f1 BUG/MINOR: mux-h1: Don't process input or ouput if an error occurred
It is useless to proceed if an error already occurred. Instead, it is better to
wait it will be catched by the stream or the connection, depending on which is
the first one to detect it.

This patch must be backported to 2.0.
2019-07-05 14:26:15 +02:00
Christopher Faulet
f8db73efbe BUG/MEDIUM: mux-h1: Handle TUNNEL state when outgoing messages are formatted
Since the commit 94b2c7 ("MEDIUM: mux-h1: refactor output processing"), the
formatting of outgoing messages is performed on the message state and no more on
the HTX blocks read. But the TUNNEL state was left out. So, the HTTP tunneling
using the CONNECT method or switching the protocol (for instance,
the WebSocket) does not work.

This issue was reported on Github. See #131. This patch must be backported to
2.0.
2019-07-05 14:26:15 +02:00
Christopher Faulet
16b2be93ad BUG/MEDIUM: lb_fas: Don't test the server's lb_tree from outside the lock
In the function fas_srv_reposition(), the server's lb_tree is tested from
outside the lock. So it is possible to remove it after the test and then call
eb32_insert() in fas_queue_srv() with a NULL root pointer, which is
invalid. Moving the test in the scope of the lock fixes the bug.

This issue was reported on Github, issue #126.

This patch must be backported to 2.0, 1.9 and 1.8.
2019-07-05 14:26:15 +02:00
Christopher Faulet
8f1aa77b42 BUG/MEDIUM: http/applet: Finish request processing when a service is registered
In the analyzers AN_REQ_HTTP_PROCESS_FE/BE, when a service is registered, it is
important to not interrupt remaining processing but just the http-request rules
processing. Otherwise, the part that handles the applets installation is
skipped.

Among the several effects, if the service is registered on a frontend (not a
listen), the forwarding of the request is skipped because all analyzers are not
set on the request channel. If the service does not depends on it, the response
is still produced and forwarded to the client. But the stream is infinitly
blocked because the request is not fully consumed. This issue was reported on
Github, see #151.

So this bug is fixed thanks to the new action return ACT_RET_DONE. Once a
service is registered, the action process_use_service() still returns
ACT_RET_STOP. But now, only rules processing is stopped. As a side effet, the
action http_action_reject() must now return ACT_RET_DONE to really stop all
processing.

This patch must be backported to 2.0. It depends on the commit introducing the
return code ACT_RET_DONE.
2019-07-05 14:26:14 +02:00
Christopher Faulet
2e4843d1d2 MINOR: action: Add the return code ACT_RET_DONE for actions
This code should be now used by action to stop at the same time the rules
processing and the possible following processings. And from its side, the return
code ACT_RET_STOP should be used to only stop rules processing.

So concretely, for TCP rules, there is no changes. ACT_RET_STOP and ACT_RET_DONE
are handled the same way. However, for HTTP rules, ACT_RET_STOP should now be
mapped on HTTP_RULE_RES_STOP and ACT_RET_DONE on HTTP_RULE_RES_DONE. So this
way, a action will have the possibilty to stop all processing or only rules
processing.

Note that changes about the TCP is done in this commit but changes about the
HTTP will be done in another one because it will fix a bug in the same time.

This patch must be backported to 2.0 because a bugfix depends on it.
2019-07-05 14:26:14 +02:00
Christopher Faulet
0c55a15ce1 BUG/MINOR: contrib/prometheus-exporter: Don't try to add empty data blocks
When the response buffer is full and nothing more can be inserted, it is
important to not try to insert an empty data block. Otherwise, when the function
channel_add_input() is called, the flag CF_READ_PARTIAL is set on the response
channel while nothing was read and the stream is uselessly woken up. Finally, we
have loop while the response buffer is full.

This patch must be backported to 2.0.
2019-07-05 14:26:14 +02:00
Frédéric Lécaille
1b9423d214 MINOR: server: Add "no-tfo" option.
Simple patch to add "no-tfo" option to "default-server" and "server" lines
to disable any usage of TCP fast open.

Must be backported to 2.0.
2019-07-04 14:45:52 +02:00
Olivier Houchard
cee0389088 BUG/MEDIUM: sessions: Don't keep an extra idle connection in sessions.
When deciding if we keep an idle connection in the session, check if
the number of connections currently in the session is >= the max allowed,
not >, or we'll keep an extra connection.

This should be backported to 1.9 and 2.0.
2019-07-04 14:28:18 +02:00
Olivier Houchard
8d82db70a5 BUG/MEDIUM: servers: Authorize tfo in default-server.
There's no reason to forbid using tfo with default-server, so allow it.

This should be backported to 2.0.
2019-07-04 13:34:25 +02:00
Olivier Houchard
2ab3dada01 BUG/MEDIUM: connections: Make sure we're unsubscribe before upgrading the mux.
Just calling conn_force_unsubscribe() from conn_upgrade_mux_fe() is not
enough, as there may be multiple XPRT involved. Instead, require that
any user of conn_upgrade_mux_fe() unsubscribe itself before calling it.
This should fix upgrading a TCP connection to HTX when using SSL.

This should be backported to 2.0.
2019-07-03 13:57:30 +02:00
Christopher Faulet
11921e6819 BUG/MINOR: contrib/prometheus-exporter: Respect the reserve when data are sent
The previous commit e6cdfe574 ("BUG/MINOR: contrib/prometheus-exporter: Don't
use channel_htx_recv_max()") is buggy. The buffer's reserve must be respected.

This patch must be backported to 2.0 and 1.9.
2019-07-03 11:47:20 +02:00
Christopher Faulet
9060fc02b5 BUG/MINOR: hlua/htx: Respect the reserve when HTX data are sent
The previous commit 7e145b3e2 ("BUG/MINOR: hlua: Don't use
channel_htx_recv_max()") is buggy. The buffer's reserve must be respected.

This patch must be backported to 2.0 and 1.9.
2019-07-03 11:47:20 +02:00
Christopher Faulet
621da6bafa BUG/MEDIUM: channel/htx: Use the total HTX size in channel_htx_recv_limit()
The receive limit of an HTX channel must be calculated against the total size of
the HTX message. Otherwise, the buffer may never be seen as full whereas the
receive limit is 0. Indeed, the function channel_htx_full() already takes care
to add a block size to the buffer's reserve (8 bytes). So if the function
channel_htx_recv_limit() also keep a block size free in addition to the buffer's
reserve, it means that at least 2 block size will be kept free but only one will
be taken into account, freezing the stream if the option http-buffer-request is
enabled.

This patch fixes the Github issue #136. It should be backported to 2.0 and
1.9. Thanks jaroslawr (Jarosław Rzeszótko) for his help.
2019-07-02 21:32:45 +02:00
Christopher Faulet
7e145b3e24 BUG/MINOR: hlua: Don't use channel_htx_recv_max()
The function htx_free_data_space() must be used intead. Otherwise, if there are
some output data not already forwarded, the maximum amount of data that may be
inserted into the buffer may be greater than what we can really insert.

This patch must be backported to 2.0 and 1.9.
2019-07-02 21:32:45 +02:00
Christopher Faulet
e6cdfe574e BUG/MINOR: contrib/prometheus-exporter: Don't use channel_htx_recv_max()
The function htx_free_data_space() must be used intead. Otherwise, if there are
some output data not already forwarded, the maximum amount of data that may be
inserted into the buffer may be greater than what we can really insert.

This patch must be backported to 2.0.
2019-07-02 21:08:26 +02:00