Commit Graph

9714 Commits

Author SHA1 Message Date
Willy Tarreau
03c6ab0cbb REGTEST: exclude osx and generic targets for 40be_2srv_odd_health_checks
As explained in the commit below, this test relies on TCP_DEFER_ACCEPT
which is not available everywhere, and as such fails on OSX as well :
15685c791 ("REGTEST: Exclude freebsd target for some reg tests.")
2019-04-25 08:39:48 +02:00
Tim Duesterhus
88c63a6e55 BUILD: extend Travis CI config to support more platforms
This commit extends the Travis CI configuration to build HAProxy
with gcc on Linux, clang on Mac and cleans up the build flag
configuration to be easier extendable.

Note: At the moment HAProxy fails on Travis for configurations
on OS X
2019-04-25 08:24:29 +02:00
Willy Tarreau
22d63a24d9 MINOR: applet: measure and report an appctx's call rate in "show sess"
Very similarly to previous commit doing the same for streams, we now
measure and report an appctx's call rate. This will help catch applets
which do not consume all their data and/or which do not properly report
that they're waiting for something else. Some of them like peers might
theorically be able to exhibit some occasional peeks when teaching a
full table to a nearby peer (e.g. the new replacement process), but
nothing close to what a bogus service can do so there is no risk of
confusion.
2019-04-24 16:04:23 +02:00
Willy Tarreau
2e9c1d2960 MINOR: stream: measure and report a stream's call rate in "show sess"
Quite a few times some bugs have made a stream task incorrectly
handle a complex combination of events, which was often reported as
"100% CPU", and was usually caused by the event not being properly
identified and flushed, and the stream's handler called in loops.

This patch adds a call rate counter to the stream struct. It's not
huge, it's really inexpensive (especially compared to the rest of the
processing function) and will easily help spot such tasks in "show sess"
output, possibly even allowing to kill them.

A future patch should probably consist in alerting when they're above a
certain threshold, possibly sending a dump and killing them. Some options
could also consist in aborting in order to get an analyzable core dump
and let a service manager restart a fresh new process.
2019-04-24 16:04:23 +02:00
Willy Tarreau
0212fadd65 MINOR: tasks/activity: report the context switch and task wakeup rates
It's particularly useful to spot runaway tasks to see this. The context
switch rate covers all tasklet calls (tasks and I/O handlers) while the
task wakeups only covers tasks picked from the run queue to be executed.
High values there will indicate either an intense traffic or a bug that
mades a task go wild.
2019-04-24 16:04:23 +02:00
Willy Tarreau
69b5a7f1a3 CLEANUP: task: report calls as unsigned in show sess
The "show sess" output used signed ints to report the number of calls,
which is confusing for runaway tasks where the call count can turn
negative.
2019-04-24 16:04:23 +02:00
Christopher Faulet
4904058661 BUG/MINOR: htx: Exclude TCP proxies when the HTX mode is handled during startup
When tests are performed on the HTX mode during HAProxy startup, only HTTP
proxies are considered. It is important because, since the commit 1d2b586cd
("MAJOR: htx: Enable the HTX mode by default for all proxies"), the HTX is
enabled on all proxies by default. But for TCP proxies, it is "deactivated".

This patch must be backported to 1.9.
2019-04-24 15:40:02 +02:00
Christopher Faulet
c1918d1a8f BUG/MAJOR: muxes: Use the HTX mode to find the best mux for HTTP proxies only
Since the commit 1d2b586cd ("MAJOR: htx: Enable the HTX mode by default for all
proxies"), the HTX is enabled by default for all proxies, HTTP and TCP, but also
CLI and HEALTH proxies. But when the best mux is retrieved, only HTTP and TCP
modes are checked. If the TCP mode is not explicitly set, it is considered as an
HTTP proxy. It is an hidden bug introduced when the option "http-use-htx" was
added. It has no effect until the commit 1d2b586cd. But now, when a stats socket
is created for the master process, the mux h1 is installed on all incoming
connections to the CLI proxy, leading to segfaults because HTX operations are
performed on raw buffers.

So to fix the buf, when a mux is installed, all proxies are considered as TCP
proxies, except HTTP ones. This way, CLI and HEALTH proxies will be handled as
TCP proxies.

This patch must be backported to 1.9 although it has no effect. It is safer to
not keep hidden bugs.
2019-04-24 15:40:02 +02:00
Willy Tarreau
274ba67862 BUG/MAJOR: lb/threads: fix AB/BA locking issue in round-robin LB
An occasional divide by zero in the round-robin scheduler was addressed
in commit 9df86f997 ("BUG/MAJOR: lb/threads: fix insufficient locking on
round-robin LB") by grabing the server's lock in fwrr_get_server_from_group().

But it happens that this is not the correct approach as it introduces a
case of AB/BA deadlock reported by Maksim Kupriianov. This happens when
a server weight changes from/to zero while another thread extracts this
server from the tree. The reason is that the functions used to manipulate
the state work under the server's lock and grab the LB lock while the ones
used in LB work under the LB lock and grab the server's lock when needed.

This commit mostly reverts the changes above and instead further completes
the locking analysis performed on this code to identify areas that really
need to be protected by the server's lock, since this is the only algorithm
which happens to have this requirement. This audit showed that in fact all
locations which require the server's lock are already protected by the LB
lock. This was not noticed the first time due to the server's lock being
taken instead and due to some functions misleadingly using atomic ops to
modify server fields which are under the LB lock protection (these ones
were now removed).

The change consists in not taking the server's lock anymore here, and
instead making sure that the aforementioned function which used to
suffer from the server's weight becoming zero only uses a copy of the
weight which was preliminary verified to be non-null (when the weight
is null, the server will be removed from the tree anyway so there is
no need to recalculate its position).

With this change, the code survived an injection at 200k req/s split
on two servers with weights changing 50 times a second.

This commit must be backported to 1.9 only.
2019-04-24 14:23:40 +02:00
Olivier Houchard
a28454ee21 BUG/MEDIUM: ssl: Return -1 on recv/send if we got EAGAIN.
In ha_ssl_read()/ha_ssl_write(), if we couldn't send/receive data because
we got EAGAIN, return -1 and not 0, as older SSL versions expect that.
This should fix the problems with OpenSSL < 1.1.0.
2019-04-24 12:06:08 +02:00
Christopher Faulet
371723b0c2 BUG/MINOR: spoe: Don't systematically wakeup SPOE stream in the applet handler
This can lead to wakeups in loop between the SPOE stream and the SPOE applets
waiting to receive agent messages (mainly AGENT-HELLO and AGENT-DISCONNECT).

This patch must be backported to 1.9 and 1.8.
2019-04-23 21:20:47 +02:00
Christopher Faulet
5e1a9d715e BUG/MEDIUM: stream: Fix the way early aborts on the client side are handled
A regression was introduced with the commit c9aecc8ff ("BUG/MEDIUM: stream:
Don't request a server connection if a shutw was scheduled"). Among other this,
it breaks the CLI when the shutr on the client side is handled with the client
data. To depend on the flag CF_SHUTW_NOW to not establish the server connection
when an error on the client side is detected is the right way to fix the bug,
because this flag may be set without any error on the client side.

So instead, we abort the request where the error is handled and only when the
backend stream-interface is in the state SI_ST_INI. This way, there is no
ambiguity on the reason why the abort accurred. The stream-interface is also
switched to the state SI_ST_CLO.

This patch must be backported to 1.9. If the commit c9aecc8ff is backported to
previous versions, this one MUST also be backported. Otherwise, it MAY be
backported to older versions that 1.9 with caution.
2019-04-23 21:20:47 +02:00
Frdric Lcaille
bed883abe8 BUG/MAJOR: stream: Missing DNS context initializations.
Fix some missing initializations wich came with 333939c commit (MINOR: action:
new '(http-request|tcp-request content) do-resolve' action). The DNS contexts of
streams which were allocated were not initialized by stream_new(). This leaded to
accesses to non-allocated memory when freeing these contexts with stream_free().
2019-04-23 20:24:11 +02:00
Willy Tarreau
ca8df4c074 REGTEST: make the "run-regtests" script search for tests in reg-tests by default
It happens almost daily to me that make regtests fails because the script
found a temporary, old, or broken VTC file that was lying in my work dir,
leaving me no place to hide it. This is a real pain as some tests take ages
to fail, so let's make this script only look up for tests where they are
expected to be stored, under reg-tests only. It remains possible to force
the location on the command line though.
2019-04-23 16:09:50 +02:00
Frdric Lcaille
b894f9230c REGTEST: adapt some reg tests after renaming.
Some reg tests and their dependencies have been renamed. They may be
referenced by the .vtc files. So, this patch modifies also the references
to these dependencies.
2019-04-23 15:37:11 +02:00
Frdric Lcaille
d7a8f14145 REGTEST: rename the reg test files.
We rename all the VTC files to avoid name collisions when importing/backporting.
2019-04-23 15:37:03 +02:00
Frdric Lcaille
dc1a3bd999 REGTEST: replace LEVEL option by a more human readable one.
This patch replaces LEVEL variable by REGTESTS_TYPES variable which is more
mnemonic and human readable. It is uses as a filter to run the reg tests scripts
where a commented #REGTEST_TYPE may be defined to designate their types.
Running the following command:

    $ REGTESTS_TYPES=slow,default

will start all the reg tests where REGTEST_TYPE is defines as 'slow' or 'default'.
Note that 'default' is also the default value of REGTEST_TYPE when not specified
dedicated to run all the current h*.vtc files. When REGTESTS_TYPES is not specified
there is no filter at all. All the tests are run.

This patches also defines REGTEST_TYPE with 'slow' value for all the s*.vtc files,
'bug' value for al the b*.vtc files, 'broken' value for all the k*.vtc files.
2019-04-23 15:14:52 +02:00
Frdric Lcaille
0bad840b4d MINOR: log: Extract some code to send syslog messages.
This patch extracts the code of __send_log() responsible of sending a syslog
message to a syslog destination represented as a logsrv struct to define
__do_send_log() function. __send_log() calls __do_send_log() for each syslog
destination of a proxy after having prepared some of its parameters.
2019-04-23 14:16:51 +02:00
Baptiste Assmann
333939c2ee MINOR: action: new '(http-request|tcp-request content) do-resolve' action
The 'do-resolve' action is an http-request or tcp-request content action
which allows to run DNS resolution at run time in HAProxy.
The name to be resolved can be picked up in the request sent by the
client and the result of the resolution is stored in a variable.
The time the resolution is being performed, the request is on pause.
If the resolution can't provide a suitable result, then the variable
will be empty. It's up to the admin to take decisions based on this
statement (return 503 to prevent loops).

Read carefully the documentation concerning this feature, to ensure your
setup is secure and safe to be used in production.

This patch creates a global counter to track various errors reported by
the action 'do-resolve'.
2019-04-23 11:41:52 +02:00
Baptiste Assmann
0b9ce82dfa MINOR: obj_type: new object type for struct stream
This patch creates a new obj_type for the struct stream in HAProxy.
2019-04-23 11:35:56 +02:00
Baptiste Assmann
db4c8521ca MINOR: dns: move callback affection in dns_link_resolution()
In dns.c, dns_link_resolution(), each type of dns requester is managed
separately, that said, the callback function is affected globaly (and
points to server type callbacks only).
This design prevents the addition of new dns requester type and this
patch aims at fixing this limitation: now, the callback setting is done
directly into the portion of code dedicated to each requester type.
2019-04-23 11:34:11 +02:00
Baptiste Assmann
dfd35fd71a MINOR: dns: dns_requester structures are now in a memory pool
dns_requester structure can be allocated at run time when servers get
associated to DNS resolution (this happens when SRV records are used in
conjunction with service discovery).
Well, this memory allocation is safer if managed in an HAProxy pool,
furthermore with upcoming HTTP action which can perform DNS resolution
at runtime.

This patch moves the memory management of the dns_requester structure
into its own pool.
2019-04-23 11:33:48 +02:00
paulborile
cd9b9bd3e4 MINOR: contrib: dummy wurfl library
This is dummy version of the Scientiamobile WURFL C API that can be used
to successfully build/run haproxy compiled with USE_WURFL=1.
It is marked as version 1.11.2.100 to distinguish it from any real version
of the lib. It has no external dependencies so it should work out of the
box by building it like this :

   $ make -C contrib/wurfl

In order to use it, simply reference this directory as the WURFL include
and library paths :

   $ make TARGET=<target> USE_WURFL=1 WURFL_INC=$PWD/contrib/wurfl WURFL_LIB=$PWD/contrib/wurfl
2019-04-23 11:00:23 +02:00
paulborile
7714b12604 MINOR: wurfl: enabled multithreading mode
Initially excluded multithreaded mode is completely supported (libwurfl is fully MT safe).
Internal tests now are run also with multithreading enabled.
2019-04-23 11:00:23 +02:00
paulborile
c81b9bf7b4 DOC: wurfl: added point of contact in MAINTAINERS file 2019-04-23 11:00:23 +02:00
paulborile
bad132c384 CLEANUP: wurfl: removed deprecated methods
last 2 major releases of libwurfl included a complete review of engine options with
the result of deprecating many features. The patch removes unecessary code and fixes
the documentation.
Can be backported on any version of haproxy.

[wt: must not be backported since it removes config keywords and would
 thus break existing configurations]
Signed-off-by: Willy Tarreau <w@1wt.eu>
2019-04-23 11:00:23 +02:00
paulborile
59d50145dc BUILD: wurfl: build fix for 1.9/2.0 code base
This applies the required changes for the new buffer API that came in 1.9.
This patch must be backported to 1.9.
2019-04-23 11:00:23 +02:00
Willy Tarreau
b518823f1b MINOR: wurfl: indicate in haproxy -vv the wurfl version in use
It also explicitly mentions that the library is the dummy one when it
is detected.

We have this output now :

$ ./haproxy  -vv |grep -i wurfl
Built with WURFL support (dummy library version 1.11.2.100)
2019-04-23 11:00:23 +02:00
Willy Tarreau
1426198efb BUILD: add USE_WURFL to the list of known build options
Since the removal of WURFL we've improved the build system to report
known build options, let's reference this option there as well.
2019-04-23 11:00:23 +02:00
Willy Tarreau
b3cc9f2887 Revert "CLEANUP: wurfl: remove dead, broken and unmaintained code"
This reverts commit 8e5e1e7bf0.

The following patches will fix this code and may be backported.
2019-04-23 10:34:43 +02:00
Emeric Brun
d0e095c2aa MINOR: ssl/cli: async fd io-handlers printable on show fd
This patch exports the async fd iohandlers and make them printable
doing a 'show fd' on cli.
2019-04-19 17:27:01 +02:00
Christopher Faulet
46451d6e04 MINOR: gcc: Fix a silly gcc warning in connect_server()
Don't know why it happens now, but gcc seems to think srv_conn may be NULL when
a reused connection is removed from the orphan list. It happens when HAProxy is
compiled with -O2 with my gcc (8.3.1) on fedora 29... Changing a little how
reuse parameter is tested removes the warnings. So...

This patch may be backported to 1.9.
2019-04-19 15:53:23 +02:00
Christopher Faulet
f48552f2c1 BUG/MINOR: da: Get the request channel to call CHECK_HTTP_MESSAGE_FIRST()
Since the commit 89dc49935 ("BUG/MAJOR: http_fetch: Get the channel depending on
the keyword used"), the right channel must be passed as argument when the macro
CHECK_HTTP_MESSAGE_FIRST is called.

This patch must be backported to 1.9.
2019-04-19 15:53:23 +02:00
Christopher Faulet
2db9dac4c8 BUG/MINOR: 51d: Get the request channel to call CHECK_HTTP_MESSAGE_FIRST()
Since the commit 89dc49935 ("BUG/MAJOR: http_fetch: Get the channel depending on
the keyword used"), the right channel must be passed as argument when the macro
CHECK_HTTP_MESSAGE_FIRST is called.

This patch must be backported to 1.9.
2019-04-19 15:53:23 +02:00
Christopher Faulet
c54e4b053d BUG/MEDIUM: stream: Don't request a server connection if a shutw was scheduled
If a shutdown for writes was performed on the client side (CF_SHUTW is set on
the request channel) while the server connection is still unestablished (the
stream-int is in the state SI_ST_INI), then it is aborted. It must also be
aborted when the shudown for write is pending (only CF_SHUTW_NOW is
set). Otherwise, some errors on the request channel can be ignored, leaving the
stream in an undefined state.

This patch must be backported to 1.9. It may probably be backported to all
suported versions, but it is unclear if the bug is visbile for older versions
than 1.9. So it is probably safer to wait bug reports on these versions to
backport this patch.
2019-04-19 15:53:23 +02:00
Christopher Faulet
e84289e585 BUG/MEDIUM: thread/http: Add missing locks in set-map and add-acl HTTP rules
Locks are missing in the rules "http-request set-map" and "http-response
add-acl" when an acl or map update is performed. Pattern elements must be
locked.

This patch must be backported to 1.9 and 1.8. For the 1.8, the HTX part must be
ignored.
2019-04-19 15:53:23 +02:00
Christopher Faulet
22c57bef56 BUG/MEDIUM: h1: Don't parse chunks CRLF if not enough data are available
As specified in the function comment, the function h1_skip_chunk_crlf() must not
change anything and return zero if not enough data are available. This must
include the case where there is no data at all. On this point, it must do the
same that other h1 parsing functions. This bug is made visible since the commit
91f77d599 ("BUG/MINOR: mux-h1: Process input even if the input buffer is
empty").

This patch must be backported to 1.9.
2019-04-19 15:53:23 +02:00
Baptiste Assmann
e1afd4fec6 MINOR: proto_tcp: tcp-request content: enable set-dst and set-dst-var
The set-dst and set dst-var are available at both 'tcp-request
connection' and 'http-request' but not at the layer in the middle.
This patch fixes this miss and enables both set-dst and set-dst-var at
'tcp-request content' layer.
2019-04-19 15:50:06 +02:00
Frédéric Lécaille
ffe30f708f REGTEST: Missing REQUIRE_VERSION declarations.
checks/s00001.vtc needs support for "srvrecord" which came with 1.8 version.
peers/s_basic_sync.vtc and s_tls_basic_sync.vtc need support for "server"
keyword usage in "peers" section which came with 2.0 version.
2019-04-19 15:48:41 +02:00
Willy Tarreau
78c5eec949 BUG/MINOR: acl: properly detect pattern type SMP_T_ADDR
Since 1.6-dev4 with commit b2f8f087f ("MINOR: map: The map can return
IPv4 and IPv6"), maps can return both IPv4 and IPv6 addresses, which
is represented as SMP_T_ADDR at the output of the map converter. But
the ACL parser only checks for either SMP_T_IPV4 or SMP_T_IPV6 and
requires to see an explicit matching method specified. Given that it
uses the same pattern parser for both address families, it implicitly
is also compatible with SMP_T_ADDR, which ought to have been added
there.

This fix should be backported as far as 1.6.
2019-04-19 11:45:20 +02:00
Willy Tarreau
aa5801bcaa BUG/MEDIUM: maps: only try to parse the default value when it's present
Maps returning an IP address (e.g. map_str_ip) support an optional
default value which must be parsed. Unfortunately the parsing code does
not check for this argument's existence and uncondtionally tries to
resolve the argument whenever the output is of type address, resulting
in segfaults at parsing time when no such argument is provided. This
patch adds the appropriate check.

This fix may be backported as far as 1.6.
2019-04-19 11:35:22 +02:00
Olivier Houchard
88698d966d MEDIUM: connections: Add a way to control the number of idling connections.
As by default we add all keepalive connections to the idle pool, if we run
into a pathological case, where all client don't do keepalive, but the server
does, and haproxy is configured to only reuse "safe" connections, we will
soon find ourself having lots of idling, unusable for new sessions, connections,
while we won't have any file descriptors available to create new connections.

To fix this, add 2 new global settings, "pool_low_ratio" and "pool_high_ratio".
pool-low-fd-ratio  is the % of fds we're allowed to use (against the maximum
number of fds available to haproxy) before we stop adding connections to the
idle pool, and destroy them instead. The default is 20. pool-high-fd-ratio is
the % of fds we're allowed to use (against the maximum number of fds available
to haproxy) before we start killing idling connection in the event we have to
create a new outgoing connection, and no reuse is possible. The default is 25.
2019-04-18 19:52:03 +02:00
Olivier Houchard
7c49d2e213 MINOR: fd: Add a counter of used fds.
Add a new counter, ha_used_fds, that let us know how many file descriptors
we're currently using.
2019-04-18 19:19:59 +02:00
Ilya Shipitsin
8a9d55bb9b MEDIUM: enable travis-ci builds
currently only xenial/clang build is enabled. osx and xenial/gcc
will be enabled later.

travis-ci is cloud based continuous integration, builds will be
started automatically if they are enabled for certain repo or fork.

Signed-off-by: Ilya Shipitsin <chipitsine@gmail.com>
2019-04-18 18:39:55 +02:00
Emeric Brun
0bbec0fa34 MINOR: peers: adds counters on show peers about tasks calls.
This patch adds a counter of calls on the orchestator peers task
and a counter on the tasks linked to applet i/o handler for
each peer.

Those two counters are useful to detect if a peer sync is active
or frozen.

This patch is related to the commit:
  "MINOR: peers: Add a new command to the CLI for peers."
and should be backported with it.
2019-04-18 18:24:25 +02:00
Olivier Houchard
66a7b3302a BUILD/medium: ssl: Fix build with OpenSSL < 1.1.0
Make sure it builds with OpenSSL < 1.1.0, a lot of the BIO_get/set methods
were introduced with OpenSSL 1.1.0, so fallback with the old way of doing
things if needed.
2019-04-18 15:58:58 +02:00
Olivier Houchard
a8955d57ed MEDIUM: ssl: provide our own BIO.
Instead of letting the OpenSSL code handle the file descriptor directly,
provide a custom BIO, that will use the underlying XPRT to send/recv data.
This will let us implement QUIC later, and probably clean the upper layer,
if/when the SSL code provide its own subscribe code, so that the upper layers
won't have to care if we're still waiting for the handshake to complete or not.
2019-04-18 14:56:24 +02:00
Olivier Houchard
e179d0e88f MEDIUM: connections: Provide a xprt_ctx for each xprt method.
For most of the xprt methods, provide a xprt_ctx.  This will be useful later
when we'll want to be able to stack xprts.
The init() method now has to create and provide the said xprt_ctx if needed.
2019-04-18 14:56:24 +02:00
Olivier Houchard
df35784600 MEDIUM: ssl: provide its own subscribe/unsubscribe function.
In order to prepare for the possibility of using different kinds of xprt
with ssl, make the ssl code provide its own subscribe and unsubscribe
functions, right now it just calls conn_subscribe and conn_unsubsribe.
2019-04-18 14:56:24 +02:00
Olivier Houchard
7b5fd1ec26 MEDIUM: connections: Move some fields from struct connection to ssl_sock_ctx.
Move xprt_st, tmp_early_data and sent_early_data from struct connection to
struct ssl_sock_ctx, as they are only used in the SSL code.
2019-04-18 14:56:24 +02:00