Commit Graph

7441 Commits

Author SHA1 Message Date
Christopher Faulet
fd83f0bfa4 BUG/MEDIUM: threads/queue: wake up other threads upon dequeue
The previous patch about queues (5cd4bbd7a "BUG/MAJOR: threads/queue: Fix
thread-safety issues on the queues management") revealed a performance drop when
multithreading is enabled (nbthread > 1). This happens when pending connections
handled by other theads are dequeued. If these other threads are blocked in the
poller, we have to wait the poller's timeout (or any I/O event) to process the
dequeued connections.

To fix the problem, at least temporarly, we "wake up" the threads by requesting
a synchronization. This may seem a bit overkill to use the sync point to do a
wakeup on threads, but it fixes this performance issue. So we can now think
calmly on the good way to address this kind of issues.

This patch should be backported in 1.8 with the commit 5cd4bbd7a ("BUG/MAJOR:
threads/queue: Fix thread-safety issues on the queues management").
2018-03-19 22:16:58 +01:00
Baptiste Assmann
2f3a56b4ff BUG/MINOR: tcp-check: use the server's service port as a fallback
When running tcp-check scripts, one must ensure we can establish a tcp
connection first.
When doing this action, HAProxy needs a TCP port configured either on
the server or on the check itself or on the connect rule itself.
For some reasons, the connect code did not evaluate the service port on
the server structure...

this patch fixes this error.

Backport status: 1.8
2018-03-19 13:55:55 +01:00
Baptiste Assmann
248f1173f2 BUG/MEDIUM: tcp-check: single connect rule can't detect DOWN servers
When tcpcheck is used to do TCP port monitoring only and the script is
composed by a single "tcp-check connect" rule (whatever port and ssl
options enabled), then the server can't be seen as DOWN.
Simple configuration to reproduce:

  backend b
    [...]
    option tcp-check
    tcp-check connect
    server s1 127.0.0.1:22 check

The main reason for this issue is that the piece of code which validates
that we're not at the end of the chained list (of rules) prevents
executing the validation of the establishment of the TCP connection.
Since validation is not executed, the rule is terminated and the report
says no errors were encountered, hence the server is UP all the time.

The workaround is simple: move the connection validation outsied the
CONNECT rule processing loop, into the main function.
That way, if the connection status is not CONNECTED, then HAProxy will
now add more time to wait for it. If the time is expired, an error is
now well reported.

Backport status: 1.8
2018-03-19 13:53:59 +01:00
Thierry FOURNIER
2986c0db88 CLEANUP: lua/syntax: lua is a name and not an acronym
This patch fix some first letter upercase for Lua messages.
2018-03-19 12:59:26 +01:00
Thierry FOURNIER
fd1e955a56 BUG/MINOR: lua: return bad error messages
The returned type is the type of the top of stack value and
not the type of the checked argument.

[wt: this can be backported to 1.8, 1.7 and 1.6]
2018-03-19 12:59:19 +01:00
Thierry FOURNIER
29a05c13d1 BUG/MINOR: spoa-example: unexpected behavior for more than 127 args
Buf is unsigned, so nbargs will be negative for more then 127 args.

Note that I cant test this bug because I cant put sufficient args
on the configuration line. It is just detected reading code.

[wt: this can be backported to 1.8 & 1.7]
2018-03-19 12:59:10 +01:00
Bernard Spil
13c53f8cc2 BUILD: ssl: Fix build with OpenSSL without NPN capability
OpenSSL can be built without NEXTPROTONEG support by passing
-no-npn to the configure script. This sets the
OPENSSL_NO_NEXTPROTONEG flag in opensslconf.h

Since NEXTPROTONEG is now considered deprecated, it is superseeded
by ALPN (Application Layer Protocol Next), HAProxy should allow
building withough NPN support.
2018-03-19 12:43:15 +01:00
Aurélien Nephtali
6a61e968ac BUG/MINOR: cli: Fix a crash when sending a command with too many arguments
This bug was introduced in 48bcfdab2 ("MEDIUM: dumpstat: make the CLI
parser understand the backslash as an escape char").

This should be backported to 1.8.

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-19 12:15:38 +01:00
Aurélien Nephtali
6e8a41d8fc BUG/MINOR: cli: Ensure all command outputs end with a LF
Since 200b0fac ("MEDIUM: Add support for updating TLS ticket keys via
socket"), 4147b2ef ("MEDIUM: ssl: basic OCSP stapling support."),
4df59e9 ("MINOR: cli: add socket commands and config to prepend
informational messages with severity") and 654694e1 ("MEDIUM: stats/cli:
add support for "set table key" to enter values"), commands
'set ssl tls-key', 'set ssl ocsp-response', 'set severity-output' and
'set table' do not always send an extra LF at the end of their outputs.

This is required as mentioned in doc/management.txt:

"Since multiple commands may be issued at once, haproxy uses the empty
line as a delimiter to mark an end of output for each command"

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-19 12:13:02 +01:00
Olivier Houchard
33e083c92e BUG/MINOR: seemless reload: Fix crash when an interface is specified.
When doing a seemless reload, while receiving the sockets from the old process
the new process will die if the socket has been bound to a specific
interface.
This happens because the code that tries to parse the informations bogusly
try to set xfer_sock->namespace, while it should be setting wfer_sock->iface.

This should be backported to 1.8.
2018-03-19 12:10:53 +01:00
Ilya Shipitsin
210eb259bf CLEANUP: dns: remove duplicate code in src/dns.c
issue was identified by cppcheck

[src/dns.c:2037] -> [src/dns.c:2041]: (warning) Variable 'appctx->st2' is reassigned a value before the old one has been used. 'break;' missing?
2018-03-19 12:09:16 +01:00
Philipp Kolmann
8bb4db5b0f TESTS: Add a testcase for multi-port + multi-server listener issue 2018-03-19 11:48:29 +01:00
Baptiste Assmann
1fa7d2acce BUG/MINOR: dns: don't downgrade DNS accepted payload size automatically
Automatic downgrade of DNS accepted payload size may have undesired side
effect, which could make a backend with all servers DOWN.

After talking with Lukas on the ML, I realized this "feature" introduces
more issues that it fixes problem.
The "best" way to handle properly big responses will be to implement DNS
over TCP.

To be backported to 1.8.
2018-03-19 11:41:52 +01:00
Christopher Faulet
5cd4bbd7ab BUG/MAJOR: threads/queue: Fix thread-safety issues on the queues management
The management of the servers and the proxies queues was not thread-safe at
all. First, the accesses to <strm>->pend_pos were not protected. So it was
possible to release it on a thread (for instance because the stream is released)
and to use it in same time on another one (because we redispatch pending
connections for a server). Then, the accesses to stream's information (flags and
target) from anywhere is forbidden. To be safe, The stream's state must always
be updated in the context of process_stream.

So to fix these issues, the queue module has been refactored. A lock has been
added in the pendconn structure. And now, when we try to dequeue a pending
connection, we start by unlinking it from the server/proxy queue and we wake up
the stream. Then, it is the stream reponsibility to really dequeue it (or
release it). This way, we are sure that only the stream can create and release
its <pend_pos> field.

However, be careful. This new implementation should be thread-safe
(hopefully...). But it is not optimal and in some situations, it could be really
slower in multi-threaded mode than in single-threaded one. The problem is that,
when we try to dequeue pending connections, we process it from the older one to
the newer one independently to the thread's affinity. So we need to wait the
other threads' wakeup to really process them. If threads are blocked in the
poller, this will add a significant latency. This problem happens when maxconn
values are very low.

This patch must be backported in 1.8.
2018-03-19 10:03:06 +01:00
Christopher Faulet
510c0d67ef BUG/MEDIUM: threads/unix: Fix a deadlock when a listener is temporarily disabled
When a listener is temporarily disabled, we start by locking it and then we call
.pause callback of the underlying protocol (tcp/unix). For TCP listeners, this
is not a problem. But listeners bound on an unix socket are in fact closed
instead. So .pause callback relies on unbind_listener function to do its job.

Unfortunatly, unbind_listener hold the listener's lock and then call an internal
function to unbind it. So, there is a deadlock here. This happens during a
reload. To fix the problemn, the function do_unbind_listener, which is lockless,
is now exported and is called when a listener bound on an unix socket is
temporarily disabled.

This patch must be backported in 1.8.
2018-03-16 11:19:07 +01:00
Cyril Bonté
4288c5a9d8 BUG/MINOR: force-persist and ignore-persist only apply to backends
>From the very first day of force-persist and ignore-persist features,
they only applied to backends, except that the documentation stated it
could also be applied to frontends.

In order to make it clear, the documentation is updated and the parser
will raise a warning if the keywords are used in a frontend section.

This patch should be backported up to the 1.5 branch.
2018-03-12 22:52:24 +01:00
Cyril Bonté
d400ab3a36 BUG/MEDIUM: fix a 100% cpu usage with cpu-map and nbthread/nbproc
Krishna Kumar reported a 100% cpu usage with a configuration using
cpu-map and a high number of threads,

Indeed, this minimal configuration to reproduce the issue :
  global
    nbthread 40
    cpu-map auto:1/1-40 0-39

  frontend test
    bind :8000

This is due to a wrong type in a shift operator (int vs unsigned long int),
causing an endless loop while applying the cpu affinity on threads. The same
issue may also occur with nbproc under FreeBSD. This commit addresses both
cases.

This patch must be backported to 1.8.
2018-03-12 22:52:24 +01:00
Aurélien Nephtali
b53e20826e BUG/MINOR: cli: Fix a typo in the 'set rate-limit' usage
The correct keyword is 'ssl-sessions' (vs. 'ssl-session').
The typo was introduced in 45c742be05 ('REORG: cli: move the "set
rate-limit" functions to their own parser').

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-12 07:49:08 +01:00
Aurélien Nephtali
bca08762d2 CLEANUP: cli: Remove a leftover debug message
This printf() was added in f886e3478d ("MINOR: cli: Add a command to
send listening sockets.").

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-12 07:49:05 +01:00
Aurélien Nephtali
76de95a4c0 CLEANUP: ssl: Remove a duplicated #include
openssl/x509.h is included twice since commit fc0421fde ("MEDIUM: ssl:
add support for SNI and wildcard certificates").

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-12 07:49:01 +01:00
Aurélien Nephtali
498a115727 BUG/MINOR: cli: Fix a crash when passing a negative or too large value to "show fd"
This bug is present since 7a4a0ac71d ("MINOR: cli: add a new "show fd"
command").

This should be backported to 1.8.

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-03-12 07:47:26 +01:00
Willy Tarreau
84b118f312 BUG/MEDIUM: h2: also arm the h2 timeout when sending
Right now the h2 idle timeout is only set when there is no stream. If we
fail to send because the socket buffers are full (generally indicating
the client has left), we also need to arm it so that we can properly
expire such connections, otherwise some failed transfers might leave
H2 connections pending forever.

Thanks to Thierry Fournier for the diag and the traces.

This patch needs to be backported to 1.8.
2018-03-08 18:43:56 +01:00
Willy Tarreau
c41b3e8dff DOC: buffers: clarify the purpose of the <from> pointer in offer_buffers()
This one is only used to compare pointers and NULL is permitted though
this is far from being clear.
2018-03-08 18:33:48 +01:00
Olivier Houchard
ec9516a6dc BUG/MINOR: unix: Don't mess up when removing the socket from the xfer_sock_list.
When removing the socket from the xfer_sock_list, we want to set
next->prev to prev, not to next->prev, which is useless.

This should be backported to 1.8.
2018-03-08 18:33:11 +01:00
Christopher Faulet
f9f6ed0a51 CLEANUP: .gitignore: Ignore binaries from the contrib directory
Some binaries were not ignored and polluted the "git status" output.
2018-03-06 17:33:05 +01:00
Emeric Brun
1738e86771 BUG/MINOR: session: Fix tcp-request session failure if handshake.
Some sample fetches check if session is established using
the flag CO_FL_CONNECTED. But in some cases, when a handshake
is performed this flag is set too late, after the process
of the tcp-request session rules.

This fix move the raising of the flag at the beginning of the
conn_complete_session function which processes the tcp-request
session rules.

This fix must be backported to 1.8 (and perhaps 1.7)
2018-03-06 14:04:45 +01:00
Willy Tarreau
b684e7a52c BUILD/MINOR: fix Lua build on Mac OS X (again)
Previous commit (13113d6 "MINOR/BUILD: fix Lua build on Mac OS X")
contains a typo, it uses "-export-dynamic" instead of "-export_dynamic"
(dash instead of underscore), despite what the commit message suggests,
and it obviously doesn't work. Thanks to Kirill A. Korinsky for reporting
it.

This patch should be backported on each version from 1.6 like the
aforementionned one above.
2018-03-05 15:39:39 +01:00
Thierry Fournier
13113d6abb MINOR/BUILD: fix Lua build on Mac OS X
Change gcc option syntax for Mac. -Wl,--export-dynamic is not
supported, use -Wl,-export_dynamic.

Thanks to Kirill A. Korinsky for the report.

This patch should be backported on each version from 1.6
2018-03-05 14:19:34 +01:00
Willy Tarreau
44e973f508 MEDIUM: h2: use a single buffer allocator
We used to have one buffer allocator per direction while we can never
block on two buffers at once. Let's have a single one and rely on the
connection's flags to know which one we're waitinf for.
2018-03-01 17:58:15 +01:00
Willy Tarreau
0a10de6066 MINOR: h2: provide and use h2s_detach() and h2s_free()
These ones save us from open-coding the cleanup functions on each and
every error path. The code was updated to use them with no functional
change.
2018-03-01 16:35:01 +01:00
Willy Tarreau
00dd07895a CLEANUP: h2: rename misleading h2c_stream_close() to h2s_close()
This function takes an h2c and an h2s but it never uses the h2c, which
is a bit confusing at some places in the code. Let's make it clear that
it only operates on the h2s instead by renaming it and removing the
unused h2c argument.
2018-03-01 16:31:34 +01:00
Tim Duesterhus
2788a39c07 MINOR: systemd: Add SystemD's SystemCallFilter option to the unit file
This option takes away system calls that are unneeded for haproxy's
operation and thus is a good defense in depth measure.
2018-03-01 15:57:15 +01:00
Tim Duesterhus
8a9659212e MINOR: systemd: Add SystemD's Protect*= options to the unit file
While the haproxy workers usually are running chrooted the master
process is not. This patch is a pretty safe defense in depth measure
to ensure haproxy cannot touch sensitive parts of the file system.

ProtectSystem takes non-boolean arguments in newer SystemD versions,
but setting those would leave older systems such as Ubuntu Xenial
unprotected. Distro maintainers and system administrators could
adapt the ProtectSystem value to the SystemD version they ship.
2018-03-01 15:57:15 +01:00
Tim Duesterhus
1ce8de2d93 MINOR: systemd: Add section for SystemD sandboxing to unit file
This commit adds a warning for settings that possibly provide better
sandboxing and explains their tradeoffs.
2018-03-01 15:57:15 +01:00
Emmanuel Hocdet
253c3b7516 MINOR: connection: add proxy-v2-options authority
This patch add option PP2_TYPE_AUTHORITY to proxy protocol v2 when a TLS
connection was negotiated. In this case, authority corresponds to the sni.
2018-03-01 11:38:32 +01:00
Emmanuel Hocdet
fa8d0f1875 MINOR: connection: add proxy-v2-options ssl-cipher,cert-sig,cert-key
This patch implement proxy protocol v2 options related to crypto information:
ssl-cipher (PP2_SUBTYPE_SSL_CIPHER), cert-sig (PP2_SUBTYPE_SSL_SIG_ALG) and
cert-key (PP2_SUBTYPE_SSL_KEY_ALG).
2018-03-01 11:38:28 +01:00
Emmanuel Hocdet
283e004a85 MINOR: ssl: add ssl_sock_get_cert_sig function
ssl_sock_get_cert_sig can be used to report cert signature short name
to log and ppv2 (RSA-SHA256).
2018-03-01 11:34:08 +01:00
Emmanuel Hocdet
96b7834e98 MINOR: ssl: add ssl_sock_get_pkey_algo function
ssl_sock_get_pkey_algo can be used to report pkey algorithm to log
and ppv2 (RSA2048, EC256,...).
Extract pkey information is not free in ssl api (lock/alloc/free):
haproxy can use the pkey information computed in load_certificate.
Store and use this information in a SSL ex_data when available,
compute it if not (SSL multicert bundled and generated cert).
2018-03-01 11:34:05 +01:00
Emmanuel Hocdet
ddc090bc55 MINOR: ssl: extract full pkey info in load_certificate
Private key information is used in switchctx to implement native multicert
selection (ecdsa/rsa/anonymous). This patch extract and store full pkey
information: dsa type and pkey size in bits. This can be used for switchctx
or to report pkey informations in ppv2 and log.
2018-03-01 11:33:18 +01:00
Emmanuel Hocdet
8c0c34b6e7 Revert "BUG/MINOR: send-proxy-v2: string size must include ('\0')"
This reverts commit 82913e4f79.
TLV string value should not be null-terminated.

This should be backported to 1.8.
2018-03-01 06:48:05 +01:00
Christopher Faulet
7d9f1ba246 BUG/MEDIUM: spoe: Remove idle applets from idle list when HAProxy is stopping
In the SPOE applet's handler, when an applet is switched from the state IDLE to
PROCESSING, it is removed for the list of idle applets. But when HAProxy is
stopping, this applet can be switched to DISCONNECT. In this case, we also need
to remove it from the list of idle applets. Else the applet is removed but still
present in the list. It could lead to a segmentation fault or an infinite loop,
depending the code path.
2018-02-28 16:20:33 +01:00
Christopher Faulet
ca6ef50661 BUG/MEDIUM: buffer: Fix the wrapping case in bi_putblk
When the block of data need to be split to support the wrapping, the start of
the second block of data was wrong. We must be sure to skup data copied during
the first memcpy.

This patch must be backported to 1.8.
2018-02-27 15:45:03 +01:00
Christopher Faulet
b2b279464c BUG/MEDIUM: buffer: Fix the wrapping case in bo_putblk
When the block of data need to be split to support the wrapping, the start of
the second block of data was wrong. We must be sure to skip data copied during
the first memcpy.

This patch must be backported to 1.8, 1.7, 1.6 and 1.5.
2018-02-27 15:45:03 +01:00
Willy Tarreau
35a62705df BUG/MEDIUM: h2: always consume any trailing data after end of output buffers
In case a stream tries to emit more data than advertised by the chunks
or content-length headers, the extra data remains in the channel's output
buffer until the channel's timeout expires. It can easily happen when
sending malformed error files making use of a wrong content-length or
having extra CRLFs after the empty chunk. It may also be possible to
forge such a bad response using Lua.

The H1 to H2 encoder must protect itself against this by marking the data
presented to it as consumed if it decides to discard them, so that the
sending stream doesn't wait for the timeout to trigger.

The visible effect of this problem is a huge memory usage and a high
concurrent connection count during benchmarks when using such bad data
(a typical place where this easily happens).

This fix must be backported to 1.8.
2018-02-27 15:37:25 +01:00
Christopher Faulet
929b52d8a1 BUG/MINOR: h2: Set the target of dbuf_wait to h2c
In h2_get_dbuf, when the buffer allocation was failing, dbuf_wait.target was
errornously set to the connection (h2c->conn) instead of the h2 connection
descriptor (h2c).

This patch must be backported to 1.8.
2018-02-26 17:33:16 +01:00
Yves Lafon
95317289e9 MINOR: stats: display the number of threads in the statistics.
Add the nbthread global variable to the output, matching nbproc.

This may be backported to 1.8
2018-02-26 11:53:46 +01:00
Willy Tarreau
364d745106 MINOR: debug/pools: make DEBUG_UAF also detect underflows
Since we use padding before the allocated page, it's trivial to place
the allocated address there and see if it gets mangled once we release
it.

This may be backported to stable releases already using DEBUG_UAF.
2018-02-22 14:18:45 +01:00
Willy Tarreau
5a9cce4653 BUG/MINOR: debug/pools: properly handle out-of-memory when building with DEBUG_UAF
Commit 158fa75 ("MINOR: pools: implement DEBUG_UAF to detect use after free")
implemented pool use-after-free detection, but the mmap() return value isn't
properly checked, preventing the call to pool_alloc_area() from returning
NULL. So on out-of-memory a mangled pointer is returned, causing a crash on
the pool_alloc() site instead of forcing a GC. It doesn't affect regular
operations however, just complicates complex bug investigations.

This fix should be backported to 1.8 and to 1.7.
2018-02-22 14:18:45 +01:00
Willy Tarreau
f161d0f51e BUG/MINOR: pools/threads: don't ignore DEBUG_UAF on double-word CAS capable archs
Since commit cf975d4 ("MINOR: pools/threads: Implement lockless memory
pools."), we support lockless pools. However the parts dedicated to
detecting use-after-free are not present in this part, making DEBUG_UAF
useless in this situation.

The present patch sets a new define CONFIG_HAP_LOCKLESS_POOLS when such
a compatible architecture is detected, and when pool debugging is not
requested, then makes use of this everywhere in pools and buffers
functions. This way enabling DEBUG_UAF will automatically disable the
lockless version.

No backport is needed as this is purely 1.9-dev.
2018-02-22 14:18:45 +01:00
Tim Duesterhus
5e64286bab CLEANUP: standard: Fix typo in IPv6 mask example
IPv6 addresses with two double colons are invalid.

This typo was introduced in commit 471851713a.
2018-02-21 05:07:35 +01:00