Commit Graph

9318 Commits

Author SHA1 Message Date
Christopher Faulet
18cca781f5 BUG/MINOR: config: Reinforce validity check when a process number is parsed
Now, in the function parse_process_number(), when a process number or a set of
processes is parsed, an error is triggered if an invalid character is found. It
means following syntaxes are not forbidden and will emit an alert during the
HAProxy startup:

  1a
  1/2
  1-2-3

This bug was reported on Github. See issue #36.

This patch may be backported to 1.9 and 1.8.
2019-02-07 21:23:58 +01:00
Christopher Faulet
11389018bc BUG/MAJOR: spoe: Don't try to get agent config during SPOP healthcheck
During SPOP healthchecks, a dummy appctx is used to create the HAPROXY-HELLO
frame and then to parse the AGENT-HELLO frame. No agent are attached to it. So
it is important to not rely on an agent during these stages. When HAPROXY-HELLO
frame is created, there is no problem, all accesses to an agent are
guarded. This is not true during the parsing of the AGENT-HELLO frame. Thus, it
is possible to crash HAProxy with a SPOA declaring the async or the pipelining
capability during a healthcheck.

This patch must be backported to 1.9 and 1.8.
2019-02-07 21:23:58 +01:00
Willy Tarreau
ff9c9140f4 MINOR: config: make MAX_PROCS configurable at build time
For some embedded systems, it's pointless to have 32- or even 64- large
arrays of processes when it's known that much fewer processes will be
used in the worst case. Let's introduce this MAX_PROCS define which
contains the highest number of processes allowed to run at once. It
still defaults to LONGBITS but may be lowered.
2019-02-07 15:10:19 +01:00
Willy Tarreau
980855bd95 BUG/MEDIUM: server: initialize the orphaned conns lists and tasks at the end
This also depends on the nbthread count, so it must only be performed after
parsing the whole config file. As a side effect, this removes some code
duplication between servers and server-templates.

This must be backported to 1.9.
2019-02-07 15:08:13 +01:00
Willy Tarreau
835daa119e BUG/MEDIUM: server: initialize the idle conns list after parsing the config
The idle conns lists are sized according to the number of threads. As such
they cannot be initialized during the parsing since nbthread can be set
later, as revealed by this simple config which randomly crashes when used.
Let's do this at the end instead.

    listen proxy
        bind :4445
        mode http
        timeout client 10s
        timeout server 10s
        timeout connect 10s
        http-reuse always
        server s1 127.0.0.1:8000

    global
        nbthread 8

This fix must be backported to 1.9 and 1.8.
2019-02-07 15:08:13 +01:00
Willy Tarreau
b0769b2ed6 BUG/MEDIUM: spoe: initialization depending on nbthread must be done last
The agent used to be configured depending on global.nbthread while nbthread
may be set after the agent is parsed. Let's move this part to the spoe_check()
function to make sure nbthread is always correct and arrays are appropriately
sized.

This fix must be backported to 1.9 and 1.8.
2019-02-07 15:08:13 +01:00
Willy Tarreau
b784b35ce8 BUG/MINOR: lua: initialize the correct idle conn lists for the SSL sockets
Commit 40a007cf2 ("MEDIUM: threads/server: Make connection list
(priv/idle/safe) thread-safe") made a copy-paste error when initializing
the Lua sockets, as the TCP one was initialized twice. Fortunately it has
no impact because the pointers are set to NULL after a memset(0) and are
not changed in between.

This must be backported to 1.9 and 1.8.
2019-02-07 15:08:13 +01:00
Willy Tarreau
3ddcf7643c BUG/MINOR: spoe: do not assume agent->rt is valid on exit
As reported by Christopher, we may call spoe_release_agent() when leaving
after an allocation failure or a config parse error. We must not assume
agent->rt is valid there as the allocation could have failed.

This should be backported to 1.9 and 1.8.
2019-02-07 15:08:13 +01:00
Bertrand Jacquin
4f03ab06a9 DOC: ssl: Stop documenting ciphers example to use
Since TLS ciphers are not well understand, it is very common pratice to
copy and paste parameters from documentation and use them as-is. Since RC4
should not be used anymore, it is wiser to link users to up to date
documnetation from Mozilla to avoid unsafe configuration in the wild.

Clarify the location of man pages for OpenSSL when missing.
2019-02-06 23:03:40 +01:00
Bertrand Jacquin
8cf7c1eb61 DOC: ssl: Clarify when pre TLSv1.3 cipher can be used
This is mainly driven by the fact TLSv1.3 will have a successor at some
point.
2019-02-06 23:03:38 +01:00
Willy Tarreau
1a0fe3becd BUG/MINOR: config: make sure to count the error on incorrect track-sc/stick rules
When commit 151e1ca98 ("BUG/MAJOR: config: verify that targets of track-sc
and stick rules are present") added a check for some process inconsistencies
between rules and their stick tables, some errors resulted in a "return 0"
statement, which is taken as "no error" in some cases. Let's fix this.

This must be backported to all versions using the above commit.
2019-02-06 10:25:07 +01:00
Christopher Faulet
f7679ad4db BUG/MAJOR: htx/backend: Make all tests on HTTP messages compatible with HTX
A piece of code about the HTX was lost this summer, after the "big merge"
(htx/http2/connection layer refactoring). I forgot to keep HTX changes in the
functions connect_server() and assign_server(). So, this patch fixes "uri",
"url_param" and "hdr" LB algorithms when the HTX is enabled. It also fixes
evaluation of the "sni" expression on server lines.

This issue was reported on github. See issue #32.

This patch must be backported in 1.9.
2019-02-06 10:20:01 +01:00
Willy Tarreau
2bdcfde426 BUG/MAJOR: spoe: verify that backends used by SPOE cover all their callers' processes
When a filter is installed on a proxy and references spoe, we must be
absolutely certain that the whole chain is valid on a given process
when running in multi-process mode. The problem here is that if a proxy
1 runs on process 1, referencing an SPOE agent itself based on a backend
running on process 2, this last one will be completely deinited on
process 1, and will thus cause random crashes when it gets messages
from this proess.

This patch makes sure that the whole chain is valid on all of the caller's
processes.

This fix must be backported to all spoe-enabled maintained versions. It
may potentially disrupt configurations which already randomly crash.
There hardly is any intermediary solution though, such configurations
need to be fixed.
2019-02-05 13:37:19 +01:00
Willy Tarreau
151e1ca989 BUG/MAJOR: config: verify that targets of track-sc and stick rules are present
Stick and track-sc rules may optionally designate a table in a different
proxy. In this case, a number of verifications are made such as validating
that this proxy actually exists. However, in multi-process mode, the target
table might indeed exist but not be bound to the set of processes the rules
will execute on. This will definitely result in a random behaviour especially
if these tables do require peer synchronization, because some tasks will be
started to try to synchronize form uninitialized areas.

The typical issue looks like this :

    peers my-peers
         peer foo ...

    listen proxy
         bind-process 1
         stick on src table ip
         ...

    backend ip
         bind-process 2
         stick-table type ip size 1k peers my-peers

While it appears obvious that the example above will not work, there are
less obvious situations, such as having bind-process in a defaults section
and having a larger set of processes for the referencing proxy than the
referenced one.

The present patch adds checks for such situations by verifying that all
processes from the referencing proxy are present on the other one in all
track-sc* and stick-* rules, and in sample fetch / converters referencing
another table so that sc_inc_gpc0() and similar are safe as well.

This fix must be backported to all maintained versions. It may potentially
disrupt configurations which already randomly crash. There hardly is any
intermediary solution though, such configurations need to be fixed.
2019-02-05 11:54:49 +01:00
Willy Tarreau
155acffc13 BUG/MINOR: task: close a tiny race in the inter-thread wakeup
__task_wakeup() takes care of a small race that exists between threads,
but it uses a store barrier that is not sufficient since apparently the
state read after clearing the leaf_p pointer sometimes is incorrect. This
results in missed wakeups between threads competing at a high rate. Let's
use a full barrier instead to serialize the operations.

This may be backported to 1.9 though it's extremely unlikely that this
bug will ever manifest itself there.
2019-02-04 14:21:35 +01:00
Willy Tarreau
ef6fd85623 BUG/MINOR: compression: properly report compression stats in HTX mode
When HTX support was added to HTTP compression, a set of counters was missed,
namely comp_in and comp_byp, resulting in no stats being available for compression.

This must be backported to 1.9.
2019-02-04 11:48:03 +01:00
Willy Tarreau
3d95717b58 MINOR: threads: make use of thread_mask() to simplify some thread calculations
By doing so it's visible that some fd_insert() calls were relying on
MAX_THREADS while all_threads_mask should have been more suitable.
2019-02-04 05:09:16 +01:00
Willy Tarreau
6daac19b3f MINOR: config: simplify bind_proc processing using proc_mask()
At a number of places we used to have null tests on bind_proc for
listeners and proxies. Let's simplify all these tests by always
having the proper bits reported via proc_mask().
2019-02-04 05:09:16 +01:00
Willy Tarreau
2415727a00 MINOR: global: add proc_mask() and thread_mask()
These two functions return either all_{proc,threads}_mask, or the argument.
This is used to default to all_proc_mask or all_threads_mask when not set
on bind_conf or proxies.
2019-02-04 05:09:15 +01:00
Willy Tarreau
a38a7175b1 MINOR: config: keep an all_proc_mask like we have all_threads_mask
This simplifies some mask comparisons at various places where
nbits(global.nbproc) was used.
2019-02-04 05:09:15 +01:00
Willy Tarreau
cafa56ecd6 MINOR: tools: improve the popcount() operation
We'll call popcount() more often so better use a parallel method
than an iterative one. One optimal design is proposed at the site
below. It requires a fast multiplication though, but even without
it will still be faster than the iterative one, and all relevant
64 bit platforms do have a multiply unit.

     https://graphics.stanford.edu/~seander/bithacks.html
2019-02-04 05:09:15 +01:00
Willy Tarreau
4ed84c96cf OPTIM: listener: optimize cache-line packing for struct listener
Some unused fields were placed early and some important ones were on
the second cache line. Let's move the proto_list and name closer to
the end of the structure to bring accept() and default_target() into
the first cache line.
2019-02-04 05:09:14 +01:00
Willy Tarreau
fc647360e0 CLEANUP: threads: use nbits to calculate the thread mask
It's pointless to do arithmetics by hand, we have a function for this.
2019-02-02 17:48:39 +01:00
Willy Tarreau
da9e939f3c CLEANUP: threads: fix misleading comment about all_threads_mask
This variable changed a bit after 1.8, it's never zero anymore.
2019-02-02 17:48:39 +01:00
Willy Tarreau
6b4a39adc4 BUG/MINOR: config: fix bind line thread mask validation
When no nbproc is specified, a computation leads to reading bind_thread[-1]
before checking if the thread mask is valid for a bind conf. It may either
report a false warning and compute a wrong mask, or miss some incorrect
configs.

This must be backported to 1.9 and possibly 1.8.
2019-02-02 17:46:24 +01:00
Willy Tarreau
bbcf2b9e0d BUG/MINOR: threads: fix the process range of thread masks
Commit 421f02e ("MINOR: threads: add a MAX_THREADS define instead of
LONGBITS") used a MAX_THREADS macros to fix threads limits. However,
one change was wrong as it affected the upper bound of the process
loop when setting threads masks. No backport is needed.
2019-02-02 13:18:01 +01:00
Olivier Houchard
32211a17eb BUG/MEDIUM: stream: Don't forget to free s->unique_id in stream_free().
In stream_free(), free s->unique_id. We may still have one, because it's
allocated in log.c::strm_log() no matter what, even if it's a TCP connection
and thus it won't get free'd by http_end_txn().
Failure to do so leads to a memory leak.

This should probably be backported to all maintained branches.
2019-02-01 18:17:36 +01:00
Willy Tarreau
053c15750b BUG/MEDIUM: mux-h2: always set :authority on request output
PiBa-NL reported that some servers don't fall back to the Host header when
:authority is absent. After studying all the combinations of Host and
:authority, it appears that we always have to send the latter, hence we
never need the former. In case of CONNECT method, the authority is retrieved
from the URI part, otherwise it's extracted from the Host field.

The tricky part is that we have to scan all headers for the Host header
before dumping other headers. This is due to the fact that we must emit
pseudo headers before other ones. One improvement could possibly be made
later in the request parser to search and emit the Host header immediately
if authority was not found. This would cost nothing on the vast marjority
of requests and make the lookup faster on output since Host would appear
first.

This fix must be backported to 1.9.
2019-02-01 16:47:46 +01:00
Willy Tarreau
5be92ff23f BUG/MEDIUM: mux-h2: always omit :scheme and :path for the CONNECT method
This is mandated by RFC7540 #8.3, these pseudo-headers must not be emitted
with the CONNECT method.

This must be backported to 1.9.
2019-02-01 16:47:46 +01:00
Willy Tarreau
1da41ecf5b BUG/MINOR: backend: check srv_conn before dereferencing it
Commit 3c4e19f42 ("BUG/MEDIUM: backend: always release the previous
connection into its own target srv_list") introduced a valid warning
about a null-deref risk since we didn't check conn_new()'s return value.

This patch must be backported to 1.9 with the patch above.
2019-02-01 16:47:46 +01:00
Olivier Houchard
9c4f08ae39 BUG/MINOR: tune.fail-alloc: Don't forget to initialize ret.
In mem_should_fail(), if we don't want to fail the allocation, either
because mem_fail_rate is 0, or because we're still initializing, don't
forget to initialize ret, or we may return a non-zero value, and fail
an allocation we didn't want to fail.
This should only affect users that compile with DEBUG_FAIL_ALLOC.
2019-02-01 16:30:08 +01:00
Willy Tarreau
3e451842dc BUG/MEDIUM: htx: check the HTX compatibility in dynamic use-backend rules
I would have sworn it was done, probably we lost it during the refactoring.
If a frontend is in HTX and the backend not (and conersely), this is
normally detected at config parsing time unless the rule is dynamic. In
this case we must abort with an error 500. The logs will report "RR"
(resource issue while processing request) with the frontend and the
backend assigned, so that it's possible to figure what was attempted.

This must be backported to 1.9.
2019-02-01 15:09:54 +01:00
Willy Tarreau
3c4e19f429 BUG/MEDIUM: backend: always release the previous connection into its own target srv_list
There was a bug reported in issue #19 regarding the fact that haproxy
could mis-route requests to the wrong server. It turns out that when
switching to another server, the old connection was put back into the
srv_list corresponding to the stream's target instead of this connection's
target. Thus if this connection was later picked, it was pointing to the
wrong server.

The patch fixes this and also clarifies the assignment to srv_conn->target
so that it's clear we don't change it when picking it from the srv_list.

This must be backported to 1.9 only.
2019-02-01 11:58:33 +01:00
Olivier Houchard
dc21ff778b MINOR: debug: Add an option that causes random allocation failures.
When compiling with DEBUG_FAIL_ALLOC, add a new option, tune.fail-alloc,
that gives the percentage of chances an allocation fails.
This is useful to check that allocation failures are always handled
gracefully.
2019-01-31 19:38:25 +01:00
Olivier Houchard
9c9da5ee89 MINOR: muxes: Don't bother to LIST_DEL(&conn->list) before calling conn_free().
conn_free() already removes the connection from any idle list, so there's no
need to do it in the mux code, just before calling conn_free().
2019-01-31 19:38:25 +01:00
Olivier Houchard
ff5dd74e25 MINOR: xref: Add missing barriers.
Add a few missing barriers in the xref code, it's unlikely to be a problem
for x86, but may be on architectures with weak memory ordering.
2019-01-31 19:38:25 +01:00
Willy Tarreau
8694978892 BUG/MEDIUM: mux-h2: properly consider the peer's advertised max-concurrent-streams
Till now we used to only rely on tune.h2.max-concurrent-streams but if
a peer advertises a lower limit this can cause streams to be reset or
even the conection to be killed. Let's respect the peer's value for
outgoing streams.

This patch should be backported to 1.9, though it depends on the following
ones :
    BUG/MINOR: server: fix logic flaw in idle connection list management
    MINOR: mux-h2: max-concurrent-streams should be unsigned
    MINOR: mux-h2: make sure to only check concurrency limit on the frontend
    MINOR: mux-h2: learn and store the peer's advertised MAX_CONCURRENT_STREAMS setting
2019-01-31 19:38:25 +01:00
Willy Tarreau
2e2083ae5b MINOR: mux-h2: learn and store the peer's advertised MAX_CONCURRENT_STREAMS setting
We used not to take it into account because we only used the configured
parameter everywhere. This patch makes sure we can actually learn the
value advertised by the peer. We still enforce our own limit on top of
it however, to make sure we can actually limit resources or stream
concurrency in case of suboptimal server settings.
2019-01-31 19:38:25 +01:00
Willy Tarreau
fa1d357f05 MINOR: mux-h2: make sure to only check concurrency limit on the frontend
h2_has_too_many_cs() was renamed to h2_frt_has_too_many_cs() to make it
clear it's only used to throttle the frontend connection, and the call
places were adjusted to only call this code on a front connection. In
practice it was already the case since the H2_CF_DEM_TOOMANY flag is
only set there. But now the ambiguity is removed.
2019-01-31 19:38:25 +01:00
Willy Tarreau
5a490b669e MINOR: mux-h2: max-concurrent-streams should be unsigned
We compare it to other unsigned values, let's make it unsigned as well.
2019-01-31 19:38:25 +01:00
Willy Tarreau
00f18a36b6 BUG/MINOR: server: fix logic flaw in idle connection list management
With variable connection limits, it's not possible to accurately determine
whether the mux is still in use by comparing usage and max to be equal due
to the fact that one determines the capacity and the other one takes care
of the context. This can cause some connections to be dropped before they
reach their stream ID limit.

It seems it could also cause some connections to be terminated with
streams still alive if the limit was reduced to match the newly computed
avail_streams() value, though this cannot yet happen with existing muxes.

Instead let's switch to usage reports and simply check whether connections
are both unused and available before adding them to the idle list.

This should be backported to 1.9.
2019-01-31 19:38:25 +01:00
Willy Tarreau
180590409f BUG/MEDIUM: mux-h2: do not close the connection on aborted streams
We used to rely on a hint that a shutw() or shutr() without data is an
indication that the upper layer had performed a tcp-request content reject
and really wanted to kill the connection, but sadly there is another
situation where this happens, which is failed keep-alive request to a
server. In this case the upper layer stream silently closes to let the
client retry. In our case this had the side effect of killing all the
connection.

Instead of relying on such hints, let's address the problem differently
and rely on information passed by the upper layers about the intent to
kill the connection. During shutr/shutw, this is detected because the
flag CS_FL_KILL_CONN is set on the connstream. Then only in this case
we send a GOAWAY(ENHANCE_YOUR_CALM), otherwise we only send the reset.

This makes sure that failed backend requests only fail frontend requests
and not the whole connections anymore.

This fix relies on the two previous patches adding SI_FL_KILL_CONN and
CS_FL_KILL_CONN as well as the fix for the connection close, and it must
be backported to 1.9 and 1.8, though the code in 1.8 could slightly differ
(cs is always valid) :

  BUG/MEDIUM: mux-h2: wait for the mux buffer to be empty before closing the connection
  MINOR: stream-int: add a new flag to mention that we want the connection to be killed
  MINOR: connstream: have a new flag CS_FL_KILL_CONN to kill a connection
2019-01-31 19:38:25 +01:00
Willy Tarreau
51d0a7e54c MINOR: connstream: have a new flag CS_FL_KILL_CONN to kill a connection
This is the equivalent of SI_FL_KILL_CONN but for the connstreams. It
will be set by the stream-interface during the various shutdown
operations.
2019-01-31 19:38:25 +01:00
Willy Tarreau
0f9cd7b196 MINOR: stream-int: add a new flag to mention that we want the connection to be killed
The new flag SI_FL_KILL_CONN is now set by the rare actions which
deliberately want the whole connection (and not just the stream) to be
killed. This is only used for "tcp-request content reject",
"tcp-response content reject", "tcp-response content close" and
"http-request reject". The purpose is to desambiguate the close from
a regular shutdown. This will be used by the next patches.
2019-01-31 19:38:25 +01:00
Willy Tarreau
4dbda620f2 BUG/MEDIUM: mux-h2: wait for the mux buffer to be empty before closing the connection
When finishing to respond on a stream, a shutw() is called (resulting
in either an end of stream or RST), then h2_detach() is called, and may
decide to kill the connection is a number of conditions are satisfied.
Actually one of these conditions is that a GOAWAY frame was already sent
or attempted to be sent. This one is wrong, because it can happen in at
least these two situations :
  - a shutw() sends a GOAWAY to obey tcp-request content reject
  - a graceful shutdown is pending

In both cases, the connection will be aborted with the mux buffer holding
some data. In case of a strong abort the client will not see the GOAWAY or
RST and might want to try again, which is counter-productive. In case of
the graceful shutdown, it could result in truncated data. It looks like a
valid candidate for the issue reported here :

    https://www.mail-archive.com/haproxy@formilux.org/msg32433.html

A backport to 1.9 and 1.8 is necessary.
2019-01-31 19:38:25 +01:00
Willy Tarreau
28e581b21c BUG/MINOR: stream: don't close the front connection when facing a backend error
In 1.5-dev13, a bug was introduced by commit e3224e870 ("BUG/MINOR:
session: ensure that we don't retry connection if some data were sent").
If a connection error is reported after some data were sent (and lost),
we used to accidently mark the front connection as being in error instead
of only the back one because the two direction flags were applied to the
same channel. This case is extremely rare with raw connections but can
happen a bit more often with multiplexed streams. This will result in
the error not being correctly reported to the client.

This patch can be backported to all supported versions.
2019-01-31 19:38:25 +01:00
Olivier Houchard
8788b4111c BUG/MEDIUM: connections: Don't forget to remove CO_FL_SESS_IDLE.
If we're adding a connection to the server orphan idle list, don't forget
to remove the CO_FL_SESS_IDLE flag, or we will assume later it's still
attached to a session.

This should be backported to 1.9.
2019-01-31 19:38:25 +01:00
Christopher Faulet
3949c9d90d BUG/MEDIUM: mux-h1: Don't add "transfer-encoding" if message-body is forbidden
When a HTTP/1.1 or above response is emitted to a client, if the flag
H1_MF_XFER_LEN is set whereas H1_MF_CLEN and H1_MF_CHNK are not, the header
"transfer-encoding" is added. It is a way to make HTX chunking consistent with
H2. But we must exclude all cases where the message-body is explicitly forbidden
by the RFC:

 * for all 1XX, 204 and 304 responses
 * for any responses to HEAD requests
 * for 200 responses to CONNECT requests

For these 3 cases, the flag H1_MF_XFER_LEN is set but H1_MF_CLEN and H1_MF_CHNK
not. And the header "transfer-encoding" must not be added.

See issue #27 on github for details about the bug.

This patch must be backported in 1.9.
2019-01-31 11:07:17 +01:00
Frédéric Lécaille
04636b7bac BUG/MEDIUM: peers: Peer addresses parsing broken.
This bug was introduced by 355b203 commit which prevented the peer
addresses to be parsed for the local peer of a "peers" section.
When adding "parse_addr" boolean parameter to parse_server(), this commit
missed the case where the syntax with "peer" keyword should still be
supported in addition to the new syntax with "server"+"bind" keyword.

May be backported as fas as 1.5.
2019-01-31 09:56:39 +01:00
Willy Tarreau
a9b7796862 MINOR: mux-h2: consistently rely on the htx variable to detect the mode
In h2_frt_transfer_data(), we support both HTX and legacy modes. The
HTX mode is detected from the proxy option and sets a valid pointer
into the htx variable. Better rely on this variable in all the function
rather than testing the option again. This way the code is clearer and
even the compiler knows this pointer is valid when it's dereferenced.

This should be backported to 1.9 if the b_is_null() patch is backported.
2019-01-31 08:07:17 +01:00