Commit Graph

13247 Commits

Author SHA1 Message Date
Maciej Zdeb
fcdfd857b3 MINOR: log: Logging HTTP path only with %HPO
This patch adds a new logging variable '%HPO' for logging HTTP path only
(without query string) from relative or absolute URI.

For example:
log-format "hpo=%HPO hp=%HP hu=%HU hq=%HQ"

GET /r/1 HTTP/1.1
=>
hpo=/r/1 hp=/r/1 hu=/r/1 hq=

GET /r/2?q=2 HTTP/1.1
=>
hpo=/r/2 hp=/r/2 hu=/r/2?q=2 hq=?q=2

GET http://host/r/3 HTTP/1.1
=>
hpo=/r/3 hp=http://host/r/3 hu=http://host/r/3 hq=

GET http://host/r/4?q=4 HTTP/1.1
=>
hpo=/r/4 hp=http://host/r/4 hu=http://host/r/4?q=4 hq=?q=4
2020-12-01 09:32:44 +01:00
Willy Tarreau
c94431b308 [RELEASE] Released version 2.4-dev2
Released version 2.4-dev2 with the following main changes :
    - BUILD: Make DEBUG part of .build_opts
    - BUILD: Show the value of DEBUG= in haproxy -vv
    - CI: Set DEBUG=-DDEBUG_STRICT=1 in GitHub Actions
    - MINOR: stream: Add level 7 retries on http error 401, 403
    - CLEANUP: remove unused function "ssl_sock_is_ckch_valid"
    - BUILD: SSL: add BoringSSL guarding to "RAND_keep_random_devices_open"
    - BUILD: SSL: do not "update" BoringSSL version equivalent anymore
    - BUG/MEDIUM: http_act: Restore init of log-format list
    - DOC: better describes how to configure a fallback crt
    - BUG/MAJOR: filters: Always keep all offsets up to date during data filtering
    - MINOR: cache: Prepare helper functions for Vary support
    - MEDIUM: cache: Add the Vary header support
    - MINOR: cache: Add a process-vary option that can enable/disable Vary processing
    - BUG/CRITICAL: cache: Fix trivial crash by sending accept-encoding header
    - BUG/MAJOR: peers: fix partial message decoding
    - DOC: cache: Add new caching limitation information
    - DOC: cache: Add information about Vary support
    - DOC: better document the config file format and escaping/quoting rules
    - DOC: Clarify %HP description in log-format
    - CI: github actions: update LibreSSL to 3.3.0
    - CI: github actions: enable 51degrees feature
    - MINOR: fd/threads: silence a build warning with threads disabled
    - BUG/MINOR: tcpcheck: Don't forget to reset tcp-check flags on new kind of check
    - MINOR: tcpcheck: Don't handle anymore in-progress send rules in tcpcheck_main
    - BUG/MAJOR: tcpcheck: Allocate input and output buffers from the buffer pool
    - MINOR: tcpcheck: Don't handle anymore in-progress connect rules in tcpcheck_main
    - MINOR: config: Deprecate and ignore tune.chksize global option
    - MINOR: config: Add a warning if tune.chksize is used
    - REORG: tcpcheck: Move check option parsing functions based on tcp-check
    - MINOR: check: Always increment check health counter on CONPASS
    - MINOR: tcpcheck: Add support of L7OKC on expect rules error-status argument
    - DOC: config: Make disable-on-404 option clearer on transition conditions
    - DOC: config: Move req.hdrs and req.hdrs_bin in L7 samples fetches section
    - BUG/MINOR: http-fetch: Fix smp_fetch_body() when called from a health-check
    - MINOR: plock: use an ARMv8 instruction barrier for the pause instruction
    - MINOR: debug: add "debug dev sched" to stress the scheduler.
    - MINOR: debug: add a trivial PRNG for scheduler stress-tests
    - BUG/MEDIUM: lists: Lock the element while we check if it is in a list.
    - MINOR: task: remove tasklet_insert_into_tasklet_list()
    - MINOR: task: perform atomic counter increments only once per wakeup
    - MINOR: task: remove __tasklet_remove_from_tasklet_list()
    - BUG/MEDIUM: task: close a possible data race condition on a tasklet's list link
    - BUG/MEDIUM: local log format regression.
2020-12-01 08:15:26 +01:00
Emeric Brun
0237c4e3f5 BUG/MEDIUM: local log format regression.
Since 2.3 default local log format always adds hostame field.
This behavior change was due to log/sink re-work, because according
to rfc3164 the hostname field is mandatory.

This patch re-introduce a legacy "local" format which is analog
to rfc3164 but with hostname stripped. This is the new
default if logs are generated by haproxy.

To stay compliant with previous configurations, the option
"log-send-hostname" acts as if the default format is switched
to rfc3164.

This patch addresses the github issue #963

This patch should be backported in branches >= 2.3.
2020-12-01 06:58:42 +01:00
Willy Tarreau
4d6c594998 BUG/MEDIUM: task: close a possible data race condition on a tasklet's list link
In issue #958 Ashley Penney reported intermittent crashes on AWS's ARM
nodes which would not happen on x86 nodes. After investigation it turned
out that the Neoverse N1 CPU cores used in the Graviton2 CPU are much
more aggressive than the usual Cortex A53/A72/A55 or any x86 regarding
memory ordering.

The issue that was triggered there is that if a tasklet_wakeup() call
is made on a tasklet scheduled to run on a foreign thread and that
tasklet is just being dequeued to be processed, there can be a race at
two places:
  - if MT_LIST_TRY_ADDQ() happens between MT_LIST_BEHEAD() and
    LIST_SPLICE_END_DETACHED() if the tasklet is alone in the list,
    because the emptiness tests matches ;

  - if MT_LIST_TRY_ADDQ() happens during LIST_DEL_INIT() in
    run_tasks_from_lists(), then depending on how LIST_DEL_INIT() ends
    up being implemented, it may even corrupt the adjacent nodes while
    they're being reused for the in-tree storage.

This issue was introduced in 2.2 when support for waking up remote
tasklets was added. Initially the attachment of a tasklet to a list
was enough to know its status and this used to be stable information.
Now it's not sufficient to rely on this anymore, thus we need to use
a different information.

This patch solves this by adding a new task flag, TASK_IN_LIST, which
is atomically set before attaching a tasklet to a list, and is only
removed after the tasklet is detached from a list. It is checked
by tasklet_wakeup_on() so that it may only be done while the tasklet
is out of any list, and is cleared during the state switch when calling
the tasklet. Note that the flag is not set for pure tasks as it's not
needed.

However this introduces a new special case: the function
tasklet_remove_from_tasklet_list() needs to keep both states in sync
and cannot check both the state and the attachment to a list at the
same time. This function is already limited to being used by the thread
owning the tasklet, so in this case the test remains reliable. However,
just like its predecessors, this function is wrong by design and it
should probably be replaced with a stricter one, a lazy one, or be
totally removed (it's only used in checks to avoid calling a possibly
scheduled event, and when freeing a tasklet). Regardless, for now the
function exists so the flag is removed only if the deletion could be
done, which covers all cases we're interested in regarding the insertion.
This removal is safe against a concurrent tasklet_wakeup_on() since
MT_LIST_DEL() guarantees the atomic test, and will ultimately clear
the flag only if the task could be deleted, so the flag will always
reflect the last state.

This should be carefully be backported as far as 2.2 after some
observation period. This patch depends on previous patch
"MINOR: task: remove __tasklet_remove_from_tasklet_list()".
2020-11-30 18:17:59 +01:00
Willy Tarreau
2da4c316c2 MINOR: task: remove __tasklet_remove_from_tasklet_list()
This function is only used at a single place directly within the
scheduler in run_tasks_from_lists() and it really ought not be called
by anything else, regardless of what its comment says. Let's delete
it, move the two lines directly into the call place, and take this
opportunity to factor the atomic decrement on tasks_run_queue. A comment
was added on the remaining one tasklet_remove_from_tasklet_list() to
mention the risks in using it.
2020-11-30 18:17:44 +01:00
Willy Tarreau
c309dbdd99 MINOR: task: perform atomic counter increments only once per wakeup
In process_runnable_tasks(), we walk the run queue and pick tasks to
insert them into the local list. And for each of these operations we
perform a few increments, some of which are atomic, and they're even
performed under the runqueue's lock. This is useless inside the loop,
better do them at the end, since we don't use these values inside the
loop and they're not used anywhere else either during this time. The
only one is task_list_size which is accessed in parallel by other
threads performing remote tasklet wakeups, but it's already
approximative and is used to decide to get out of the loop when the
limit is reached. So now we compute it first as an initial budget
instead.
2020-11-30 18:17:44 +01:00
Willy Tarreau
a868c2920b MINOR: task: remove tasklet_insert_into_tasklet_list()
This function is only called at a single place and adds more confusion
than it removes. It also makes one think it could be used outside of
the scheduler while it must absolutely not. Let's just move its two
lines to the call place, making the code more readable there. In
addition this clearly shows that the preliminary LIST_INIT() is
useless since the entry is immediately overwritten.
2020-11-30 18:17:44 +01:00
Olivier Houchard
1f05324cbe BUG/MEDIUM: lists: Lock the element while we check if it is in a list.
In MT_LIST_TRY_ADDQ() and MT_LIST_TRY_ADD() we can't just check if the
element is already in a list, because there's a small race condition, it
could be added  between the time we checked, and the time we actually set
its next and prev, so we have to lock it first.

This is required to address issue #958.

This should be backported to 2.3, 2.2 and 2.1.
2020-11-30 18:17:29 +01:00
Willy Tarreau
8a069eb9a4 MINOR: debug: add a trivial PRNG for scheduler stress-tests
Commit a5a447984 ("MINOR: debug: add "debug dev sched" to stress the
scheduler.") doesn't scale with threads because ha_random64() takes care
of being totally thread-safe for use with UUIDs. We don't need this for
the stress-testing functions, let's just implement a xorshift PRNG
instead. On 8 threads the performance jumped from 230k ctx/s with 96%
spent in ha_random64() to 14M ctx/s.
2020-11-30 17:07:32 +01:00
Willy Tarreau
a5a4479849 MINOR: debug: add "debug dev sched" to stress the scheduler.
This command supports starting a bunch of tasks or tasklets, either on the
current thread (mask=0), all (default), or any set, either single-threaded
or multi-threaded, and possibly auto-scheduled.

These tasks/tasklets will randomly pick another one to wake it up. The
tasks only do it 50% of the time while tasklets always wake two tasks up,
in order to achieve roughly 50% load (since the target might already be
woken up).
2020-11-29 17:43:07 +01:00
Your Name
1e237d037b MINOR: plock: use an ARMv8 instruction barrier for the pause instruction
As suggested by @AGSaidi in issue #958, on ARMv8 its convenient to use
an "isb" instruction in pl_cpu_relax() to improve fairness. Without it
I've met a few watchdog conditions on valid locks with 16 threads,
indicating that some threads couldn't manage to get it in 2 seconds. I
never happened again with it. In addition, the performance increased
by slightly more than 5% thanks to the reduced contention.

This should be backported as far as 2.2, possibly even 2.0.
2020-11-29 14:53:33 +01:00
Christopher Faulet
a9ffc41637 BUG/MINOR: http-fetch: Fix smp_fetch_body() when called from a health-check
res.body may be called from a health-check. It is probably never used. But it is
possibe. In such case, there is no channel. Thus we must not use it
unconditionally to set the flag SMP_F_MAY_CHANGE on the smp.

Now the condition test the channel first. In addtion, the flag is not set if the
payload is fully received.

This patch must be backported as far as 2.2.
2020-11-27 10:30:23 +01:00
Christopher Faulet
687a68e2d0 DOC: config: Move req.hdrs and req.hdrs_bin in L7 samples fetches section
req.hdrs and req.hdrs_bin are L7 sample fetches, not L6. They were in the wrong
section.

This patch may be backported as far as 1.8.
2020-11-27 10:30:23 +01:00
Christopher Faulet
fa8b89ac20 DOC: config: Make disable-on-404 option clearer on transition conditions
This option is only evaluated for running server. A stopped server becoming
up again but still replying 404s will stay stopped.
2020-11-27 10:30:23 +01:00
Christopher Faulet
83662b5431 MINOR: tcpcheck: Add support of L7OKC on expect rules error-status argument
L7OKC may now be used as an error status for an HTTP/TCP expect rule. Thus
it is for instance possible to write:

    option httpchk GET /isalive
    http-check expect status 200,404
    http-check expect status 200 error-status L7OKC

It is more or less the same than the disable-on-404 option except that if a
DOWN is up again but still replying a 404 will be set to NOLB state. While
it will stay in DOWN state with the disable-on-404 option.
2020-11-27 10:30:23 +01:00
Christopher Faulet
1e527cbf53 MINOR: check: Always increment check health counter on CONPASS
Regarding the health counter, a check finished with the CONDPASS result is
now the same than with the PASSED result: The health counter is always
incemented. Before, it was only performed is the health counter was not 0.

There is no change for the disable-on-404 option because it is only
evaluated for running or stopping servers. So with an health check counter
greater than 0. But it will make possible to handle (STOPPED -> STOPPING)
transition for servers.
2020-11-27 10:30:23 +01:00
Christopher Faulet
97b7bdfcf7 REORG: tcpcheck: Move check option parsing functions based on tcp-check
The parsing of the check options based on tcp-check rules (redis, spop,
smtp, http...) are moved aways from check.c. Now, these functions are placed
in tcpcheck.c. These functions are only related to the tcpcheck ruleset
configured on a proxy and not to the health-check attached to a server.
2020-11-27 10:30:23 +01:00
Christopher Faulet
f8c869bac4 MINOR: config: Add a warning if tune.chksize is used
This option is now deprecated. It is recent, but it is now marked as
deprecated as far as 2.2. Thus, there is now a warning in the 2.4 if this
option is still used. It will be removed in 2.5.

Becaue the 2.3 is quite new, this patch may be backported to 2.3.
2020-11-27 10:30:23 +01:00
Christopher Faulet
bb9fb8b7f8 MINOR: config: Deprecate and ignore tune.chksize global option
This option is now ignored because I/O check buffers are now allocated using the
buffer pool. Thus, it is marked as deprecated in the documentation and ignored
during the configuration parsing. The field is also removed from the global
structure.

Because this option is ignored since a recent fix, backported as fare as 2.2,
this patch should be backported too. Especially because it updates the
documentation.
2020-11-27 10:30:23 +01:00
Christopher Faulet
b1bb069c15 MINOR: tcpcheck: Don't handle anymore in-progress connect rules in tcpcheck_main
The special handling of in-progress connect rules at the begining of
tcpcheck_main() function can be removed. Instead, at the begining of the
tcpcheck_eval_connect() function, we test is there is already an existing
connection. In this case, it means we are waiting for a connection
establishment. In addition, before evaluating a new connect rule, we take
care to release any previous connection.
2020-11-27 10:29:41 +01:00
Christopher Faulet
b381a505c1 BUG/MAJOR: tcpcheck: Allocate input and output buffers from the buffer pool
Historically, the input and output buffers of a check are allocated by hand
during the startup, with a specific size (not necessarily the same than
other buffers). But since the recent refactoring of the checks to rely
exclusively on the tcp-checks and to use the underlying mux layer, this part
is totally buggy. Indeed, because these buffers are now passed to a mux,
they maybe be swapped if a zero-copy is possible. In fact, for now it is
only possible in h2_rcv_buf(). Thus the bug concretely only exists if a h2
health-check is performed. But, it is a latent bug for other muxes.

Another problem is the size of these buffers. because it may differ for the
other buffer size, it might be source of bugs.

Finally, for configurations with hundreds of thousands of servers, having 2
buffers per check always allocated may be an issue.

To fix the bug, we now allocate these buffers when required using the buffer
pool. Thus not-running checks don't waste memory and muxes may swap them if
possible. The only drawback is the check buffers have now always the same
size than buffers used by the streams. This deprecates indirectly the
"tune.chksize" global option.

In addition, the http-check regtest have been update to perform some h2
health-checks.

Many thanks to @VigneshSP94 for its help on this bug.

This patch should solve the issue #936. It relies on the commit "MINOR:
tcpcheck: Don't handle anymore in-progress send rules in tcpcheck_main".
Both must be backport as far as 2.2.

bla
2020-11-27 10:29:41 +01:00
Christopher Faulet
39066c2738 MINOR: tcpcheck: Don't handle anymore in-progress send rules in tcpcheck_main
The special handling of in-progress send rules at the begining of
tcpcheck_main() function can be removed. Instead, at the begining of the
tcpcheck_eval_send() function, we test is there is some data in the output
buffer. In this case, it means we are evaluating an unfinished send rule and
we can jump to the sending part, skipping the formatting part.

This patch is mandatory for a major fix on the checks and must be backported
as far as 2.2.
2020-11-27 10:08:21 +01:00
Christopher Faulet
1faf18ae39 BUG/MINOR: tcpcheck: Don't forget to reset tcp-check flags on new kind of check
When a new kind of check is found during the parsing of a proxy section (via
an option directive), we must reset tcpcheck flags for this proxy. It is
mandatory to not inherit some flags from a previously declared check (for
instance in the default section).

This patch must be backported as far as 2.2.
2020-11-27 10:08:18 +01:00
Willy Tarreau
5a7d6ebf2c MINOR: fd/threads: silence a build warning with threads disabled
Building with gcc-9.3.0 without threads may result in this warning:

In file included from include/haproxy/api-t.h:36,
                 from include/haproxy/api.h:33,
                 from src/fd.c:90:
src/fd.c: In function 'updt_fd_polling':
include/haproxy/fd.h:507:11: warning: array subscript 63 is above array bounds of 'int[1]' [-Warray-bounds]
  507 |  DISGUISE(write(poller_wr_pipe[tid], &c, 1));
include/haproxy/compiler.h:92:41: note: in definition of macro 'DISGUISE'
   92 | #define DISGUISE(v) ({ typeof(v) __v = (v); ALREADY_CHECKED(__v); __v; })
      |                                         ^
src/fd.c:113:5: note: while referencing 'poller_wr_pipe'
  113 | int poller_wr_pipe[MAX_THREADS]; // Pipe to wake the threads
      |     ^~~~~~~~~~~~~~

gcc is wrong but this time it cannot be blamed because it doesn't know
that the FD's thread_mask always has at least one bit set. Let's add
the test for all_threads_mask there. It will also remove that test and
drop the else block.
2020-11-26 22:28:41 +01:00
Ilya Shipitsin
b34aee8294 CI: github actions: enable 51degrees feature 2020-11-26 19:08:15 +01:00
Ilya Shipitsin
f500359708 CI: github actions: update LibreSSL to 3.3.0 2020-11-26 19:08:02 +01:00
Maciej Zdeb
21acc33266 DOC: Clarify %HP description in log-format
%HP is used to report HTTP request URI in logs, which might be relative
or absolute. Description in documentation should not suggest that it
behaves exactly the same as "path" sample fetch.

This is even more important after 30ee1efe67
because right now, when HTTP2 is a standard, %HP usually returns absolute
URI.

This might be backported as far as 2.1
2020-11-26 19:07:21 +01:00
Willy Tarreau
6f1129d14d DOC: better document the config file format and escaping/quoting rules
It's always a pain to figure how to proceed when special characters need
to be embedded inside arguments of an expression. Let's document the
configuration file format and how unquoting/unescaping works at each
level (top level and argument level) so that everyone hopefully finds
suitable reminders or examples for complex cases.

This is related to github issue #200 and addresses issues #712 and #966.
2020-11-26 18:50:12 +01:00
Remi Tricot-Le Breton
4f7308335e DOC: cache: Add information about Vary support
We do not skip all responses containing a Vary in the cachign mechanism
anymore. Under certain conditions such responses might be cached.
2020-11-26 18:01:43 +01:00
Remi Tricot-Le Breton
d493bc863d DOC: cache: Add new caching limitation information
Responses that do not have an explicit expiration time or a validator
will not be cached anymore.

Must be backported if cc9bf2e ("MEDIUM: cache: Change caching
conditions") is backported.
2020-11-26 17:58:01 +01:00
Willy Tarreau
345ebcfc01 BUG/MAJOR: peers: fix partial message decoding
Another bug in the peers message parser was uncovered by last commit
1dfd4f106 ("BUG/MEDIUM: peers: fix decoding of multi-byte length in
stick-table messages"): the function return on incomplete message does
not check if the channel has a pending close before deciding to return
0. It did not hurt previously because the loop calling co_getblk() once
per character would have depleted the buffer and hit the end, causing
<0 to be returned and matching the condition. But now that we process
at once what is available this cannot be relied on anymore and it's
now clearly visible that the final check is missing.

What happens when this strikes is that if a peer connection breaks in
the middle of a message, the function will return 0 (missing data) but
the caller doesn't check for the closed buffer, subscribes to reads,
and the applet handler is immediately called again since some data are
still available. This is detected by the loop prevention and the process
dies complaining that an appctx is spinning.

This patch simply adds the check for closed channel. It must be
backported to the same versions as the fix above.
2020-11-26 17:12:47 +01:00
Tim Duesterhus
23b2945c1c BUG/CRITICAL: cache: Fix trivial crash by sending accept-encoding header
Since commit 3d08236cb3 HAProxy can be trivially
crashed remotely by sending an `accept-encoding` HTTP request header that
contains 16 commas.

This is because the `values` array in `accept_encoding_normalizer` accepts only
16 entries and it is not verified whether the end is reached during looping.

Fix this issue by checking the length. This patch also simplifies the ist
processing in the loop, because it manually calculated offsets and lengths,
when the ist API exposes perfectly safe functions to advance and truncate ists.

I wonder whether the accept_encoding_normalizer function is able to re-use some
existing function for parsing headers that may contain lists of values. I'll
leave this evaluation up to someone else, only patching the obvious crash.

This commit is 2.4-dev specific and was merged just a few hours ago. No
backport needed.
2020-11-25 10:23:00 +01:00
Remi Tricot-Le Breton
754b2428d3 MINOR: cache: Add a process-vary option that can enable/disable Vary processing
The cache section's process-vary option takes a 0 or 1 value to disable
or enable the vary processing.
When disabled, a response containing such a header will never be cached.
When enabled, we will calculate a preliminary hash for a subset of request
headers on all the incoming requests (which might come with a cpu cost) which
will be used to build a secondary key for a given request (see RFC 7234#4.1).
The default value is 0 (disabled).
2020-11-24 16:52:57 +01:00
Remi Tricot-Le Breton
1785f3dd96 MEDIUM: cache: Add the Vary header support
Calculate a preliminary secondary key for every request we see so that
we can have a real secondary key if the response is cacheable and
contains a manageable Vary header.
The cache's ebtree is now allowed to have multiple entries with the same
primary key. Two of those entries will be distinguished thanks to
secondary keys stored in the cache_entry (based on hashes of a subset of
their headers).
When looking for an entry in the cache (cache_use), we still use the
primary key (built the same way as before), but in case of match, we
also need to check if the entry has a vary signature. If it has one, we
need to perform an extra check based on the newly built secondary key.
We will only be able to forge a response out of the cache if both the
primary and secondary keys match with one of our entries. Otherwise the
request will be forwarder to the server.
2020-11-24 16:52:57 +01:00
Remi Tricot-Le Breton
3d08236cb3 MINOR: cache: Prepare helper functions for Vary support
The Vary functionality is based on a secondary key that needs to be
calculated for every request to which a server answers with a Vary
header. The Vary header, which can only be found in server responses,
determines which headers of the request need to be taken into account in
the secondary key. Since we do not want to have to store all the headers
of the request until we have the response, we will pre-calculate as many
sub-hashes as there are headers that we want to manage in a Vary
context. We will only focus on a subset of headers which are likely to
be mentioned in a Vary response (accept-encoding and referer for now).
Every managed header will have its own normalization function which is
in charge of transforming the header value into a core representation,
more robust to insignificant changes that could exist between multiple
clients. For instance, two accept-encoding values mentioning the same
encodings but in different orders should give the same hash.
This patch adds a function that parses a Vary header value and checks if
all the values belong to our supported subset. It also adds the
normalization functions for our two headers, as well as utility
functions that can prebuild a secondary key for a given request and
transform it into an actual secondary key after the vary signature is
determined from the response.
2020-11-24 16:52:57 +01:00
Christopher Faulet
401e6dbff3 BUG/MAJOR: filters: Always keep all offsets up to date during data filtering
When at least one data filter is registered on a channel, the offsets of all
filters must be kept up to date. For data filters but also for others. It is
safer to do it in that way. Indirectly, this patch fixes 2 hidden bugs
revealed by the commit 22fca1f2c ("BUG/MEDIUM: filters: Forward all filtered
data at the end of http filtering").

The first one, the worst of both, happens at the end of http filtering when
at least one data filtered is registered on the channel. We call the
http_end() callback function on the filters, when defined, to finish the
http filtering. But it is performed for all filters. Before the commit
22fca1f2c, the only risk was to call the http_end() callback function
unexpectedly on a filter. Now, we may have an overflow on the offset
variable, used at the end to forward all filtered data. Of course, from the
moment we forward an arbitrary huge amount of data, all kinds of bad things
may happen. So offset computation is performed for all filters and
http_end() callback function is called only for data filters.

The other one happens when a data filter alter the data of a channel, it
must update the offsets of all previous filters. But the offset of non-data
filters must be up to date, otherwise, here too we may have an integer
overflow.

Another way to fix these bugs is to always ignore non-data filters from the
offsets computation. But this patch is safer and probably easier to
maintain.

This patch must be backported in all versions where the above commit is. So
as far as 2.0.
2020-11-24 14:17:32 +01:00
Joao Morais
aa8fcc4692 DOC: better describes how to configure a fallback crt
A default certificate is always the first one declared in the bind line,
either from `crt` or from `crt-line` option. This commit updates the
description of how to configure a fallback certificate, clarifying that
it needs to be the first one of the bind line.

Should be merged as far as the first SNI filter implementation.
2020-11-24 13:23:06 +01:00
Maciej Zdeb
6dee9969b9 BUG/MEDIUM: http_act: Restore init of log-format list
Restore init of log-format list in parse_http_del_header which was
accidently deleted by commit ebdd4c55da
(implementation of different header matching methods for
http-request/response del-header).

This is related to GitHub issue #909
2020-11-24 10:33:46 +01:00
Ilya Shipitsin
5bfe66366c BUILD: SSL: do not "update" BoringSSL version equivalent anymore
we have added all required fine guarding, no need to reduce
BoringSSL version back to 1.1.0 anymore, we do not depend on it
2020-11-24 09:54:44 +01:00
Ilya Shipitsin
d9a16dc0f2 BUILD: SSL: add BoringSSL guarding to "RAND_keep_random_devices_open"
"RAND_keep_random_devices_open" is OpenSSL specific, does not present
in other OpenSSL variants like LibreSSL or BoringSSL. BoringSSL recently
"updated" its internal openssl version to 1.1.1, we temporarily set it
back to 1.1.0, as we are going to remove that hack, let us add proper
guarding.
2020-11-24 09:54:44 +01:00
Ilya Shipitsin
f04a89c549 CLEANUP: remove unused function "ssl_sock_is_ckch_valid"
"ssl_sock_is_ckch_valid" is not used anymore, let us remove it
2020-11-24 09:54:44 +01:00
Julien Pivotto
2de240a676 MINOR: stream: Add level 7 retries on http error 401, 403
Level-7 retries are only possible with a restricted number of HTTP
return codes. While it is usually not safe to retry on 401 and 403, I
came up with an authentication backend which was not synchronizing
authentication of users. While not perfect, being allowed to also retry
on those return codes is really helpful and acts as a hotfix until we
can fix the backend.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
2020-11-23 09:33:14 +01:00
Tim Duesterhus
9fee7e02d1 CI: Set DEBUG=-DDEBUG_STRICT=1 in GitHub Actions
This was missing when migrating from Travis.
2020-11-21 18:27:33 +01:00
Tim Duesterhus
c8d19702f4 BUILD: Show the value of DEBUG= in haproxy -vv
Previously this was not visible after building.
2020-11-21 18:27:33 +01:00
Tim Duesterhus
81e948e051 BUILD: Make DEBUG part of .build_opts
This forces a recompilation if the value of DEBUG= changes.
2020-11-21 18:27:33 +01:00
Willy Tarreau
1a38ffcb0f [RELEASE] Released version 2.4-dev1
Released version 2.4-dev1 with the following main changes :
    - MINOR: ist: Add istend() function to return a pointer to the end of the string
    - MINOR: sample: Add converters to parse FIX messages
    - REGTEST: converter: Add a regtest for fix converters
    - MINOR: sample: Add converts to parses MQTT messages
    - REGTEST: converter: Add a regtest for MQTT converters
    - MINOR: compat: automatically include malloc.h on glibc
    - MEDIUM: pools: call malloc_trim() from pool_gc()
    - MEDIUM: pattern: call malloc_trim() on pat_ref_reload()
    - MINOR: pattern: move the update revision to the pat_ref, not the expression
    - CLEANUP: pattern: delete the back refs at once during pat_ref_reload()
    - MINOR: pattern: new sflag PAT_SF_REGFREE indicates regex_free() is needed
    - MINOR: pattern: make the delete and prune functions more generic
    - MEDIUM: pattern: link all final elements from the reference
    - MEDIUM: pattern: change the pat_del_* functions to delete from the references
    - MINOR: pattern: remerge the list and tree deletion functions
    - MINOR: pattern: perform a single call to pat_delete_gen() under the expression
    - CLEANUP: acl: don't reference the generic pattern deletion function anymore
    - CLEANUP: pattern: remove pat_delete_fcts[] and pattern_head->delete()
    - MINOR: pattern: introduce pat_ref_delete_by_ptr() to delete a valid reference
    - MINOR: pattern: store a generation number in the reference patterns
    - MEDIUM: pattern: only match patterns that match the current generation
    - MINOR: pattern: add pat_ref_commit() to commit a previously inserted element
    - MINOR: pattern: implement pat_ref_load() to load a pattern at a given generation
    - MINOR: pattern: add pat_ref_purge_older() to purge old entries
    - MEDIUM: pattern: make pat_ref_prune() rely on pat_ref_purge_older()
    - MINOR: pattern: during reload, delete elements frem the ref, not the expression
    - MINOR: pattern: prepare removal of a pattern from the list head
    - MEDIUM: pattern: turn the pattern chaining to single-linked list
    - CLEANUP: cfgparse: remove duplicate registration for transparent build options
    - BUG/MINOR: ssl: don't report 1024 bits DH param load error when it's higher
    - MINOR: http-htx: Add understandable errors for the errorfiles parsing
    - MINOR: ssl: instantiate stats module
    - MINOR: ssl: count client hello for stats
    - MINOR: ssl: add counters for ssl sessions
    - DOC: config: Fix a typo on ssl_c_chain_der
    - MINOR: server: remove idle lock in srv_cleanup_connections
    - BUILD: ssl: silence build warning on uninitialised counters
    - BUILD: http-htx: fix build warning regarding long type in printf
    - REGTEST: ssl: test wildcard and multi-type + exclusions
    - BUG/MEDIUM: ssl/crt-list: correctly insert crt-list line if crt already loaded
    - CI: Expand use of GitHub Actions for CI
    - REGTEST: ssl: mark reg-tests/ssl/ssl_crt-list_filters.vtc as broken
    - BUG/MINOR: pattern: a sample marked as const could be written
    - BUG/MINOR: lua: set buffer size during map lookups
    - MEDIUM: cache: Change caching conditions
    - BUG/MINOR: stats: free dynamically stats fields/lines on shutdown
    - BUG/MEDIUM: stats: prevent crash if counters not alloc with dummy one
    - MINOR: peers: Add traces to peer_treat_updatemsg().
    - BUG/MINOR: peers: Do not ignore a protocol error for dictionary entries.
    - BUG/MINOR: peers: Missing TX cache entries reset.
    - BUG/MEDIUM: peers: fix decoding of multi-byte length in stick-table messages
    - BUG/MINOR: http-fetch: Extract cookie value even when no cookie name
    - BUG/MINOR: http-fetch: Fix calls w/o parentheses of the cookie sample fetches
    - BUG/MEDIUM: check: reuse srv proto only if using same mode
    - MINOR: check: report error on incompatible proto
    - MINOR: check: report error on incompatible connect proto
    - BUG/MINOR: http-htx: Handle warnings when parsing http-error and http-errors
    - BUG/MAJOR: spoe: Be sure to remove all references on a released spoe applet
    - MINOR: spoe: Don't close connection in sync mode on processing timeout
    - BUG/MINOR: tcpcheck: Don't warn on unused rules if check option is after
    - MINOR: init: Fix the prototype for per-thread free callbacks
    - MINOR: config/mux-h2: Return ERR_ flags from init_h2() instead of a status
    - CLEANUP: config: Return ERR_NONE from config callbacks instead of 0
    - MINOR: cfgparse: tighten the scope of newnameserver variable, free it on error.
    - REGTEST: make ssl_client_samples and ssl_server_samples require to 2.2
    - REGTESTS: Add sample_fetches/cook.vtc
    - BUG/MEDIUM: filters: Forward all filtered data at the end of http filtering
    - BUG/MINOR: http-ana: Don't wait for the body of CONNECT requests
    - CLEANUP: flt-trace: Remove unused random-parsing option
    - MINOR: flt-trace: Add an option to inhibits trace messages
    - MINOR: flt-trace: Use a bitfield for the trace options
    - REGTESTS: Add a script to test the random forwarding with several filters
    - REGTESTS: mark the abns test as broken again
    - REGTESTS: converter: add url_dec test
    - CI: Stop hijacking the hosts file
    - CI: Make the h2spec workflow more consistent with the VTest workflow
    - CI: travis-ci: remove amd64, osx builds
    - CI: travis-ci: arm64 are not allowed to fail anymore
    - DOC: add missing 3.10 in the summary
    - MINOR: ssl: remove client hello counters
    - MEDIUM: stats: add counters for failed handshake
    - MINOR: ssl: create common ssl_ctx init
    - MEDIUM: cli/ssl: configure ssl on server at runtime
    - REGTEST: server/cli_set_ssl.vtc requires OpenSSL
    - DOC: coding-style: update a few rules about pointers
    - BUG/MINOR: ssl: segv on startup when AKID but no keyid
    - BUILD: ssl: use SSL_MODE_ASYNC macro instead of OPENSSL_VERSION
    - BUG/MEDIUM: http-ana: Don't eval http-after-response ruleset on empty messages
    - BUG/MEDIUM: ssl/crt-list: bundle support broken in crt-list
    - BUG/MEDIUM: ssl: error when no certificate are found
    - BUG/MINOR: ssl/crt-list: load bundle in crt-list only if activated
    - BUG/MEDIUM: ssl/crt-list: fix error when no file found
    - CI: Github Actions: enable prometheus exporter
    - CI: Github Actions: remove LibreSSL-3.0.2 builds
    - CI: Github Actions: enable BoringSSL builds
    - CI: travis-ci: remove builds migrated to GH actions
    - BUILD: makefile: enable crypt(3) for OpenBSD
    - CI: Github Action: run "apt-get update" before packages restore
    - BUILD: SSL: guard TLS13 ciphersuites with HAVE_SSL_CTX_SET_CIPHERSUITES
    - CI: Pass the github.event_name to matrix.py
    - CI: Clean up Windows CI
    - DOC: clarify how to create a fallback crt
    - CLEANUP: connection: do not use conn->owner when the session is known
    - BUG/MAJOR: connection: reset conn->owner when detaching from session list
    - REGTESTS: mark proxy_protocol_random_fail as broken
    - BUG/MINOR: http_htx: Fix searching headers by substring
    - MINOR: http_act: Add -m flag for del-header name matching method
2020-11-21 16:00:40 +01:00
Maciej Zdeb
ebdd4c55da MINOR: http_act: Add -m flag for del-header name matching method
This patch adds -m flag which allows to specify header name
matching method when deleting headers from http request/response.
Currently beg, end, sub, str and reg are supported.

This is related to GitHub issue #909
2020-11-21 15:54:30 +01:00
Maciej Zdeb
302b9f8d7a BUG/MINOR: http_htx: Fix searching headers by substring
Function __http_find_header is used to search headers by name using specified
matching method. Matching by substring returned unexpected results due to wrong
length of substring supplied to strnistr function.

Fixed also the boolean condition by inverting it, as we're interested in
headers that contains the substring.

This patch should be backported as far as 2.2
2020-11-21 15:54:26 +01:00
Willy Tarreau
4137889911 REGTESTS: mark proxy_protocol_random_fail as broken
As indicated in issue #907 it very frequently fails on FreeBSD, and
looking at it, it's broken in multiple ways. It relies on log ordering
between two layers, the first one being allowed to support h2. Given
that on FreeBSD it usually ends up in timeouts, it's very likely that
for some reason one frontend logs before the other one or that curl
uses h2 instead of h1 there, and that the log instance waits forever
for its lines.

Usually it works fine when run locally though, so let's not remove it
and mark it as broken instead so that it can still be used when relevant.
2020-11-21 15:33:03 +01:00
Willy Tarreau
3aab17bd56 BUG/MAJOR: connection: reset conn->owner when detaching from session list
Baptiste reported a new crash affecting 2.3 which can be triggered
when using H2 on the backend, with http-reuse always and with a tens
of clients doing close only. There are a few combined cases which cause
this to happen, but each time the issue is the same, an already freed
session is dereferenced in session_unown_conn().

Two cases were identified to cause this:
  - a connection referencing a session as its owner, which is detached
    from the session's list and is destroyed after this session ends.
    The test on conn->owner before calling session_unown_conn() is not
    sufficent as the pointer is not null but is not valid anymore.

  - a connection that never goes idle and that gets killed form the
    mux, where session_free() is called first, then conn_free() calls
    session_unown_conn() which scans the just freed session for older
    connections. This one is only triggered with DEBUG_UAF

The reason for this session to be present here is that it's needed during
the connection setup, to be passed to conn_install_mux_be() to mux->init()
as the owning session, but it's never deleted aftrewards. Furthermore, even
conn_session_free() doesn't delete this pointer after freeing the session
that lies there. Both do definitely result in a use-after-free that's more
easily triggered under DEBUG_UAF.

This patch makes sure that the owner is always deleted after detaching
or killing the session. However it is currently not possible to clear
the owner right after a synchronous init because the proxy protocol
apparently needs it (a reg test checks this), and if we leave it past
the connection setup with the session not attached anywhere, it's hard
to catch the right moment to detach it. This means that the session may
remain in conn->owner as long as the connection has never been added to
nor removed from the session's idle list. Given that this patch needs to
remain simple enough to be backported, instead it adds a workaround in
session_unown_conn() to detect that the element is already not attached
anywhere.

This fix absolutely requires previous patch "CLEANUP: connection: do not
use conn->owner when the session is known" otherwise the situation will
be even worse, as some places used to rely on conn->owner instead of the
session.

The fix could theorically be backported as far as 1.8. However, the code
in this area has significantly changed along versions and there are more
risks of breaking working stuff than fixing real issues there. The issue
was really woken up in two steps during 2.3-dev when slightly reworking
the idle conns with commit 08016ab82 ("MEDIUM: connection: Add private
connections synchronously in session server list") and when adding
support for storing used H2 connections in the session and adding the
necessary call to session_unown_conn() in the muxes. But the same test
managed to crash 2.2 when built in DEBUG_UAF and patched like this,
proving that we used to already leave dangling pointers behind us:

|  diff --git a/include/haproxy/connection.h b/include/haproxy/connection.h
|  index f8f235c1a..dd30b5f80 100644
|  --- a/include/haproxy/connection.h
|  +++ b/include/haproxy/connection.h
|  @@ -458,6 +458,10 @@ static inline void conn_free(struct connection *conn)
|                          sess->idle_conns--;
|                  session_unown_conn(sess, conn);
|          }
|  +       else {
|  +               struct session *sess = conn->owner;
|  +               BUG_ON(sess && sess->origin != &conn->obj_type);
|  +       }
|
|          sockaddr_free(&conn->src);
|          sockaddr_free(&conn->dst);

It's uncertain whether an existing code path there can lead to dereferencing
conn->owner when it's bad, though certain suspicious memory corruption bugs
make one think it's a likely candidate. The patch should not be hard to
adapt there.

Backports to 2.1 and older are left to the appreciation of the person
doing the backport.

A reproducer consists in this:

  global
    nbthread 1

  listen l
    bind :9000
    mode http
    http-reuse always
    server s 127.0.0.1:8999 proto h2

  frontend f
    bind :8999 proto h2
    mode http
    http-request return status 200

Then this will make it crash within 2-3 seconds:

  $ h1load -e -r 1 -c 10 http://0:9000/

If it does not, it might be that DEBUG_UAF was not used (it's harder then)
and it might be useful to restart.
2020-11-21 15:29:22 +01:00