The function really has the semantics of a realloc() except that it
also passes the old size to help with accounting. No need to special
case the free or malloc, realloc does everything we need.
If the session is not established, the applet handler could leave
with the applet detached from the ring. At next call, the attach
counter will be decreased again causing unpredectable behavior.
This patch should be backported on branches >=2.2
With commit a1f12746b ("MINOR: traces: add a new level "error" below
the "user" level") a new trace level was inserted, resulting in
shifting all exiting ones by one. But the levels reported in the
__trace() function were not updated accordingly, resulting in the
TRACE_LEVEL_DEVELOPER not to be properly reported anymore. This
patch fixes it by extending the number of levels to 6.
No backport is needed.
When many concurrent requests targeting the same resource were seen, the
cache could sometimes be filled by too many partial responses resulting
in the impossibility to cache a single one of them. This happened
because the actual tree insertion happened only after all the payload of
every response was seen. So until then, every response was added to the
cache because none of the streams knew that a similar request/response
was already being treated.
This patch consists in adding the cache_entry as soon as possible in the
tree (right after the first packet) so that the other responses do not
get cached as well (if they have the same primary key).
A "complete" flag is also added to the cache_entry so that we know if
all the payload is already stored in the entry or if it is still being
processed.
Turn the "Accept-Encoding" value to lower case before processing it.
Calculate the CRC on every token instead of a sorted concatenation of
them all (in order to avoir copying them) then XOR all the CRCs into a
single hash (while ignoring duplicates).
Lua allows registering multiple sample-fetches, converters, action, cli,
applet/services with the same name. This is absolutely useless since only
the first registration will be used. This patch sends a warning if the case
is encountered.
This pach could be backported until 1.8, with the 3 associated patches:
- MINOR: actions: Export actions lookup functions
- MINOR: actions: add a function returning a service pointer from its name
- MINOR: cli: add a function to look up a CLI service description
This function will be useful to check if the keyword is already registered.
Also add a define for the max number of args.
This will be needed by a next patch to fix a bug and will have to be
backported.
This function simply calls action_lookup() on the private service_keywords,
to look up a service name. This will be used to detect double registration
of a same service from Lua.
This will be needed by a next patch to fix a bug and will have to be
backported.
These functions will be useful to check if a keyword is already registered.
This will be needed by a next patch to fix a bug, and will need to be
backported.
Operation luaL_openlibs() and lua_prepend path are processed whithout
the safe context, so in case of failure Haproxy aborts or stops without
error message.
This patch could be backported until 1.8
"lua-load" doesn't check if the expected parameter is present. It tries to
open() directly the argument at second position. So if the filename is
omitted, it tries to load an empty filename.
This patch could be backported until 1.8
The stats on haproxy.org reported ~12k GOAWAY for ~34k connections, with
only 2 protocol errorss. It turns out that the GOAWAY frame counter added
in commit a8879238c ("MINOR: mux-h2: report detected error on stats")
matches a bit too many situations. First it counts those which are not
sent as well as failed retries, second it counts as errors the cases of
attempts to cleanly close, while it's titled "GOAWAY sent on detected
error". Let's address this by moving the counter up one line and excluding
the clean codes.
This can be backported to 2.3.
A number of traces could be added, and a few TRACE_PROTO were replaced
with TRACE_ERROR. The goal is to be able to enable error tracing only
to detect anomalies.
It looks like they're mostly correct as they don't seem to strike on
valid H2 traffic but are very verbose on h2spec.
Sometimes it would be nice to be able to only trace abnormal events such
as protocol errors. Let's add a new "error" level below the "user" level
for this. This will allow to add TRACE_ERROR() at various error points
and only see them.
Since commit a8879238c ("MINOR: mux-h2: report detected error on stats")
we now have some error stats on stream/connection level protocol errors,
but some were improperly marked as stream while they're connection, and
2 or 3 relevant ones were missing and have now been added.
This could be backported to 2.3.
This patch adds a new logging variable '%HPO' for logging HTTP path only
(without query string) from relative or absolute URI.
For example:
log-format "hpo=%HPO hp=%HP hu=%HU hq=%HQ"
GET /r/1 HTTP/1.1
=>
hpo=/r/1 hp=/r/1 hu=/r/1 hq=
GET /r/2?q=2 HTTP/1.1
=>
hpo=/r/2 hp=/r/2 hu=/r/2?q=2 hq=?q=2
GET http://host/r/3 HTTP/1.1
=>
hpo=/r/3 hp=http://host/r/3 hu=http://host/r/3 hq=
GET http://host/r/4?q=4 HTTP/1.1
=>
hpo=/r/4 hp=http://host/r/4 hu=http://host/r/4?q=4 hq=?q=4
Released version 2.4-dev2 with the following main changes :
- BUILD: Make DEBUG part of .build_opts
- BUILD: Show the value of DEBUG= in haproxy -vv
- CI: Set DEBUG=-DDEBUG_STRICT=1 in GitHub Actions
- MINOR: stream: Add level 7 retries on http error 401, 403
- CLEANUP: remove unused function "ssl_sock_is_ckch_valid"
- BUILD: SSL: add BoringSSL guarding to "RAND_keep_random_devices_open"
- BUILD: SSL: do not "update" BoringSSL version equivalent anymore
- BUG/MEDIUM: http_act: Restore init of log-format list
- DOC: better describes how to configure a fallback crt
- BUG/MAJOR: filters: Always keep all offsets up to date during data filtering
- MINOR: cache: Prepare helper functions for Vary support
- MEDIUM: cache: Add the Vary header support
- MINOR: cache: Add a process-vary option that can enable/disable Vary processing
- BUG/CRITICAL: cache: Fix trivial crash by sending accept-encoding header
- BUG/MAJOR: peers: fix partial message decoding
- DOC: cache: Add new caching limitation information
- DOC: cache: Add information about Vary support
- DOC: better document the config file format and escaping/quoting rules
- DOC: Clarify %HP description in log-format
- CI: github actions: update LibreSSL to 3.3.0
- CI: github actions: enable 51degrees feature
- MINOR: fd/threads: silence a build warning with threads disabled
- BUG/MINOR: tcpcheck: Don't forget to reset tcp-check flags on new kind of check
- MINOR: tcpcheck: Don't handle anymore in-progress send rules in tcpcheck_main
- BUG/MAJOR: tcpcheck: Allocate input and output buffers from the buffer pool
- MINOR: tcpcheck: Don't handle anymore in-progress connect rules in tcpcheck_main
- MINOR: config: Deprecate and ignore tune.chksize global option
- MINOR: config: Add a warning if tune.chksize is used
- REORG: tcpcheck: Move check option parsing functions based on tcp-check
- MINOR: check: Always increment check health counter on CONPASS
- MINOR: tcpcheck: Add support of L7OKC on expect rules error-status argument
- DOC: config: Make disable-on-404 option clearer on transition conditions
- DOC: config: Move req.hdrs and req.hdrs_bin in L7 samples fetches section
- BUG/MINOR: http-fetch: Fix smp_fetch_body() when called from a health-check
- MINOR: plock: use an ARMv8 instruction barrier for the pause instruction
- MINOR: debug: add "debug dev sched" to stress the scheduler.
- MINOR: debug: add a trivial PRNG for scheduler stress-tests
- BUG/MEDIUM: lists: Lock the element while we check if it is in a list.
- MINOR: task: remove tasklet_insert_into_tasklet_list()
- MINOR: task: perform atomic counter increments only once per wakeup
- MINOR: task: remove __tasklet_remove_from_tasklet_list()
- BUG/MEDIUM: task: close a possible data race condition on a tasklet's list link
- BUG/MEDIUM: local log format regression.
Since 2.3 default local log format always adds hostame field.
This behavior change was due to log/sink re-work, because according
to rfc3164 the hostname field is mandatory.
This patch re-introduce a legacy "local" format which is analog
to rfc3164 but with hostname stripped. This is the new
default if logs are generated by haproxy.
To stay compliant with previous configurations, the option
"log-send-hostname" acts as if the default format is switched
to rfc3164.
This patch addresses the github issue #963
This patch should be backported in branches >= 2.3.
In issue #958 Ashley Penney reported intermittent crashes on AWS's ARM
nodes which would not happen on x86 nodes. After investigation it turned
out that the Neoverse N1 CPU cores used in the Graviton2 CPU are much
more aggressive than the usual Cortex A53/A72/A55 or any x86 regarding
memory ordering.
The issue that was triggered there is that if a tasklet_wakeup() call
is made on a tasklet scheduled to run on a foreign thread and that
tasklet is just being dequeued to be processed, there can be a race at
two places:
- if MT_LIST_TRY_ADDQ() happens between MT_LIST_BEHEAD() and
LIST_SPLICE_END_DETACHED() if the tasklet is alone in the list,
because the emptiness tests matches ;
- if MT_LIST_TRY_ADDQ() happens during LIST_DEL_INIT() in
run_tasks_from_lists(), then depending on how LIST_DEL_INIT() ends
up being implemented, it may even corrupt the adjacent nodes while
they're being reused for the in-tree storage.
This issue was introduced in 2.2 when support for waking up remote
tasklets was added. Initially the attachment of a tasklet to a list
was enough to know its status and this used to be stable information.
Now it's not sufficient to rely on this anymore, thus we need to use
a different information.
This patch solves this by adding a new task flag, TASK_IN_LIST, which
is atomically set before attaching a tasklet to a list, and is only
removed after the tasklet is detached from a list. It is checked
by tasklet_wakeup_on() so that it may only be done while the tasklet
is out of any list, and is cleared during the state switch when calling
the tasklet. Note that the flag is not set for pure tasks as it's not
needed.
However this introduces a new special case: the function
tasklet_remove_from_tasklet_list() needs to keep both states in sync
and cannot check both the state and the attachment to a list at the
same time. This function is already limited to being used by the thread
owning the tasklet, so in this case the test remains reliable. However,
just like its predecessors, this function is wrong by design and it
should probably be replaced with a stricter one, a lazy one, or be
totally removed (it's only used in checks to avoid calling a possibly
scheduled event, and when freeing a tasklet). Regardless, for now the
function exists so the flag is removed only if the deletion could be
done, which covers all cases we're interested in regarding the insertion.
This removal is safe against a concurrent tasklet_wakeup_on() since
MT_LIST_DEL() guarantees the atomic test, and will ultimately clear
the flag only if the task could be deleted, so the flag will always
reflect the last state.
This should be carefully be backported as far as 2.2 after some
observation period. This patch depends on previous patch
"MINOR: task: remove __tasklet_remove_from_tasklet_list()".
This function is only used at a single place directly within the
scheduler in run_tasks_from_lists() and it really ought not be called
by anything else, regardless of what its comment says. Let's delete
it, move the two lines directly into the call place, and take this
opportunity to factor the atomic decrement on tasks_run_queue. A comment
was added on the remaining one tasklet_remove_from_tasklet_list() to
mention the risks in using it.
In process_runnable_tasks(), we walk the run queue and pick tasks to
insert them into the local list. And for each of these operations we
perform a few increments, some of which are atomic, and they're even
performed under the runqueue's lock. This is useless inside the loop,
better do them at the end, since we don't use these values inside the
loop and they're not used anywhere else either during this time. The
only one is task_list_size which is accessed in parallel by other
threads performing remote tasklet wakeups, but it's already
approximative and is used to decide to get out of the loop when the
limit is reached. So now we compute it first as an initial budget
instead.
This function is only called at a single place and adds more confusion
than it removes. It also makes one think it could be used outside of
the scheduler while it must absolutely not. Let's just move its two
lines to the call place, making the code more readable there. In
addition this clearly shows that the preliminary LIST_INIT() is
useless since the entry is immediately overwritten.
In MT_LIST_TRY_ADDQ() and MT_LIST_TRY_ADD() we can't just check if the
element is already in a list, because there's a small race condition, it
could be added between the time we checked, and the time we actually set
its next and prev, so we have to lock it first.
This is required to address issue #958.
This should be backported to 2.3, 2.2 and 2.1.
Commit a5a447984 ("MINOR: debug: add "debug dev sched" to stress the
scheduler.") doesn't scale with threads because ha_random64() takes care
of being totally thread-safe for use with UUIDs. We don't need this for
the stress-testing functions, let's just implement a xorshift PRNG
instead. On 8 threads the performance jumped from 230k ctx/s with 96%
spent in ha_random64() to 14M ctx/s.
This command supports starting a bunch of tasks or tasklets, either on the
current thread (mask=0), all (default), or any set, either single-threaded
or multi-threaded, and possibly auto-scheduled.
These tasks/tasklets will randomly pick another one to wake it up. The
tasks only do it 50% of the time while tasklets always wake two tasks up,
in order to achieve roughly 50% load (since the target might already be
woken up).
As suggested by @AGSaidi in issue #958, on ARMv8 its convenient to use
an "isb" instruction in pl_cpu_relax() to improve fairness. Without it
I've met a few watchdog conditions on valid locks with 16 threads,
indicating that some threads couldn't manage to get it in 2 seconds. I
never happened again with it. In addition, the performance increased
by slightly more than 5% thanks to the reduced contention.
This should be backported as far as 2.2, possibly even 2.0.
res.body may be called from a health-check. It is probably never used. But it is
possibe. In such case, there is no channel. Thus we must not use it
unconditionally to set the flag SMP_F_MAY_CHANGE on the smp.
Now the condition test the channel first. In addtion, the flag is not set if the
payload is fully received.
This patch must be backported as far as 2.2.
L7OKC may now be used as an error status for an HTTP/TCP expect rule. Thus
it is for instance possible to write:
option httpchk GET /isalive
http-check expect status 200,404
http-check expect status 200 error-status L7OKC
It is more or less the same than the disable-on-404 option except that if a
DOWN is up again but still replying a 404 will be set to NOLB state. While
it will stay in DOWN state with the disable-on-404 option.
Regarding the health counter, a check finished with the CONDPASS result is
now the same than with the PASSED result: The health counter is always
incemented. Before, it was only performed is the health counter was not 0.
There is no change for the disable-on-404 option because it is only
evaluated for running or stopping servers. So with an health check counter
greater than 0. But it will make possible to handle (STOPPED -> STOPPING)
transition for servers.
The parsing of the check options based on tcp-check rules (redis, spop,
smtp, http...) are moved aways from check.c. Now, these functions are placed
in tcpcheck.c. These functions are only related to the tcpcheck ruleset
configured on a proxy and not to the health-check attached to a server.
This option is now deprecated. It is recent, but it is now marked as
deprecated as far as 2.2. Thus, there is now a warning in the 2.4 if this
option is still used. It will be removed in 2.5.
Becaue the 2.3 is quite new, this patch may be backported to 2.3.
This option is now ignored because I/O check buffers are now allocated using the
buffer pool. Thus, it is marked as deprecated in the documentation and ignored
during the configuration parsing. The field is also removed from the global
structure.
Because this option is ignored since a recent fix, backported as fare as 2.2,
this patch should be backported too. Especially because it updates the
documentation.
The special handling of in-progress connect rules at the begining of
tcpcheck_main() function can be removed. Instead, at the begining of the
tcpcheck_eval_connect() function, we test is there is already an existing
connection. In this case, it means we are waiting for a connection
establishment. In addition, before evaluating a new connect rule, we take
care to release any previous connection.
Historically, the input and output buffers of a check are allocated by hand
during the startup, with a specific size (not necessarily the same than
other buffers). But since the recent refactoring of the checks to rely
exclusively on the tcp-checks and to use the underlying mux layer, this part
is totally buggy. Indeed, because these buffers are now passed to a mux,
they maybe be swapped if a zero-copy is possible. In fact, for now it is
only possible in h2_rcv_buf(). Thus the bug concretely only exists if a h2
health-check is performed. But, it is a latent bug for other muxes.
Another problem is the size of these buffers. because it may differ for the
other buffer size, it might be source of bugs.
Finally, for configurations with hundreds of thousands of servers, having 2
buffers per check always allocated may be an issue.
To fix the bug, we now allocate these buffers when required using the buffer
pool. Thus not-running checks don't waste memory and muxes may swap them if
possible. The only drawback is the check buffers have now always the same
size than buffers used by the streams. This deprecates indirectly the
"tune.chksize" global option.
In addition, the http-check regtest have been update to perform some h2
health-checks.
Many thanks to @VigneshSP94 for its help on this bug.
This patch should solve the issue #936. It relies on the commit "MINOR:
tcpcheck: Don't handle anymore in-progress send rules in tcpcheck_main".
Both must be backport as far as 2.2.
bla
The special handling of in-progress send rules at the begining of
tcpcheck_main() function can be removed. Instead, at the begining of the
tcpcheck_eval_send() function, we test is there is some data in the output
buffer. In this case, it means we are evaluating an unfinished send rule and
we can jump to the sending part, skipping the formatting part.
This patch is mandatory for a major fix on the checks and must be backported
as far as 2.2.
When a new kind of check is found during the parsing of a proxy section (via
an option directive), we must reset tcpcheck flags for this proxy. It is
mandatory to not inherit some flags from a previously declared check (for
instance in the default section).
This patch must be backported as far as 2.2.
Building with gcc-9.3.0 without threads may result in this warning:
In file included from include/haproxy/api-t.h:36,
from include/haproxy/api.h:33,
from src/fd.c:90:
src/fd.c: In function 'updt_fd_polling':
include/haproxy/fd.h:507:11: warning: array subscript 63 is above array bounds of 'int[1]' [-Warray-bounds]
507 | DISGUISE(write(poller_wr_pipe[tid], &c, 1));
include/haproxy/compiler.h:92:41: note: in definition of macro 'DISGUISE'
92 | #define DISGUISE(v) ({ typeof(v) __v = (v); ALREADY_CHECKED(__v); __v; })
| ^
src/fd.c:113:5: note: while referencing 'poller_wr_pipe'
113 | int poller_wr_pipe[MAX_THREADS]; // Pipe to wake the threads
| ^~~~~~~~~~~~~~
gcc is wrong but this time it cannot be blamed because it doesn't know
that the FD's thread_mask always has at least one bit set. Let's add
the test for all_threads_mask there. It will also remove that test and
drop the else block.
%HP is used to report HTTP request URI in logs, which might be relative
or absolute. Description in documentation should not suggest that it
behaves exactly the same as "path" sample fetch.
This is even more important after 30ee1efe67
because right now, when HTTP2 is a standard, %HP usually returns absolute
URI.
This might be backported as far as 2.1
It's always a pain to figure how to proceed when special characters need
to be embedded inside arguments of an expression. Let's document the
configuration file format and how unquoting/unescaping works at each
level (top level and argument level) so that everyone hopefully finds
suitable reminders or examples for complex cases.
This is related to github issue #200 and addresses issues #712 and #966.
Responses that do not have an explicit expiration time or a validator
will not be cached anymore.
Must be backported if cc9bf2e ("MEDIUM: cache: Change caching
conditions") is backported.
Another bug in the peers message parser was uncovered by last commit
1dfd4f106 ("BUG/MEDIUM: peers: fix decoding of multi-byte length in
stick-table messages"): the function return on incomplete message does
not check if the channel has a pending close before deciding to return
0. It did not hurt previously because the loop calling co_getblk() once
per character would have depleted the buffer and hit the end, causing
<0 to be returned and matching the condition. But now that we process
at once what is available this cannot be relied on anymore and it's
now clearly visible that the final check is missing.
What happens when this strikes is that if a peer connection breaks in
the middle of a message, the function will return 0 (missing data) but
the caller doesn't check for the closed buffer, subscribes to reads,
and the applet handler is immediately called again since some data are
still available. This is detected by the loop prevention and the process
dies complaining that an appctx is spinning.
This patch simply adds the check for closed channel. It must be
backported to the same versions as the fix above.
Since commit 3d08236cb3 HAProxy can be trivially
crashed remotely by sending an `accept-encoding` HTTP request header that
contains 16 commas.
This is because the `values` array in `accept_encoding_normalizer` accepts only
16 entries and it is not verified whether the end is reached during looping.
Fix this issue by checking the length. This patch also simplifies the ist
processing in the loop, because it manually calculated offsets and lengths,
when the ist API exposes perfectly safe functions to advance and truncate ists.
I wonder whether the accept_encoding_normalizer function is able to re-use some
existing function for parsing headers that may contain lists of values. I'll
leave this evaluation up to someone else, only patching the obvious crash.
This commit is 2.4-dev specific and was merged just a few hours ago. No
backport needed.
The cache section's process-vary option takes a 0 or 1 value to disable
or enable the vary processing.
When disabled, a response containing such a header will never be cached.
When enabled, we will calculate a preliminary hash for a subset of request
headers on all the incoming requests (which might come with a cpu cost) which
will be used to build a secondary key for a given request (see RFC 7234#4.1).
The default value is 0 (disabled).