As shown by previous patch series, having to free some common proxy
struct members twice (in free_proxy() and proxy_free_defaults()) is
error-prone: we often overlook one of the two free locations when
adding new features.
To prevent such bugs from being introduced in the future, and also avoid
code duplication, we now have a proxy_free_common() function to free all
proxy struct members that are common to all proxy types (either regular or
default ones).
This should greatly improve code maintenance related to proxy freeing
logic.
proxy header_unique_id wasn't cleaned up in proxy_free_defaults(),
resulting in small memory leak if "unique-id-header" was used on a
default proxy section.
It may be backported to all stable versions.
proxy conn_src.iface_name was only freed in proxy_free_defaults(), whereas
proxy conn_src.bind_hdr_name was only freed in free_proxy().
Because of that, using "source usesrc hdr_ip()" in a default proxy, or
"source interface" in a regular or default proxy would cause memory leaks
during deinit.
It may be backported to all stable versions.
proxy dyncookie_key wasn't cleaned up in free_proxy(), resulting in small
memory leak if "dynamic-cookie-key" was used on a regular or default
proxy.
It may be backported to all stable versions.
proxy check_{command,path} members (used for "external-check" feature)
weren't cleaned up in free_proxy(), resulting in small memory leak if
"external-check command" or "external-check path" were used on a regular
or default proxy.
It may be backported to all stable versions.
proxy email-alert settings weren't cleaned up in free_proxy(), resulting
in small memory leak if "email-alert to" or "email-alert from" were used
on a regular or default proxy.
It may be backported to all stable versions.
proxy log_tag wasn't cleaned up in free_proxy(), resulting in small
memory leak if "log-tag" was used on a regular or default proxy.
It may be backported to all stable versions.
proxy server_id_hdr_name member (used for "http-send-name-header" option)
wasn't cleaned up in free_proxy(), resulting in small memory leak if
"http-send-name-header" was used on a regular or default proxy.
This may be backported to all stable versions.
Warning message to indicate that the "http-send-name-header" option is
ignored for backend in "mode log" was referenced using its internal
struct wording instead of public name (as seen in the documentation).
Let's fix that.
It may be backported with c7783fb ("MINOR: log/backend: prevent
"http-send-name-header" use with LOG mode") in 2.9.
BoringSSL support is known to be broken since 2021, it was removed from
the CI at this time and never fixed.
(30ee2965b6)
Even the QUIC code for boringSSL was removed in 2022.
(e06f7459fa)
Instead of setting this flag on the ones used for the zero-copy negociation,
it is set on the connection flags used for xprt->rcv_buf()
call. Fortunately, there is no real consequence. The only visible effect is
the chunk size that is written on 8 bytes for no reason.
This patch is related to issue #2598. It must be backported to 3.0.
When data are transfered via zero-copy data forwarding, if some data were
already received, we try to immediately tranfer it during the negociation
step. If data are chunked and the chunk size is unknown, 10 bytes are reserved
to write the chunk size during the done step. However, when input data are
finally transferred, the offset is ignored. Data are copied into the output
buffer. But the first 10 bytes are then crushed by the chunk size. Thus the
chunk is truncated leading to a malformed message.
This patch should fix the issue #2598. It must be backported to 3.0.
This fixes an issue I've had where if a connection was idle for ~23s
it would get in a bad state. I don't understand this code, so I'm
not sure exactly why it was failing.
I discovered this by bisecting to identify the commit that caused the
regression between 2.9 and 3.0. The commit is
d2c3f8dde7: "MINOR: stconn/connection:
Move shut modes at the SE descriptor level" - a part of v3.0-dev8.
It seems to be an innocent renaming, so I looked through it and this
stood out as suspect:
- if (mode != CO_SHW_NORMAL)
+ if (mode & SE_SHW_NORMAL)
It looks like the not went missing here, so this patch reverses that
condition. It fixes my test.
I don't quite understand what this is doing or is for so I can't write
a regression test or decent commit message. Hopefully someone else
will be able to pick this up from where I've left it.
[CF: This inverts the condition to perform clean shutdowns. This means no
clean shutdown are performed when it should do. This patch must be
backported to 3.0]
quic_conn API for sending was recently refactored. The main objective
was to regroup the different functions present for both handshake and
application emission.
After this refactoring, an optimization was introduced to avoid calling
qc_send() if there was nothing new to emit. However, this prevent the Tx
buffer to be purged if previous sending was interrupted, until new
frames are finally available.
To fix this, simply remove the optimization. qc_send() is thus now
always called in quic_conn IO handlers.
The impact of this bug should be minimal as it happens only on sending
temporary error. However in this case, this could cause extra latency or
even a complete sending freeze in the worst scenario.
This must be backported up to 3.0.
qc_build_frms() is responsible to encode multiple frames in a single
QUIC packet. It accounts for room left in the buffer packet for each
newly encded frame.
An incorrect computation was performed when encoding a STREAM frame in a
single packet. Frame length was accounted twice which would reduce in
excess the buffer packet room. This caused the remaining built frames to
be reduced with the resulting packet not able to fill the whole MTU.
The impact of this bug should be minimal. It is only present when
multiple frames are encoded in a single packet after a STREAM. However
in this case datagrams built are smaller than expecting, which is
suboptimal for bandwith.
This should be backported up to 2.6.
The ClientHello callback for WolfSSL introduced in haproxy 2.9, seems
not to behave correctly with TLSv1.2.
In TLSv1.2, this is the cipher that is used to chose the authentication algorithm
(ECDSA or RSA), however an SSL client can send a signature algorithm.
In TLSv1.3, the authentication is not part of the ciphersuites, and
is selected using the signature algorithm.
The mistake in the code is that the signature algorithm in TLSv1.2 are
overwritting the auth that was selected using the ciphers.
This must be backported as far as 2.9.
The ClientHello Callback which is used for certificate selection uses
both the signature algorithms and the ciphers sent by the client.
However, when a client is announcing both ECDSA and RSA capabilities
with ECSDA ciphers that are not available on haproxy side and RSA
ciphers that are compatibles, the ECDSA certificate will still be used
but this will result in a "no shared cipher" error, instead of a
fallback on the RSA certificate.
For example, a client could send
'ECDHE-ECDSA-AES128-CCM:ECDHE-RSA-AES256-SHA and HAProxy could be
configured with only 'ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA'.
This patch fixes the issue by validating that at least one ECDSA cipher
is available on both side before chosing the ECDSA certificate.
This must be backported on all stable versions.
It may only happens when there is no data to forward but a last stream frame
must be sent with the FIN bit. It is not invalid, but it is useless to send
an empty H3 DATA frame in that case.
The previous fix (792a645ec2 ["BUG/MEDIUM: mux-quic: Unblock zero-copy
forwarding if the txbuf can be released"]) introduced a regression. The
zero-copy data forwarding must only be unblocked if it was blocked by the
producer, after a successful negotiation.
It is important because during a negotiation, the consumer may be blocked
for another reason. Because of the flow control for instance. In that case,
there is not necessarily a TX buffer. And it unexpected to try to release an
unallocated TX buf.
In addition, the same may happen while a TX buf is still in-use. In that
case, it must also not be released. So testing the TX buffer is not the
right solution.
To fix the issue, a new IOBUF flag was added (IOBUF_FL_FF_WANT_ROOM). It
must be set by the producer if it is blocked after a sucessful negotiation
because it needs more room. In that case, we know a buffer was provided by
the consummer. In done_fastfwd() callback function, it is then possible to
safely unblock the zero-copy data forwarding if this flag is set.
This patch must be backported to 3.0 with the commit above.
'lua_insert(lua->T, -lua_gettop(lua->T))' is actually used to rotate the
top value with the bottom one, thus the code was overkill and the comment
was actually misleading, let's fix that by using explicit equivalent form
(absolute index).
It may be backported with 5508db9a2 ("BUG/MINOR: hlua: fix unsafe
lua_tostring() usage with empty stack") to all stable versions to ease
code maintenance.
in hlua_ckch_commit_yield() and hlua_ckch_set(), when an error occurs,
we enter the error path and try to raise an error from the <err> msg
pointer which must be freed afterwards.
However, the fact that luaL_error() never returns was overlooked, because
of that <err> msg is never freed in such case.
To fix the issue, let's use hlua_pushfstring_safe() helper to push the
err on the lua stack and then free it before throwing the error using
lua_error().
It should be backported up to 2.6 with 30fcca18 ("MINOR: ssl/lua:
CertCache.set() allows to update an SSL certificate file")
Thanks to the previous commit, we may now assume that hlua_traceback()
won't LJMP, so it's safe to use it from unprotected environment without
any precautions.
Function is often used on error paths where no precaution is taken
against LJMP. Since the function is used on error paths (which include
out-of-memory error paths) the function lua_getinfo() could also raise
a memory exception, causing the process to crash or improper error
handling if the caller isn't prepared against that eventually. Since the
function is only used on rare events (error handling) and is lacking the
__LJMP prototype pefix, let's make it safe by protecting the lua_getinfo()
call so that hlua_traceback() callers may use it safely now (the function
will always succeed, output will be truncated in case of error).
This could be backported to all stable versions.
Following previous commit's logic: hlua_pusherror() is mainly used
from cleanup paths where the caller isn't protected against LJMPs.
Caller was tempted to think that the function was safe because func
prototype was lacking the __LJMP prefix.
Let's make the function really LJMP-safe by wrapping the sensitive calls
under lua_pcall().
This may be backported to all stable versions.
lua_pushfstring() is used in multiple cleanup paths (upon error) to
push the error message that will be raised by lua_error(). However this
is often done from an unprotected environment, or in the middle of a
cleanup sequence, thus we don't want the function to LJMP! (it may cause
various issues ranging from memory leaks to crashing the process..)
Hopefully this has very few chances of happening but since the use of
lua_pushfstring() is limited to error reporting here, it's ok to use our
own hlua_pushfstring_safe() implementation with a little overhead to
ensure that the function will never LJMP.
This could be backported to all stable versions.
In hlua_map_new(), when error occurs we use a combination of luaL_where,
lua_pushfstring and lua_concat to build the error string before calling
lua_error().
It turns out that we already have the hlua_pusherror() macro which is
exactly made for that purpose so let's use it.
It could be backported to all stable versions to ease code maintenance.
Ensure idle_timer task is allocated in qc_kill_conn() before waking it
up. It can be NULL if idle timer has already fired but MUX layer is
still present, which prevents immediate quic_conn release.
qc_kill_conn() is only used on send() syscall fatal error to notify
upper layer of an error and close the whole connection asap.
This crash occurence is pretty rare as it relies on timing issues. It
happens only if idle timer occurs before the MUX release (a bigger
client timeout is thus required) and any send() syscall detected error.
For now, it was only reproduced using GDB to interrupt haproxy longer
than the idle timeout.
This should be backported up to 2.6.
In done_fastfwd() callback function, if nothing was forwarding while the SD
is blocked, it means there is not enough space in the buffer to proceed. It
may be because there are data to be sent. But it may also be data already
sent waiting for an ack. In this case, no data to be sent by the mux. So the
quic stream is not woken up when data are finally removed from the
buffer. The data forwarding can thus be stuck. This happens when the stats
page is requested in QUIC/H3. Only applets are affected by this issue and
only with the QUIC multiplexer because it is the only mux with already sent
data in the TX buf.
To fix the issue, the idea is to release the txbuf if possible and then
unblock the SD to perform a new zero-copy data forwarding attempt. Doing so,
and thanks to the previous patch ("MEDIUM: applet: Be able to unblock
zero-copy data forwarding from done_fastfwd"), the applet will be woken up.
This patch should fix the issue #2584. It must be backported to 3.0.
This part is only experienced by applet. When an applet try to forward data
via an iobuf, it may decide to block for any reason even if there is free
space in the buffer. For instance, the stats applet don't procude data if
the buffer is almost full.
However, in this case, it could be good to let the consumer decide a new
attempt is possible because more space was made. So, if IOBUF_FL_FF_BLOCKED
flag is removed by the consumer when done_fastfwd() callback function is
called, the SE_FL_WANT_ROOM flag is removed on the producer sedesc. It is
only done for applets. And thanks to this change, the applet can be woken up
for a new attempt.
This patch is required for a fix on the QUIC multiplexer.
Interim responses are by definition bodyless. But we must not set the
corresponding HTX start-line flag, beecause the start-line of the final
response is still expected. Setting the flag above too early may lead the
multiplexer on the sending side to consider the message is finished after
the headers of the interim message.
It happens with the H2 multiplexer on frontend side if a "100-Continue" is
received from the server. The interim response is sent and
HTX_SL_F_BODYLESS_RESP flag is evaluated. Then, the headers of the final
response are sent with ES flag, because HTX_SL_F_BODYLESS_RESP flag was seen
too early, leading to a protocol error if the response has a body.
Thanks to grembo for this analysis.
This patch should fix the issue #2587. It must be backported as far as 2.9.
CLASS_CERTCACHE is used to declare CertCache global object, not Regex one
This copy-paste typo introduced was in 30fcca18 ("MINOR: ssl/lua:
CertCache.set() allows to update an SSL certificate file")
Using CertCache.set() from init context wasn't explicitly supported and
caused the process to crash:
crash.lua:
core.register_init(function()
CertCache.set{filename="reg-tests/ssl/set_cafile_client.pem", ocsp=""}
end)
crash.conf:
global
lua-load crash.lua
listen front
bind localhost:9090 ssl crt reg-tests/ssl/set_cafile_client.pem ca-file reg-tests/ssl/set_cafile_interCA1.crt verify none
./haproxy -f crash.conf
[NOTICE] (267993) : haproxy version is 3.0-dev2-640ff6-910
[NOTICE] (267993) : path to executable is ./haproxy
[WARNING] (267993) : config : missing timeouts for proxy 'front'.
| While not properly invalid, you will certainly encounter various problems
| with such a configuration. To fix this, please ensure that all following
| timeouts are set to a non-zero value: 'client', 'connect', 'server'.
[1] 267993 segmentation fault (core dumped) ./haproxy -f crash.conf
This is because in hlua_ckch_set/hlua_ckch_commit_yield, we always
consider that we're being called from a yield-capable runtime context.
As such, hlua_gethlua() is never checked for NULL and we systematically
try to wake hlua->task and yield every 10 instances.
In fact, if we're called from the body or init context (that is, during
haproxy startup), hlua_gethlua() will return NULL, and in this case we
shouldn't care about yielding because it is ok to commit all instances
at once since haproxy is still starting up.
Also, when calling CertCache.set() from a non-yield capable runtime
context (such as hlua fetch context), we kept doing as if the yield
succeeded, resulting in unexpected function termination (operation
would be aborted and the CertCache lock wouldn't be released). Instead,
now we explicitly state in the doc that CertCache.set() cannot be used
from a non-yield capable runtime context, and we raise a runtime error
if it is used that way.
These bugs were discovered by reading the code when trying to address
Svace report documented by @Bbulatov GH #2586.
It should be backported up to 2.6 with 30fcca18 ("MINOR: ssl/lua:
CertCache.set() allows to update an SSL certificate file")
As reported by @Bbulatov in GH #2586, stktable_data_ptr() return value is
used without checking it isn't NULL first, which may happen if the given
type is invalid or not stored in the table.
However, since date_type is set by table_prepare_data_request() right
before cli_io_handler_table() is invoked, date_type is not expected to
be invalid: table_prepare_data_request() normally checked that the type
is stored inside the table. Thus stktable_data_ptr() should not be failing
at this point, so we add a BUG_ON() to indicate that.
In ticket #785, people are still confused about how to use the crt-store
load parameters in a crt-list.
This patch adds an example.
This must be backported in 3.0
In GH issue #2586 @Bbulatov reported a theoretical null-deref in
env_expand() in case there's no memory anymore to expand an environment
variable. The function should return NULL in this case so that the only
caller (str2sa_range) sees it. In practice it may only happen during
boot thus is harmless but better fix it since it's easy. This can be
backported to all versions where this applies.
When parsing tcp-check expect-header, a copy-paste error in the error
message causes the name of the header to be reporetd as the invalid
format string instead of its value. This is really harmless but should
be backported to all versions to help users understand the cause of the
problem when this happens. This was reported in GH issue #2586 by
@Bbulatov.
In GH issue #2586 @Bbulatov reported a bug where the http-check
send-state flag is removed from options instead of options2 when
http-check is disabled. It only has an effect when this option is
set and http-check disabled, where it displays a warning indicating
this will be ignored. The option removed instead is srvtcpka when
this happens. It's likely that both options being so minor, nobody
ever faced it.
This can be backported to all versions.
This patch removes the old README file and replaces it with a more
modern markdown version which allows clickable links on the github page.
It also adds some of the Github Actions worfklow Status.
This patch includes the HAProxy png in the doc directory.