During zero-copy forwarding, we first try to forward data found in the input
buffer before trying to receive more data. These data must be removed from
the amount of data to forward (the cound variable).
Otherwise, on an internal retry, in h1_fastfwd(), we can be lead to read
more data than expected. It is especially a problem on the end of a
chunk. An error is erroneously reported because more data than announced are
received.
This patch should fix the issue #2382. It must be backported to 2.9.
When the producer side (h1 for now) negociates with the consumer side to
perform a zero-copy forwarding, we now consider the consumer side as blocked
if it is closed and this was reported to the SE via a end-of-stream or a
(pending) error.
It is performed before calling ->nego_ff callback function, in se_nego_ff().
This way, all consumer are concerned automatically. The aim of this patch is
to fix an issue with the QUIC mux. Indeed, it is unexpected to send a frame
on an closed stream. This triggers a BUG_ON(). Other muxes are not affected
but it remains useless to try to send data if the stream is closed.
This patch should fix the issue #2372. It must be backported to 2.9.
This bug arrived with this commit:
BUG/MINOR: quic: Wrong RETIRE_CONNECTION_ID sequence number chec
Every connection ID manipulations against the by thread trees used to store the
connection IDs must be done under the trees locks. These trees are accessed by
the low level connection identification code.
When receiving a RETIRE_CONNECTION_ID frame, the concerned connection ID must
be deleted from the its underlying by thread tree but not without locking!
Add a WR lock around ebmb_delete() call to do so.
Must be backported as far as 2.7.
Similarly to H3, hq-interop now uses zero-copy when dealing with a HTX
message with only a single data block. Exchange HTX and QCS buffer, and
use the HTX data block for HTTP payload. This is only possible if QCS
buffer is empty. Contrary to HTTP/3, no extra frame header is needed
before transferring HTTP payload.
hq-interop is only implemented for testing purpose so this change should
not be noticeable by users. However, it will be useful to be able to
test zero-copy transfer on QUIC interop testing.
Adjust HTTP/3 data emission. First, add HTX as argument to the function
as this is used for other frames emission function. Keep the buffer
argument as this is mandatory for zero-copy. Extend comments related to
this, in particular to explain purposes of both HTX and buffer
arguments.
No function change here. This should however be useful to port a code
equivalent to hq-interop protocol.
Add data level traces for each encoded H3 frame. Of notable interest,
traces will be useful to detect if standard emission, zero-copy or fast
forward is used. Also add the generic filter H3_EV_TX_FRAME to be able
to filter these messages.
When dealing with HTTP/1 responses without Content-Length nor chunked
encoding, flag QC_SF_UNKNOWN_PL_LENGTH is set on QCS. This prevent the
emission of a RESET_STREAM on shutw, instead resorting to a proper FIN
emission.
This code was duplicated both in H3 and hq-interop. Move it in common
qcs_http_snd_buf() to factorize it.
qcc_app_ops is a set of callbacks used to unify application protocol
running over QUIC. This commit introduces some changes to clarify its
API :
* write simple comment to reflect each callback purpose
* rename decode_qcs to rcv_buf as this name is more common and is
similar to already existing snd_buf
* finalize is moved up as it is used during connection init stage
All these changes are ported to HTTP/3 layer. Also function comments
have been extended to highlight HTTP/3 special characteristics.
This function is similar to the previous one, but this time for QCS
sending buffer.
Previously, each application layer redefine their own version of
mux_get_buf() which was used to allocate <qcs.tx.buf>. Unify it under a
single function renamed qcc_get_stream_txbuf().
Replaces qcs_get_buf() function which naming does not reflect its
purpose. Add a new function qcc_get_stream_rxbuf() which allocate if
needed <qcs.rx.app_buf> and returns the buffer pointer. This function is
reserved for application protocol layer. This buffer is then accessed by
stconn layer.
For other qcs_get_buf() invocation which was used in effect for a local
buffer, replace these by a plain b_alloc().
Since 1de44da ("MINOR: ext-check: add an option to preserve environment
variables"), it is now possible to provide an extra argument to
"external-check" directive. This allows to support the "preserve-env"
option which differs from the default behavior.
However a mistake was made, because the config parser doesn't allow the
default configuration anymore: using external-check without argument will
trigger an error:
'external-check' only supports 'preserve-env' as an argument, found ''.
This is due to as small mistake in the code that make the check
systematically report an error if the first argument is not equal to
"preserve-env". The check was modified so that the error is only reported
if the argument is provided, so that the default behavior is restored.
This should fix GH #2380 and should be backported on 2.9 and potentially
further (anywhere 1de44da is, because a note about an optional backport
up to the 2.6 was left in the original commit message)
Some regressions were introduced by 5fea59754b ("MEDIUM: map/acl:
Accelerate several functions using pat_ref_elt struct ->head list")
pat_ref_delete_by_id() fails to properly unlink and free the removed
reference because it bypasses the pat_ref_delete_by_ptr() made for
that purpose. This function is normally used everywhere the target
reference is set for removal, such as the pat_ref_delete() function
that matches pattern against a string. The call was probably skipped
by accident during the rewrite of the function.
With the above commit also comes another undesirable change:
both pat_ref_delete_by_id() and pat_ref_set_by_id() directly use the
<refelt> argument as a valid pointer (they do dereference it).
This is wrong, because <refelt> is unsafe and should be handled as an
ID, not a pointer (hence the function name). Indeed, the calling function
may directly pass user input from the CLI as <refelt> argument, so we must
first ensure that it points to a valid element before using it, else it is
probably invalid and we shouldn't touch it.
What this patch essentially does, is that it reverts pat_ref_set_by_id()
and pat_ref_delete_by_id() to pre 5fea59754b behavior. This seems like
it was the only optimization from the patch that doesn't apply.
Hopefully, after reviewing the changes with Fred, it seems that the 2
functions are only being involved in commands for manipulating maps or
acls on the cli, so the "missed" opportunity to improve their performance
shouldn't matter much. Nonetheless, if we wanted to speed up the reference
lookup by ID, we could consider adding an eb64 tree for that specific
purpose that contains all pattern references IDs (ie: pointers) so that
eb lookup functions may be used instead of linear list search.
The issue was raised by Marko Juraga as he failed to perform an an acl
removal by reference on the CLI on 2.9 which was known to work properly
on other versions.
It should be backported on 2.9.
Co-Authored-by: Frédéric Lécaille <flecaille@haproxy.com>
The PR which allows to chose a certificate depending on the ciphers and
the signature algorithms was merged in WolfSSL. Let's activate this
code.
This could be backported in 2.9 only when the next WolfSSL release is
available (5.6.5). It will also need a check on the version.
This bug impacts only the OpenSSL QUIC compatibility module (USE_QUIC_OPENSSL_COMPAT).
This may happen only when the TLS stack has to be provided with more than 1024+1+5+16
bytes of CRYPTO data. In this case several TLS records have to be built in one
call to SSL_provide_quic_data(). A 5-bytes header is created at the head
of these records. This header is used as AAD to cipher the record. But
the length of this AAD was counted two times. One time here in
quic_tls_compat_create_record() (initialization):
adlen = quic_tls_compat_create_header(qc, rec, ad, 0);
and a second time here in the same function after quic_tls_tls_seal() return:
ret = aad_len + outlen;
This addition is useless. Note that this bug could be reproduced when haproxy has
to authenticate the client.
Thank you to @vifino for having reported this issue in GH #2381.
Must be backported to 2.8.
"set severity-output" is one of these command that changes the appctx
state so the next commands are affected.
Unfortunately the master CLI works with pipelining and server close
mode, which means the connection between the master and the worker is
closed after each response, so for the next command this is a new appctx
state.
To fix the problem, 2 new flags are added ACCESS_MCLI_SEVERITY_STR and
ACCESS_MCLI_SEVERITY_NB which are used to prefix each command sent to
the worker with the right "set severity-output" command.
This patch fixes issue #2350.
It could be backported as far as 2.6.
All QUIC MUX functions which are callbacks for stream layer use the
prefix qmux_strm_*. This was not the case for fast forward related
callback which only used qmux_* prefix.
Fix this by reusing the standard prefix to respect QUIC MUX code
convention.
Implement callback for fast forwarding for hq-interop.
This change should not be considered as functionally important. Indeed,
HTTP/0.9 is reserved for QUIC interop testing and should not be used
outside of it. However, implementing fast forwarding in this context is
useful as this will allow to test MUX code sections for fast forward via
QUIC interop.
rep_ssl_hello_type was renamed in res.ssl_hello_type a long time ago.
This patch fixes a typo where an example was renamed
"rep.ssl_hello_type" instead of "res.ssl_hello_type"
fixes issue #2377 and #2379.
Must be backported in all maintained versions.
This bugfix is the same as the following one:
"BUG/MINOR: ssl_ckch: Wrong OCSP CID after modifying an SSL certficate"
where the OCSP CID had to be reset when updating a certificate.
Must be backported to 2.8.
This bug could be reproduced with the "set ssl cert" CLI command to update
a certificate. The OCSP CID is duplicated by ckchs_dup() which calls
ssl_sock_copy_cert_key_and_chain(). It should be computed again by
ssl_sock_load_ocsp(). This may be accomplished resetting the new ckch OCSP CID
returned by ckchs_dup().
This bug may be in relation with GH #2319.
Must be backported to 2.8.
This patch allows cli_io_handler_commit_cert() callback called upon
a "commit ssl cert ..." command to prefix the messages returned by the CLI
to the by the ones built by ha_warining(), ha_alert().
Should be interesting to backport this commit to 2.8.
This bug could be reproduced loading several certificated from "bind" line:
with "server_ocsp.pem" as argument to "crt" setting and updating
the CDSA certificate with the RSA as follows:
echo -e "set ssl cert reg-tests/ssl/ocsp_update/multicert/server_ocsp.pem.ecdsa \
<<\n$(cat reg-tests/ssl/ocsp_update/multicert/server_ocsp.pem.rsa)\n" | socat - /tmp/stats
followed by an "commit ssl cert reg-tests/ssl/ocsp_update/multicert/server_ocsp.pem.ecdsa"
command. This could be detected by libasan as follows:
=================================================================
==507223==ERROR: AddressSanitizer: attempting double-free on 0x60200007afb0 in thread T3:
#0 0x7fabc6fb5527 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x54527)
#1 0x7fabc6ae8f8c in ossl_asn1_string_embed_free (/opt/quictls/lib/libcrypto.so.81.3+0xd4f8c)
#2 0x7fabc6af54e9 in ossl_asn1_primitive_free (/opt/quictls/lib/libcrypto.so.81.3+0xe14e9)
#3 0x7fabc6af5960 in ossl_asn1_template_free (/opt/quictls/lib/libcrypto.so.81.3+0xe1960)
#4 0x7fabc6af569f in ossl_asn1_item_embed_free (/opt/quictls/lib/libcrypto.so.81.3+0xe169f)
#5 0x7fabc6af58a4 in ASN1_item_free (/opt/quictls/lib/libcrypto.so.81.3+0xe18a4)
#6 0x46a159 in ssl_sock_free_cert_key_and_chain_contents src/ssl_ckch.c:723
#7 0x46aa92 in ckch_store_free src/ssl_ckch.c:869
#8 0x4704ad in cli_release_commit_cert src/ssl_ckch.c:1981
#9 0x962e83 in cli_io_handler src/cli.c:1140
#10 0xc1edff in task_run_applet src/applet.c:454
#11 0xaf8be9 in run_tasks_from_lists src/task.c:634
#12 0xafa2ed in process_runnable_tasks src/task.c:876
#13 0xa23c72 in run_poll_loop src/haproxy.c:3024
#14 0xa24aa3 in run_thread_poll_loop src/haproxy.c:3226
#15 0x7fabc69e7ea6 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x7ea6)
#16 0x7fabc6907a2e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xfba2e)
0x60200007afb0 is located 0 bytes inside of 3-byte region [0x60200007afb0,0x60200007afb3)
freed by thread T3 here:
#0 0x7fabc6fb5527 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x54527)
#1 0x7fabc6ae8f8c in ossl_asn1_string_embed_free (/opt/quictls/lib/libcrypto.so.81.3+0xd4f8c)
previously allocated by thread T2 here:
#0 0x7fabc6fb573f in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x5473f)
#1 0x7fabc6ae8d77 in ASN1_STRING_set (/opt/quictls/lib/libcrypto.so.81.3+0xd4d77)
Thread T3 created by T0 here:
#0 0x7fabc6f84bba in pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x23bba)
#1 0xc04f36 in setup_extra_threads src/thread.c:252
#2 0xa2761f in main src/haproxy.c:3917
#3 0x7fabc682fd09 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x23d09)
Thread T2 created by T0 here:
#0 0x7fabc6f84bba in pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x23bba)
#1 0xc04f36 in setup_extra_threads src/thread.c:252
#2 0xa2761f in main src/haproxy.c:3917
#3 0x7fabc682fd09 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x23d09)
SUMMARY: AddressSanitizer: double-free ??:0 __interceptor_free
==507223==ABORTING
Aborted
The OCSP CID stored in the impacted ckch data were freed but not reset to NULL,
leading to a subsequent double free.
Must be backported to 2.8.
numbers of active and backup servers per backend were exported but the info
was not exported per-server. The main issue to do so was we were unable to
have a different name for the same metric in a different scope. Thanks to
the previous patch ("MINOR: promex: Add support for specialized
front/back/li/srv metric names") it is now possible. So the info is now
exported per-server.
This patch should fix the issue #2271.
Depending on the scope, metrics can have different names. For instance, the
number of active/backend servers are reported with
"haproxy_backend_active_servers"/"haproxy_backend_backup_servers" metric names
in the backend scope while it should be
"haproxy_server_active"/"haproxy_server_backup" in the server scope.
To be able to support different names depending on the scope for the same
metric, arrays of ISTs were added, one by scope (front, back, listen,
server). These arrays only contain names overriding the default ones.
Note: the exemple above is not supported for now and is the reason for this
commit.
Because maps and list of ACLs are no longer necessarily referenced by
filenames, CLI commands to manipulate them were updated accordingly. Instead
of "filename" we talk about "name" now.
The same is performed in the LUA documentation.
Maps and list of ACLs can now reference something else than regular files
and can have prefix to set the type of the list (file, virutal file or
optional file). So, the configuration manual was updated accordingly.
The section 2.7. about name format for maps and ACLs was added (the former
2.7. sections with some examples was moved to 2.8.) and references to map or
ACLs files were updated.
Before this patch, it was not possible to use a list of patterns, map or a
list of acls, without an existing file. However, it could be handy to just
use an ID, with no file on the disk. It is pretty useful for everyone
managing dynamically these lists. It could also be handy to try to load a
list from a file if it exists without failing if not. This way, it could be
possible to make a cold start without any file (instead of empty file),
dynamically add and del patterns, dump the list to the file periodically to
reuse it on reload (via an external process).
In this patch, we uses some prefixes to be able to use virtual or optional
files.
The default case remains unchanged. regular files are used. A filename, with
no prefix, is used as reference, and it must exist on the disk. With the
prefix "file@", the same is performed. Internally this prefix is
skipped. Thus the same file, with ou without "file@" prefix, references the
same list of patterns.
To use a virtual map, "virt@" prefix must be used. No file is read, even if
the following name looks like a file. It is just an ID. The prefix is part
of ID and must always be used.
To use a optional file, ie a file that may or may not exist on a disk at
startup, "opt@" prefix must be used. If the file exists, its content is
loaded. But HAProxy doesn't complain if not. The prefix is not part of
ID. For a given file, optional files and regular files reference the same
list of patterns.
This patch should fix the issue #2202.
It is only a small API refactoring. The filename is no longer used when
pat_ref_read_from_file_smp() or pat_ref_read_from_file() functions are
called. The filename was already used to create the reference on the list of
patterns. Thus, we now directly use info from this reference.
tune.cache.zero-copy-forwarding parameter can now be used to enable or
disable the zero-copy fast-forwarding for the cache applet only. It is
enabled ('on') by default. It can be disabled by setting the parameter to
'off'.
It is now possible to directly forward data to the opposite side from the
cache applet. To do so, dedicated functions were added to fast-forward the
payload part of the cached objects. Of course headers and trailers are still
sent via the channel's buffer, using the HTX.
When an object is delivered from the cache, once the applet reaches the
HTX_CACHE_DATA state, it declares it can fast-forward data. From this point,
all data are directly transferred to the oppposite side.
We now save the body size of cached objets in the cache entry strucutre. In
addition, the cache applet tracks the body part already sent.
This will be mandatory to add support of endpoint-to-endpoint
fast-forwarding in the cache applet.
To be able to support endpoint-to-endpoint fast-forwarding (formerly called
mux-to-mux fast-forwarding), we cannot rely on data in the input channel to
compute amount of data the applet has produced.
The applet API is not really designed to know how many bytes are produced or
received at each call. Till now, it was not a problem because data always
passed through the channels. With E2E fast-frowarding, input data may be
immediately consumed. From the caller point of view (task_run_applet), there
is only the total field of the input channel that will change. So let's use
it now.
Till now, it was not possible to notify an producing applet is streaming
data. It means, it was not possible to set CF_STREAMER and CF_STREAMER_FLAGS
on the input channel of an applet streaming data.
While it is not a big deal for most of applets, it is interesting for the
cache. Because there are now dedicated functions to deal with these flags,
we can use them in task_run_applet() to set/unset these flags on the input
channel.
This patch relies on "MINOR: channel: Use dedicated functions to deal with
STREAMER flags".
For now, CF_STREAMER and CF_STREAMER_FAST flags are set in sc_conn_recv()
function. The logic is moved in dedicated functions.
First, channel_check_idletimer() function is now responsible to check the
channel's last read date against the idle timer value to be sure the
producer is still streaming data. Otherwise, it removes STREAMER flags.
Then, channel_check_xfer() function is responsible to check amount of data
transferred avec a receive, to eventually update STREAMER flags.
In sc_conn_recv(), we now use these functions.
Released version 2.9.0 with the following main changes :
- DOC: config: add missing colon to "bytes_out" sample fetch keyword (2)
- BUG/MINOR: cfgparse-listen: fix warning being reported as an alert
- DOC: config: add matrix entry for "max-session-srv-conns"
- DOC: config: fix monitor-fail typo
- DOC: config: add context hint for proxy keywords
- DEBUG: stream: Report lra/fsb values for front end back SC in stream dump
- REGTESTS: sample: Test the behavior of consecutive delimiters for the field converter
- BUG/MINOR: sample: Make the `word` converter compatible with `-m found`
- DOC: Clarify the differences between field() and word()
- BUG/MINOR: server/event_hdl: properly handle AF_UNSPEC for INETADDR event
- BUILD: http_htx: silence uninitialized warning on some gcc versions
- MINOR: acme.sh: don't use '*' in the filename for wildcard domain
- MINOR: global: Use a dedicated bitfield to customize zero-copy fast-forwarding
- MINOR: mux-pt: Add global option to enable/disable zero-copy forwarding
- MINOR: mux-h1: Add global option to enable/disable zero-copy forwarding
- MINOR: mux-h2: Add global option to enable/disable zero-copy forwarding
- MINOR: mux-quic: Add global option to enable/disable zero-copy forwarding
- MINOR: mux-quic: Disable zero-copy forwarding for send by default
- DOC: config: update the reminder on the HTTP model and add some terminology
- DOC: config: add a few more differences between HTTP/1 and 2+
- DOC: config: clarify session vs stream
- DOC: config: fix typo abandonned -> abandoned
- DOC: management: fix two latest typos (optionally, exception)
- BUG/MEDIUM: peers: fix partial message decoding
- DOC: management: update stream vs session
peer_recv_msg() may return because the message is incomplete without
checking if a shutdown is pending for the SC. The function relies on
co_getblk() to detect shutdowns. However, the message length decoding may be
interrupted if the multi-bytes integer is incomplete. In this case, the SC
is not check for shutdowns.
When this happens, this leads to an appctx spinning loop.
This patch should fix the issue #2373. It must be backported to 2.8.
No backport needed, these were introduced by latest commits 3dd55fa13
("MINOR: mworker/cli: implement hard-reload over the master CLI") and
cef29d370 ("MINOR: trace: define simple -dt argument").
It was really necessary to try to clear the confusion between sessions
and streams, so let's first lift a little bit the HTTP model part to
better consider new protocols, and explain what a stream is and how this
differs from the earlier sessions.
There is at least an bug for now in this part and it is still unstable. Thus
it is better to disable it for now by default. It can be enable by setting
tune.quic.zero-copy-fwd-send to 'on'.
tune.quic.zero-copy-fwd-send can now be used to enable or disable the
zero-copy fast-forwarding for the QUIC mux only, for sends. For now, there
is no option to disable it for receives because it is not supported yet.
It is enabled ('on') by default.