Move quic_path struct from quic_conn-t.h to quic_cc-t.h and rename it to quic_cc_path.
Update the code consequently.
Also some inlined functions in relation with QUIC path to quic_cc.h
Move quic_pkt_type(), quic_saddr_cpy(), quic_write_uint32(), max_available_room(),
max_stream_data_size(), quic_packet_number_length(), quic_packet_number_encode()
and quic_compute_ack_delay_us() to quic_tx.c because only used in this file.
Also move quic_ack_delay_ms() and quic_read_uint32() to quic_tx.c because they
are used only in this file.
Move quic_rx_packet_refinc() and quic_rx_packet_refdec() to quic_rx.h header.
Move qc_el_rx_pkts(), qc_el_rx_pkts_del() and qc_list_qel_rx_pkts() to quic_tls.h
header.
Move quic_cstream struct definition from quic_conn-t.h to quic_tls-t.h.
Its pool is also moved from quic_conn module to quic_tls. Same thing for
quic_cstream_new() and quic_cstream_free().
Fix such building issues:
In file included from src/quic_tx.c:15:
include/haproxy/quic_tx.h:51:23: warning: ‘struct quic_rx_packet’
Do not know why the compiler warns about such missing header inclusions
just now. It should have complained a long time ago during the big QUIC
source code split.
Move quic_cid and quic_connnection_id from quic_conn-t.h to new quic_cid-t.h header.
Move defintions of quic_stateless_reset_token_init(), quic_derive_cid(),
new_quic_cid(), quic_get_cid_tid() and retrieve_qc_conn_from_cid() to quic_cid.c
new C file.
When a H2 stream is blocked during data fast-forwarding, we must take care
to remove H2_SF_NOTIFIED flag. This was only performed when data
fast-forward was attempted. However, if the H2 stream was blocked for any
reason, this flag was not removed. During our tests, we found it was
possible to infinitely block a connection because one of its streams was in
the send_list with the flag set. In this case, the stream was no longer
woken up to resume the sends, blocking all other streams.
No backport needed.
When zero-copy data fast-forwarding is inuse, if the opposite SC is blocked,
there is no reason to try to fast-forward more data. Worst, in some cases,
this can lead to a receive loop of the producer side while the consumer side
is blocked.
No backport needed.
CONNECTION_CLOSE_APP encoding is broken, which prevents the sending of
every packet with such a frame. This bug was always present in quic
haproxy. However, it was slightly dissimulated by the previous code
which always initialized all frame members to zero, which was sufficient
to ensure CONNECTION_CLOSE_APP encoding was ok. The below patch changes
this behavior by removing this costly initialization step.
4cf784f38e
MINOR: quic: Avoid zeroing frame structures
Now, frames members must always be initialized individually given the
type of frame to used. However, for CONNECTION_CLOSE_APP this was not
done as qc_cc_build_frm() accessed the wrong union member refering to a
CONNECTION_CLOSE instead.
This bug was detected when trying to generate a HTTP/3 error. The
CONNECTION_CLOSE_APP frame encoding failed due to a non-initialized
<reason_phrase_len> which was too big. This was reported by the
following trace :
"frame building error : qc@0x5555561b86c0 idle_timer_task@0x5555561e5050 flags=0x86038058 CONNECTION_CLOSE_APP"
This must be backported up to 2.6. This is necessary even if above
commit is not as previous code is also buggy, albeit with a different
behavior.
It's the exact same as commit 0a7ab7067 ("OPTIM: mux-h2: don't allocate
more buffers per connections than streams"), but for the zero-copy case
this time. Previously it was only done on the regular snd_buf() path, but
this one is needed as well. A transfer on 16 parallel streams now consumes
half of the memory, and a single stream consumes much less.
An alternate approach would be worth investigating in the future, based
on the same principle as the CF_STREAMER_FAST at the higher level: in
short, by monitoring how many mux buffers we write at once before refilling
them, we would get an idea of how much is worth keeping in buffers max,
given that anything beyond would just waste memory. Some tests show that
a single buffer already seems almost as good, except for single-stream
transfers, which is why it's worth spending more time on this.
Add an optional argument for "-dt". This argument is interpreted as a
list of several trace statement separated by comma. For each statement,
a specific trace name can be specifed, or none to act on all sources.
Using double-colon separator, it is possible to add specifications on
the wanted level and verbosity.
Extract conversion of level string argument to integer value in a
dedicated internal function trace_parse_level(). This function is used
to for CLI trace parsing and will also be useful for "-dt" process
argument.
Add '-dt' haproxy process argument. This will automatically activate all
trace sources on stderr with the error level. This could be useful to
troubleshoot issues such as protocol violations.
<pattern> field pointer of pat_ref_elt structure has been by a
zero-length array. As such, it's now unneeded to check for NULL address
before printing it.
This type conversion was done in the following commit :
3ac9912837
OPTIM: pattern: save memory and time using ebst instead of ebis
The current patch is mandatory to fix the following GCC warning :
CC src/map.o
src/map.c: In function ‘cli_io_handler_map_lookup’:
src/map.c:549:54: error: the comparison will always evaluate as ‘true’ for the address of ‘pattern’ will never be NULL [-Werror=address]
549 | if (pat->ref && pat->ref->pattern)
|
No need to backport it unless the above commit is.
In the pat_ref_elt struct, the pattern string is stored outside of the
node element, using a pointer to an strdup(). Not only this needlessly
wastes at least 16-24 bytes per entry (8 for the pointer, 8-16 for the
allocator), it also makes the tree descent less efficient since both
the node and the string have to be visited for each layer (hence at least
two cache lines). Let's use an ebmb storage and place the pattern right
at the end of the pat_ref_elt, making it a variable-sized element instead.
The set-map test below jumps from 173 to 182 kreq/s/core, and the memory
usage drops from 356 MB to 324 MB:
http-request set-map(/dev/null) %[rand(1000000)] 1
This is even more visible with large maps: after loading 16M IP addresses
into a map, the process uses this amount of memory:
- 3.15 GB with haproxy-2.8
- 4.21 GB with haproxy-2.9-dev11
- 3.68 GB with this patch
So that's a net saving of 32 bytes per entry here, which cuts in half the
extra cost of the tree, and loading a large map takes about 20% less time.
Task_drop_running() is used to remove the RUNNING bit and check if
while the task was running it got a new wakeup from itself. Thus
each time task_drop_running() marks itself as a caller, it in fact
removes the previous caller that woke up the task, such as below:
Tasks activity over 10.439 sec till 0.000 sec ago:
function calls cpu_tot cpu_avg lat_tot lat_avg
task_run_applet 57895273 6.396m 6.628us 2.733h 170.0us <- run_tasks_from_lists@src/task.c:658 task_drop_running
Better not mark this function as a caller and keep the original one:
Tasks activity over 13.834 sec till 0.000 sec ago:
function calls cpu_tot cpu_avg lat_tot lat_avg
task_run_applet 62424582 5.825m 5.599us 5.717h 329.7us <- sc_app_chk_rcv_applet@src/stconn.c:952 appctx_wakeup
It is not possible in H1, but in H2 (and probably H3) it is possible to have
trailers at the end of a message while a Content-Length was announced.
However, depending if the trailers are received with the last HTX DATA block
or the zero-copy forwarding is used or not, an processing error may be
triggered, leading to a 500-internal-error.
To fix the issue, when a content-length is announced and all the payload was
processed, we switch the message to H1_MSG_DONE state only if the
end-of-message was also reported (HTX_FL_EOM flag set). Otherwise, it is
switched to H1_MSG_TRAILERS state to be able to properly ignored the
trailers, if so.
The patch must be backported as far as 2.4. Be careful, this part was highly
refactored. The patch will have to be adapted to be backported.
The mworker mode never had a proper 'hard-stop' (-st) for the reload,
this is a mode which was commonly used with the daemon mode, but it was
never implemented in mworker mode.
This patch fixes the problem by implementing a "hard-reload" command
over the master CLI. It does the same as the "reload" command, but
instead of waiting for the connections to stop in the previous process,
it immediately quits the previous process after binding.
This patch removes the code which selects the SSL certificate in the
OpenSSL Client Hello callback, to use the ssl_sock_chose_sni_ctx()
function which does the same.
The bigger part of the function which remains is the extraction of the
servername, ciphers and sigalgs, because it's done manually by parsing
the TLS extensions.
This is not supposed to change anything functionally.
The certificate selection used in the WolfSSL cert_cb and in the OpenSSL
clienthello callback is the same, the function was duplicate to achieve
the same.
This patch move the selection code to a common function called
ssl_sock_chose_sni_ctx().
The servername string is still lowered in the callback, however the
search for the first dot in the string (wildp) is done in
ssl_sock_chose_sni_ctx()
The function uses the same certificate selection algorithm as before, it
needs to know if you need rsa or ecdsa, the bind_conf to achieve the
lookup, and the servername string.
This patch moves the code for WolSSL only.
PR https://github.com/wolfSSL/wolfssl/pull/6963 implements primitives to
extract ciphers and algorithm signatures.
It allows to chose a certificate depending on the sigals and
ciphers presented by the client (RSA or ECDSA).
Since WolfSSL does not implement the clienthello callback, the patch
uses the certificate callback (SSL_CTX_set_cert_cb())
The callback is inspired by our clienthello callback, however the
extraction of client ciphers and sigalgs is simpler,
wolfSSL_get_sigalg_info() and wolfSSL_get_ciphersuite_info() are used.
This is not enabled by default yet as the PR was not merged.
Following previous commit: in this patch we add the "syslog" output as
possible return value for Proxy.get_mode() function since log backend
may now be enumerated from lua with 9a74a6c ("MAJOR: log: introduce log
backends")
Proxy.get_mode() function internally relies on proxy_mode_str() to return
the proxy mode. The current function description is exhaustive about the
possible outputs for the function. I can't tell if it's relevant or not
but it's subject to changes. Here it is the case, the documentation
indicates that "health" mode may be returned, which cannot happen
since 77e0daef9 ("MEDIUM: proxy: remove obsolete "mode health"").
This should be backported up to 2.4
http_reuse_be_transparent.vtc relies on "transparent" proxy option which
is guarded by the USE_TPROXY ifdef at multiple places in the code.
Hence, executing the above test when haproxy was compiled without the
USE_TPROXY feature (ie: generic target) results in this kind of error:
*** h1 debug|[NOTICE] (1189756) : haproxy version is 2.9-dev1-8fc21e-807
*** h1 debug|[NOTICE] (1189756) : path to executable is ./haproxy
*** h1 debug|[ALERT] (1189756) : config : parsing [/tmp/vtc.1189751.18665e7b/h1/cfg:11]: option 'transparent' is not supported due to build options.
*** h1 debug|[ALERT] (1189756) : config : Error(s) found in configuration file : /tmp/vtc.1189751.18665e7b/h1/cfg
Now we skip the regtest if TPROXY feature is missing.
In 6e0425b718 ("DOC: config: Add documentation about TCP/HTTP rules in
defaults section") an error was made: the restriction note about the
setting not being inherited from anonymous default section was added
by mistake in the "timeout check" documentation. But it is wrong,
"timeout check" behaves like other "timeout" directives for proxy
sections.
This should be backported up to 2.6.
There was no info about supported sections for "max-session-srv-conns"
proxy directive. A quick look at the code tells us that it may be used
in proxies with the FE capability set.
Maintain proper px->lbprm.tot_weight for log backends. server's weight is
considered as 1 as long as the server is usable.
This will allow the stats page to correctly display the proxy status since
the check currently relies on proxy's lbprm.tot_weight variable.
server_rules declared using "use-server" keyword within a proxy are not
supported inside a log backend (with "mode log" set), so we report a
warning to the user and reset the setting.
Take the px->server_rules freeing part out of free_proxy() and make it
a dedicated helper function so that it becomes possible to use it from
anywhere.
There are multiple places inside free_proxy() where we need to perform
the exact same operation: freeing a logformat list which includes freeing
every member.
To prevent code duplication, we add the free_logformat_list() function
that takes such list as parameter and does all the freeing job on its
own.
This reverts commit 5884e46ec8c8231e73c68e1bdd345c75c9af97a0 since we
cannot perform the test during parsing as the effective proxy mode is
not yet known.
This is a leftover from 1e0093a317 ("MINOR: backend/balance: "balance"
requires TCP or HTTP mode").
Indeed, we cannot perform the test during parsing as the effective proxy
type is not yet known. Moreover, thanks to b61147fd ("MEDIUM: log/balance:
merge tcp/http algo with log ones") we could potentially benefit from
this setting even in log mode, but for now it is ignored by all log
compatible load-balancing algorithms.
In this patch we fix the prototype for ipcmp() and ipcpy() functions so
that input pointers that are used exclusively for reads are used as const
pointers. This way, the compiler can safely assume that those variables
won't be altered by the function.
In this patch we add the support for a new SERVER event in the
event_hdl API.
SERVER_INETADDR is implemented as an advanced server event.
It is published each time the server's ip address or port is
about to change. (ie: from the cli, dns, lua...)
SERVER_INETADDR data is an event_hdl_cb_data_server_inetaddr struct
that provides additional info related to the server inet addr change,
but can be casted as a regular event_hdl_cb_data_server struct if
additional info is not needed.
"log-balance" keyword was removed by b61147f ("MEDIUM: log/balance: merge
tcp/http algo with log ones") but it was still documented.
Removing "log-balance" references in the documentation where needed.
Released version 2.9-dev11 with the following main changes :
- BUG/MINOR: startup: set GTUNE_SOCKET_TRANSFER correctly
- BUG/MINOR: sock: mark abns sockets as non-suspendable and always unbind them
- BUILD: cache: fix build error on older compilers
- BUG/MAJOR: quic: complete thread migration before tcp-rules
- BUG/MEDIUM: quic: Possible crash for connections to be killed
- MINOR: quic: remove unneeded QUIC specific stopping function
- MINOR: acl: define explicit HTTP_3.0
- DEBUG: connection/flags: update flags for reverse HTTP
- BUILD: log: silence a build warning when threads are disabled
- MINOR: quic: Add traces to debug frames handling during retransmissions
- BUG/MEDIUM: quic: Possible crash during retransmissions and heavy load
- BUG/MINOR: quic: Possible leak of TX packets under heavy load
- BUG/MINOR: quic: Possible RX packet memory leak under heavy load
- BUG/MINOR: server: do not leak default-server in defaults sections
- DEBUG: tinfo: store the pthread ID and the stack pointer in tinfo
- MINOR: debug: start to create a new struct post_mortem
- MINOR: debug: add OS/hardware info to the post_mortem struct
- MINOR: debug: report in port_mortem whether a container was detected
- MINOR: debug: report in post_mortem if the container techno used is docker
- MINOR: debug: detect CPU model and store it in post_mortem
- MINOR: debug: report any detected hypervisor in post_mortem
- MINOR: debug: collect some boot-time info related to the process
- MINOR: debug: copy the thread info into the post_mortem struct
- MINOR: debug: dump the mapping of the libs into post_mortem
- MINOR: debug: add the ability to enter components in the post_mortem struct
- MINOR: init: add info about the main program to the post_mortem struct
- DOC: management: document "show dev"
- CLEANUP: assorted typo fixes in the code and comments
- CI: limit codespell checks to main repo, not forks
- DOC: 51d: updated 51Degrees repo URL for v3.2.10
- DOC: install: update the list of openssl versions
- MINOR: ext-check: add an option to preserve environment variables
- BUG/MEDIUM: mux-h1: Don't set CO_SFL_MSG_MORE flag on last fast-forward send
- MINOR: rhttp: rename proto_reverse_connect
- MINOR: rhttp: large renaming to use rhttp prefix
- MINOR: rhttp: add count of active conns per thread
- MEDIUM: rhttp: support multi-thread active connect
- MINOR: listener: allow thread kw for rhttp bind
- DOC: rhttp: replace maxconn by nbconn
- MINOR: log/balance: rename "log-sticky" to "sticky"
- MEDIUM: mux-quic: Add consumer-side fast-forwarding support
- MAJOR: h3: Implement zero-copy support to send DATA frame
When possible, we try send DATA frame without copying data. To do so, we
swap the input buffer with QCS tx buffer. It is only possible iff:
* There is only one HTX block of data at the beginning of the message
* Amount of data to send is equal to the size of the HTX data block
* The QCS tx buffer is empty
In this case, both buffers are swapped. The frame metadata are written at
the begining of the buffer, before data and where the HTX structure is
stored.