In e709e1e ("MEDIUM: logs: buffer targets now rely on new sink_write")
we started using the sink API instead of using the ring_write function
directly.
But as indicated in the commit message, the maxlen parameter of the log
directive now only applies to the message part and not the complete
payload. I don't know what the original intent was (maybe minimizing code
changes) but it seems wrong, because the doc doesn't mention this special
case, and the result is that the ring->buffer output can differ from all
other log output types, making it very confusing.
One last issue with this is that log messages can end up being dropped at
runtime, only for the buffer target, and even if logsrv->maxlen is
correctly set (including default: 1024) because depending on the generated
header size the payload can grow bigger than the accepted sink size (sink
maxlen is not mandatory) and we have no simple way to detect this at
configuration time.
First, we partially revert e709e1e:
TARGET_BUFFER still leverages the proper sink API, but thanks to
"MINOR: sink: pass explicit maxlen parameter to sink_write()" we now
explicitly pass the logsrv->maxlen to the sink_write function in order
to stop writing as soon as either sink->maxlen or logsrv->maxlen is
reached.
This restores pre-e709e1e behavior with the added benefit from using the
high-level API, which includes automatically announcing dropped message
events.
Then, we also need to take the ending '\n' into account: it is not
explicitly set when generating the logline for TARGET_BUFFER, but it will
be forcefully added by the sink_forward_io_handler function from the tcp
handler applet when log messages from the buffer are forwarded to tcp
endpoints.
In current form, because the '\n' is added later in the chain, maxlen is
not being considered anymore, so the final log message could exceed maxlen
by 1 byte, which could make receiving servers unhappy in logging context.
To prevent this, we sacrifice 1 byte from the logsrv->maxlen to ensure
that the final message will never exceed log->maxlen, even if the '\n'
char is automatically appended later by the forwarding applet.
Thanks to this change TCP (over RING/BUFFER) target now behaves like
FD and UDP targets.
This commit depends on:
- "MINOR: sink: pass explicit maxlen parameter to sink_write()"
It may be backported as far as 2.2
[For 2.2 and 2.4 the patch does not apply automatically, the sink_write()
call must be updated by hand]
sink_write() currently relies on sink->maxlen to know when to stop
writing a given payload.
But it could be useful to pass a smaller, explicit value to sink_write()
to stop before the ring maxlen, for instance if the ring is shared between
multiple feeders.
sink_write() now takes an optional maxlen parameter:
if maxlen is > 0, then sink_write will stop writing at maxlen if maxlen
is smaller than ring->maxlen, else only ring->maxlen will be considered.
[for haproxy <= 2.7, patch must be applied by hand: that is:
__sink_write() and sink_write() should be patched to take maxlen into
account and function calls to sink_write() should use 0 as second argument
to keep original behavior]
A regression was introduced with 5464885 ("MEDIUM: log/sink: re-work
and merge of build message API.").
For UDP targets, a final '\n' is systematically inserted, but with the
rework of the build message API, it is inserted after the maxlen
limitation has been enforced, so this can lead to the final message
becoming maxlen+1. For strict syslog servers that only accept up to
maxlen characters, this could be a problem.
To fix the regression, we take the final '\n' into account prior to
building the message, like it was done before the rework of the API.
This should be backported up to 2.4.
When maxlen parameter exceeds sink size, a warning is generated and maxlen
is enforced to sink size. But the err_code is incorrectly set to ERR_ALERT
Indeed, being a "warning", ERR_WARN should be used here.
This may be backported as far as 2.2
When a ring section defines its size using the "size" directive with a
smaller size than the default one or smaller than the previous one,
a warning is generated to inform the user that the new size will be
ignored.
However the err_code is returned as FATAL, so this cause haproxy to
incorrectly abort the init sequence.
Changing the err_code to ERR_WARN so that this warning doesn't refrain
from successfully starting the process.
This should be backported as far as 2.4
Adding missing free for sft (string_forward_target) in sink_deinit(),
which resulted in minor leak for each declared ring target at deinit().
(either explicit and implicit rings are affected)
This may be backported up to 2.4.
_proxy_http_parse_7239_expr() helper used in proxy_http_parse_7239()
function may return ERR_ABORT in case of memory error. But the error check
used below is insufficient to catch ERR_ABORT so the function could keep
executing prior to returning ERR_ABORT, which may cause undefined
behavior. Hopefully no sensitive handling is performed in this case so
this bug has very limited impact, but let's fix it anyway.
We now use ERR_CODE mask instead of ERR_FATAL to check if err_code is set
to any kind of error combination that should prevent the function from
further executing.
This may be backported in 2.8 with b2bb9257d2 ("MINOR: proxy/http_ext:
introduce proxy forwarded option")
forward proxy server list created from sink_new_from_logsrv() is invalid
Indeed, srv->next is literally assigned to itself. This did not cause
issues during syslog handling because the sft was properly set, but it
will cause the free_proxy(sink->forward_px) at deinit to go wild since
free_proxy() will try to iterate through the proxy srv list to free
ressources, but because of the improper list initialization, double-free
and infinite-loop will occur.
This bug was revealed by 9b1d15f53a ("BUG/MINOR: sink: free forward_px on deinit()")
It must be backported as far as 2.4.
When a s-maxage cache-control directive is present, it overrides any
other max-age or expires value (see section 5.2.2.9 of RFC7234). So if
we have a max-age=0 alongside a strictly positive s-maxage, the response
should be cached.
This bug was raised in GitHub issue #2203.
The fix can be backported to all stable branches.
This bug was introduced by this commit:
MEDIUM: quic: Release encryption levels and packet number spaces asap
Add some checks before derefencing pointers to packet number spaces objects
to dump them from "show quic" command.
No backport needed.
Thierry Fournier reported an annoying side-effect when using the debug()
converter.
Consider the following examples:
[1] http-request set-var(txn.test) bool(true),ipmask(24)
[2] http-request redirect location /match if { bool(true),ipmask(32) }
When starting haproxy with [1] example we get:
config : parsing [test.conf:XX] : error detected in frontend 'fe' while parsing 'http-request set-var(txn.test)' rule : converter 'ipmask' cannot be applied.
With [2], we get:
config : parsing [test.conf:XX] : error detected in frontend 'fe' while parsing 'http-request redirect' rule : error in condition: converter 'ipmask' cannot be applied in ACL expression 'bool(true),ipmask(32)'.
Now consider the following examples which are based on [1] and [2]
but with the debug() sample conv inserted in-between those incompatible
sample types:
[1*] http-request set-var(txn.test) bool(true),debug,ipmask(24)
[2*] http-request redirect location /match if { bool(true),debug,ipmask(32) }
According to the documentation, "it is safe to insert the debug converter
anywhere in a chain, even with non-printable sample types".
Thus we don't expect any side-effect from using it within a chain.
However in current implementation, because of debug() returning SMP_T_ANY
type which is a pseudo type (only resolved at runtime), the sample
compatibility checks performed at parsing time are completely uneffective.
(haproxy will start and no warning will be emitted)
The indesirable effect of this is that debug() prevents haproxy from
warning you about impossible type conversions, hiding potential errors
in the configuration that could result to unexpected evaluation or logic
while serving live traffic. We better let haproxy warn you about this kind
of errors when it has the chance.
With our previous examples, this could cause some inconveniences. Let's
say for example that you are testing a configuration prior to deploying
it. When testing the config you might want to use debug() converter from
time to time to check how the conversion chain behaves.
Now after deploying the exact same conf, you might want to remove those
testing debug() statements since they are no longer relevant.. but
removing them could "break" the config and suddenly prevent haproxy from
starting upon reload/restart. (this is the expected behavior, but it
comes a bit too late because of debug() hiding errors during tests..)
To fix this, we introduce a new output type for sample expressions:
SMP_T_SAME - may only be used as "expected" out_type (parsing time)
for sample converters.
As it name implies, it is a way for the developpers to indicate that the
resulting converter's output type is guaranteed to match the type of the
sample that is presented on the converter's input side.
(converter may alter data content, but data type must not be changed)
What it does is that it tells haproxy that if switching to the converter
(by looking at the converter's input only, since outype is SAME) is
conversion-free, then the converter type can safely be ignored for type
compatibility checks within the chain.
debug()'s out_type is thus set to SMP_T_SAME instead of ANY, which allows
it to fully comply with the doc in the sense that it does not impact the
conversion chain when inserted between sample items.
Co-authored-by: Thierry Fournier <thierry.f.78@gmail.com>
Historically, the ADDR pseudo-type did not exist. So when IPV6 support
was added to existing IPV4 sample fetches (e.g.: src,dst,hdr_ip...) the
expected out_type in related sample definitions was left on IPV4 because
it was required to declare the out_type as the lowest common denominator
(the type that can be casted into all other ones) to make compatibility
checks at parse time work properly.
However, now that ADDR pseudo-type may safely be used as out_type since
("MEDIUM: sample: add missing ADDR=>? compatibility matrix entries"), we
can use ADDR for fetches that may output both IPV4 and IPV6 at runtime.
One added benefit on top of making the code less confusing is that
'haproxy -dKsmp' output will now show "addr" instead of "ipv4" for such
fetches, so the 'haproxy -dKsmp' output better complies with the fetches
signatures from the documentation.
out_ip fetch, which returns an ip according to the doc, was purposely
left as is (returning IPV4) since the smp_fetch_url_ip() implementation
forces output type to IPV4 anyway, and since this is an historical fetch
I prefer not to touch it to prevent any regression. However if
smp_fetch_url_ip() were to be fixed to also return IPV6 in the future,
then its expected out_type may be changed to ADDR as well.
Multiple notes in the code were updated to mention that the appropriate
pseudo-type may be used instead of the lowest common denominator for
out_type when available.
Since 1478aa7 ("MEDIUM: sample: Add IPv6 support to the ipmask converter")
ipmask converter may return IPV6 as well, so the sample definition must
be updated. (the output type is currently explicited to IPV4)
This has no functional impact: one visible effect will be that
converter's visible output type will change from "ipv4" to "addr" when
using 'haproxy -dKcnv'.
SMP_T_ADDR support was added in b805f71 ("MEDIUM: sample: let the cast
functions set their output type").
According to the above commit, it is made clear that the ADDR type is
a pseudo/generic type that may be used for compatibility checks but
that cannot be emitted from a fetch or converter.
With that in mind, all conversions from ADDR to other types were
explicitly disabled in the compatibility matrix.
But later, when map_*_ip functions were updated in b2f8f08 ("MINOR: map:
The map can return IPv4 and IPv6"), we started using ADDR as "expected"
output type for converters. This still complies with the original
description from b805f71, because it is used as the theoric output
type, and is never emitted from the converters themselves (only "real"
types such as IPV4 or IPV6 are actually being emitted at runtime).
But this introduced an ambiguity as well as a few bugs, because some
compatibility checks are being performed at config parse time, and thus
rely on the expected output type to check if the conversion from current
element to the next element in the chain is theorically supported.
However, because the compatibility matrix doesn't support ADDR to other
types it is currently impossible to use map_*_ip converters in the middle
of a chain (the only supported usage is when map_*_ip converters are
at the very end of the chain).
To illustrate this, consider the following examples:
acl test str(ok),map_str_ip(test.map) -m found # this will work
acl test str(ok),map_str_ip(test.map),ipmask(24) -m found # this will raise an error
Likewise, stktable_compatible_sample() check for stick tables also relies
on out_type[table_type] compatibility check, so map_*_ip cannot be used
with sticktables at the moment:
backend st_test
stick-table type string size 1m expire 10m store http_req_rate(10m)
frontend fe
bind localhost:8080
mode http
http-request track-sc0 str(test),map_str_ip(test.map) table st_test # raises an error
To fix this, and prevent future usage of ADDR as expected output type
(for either fetches or converters) from introducing new bugs, the
ADDR=>? part of the matrix should follow the ANY type logic.
That is, ADDR, which is a pseudo-type, must be compatible with itself,
and where IPV4 and IPV6 both support a backward conversion to a given
type, ADDR must support it as well. It is done by setting the relevant
ADDR entries to c_pseudo() in the compatibility matrix to indicate that
the operation is theorically supported (c_pseudo() will never be executed
because ADDR should not be emitted: this only serves as a hint for
compatibility checks at parse time).
This is what's being done in this commit, thanks to this the broken
examples documented above should work as expected now, and future
usage of ADDR as out_type should not cause any issue.
This function is used for ANY=>!ANY conversions in the compatibility
matrix to help differentiate between real NOOP (c_none) and pseudo
conversions that are theorically supported at config parse time but can
never occur at runtime,. That is, to explicit the fact that actual related
runtime operations (e.g.: ANY->IPV4) are not NOOP since they might require
some conversion to be performed depending on the input type.
When checking the conf we don't know the effective out types so
cast[pseudo type][pseudo type] is allowed in the compatibility matrix,
but at runtime we only expect cast[real type][(real type || pseudo type)]
because fetches and converters may not emit pseudo types, thus using
c_none() everywhere was too ambiguous.
The process will crash if c_pseudo() is invoked to help catch bugs:
crashing here means that a pseudo type has been encountered on a
converter's input at runtime (because it was emitted earlier in the
chain), which is not supported and results from a broken sample fetch
or converter implementation. (pseudo types may only be used as out_type
in sample definitions for compatibility checks at parsing time)
Both sample_parse_expr() and parse_acl_expr() implement some code
logic to parse sample conv list after respective fetch or acl keyword.
(Seems like the acl one was inspired by the sample one historically)
But there is clearly code duplication between the two functions, making
them hard to maintain.
Hopefully, the parsing logic between them has stayed pretty much the
same, thus the sample conv parsing part may be moved in a dedicated
helper parsing function.
This is what's being done in this commit, we're adding the new function
sample_parse_expr_cnv() which does a single thing: parse the converters
that are listed right after a sample fetch keyword and inject them into
an already existing sample expression.
Both sample_parse_expr() and parse_acl_expr() were adapted to now make
use of this specific parsing function and duplicated code parts were
cleaned up.
Although sample_parse_expr() remains quite complicated (numerous function
arguments due to contextual parsing data) the first goal was to get rid of
code duplication without impacting the current behavior, with the added
benefit that it may allow further code cleanups / simplification in the
future.
bc_{dst,src} sample fetches were introduced in 7d081f0 ("MINOR:
tcp_samples: Add samples to get src/dst info of the backend connection")
However a copy-pasting error was probably made here because they both
return IP addresses (not integers) like their fc_{src,dst} or src,dst
analogs.
Hopefully, this had no functional impact because integer can be converted
to both IPV4 and IPV6 types so the compatibility checks were not affected.
Better fix this to prevent confusion in code and "haproxy -dKsmp" output.
This should be backported up to 2.4 with 7d081f0.
The current limitation of the 'ocsp-update' option and the fact that it
can only be used in crt-lists was puzzling for some people so the doc
was amended to emphasize this specificity. A configuration extract was
added as well.
A few troubleshooting clues were added as well.
Must be backported in 2.8.
This patch fixes a misalignment in the 'ocsp-update' option description
and it splits the example log lines for readability.
Must be backported in 2.8.
This is to prevent the build from these gcc warning when compiling with -O1 option:
willy@wtap:haproxy$ sh make-native-quic-memstat src/quic_conn.o CPU_CFLAGS="-O1"
CC src/quic_conn.o
src/quic_conn.c: In function 'qc_prep_pkts':
src/quic_conn.c:3700:44: warning: potential null pointer dereference [-Wnull-dereference]
3700 | frms = &qel->pktns->tx.frms;
| ~~~^~~~~~~
src/quic_conn.c:3626:33: warning: 'end' may be used uninitialized in this function [-Wmaybe-uninitialized]
3626 | if (end - pos < QUIC_INITIAL_PACKET_MINLEN) {
| ~~~~^~~~~
This bug was introduced by this commit:
MINOR: quic: Remove pool_zalloc() from qc_new_conn().
If ->path is not initialized to NULL value, and if a QUIC connection object
allocation has failed (from qc_new_conn()), haproxy could crash in
quic_conn_prx_cntrs_update() when dereferencing this QUIC connection member.
No backport needed.
This bug was reported by GH #2200 (coverity issue) as follows:
*** CID 1516590: Resource leaks (RESOURCE_LEAK)
/src/quic_tls.c: 159 in quic_conn_enc_level_init()
153
154 LIST_APPEND(&qc->qel_list, &qel->list);
155 *el = qel;
156 ret = 1;
157 leave:
158 TRACE_LEAVE(QUIC_EV_CONN_CLOSE, qc);
>>> CID 1516590: Resource leaks (RESOURCE_LEAK)
>>> Variable "qel" going out of scope leaks the storage it points to.
159 return ret;
160 }
161
162 /* Uninitialize <qel> QUIC encryption level. Never fails. */
163 void quic_conn_enc_level_uninit(struct quic_conn *qc, struct quic_enc_level *qel)
164 {
This bug was introduced by this commit which has foolishly assumed the encryption
level memory would be released after quic_conn_enc_level_init() has failed. This
is no more possible because this object is dynamic and no more a static member
of the QUIC connection object.
Anyway, this patch modifies quic_conn_enc_level_init() to ensure this is
no more leak when quic_conn_enc_level_init() fails calling quic_conn_enc_level_uninit()
in case of memory allocation error.
quic_conn_enc_level_uninit() code was moved without modification only to be defined
before quic_conn_enc_level_init()
There is no need to backport this.
Released version 2.9-dev1 with the following main changes :
- BUG/MINOR: stats: Fix Lua's `get_stats` function
- MINOR: stats: protect against future stats fields omissions
- BUG/MINOR: stream: do not use client-fin/server-fin with HTX
- BUG/MINOR: quic: Possible crash when SSL session init fails
- CONTRIB: Add vi file extensions to .gitignore
- BUG/MINOR: spoe: Only skip sending new frame after a receive attempt
- BUG/MINOR: peers: Improve detection of config errors in peers sections
- REG-TESTS: stickiness: Delay haproxys start to properly resolv variables
- DOC: quic: fix misspelled tune.quic.socket-owner
- DOC: config: fix jwt_verify() example using var()
- DOC: config: fix rfc7239 converter examples (again)
- BUG/MINOR: cfgparse-tcp: leak when re-declaring interface from bind line
- BUG/MINOR: proxy: add missing interface bind free in free_proxy
- BUG/MINOR: proxy/server: free default-server on deinit
- BUG/MEDIUM: hlua: Use front SC to detect EOI in HTTP applets' receive functions
- BUG/MINOR: ssl: log message non thread safe in SSL Hanshake failure
- BUG/MINOR: quic: Wrong encryption level flags checking
- BUG/MINOR: quic: Address inversion in "show quic full"
- BUG/MINOR: server: inherit from netns in srv_settings_cpy()
- BUG/MINOR: namespace: missing free in netns_sig_stop()
- BUG/MINOR: quic: Missing initialization (packet number space probing)
- BUG/MINOR: quic: Possible crash in quic_conn_prx_cntrs_update()
- BUG/MINOR: quic: Possible endless loop in quic_lstnr_dghdlr()
- MINOR: quic: Remove pool_zalloc() from qc_new_conn()
- MINOR: quic: Remove pool_zalloc() from qc_conn_alloc_ssl_ctx()
- MINOR: quic: Remove pool_zalloc() from quic_dgram_parse()
- BUG/MINOR: quic: Missing transport parameters initializations
- BUG/MEDIUM: mworker: increase maxsock with each new worker
- BUG/MINOR: quic: ticks comparison without ticks API use
- BUG/MINOR: quic: Missing TLS secret context initialization
- DOC: Add tune.h2.be.* and tune.h2.fe.* options to table of contents
- DOC: Add tune.h2.max-frame-size option to table of contents
- DOC: Attempt to fix dconv parsing error for tune.h2.fe.initial-window-size
- REGTESTS: h1_host_normalization : Add a barrier to not mix up log messages
- MEDIUM: mux-h1: Split h1_process_mux() to make code more readable
- REORG: mux-h1: Rename functions to emit chunk size/crlf in the output buffer
- MINOR: mux-h1: Add function to append the chunk size to the output buffer
- MINOR: mux-h1: Add function to prepend the chunk crlf to the output buffer
- MEDIUM: filters/htx: Don't rely on HTX extra field if payload is filtered
- MEDIIM: mux-h1: Add splicing support for chunked messages
- REGTESTS: Add a script to test the kernel splicing with chunked messages
- CLEANUP: mux-h1: Remove useless __maybe_unused statement
- BUG/MINOR: http_ext: fix if-none regression in forwardfor option
- REGTEST: add an extra testcase for ifnone-forwardfor
- BUG/MINOR: mworker: leak of a socketpair during startup failure
- BUG/MINOR: quic: Prevent deadlock with CID tree lock
- MEDIUM: ssl: handle the SSL_ERROR_ZERO_RETURN during the handshake
- BUG/MINOR: ssl: SSL_ERROR_ZERO_RETURN returns CO_ER_SSL_EMPTY
- BUILD: mux-h1: silence a harmless fallthrough warning
- BUG/MEDIUM: quic: error checking buffer large enought to receive the retry tag
- MINOR: ssl: allow to change the server signature algorithm on server lines
- MINOR: ssl: allow to change the client-sigalgs on server lines
- BUG/MINOR: config: fix stick table duplicate name check
- BUG/MINOR: quic: Missing random bits in Retry packet header
- BUG/MINOR: quic: Wrong Retry paquet version field endianess
- BUG/MINOR: quic: Wrong endianess for version field in Retry token
- IMPORT: slz: implement a synchronous flush() operation
- MINOR: compression/slz: add support for a pure flush of pending bytes
- MINOR: quic: Move QUIC TLS encryption level related code (quic_conn_enc_level_init())
- MINOR: quic: Move QUIC encryption level structure definition
- MINOR: quic: Implement a packet number space identification function
- MINOR: quic: Move packet number space related functions
- MEDIUM: quic: Dynamic allocations of packet number spaces
- CLEANUP: quic: Remove qc_list_all_rx_pkts() defined but not used
- MINOR: quic: Add a pool for the QUIC TLS encryption levels
- MEDIUM: quic: Dynamic allocations of QUIC TLS encryption levels
- MINOR: quic: Reduce the maximum length of TLS secrets
- CLEANUP: quic: Remove two useless pools a low QUIC connection level
- MEDIUM: quic: Handle the RX in one pass
- MINOR: quic: Remove call to qc_rm_hp_pkts() from I/O callback
- CLEANUP: quic: Remove server specific about Initial packet number space
- MEDIUM: quic: Release encryption levels and packet number spaces asap
- CLEANUP: quic: Remove a useless test about discarded pktns (qc_handle_crypto_frm())
- MINOR: quic: Move the packet number space status at quic_conn level
- MINOR: quic: Drop packet with type for discarded packet number space.
- BUILD: quic: Add a DISGUISE() to please some compiler to qc_prep_hpkts() 1st parameter
- BUILD: debug: avoid a build warning related to epoll_wait() in debug code
Ilya reported in issue #2193 that the latest Fedora complains about us
passing NULL to epoll_wait() in the "debug dev fd" code to try to detect
an epoll FD. That was intentional to get the kernel's verifications and
make sure we're facing a poller, but since such a warning comes from the
libc, it's possible that it plans to replace the syscall with a wrapper
in the near future (e.g. epoll_pwait()), and that just hiding the NULL
(as was confirmed to work) might just postpone the problem.
Let's take another approach, instead we'll create a new dummy FD that
we'll try to remove from the epoll set using epoll_ctl(). Since we
created the FD we're certain it cannot be there. In this case (and
only in this case) epoll_ctl() will return -ENOENT, otherwise it will
typically return EINVAL or EBADF. It was verified that it works and
doesn't return false positives for other FD types. It should be
backported to the branches that contain a backport of the commit which
introduced the feature, apparently as far as 2.4:
5be7c198e ("DEBUG: cli: add a new "debug dev fd" expert command")
Some compiler could complain with such a warning:
src/quic_conn.c:3700:44: warning: potential null pointer dereference [-Wnull-dereference]
3700 | frms = &qel->pktns->tx.frms;
It could not figure out that <qel> could not be NULL at this location.
This is fixed calling qc_prep_hpkts() with a disguise 1st parameter.
This patch allows the low level packet parser to drop packets with type for discarded
packet number spaces. Furthermore, this prevents it from reallocating new encryption
levels and packet number spaces already released/discarded. When a packet number space
is discarded, it MUST NOT be reallocated.
As the packet number space discarding is done asap the type of packet received is
known, some packet number space discarding check may be safely removed from qc_try_rm_hp()
and qc_qel_may_rm_hp() which are called after having parse the packet header, and
is type.
As the packet number spaces and encryption level are dynamically allocated,
the information about the packet number space discarded status must be kept
somewhere else than in these objects.
quic_tls_discard_keys() is no more useful.
Modify quic_pktns_discard() to do the same job: flag the quic_conn object
has having discarded packet number space.
Implement quic_tls_pktns_is_disarded() to check if a packet number space is
discarded. Note the Application data packet number space is never discarded.
There is no need to check that the packet number space associated to the encryption
level to handle the CRYPTO frames is available when entering qc_handle_crypto_frm().
This has already been done by the caller: qc_treat_rx_pkts().
Release the memory allocated for the Initial and Handshake encryption level
under packet number spaces as soon as possible.
qc_treat_rx_pkts() has been modified to do that for the Initial case. This must
be done after all the Initial packets have been parsed and the underlying TLS
secrets have been flagged as to be discarded. As the Initial encryption level is
removed from the list attached to the quic_conn object inside a list_for_each_entry()
loop, this latter had to be converted into a list_for_each_entry_safe() loop.
The memory allocated by the Handshake encryption level and packet number spaces
is released just before leaving the handshake I/O callback (qc_conn_io_cb()).
As ->iel and ->hel pointer to Initial and Handshake encryption are reset to null
value when releasing the encryption levels memory, some check have been added
at several place before dereferencing them. Same thing for the ->ipktns and ->htpktns
pointer to packet number spaces.
Also take the opportunity of this patch to modify qc_dgrams_retransmit() to
use packet number space variables in place of encryption level variables to shorten
some statements. Furthermore this reflects the jobs it has to do: retransmit
UDP datagrams from packet number spaces.
quic_conn_io_cb() is the I/O handler callback used during the handshakes.
quic_conn_app_io_cb() is used after the handshakes. Both call qc_rm_hp_pkts()
before parsing the RX packets ordered by their packet numbers calling qc_treat_rx_pkts().
qc_rm_hp_pkts() is there to remove the header protection to reveal the packet
numbers, then reorder the packets by their packet numbers.
qc_rm_hp_pkts() may be safely directly called by qc_treat_rx_pkts(), which is
itself called by the I/O handlers.
Insert a loop around the existing code of qc_treat_rx_pkts() to handle all
the RX packets by encryption level for all the encryption levels allocated
for the connection, i.e. for all the encryption levels with secrets derived
from the ones provided by the TLS stack through ->set_encryption_secrets().
The maximum length of the secrets derived by the TLS stack is 384 bits.
This reduces the size of the objects provided by the "quic_tls_secret" pool by
16 bytes.
Should be backported as far as 2.6
Replace ->els static array of encryption levels by 4 pointers into the QUIC
connection object, quic_conn struct.
->iel denotes the Initial encryption level,
->eel the Early-Data encryption level,
->hel the Handshaske encryption level and
->ael the Application Data encryption level.
Add ->qel_list to this structure to list the encryption levels after having been
allocated. Modify consequently the encryption level object itself (quic_enc_level
struct) so that it might be added to ->qel_list QUIC connection list of
encryption levels.
Implement qc_enc_level_alloc() to initialize the value of a pointer to an encryption
level object. It is used to initialized the pointer newly added to the quic_conn
structure. It also takes a packet number space pointer address as argument to
initialize it if not already initialized.
Modify quic_tls_ctx_reset() to call it from quic_conn_enc_level_init() which is
called by qc_enc_level_alloc() to allocate an encryption level object.
Implement 2 new helper functions:
- ssl_to_qel_addr() to match and pointer address to a quic_encryption level
attached to a quic_conn object with a TLS encryption level enum value;
- qc_quic_enc_level() to match a pointer to a quic_encryption level attached
to a quic_conn object with an internal encryption level enum value.
This functions are useful to be called from ->set_encryption_secrets() and
->add_handshake_data() TLS stack called which takes a TLS encryption enum
as argument (enum ssl_encryption_level_t).
Replace all the use of the qc->els[] array element values by one of the newly
added ->[ieha]el quic_conn struct member values.
Very simple patch to define and declare a pool for the QUIC TLS encryptions levels.
It will be used to dynamically allocate such objects to be attached to the
QUIC connection object (quic_conn struct) and to remove from quic_conn struct the
static array of encryption levels (see ->els[]).
Add a pool to dynamically handle the memory used for the QUIC TLS packet number spaces.
Remove the static array of packet number spaces at QUIC connection level (struct
quic_conn) and add three new members to quic_conn struc as pointers to quic_pktns
struct, one by packet number space as follows:
->ipktns for Initial packet number space,
->hpktns for Handshake packet number space and
->apktns for Application packet number space.
Also add a ->pktns_list new member (struct list) to quic_conn struct to attach
the list of the packet number spaces allocated for the QUIC connection.
Implement ssl_to_quic_pktns() to map and retrieve the addresses of these pointers
from TLS stack encryption levels.
Modify quic_pktns_init() to initialize these members.
Modify ha_quic_set_encryption_secrets() and ha_quic_add_handshake_data() to
allocate the packet numbers and initialize the encryption level.
Implement quic_pktns_release() which takes pointers to pointers to packet number
space objects to release the memory allocated for a packet number space attached
to a QUIC connection and reset their address values.
Modify qc_new_conn() to allocation only the Initial packet number space and
Initial encryption level.
Modify QUIC loss detection API (quic_loss.c) to use the new ->pktns_list
list attached to a QUIC connection in place of a static array of packet number
spaces.
Replace at several locations the use of elements of an array of packet number
spaces by one of the three pointers to packet number spaces
haproxy/quic_tls-t.h is the correct place to quic_enc_level structure
definition.
Should be backported as far as 2.6 to ease any further backport to come.
While HTTP makes no promises of sub-message delivery, haproxy tries to
make reasonable efforts to be friendly to applications relying on this,
particularly though the "option http-no-delay" statement. However, it
was reported that when slz compression is being used, a few bytes can
remain pending for more data to complete them in the SLZ queue when
built on a 64-bit little endian architecture. This is because aligning
blocks on byte boundary is costly (requires to switch to literals and
to send 4 bytes of block size), so incomplete bytes are left pending
there until they reach at least 32 bits. On other architecture, the
queue is only 8 bits long.
Robert Newson from Apache's CouchDB project explained that the heartbeat
used by CouchDB periodically delivers a single LF character, that it used
to work fine before the change enlarging the queue for 64-bit platforms,
but only forwards once every 3 LF after the change. This was definitely
caused by this incomplete byte sequence queuing. Zlib is not affected,
and the code shows that ->flush() is always called. In the case of SLZ,
the called function is rfc195x_flush_or_finish() and when its "finish"
argument is zero, no flush is performed because there was no such flush()
operation.
The previous patch implemented a flush() operation in SLZ, and this one
makes rfc195x_flush_or_finish() call it when finish==0. This way each
delivered data block will now provoke a flush of the queue as is done
with zlib.
This may very slightly degrade the compression ratio, but another change
is needed to condition this on "option http-no-delay" only in order to
be consistent with other parts of the code.
This patch (and the preceeding slz one) should be backported at least to
2.6, but any further change to depend on http-no-delay should not.
In some cases it may be desirable for latency reasons to forcefully
flush the queue even if it results in suboptimal compression. In our
case the queue might contain up to almost 4 bytes, which need an EOB
and a switch to literal mode, followed by 4 bytes to encode an empty
message. This means that each call can add 5 extra bytes in the ouput
stream. And the flush may also result in the header being produced for
the first time, which can amount to 2 or 10 bytes (zlib or gzip). In
the worst case, a total of 19 bytes may be emitted at once upon a flush
with 31 pending bits and a gzip header.
This is libslz upstream commit cf8c4668e4b4216e930b56338847d8d46a6bfda9.
The 32-bits version field of the Retry paquet was inversed by the code. As this
field must be in the network byte order on the wire, this code has supposed that
the sender of the Retry packet will always be little endian. Hopefully this is
often the case on our Intel machines ;)
Must be backported as far as 2.6.
The 4 bits least significant bits of the first byte in a Retry packet must be
random. There are generated calling statistical_prng_range() with 16 as argument.
Must be backported as far as 2.6.
When a stick-table is defined within a peers section, the name is
prefixed with the peers section name. However when checking for
duplicate table names, the check was using the table name without
the prefix, and would thus never match.
Must be backported as far as 2.6.
This patch introduces the "client-sigalgs" keyword for the server line,
which allows to configure the list of server signature algorithms
negociated during the handshake. Also available as
"ssl-default-server-client-sigalgs" in the global section.
This patch introduces the "sigalgs" keyword for the server line, which
allows to configure the list of server signature algorithms negociated
during the handshake. Also available as "ssl-default-server-sigalgs" in
the global section.