Historically the client-fin and server-fin timeouts were made to allow
a connection closure to be effective quickly if the last data were sent
down a socket and the client didn't close, something that can happen
when the peer's FIN is lost and retransmits are blocked by a firewall
for example. This made complete sense in 1.5 for TCP and HTTP in close
mode. But nowadays with muxes, it's not done at the right layer anymore
and even the description doesn't match what is being done, because what
happens is that the stream will abort the whole transfer after it's done
sending to the mux and this timeout expires.
We've seen in GH issue 2095 that this can happen with very short timeout
values, and while this didn't trigger often before, now that the muxes
(h2 & quic) properly report an end of stream before even the first
sc_conn_sync_recv(), it seems that it can happen more often, and have
two undesirable effects:
- logging a timeout when that's not the case
- aborting the request channel, hence the server-side conn, possibly
before it had a chance to be put back to the idle list, causing
this connection to be closed and not reusable.
Unfortunately for TCP (mux_pt) this remains necessary because the mux
doesn't have a timeout task. So here we're adding tests to only do
this through an HTX mux. But to be really clean we should in fact
completely drop all of this and implement these timeouts in the mux
itself.
This needs to be backported to 2.8 where the issue was discovered,
and maybe carefully to older versions, though that is not sure at
all. In any case, using a higher timeout or removing client-fin in
HTTP proxies is sufficient to make the issue disappear.
As seen in commits 33a4461fa ("BUG/MINOR: stats: Fix Lua's `get_stats`
function") and a46b142e8 ("BUG/MINOR: Missing stat_field_names (since
f21d17bb)") it seems frequent to omit to update stats_fields[] when
adding a new ST_F_xxx entry. This breaks Lua's get_stats() and shows
a "(null)" in the header of "show stat", but that one is not detectable
to the naked eye anymore.
Let's add a reminder above the enum declaration about this, and a small
reg tests checking for the absence of "(null)". It was verified to fail
before the last patch above.
Lua's `get_stats` function stopped working in
4cfb0019e6, due to the addition a new field
ST_F_PROTO without a corresponding entry in `stat_fields`.
Fix the issue by adding the entry, like
a46b142e88 did previously for a different field.
This patch fixes GitHub Issue #2174, it should be backported to 2.8.
Released version 2.8.0 with the following main changes :
- MINOR: compression: Improve the way Vary header is added
- BUILD: makefile: search for SSL_INC/wolfssl before SSL_INC
- MINOR: init: pre-allocate kernel data structures on init
- DOC: install: add details about WolfSSL
- BUG/MINOR: ssl_sock: add check for ha_meth
- BUG/MINOR: thread: add a check for pthread_create
- BUILD: init: print rlim_cur as regular integer
- DOC: install: specify the minimum openssl version recommended
- CLEANUP: mux-quic: remove unneeded fields in qcc
- MINOR: mux-quic: remove nb_streams from qcc
- MINOR: quic: fix stats naming for flow control BLOCKED frames
- BUG/MEDIUM: mux-quic: only set EOI on FIN
- BUG/MEDIUM: threads: fix a tiny race in thread_isolate()
- DOC: config: fix rfc7239 converter examples
- DOC: quic: remove experimental status for QUIC
- CLEANUP: mux-quic: rename functions for mux_ops
- CLEANUP: mux-quic: rename internal functions
- BUG/MINOR: mux-h2: refresh the idle_timer when the mux is empty
- DOC: config: Fix bind/server/peer documentation in the peers section
- BUILD: Makefile: use -pthread not -lpthread when threads are enabled
- CLEANUP: doc: remove 21 totally obsolete docs
- DOC: install: mention the common strict-aliasing warning on older compilers
- DOC: install: clarify a few points on the wolfSSL build method
- MINOR: quic: Add QUIC connection statistical counters values to "show quic"
- EXAMPLES: update the basic-config-edge file for 2.8
- MINOR: quic/cli: clarify the "show quic" help message
- MINOR: version: mention that it's LTS now.
Add the total number of sent packets for each QUIC connection dumped by
"show quic". Also add the remaining counter values only if not null.
Must be backported to 2.7.
Let's make clear which commands goes into the wolfSSL directory and
which one in the haproxy directory. Also, let's add a paragraph in the
QUIC section explaining how to proceed with wolfSSL.
In the errors and warnings section about common issues, it's useful to
mention the strict-aliasing warning that was happening with gcc-4.4 that
may still be found on old systems, especially since it will probably take
ages to build there and the warning is harmless.
These were docs for very old design thoughts or internal subsystems
which are now totally irrelevant and even misleading. Those with some
outdated ideas mixed with useful stuff were kept though.
-pthread is normally the right way to enable threads, it involves -lpthread
at the end of the arguments, and also enables -D_REENTRANT=1. We normally
don't care about the subtle difference, but building with a static openssl
library that has threads enabled breaks because -lpthread is placed before
the SSL_LDFLAGS and openssl doesn't find pthread_atfork().
Let's change the flag to -pthread once for all, that's something we've
considered over the last decade without having a good reason to do it
since it didn't bring any value. Now at least it fixes a build issues,
this is a good reason. This doesn't need to be backported since it is
one of the consequences of the new more flexible build options in 2.8.
Documentation about bind and server directives in the peers section was
retrieved from the proxy part but there are some limitations, especially for
the bind directive. And the same is true for the peer directive. It is
forbidden to have several listening addresses. Multiple addresses or port
range are not allowed.
Here, only the documentation is fixed. The configuration parsing will be
improved later to trigger errors on bad uses.
In addition, it is also specified that unix socket are supported.
This patch partially fixes the issue #2066. It should be backported to all
stable versions.
There's a rare case where on long fat pipes, we can see the keep-alive
timeout trigger before the end of the transfer of the last large object,
and the connection closed a bit quickly after the end of the transfer
because a GOAWAY is queued. The data are not destroyed, except that
the WINDOW_UPDATES from the client arriving late while the last data
are being drained by the socket buffers may at some point trigger a
reset, and some clients might choke a bit too early on these. Let's
make sure we only arm the idle_start timestamp once the output buffer
is empty. Of course it will still not cover for the data pending in the
socket buffers but it will at least let those in the buffer leave in
peace. More elaborate options can be used to protect the data in the
kernel buffers, such as the one described in GH issue #5.
It's very likely that this old issue was emphasized by the following
commit in 2.6:
15a4733d5 ("BUG/MEDIUM: mux-h2: make use of http-request and keep-alive timeouts")
and the behavior probably changed again with this one in 2.8, which
was backported to 2.7 and scheduled for 2.6:
d38d8c6cc ("BUG/MEDIUM: mux-h2: make sure control frames do not refresh the idle timeout")
As such this patch should be backported to 2.6 after some observation
period.
This patch is similar to the previous one but for QUIC mux functions
used inside the mux code itself or application layer. Replace all
occurences of qc_* prefix by qcc_* or qcs_*. This should help to better
differentiate code between quic_conn and MUX.
This should be backported up to 2.7.
Rename all QUIC mux function exposed through mux_ops structure. Use the
prefix qmux_* or qmux_strm_*. The objective is to remove qc_* prefix
which should only be used in quic_conn layer.
This should be backported up to 2.7.
QUIC support can now be considered production-ready. As such, remove all
statements on the documentation concerning its experimental status.
Do not backport this one.
Some rfc7239 converter examples were not working and thus were misleading.
Fixing rfc7239_n2nn and rfc7239_n2np usage examples.
As both converters were introduced in 2.8, no backport needed.
Aurlien found a tiny race in thread_isolate() that can allow a thread
that was running under isolation to continue running while another one
enters isolation. The reason is that the check for harmless is only
done before winning the CAS, but since the previously isolated thread
doesn't wait for !rdv_request in thread_release(), it can effectively
continue its activities while the next one believes it's isolated. A
proper solution consists in looping once again in thread_isolate() to
recheck (and wait) for all threads to be isolated once the CAS is won.
The issue was introduced in 2.7 by commit 598cf3f22 ("MAJOR: threads:
change thread_isolate to support inter-group synchronization") so the
fix needs to be backported there.
Recently stconn flags were reviewed for QUIC mux to be conform with
other HTTP muxes. However, a mistake was made when dealing with a proper
stream FIN with both EOI and EOS set. This was done as RESET_STREAM
received after a FIN are ignored by QUIC mux and thus there is no
difference between EOI or EOI+EOS. However, analyzers may interpret EOS
as an interrupted request which result in a 400 HTTP error code.
To fix this, only set EOI on proper stream FIN. EOS is set when input is
interrupted (RESET_STREAM before FIN) or a STOP_SENDING is received
which prevent transfer to complete. In this last case, EOS must be
manually set too if FIN has been received before STOP_SENDING to go
directly from ERR_PENDING to final ERROR state.
This must be backported up to 2.7.
There was a misnaming in stats counter for *_BLOCKED frames in regard to
QUIC rfc convention. This patch fixes it to prevent future ambiguity :
- STREAMS_BLOCKED -> STREAM_DATA_BLOCKED
- STREAMS_DATA_BLOCKED_BIDI -> STREAMS_BLOCKED_BIDI
- STREAMS_DATA_BLOCKED_UNI -> STREAMS_BLOCKED_UNI
This should be backported up to 2.7.
Remove nb_streams field from qcc. It was not used outside of a BUG_ON()
statement to ensure we never have a negative count of streams. However
this is already checked with other fields.
This should be backported up to 2.7.
haproxy does not compile anymore on macOS+clang since 425d7ad ("MINOR:
init: pre-allocate kernel data structures on init"). This is due to
rlim_cur being printed uncasted using %lu format specifier, with rlim_cur
being stored as a rlim_t which is a typedef so its size may vary depending
on the system's architecture.
This is not the first time we need to dump rlim_cur in case of errors,
there are already multiple occurences in the init code. Everywhere this
happens, rlim is casted as a regular int and printed using the '%d'
format specifier, so we do the same here as well to fix the build issue.
No backport needed unless 425d7ad gets backported.
preload_libgcc_s() use pthread_create to create a thread and then call
pthread_join to use it, but it doesn't check if the option is successful.
So add a check to aviod potential crash.
in __ssl_sock_init, BIO_meth_new may failed and return NULL if
OPENSSL_zalloc failed. in this case, ha_meth will be NULL, and then
crash happens in BIO_meth_set_write. So, we add a check for ha_meth.
The Linux kernel maintains data structures to track a processes' open file
descriptors, and it expands these structures as necessary when FD usage grows
(at every FD=2^X starting at 64). However when threading is in use, during
expansion the kernel will pause (observed up to 47ms) while it waits for thread
synchronization (see https://bugzilla.kernel.org/show_bug.cgi?id=217366).
This change addresses the issue and avoids the random pauses by opening the
maximum file descriptor during initialization, so that expansion will not occur
while processing traffic.
Building with an install of wolfssl and openssl side-by-side breaks
because for wolfssl we need the two include levels and since some
names are in common, this results in some files being found in the
original openssl tree. Let's swap the two include paths so that all
that is related to wolfssl is found there first when needed.
No backport is needed.
When a message is compressed, A "Vary" header is added with
"accept-encoding" value. However, a new header is always added, regardless
there is already a Vary header or not. In addition, if there is already a
Vary header, there is no check on values to be sure "accept-encoding" value
is not already there. So it is possible to have it twice.
To improve this part, we now test Vary header values and "accept-encoding"
is only added if it was not found. In addition, "accept-encoding" value is
appended to the last Vary header found, if any. Otherwise, a new header is
added.
Released version 2.8-dev13 with the following main changes :
- DOC: add size format section to manual
- CLEANUP: mux-quic/h3: complete BUG_ON with comments
- MINOR: quic: remove return val of quic_aead_iv_build()
- MINOR: quic: use WARN_ON for encrypt failures
- BUG/MINOR: quic: handle Tx packet allocation failure properly
- MINOR: quic: fix alignment of oneline show quic
- MEDIUM: stconn/applet: Allow SF_SL_EOS flag alone
- MEDIUM: stconn: make the SE_FL_ERR_PENDING to ERROR transition systematic
- DOC: internal: add a bit of documentation for the stconn closing conditions
- DOC/MINOR: config: Fix typo in description for `ssl_bc` in configuration.txt
- BUILD: quic: re-enable chacha20_poly1305 for libressl
- MINOR: mux-quic: set both EOI EOS for stream fin
- MINOR: mux-quic: only set EOS on RESET_STREAM recv
- MINOR: mux-quic: report error on stream-endpoint earlier
- BUILD: makefile: fix build issue on GNU make < 3.82
- BUG/MINOR: mux-h2: Check H2_SF_BODY_TUNNEL on H2S flags and not demux frame ones
- MINOR: mux-h2: Set H2_SF_ES_RCVD flag when decoding the HEADERS frame
- MINOR: mux-h2: Add a function to propagate termination flags from h2s to SE
- BUG/MEDIUM: mux-h2: Propagate termination flags when frontend SC is created
- DEV: add a Lua helper script for SSL keys logging
- CLEANUP: makefile: don't display a dummy features list without a target
- BUILD: makefile: do not erase build options for some build options
- MINOR: quic: Add low level traces (addresses, DCID)
- BUG/MINOR: quic: Wrong token length check (quic_generate_retry_token())
- BUG/MINOR: quic: Missing Retry token length on receipt
- MINOR: quic: Align "show quic" command help information
- CLEANUP: quic: Indentation fix quic_rx_pkt_retrieve_conn()
- CLEANUP: quic: Useless tests in qc_rx_pkt_handle()
- MINOR: quic: Add some counters at QUIC connection level
- MINOR: quic: Add a counter for sent packets
- MINOR: hlua: hlua_smp2lua_str() may LJMP
- MINOR: hlua: hlua_smp2lua() may LJMP
- MINOR: hlua: hlua_arg2lua() may LJMP
- DOC: hlua: document hlua_lua2arg() function
- DOC: hlua: document hlua_lua2smp() function
- BUG/MINOR: hlua: unsafe hlua_lua2smp() usage
- BUILD: makefile: commit the tiny FreeBSD makefile stub
- BUILD: makefile: fix build options when building tools first
- BUILD: ist: do not put a cast in an array declaration
- BUILD: ist: use the literal declaration for ist_lc/ist_uc under TCC
- BUILD: compiler: systematically set USE_OBSOLETE_LINKER with TCC
- DOC: install: update reference to known supported versions
- SCRIPTS: publish-release: update the umask to keep group write access
Gcc 13 is known to work, OpenSSL 3.1 and wolfSSL as well. Add a few
hints about build errors when using QUIC + OpenSSL and warnings about
the dramatic OpenSSL 3.x performance regression.
TCC silently ignores the weak and section attributes, which ruins the
initcalls. Technically we're exactly in the same situation as with an
obsolete linker. Let's just automatically set the flag if TCC is
detected, this avoids surprises where the program compiles but does
not start.
No backport is needed.
TCC doesn't knoow about __attribute__((weak)), it silently ignores it.
We could add a "static" modifier there in this case but we already have
an alternate portable mode that is based on a slightly larger literal
for obsolete linkers (and non-ELF systems) which choke on weak. Let's
just add the test for tcc there and use it in this case.
No backport is needed.
TCC is upset by the declaration looking like:
const unsigned char ist_lc[256] __attribute__((weak)) = ((const unsigned char[256]){ ... });
It was written like this because it's expanded from the _IST_LC macro
but it's never used as-is, it's only used from ist_lc, which should be
the one containing the cast so that the macro only contains the list of
bytes that can be used in both places. And this assigns more consistent
roles to the lower and upper case macro/variable now, one is typed and
the other one not. No backport is needed.
Due to the test on the target introduced by commit 9577a152b ("BUILD:
makefile: do not erase build options for some build options"), if a
tool (e.g. halog) is build first before haproxy after a clean or a
fresh source extraction, the .build_opts file does not exist and
"make" complains since there's no such target. Make sure to define
the empty target for all "else" blocks there. No backport is needed.
The idea here is to try to detect the use of "make" instead of "gmake"
on FreeBSD. After having long tried, there's no way to construct a
condition that is common to both makefile languages and could serve as
a differentiator since there's simply no common word between the two
languages. However on FreeBSD (the main used BSD platform), "make" is
configured to look for BSDmakefile before the other ones. It allows us
to intercept it and explain to use gmake with an example of a roughly
converted make command line (we just strip "-J xx,xx" that systematically
gets inserted if "-j" is used). A few tricks are used, such as creating
a dummy target on the fly based on the requested one just to silence the
output, and always match "all" since it's used by default when no target
is specified. .DEFAULTS was initially used but finally dropped thanks to
this.
For example:
$ make -j$(getconf NPROCESSORS_ONLN) TARGET=freebsd USE_OPENSSL=1
Please use GNU make instead. It is often called gmake.
Example:
gmake -j 4 TARGET=freebsd USE_OPENSSL=1 all
It will often be sufficient to permit a copy-paste and to try again.
Note that the .gitignore was updated.
Fixing hlua_lua2smp() usage in hlua's code since it was assumed that
hlua_lua2smp() makes a standalone smp out of lua data, but it is not
the case.
This is especially true when dealing with lua strings (string is
extracted using lua_tolstring() which returns a pointer to lua string
memory location that may be reclaimed by lua at any time when no longer
used from lua's point of view). Thus, smp generated by hlua_lua2smp() may
only be used from the lua context where the call was initially made, else
it should be explicitly duplicated before exporting it out of lua's
context to ensure safe (standalone) usage.
This should be backported to all stable versions.
Add ->sent_pkt counter to quic_conn struct to count the packet at QUIC connection
level. Then, when the connection is released, the ->sent_pkt counter value
is added to the one for the listener.
Must be backported to 2.7.
Add some statistical counters to quic_conn struct from quic_counters struct which
are used at listener level to handle them at QUIC connection level. This avoid
calling atomic functions. Furthermore this will be useful soon when a counter will
be added for the total number of packets which have been sent which will be very
often incremented.
Some counters were not added, espcially those which count the number of QUIC errors
by QUIC error types. Indeed such counters would be incremented most of the time
only one time at QUIC connection level.
Implement quic_conn_prx_cntrs_update() which accumulates the QUIC connection level
statistical counters to the listener level statistical counters.
Must be backported to 2.7.