The principle that each mux stream should have an endpoint is not
guaranteed for closed streams that map to the dummy static streams.
Let's have a dummy endpoint for use with such streams. It only has
the DETACHED flag and a NULL conn_stream, and is referenced by all
the closed streams so that we can afford not to test h2s->endp when
trying to access the flags or the CS.
There is always an endpoint link in a stream, and this endpoint link
contains a pointer to the conn_stream it's attached to, so the one in
the h1 stream is always duplicate now. Let's always use endp->cs
instead and get rid of it.
Muxes and applets need to have both a pointer to the endpoint and to the
conn_stream. It would seem more natural that they only have a pointer to
the endpoint (that is always there) and that this one has an optional
pointer to the conn_stream. This would reduce the number of elements to
manipulate in lower level code. In addition, the conn_stream is not much
used from the lower layers (wake and exceptional events mostly).
The few applets that set CS_EP_EOI or CS_EP_ERROR used to set it on the
endpoint retrieved from the conn_stream while it's already available on
the appctx itself. Better use the appctx one to limit the unneeded
interactions between the two sides.
At a few places the endpoint pointer was retrieved from the conn_stream
while it's safer and more long-term proof to take it from the qcs. Let's
just do that.
At a few places the endpoint pointer was retrieved from the conn_stream
while it's safer and more long-term proof to take it from the fstrm.
Let's just do that.
At a few places the endpoint pointer was retrieved from the conn_stream
while it's safer and more long-term proof to take it from the context.
Let's just do that.
At a few places the endpoint pointer was retrieved from the conn_stream
while it's safer and more long-term proof to take it from the h2s. Let's
just do that.
At a few places the endpoint pointer was retrieved from the conn_stream
while it's safer and more long-term proof to take it from the h1s. Let's
just do that.
Wherever we need to report an error, we have an even easier access to
the endpoint than the conn_stream. Let's first adjust the API to use
the endpoint and rename the function accordingly to cs_ep_set_error().
Since the 2.5, for security reason, HTTP/1.0 GET/HEAD/DELETE requests with a
payload are rejected (See e136bd12a "MEDIUM: mux-h1: Reject HTTP/1.0
GET/HEAD/DELETE requests with a payload" for details). However it may be an
issue for old clients.
To avoid any compatibility issue with such clients,
"h1-accept-payload-with-any-method" global option was added. It must only be
set if there is a good reason to do so because it may lead to a request
smuggling attack on some servers or intermediaries.
This patch should solve the issue #1691. it may be backported to 2.5.
In wdt_handler(), does not try to trigger the watchdog if the
prev_cpu_time wasn't initialized.
This prevents an unexpected trigger of the watchdog when it wasn't
initialized yet. This case could happen in the master just after loading
the configuration. This would show a trace where the <diff> value is equal
to the <now> value in the trace, and the <poll> value would be 0.
For example:
Thread 1 is about to kill the process.
*>Thread 1 : id=0x0 act=1 glob=1 wq=0 rq=0 tl=0 tlsz=0 rqsz=0
stuck=1 prof=0 harmless=0 wantrdv=0
cpu_ns: poll=0 now=6005541706 diff=6005541706
curr_task=0
Thanks to Christian Ruppert for repporting the problem.
Could be backported in every stable versions.
Lua API Channel.remove() and HTTPMessage.remove() expects 1 to 3
arguments (counting the manipulated object), with offset and length
being the 2nd and 3rd argument, respectively.
hlua_{channel,http_msg}_del_data() incorrectly gets the 3rd argument as
offset, and 4th (nonexistent) as length. hlua_http_msg_del_data() also
improperly checks arguments. This patch fixes argument handling in both.
Must be backported to 2.5.
Implement a series of unit test to validate ncbuf. This is written with
a main function which can be compiled independently using the following
command-line :
$ gcc -DSTANDALONE -lasan -I./include -o ncbuf src/ncbuf.c
The first part tests is used to test ncb_add()/ncb_advance(). After each
call a loop is done on the buffer blocks which should ensure that the
gap infos are correct.
The second part generates random offsets and insert them until the
buffer is full. The buffer is then resetted and all random offsets are
re-inserted in the reverse order : the buffer should be full once again.
The generated binary takes arguments to change the tests execution.
"usage: ncbuf [-r] [-s bufsize] [-h bufhead] [-p <delay_msec>]"
A new function ncb_advance() is implemented. This is used to advance the
buffer head pointer. This will consume the front data while forming a
new gap at the end for future data.
On success NCB_RET_OK is returned. The operation can be rejected if a
too small new gap is formed in front of the buffer.
Define three different ways to proceed insertion. This configures how
overlapping data is treated.
- NCB_ADD_PRESERVE : in this mode, old data are kept during insertion.
- NCB_ADD_OVERWRT : new data will overwrite old ones.
- NCB_ADD_COMPARE : this mode adds a new test in check stage. The
overlapping old and new data must be identical or else the insertion
is not conducted. An error NCB_RET_DATA_REJ is used in this case.
The mode is specified with a new argument to ncb_add() function.
Implement a new function ncb_add() to insert data in ncbuf. This
operation is conducted in two stages. First, a simulation will be run to
ensure that insertion can be proceeded. If a gap is formed, either
before or after the new data, it must be big enough to store its header,
or else the insertion is aborted.
After this check stage, the insertion is conducted block by block with
the function pair ncb_fill_data_blk()/ncb_fill_gap_blk().
A new type ncb_ret is used as a return value. For the moment, only
success or gap-size error is used. It is planned to add new error types
in the future when insertion will be extended.
Relax the constraint for gap storage when this is the last block.
ncb_blk API functions will consider that if a gap is stored near the end
of the buffer, without the space to store its header, the gap will cover
entirely the buffer end.
For these special cases, the gap size/data size are not write/read
inside the gap to prevent an overflow. Such a gap is designed in
functions as "reduced gap" and will be flagged with the value
NCB_BK_F_FIN.
This should reduce the rejection on future add operation when receiving
data in-order. Without reduced gap handling, an insertion would be
rejected if it covers only partially the last buffer bytes, which can be
a very common case.
Implement two new functions to report the total data stored accross the
whole buffer and the data stored at a specific offset until the next gap
or the buffer end.
To facilitate implementation of these new functions and also future
add/delete operations, a new abstraction is introduced : ncb_blk. This
structure represents a block of either data or gap in the buffer. It
simplifies operation when moving forward in the buffer. The first buffer
block can be retrieved via ncb_blk_first(buf). The block at a specific
offset is accessed via ncb_blk_find(buf, off).
This abstraction is purely used in functions but not stored in the ncbuf
structure per-se. This is necessary to keep the minimal memory
footprint.
Define the new type ncbuf. It can be used as a buffer with
non-contiguous data and wrapping support.
To reduce as much as possible the memory footprint, size of data and
gaps are stored in the gaps themselves. This put some limitation on the
buffer usage. A reserved space is present just before the head to store
the size of the first data block. Also, add and delete operations will
be constrained to ensure minimal gap sizes are preserved.
The sizes stored in the gaps are represented by a custom type named
ncb_sz_t. This type is a typedef to easily change it : this has a
direct impact on the maximum buffer size (MAX(ncb_sz_t) - sizeof(ncb_sz_t))
and the minimal gap sizes (sizeof(ncb_sz_t) * 2)).
Currently, it is set to uint32_t.
Add send_stateless_reset() to send a stateless reset packet. It prepares
a packet to build a 1-RTT packet with quic_stateless_reset_token_cpy()
to copy a stateless reset token derived from the cluster secret with
the destination connection ID received as salt.
Also add QUIC_EV_STATELESS_RST new trace event to at least to have a trace
of the connection which are reset.
A server may send the stateless reset token associated to the current
connection from its transport parameters. So, let's copy it from
qc_lstnt_params_init().
The stateless reset token of a connection is generated from qc_new_conn() when
allocating the first connection ID. A QUIC server can copy it into its transport
parameters to allow the peer to reset the associated connection.
This latter is not easily reachable after having returned from qc_new_conn().
We want to be able to initialize the transport parameters from this function which
has an access to all the information to do so.
Extract the code used to initialize the transport parameters from qc_lstnr_pkt_rcv()
and make it callable from qc_new_conn(). qc_lstnr_params_init() is implemented
to accomplish this task for a haproxy listener.
Modify qc_new_conn() to reduce its the number of parameters.
The source address coming from Initial packets is also copied from qc_new_conn().
Add quic_stateless_reset_token_init() wrapper function around
quic_hkdf_extract_and_expand() function to derive the stateless reset tokens
attached to the connection IDs from "cluster-secret" configuration setting
and call it each time we instantiate a QUIC connection ID.
This function will have to call another one from quic_tls.[ch] soon.
As we do not want to include quic_tls.h from xprt_quic.h because
quic_tls.h already includes xprt_quic.h, let's moving it into
xprt_quic.c.
This is a wrapper function around OpenSSL HKDF API functions to
use the "extract-then-expand" HKDF mode as defined by rfc5869.
This function will be used to derived stateless reset tokens
from secrets ("cluster-secret" conf. keyword) and CIDs (as salts).
It could be usefull to set a ASCII secret which could be used for different
usages. For instance, it will be used to derive QUIC stateless reset tokens.
A ->time_received new member is added to quic_rx_packet to store the time the
packet are received. ->largest_time_received is added the the packet number
space structure to store this timestamp for the packet with a new largest
packet number to be acknowledged. QUIC_FL_PKTNS_NEW_LARGEST_PN new flag is
added to mark a packet number space as having to acknowledged a packet wih a
new largest packet number. In this case, the packet number space ack delay
must be recalculated.
Add quic_compute_ack_delay_us() function to compute the ack delay from the value
of the time a packet was received. Used only when a packet with a new largest
packet number.
The call to quic_dflt_transport_params_cpy() is already first done by
quic_transport_params_init() which is a good thing. But this function was also
called each time we parsed a transport parameters with quic_transport_param_decode(),
re-initializing to default values some of them. The transport parameters concerned
by this bug are the following:
- max_udp_payload_size
- ack_delay_exponent
- max_ack_delay
- active_connection_id_limit
So, let's remove this call to quic_dflt_transport_params_cpy() which has nothing
to do here!
As we do not have any task to be wake up by the poller after sendto() error,
we add an sendto() error counter to the quic_conn struct.
Dump its values from qc_send_ppkts().
There are two reasons we can reject the creation of an h2 stream on the
frontend:
- its creation would violate the MAX_CONCURRENT_STREAMS setting
- there's no more memory available
And on the backend it's almost the same except that the setting might
have be negotiated after trying to set up the stream.
Let's add traces for such a sitaution so that it's possible to know why
the stream was rejected (currently we only know it was rejected).
It could be nice to backport this to the most recent versions.
When a client doesn't respect the h2 MAX_CONCURRENT_STREAMS setting, we
rightfully send RST_STREAM to it so that the client closes. But the
max_id is only updated on the successful path of h2c_handle_stream_new(),
which may be reentered for partial frames or CONTINUATION frames, and as
a result we don't increment it if an extraneous stream ID is rejected.
Normally it doesn't have any consequence. But on a POST it can have some
if the DATA frame immediately follows the faulty HEADERS frame: with
max_id not incremented, the stream remains in IDLE state, and the DATA
frame now lands in an invalid state from a protocol's perspective, which
must lead to a connection error instead of a stream error.
This can be tested by modifying the code to send an arbitrarily large
MAX_CONCURRENT_STREAM setting and using h2load to send more concurrent
streams than configured: with a GET, only a tiny fraction of them will
report an error (e.g. 101 streams for 100 accepted will result in ~1%
failure), but when sending data, most of the streams will be reported
as failed because the connection will be closed. By updating the max_id
earlier, the stream is now considered as closed when the DATA frame
arrives and it's silently discarded.
This must be backported to all versions but only if the code is exactly
the same. Under no circumstance this ID may be updated for a partial frame
(i.e. only update it before or just after calling h2c_frt_steam_new()).
This patch adds a lock on the struct dgram_conn to ensure
that an other thread cannot trash a fd or alter its status
while the current thread processing it on for send/receive/connect
operations.
Starting with the 2.4 version this could cause a crash when a DNS
request is failing, setting the FD of the dgram structure to -1. If the
dgram structure is reused after that, a read access to fdtab[-1] is
attempted. The crash was only triggered when compiled with ASAN.
In previous versions the concurrency issue also exists but is less
likely to crash.
This patch must be backported until v2.4 and should be
adapt for v < 2.4.
The statefile before this patch can only parse lines within 512
characters, now as we made the value to 2000, it can support a
line of length of 2kB.
This patch fixes GitHub issue #1530.
It should be backported to all stable releases.
Some error reports are misleading on some recent versions of gcc because
it goes on to build for a very long time after it meets an error. Not
only this makes it hard to scroll back to the beginning of the error,
but it also hides the cause of the error when it's prominently printed
in a "#error" statement. This typically happens when building with QUIC
and without OPENSSL where there can be 4 pages of unknown types and such
errors after the "Must define USE_OPENSSL" suggestion.
The flag -Wfatal-errors serves exactly this purpose, to stop after the
first error, and it's supported on all the compilers we support, so let's
enable this now.
It turns out that gcc-3.4 doesn't build anymore (and it has probably been
the case since 2.4 or so). gcc-4.2 does build fine though, let's mark it
as the oldest supported one. Now that gcc-12 works, also update the most
recently known-to-work version.
... or how a bogus warning forces you to do tricky changes in your code
and fail on a length test condition! Fortunately it changed in the right
direction that immediately broke, due to a missing "> sizeof(path)" that
had to be added to the already ugly condition.
This fixes recent commit 393e42ae5 ("BUILD: ssl: work around bogus warning
in gcc 12's -Wformat-truncation"). It may have to be backported if that
one is backported.
When building without threads, gcc 12 says that there's a null-deref in
_HA_ATOMIC_INC() called from listener_accept(). It's just that the code
was originally written in an attempt not to always have a proxy for a
listener and that there are two places where the pointer is tested before
being used, so the compiler concludes that the pointer might be null
hence that other places are null-derefs.
In practice the pointer cannot be null there (and never has been), but
since that code was initially built that way and it's only a matter of
adding a pair of braces to shut it up, let's respect that initial
attempt in case one day we need it.
This one was also reported by Ilya in issue #1513, though with threads
enabled in his case.
This may have to be backported if users complain about new breakage with
gcc-12.
As was first reported by Ilya in issue #1513, compiling with gcc-12
adds warnings about size 0 around each BUG_ON() call due to the
ABORT_NOW() macro that tries to dereference pointer value 1.
The problem is known, seems to be complex inside gcc and could only
be worked around for now by adjusting a pointer limit so that the
warning still catches NULL derefs in the first page but not other
values commonly used in kernels and boot loaders:
https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=91f7d7e1b
It's described in more details here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104657https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103768
And some projects had to work around it using various approaches,
some of which are described in the bugs reports above, plus another
one here:
https://mail.coreboot.org/hyperkitty/list/seabios@seabios.org/thread/HLK3BHP2T3FN6FZ46BIPIK3VD5FOU74Z/
In haproxy we can hide it by hiding the pointer in a DISGUISE() macro,
but this forces the pointer to be loaded into a register, so that
register is lost precisely where we want to get the maximum of them.
In our case we purposely use a low-value non-null pointer because:
- it's mandatory that this value fits within an unmapped page and
only the lowest one has this property
- we really want to avoid register loads for the address, as these
will be lost and will complicate the bug analysis, and they tend
to be used for large addresses (i.e. instruction length limit).
- the compiler may decide to optimize away the null deref when it
sees it (seen in the past already)
As such, the current workaround merged in gcc-12 is not effective for
us.
Another approach consists in using pragmas to silently disable
-Warray-bounds and -Wnull-dereference only for this part. The problem
is that pragmas cannot be placed into macros.
The resulting solution consists in defining a forced-inlined function
only to trigger the crash, and surround the dereference with pragmas,
themselves conditionned to gcc >= 5 since older versions don't
understand them (but they don't complain on the dereference at least).
This way the code remains the same even at -O0, without the stack
pointer being modified nor any address register being modified on
common archs (x86 at least). A variation could have been to rely on
__builtin_trap() but it's not everywhere and it behaves differently
on different platforms (undefined opcode or a nasty abort()) while
the segv remains uniform and effective.
This may need to be backported to older releases once users start to
complain about gcc-12 breakage.
As was first reported by Ilya in issue #1513, Gcc 12 incorrectly reports
a possible overflow from the concatenation of two strings whose size was
previously checked to fit:
src/ssl_crtlist.c: In function 'crtlist_parse_file':
src/ssl_crtlist.c:545:58: error: '%s' directive output may be truncated writing up to 4095 bytes into a region of size between 1 and 4096 [-Werror=format-truncation=]
545 | snprintf(path, sizeof(path), "%s/%s", global_ssl.crt_base, crt_path);
| ^~
src/ssl_crtlist.c:545:25: note: 'snprintf' output between 2 and 8192 bytes into a destination of size 4097
545 | snprintf(path, sizeof(path), "%s/%s", global_ssl.crt_base, crt_path);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
It would be a bit concerning to disable -Wformat-truncation because it
might detect real programming mistakes at other places. The solution
adopted in this patch is absolutely ugly and error-prone, but it works,
it consists in integrating the snprintf() call in the error condition
and to test the result again. Let's hope a smarter compiler will not
warn that this test is absurd since guaranteed by the first condition...
This may have to be backported for those suffering from a compiler upgrade.