Just like for the cache applet, it is now possible to send response to the
opposite side using the zero-copy forwarding. Internal functions were
slightly updated but there is nothing special to say. Except the requested
size during the nego stage is not exact.
Till now, for chunked messages, the H1 mux used the size requested during
the zero-copy forwarding negotiation as the chunk size. And till now, this
was accurate because the requested size was indeed the chunk size on the
producer side.
But this will be a problem to implement the zero-copy forwarding on some
applets because the content size is not known during the nego but only when
it is produced. Thanks to previous patches, it is now possible to know the
requested size is not exact and we are able to reserve a larger space to
write the chunk size later, in h1_done_ff(), with some padding.
Now, during the zero-copy forwarding negotiation, when the requested size is
exact, we are now able to check if it is bigger than the expected one or
not. If it is indeed bigger than expeceted, the zero-copy forwarding is
disabled, the error will be triggered later on the normal sending path.
It is now possible to use a flag during zero-copy forwarding negotiation to
specify the requested size is exact, it means the producer really expect to
receive at least this amount of data.
It can be used by consumer to prepare some processing at this stage, based
on the requested size. For instance, in the H1 mux, it is used to write the
next chunk size.
It is now possible to impose the length to represent the chunk size in the
function used to prepended the chunk size in a buffer (so before the chunk
itself). It is thus possible to reserve a specific space for an unknown
chunk size and padding it with leading '0' to use all the space and avoid
holes.
During zero-copy forwarding negotiation, a pseudo flag was already used to
notify the consummer if the producer is able to use kernel splicing or not. But
this was not extensible. So, now we use a true bitfield to be able to pass flags
during the negotiation. NEGO_FF_FL_* flags may be used now.
Of course, for now, there is only one flags, the kernel splicing support on
producer side (NEGO_FF_FL_MAY_SPLICE).
The cache applet will be refactored to use its own buffer. Thus, for now,
the zero-copy forwarding support is removed and it will be reintrocuded
later.
The HTTP stat applets and all internal functions was adapted to use its own
buffers instead of the channels ones. The CLI part was not refactored yet,
thus there are still some access to channels in the file. But for the HTTP
part, we no longer use the channels at all.
To do so, the HTTP stats applet now uses default .rcv_buf and .snd_buf
callback function. In addition, it sets appctx flags instead of SE ones.
We no longer test the opposite stream-connector to detect aborted partial
post. Applets must not try to access to info ouside their scope. This make
the code more sensitive to changes and it is a common source of bug.
Tests on the sedesc flags at the begining of the I/O handler should be
enough.
Thanks to this patch, it is possible to an applet to directly send data to
the opposite endpoint. To do so, it must implement <fastfwd> appctx callback
function and set SE_FL_MAY_FASTFWD flag.
Everything will be handled by appctx_fastfwd() function. The applet is only
responsible to transfer data. If it sets <to_forward> value, it is used to
limit the amount of data to forward.
This patch introduces the support for the callback function responsible to
produce data via the zero-copy forwarding mechanism. There is no
implementation for now. But <to_forward> field was added in the appctx
structure to let an applet inform how much data it want to forward. It is
not mandatory but it will be used during the zero-copy forwarding
negociation.
We have indroduced flags to deal with end of input, end of stream and errors
at the applet level. With this patch we make the link with the endpoint
descriptor.
In appctx_rcv_buf(), applet flags are converted to SE flags.
There is no shutdown for reads and send with applets. Both are performed
when the appctx is released. So instead of 2 flags, like for
muxes/connections, only one flag is used. But the idea is the same:
acknowledge the event at the applet level.
The appctx state was never really used as a state. It is only used to know
when an applet should be freed on the next wakeup. This can be converted to
a flag and the state can be removed. This is what this patch does.
Dedicated appctx flags to report EOI, EOS and errors (pending or terminal) were
added with the functions to set these flags. It is pretty similar to what it
done on most of muxes.
Till now, we've extended the appctx state to add some flags. However, the
field name is misleading. So a bitfield was added to handle real flags. And
helper functions to manipulate this bitfield were added.
A dedicated function to run applets was introduced, in addition to the old
one, to deal with applets that use their own buffers. The main differnce
here is that this handler does not use channels at all. It performs a
synchronous send before calling the applet and performs a synchronous
receive just after.
No applets are plugged on this handler for now.
There is no tasklet to handle I/O subscriptions for applets, but functions
to deal with receives and sends from the SC layer were added. it meanse a
function to retrieve data from an applet with this synchronous version and a
function to push data to an applet wit this synchronous version.
It is pretty similar to the functions used for muxes but there are some
differences. So for now, we keep them separated.
Zero-copy forwarding is not supported for now. In addition, there is no
subscription mechanism.
In this patch, we add default functions to copy data from a channel to the
<inbuf> buffer of an applet (appctx_rcv_buf) and another on to copy data
from <outbuf> buffer of an applet to a channel (appctx_snd_buf).
These functions are not used for now, but they will be used by applets to
define their <rcv_buf> and <snd_buf> callback functions. Of course, it will
be possible for a specific applet to implement its own functions but these
ones should be good enough for most of applets. HTX and RAW buffers are
supported.
For now, it is not usable, but this patch introduce the support of callback
functions, in the applet structure, to exchange data between channels and
applets. It is pretty similar to callback functions defined by muxes.
It is the first patch of a series aimed to align applets on connections.
Here, dedicated buffers are added for applets. For now, buffers are
initialized and helpers function to deal with allocation are added. In
addition, flags to report allocation failures or full buffers are also
introduced. <inbuf> will be used to push data to the applet from the stream
and <outbuf> will be used to push data from the applet to the stream.
sc_attach_applet() was changed to be able to fail and callers were updated
accordingly. For now it cannot fail but if this changes, callers will be
prepared to handle errors.
In sc_attach_applet, an untyped pointer (void *) was used to attach a SC on
an applet. There is no reason to not use the right type here. So now a
pointer on an appctx is explicitly used.
IS_HXT_SC() macro is only usable if the stream-connector is attached to a
connection. It is a bit restrictive because this cannot work if the SC is
attached to an applet. So let's fix that be adding the support of applets
too.
wait_event structure was in connection header file because it is only used
by connections and muxes. But, this may change. For instance applets may be
good candidates to use it too. So, the structure is moved to the task header
file instead.
Avoid loss of precision when computing K cubic value.
Same issue when computing the congestion window value from cubic increase function
formula with possible integer varaiable wrap around.
Depends on this commit:
MINOR: quic: Code clarifications for QUIC CUBIC (RFC 9438)
Must be backported as far as 2.6.
The first version of our QUIC CUBIC implementation is confusing because relying on
TCP CUBIC linux kernel implementation and with references to RFC 8312 which is
obsoleted by RFC 9438 (August 2023) after our implementation. RFC 8312 is a little
bit hard to understand. RFC 9438 arrived with much more clarifications.
So, RFC 9438 is about "CUBIC for Fast Long-Distance Networks". Our implementation
for QUIC is not very well documented. As it was difficult to reread this
code, this patch adds only some comments at complicated locations and rename
some macros, variables without logic modifications at all.
So, the aim of this patch is to add first some comments and variables/macros renaming
to avoid embedding too much code modifications in the same big patch.
Some code modifications will come to adapt this CUBIC implementation to this new
RFC 9438.
Rename some macros:
CUBIC_BETA -> CUBIC_BETA_SCALED
CUBIC_C -> CUBIC_C_SCALED
CUBIC_BETA_SCALE_SHIFT -> CUBIC_SCALE_FACTOR_SHIFT (this is the scaling factor
which is used only for CUBIC_BETA_SCALED)
CUBIC_DIFF_TIME_LIMIT -> CUBIC_TIME_LIMIT
CUBIC_ONE_SCALED was added (scaled value of 1).
These cubic struct members were renamed:
->tcp_wnd -> ->W_est
->origin_point -> ->W_target
->epoch_start -> ->t_epoch
->remaining_tcp_inc -> remaining_W_est_inc
Local variables to quic_cubic_update() were renamed:
t -> elapsed_time
diff ->t
delta -> W_cubic_t
Add a grahpic curve about the CUBIC Increase function.
Add big copied & pasted RFC 9438 extracts in relation with the 3 different increase
function regions.
Same thing for the fast convergence.
Fix a typo about the reference to QUIC RFC 9002.
Must be backported as far as 2.6 to ease any further modifications to come.
This commit adds support for an optional second argument to BUG_ON(),
WARN_ON(), CHECK_IF(), that can be a constant string. When such an
argument is given, it will be printed on a second line after the
existing first message that contains the condition.
This can be used to provide more human-readable explanations about
what happened, such as "too low on memory" or "memory corruption
detected" that may help a user resolve the incident by themselves.
The ABORT_NOW() macro is not much used since we have BUG_ON(), but
there are situations where it makes sense, typically if the program
must always die regardless od DEBUG_STRICT, or if the condition must
always be evaluated (e.g. decompress something and check it).
It's not convenient not to have any hint about what happened there. But
providing too much info also results in wiping some registers, making
the trace less exploitable, so a compromise must be found.
What this patch does is to provide the support for an optional argument
to ABORT_NOW(). When an argument is passed (a string), then a message
will be emitted with the file name, line number, the message and a
trailing LF, before the stack dump and the crash. It should be used
reasonably, for example in functions that have multiple calls that need
to be more easily distinguished.
If we were to enable 'ocsp-update' on a certificate that does not have
an OCSP URI, we would exit ssl_sock_load_ocsp with a negative error code
which would raise a misleading error message ("<cert> has an OCSP URI
and OCSP auto-update is set to 'on' ..."). This patch simply fixes the
error message but an error is still raised.
This issue was raised in GitHub #2432.
It can be backported up to branch 2.8.
As seen in previous commit 59acb27001 ("BUILD: quic: Variable name typo
inside a BUG_ON()."), it can sometimes happen that with DEBUG forced
without DEBUG_STRICT, BUG_ON() statements are ignored. Sadly, it means
that typos there are not even build-tested.
This patch makes these statements reference sizeof(cond) to make sure
the condition is parsed. This doesn't result in any code being emitted,
but makes sure the expression is correct so that an issue such as the one
above will fail to build (which was verified).
This may be backported as it can help spot failed backports as well.
Since d480b7b ("MINOR: debug: make ABORT_NOW() store the caller's line
number when using abort"), building with 'DEBUG_USE_ABORT' fails with:
|In file included from include/haproxy/api.h:35,
| from include/haproxy/activity.h:26,
| from src/ev_poll.c:20:
|include/haproxy/thread.h: In function ‘ha_set_thread’:
|include/haproxy/bug.h:107:47: error: expected ‘;’ before ‘_with_line’
| 107 | #define ABORT_NOW() do { DUMP_TRACE(); abort()_with_line(__LINE__); } while (0)
| | ^~~~~~~~~~
|include/haproxy/bug.h:129:25: note: in expansion of macro ‘ABORT_NOW’
| 129 | ABORT_NOW(); \
| | ^~~~~~~~~
|include/haproxy/bug.h:123:9: note: in expansion of macro ‘__BUG_ON’
| 123 | __BUG_ON(cond, file, line, crash, pfx, sfx)
| | ^~~~~~~~
|include/haproxy/bug.h:174:30: note: in expansion of macro ‘_BUG_ON’
| 174 | # define BUG_ON(cond) _BUG_ON (cond, __FILE__, __LINE__, 3, "FATAL: bug ", "")
| | ^~~~~~~
|include/haproxy/thread.h:201:17: note: in expansion of macro ‘BUG_ON’
| 201 | BUG_ON(!thr->ltid_bit);
| | ^~~~~~
|compilation terminated due to -Wfatal-errors.
|make: *** [Makefile:1006: src/ev_poll.o] Error 1
This is because of a leftover: abort()_with_line(__LINE__);
^^
Fixing it by removing the extra parentheses after 'abort' since the
abort() call is now performed under abort_with_line() helper function.
This was raised by Ilya in GH #2440.
No backport is needed, unless the above commit gets backported.
Fetch will return true if the stream underwent a redispatch according to
"option redispatch" setting upon retries.
Documentation was added, and the "%rc" logformat alternative now mentions
the new fetch to properly emulate the logformat behavior.
This build issued was introduced by this previous commit which is a bugfix:
BUG/MINOR: quic: Wrong ack ranges handling when reaching the limit.
A BUG_ON() referenced <fist> variable in place of <first>.
Must be backported as far as 2.6 as the previous commit.
Acknowledgements ranges are used to build ACK frames. To avoid allocating too
much such objects, a limit was set to 32(QUIC_MAX_ACK_RANGES) by this commit:
MINOR: quic: Do not allocate too much ack ranges
But there is an inversion when removing the oldest range from its tree.
eb64_first() must be used in place of eb64_last(). Note that this patch
only does this modification in addition to rename <last> variable to <first>.
This bug leads such a h2load command to block when a request ends up not
being acknowledged by haproxy even if correctly served:
/opt/nghttp2/build/bin/h2load --alpn-list h3 -t 1 -c 1 -m 1 -n 100 \
https://127.0.0.1/?s=5m
There is a remaining question to be answered. In such a case, haproxy refuses to
reopen the stream, this is a good thing but should not haproxy ackownledge the
request (because correctly parsed again).
Note that to be easily reproduced, this setting had to be applied to the client
network interface:
tc qdisc add dev eth1 root netem delay 100ms 1s loss random
Must be backported as far as 2.6.
As noticed in this thread, some bogus configurations are not always easy
to spot: https://www.mail-archive.com/haproxy@formilux.org/msg44558.html
Here it was about config keywords being used in ACL patterns where strings
were expected, hence they're always valid.
Since we have the diag mode (-dD) we can perform some extra checks when
it's used, and emit them to suggest the user there might be an issue.
Here we detect a few common words (logic such as "and"/"or"/"||" etc),
C++/JS comments mistakenly used to try to isolate final args, and words
that have the exact name of a sample fetch or an ACL keyword. These checks
are only done in diag mode of course.
Final diags were added in 2.4 by commit 5a6926dcf ("MINOR: diag: create
cfgdiag module"), but it's called too late in the startup process,
because when "-c" is passed, the call is not made, while it's its primary
use case. Let's just move the call earlier.
Note that currently the check in this function is limited to verifying
unicity of server cookies in a backend, so it can be backported as far
as 2.4, but there is little value in insisting if it doesn't backport
easily.
Diag warnings were added in 2.4 by commit 7b01a8dbd ("MINOR: global:
define diagnostic mode of execution") but probably due to the split
function that checks for the mode, they did not reuse the emission of
the version string before the first warning, as was brought in 2.2 by
commit bebd21206 ("MINOR: init: report in "haproxy -c" whether there
were warnings or not"). The effet is that diag warnings are emitted
before the version string if there is no other warning nor error. Let's
just proceed like for the two other ones.
This can be backported to 2.4, though this is of very low importance.