quotes building for some log formats is directly performed under each
switch case statement so it would become painful to add other conditions
to prevent the quotes from being generated when it's not supported by the
the data encoding format for instance (ie: JSON).
Let's centralize and simplify quotes handling by adding LOGQUOTE_START()
and LOGQUOTE_END() helper macros. If a quotation is started and not
explicitly ended, it will be automatically ended at the end of the current
logformat node:
LOGQUOTE_START() sets 'quote' variable to 1, this way LOGQUOTE_END() only
prints the ending quote when needed. LOGQUOTE_END() is systematically
called after each node switch-case (after each value). LOGQUOTE_START()
does nothing if LOG_OPT_QUOTE isn't set, so does LOGQUOTE_END().
Some rare cases such as %hsl (list of captured headers) required special
handling: in this case multiple quoted texts are generated for the same
field value so explicit LOGQUOTE_START() + LOGQUOTE_END() combination was
needed.
last_isspace variable is explicitly set to 0 in all cases except
LOG_FMT_SEPARATOR case. So we can actually simplify the code by setting
last_isspace to 0 by default and skipping the assignment for the
LOG_FMT_SEPARATOR case.
Add the ability to manually specify desired output type after a custom
field name for logformat nodes. Forcing the type can be useful to ensure
value is stored with the proper type representation. (i.e.: forcing
numerical to string to work around the limited resolution of JS number
types)
By default, type is set to SMP_T_SAME, which means the original type will
be preserved.
Currently supported types are: bool, str, sint
type_to_smp(type) does the reverse operation of smp_to_type[smp]: it takes
a type name as input string and tries to return the corresponding SMP_T_*
smp type or SMP_TYPES if not found.
Add the ability to specify custom name (will be used for representation
in verbose output types such as json) to logformat nodes.
For now, a custom name should be composed by characters [a-zA-Z0-9-_]*
Transient send errors is handled differentely if using connection or
listener socket for QUIC transfers. In the first case, proper poller
subscription is used via fd_cant_send()/fd_want_send(). For the listener
socket case, error is ignored by qc_snd_buf() caller and retransmission
mechanism will allow to reemit the data.
For listener socket, transient error code handling is buggy. It blindly
uses fd_cand_send() with <qc.fd> member which is set to -1 for listener
socket usage. This results in an invalid fdtab access, with a possible
crash or a modification of a totally unrelated FD.
This bug is simply fixed by using qc_test_fd() before using
fd_cant_send()/fd_want_send(). This ensures <qc.fd> is used only if
initialized which is only the case when using connection socket.
No crash was reported yet for this bug. However, it is reproducible by
using ASAN compilation and the following strace sendmsg() errno command
injection :
# strace -qq -yy -p $(pgrep haproxy) -f -e trace=%network \
-e inject=sendto,sendmsg:error=EAGAIN:when=20+20
This must be backported up to 2.7.
If some data are received for a lua socket while the lua script responsible
to consume these data is not ready to do so, for instance because it is
sleeping, the applet is woken up in loop because it never states it will not
consume these data yet.
To fix the issue, in the applet I/O handle, when there are outgoing data, we
always pretend the applet will not consume it. It is the responsibility to
the lua script to reactivate receives by calling Socket.receive() function.
This patch must be backported to every stable version. For 2.4 and older,
si_want_get()/si_cant_get() must be used instead of
applet_will_consume()/applet_wont_consume().
It is poosible to create a lua socket without performing any connect. In
this case, the lua socket is released because of the garbage collector.
However, the garbarge collector does not release the applet, it wakes it
up. Since commit 751b59c40b ("BUG/MEDIUM: hlua: Initialize appctx used by a
lua socket on connect only"), the applet initialization is performed on
connect. So, here, it is possible to wake an uninitialized applet. It is an
unexpected case for the applet's I/O handler, leading to a segfault because
some resources are not initialized (the stream's target in this case).
So, now, in the lua socket GC function, we take care to immediately release
uninitialized applets. At worst, the release itself is delayed. But it is
safe because we are sure the applet's I/O handler will never be executed.
In addition, we take case to increment the GC counter when the lua socket is
created. The way, uninitialized lua socket are released more quickly.
This patch should fix the issue #2451. It must be backported as far as 2.6.
When an error is triggered during the applet initialization, a dedicated
function is called to release it. Indeed, in this case, because the applet
was not initialized, the ->release callback must not be called. However,
because the init stage may be delayed to be performed during the first
applet wakeup, we must also take care to not rely on the default
appctx_free() function, to immediately release the applet. Otherwise, if the
error happens in a delayed init stage, the applet is never released.
This patch partially fix the issue #2451. It must be backported as far as
2.6.
Should have come with this previous commit:
MINOR: quic: Add a counter for reordered packets
Must be backported where the previous commit was backported.
This change adds a paragraph to the documentation regarding "userlists"
and the use of hashed password value.
It indicates what a user can do to address the high CPU cost of
having to calculate the hash at each request, such as reducing the
number of rounds or the cost complexity, if the algorithm allows for it.
I believe it is necessary to mention how the musl C library
impacts performance of hashing functions, as this has already led to a
few issues:
https://github.com/haproxy/haproxy/issues/1298https://github.com/haproxy/haproxy/issues/2008https://github.com/haproxy/haproxy/issues/2251
The performance impact is significant enough to mention it.
Acked-by: Lukas Tribus <lukas@ltri.eu>
Currently haproxy does not implement dynamic table support for QPACK. As
such, dynamic table capacity advertized via H3 SETTINGS is 0. When
receiving a non-null Set Dynamic Table Capacity instruction, close
immediately the connection using QPACK_ENCODER_STREAM_ERROR.
Prior to this patch, such instructions were simply ignored. This is non
conform to QUIC specification.
This should be backported up to 2.6. Note that on 2.6 qcc_set_error()
must be replaced by function qcc_emit_cc_app().
Close the connection using QPACK_DECODER_STREAM_ERROR when receiving an
invalid insert count increment. As haproxy does not use dynamic table,
this instruction must never be emitted by the peer.
Prior to this patch, haproxy silently ignored such instruction which is
not conform to the QUIC specification.
This should be backported up to 2.6. Note that on 2.6 qcc_set_error()
must be replaced by function qcc_emit_cc_app().
As specified in RFC 9000, a client must never emit a HANDSHAKE_DONE
frame. If this happens, the server must close the connection with error
PROTOCOL VIOLATION.
Previously, such a frame was silently discarded on server side. The
connection remained opened which is not conformant to the specification.
This should be backported up to 2.6.
Ensure every frame types are handled in qc_parse_pkt_frms. Add an
ABORT_NOW on the default case. This is safe as an unknown frame must be
rejected prior via qc_parse_frm().
As specified by RFC 9000, connection is closed on error if an unknown
QUIC frame type is received.
Previously, a frame with unknown type was silently discarded. The
connection remained opened which is not conformant to the specification.
This should be backported up to 2.6.
Global options to disable for zero-copy forwarding are now tested outside
callbacks responsible to perform the forwarding itself. It is cleaner this
way because we don't try at all zero-copy forwarding if at least one side
does not support it. It is equivalent to what was performed before, but it
is simplier this way.
There is a nego stage when a producer is ready to forward data to the other
side. At this stage, the zero-copy forwarding may be disabled if the
consumer does not support it. However, there is a flaw with this way to
proceed. If the channel buffer is not empty, we delay the zero-copy
forwarding to flush all data from the channel first. During this delay,
receives on the endpoint (at connection level for muxes), are blocked to be
sure to have the opportunity to switch on zero-copy forwarding. It is a
problem if the consumer cannot flush data from the channel's buffer, waiting
for more data for instance.
It is especially annoying with the CLI applet, because this scenario can
happen if a command is partially received. For instance without the LF at
the end. In this case, the CLI applet is blocked because it waits more
data. The frontend connexion is also blocked because channel's data must be
flushed before trying to receive more data. Worst, this happen at where no
timeout is armed. Thus the session is stuck infinitly, client aborts cannot
be detected because receives are blocked, and the applet cannot abort on its
side because there are pending outgoing data. It is clearly a situation
where it is easy to consume all CLI slots.
To fix the issue, thanks to previous commits, we now check zero-copy
forwarding support on both sides before proceeding.
This patch relies on the following commits:
* MINOR: muxes: Announce support for zero-copy forwarding on consumer side
* MINOR: stconn: Add SE flag to announce zero-copy forwarding on consumer side
* MINOR: stconn: Rename SE_FL_MAY_FASTFWD and reorder bitfield
* CLEANUP: stconn: Move SE flags set by app layer at the end of the bitfield
All the series must be backported to 2.9.
It is unused for now, but the muxes announce their support of the zero-copy
forwarding on consumer side. All muxes, except the fgci one, are supported
it.
The SE_FL_MAY_FASTFWD_CONS is added and it will be used by endpoints to
announce their support for the zero-copy forwarding on the consumer
side. The flag is not necessarily permanent. However, it will be used this
way for now.
To fix a bug, a flag to announce the capabitlity to support the zero-copy
forwarding on the consumer side will be added on the SE descriptor. So the
old flag SE_FL_MAY_FASTFWD is renamed to indicate it concerns the producer
side. It is now SE_FL_MAY_FASTFWD_PROD. And to prepare addition of the new
flag, the bitfield is a bit reordered.
To fix a bug, some SE flags must be added or renamed. To avoid mixing flags
set by the endpoint and flags set by the app, the second set of flags are
moved at the end of the bitfield, leaving the holes on the middle.
This revert of commit 0b93ff8c87 ("BUG/MEDIUM: stconn: Wake applets on
sending path if there is a pending shutdown") and 9e394d34e0 ("BUG/MINOR:
stconn: Don't report blocked sends during connection establishment") because
it was not the right fixes.
We must not wake an applet up when a shutdown is pending because it means
output some data are still blocked in the channel buffer. The applet does
not necessarily consume these data. In this case, the applet may be woken up
infinitly, except if it explicitly reports it wont consume datay yet.
This patch must be backported as far as 2.8. For older versions, as far as
2.2, it may be backported. If so, a previous fix must be pushed to prevent
an HTTP applet to be stuck. In http_ana.c, in http_end_request() and
http_end_reponse(), the call to channel_htx_truncate() on the request
channel in case of MSG_ERROR must be replace by a call to
channel_htx_erase().
When a READ or a WRITE activity is reported on a channel, the corresponding
date is updated. the last-read-activity date (lra) is updated and the
first-send-block date (fsb) is reset. The event is also reported at the
channel level by setting CF_READ_EVENT or CF_WRITE_EVENT flags. When one of
these flags is set, this prevent the update of the stream's task expiration
date from sc_notify(). It also prevents corresponding timeout to be reported
from process_stream().
But it is a problem during fast-forwarding stage if no expiration date was
set by the stream. Only process_stream() resets these flags. So a first READ
or WRITE event will prevent any stream's expiration date update till a new
call to process_stream(). But with no expiration date, this will only happen
on shutdown/abort event, blocking the stream for a while.
It is for instance possible to block the stats applet or the cli applet if a
client does not consume the response. The stream may be blocked, the client
timeout is not respected and the stream can only be closed on a client
abort.
So now, we update the stream's expiration date, regardless of reported
READ/WRITE events. It is not a big deal because lra and fsb date are
properly updated. It also means an old READ/WRITE event will no prevent the
stream to report a timeout and it is expected too.
This patch must be backported as far as 2.8. On older versions, timeouts and
stream's expiration date are not updated in the same way and this works as
expected.
Thanks to the previous patch ("MEDIUM: applet: Add notion of shutdown for
write for applets"), it is no longer necessary to check SC flags to detect
shutdowns to interrupt the wait command. It is possible to remove this ugly
workaround. In addition, we only test the SE for shutdown because end of
stream and error are already checked by the CLI I/O handler. And it is no
longer necessary to remove output data from the channel's buffer because
shutdown are not reported if there are remaining outgoing data.
Of course, if the "wait" command is backported, the commit above and this
one must be backported too.
In fact there is already flags on the SE to state a shutdown for reads or
writes was performed. But for applets, this notion does not exist. Both
flags are set in same time when the applet is released. But at the SC level,
there are functions to perform a shutdown (formely the shutw) and an abort
(formely the shutr). For applets, when a shutdown is performed on the SC, if
the applet is not immediately released, nothing is acknowledge at the SE
level.
With old way to implement applets, this was not an real issue until recently
because applets accessed to the channel/SC flags. It was thus possible to
catch the shutdowns. But the "wait" command on the CLI reveals the
flaw. Indeed, when this command is executed, nothing is read or sent. So, it
is not possible to detect the shutdowns. As a workaround, a dedicated test
on the SC flags was added at the end of the wait command I/O handler. But it
is pretty ugly.
With new way to implement applets, there is no longer access to the channel
or SC. So we must add a way to acknowledge shutdown into the SE.
This patch solves the both sides of the issue. The shutw notion is added for
applets. Its only purpose is to set SE_FL_SHWN flags. This flag is tested by
all applets, so, it solves the issue quite simply.
Note that it is described as a bug fix but there is no real issue, just a
design flaw. However, if the "wait" command is backported, this patch must
be backported too. Unfortinately it will require an adaptation because there
is no appctx flags on older versions.
These both flags are set after releasing the applet, in
appctx_shut(). Concretly, it means the applet is shutdown for reads and
writes. Once set, the applet's I/O handler was no longer called. Tests on
these flags are useless. There is no chance to match them.
This case does not exist yet with the H1 multiplexer, but applets may decide to
not produce data if there is not enough room in the destination buffer (the
applet's outbuf or the opposite SE buffer). It is true for the stats applets for
instance. However this case is not properly handled when the zero-copy
forwarding is in-use.
To fix the issue, the se_done_ff() function was modified to return the number of
bytes really forwarded and to subs for sends if nothing was forwarded while the
zero-copy forwarding was blocked by the producer. On the applet side, we take
care to block the zero-copy forwarding if the applet requests more room. At the
end, zero-copy forwarding is unblocked if something was forwarded.
This way, it is now possible for the stats applet to report a full buffer and
block the zero-copy forwarding, even if the buffer is not really full, by
requesting more room.
No backport needed.
An issue was introduced when zero-copy forwarding was added to the stats and
cache applets. There is no test to be sure the upper layer is ready to use
the zero-copy forwarding. So these applets refuse to deliver the response
into the applet's output buffer if the zero-copy forwarding is supported by
the opposite endpoint. It is especially an issue when a filter, like the
compression, is in-use on the response channel.
Because of this bug, the response is not delivered and the applet is woken
up in loop to produce data.
To fix the issue, an appctx flag was added, APPCTX_FL_FASTFWD, to know when
the zero-copy forwarding is in-use. We rely on this flag to not fill the
outbuf in the applet's I/O handler.
No backport needed.
This simplifies a bit the stats applet. Because the CLI part was not
refactored yet to use the applet's buffers, there are 3 ways to produce
data:
* the HTX message for the HTTP stats when zero-copy forwarding is not
used
* raw data in the opposite endpoint buffer for the HTTP stats when
zero-copy forwarding is used
* the channel buffer when the CLI "show stat" command is evaluated
There is already a dedicated function to take care to copy data at the right
place. There is now also a dedicated function to check us the output buffer
is almost full.
Commit 91b77c1632 ("MEDIUM: mux-h1: Support zero-copy forwarding for chunks with
an unknown size") was recently pushed but it contains 3 bugs. The first one is
during the nego. The extra size reserved for the CRLF at the end of the chunk
must not be added to the offset value. Indeed, the CRLF will be appended after
the data and not prepended to them.
The second one, still during the nego, is an integer overflow when the available
room in the output buffer is computed.
Finally, the last one is when the chunk itself is formatted. This part was
totally buggy if the output buffer was not empty at the beginning.
No backport needed.
A packet is considered as reordered when it is detected as lost because its packet
number is above the largest acknowledeged packet number by at least the
packet reordering threshold value.
Add ->nb_reordered_pkt new quic_loss struct member at the same location that
the number of lost packets to count such packets.
Should be backported to 2.6.
Let's say that the largest packet number acknowledged by the peer is #10, when inspecting
the non already acknowledged packets to detect if they are lost or not, this is the
case a least if the difference between this largest packet number and and their
packet numbers are bigger or equal to the packet reordering threshold as defined
by the RFC 9002. This latter must not be less than QUIC_LOSS_PACKET_THRESHOLD(3).
Which such a value, packets #7 and oldest are detected as lost if non acknowledged,
contrary to packet number #8 or #9.
So, the packet loss detection is very sensitive to such a network characteristic
where non acknowledged packets are distant from each others by their packet number
differences.
Do not use this static value anymore for the packet reordering threshold which is used
as a criteria to detect packet loss. In place, make it depend on the difference
between the number of the last transmitted packet and the number of the oldest
one among the packet which are still in flight before being inspected to be
deemed as lost.
Add new tune.quic.reorder-ratio setting to apply a ratio in percent to this
dynamic packet reorder threshold.
Should be backported to 2.6.
The new formula for K CUBIC which arrives with RFC 9438 is as follows:
K = cubic_root((W_max - cwnd_epoch) / C)
Note that W_max is c->last_w_max, and cwnd_epoch is c->cwnd when entering
quic_cubic_update() just after a congestion event.
Must be backported as far as 2.6.
The formula for K CUBIC calculation is as follows:
K = cubic_root(W_max * (1 - beta_quic) / C).
Note that this does not match the comment. But the aim of this patch is to not
hide a bug inside another patch to update this K CUBIC calculation.
The unit of C is bytes/s^3 (or segments/s^3). And we want to store K as
milliseconds. So, the conversion inside the cubic_root() to convert seconds in
milliseconds is wrong. The unit used here is bytes/(ms/1000)^3 or
bytes*1000^3/ms^3. That said, it is preferable to compute K as seconds, then
convert to milliseconds as done by this patch.
Must be backported as far as 2.6.
The CLI command "update ssl ocsp-response" was forcefully removing an
OCSP response from the update tree regardless of whether it used to be
in it beforehand or not. But since the main OCSP upate task works by
removing the entry being currently updated from the update tree and then
reinserting it when the update process is over, it meant that in the CLI
command code we were modifying a structure that was already being used.
These concurrent accesses were not properly locked on the "regular"
update case because it was assumed that once an entry was removed from
the update tree, the update task was the only one able to work on it.
Rather than locking the whole update process, an "updating" flag was
added to the certificate_ocsp in order to prevent the "update ssl
ocsp-response" command from trying to update a response already being
updated.
An easy way to reproduce this crash was to perform two "simultaneous"
calls to "update ssl ocsp-response" on the same certificate. It would
then crash on an eb64_delete call in the main ocsp update task function.
This patch can be backported up to 2.8.
Released version 3.0-dev3 with the following main changes :
- DOC: configuration: clarify http-request wait-for-body
- BUG/MAJOR: ssl_sock: Always clear retry flags in read/write functions
- MINOR: h3: add traces for stream sending function
- BUG/MEDIUM: h3: do not crash on invalid response status code
- BUG/MEDIUM: qpack: allow 6xx..9xx status codes
- BUG/MEDIUM: quic: fix crash on invalid qc_stream_buf_free() BUG_ON
- CLEANUP: log: deinitialization of the log buffer in one function
- BUG/MINOR: h1: Don't support LF only at the end of chunks
- BUG/MEDIUM: h1: Don't support LF only to mark the end of a chunk size
- MINOR: ssl: add HAVE_SSL_0RTT constant
- MINOR: ssl: rename HA_OPENSSL_HAVE_0RTT_SUPPORT constant to HAVE_SSL_0RTT_QUIC
- MEDIUM: ssl/quic: always compile the ssl_conf.early_data test
- DOC: httpclient: add dedicated httpclient section
- BUG/MINOR: h1-htx: properly initialize the err_pos field
- BUG/MEDIUM: h1: always reject the NUL character in header values
- CLEANUP: h1: remove unused function h1_measure_trailers()
- BUG/MINOR: ssl/quic: fix 0RTT define
- MINOR: mux-quic: prepare for earlier flow control update
- MINOR: mux-quic: define a flow control related type
- MEDIUM: mux-quic: limit stream flow control on snd_buf
- MEDIUM: mux-quic: limit conn flow control on snd_buf
- MINOR: mux-quic: remove unneeded sent-offset fields
- MINOR: mux-quic: check fctl during STREAM frame build
- MAJOR: mux-quic: remove intermediary Tx buffer
- MEDIUM: mux-quic: simplify sending API
- MEDIUM: mux-quic: release Tx buf on too small room
- MEDIUM: mux-quic: properly handle conn Tx buf exhaustion
- MINOR: mux-quic: realign Tx buffer if possible
- CLEANUP: connection: remove obsolete comment in header file
- OPTIM: connection: progressive hash for conn_calculate_hash()
- MINOR: tcp_act: fix alphabetical ordering of tcp request content actions
- MINOR: tcp-act: Rename "set-{mark,tos}" to "set-fc-{mark,tos}"
- MINOR: hlua: Rename set_{tos, mark} to set_fc_{tos, mark}
- MEDIUM: tcp-act: <expr> support for set-fc-{mark,tos} actions
- MEDIUM: tcp-act/backend: support for set-bc-{mark,tos} actions
- MINOR: stats: Be able to access to registered stats modules from anywhere
- MEDIUM: stats: Be able to access a specific field into a stats module
- MINOR: promex: Add a param to override the description when a metric is dumped
- MINOR: promex: Add info in the promex context to dump extra counters
- MEDIUM: promex: Dump frontends extra counters if requested
- MEDIUM: promex: Dump backends extra counters if requested
- MEDIUM: promex: Dump servers extra counters if requested
- MEDIUM: promex: Dump listeners extra counters if requested
- DOC: promex: Add documentation about extra-counters
- MINOR: promex: Always limit the number of labels dumped for each metric
- MEDIUM: promex: Simplify the context using generic pointers for restart points
- MINOR: promex: Remove unsued htx parameter when a metric is dumped
- MEDIUM: promex: Add a registration mechanism to support modules
- MEDIUM: promex: Dump metrics of registered modules with a way to filter them
- MEDIUM: promex/stick-table: Dump stick-table metrics via a promex module
- MEDIUM: promex/resolvers: Dump resolvers metrics via a promex module
- MINOR: promex: Rename dump functions to use the right wording
- MINOR: promex: Always pass the final name and description to promex_dmp_ts()
- MEDIUM: promex: Add support for filters on metric names
- REGTESTS: promex: Adapt script to be less verbose
- MINOR: compiler: add a new DO_NOT_FOLD() macro to prevent code folding
- MINOR: debug: make sure calls to ha_crash_now() are never merged
- MINOR: debug: make ABORT_NOW() store the caller's line number when using abort
- BUG/MINOR: diag: always show the version before dumping a diag warning
- BUG/MINOR: diag: run the final diags before quitting when using -c
- MINOR: acl: add extra diagnostics about suspicious string patterns
- BUG/MINOR: quic: Wrong ack ranges handling when reaching the limit.
- BUILD: quic: Variable name typo inside a BUG_ON().
- DOC: config: fix typo for '%ms' log format alternative
- DOC: config: fix ordering for "txn.*" fetches
- MINOR: stream: add "txn.redispatch" fetch
- BUILD: debug: remove leftover parentheses in ABORT_NOW()
- MINOR: debug: make BUG_ON() catch build errors even without DEBUG_STRICT
- BUG/MINOR: ssl: Fix error message after ssl_sock_load_ocsp call
- MINOR: debug: support passing an optional message in ABORT_NOW()
- MINOR: debug: add an optional message argument to the BUG_ON() family
- DEBUG: make the "debug dev {debug|warn|check}" command print a message
- CLEANUP: quic: Code clarifications for QUIC CUBIC (RFC 9438)
- BUG/MINOR: quic: fix possible integer wrap around in cubic window calculation
- MINOR: quic: Stop using 1024th of a second.
- CI: github: abandon asan matrix.py helper
- CI: ssl: add yet another OpenSSL download fallback
- DOC: install: clarify WolfSSL chroot requirements
- MINOR: task: Move wait_event in the task header file
- MINOR: stconn: Be able to detect applets using HTX
- MINOR: stconn: Explicitly use an appctx to attach a stconn on it
- MINOR: stconn: Be prepared to handle error when a SC is attached to an applet
- MINOR: applet: Add dedicated IN/OUT buffers for appctx
- MINOR: applet: Add traces to debug receive/send and block/wake events
- MINOR: applet: Add support for callback functions to exchange data with channels
- MINOR: applet: Implement default functions to exchange data with channels
- MEDIUM: stconn: Add functions to handle applets I/O from the SC layer
- MEDIM: applet: Add the applet handler based on IN/OUT buffers
- MINOR: applet: Show IN/OUT buffers in trace messages when used
- MINOR: applet: Add flags on the appctx and stop abusing its state
- MINIOR: applet: Add flags to deal with ends of input, ends of stream and errors
- MINOR: applet: Remove appctx state field to only used the flags
- MINOR: applet: Add an appctx flag to report shutdown to applets
- MEDIUM: applet: Use appctx flags to report EOS/EOI/ERROR to SE
- MINOR: applet: Add callback function to deal with zero-copy forwarding
- MEDIUM: applet: Add support for zero-copy forwarding from an applet
- MINOR: applet: Automatically handle applets having more data for the stream
- MEDIUM: stats: Don't interrupt processing on partial post
- MAJOR: stats: Update HTTP stats applet to handle its own buffers
- MEDIUM: cache: Temporarily remove zero-copy forwarding support
- MAJOR: cache: Update HTTP cache applet to handle its own buffers
- MAJOR: cache: Send cached objects using zero-copy forwarding
- MINOR: stconn: Add support for flags during zero-copy forwarding negotiation
- MINOR: mux-h1: Be able to define the length of a chunk size when it is prepended
- MEDIUM: stconn: Nofify requested size during zero-copy forwarding nego is exact
- MINOR: mux-h1: Stop zero-copy forwarding during nego for too big requested size
- MEDIUM: mux-h1: Support zero-copy forwarding for chunks with an unknown size
- MAJOR: stats: Send stats dump over HTTP using zero-copy forwarding
- MEDIUM: applet: Simplify a bit API to exchange data with applets
- MINOR: cache: Remove unsed .data_sent field from the cache applet context
- MINOR: applet: Use an option to disable zero-copy forwarding for all applets
- MINOR: applet: Identify applets using their own buffers via a flag
- BUG/MINOR: ssl: Duplicate ocsp update mode when dup'ing ckch
- MINOR: ssl: Use OCSP_CERTID instead of ckch_store in ckch_store_build_certid
- BUG/MINOR: ssl: Clear the ckch instance when deleting a crt-list line
- BUG/MEDIUM: ocsp: Separate refcount per instance and per store
- BUG/MINOR: ssl: Destroy ckch instances before the store during deinit
- BUG/MINOR: ssl: Reenable ocsp auto-update after an "add ssl crt-list"
- REGTESTS: ssl: Add OCSP related tests
- REGTESTS: ssl: Fix empty line in cli command input
- DOC: install: recommend pcre2
- DOC: config: fix misplaced "txn.conn_retries"
- DOC: config: fix typos for "bytes_{in,out}"
- DOC: config: fix misplaced "bytes_{in,out}"
- DOC: config: add more custom log format table alternatives
- MINOR: stream: rename "txn.redispatch" to "txn.redispatched"
- MINOR: sample: implement bc_{be,srv}_queue samples
- BUG/MINOR: mux-h2: count rejected DATA frames against the connection's flow control
- MINOR: mux-h2: count excess of CONTINUATION frames as a glitch
- MINOR: mux-h2: count late reduction of INITIAL_WINDOW_SIZE as a glitch
- DOC: internal: update missing data types in peers-v2.0.txt
- MEDIUM: stick-tables: add a new stored type for glitch_cnt and glitch_rate
- MINOR: session: add the necessary functions to update the per-session glitches
- MEDIUM: mux-h2: update session trackers with number of glitches
- BUG/MINOR: server/cli: add missing LF at the end of certain notice/error lines
- BUG/MINOR: vars/cli: fix missing LF after "get var" output
- BUG/MEDIUM: cli: fix once for all the problem of missing trailing LFs
- MINOR: cli: make sure to always print a pending message after release()
- MINOR: cli: always reset the applet task's timeout
- MINOR: cli: add a new "wait" command to wait for a certain delay
- BUG/MINOR: applet: Always release empty appctx buffers after processing
- MINOR: server: split the server deletion code in two parts
- MINOR: cli/wait: make the wait command support a more detailed help message
- MINOR: cli/wait: also support an unrecoverable failure status
- MINOR: cli/wait: also pass up to 4 arguments to the external conditions
- MINOR: cli/wait: add a condition to wait on a server to become unused
- CI: Update to actions/cache@v4
- BUILD: address a few remaining calloc(size, n) cases
- BUG/MEDIUM: pool: fix rare risk of deadlock in pool_flush()
As reported by github user @JB0925 in issue #2427, there is a possible
crash in pool_flush(). The problem is that if the free_list is not empty
in the first test, and is empty at the moment the xchg() is performed,
for example because another thread called it in parallel, we place a
POOL_BUSY there that is never removed later, causing the next thread to
wait forever.
This was introduced in 2.5 with commit 2a4523f6f ("BUG/MAJOR: pools: fix
possible race with free() in the lockless variant"). It has probably
very rarely been detected, because:
- pool_flush() is only called when stopping is set
- the function does nothing if global pools are disabled, which is
the case on most modern systems with a fast memory allocator.
It's possible to reproduce it by modifying __task_free() to call
pool_flush() on 1% of the calls instead of only when stopping.
The fix is quite simple, it consists in moving the zeroing of the
entry in the break path after verifying that the entry was not already
busy.
This must be backported wherever commit 2a4523f6f is.
In issue #2427 Ilya reports that gcc-14 rightfully complains about
sizeof() being placed in the left term of calloc(). There's no impact
but it's a bad pattern that gets copy-pasted over time. Let's fix the
few remaining occurrences (debug.c, halog, udp-perturb).
This can be backported to all branches, and the irrelevant parts dropped.
The "wait" command now supports a condition, "srv-unused", which waits
for the designated server to become totally unused, indicating that it
is removable. Upon each wakeup it calls srv_check_for_deletion() to
verify if conditions are met, if not if it's recoverable, or if it's
not recoverable, and proceeds according to this, never waiting for a
final decision longer than the configured delay.
The purpose is to make it possible to remove servers from the CLI after
waiting for their sessions to be terminated:
$ socat -t5 /path/to/socket - <<< "
disable server px/srv1
shutdown sessions server px/srv1
wait 2s srv-unused px/srv1
del server px/srv1"
Or even wait for connections to terminate themselves:
$ socat -t70 /path/to/socket - <<< "
disable server px/srv1
wait 1m srv-unused px/srv1
del server px/srv1"
Conditions will need to have context, arguments etc from the command line.
Since these will vary with time (otherwise we wouldn't wait), let's just
pass them as text (possibly pre-processed). We're starting with 4 strings
that are expected to be allocated by strdup() and are always sent to free()
upon release.
Since we'll support waiting for an action to succeed or permanently
fail, we need the ability to return an unrecoverable failure. Let's
add CLI_WAIT_ERR_FAIL for this. A static error message may be placed
into ctx->msg to report to the user why the failure is unrecoverable.
We'll need to be able to verify whether or not a server may be deleted.
For now, both the verification and the action are performed in the same
function, at once under thread isolation. The goal here is to extract
the verification code into a new function that will perform these checks,
return a status between success/recoverable/non-recoverable failure, and
will also return a message for the caller.
When an applet is using its own buffers, it is important to release them, if
empty, after processing to recycle unsued buffers. It is not a leak because
these buffers are necessarily released when the applet is released. But this
leads to an excess of buffer allocations.
No need to backport.
This allows to insert delays between commands, i.e. to collect a same
set of metrics at a fixed interval. E.g:
$ socat -t20 /path/to/socket <<< "show activity; wait 10s; show activity"
The goal will be to extend the feature to optionally support waiting on
certain conditions. For this reason the struct definitions and enums were
placed into cli-t.h.
The CLI applet doesn't make use of its timeout at all, only the stream
does. That's a wonder because it allows any command's I/O handler to
trivially set a wakeup timer by simply touching the task's ->expire
field, and the I/O handler will automatically be woken up again. The
only condition for this is that we properly take care of clearing that
timeout whenever we finish processing a command and switch back to the
PROMPT state. That's what this patch does.
If a release handler produces a final message, it's currently left
pending in the CLI context and needs another I/O event to be dumped
because immediately after calling ->release, we check for states
OUTPUT and above and we wait until more data arrives.
This patch adds continue statement to go back to the loop immediately
after leaving the release handler in order to attempt to emit the
output message.
At this point it's not sure whether any release handlers are producing
messages, so it's probably not needed to backport this.
Some commands are still missing their trailing LF, and very few were even
already spotted in the past emitting more than one. The risk of missing
this LF is particularly high, especially when tests are run in non-
interactive mode where the output looks good at first glance. The problem
is that once run in interactive mode, the missing empty line makes the
command not being complete, and scripts can wait forever.
Let's tackle the problem at its root: messages emitted at the end must
always end with an LF and we know some miss it. Thus, in cli_output_msg()
we now start by removing the trailing LFs from the string, and we always
add exactly one. This way the trailing LF added by correct functions are
silently ignored and all functions are now correct.
This would need to be progressively backported to all supported versions
in order to address them all at once, though the risk of breaking a legacy
script relying on the wrong output is never zero. At first it should at
least go as far as the lastest LTS (2.8), and maybe another one depending
on user demands. Note that it also requires previous patch ("BUG/MINOR:
vars/cli: fix missing LF after "get var" output") because it fixes a test
for a bogus output for "get var" in a VTC.