The aim of the idle timeout is to silently closed the connection after a period
of inactivity depending on the "max_idle_timeout" transport parameters advertised
by the endpoints. We add a new task to implement this timer. Its expiry is
updated each time we received an ack-eliciting packet, and each time we send
an ack-eliciting packet if no other such packet was sent since we received
the last ack-eliciting packet. Such conditions may be implemented thanks
to QUIC_FL_CONN_IDLE_TIMER_RESTARTED_AFTER_READ new flag.
This packet number space flags were defined with the same value because
defined at different places in the file. Assemble them at the same location
with different values.
This bug could unvalidate the peer address after it was validated
during the handshake leading to the anti-amplication limit to be
enabled again after having been disabled. The situation could not
be unblocked (deadlock).
There is no need to use such a reference counter anymore since the QUIC
connections are always handled by the same thread.
quic_conn_drop() is removed. Its code is merged into quic_conn_release().
When we store the remote transport parameters, we compute the maximum idle
timeout for the connection which is the minimum of the two advertised
max_idle_timeout transport parameter values if both have non-null values, or the
maximum if one of the value is set and non-null.
Andrew Suffield reported in issue #1596 that we've had a bug in
session_accept_fd() since 2.4 with commit 1b3c931bf ("MEDIUM:
connections: Introduce a new XPRT method, start().") where an error
label is wrong and may cause the leak of the freshly allocated session
in case conn_xprt_start() returns < 0.
The code was checked there and the only two transport layers available
at this point are raw_sock and ssl_sock. The former doesn't provide a
->start() method hence conn_xprt_start() will always return zero. The
second does provide such a function, but it may only return <0 if the
underlying transport (raw_sock) has such a method and fails, which is
thus not the case.
So fortunately it is not possible to trigger this leak.
The patch above also touched the accept code in quic_sock() which was
mostly a plain copy of the session code, but there the move didn't
have this impact, and since then it was simplified and the next change
moved it to its final destination with the proper error label.
This should be backported as far as 2.4 as a long-term safety measure
(e.g. if in the future we have a reason for making conn_xprt_start()
to start failing), but will not have any positive nor negative effect
in the short term.
These two sample fetch methods report respectively the file name and the
line number where was located the last rule that was final. This is aimed
at being used on log-format lines to help admins figure what rule in the
configuration gave a final verdict, and help understand the condition
that led to the action.
For example, it's now possible to log the last matched rule by adding
this to the log-format:
... lr=%[last_rule_file]:%[last_rule_line]
A regtest is provided to test various combinations of final rules, some
even on top of each other from different rulesets.
When a tcp-{request,response} content or http-request/http-response
rule delivers a final verdict (deny, accept, redirect etc), the last
evaluated one will now be recorded in the stream. The purpose is to
permit to log the last one that performed a final action. For now
the log is not produced.
The distcc* sample fetch methods were surprisingly located within the
"internal state" section, while they in fact depend on L6 contents.
This can be backported to all versions where they appear.
In TCP, when a conn-stream is detached from a backend connection, the
connection must be always closed. It was only performed if an error or a
shutdown occurred or if there was no connection owner. But it is a problem,
because, since the 2.3, backend connections are always owned by a
session. This way it is possible to have idle connections attached to a
session instead of a server. But there is no idle connections in TCP. In
addition, when a session owns a connection it is responsible to close it
when it is released. But it only works for idle connections. And it only
works if the session is released.
Thus there is the place for bugs here. And indeed, a connection leak may
occur if a connection retry is performed because of a timeout. In this case,
the underlying connection is still alive and is waiting to be fully
established. Thus, when the conn-stream is detached from the connection, the
connection is not closed. Because the PT multiplexer is quite simple, there
is no timeout at this stage. We depend on the kenerl to be notified and
finally close the connection. With an unreachable server, orphan backend
connections may be accumulated for a while. It may be perceived as a leak.
Because there is no reason to keep such backend connections, we just close
it now. Frontend connections are still closed by the session or when an
error or a shutdown occurs.
This patch should fix the issue #1522. It must be backported as far as
2.0. Note that the 2.2 and 2.0 are not affected by this bug because there is
no owner for backend TCP connections. But it is probably a good idea to
backport the patch on these versions to avoid any future bugs.
Found manually, while creating the previous commits to turn `struct proxy`
members into ists.
There is an existing Coccinelle rule to replace this pattern by `istadv()` in
`ist.cocci`:
@@
struct ist i;
expression e;
@@
- i.ptr += e;
- i.len -= e;
+ i = istadv(i, e);
But apparently it is not smart enough to match ists that are stored in another
struct. It would be useful to make the existing rule more generic, so that it
might catch similar cases in the future.
The server_id_hdr_name is already processed as an ist in various locations lets
also just store it as such.
see 0643b0e7e ("MINOR: proxy: Make `header_unique_id` a `struct ist`") for a
very similar past commit.
The orgto_hdr_name is already processed as an ist in `http_process_request`,
lets also just store it as such.
see 0643b0e7e ("MINOR: proxy: Make `header_unique_id` a `struct ist`") for a
very similar past commit.
The fwdfor_hdr_name is already processed as an ist in `http_process_request`,
lets also just store it as such.
see 0643b0e7e ("MINOR: proxy: Make `header_unique_id` a `struct ist`") for a
very similar past commit.
The monitor_uri is already processed as an ist in `http_wait_for_request`, lets
also just store it as such.
see 0643b0e7e ("MINOR: proxy: Make `header_unique_id` a `struct ist`") for a
very similar past commit.
Channels buffer state is displayed in the strem trace messages. However,
because of a typo, the request buffer was used instead of the response one.
This patch should be backported as far as 2.2.
The response analyzer of the master CLI only handles read errors. So if
there is a write error, the session remains stuck because some outgoing data
are blocked in the channel and the response analyzer waits everything to be
sent. Because the maxconn is set to 10 for the master CLI, it may be
unresponsive if this happens to many times.
Now read and write errors, timeouts and client aborts are handled.
This patch should solve the issue #1512. It must be backported as far as
2.0.
In the I/O handler of the cache applet, we must update the underlying buffer
when the HTX message is loaded, using htx_from_buf() function instead of
htxbuf(). It is important because the applet will update the message by
adding new HTX blocks. This way, the state of the underlying buffer remains
consistant with the state of the HTX message.
It is especially important if HAProxy is compiled with "DEBUG_STRICT=2"
mode. Without this patch, channel_add_input() call crashed if the channel
was empty at the begining of the I/O handler.
Note that it is more a build/debug issue than a bug. But this patch may
prevent future bugs. For now it is safe because htx_to_buf() function is
systematically called, updating accordingly the underlying buffer.
This patch may be backported as far as 2.0.
For now, for a stream, request analyzers are set at 2 stages. The first one
is when the stream is created. The session's listener analyzers, if any, are
set on the request channel. In addition, some HTTP analyzers are set for HTX
streams (AN_REQ_WAIT_HTTP and AN_REQ_HTTP_PROCESS_FE). The second one is
when the backend is set on the stream. At the stage, request analyzers are
updated using the backend settings.
It is an issue for client applets because there is no listener attached to
the stream. In addtion, it may have no specific/dedicated backend. Thus,
several request analyzers are missing. Among others, the HTTP analyzers for
HTTP applets. The HTTP client is the only one affected for now.
To fix the bug, when a stream is created without a listener, we use the
frontend to set the request analyzers. Note that there is no issue with the
response channel because its analyzers are set when the server connection is
established.
This patch may be backported to all stable versions. Because only the HTTP
client is affected, it must at least be backported to 2.5. It is related to
the issue #1593.
This bug is the same than for the HTTP client. See "BUG/MINOR: httpclient:
Set conn-stream/channel EOI flags at the end of request" for details.
This patch must be backported as far as 2.0. But only CF_EOI must be set
because applets are not attached to a conn-stream on older versions.
This bug is the same than for the HTTP client. See "BUG/MINOR: httpclient:
Set conn-stream/channel EOI flags at the end of request" for details.
Note that because a filter is always attached to the stream when the cache
is used, there is no issue because there is no direct forwarding in this
case. Thus the stream analyzers are able to see the HTX_FL_EOM flag on the
HTX messge.
This patch must be backported as far as 2.0. But only CF_EOI must be set
because applets are not attached to a conn-stream on older versions.
This bug is the same than for the HTTP client. See "BUG/MINOR: httpclient:
Set conn-stream/channel EOI flags at the end of request" for details.
This patch must be backported as far as 2.0. But only CF_EOI must be set
because applets are not attached to a conn-stream on older versions.
This bug is the same than for the HTTP client. See "BUG/MINOR: httpclient:
Set conn-stream/channel EOI flags at the end of request" for details.
This patch must be backported as far as 2.0. But only CF_EOI must be set
because applets are not attached to a conn-stream on older versions.
In HTX, HTX_FL_EOM flag is added on the message to notifiy the end of the
message was received. In addition, the producer must set CS_FL_EOI flag on
the conn-stream. If it is a mux, the stream-interface is responsible to set
CF_EOI flag on the input channel. But, for now, if the producer is an
applet, in addition to the conn-stream flag, it must also set the channel
one.
These flags are used to notify the stream that the message is finished and
no more data are expected. It is especially important when the message
itself it directly forwarded from one side to the other. Because in this
case, the stream has no way to see the HTX_FL_EOM flag on the
message. Otherwise, the stream will detect a client or a server abort,
depending on the side.
For the HTTP client, it is not really easy to diagnose this error because
there is also another bug hiding this one. All HTTP request analyzers are
not set on the input channel. This will be fixed by another patch.
This patch must be backported to 2.5. It is related to the issue #1593.
Supporting kFreebsd previously led to FreeBSD (< 14) build breakage:
In file included from src/cpuset.c:5:
In file included from include/haproxy/cpuset.h:4:
include/haproxy/cpuset-t.h:46:2: error: unknown type name 'cpu_set_t'; did you mean 'cpuset_t'?
CPUSET_REPR cpuset;
^~~~~~~~~~~
cpuset_t
include/haproxy/cpuset-t.h:21:22: note: expanded from macro 'CPUSET_REPR'
# define CPUSET_REPR cpu_set_t
^
In commit e9ed63e548 dark mode support was added to the stats page. The
initial commit does not include dark mode color overwrites for the
.socket CSS class. This commit colors socket rows the same way as
backends that acre active but do not have a health check defined.
This fixes an issue where reading information from socket lines became
really hard in dark mode due to suboptimal coloring of the cell
background and the font in it.
Change the return value to success in qc_handle_bidi_strm_frm for two
specific cases :
* if STREAM frame is an already received offset
* if application decoding failed
This ensures that the packet is not dropped and properly acknowledged.
Previous to this fix, the return code was set to error which prevented
the ACK to be generated.
The impact of the bug might be noticeable in environment with packet
loss and retransmission. Due to haproxy not generating ACK for packets
containing STREAM frames with already received offset, the client will
probably retransmit them again, which will worsen the network
transmission.
The "show sess" cli command only handles "http" or "tcp" as a fallback
mode, replace this by a call to proxy_mode_str() to show all the modes.
Could be backported in every maintained versions.
Some users with very large numbers of connections have been facing
extremely long malloc_trim() calls on reload that managed to trigger
the watchdog! That's a bit counter-productive. It's even possible
that some implementations are not perfectly reliable or that their
trimming time grows quadratically with the memory used. Instead of
constantly trying to work around these issues, let's offer an option
to disable this mechanism, since nobody had been complaining in the
past, and this was only meant to be an improvement.
This should be backported to 2.4 where trimming on reload started to
appear.
Around limits for QUIC integer encoding, this functions could return
wrong values which lead to qc_build_frms() to prepare wrong CRYPTO (less chances)
or STREAM frames (more chances). qc_do_build_pkt() could build wrong packets
with bad CRYPTO/STREAM frames which could not be decoded by the peer.
In such a case ngtcp2 closes the connection with an ENCRYPTION_ERROR error
in a transport CONNECTION_CLOSE frame.
This function returns the maximum integer which may be encoded with a number of
bytes passed as parameter. Useful to precisely compute the number of bytes which
may used to fulfill a buffer with lengths as QUIC enteger encoded prefixes for the
number of following bytes.
When in congestion avoidance state and when acknowledging an <acked> number bytes
we must increase the congestion window by at most one datagram (<path->mtu>)
by congestion window. So thanks to this patch we apply a ratio to the current
number of acked bytes : <acked> * <path->mtu> / <cwnd>.
So, when <cwnd> bytes are acked we precisely increment <cwnd> by <path->mtu>.
Furthermore we take into an account the number of remaining acknowledged bytes
each time we increment the window by <acked> storing their values in the algorithm
struct state (->remain_acked) so that it might be take into an account at the
next ACK event.
This function returns the remaining number of bytes which can be sent on the
network before fulfilling the congestion window. There is a counter for
the number of prepared data and another one for the really in flight number
of bytes (in_flight). These variable have been mixed up.
Since the persistent congestion detection is done out of the congestion
controllers, there is no need to pass them information through quic_cc_event struct.
We remove its useless members. Also remove qc_cc_loss_event() which is no more used.
We establish the persistent congestion out of any congestion controller
to improve the algorithms genericity. This path characteristic detection may
be implemented regarless of the underlying congestion control algorithm.
Send congestion (loss) event using directly quic_cc_event(), so without
qc_cc_loss_event() wrapper function around quic_cc_event().
Take the opportunity of this patch to shorten "newest_time_sent" member field
of quic_cc_event to "time_sent".
We want to be able to make the congestion controllers re-enter the slow
start state outside of the congestion controllers themselves. So,
we add a callback ->slow_start() to do so.
Define this callback for NewReno algorithm.
QUIC connection path in flight bytes is a variable which should not be manipulated
by the congestion controller. This latter aim is to compute the congestion window.
So, we pass it as less as parameters as possible to do so.
kFreeBSD needs to be treated as a distinct target from FreeBSD
since the underlying system libc is the GNU one. Thus, relying
only on __GLIBC__ no longer suffice.
- freebsd-glibc new target, key difference is including crypt.h
and linking to libdl like linux.
- cpu affinity available but the api is still the FreeBSD's.
- enabling auxiliary data access only for Linux.
Patch based on preliminary work done by @bigon.
closes#1555
Implement the locally flow-control streams limit for opened
bidirectional streams. Add a counter which is used to count the total
number of closed streams. If this number is big enough, emit a
MAX_STREAMS frame to increase the limit of remotely opened bidirectional
streams.
This is the first commit to implement QUIC flow-control. A series of
patches should follow to complete this.
This is required to be able to handle more than 100 client requests.
This should help to validate the Multiplexing interop test.
This commit should fix the possible transfer interruption caused by the
previous commit. The MUX always retry to send frames if there is
remaining data after a send call on the transport layer. This is useful
if the transport layer is not blocked on the sending path.
In the future, the transport layer should retry by itself the send
operation if no blocking condition exists. The MUX layer will always
subscribe to retry later if remaining frames are reported which indicate
a blocking on the transport layer.
Modify the STREAM emission in qc_send. Use the new transport function
qc_send_app_pkts to directly send the list of constructed frames. This
allows to remove the tasklet wakeup on the quic_conn and should reduce
the latency.
If not all frames are send after the transport call, subscribe the MUX
on the lower layer to be able to retry. Currently there is a bug because
the transport layer does not retry to send frames in excess after a
successful sendto. This might cause the transfer to be interrupted.