Historically the client-fin and server-fin timeouts were made to allow
a connection closure to be effective quickly if the last data were sent
down a socket and the client didn't close, something that can happen
when the peer's FIN is lost and retransmits are blocked by a firewall
for example. This made complete sense in 1.5 for TCP and HTTP in close
mode. But nowadays with muxes, it's not done at the right layer anymore
and even the description doesn't match what is being done, because what
happens is that the stream will abort the whole transfer after it's done
sending to the mux and this timeout expires.
We've seen in GH issue 2095 that this can happen with very short timeout
values, and while this didn't trigger often before, now that the muxes
(h2 & quic) properly report an end of stream before even the first
sc_conn_sync_recv(), it seems that it can happen more often, and have
two undesirable effects:
- logging a timeout when that's not the case
- aborting the request channel, hence the server-side conn, possibly
before it had a chance to be put back to the idle list, causing
this connection to be closed and not reusable.
Unfortunately for TCP (mux_pt) this remains necessary because the mux
doesn't have a timeout task. So here we're adding tests to only do
this through an HTX mux. But to be really clean we should in fact
completely drop all of this and implement these timeouts in the mux
itself.
This needs to be backported to 2.8 where the issue was discovered,
and maybe carefully to older versions, though that is not sure at
all. In any case, using a higher timeout or removing client-fin in
HTTP proxies is sufficient to make the issue disappear.
As seen in commits 33a4461fa ("BUG/MINOR: stats: Fix Lua's `get_stats`
function") and a46b142e8 ("BUG/MINOR: Missing stat_field_names (since
f21d17bb)") it seems frequent to omit to update stats_fields[] when
adding a new ST_F_xxx entry. This breaks Lua's get_stats() and shows
a "(null)" in the header of "show stat", but that one is not detectable
to the naked eye anymore.
Let's add a reminder above the enum declaration about this, and a small
reg tests checking for the absence of "(null)". It was verified to fail
before the last patch above.
This patch is similar to the previous one but for QUIC mux functions
used inside the mux code itself or application layer. Replace all
occurences of qc_* prefix by qcc_* or qcs_*. This should help to better
differentiate code between quic_conn and MUX.
This should be backported up to 2.7.
There was a misnaming in stats counter for *_BLOCKED frames in regard to
QUIC rfc convention. This patch fixes it to prevent future ambiguity :
- STREAMS_BLOCKED -> STREAM_DATA_BLOCKED
- STREAMS_DATA_BLOCKED_BIDI -> STREAMS_BLOCKED_BIDI
- STREAMS_DATA_BLOCKED_UNI -> STREAMS_BLOCKED_UNI
This should be backported up to 2.7.
Remove nb_streams field from qcc. It was not used outside of a BUG_ON()
statement to ensure we never have a negative count of streams. However
this is already checked with other fields.
This should be backported up to 2.7.
The Linux kernel maintains data structures to track a processes' open file
descriptors, and it expands these structures as necessary when FD usage grows
(at every FD=2^X starting at 64). However when threading is in use, during
expansion the kernel will pause (observed up to 47ms) while it waits for thread
synchronization (see https://bugzilla.kernel.org/show_bug.cgi?id=217366).
This change addresses the issue and avoids the random pauses by opening the
maximum file descriptor during initialization, so that expansion will not occur
while processing traffic.
TCC silently ignores the weak and section attributes, which ruins the
initcalls. Technically we're exactly in the same situation as with an
obsolete linker. Let's just automatically set the flag if TCC is
detected, this avoids surprises where the program compiles but does
not start.
No backport is needed.
TCC doesn't knoow about __attribute__((weak)), it silently ignores it.
We could add a "static" modifier there in this case but we already have
an alternate portable mode that is based on a slightly larger literal
for obsolete linkers (and non-ELF systems) which choke on weak. Let's
just add the test for tcc there and use it in this case.
No backport is needed.
TCC is upset by the declaration looking like:
const unsigned char ist_lc[256] __attribute__((weak)) = ((const unsigned char[256]){ ... });
It was written like this because it's expanded from the _IST_LC macro
but it's never used as-is, it's only used from ist_lc, which should be
the one containing the cast so that the macro only contains the list of
bytes that can be used in both places. And this assigns more consistent
roles to the lower and upper case macro/variable now, one is typed and
the other one not. No backport is needed.
Add ->sent_pkt counter to quic_conn struct to count the packet at QUIC connection
level. Then, when the connection is released, the ->sent_pkt counter value
is added to the one for the listener.
Must be backported to 2.7.
Add some statistical counters to quic_conn struct from quic_counters struct which
are used at listener level to handle them at QUIC connection level. This avoid
calling atomic functions. Furthermore this will be useful soon when a counter will
be added for the total number of packets which have been sent which will be very
often incremented.
Some counters were not added, espcially those which count the number of QUIC errors
by QUIC error types. Indeed such counters would be incremented most of the time
only one time at QUIC connection level.
Implement quic_conn_prx_cntrs_update() which accumulates the QUIC connection level
statistical counters to the listener level statistical counters.
Must be backported to 2.7.
Thierry Fournier reported a build breakage with the ubiquitous make
3.81, LDFLAGS were ignored. This is caused by the declaration of the
collect_opt_flags macro that is defined with an "=" sign, something
that only appeared in 3.82 and that is not necessary. With it removed,
the build now works fine at least from 3.80 to 4.3.
No backport is needed since this makefile cleanup appeared in 2.8.
During a code audit of the various situations that promote ERR_PENDING to
ERROR, it appeared that:
- all muxes use se_fl_set_error() to set it, which chooses either based
on EOI/EOS presence ;
- EOI/EOS that arrive late after ERR_PENDING were not systematically
upgraded to ERROR
This results in confusion about how such ERROR or ERR_PENDING ought to
be handled, which is not quite desirable.
This patch adds a test to se_fl_set() to detect if we're setting EOI or
EOS while ERR_PENDING is present, or the other way around so that any
sequence of EOI/EOS <-> ERR_PENDING results in ERROR being set. This
way there will no longer be possible situations where ERROR is missing
while the other ones are set.
quic_aead_iv_build() should never fail unless we call it with buffers of
different size. This never happens in the code as every input buffers
are of size QUIC_TLS_IV_LEN.
Remove the return value and add a BUG_ON() to prevent future misusage.
This is especially useful to remove one error handling on the sending
patch via quic_packet_encrypt().
This should be backported up to 2.7.
Right now there's no way to enforce a specific value of now_ms upon
startup in order to compensate for the time it takes to load a config,
specifically when dealing with the health check startup. For this we'd
need to force the now_offset value to compensate for the last known
value of the current date. This patch exposes a function to do exactly
this.
Just like we have the uptime in "show info", let's add the boot time.
It's trivial to collect as it's just the difference between the ready
date and the start date, and will allow users to monitor this element
in order to take action before it starts becoming problematic. Here
the boot time is reported in milliseconds, so this allows to even
observe sub-second anomalies in startup delays.
Some huge configs take a significant amount of time to start and this
can cause some trouble (e.g. health checks getting delayed and grouped,
process not responding to the CLI etc). For example, some configs might
start fast in certain environments and slowly in other ones just due to
the use of a wrong DNS server that delays all libc's resolutions. Let's
first start by measuring it by keeping a copy of the most recently known
ready date, once before calling check_config_validity() and then refine
it when leaving this function. A last call is finally performed just
before deciding to split between master and worker processes, and it covers
the whole boot. It's trivial to collect and even allows to get rid of a
call to clock_update_date() in function check_config_validity() that was
used in hope to better schedule future events.
Uninline and move qc_attach_sc() function to implementation source file.
This will be useful for next commit to add traces in it.
This should be backported up to 2.7.
MUX is responsible to put EOS on stream when read channel is closed.
This happens if underlying connection is closed or a RESET_STREAM is
received. FIN STREAM is ignored in this case.
For connection closure, simply check for CO_FL_SOCK_RD_SH.
For RESET_STREAM reception, a new flag QC_CF_RECV_RESET has been
introduced. It is set when RESET_STREAM is received, unless we already
received all data. This is conform to QUIC RFC which allows to ignore a
RESET_STREAM in this case. During RESET_STREAM processing, input buffer
is emptied so EOS can be reported right away on recv_buf operation.
This should be backported up to 2.7.
Fix the openssl build with older openssl version by disabling the new
ssl_c_r_dn fetch.
This also disable the ssl_client_samples.vtc file for OpenSSL version
older than 1.1.1
This patch addresses #1514, adds the ability to fetch DN of the root
ca that was in the chain when client certificate was verified during SSL
handshake.
qc_stream_buf_alloc() can fail for two reasons :
* limit of Tx buffer per connection reached
* allocation failure
The first case is properly treated. A flag QC_CF_CONN_FULL is set on the
connection to interrupt emission. It is cleared when a buffer became
available after in order ACK reception and the MUX tasklet is woken up.
The allocation failure was handled with the same mechanism which in this
case is not appropriate and could lead to a connection transfer freeze.
Instead, prefer to close the connection with a QUIC internal error code.
To differentiate the two causes, qc_stream_buf_alloc() API was changed
to return the number of available buffers to the caller.
This must be backported up to 2.6.
Remove QUIC MUX function qcs_http_handle_standalone_fin(). The purpose
of this function was only used when receiving an empty STREAM frame with
FIN bit. Besides, it was called by each application protocol which could
have different approach and render the function purpose unclear.
Invocation of qcs_http_handle_standalone_fin() have been replaced by
explicit code in both H3 and HTTP/0.9 module. In the process, use
htx_set_eom() to reliably put EOM on the HTX message.
This should be backported up to 2.7, along with the previous patch which
introduced htx_set_eom().
Implement a new HTX utility function htx_set_eom(). If the HTX message
is empty, it will first add a dummy EOT block. This is a small trick
needed to ensure readers will detect the HTX buffer as not empty and
retrieve the EOM flag.
Replace the H2 code related by a htx_set_eom() invocation. QUIC also has
the same code which will be replaced in the next commit.
This should be backported up to 2.7 before the related QUIC patch.
This provides more consistency between the master and the worker. When
"prompt timed" is passed on the master, the timed mode is toggled. When
enabled, for a master it will show the master process' uptime, and for
a worker it will show this worker's uptime. Example:
master> prompt timed
[0:00:00:50] master> show proc
#<PID> <type> <reloads> <uptime> <version>
11940 master 1 [failed: 0] 0d00h02m10s 2.8-dev11-474c14-21
# workers
11955 worker 0 0d00h00m59s 2.8-dev11-474c14-21
# old workers
11942 worker 1 0d00h02m10s 2.8-dev11-474c14-21
# programs
[0:00:00:58] master> @!11955
[0:00:01:03] 11955> @!11942
[0:00:02:17] 11942> @
[0:00:01:10] master>
Entering "prompt timed" toggles reporting of the process' uptime in
the prompt, which will report days, hours, minutes and seconds since
it was started. As discussed with Tim in issue #2145, this can be
convenient to roughly estimate the time between two outputs, as well
as detecting that a process failed to be reloaded for example.
Adding http_free_redirect_rule() function to free a single redirect rule
since it may be required to free rules outside of free_proxy() function.
This patch is required for an upcoming bugfix.
[for 2.2, free_proxy function did not exist (first seen in 2.4), thus
http_free_redirect_rule() needs to be deducted from haproxy.c deinit()
function if the patch is required]
A xref is added between the endpoint descriptors. It is created when the
server endpoint is attached to the SC and it is destroyed when an endpoint
is detached.
This xref is not used for now. But it will be useful to retrieve info about
an endpoint for the opposite side. It is also the warranty there is still a
endpoint attached on the other side.
When "optioon socket-stats" is used in a frontend, its listeners have
their own stats and will appear in the stats page. And when the stats
page has "stats show-legends", then a tooltip appears on each such
socket with ip:port and ID. The problem is that since QUIC arrived, it
was not possible to distinguish the TCP listeners from the QUIC ones
because no protocol indication was mentioned. Now we add a "proto"
legend there with the protocol name, so we can see "tcp4" or "quic6"
and figure how the socket is bound.
Following previous patch, error notification from quic_conn has been
adjusted to rely on standard connection flags. Most notably, CO_FL_ERROR
on the connection instance when a fatal error is detected.
Check for CO_FL_ERROR is implemented by qc_send(). If set the new flag
QC_CF_ERR_CONN will be set for the MUX instance. This flag is similar to
the local error flag and will abort most of the futur processing. To
ensure stream upper layer is also notified, qc_wake_some_streams()
called by qc_process() will put the stream on error if this new flag is
set.
This should be backported up to 2.7.
When an error is detected at quic-conn layer, the upper MUX must be
notified. Previously, this was done relying on quic_conn flag
QUIC_FL_CONN_NOTIFY_CLOSE set and the MUX wake callback called on
connection closure.
Adjust this mechanism to use an approach more similar to other transport
layers in haproxy. On error, connection flags are updated with
CO_FL_ERROR, CO_FL_SOCK_RD_SH and CO_FL_SOCK_WR_SH. The MUX is then
notified when the error happened instead of just before the closing. To
reflect this change, qc_notify_close() has been renamed qc_notify_err().
This function must now be explicitely called every time a new error
condition arises on the quic_conn layer.
To ensure MUX send is disabled on error, qc_send_mux() now checks
CO_FL_SOCK_WR_SH. If set, the function returns an error. This should
prevent the MUX from sending data on closing or draining state.
To complete this patch, MUX layer must now check for CO_FL_ERROR
explicitely. This will be the subject of the following commit.
This should be backported up to 2.7.
Add traces for when an upper layer stream is woken up by the MUX. This
should help to diagnose frozen stream issues.
This should be backported up to 2.7.
As discussed a few times over the years, it's quite difficult to know
how often we stop accepting connections because the global maxconn was
reached. This is not easy to know because when we reach the limit we
stop accepting but we don't know if incoming connections are pending,
so it's not possible to know how many were delayed just because of this.
However, an interesting equivalent metric consist in counting the number
of times an accepted incoming connection resulted in the limit being
reached. I.e. "we've accepted the last one for now". That doesn't imply
any other one got delayed but it's a factual indicator that something
might have been delayed. And by counting the number of such events, it
becomes easier to know whether some limits need to be adjusted because
they're reached often, or if it's exceptionally rare.
The metric is reported as a counter in show info and on the stats page
in the info section right next to "maxconn".
Now in "show info" we have a TotalWarnings field that reports the total
number of warnings issued since the process started. It's also reported
in the the stats page next to the uptime.
LIST_DELETE doesn't affect the previous pointers of the stored element.
This can sometimes hide bugs when such a pointer is reused by accident
in a LIST_NEXT() or equivalent after having been detached for example, or
ia another LIST_DELETE is performed again, something that LIST_DEL_INIT()
is immune to. By compiling with -DDEBUG_LIST, we'll replace a freshly
detached list element with two invalid pointers that will cause a crash
in case of accidental misuse. It's not enabled by default.
->others member of tp_version_information structure pointed to a buffer in the
TLS stack used to parse the transport parameters. There is no garantee that this
buffer is available until the connection is released.
Do not dump the available versions selected by the client anymore, but displayed the
chosen one (selected by the client for this connection) and the negotiated one.
Must be backported to 2.7 and 2.6.
A recent series of patch were introduced to streamline error generation
by QUIC MUX. However, a regression was introduced : every error
generated by the MUX was built as CONNECTION_CLOSE_APP frame, whereas it
should be only for H3/QPACK errors.
Fix this by adding an argument <app> in qcc_set_error. When false, a
standard CONNECTION_CLOSE is used as error.
This bug was detected by QUIC tracker with the following tests
"stop_sending" and "server_flow_control" which requires a
CONNECTION_CLOSE frame.
This must be backported up to 2.7.
The comment for the stconn structure was still referencing the SHUTR/SHUTW
flags. These flags were replaced and we now use ABRT/SHUT flags in
comments. The comment itself was slightly updated to be accurate.
When sc_need_room() is called, the caller cannot request more free space
than a minimum value to be sure it is always possible to unblock it. it is a
safety guard to not freeze any SC on NEED_ROOM condition. At worse it will
lead to some wakeups un excess at the edge.
To keep things simple, the following minimum is used:
(global.tune.bufsize - global.tune.maxrewrite - sizeof(struct htx))
Most of the function in quic_frame.c and quic_frame.h manipulate <buf> buffer
position variables which have nothing to see with struct buffer variables.
Rename them to <pos>
Should be backported to 2.7.
Commit e83f937cc ("MEDIUM: quic: use a global CID trees list") uses a
local variable "tree" used only for locks, but when threads are disabled
it spews a warning about this unused variable.