Commit Graph

6980 Commits

Author SHA1 Message Date
Willy Tarreau 4ad1c9635a BUG/MINOR: stream: do not use client-fin/server-fin with HTX
Historically the client-fin and server-fin timeouts were made to allow
a connection closure to be effective quickly if the last data were sent
down a socket and the client didn't close, something that can happen
when the peer's FIN is lost and retransmits are blocked by a firewall
for example. This made complete sense in 1.5 for TCP and HTTP in close
mode. But nowadays with muxes, it's not done at the right layer anymore
and even the description doesn't match what is being done, because what
happens is that the stream will abort the whole transfer after it's done
sending to the mux and this timeout expires.

We've seen in GH issue 2095 that this can happen with very short timeout
values, and while this didn't trigger often before, now that the muxes
(h2 & quic) properly report an end of stream before even the first
sc_conn_sync_recv(), it seems that it can happen more often, and have
two undesirable effects:
  - logging a timeout when that's not the case
  - aborting the request channel, hence the server-side conn, possibly
    before it had a chance to be put back to the idle list, causing
    this connection to be closed and not reusable.

Unfortunately for TCP (mux_pt) this remains necessary because the mux
doesn't have a timeout task. So here we're adding tests to only do
this through an HTX mux. But to be really clean we should in fact
completely drop all of this and implement these timeouts in the mux
itself.

This needs to be backported to 2.8 where the issue was discovered,
and maybe carefully to older versions, though that is not sure at
all. In any case, using a higher timeout or removing client-fin in
HTTP proxies is sufficient to make the issue disappear.
2023-06-02 16:33:40 +02:00
Willy Tarreau ae0f8be011 MINOR: stats: protect against future stats fields omissions
As seen in commits 33a4461fa ("BUG/MINOR: stats: Fix Lua's `get_stats`
function") and a46b142e8 ("BUG/MINOR: Missing stat_field_names (since
f21d17bb)") it seems frequent to omit to update stats_fields[] when
adding a new ST_F_xxx entry. This breaks Lua's get_stats() and shows
a "(null)" in the header of "show stat", but that one is not detectable
to the naked eye anymore.

Let's add a reminder above the enum declaration about this, and a small
reg tests checking for the absence of "(null)". It was verified to fail
before the last patch above.
2023-06-02 08:39:53 +02:00
Willy Tarreau cb6a35fdc1 [RELEASE] Released version 2.9-dev0
Released version 2.9-dev0 with the following main changes :
    - MINOR: version: mention that it's development again
2023-05-31 16:29:19 +02:00
Willy Tarreau 9dc8308a67 MINOR: version: mention that it's development again
This essentially reverts b9b6e94474.
2023-05-31 16:28:34 +02:00
Willy Tarreau b9b6e94474 MINOR: version: mention that it's LTS now.
The version will be maintained up to around Q2 2028. Let's
also update the INSTALL file to mention this.
2023-05-31 16:23:56 +02:00
Amaury Denoyelle d68f8b5a4a CLEANUP: mux-quic: rename internal functions
This patch is similar to the previous one but for QUIC mux functions
used inside the mux code itself or application layer. Replace all
occurences of qc_* prefix by qcc_* or qcs_*. This should help to better
differentiate code between quic_conn and MUX.

This should be backported up to 2.7.
2023-05-30 15:45:55 +02:00
Amaury Denoyelle 6d6ee0dc0b MINOR: quic: fix stats naming for flow control BLOCKED frames
There was a misnaming in stats counter for *_BLOCKED frames in regard to
QUIC rfc convention. This patch fixes it to prevent future ambiguity :

- STREAMS_BLOCKED -> STREAM_DATA_BLOCKED
- STREAMS_DATA_BLOCKED_BIDI -> STREAMS_BLOCKED_BIDI
- STREAMS_DATA_BLOCKED_UNI -> STREAMS_BLOCKED_UNI

This should be backported up to 2.7.
2023-05-26 17:17:00 +02:00
Amaury Denoyelle 087c5f041b MINOR: mux-quic: remove nb_streams from qcc
Remove nb_streams field from qcc. It was not used outside of a BUG_ON()
statement to ensure we never have a negative count of streams. However
this is already checked with other fields.

This should be backported up to 2.7.
2023-05-26 17:17:00 +02:00
Amaury Denoyelle 7b41dfd834 CLEANUP: mux-quic: remove unneeded fields in qcc
Remove fields from qcc structure which are unused.

This should be backported up to 2.7.
2023-05-26 17:17:00 +02:00
Patrick Hemmer 425d7ad89d MINOR: init: pre-allocate kernel data structures on init
The Linux kernel maintains data structures to track a processes' open file
descriptors, and it expands these structures as necessary when FD usage grows
(at every FD=2^X starting at 64). However when threading is in use, during
expansion the kernel will pause (observed up to 47ms) while it waits for thread
synchronization (see https://bugzilla.kernel.org/show_bug.cgi?id=217366).

This change addresses the issue and avoids the random pauses by opening the
maximum file descriptor during initialization, so that expansion will not occur
while processing traffic.
2023-05-26 09:28:18 +02:00
Willy Tarreau b298882acc BUILD: compiler: systematically set USE_OBSOLETE_LINKER with TCC
TCC silently ignores the weak and section attributes, which ruins the
initcalls. Technically we're exactly in the same situation as with an
obsolete linker. Let's just automatically set the flag if TCC is
detected, this avoids surprises where the program compiles but does
not start.

No backport is needed.
2023-05-24 21:37:06 +02:00
Willy Tarreau eced142aa8 BUILD: ist: use the literal declaration for ist_lc/ist_uc under TCC
TCC doesn't knoow about __attribute__((weak)), it silently ignores it.
We could add a "static" modifier there in this case but we already have
an alternate portable mode that is based on a slightly larger literal
for obsolete linkers (and non-ELF systems) which choke on weak. Let's
just add the test for tcc there and use it in this case.

No backport is needed.
2023-05-24 21:33:34 +02:00
Willy Tarreau 4e8720ab78 BUILD: ist: do not put a cast in an array declaration
TCC is upset by the declaration looking like:

  const unsigned char ist_lc[256] __attribute__((weak)) = ((const unsigned char[256]){ ... });

It was written like this because it's expanded from the _IST_LC macro
but it's never used as-is, it's only used from ist_lc, which should be
the one containing the cast so that the macro only contains the list of
bytes that can be used in both places. And this assigns more consistent
roles to the lower and upper case macro/variable now, one is typed and
the other one not. No backport is needed.
2023-05-24 21:27:39 +02:00
Frédéric Lécaille 12a815ad19 MINOR: quic: Add a counter for sent packets
Add ->sent_pkt counter to quic_conn struct to count the packet at QUIC connection
level. Then, when the connection is released, the ->sent_pkt counter value
is added to the one for the listener.

Must be backported to 2.7.
2023-05-24 16:30:11 +02:00
Frédéric Lécaille bdd64fd71d MINOR: quic: Add some counters at QUIC connection level
Add some statistical counters to quic_conn struct from quic_counters struct which
are used at listener level to handle them at QUIC connection level. This avoid
calling atomic functions. Furthermore this will be useful soon when a counter will
be added for the total number of packets which have been sent which will be very
often incremented.

Some counters were not added, espcially those which count the number of QUIC errors
by QUIC error types. Indeed such counters would be incremented most of the time
only one time at QUIC connection level.

Implement quic_conn_prx_cntrs_update() which accumulates the QUIC connection level
statistical counters to the listener level statistical counters.

Must be backported to 2.7.
2023-05-24 16:30:11 +02:00
Willy Tarreau 1e1c28873c BUILD: makefile: fix build issue on GNU make < 3.82
Thierry Fournier reported a build breakage with the ubiquitous make
3.81, LDFLAGS were ignored. This is caused by the declaration of the
collect_opt_flags macro that is defined with an "=" sign, something
that only appeared in 3.82 and that is not necessary. With it removed,
the build now works fine at least from 3.80 to 4.3.

No backport is needed since this makefile cleanup appeared in 2.8.
2023-05-24 15:51:03 +02:00
Ilya Shipitsin 97c344dae0 BUILD: quic: re-enable chacha20_poly1305 for libressl
this reverts d2be9d4c48

LibreSSL implements EVP_chacha20_poly1305() with EVP_CIPHER for every
released version starting with 3.6.0
2023-05-23 19:20:36 +02:00
Willy Tarreau b7209d42d9 MEDIUM: stconn: make the SE_FL_ERR_PENDING to ERROR transition systematic
During a code audit of the various situations that promote ERR_PENDING to
ERROR, it appeared that:
  - all muxes use se_fl_set_error() to set it, which chooses either based
    on EOI/EOS presence ;
  - EOI/EOS that arrive late after ERR_PENDING were not systematically
    upgraded to ERROR

This results in confusion about how such ERROR or ERR_PENDING ought to
be handled, which is not quite desirable.

This patch adds a test to se_fl_set() to detect if we're setting EOI or
EOS while ERR_PENDING is present, or the other way around so that any
sequence of EOI/EOS <-> ERR_PENDING results in ERROR being set. This
way there will no longer be possible situations where ERROR is missing
while the other ones are set.
2023-05-23 16:17:04 +02:00
Amaury Denoyelle 5eadc27623 MINOR: quic: remove return val of quic_aead_iv_build()
quic_aead_iv_build() should never fail unless we call it with buffers of
different size. This never happens in the code as every input buffers
are of size QUIC_TLS_IV_LEN.

Remove the return value and add a BUG_ON() to prevent future misusage.
This is especially useful to remove one error handling on the sending
patch via quic_packet_encrypt().

This should be backported up to 2.7.
2023-05-22 11:17:18 +02:00
Willy Tarreau 5345490b8e MINOR: clock: provide a function to automatically adjust now_offset
Right now there's no way to enforce a specific value of now_ms upon
startup in order to compensate for the time it takes to load a config,
specifically when dealing with the health check startup. For this we'd
need to force the now_offset value to compensate for the last known
value of the current date. This patch exposes a function to do exactly
this.
2023-05-17 09:33:54 +02:00
Willy Tarreau 5723b382ed MINOR: stats: report the boot time in "show info"
Just like we have the uptime in "show info", let's add the boot time.
It's trivial to collect as it's just the difference between the ready
date and the start date, and will allow users to monitor this element
in order to take action before it starts becoming problematic. Here
the boot time is reported in milliseconds, so this allows to even
observe sub-second anomalies in startup delays.
2023-05-17 09:33:54 +02:00
Willy Tarreau da4aa6905c MINOR: clock: measure the total boot time
Some huge configs take a significant amount of time to start and this
can cause some trouble (e.g. health checks getting delayed and grouped,
process not responding to the CLI etc). For example, some configs might
start fast in certain environments and slowly in other ones just due to
the use of a wrong DNS server that delays all libc's resolutions. Let's
first start by measuring it by keeping a copy of the most recently known
ready date, once before calling check_config_validity() and then refine
it when leaving this function. A last call is finally performed just
before deciding to split between master and worker processes, and it covers
the whole boot. It's trivial to collect and even allows to get rid of a
call to clock_update_date() in function check_config_validity() that was
used in hope to better schedule future events.
2023-05-17 09:33:54 +02:00
Amaury Denoyelle 1a2faef92f MINOR: mux-quic: uninline qc_attach_sc()
Uninline and move qc_attach_sc() function to implementation source file.
This will be useful for next commit to add traces in it.

This should be backported up to 2.7.
2023-05-16 17:53:45 +02:00
Amaury Denoyelle 3cb78140cf MINOR: mux-quic: properly report end-of-stream on recv
MUX is responsible to put EOS on stream when read channel is closed.
This happens if underlying connection is closed or a RESET_STREAM is
received. FIN STREAM is ignored in this case.

For connection closure, simply check for CO_FL_SOCK_RD_SH.

For RESET_STREAM reception, a new flag QC_CF_RECV_RESET has been
introduced. It is set when RESET_STREAM is received, unless we already
received all data. This is conform to QUIC RFC which allows to ignore a
RESET_STREAM in this case. During RESET_STREAM processing, input buffer
is emptied so EOS can be reported right away on recv_buf operation.

This should be backported up to 2.7.
2023-05-16 17:53:45 +02:00
William Lallemand d0c363486c BUILD: ssl: get0_verified chain is available on libreSSL
Define HAVE_SSL_get0_verified_chain when it's using libreSSL >= 3.3.6.
2023-05-15 15:16:15 +02:00
William Lallemand 6e0c39d7ac BUILD: ssl: ssl_c_r_dn fetches uses functiosn only available since 1.1.1
Fix the openssl build with older openssl version by disabling the new
ssl_c_r_dn fetch.

This also disable the ssl_client_samples.vtc file for OpenSSL version
older than 1.1.1
2023-05-15 12:07:52 +02:00
Abhijeet Rastogi df97f472fa MINOR: ssl: add new sample ssl_c_r_dn
This patch addresses #1514, adds the ability to fetch DN of the root
ca that was in the chain when client certificate was verified during SSL
handshake.
2023-05-15 10:48:05 +02:00
Amaury Denoyelle 6c501ed23b BUG/MINOR: mux-quic: differentiate failure on qc_stream_desc alloc
qc_stream_buf_alloc() can fail for two reasons :
* limit of Tx buffer per connection reached
* allocation failure

The first case is properly treated. A flag QC_CF_CONN_FULL is set on the
connection to interrupt emission. It is cleared when a buffer became
available after in order ACK reception and the MUX tasklet is woken up.

The allocation failure was handled with the same mechanism which in this
case is not appropriate and could lead to a connection transfer freeze.
Instead, prefer to close the connection with a QUIC internal error code.

To differentiate the two causes, qc_stream_buf_alloc() API was changed
to return the number of available buffers to the caller.

This must be backported up to 2.6.
2023-05-12 16:26:20 +02:00
Amaury Denoyelle 93dd23cab4 MINOR: mux-quic: remove dedicated function to handle standalone FIN
Remove QUIC MUX function qcs_http_handle_standalone_fin(). The purpose
of this function was only used when receiving an empty STREAM frame with
FIN bit. Besides, it was called by each application protocol which could
have different approach and render the function purpose unclear.

Invocation of qcs_http_handle_standalone_fin() have been replaced by
explicit code in both H3 and HTTP/0.9 module. In the process, use
htx_set_eom() to reliably put EOM on the HTX message.

This should be backported up to 2.7, along with the previous patch which
introduced htx_set_eom().
2023-05-12 15:50:30 +02:00
Amaury Denoyelle 25cf19d5c8 MINOR: htx: add function to set EOM reliably
Implement a new HTX utility function htx_set_eom(). If the HTX message
is empty, it will first add a dummy EOT block. This is a small trick
needed to ensure readers will detect the HTX buffer as not empty and
retrieve the EOM flag.

Replace the H2 code related by a htx_set_eom() invocation. QUIC also has
the same code which will be replaced in the next commit.

This should be backported up to 2.7 before the related QUIC patch.
2023-05-12 15:29:28 +02:00
Willy Tarreau ea07715ccf MINOR: master/cli: also implement the timed prompt on the master CLI
This provides more consistency between the master and the worker. When
"prompt timed" is passed on the master, the timed mode is toggled. When
enabled, for a master it will show the master process' uptime, and for
a worker it will show this worker's uptime. Example:

  master> prompt timed
  [0:00:00:50] master> show proc
  #<PID>          <type>          <reloads>       <uptime>        <version>
  11940           master          1 [failed: 0]   0d00h02m10s     2.8-dev11-474c14-21
  # workers
  11955           worker          0               0d00h00m59s     2.8-dev11-474c14-21
  # old workers
  11942           worker          1               0d00h02m10s     2.8-dev11-474c14-21
  # programs

  [0:00:00:58] master> @!11955
  [0:00:01:03] 11955> @!11942
  [0:00:02:17] 11942> @
  [0:00:01:10] master>
2023-05-11 16:38:52 +02:00
Willy Tarreau 225555711f MINOR: cli: add an option to display the uptime in the CLI's prompt
Entering "prompt timed" toggles reporting of the process' uptime in
the prompt, which will report days, hours, minutes and seconds since
it was started. As discussed with Tim in issue #2145, this can be
convenient to roughly estimate the time between two outputs, as well
as detecting that a process failed to be reloaded for example.
2023-05-11 16:38:52 +02:00
Aurelien DARRAGON 31b23aef38 CLEANUP: acl: discard prune_acl_cond() function
Thanks to previous commit, we have no more use for prune_acl_cond(),
let's remove it to prevent code duplication.
2023-05-11 15:37:04 +02:00
Aurelien DARRAGON 7abc9224a6 MINOR: proxy: add http_free_redirect_rule() function
Adding http_free_redirect_rule() function to free a single redirect rule
since it may be required to free rules outside of free_proxy() function.

This patch is required for an upcoming bugfix.

[for 2.2, free_proxy function did not exist (first seen in 2.4), thus
http_free_redirect_rule() needs to be deducted from haproxy.c deinit()
function if the patch is required]
2023-05-11 15:37:04 +02:00
Christopher Faulet 7542fb43d6 MINOR: stconn: Add a cross-reference between SE descriptor
A xref is added between the endpoint descriptors. It is created when the
server endpoint is attached to the SC and it is destroyed when an endpoint
is detached.

This xref is not used for now. But it will be useful to retrieve info about
an endpoint for the opposite side. It is also the warranty there is still a
endpoint attached on the other side.
2023-05-11 15:37:04 +02:00
Willy Tarreau 4cfb0019e6 MINOR: stats: report the listener's protocol along with the address in stats
When "optioon socket-stats" is used in a frontend, its listeners have
their own stats and will appear in the stats page. And when the stats
page has "stats show-legends", then a tooltip appears on each such
socket with ip:port and ID. The problem is that since QUIC arrived, it
was not possible to distinguish the TCP listeners from the QUIC ones
because no protocol indication was mentioned. Now we add a "proto"
legend there with the protocol name, so we can see "tcp4" or "quic6"
and figure how the socket is bound.
2023-05-11 14:52:56 +02:00
Amaury Denoyelle 5f67b17a59 MEDIUM: mux-quic: adjust transport layer error handling
Following previous patch, error notification from quic_conn has been
adjusted to rely on standard connection flags. Most notably, CO_FL_ERROR
on the connection instance when a fatal error is detected.

Check for CO_FL_ERROR is implemented by qc_send(). If set the new flag
QC_CF_ERR_CONN will be set for the MUX instance. This flag is similar to
the local error flag and will abort most of the futur processing. To
ensure stream upper layer is also notified, qc_wake_some_streams()
called by qc_process() will put the stream on error if this new flag is
set.

This should be backported up to 2.7.
2023-05-11 14:12:48 +02:00
Amaury Denoyelle b2e31d33f5 MEDIUM: quic: streamline error notification
When an error is detected at quic-conn layer, the upper MUX must be
notified. Previously, this was done relying on quic_conn flag
QUIC_FL_CONN_NOTIFY_CLOSE set and the MUX wake callback called on
connection closure.

Adjust this mechanism to use an approach more similar to other transport
layers in haproxy. On error, connection flags are updated with
CO_FL_ERROR, CO_FL_SOCK_RD_SH and CO_FL_SOCK_WR_SH. The MUX is then
notified when the error happened instead of just before the closing. To
reflect this change, qc_notify_close() has been renamed qc_notify_err().
This function must now be explicitely called every time a new error
condition arises on the quic_conn layer.

To ensure MUX send is disabled on error, qc_send_mux() now checks
CO_FL_SOCK_WR_SH. If set, the function returns an error. This should
prevent the MUX from sending data on closing or draining state.

To complete this patch, MUX layer must now check for CO_FL_ERROR
explicitely. This will be the subject of the following commit.

This should be backported up to 2.7.
2023-05-11 14:04:51 +02:00
Amaury Denoyelle 2d5c3f5cd1 MINOR: mux-quic: add traces for stream wake
Add traces for when an upper layer stream is woken up by the MUX. This
should help to diagnose frozen stream issues.

This should be backported up to 2.7.
2023-05-11 14:04:51 +02:00
Willy Tarreau 9615102b01 MINOR: stats: report the number of times the global maxconn was reached
As discussed a few times over the years, it's quite difficult to know
how often we stop accepting connections because the global maxconn was
reached. This is not easy to know because when we reach the limit we
stop accepting but we don't know if incoming connections are pending,
so it's not possible to know how many were delayed just because of this.
However, an interesting equivalent metric consist in counting the number
of times an accepted incoming connection resulted in the limit being
reached. I.e. "we've accepted the last one for now". That doesn't imply
any other one got delayed but it's a factual indicator that something
might have been delayed. And by counting the number of such events, it
becomes easier to know whether some limits need to be adjusted because
they're reached often, or if it's exceptionally rare.

The metric is reported as a counter in show info and on the stats page
in the info section right next to "maxconn".
2023-05-11 13:51:31 +02:00
Willy Tarreau 3c4a297d2b MINOR: stats: report the total number of warnings issued
Now in "show info" we have a TotalWarnings field that reports the total
number of warnings issued since the process started. It's also reported
in the the stats page next to the uptime.
2023-05-11 12:02:21 +02:00
Willy Tarreau 29dcc5e559 DEBUG: list: add DEBUG_LIST to purposely corrupt list heads after delete
LIST_DELETE doesn't affect the previous pointers of the stored element.
This can sometimes hide bugs when such a pointer is reused by accident
in a LIST_NEXT() or equivalent after having been detached for example, or
ia another LIST_DELETE is performed again, something that LIST_DEL_INIT()
is immune to. By compiling with -DDEBUG_LIST, we'll replace a freshly
detached list element with two invalid pointers that will cause a crash
in case of accidental misuse. It's not enabled by default.
2023-05-11 11:33:35 +02:00
Frédéric Lécaille b971696296 BUG/MINOR: quic: Possible crash when dumping version information
->others member of tp_version_information structure pointed to a buffer in the
TLS stack used to parse the transport parameters. There is no garantee that this
buffer is available until the connection is released.

Do not dump the available versions selected by the client anymore, but displayed the
chosen one (selected by the client for this connection) and the negotiated one.

Must be backported to 2.7 and 2.6.
2023-05-10 13:26:37 +02:00
Amaury Denoyelle 58721f2192 BUG/MINOR: mux-quic: fix transport VS app CONNECTION_CLOSE
A recent series of patch were introduced to streamline error generation
by QUIC MUX. However, a regression was introduced : every error
generated by the MUX was built as CONNECTION_CLOSE_APP frame, whereas it
should be only for H3/QPACK errors.

Fix this by adding an argument <app> in qcc_set_error. When false, a
standard CONNECTION_CLOSE is used as error.

This bug was detected by QUIC tracker with the following tests
"stop_sending" and "server_flow_control" which requires a
CONNECTION_CLOSE frame.

This must be backported up to 2.7.
2023-05-09 18:42:34 +02:00
Christopher Faulet 557146ccc8 DOC: stconn: Update comments about ABRT/SHUT for stconn structure
The comment for the stconn structure was still referencing the SHUTR/SHUTW
flags. These flags were replaced and we now use ABRT/SHUT flags in
comments. The comment itself was slightly updated to be accurate.
2023-05-09 16:36:45 +02:00
Christopher Faulet e59f7583ee MEDIUM: stconn: Be sure to always be able to unblock a SC that needs room
When sc_need_room() is called, the caller cannot request more free space
than a minimum value to be sure it is always possible to unblock it. it is a
safety guard to not freeze any SC on NEED_ROOM condition. At worse it will
lead to some wakeups un excess at the edge.

To keep things simple, the following minimum is used:

  (global.tune.bufsize - global.tune.maxrewrite - sizeof(struct htx))
2023-05-09 11:53:28 +02:00
Frédéric Lécaille 1bc6e318f0 CLEANUP: quic: Rename several <buf> variables in quic_frame.(c|h)
Most of the function in quic_frame.c and quic_frame.h manipulate <buf> buffer
position variables which have nothing to see with struct buffer variables.
Rename them to <pos>

Should be backported to 2.7.
2023-05-09 10:48:40 +02:00
Frédéric Lécaille d19a02a40e CLEANUP: quic: No more used q_buf structure
This definition is no more used.

Should be backported to 2.7.
2023-05-09 10:48:40 +02:00
Willy Tarreau 652d1712dd BUILD: quic: fix build warning when threads are disabled
Commit e83f937cc ("MEDIUM: quic: use a global CID trees list") uses a
local variable "tree" used only for locks, but when threads are disabled
it spews a warning about this unused variable.
2023-05-07 15:06:22 +02:00
Willy Tarreau dd9f921b3a CLEANUP: fix a few reported typos in code comments
These are only the few relevant changes among those reported here:

  https://github.com/haproxy/haproxy/actions/runs/4856148287/jobs/8655397661
2023-05-07 07:07:44 +02:00