Commit Graph

17486 Commits

Author SHA1 Message Date
Frédéric Lécaille
1231d3c179 MINOR: quic: Drop 0-RTT packets without secrets
If we received 0-RTT packets and no secrets were provided by the TLS stack
we must drop them.
2022-04-29 16:46:56 +02:00
Amaury Denoyelle
74cf237ecd MEDIUM: quic: do not ack packet with invalid STREAM
If the MUX cannot handle immediately nor buffer a STREAM frame, the
packet containing it must not be acknowledge. This is in conformance
with the RFC9000.

qcc_recv() return codes have been adjusted to differentiate an invalid
frame with an already fully received offset which must be acknowledged.
2022-04-29 16:16:19 +02:00
Amaury Denoyelle
d46e335683 MEDIUM: quic: do not ACK packet with STREAM if MUX not present
If a packet contains a STREAM frame but the MUX is not allocated, the
frame cannot be enqueued. According to the RFC9000, we must not
acknowledge the packet under this condition.

This may prevents a bug with firefox which keeps trying on refreshing
the web page. This issue has already been detected before closing state
implementation : haproxy wasn't emitted CONNECTION_CLOSE and keeps
acknowledge STREAM frames despite not handle them.

In the future, it might be necessary to respond with a CONNECTION_CLOSE
if the MUX has already been freed.
2022-04-29 16:15:47 +02:00
Willy Tarreau
4173f4ea29 BUG/MINOR: conn_stream: do not confirm a connection from the frontend path
In issue #1468 it was reported that sometimes server-side connection
attempts were only validated after the "timeout connect" value, and
that would only happen with an H2 client. A long code analysis with the
output dumps showed only one possible call path: an I/O event on the
frontend while reading had just been disabled calls h2_wake() which in
turns wakes cs_conn_io_cb(), which tries cs_conn_process() and cs_notify(),
which sees that the other side is not blocked (already in CS_ST_CON)
and tries cs_chk_snd() on it. But on that side the connection had just
finished to be set up and not yet woken the stream up, cs_notify()
would then call cs_conn_send() which succeeds and passes the connection
to CS_ST_RDY. The problem is that nothing new happened on the frontend
side so there's no reason to wake the stream up and the backend-side
conn_stream remains in CS_ST_RDY state with the stream never being
woken up.

Once the "timeout connect" strikes, process_stream() is woken up and
finds the connection finally setup, so it ignores the timeout and goes
on.

The number of conditions to meet to reproduce this is huge, which also
explains why the reporter says it's "occasional" and we were never able
to reproduce it in the lab. It needs at least reads to be disabled and
immediately re-enabled on the frontend side (e.g. buffer full) with an
I/O even reported before the poller had an opportunity to be disabled
but with no subscribe being reinstalled, so that sock_conn_iocb() has
no other choice but calling h2_wake(), and exactly at the same time
the backend connection must finish to set up so that it was not yet
reported by the poller, the data were sent and the polling for writes
disabled.

Several factors are to be considered here:
  - h2_wake() should probably not call h2_wake_some_streams() for
    ret >= 0 (common case), but only if some special event is reported
    for at least one stream; that part is sensitive though as in the
    past we managed to lose some rare cases (e.g. restart processing
    after a pause), and such wakeups are extremely rare so we'd better
    make that effort once in a while.

  - letting a lazy forward attempt on the frontend confirm a backend
    connection establishment is too smart to be reliable. That wasn't
    in fact the intent and it's inherited from the very old code where
    muxes didn't exist and where it was guaranteed that an even at this
    layer would wake everyone up.

Here the best thing to do is to refrain from attempting to forward data
until the connection is confirmed. This will let the poller report the
connect() event to the backend side which will process it as it should
and does in all other cases.

Thanks to Jimmy Crutchfield for having reported useful traces and
tested patches.

This will have to be backported to all stable branches after some
observation. Before 2.6 the function is stream_int_chk_snd_conn(),
and the flag to remove is SI_SB_CON.
2022-04-29 15:32:14 +02:00
Christopher Faulet
0055d5693e MINOR: httpclient: Don't use co_set_data() to decrement output
The use of co_set_data() should be strictly limited to setting the amount of
existing data to be transmitted. It ought not be used to decrement the
output after the data have left the buffer, because doing so involves
performing incorrect calculations using co_data() that still comprises data
that are not in the buffer anymore. Let's use c_rew() for this, which is
made exactly for this purpose, i.e. decrement c->output by as much as
requested.
2022-04-29 14:12:42 +02:00
Christopher Faulet
6b4f1f64a8 BUG/MINOR: httpclient: Count metadata in size to transfer via htx_xfer_blks()
When HTX blocks are transfer from the HTTP client context to the request
channel, via htx_xfer_blks() function, the metadata must also be counted, in
addition to the data size. Otherwise, expected payload size will not be
copied because the metadata of an HTX block (8 bytes) will be reserved. And
if the payload size is lower than 8 bytes, nothing will be copied. Thus only
a zero-copy will be able to copy the payload.

This issue is 2.6-specific, no backport is needed.
2022-04-29 14:12:42 +02:00
Christopher Faulet
534645d6c0 BUG/MEDIUM: httpclient: Fix loop consuming HTX blocks from the response channel
When the HTTP client consumes the response, it loops on the HTX message to
copy blocks content and it removes blocks by calling htx_remove_blk(). But
this function removes a block and returns the next one in the HTX
message. The result must be used instead of using htx_get_next(). It is
especially important because the block used in htx_get_next() loop was
removed. It only works because the message is not defragmented during the
loop.

In addition, the loop on the response was simplified to iter on blocks
instead of positions.

This patch must be backported to 2.5.
2022-04-29 14:12:42 +02:00
Christopher Faulet
a6c4a48341 BUG/MEDIUM: conn-stream: Don't erase endpoint flags on reset
Only CS_EP_ERROR flag is now removed from the endpoint when a reset is
performed. When a new the endpoint is allocated, flags are preserved. It is
the caller responsibility to remove other flags, depending on its need.

Concretly, during a connection retry or a L7 retry, we must preserve
flags. In tcpcheck and the CLI, we reset flags.

This patch is 2.6-specific. No backport needed.
2022-04-29 14:12:42 +02:00
William Lallemand
04994de642 BUG/MINOR: httpclient/ssl: use the correct verify constant
The SSL_SERVER_VERIFY_* constants were incorrectly set on the httpclient
server verify. The right constants are SSL_SOCK_VERIFY_* .

This could cause issues when using "httpclient-ssl-verify" or when the
SSL certificates can't be loaded.

No backport needed
2022-04-28 19:35:21 +02:00
Frédéric Lécaille
3e26698f89 MINOR: quic: Drop 0-RTT packets if not allowed
Drop the 0-RTT packets for a listener without early data configuration enabled.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
4646cf3b70 CLEANUP: quic: Rely on the packet length set by qc_lstnr_pkt_rcv()
This function is used to parse the QUIC packets carried by a UDP datagram.
When a correct packet could be found, the ->len RX packet structure value
is set to the packet length value. On the contrary, it is set to the remaining
number of bytes in the UDP datagram if no correct QUIC packet could be found.
So, there is no need to make this function return a status value. It allows
the caller to parse any QUIC packet carried by a UDP datagram without this.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
87373e7269 BUG/MINOR: quic: Missing Initial packet length check
Any client Initial packet carried in a datagram smaller than QUIC_INITIAL_PACKET_MINLEN(200)
bytes must be discarded. This does not mean we must discard the entire datagram.
So we must at least try to parse the packet length before dropping the packet
and return its length from qc_lstnr_pkt_rcv().
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
77cb38d22d BUG/MEDIUM: quic: Possible crash on STREAM frame loss
A crash is possible under such circumtances:
    - The congestion window is drastically reduced to its miniaml value
    when a quic listener is experiencing extreme packet loss ;
    - we enqueue several STREAM frames to be resent and some of them could not be
    transmitted ;
    - some of the still in flight are acknowledged and trigger the
    stream memory releasing ;
    - when we come back to send the remaing STREAM frames, haproxy
    crashes when it tries to build them.

To fix this issue, we mark the STREAM frame as lost when detected as lost.
Then a lookup if performed for the stream the STREAM frames are attached
to before building them. They are released if the stream is no more available
or the data range of the frame is consumed.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
dafbde6c8c MINOR: quic: Wake up the mux to probe with new data
When we have to probe the peer, we must first try to send new data. This is done
here waking up the mux after having set the number of maximum number of datagrams
to send to QUIC_MAX_NB_PTO_DGRAMS (2). Of course, this is only the case if the
mux was subscribed to SEND events.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
d8b798d7ef BUG/MINOR: quic: Traces fix about remaining frames upon packet build failure 2022-04-28 16:22:40 +02:00
Frédéric Lécaille
834399c24a BUG/MINOR: quic: Avoid sending useless PADDING frame
This may happen in rare cases with extreme packet loss (30% for both TX and RX)
which leads the congestion window to decrease down to its minimal value (two
datagrams). Under such circumtances, no ack-eliciting frame can be added to
a packet by qc_build_frms(). In this case we must cancel the packet building
process if there is no ACK or probe (PING frame) to send.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
573b56b774 BUG/MINOR: quic: Wrong returned status by qc_build_frms()
This function must return a successful status as soon as it could be build
a frame to be embedded by a packet. This behavior was broken by the last
modifications. This was due to a dangerous "ret = 1" statement inside
a loop. This statement must be reach only if we go out of a switch/case
after a "break" statement.
Add comments to mention this information.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
337108ecda MINOR: quic: Do not send ACK frames when probing
When we are probing, we do not receive packets, furthermore all ACK frames have
already been sent. This is useless to send ACK when probing the peer. This
modification does not reset the flag which marks the connection as requiring an
ACK frame to be sent. If this is the case, this will be taken into an account
by after the probing process.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
7aef5f4c3f MEDIUM: quic: Enable the new datagram probing process
Make the two I/O handlers quic_conn_io_cb() and quic_conn_app_io_cb() call
qc_dgrams_retransmit() after probing retransmissions need was detected by
the timer task (qc_process_timer()).
We must modify qc_prep_pkts() to support QUIC_TLS_ENC_LEVEL_NONE as <next_tel>
parameter when called from qc_dgrams_retransmit().
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
da342556c3 MEDIUM: quic: Mark copies of acknowledged frames as acknowledged
We call qc_release_frm() to do so from this function everywhere a frame
is released.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
1809c33d6e MINOR: quic: Mark packets as probing with old data
When probing retranmissions with old data are needed for the connection we
mark the packets as probing with old data to track them when acknowledged:
we do not resend frames with old data when lost.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
3e3a621447 MINOR: quic: old data distinction for qc_send_app_pkt()
Modify qc_send_app_pkt() to distinguish the case where it sends new data
against the case where it sends old data during probing retransmissions.
We add <old_data> boolean parameter to this function to do so. The mux
never directly send old data when probing retransmissions are needed by
the connection.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
96367158ab MEDIUM: quic: qc_requeue_nacked_pkt_tx_frms() rework
This function is used to requeue the TX frames from TX packets which have
been detected as lost. The modifications consist in avoiding resending frames from
duplicated frames used to probe the peer. This is useless. Only the original
frames loss must be taken into an account because detected as lost before
the retransmitted frames. If these latter are also detected as lost, other
duplicated frames would have been retransmitted before their loss detection.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
e248e378c8 MEDIUM: quic: Retransmission functions rework
qc_prep_fast_retrans() and qc_prep_hdshk_fast_retrans() are modified to
take two list of frames as parameters. Two lists are needed for
qc_prep_hdshk_fast_retrans() to build datagrams with two packets during
handshake. qc_prep_fast_retrans() needs two lists of frames to be used
to send two datagrams with one list by datagram.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
a9568411e4 MEDIUM: quic: New functions for probing rework
We want to be able to resend frames from list of frames during handshakes to
resend datagrams with the same frames as during the first transmissions.
This leads to decrease drasctically the chances of frame fragmentation due to
variable lengths of the packet fields. Furthermore the frames were not duplicated
when retransmitted from a packet to another one. This must be the case only during
packet loss dectection.

qc_dup_pkt_frms() is there to duplicate the frames from an input list to an output
list. A distinction is made between STREAM frames and the other ones because we
can rely on the "acknowledged offset" the aim of which is to track the number
of bytes which were acknowledged from TX STREAM frames.

qc_release_frm() in addition to release the frame passed as parameter, also mark
the duplicate STREAM frames as acknowledeged.

qc_send_hdshk_pkts() is the qc_send_app_pkts() counterpart to send datagrams from
at most two list of frames to be able to coalesced packets from two different
packet number spaces

qc_dgrams_retransmit() is there to probe the peer with datagrams depending on the
need of the packet number spaces which must be flag with QUIC_FL_PKTNS_PROBE_NEEDED
by the PTO timer task (qc_process_timer()).
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
3ef729a643 MINOR: quic: process_timer() rework
Add QUIC_FL_CONN_RETRANS_NEEDED connection flag definition to mark a quic_conn
struct as needing a retranmission.
Add QUIC_FL_PKTNS_PROBE_NEEDED to mark a packet number space as needing a
datagram probing.
Set these flags from process_timer() to trigger datagram probings.
Do not initiate anymore datagrams probing from any quic encryption level.
This will be done from the I/O handlers (quic_conn_io_cb() during handshakes and
quic_conn_app_io_cb() after handshakes).
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
e87b3ee9f5 MINOR: quic: Add traces about TX frame memory releasing
Add such traces in qc_treat_acked_tx_frm(). This should be helpful to track memory
leak issues for TX frames.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
b44cbc68a6 MINOR: quic: Do not retransmit frames from coalesced packets
Add QUIC_FL_TX_PACKET_COALESCED flag to mark a TX packet as coalesced with others
to build a datagram.
Ensure we do not directly retransmit frames from such coalesced packets. They must
be retransmitted from their packet number spaces to avoid duplications.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
b917191817 MINOR: quic: Prepare quic_frame struct duplication
We want to track the frames which have been duplicated during retransmissions so
that to avoid uselessly retransmitting frames which would already have been
acknowledged. ->origin new member is there to store the frame from which a copy
was done, ->reflist is a list to store the frames which are copies.
Also ensure all the frames are zeroed and that their ->reflist list member is
initialized.
Add QUIC_FL_TX_FRAME_ACKED flag definition to mark a TX frame as acknowledged.
2022-04-28 16:22:40 +02:00
Frédéric Lécaille
fc88844d2c MINOR: quic: Improve qc_prep_pkts() flexibility
We want to be able to chosse the list of frames we want to prepare in packets
to be send. This is to modify the retransmission process (to come).
2022-04-28 16:22:40 +02:00
Amaury Denoyelle
03cc62c840 MINOR: quic: decode as much STREAM as possible
Add a loop in the bidi STREAM function. This will call repeatdly
qcc_decode_qcs() and dequeue buffered frames.

This is useful when reception of more data is interrupted because the
MUX buffer was full. qcc_decode_qcs() has probably free some space so it
is useful to immediatly retry reception of buffered frames of the qcs
tree.

This may fix occurences of stalled Rx transfers with large payload.
Note however that there is still room for improvment. The conn-stream
layer is not able at this moment to retrigger demuxing. This is because
the mux io-handler does not treat Rx : this may continue to cause
stalled tranfers.
2022-04-28 16:10:10 +02:00
Amaury Denoyelle
48f01bda86 MINOR: h3: support DATA demux if buffer full
Previously, h3 layer was not able to demux a DATA frame if not fully
received in the Rx buffer. This causes evident limitation and prevents
to be able to demux a frame bigger than the buffer.

Improve h3_data_to_htx() to support partial frame demuxing. The demux
state is preserved in the h3s new fields : this is useful to keep the
current type and length of the demuxed frame.
2022-04-28 15:44:19 +02:00
Amaury Denoyelle
67e92d3652 MINOR: h3: implement h3 stream context
Define a new structure h3s used to provide context for a H3 stream. This
structure is allocated and stored in the qcs thanks to previous commit
which provides app-layer context storage.

For now, h3s is empty. It will soon be completed to be able to support
stateful demux : this is required to be able to demux an incomplete
frame if the rx buffer is full.
2022-04-28 15:44:19 +02:00
Amaury Denoyelle
47447af1ef MINOR: mux-quic: add a app-layer context in qcs
Define 2 new callback for qcc_app_ops : attach and detach. They are
called when a qcs instance is respectively allocated and freed. If
implemented, they can allocate a custom context stored in the new
abstract field ctx of qcs.

For now, h3 and hq-interop does not use these new callbacks. They will
be soon implemented by the h3 layer to allocate a context used for
stateful demuxing.

This change is required to support the demuxing of H3 frames bigger than
a buffer.
2022-04-28 15:44:19 +02:00
Amaury Denoyelle
314578a54f MINOR: h3: change frame demuxing API
Edit the functions used for HEADERS and DATA parsing. They now return
the number of bytes handled.

This change will help to demux H3 frames bigger than the buffer.
2022-04-28 15:44:19 +02:00
Amaury Denoyelle
3df8ca0a4d MINOR: mux-quic: partially copy Rx frame if almost full buf
Improve the reception for STREAM frames. In qcc_recv(), if the frame is
bigger than the remaining space in rx buffer, do not reject it wholly.
Instead, copy as much data as possible. The rest of the data is
buffered.

This is necessary to handle H3 frames bigger than a buffer. The H3 code
does not demux until the frame is complete or the buffer is full.
Without this, the transfer on payload larger than the Rx buffer can
rapidly freeze.
2022-04-28 15:42:21 +02:00
Amaury Denoyelle
30f23f53d2 BUG/MEDIUM: h3: fix use-after-free on mux Rx buffer wrapping
Handle wrapping buffer in h3_data_to_htx(). If data is wrapping, first
copy the contiguous data, then copy the data in front of the buffer.

Note that h3_headers_to_htx() is not able to handle wrapping data. For
the moment, a BUG_ON was added as a reminder. This cas never happened,
most probably because HEADERS is the first frame of the stream.
2022-04-28 15:18:20 +02:00
Amaury Denoyelle
0fa14a69e8 BUG/MINOR: h3: fix incomplete POST requests
Always set HTX flag HTX_SL_F_XFER_LEN for http/3. This is correct
becuase the size of H3 requests is always known thanks to the protocol
framing.

This may fix occurences of incomplete POST requests when the client side
of the connection has been closed before.
2022-04-28 15:14:25 +02:00
Amaury Denoyelle
44d0912f7b MINOR: mux-quic: count local flow-control stream limit on reception
Add new qcs fields to count the sum of bytes received for each stream.
This is necessary to enforce flow-control for reception on the peer.

For the moment, the implementation is partial. No MAX_STREAM_DATA or
FLOW_CONTROL_ERROR are emitted. BUG_ON statements are here as a
remainder.

This means that for the moment we do not support POST payloads greater
that the initial max-stream-data announced (256k currently).

At least, we now ensure that we never buffer a frame which overflows the
flow-control limit : this ensures that the memory consumption per stream
should stay under control.
2022-04-28 14:56:14 +02:00
Amaury Denoyelle
17014a64bf BUG/MINOR: mux-quic: fix leak if cs alloc failure
qcs and its field are not properly freed if the conn-stream allocation
fail in qcs_new(). Fix this by having a proper deinit code with a
dedicated label.
2022-04-28 14:47:53 +02:00
Amaury Denoyelle
408d226aa1 MINOR: mux-quic: remove unused bogus qcc_get_stream()
qcc_get_stream() was used when qcs and qc_stream_desc shared the same
node-tree. This is not the case anymore since
  e4301da5ed
  MINOR: quic-stream: use distinct tree nodes for quic stream and qcs

Now this function is broken as the qcc tree only contains qcs.
Thankfully it is unused so it can be removed without impact.
2022-04-28 14:47:53 +02:00
Amaury Denoyelle
fe8f555f5c MINOR: mux-quic: adjust comment on emission function
Comments were not properly edited since the splitting of functions for
stream emission. Also "payload" argument has been renamed to "in" as it
better reflects the function purpose.
2022-04-28 14:47:21 +02:00
Amaury Denoyelle
b50f311c50 BUG/MINOR: mux-quic: fix build in release mode
Fix build when not using DEBUG_STRICT. 'ret' is reported as unused as it
is only tested in a BUG_ON statement.
2022-04-28 14:42:39 +02:00
Willy Tarreau
faafe4bf16 CLEANUP: connections/deinit: destroy the idle_conns tasks
This adds a deinit_idle_conns() function that's called on deinit to
release the per-thread idle connection management tasks. The global
task was already taken care of.
2022-04-27 18:54:08 +02:00
Willy Tarreau
e01b08db6a CLEANUP: listeners/deinit: release accept queue tasklets on deinit
There was no function to release these ones, they were only created
so the patch adds an accept_queue_deinit() call.
2022-04-27 18:42:47 +02:00
Willy Tarreau
226866e1bb CLEANUP: deinit: release the config postparsers
These ones were not released either, it just requires to export the list
("postparsers") and it makes valgrind happy.
2022-04-27 18:07:24 +02:00
Willy Tarreau
65009ebde1 CLEANUP: deinit: release the pre-check callbacks
The freeing of pre-check callbacks was missing when this feature was
recently added with commit b53eb8790 ("MINOR: init: add the pre-check
callback"), let's do it to make valgrind happy.
2022-04-27 18:02:54 +02:00
Willy Tarreau
d941146583 CLEANUP: chunks: release trash also in deinit
Tim reported in issue #1676 that just like startup_logs, trash buffers
are not released on deinit since they're thread-local, but valgrind
notices it when quitting before creating threads like "-c -f ...". Let's
just subscribe the function to deinit in addition to threads' end.

The two "free(x);x=NULL;" in free_trash_buffers_per_thread() were
also simplified using ha_free().
2022-04-27 17:55:41 +02:00
Willy Tarreau
032e700e8b CLEANUP: errors: also call deinit_errors_buffers() on deinit()
Tim reported in issue #1676 that we don't release startup logs if we
warn during startup and quit before creating threads (e.g. -c -f ...).
Let's subscribe deinit_errors_buffers() to both thread's end and
deinit. That's OK since it uses both per-thread and global variables,
and is idempotent.
2022-04-27 17:50:53 +02:00
Thomas Prckl
10243938db MINOR: ssl: add a new global option "tune.ssl.hard-maxrecord"
Low footprint client machines may not have enough memory to download a
complete 16KB TLS record at once. With the new option the maximum
record size can be defined on the server side.

Note: Before limiting the the record size on the server side, a client should
consider using the TLS Maximum Fragment Length Negotiation Extension defined
in RFC6066.

This patch fixes GitHub issue #1679.
2022-04-27 16:53:43 +02:00