The httpclient.resolvers.prefer global keyword allows to configure an
ipv4 or ipv6 preference when resolving. This could be useful in
environment where the ipv6 is not supported.
By default the httpclient uses the resolvers section whose ID is
"default", the httpclient.resolvers.id global option allows to configure
another section to use.
The hard_error_ssl flag is set when the configuration is explicitely
done for the ssl in the httpclient.
If no configuration was made, the features are simply disabled and no
alert is emitted.
Cleanup the error handling in the initialization so we rely on the
ERR_CODE and use memprintf() to set the errmsg before printing it at the
end of the functions.
httpclient_set_dst() allows one to set the destination address instead
of using the one in the URL or resolving one from the host.
This function also support other types of socket like sockpair@, unix@,
anything that could be used on a server line.
In order to still support this behavior, the address must be set on the
backend in this particular case because the frontend connection does not
support anything other than ipv4 or ipv6.
To allow the http-request set-dst to work for the httpclient DNS
resolving, some changes need to be done:
- The destination address need to be set in the frontend (s->csf->dst)
instead of the backend (s->csb->dst) to be able to use
tcp_action_req_set_dst()
- SRV_F_MAPPORTS need to be set on the proxy in order to allow the port
change in alloc_dst_address()
The httpclient_resolve_init() adds http-request actions which does the
resolving using the Host header of the HTTP request.
The parse_http_req_cond function is directly used over an array of http
rules.
The do-resolve rule uses the "default" resolvers section. If this
section does not exist in the configuration, the resolving is disabling.
The httpclient DNS resolver will need a more efficient URL parser which
splits the URL into parts and does not try to resolve.
httpclient_spliturl uses the http_uri_parser in order to split the URL
into several ist.
The result of the function host part is then processed into str2ip2(),
and fails if it it not an IP, allowing us to resolve the domain later.
We rely on the largest ID which was used to open streams to know if the
stream we received STREAM frames for is closed or not. If closed, we return the
same status as the one for a STREAM frame which was a already received one for
on open stream.
It is possible that we continue to receive retransmitted STREAM frames after
the mux have been released. We rely on the ->rx.streams[].nb_streams counter
to check the stream was closed. If not, at this time we drop the packet.
This is required when the retransmitted frame types when the mux is released.
We add a counter for the number of streams which were opened or closed by the mux.
After the mux has been released, we can rely on this counter to know if the STREAM
frames are retransmitted ones or not.
That's similar to what was done for conn_streams and connections. The
flags were only set exactly when the relevant pointers were allocated,
so better test the pointer than the flag and stop setting the flag.
Just like for the conn_stream, now that these addresses are dynamically
allocated, there is no single case where the pointer is set without the
corresponding flag, and the flag is used as a permission to dereference
the pointer. Let's just replace the test of the flag with a test of the
pointer and remove all flag assignment. This makes the code clearer
(especially in "if" conditions) and saves the need for future code to
think about properly setting the flag after setting the pointer.
Some of the protocol-level ->connect() functions currently dereference
the connection's destination address while others test it and return an
error. There's normally no more non-bogus code path that calls such
functions without a valid destination address on the connection, so
let's unify these functions and just place a BUG_ON() there, and drop
the useless test that's supposed to return an internal error.
These flags indicate that the ->src or ->dst field in the conn_stream
is not null, which is something the caller already sees (and even tests
from the two sets of functions that set them). They maintain some burden
because an agent trying to set a source or destination has to manually
set the flags in addition to setting the pointer, so they provide no
value anymore, let's drop them.
This flag is no longer needed now that it must always match the presence
of a destination address on the backend conn_stream. Worse, before previous
patch, if it were to be accidently removed while the address is present, it
could result in a leak of that address since alloc_dst_address() would first
be called to flush it.
Its usage has a long history where addresses were stored in an area shared
with the connection, but as this is no longer the case, there's no reason
for putting this burden onto application-level code that should not focus
on setting obscure flags.
The only place where that made a small difference is in the dequeuing code
in case of queue redistribution, because previously the code would first
clear the flag, and only later when trying to deal with the queue, would
release the address. It's not even certain whether there would exist a
code path going to connect_server() without calling pendconn_dequeue()
first (e.g. retries on queue timeout maybe?).
Now the pendconn_dequeue() code will rely on SF_ASSIGNED to decide to
clear and release the address, since that flag is always set while in
a server's queue, and its clearance implies that we don't want to keep
the address. At least it remains consistent and there's no more risk of
leaking it.
These functions dynamically allocate a source or destination address but
start by clearing the previous one. There's a non-null risk of leaking
addresses there in case of misuse. Better have them do nothing if the
address was already allocated.
HTTP/3 implementation must ignore unknown frame type to support protocol
evolution. Clients can deliberately use unknown type to test that the
server is conformant : this principle is called greasing.
Quiche client uses greasing on H3 frame type with a zero length frame.
This reveals a bug in H3 parsing code which causes the transfer to be
interrupted. Fix this by removing the break statement on ret variable.
Now the parsing loop is only interrupted if input buffer is empty or the
demux is blocked.
This should fix http/3 freeze transfers with the quiche client. Thanks
to Lucas Pardue from Cloudflare for his report on the bug. Frédéric
Lecaille quickly found the source of the problem which helps me to write
this patch.
If the request channel buffer is full, H3 demuxing must be interrupted
on the stream until some read is performed. This condition is reported
if the HTX stream buffer qcs.rx.app_buf is full.
In this case, qcs instance is marked with a new flag QC_SF_DEM_FULL.
This flag cause the H3 demuxing to be interrupted. It is cleared when
the HTX buffer is read by the conn-stream layer through rcv_buf
operation.
When the flag is cleared, the MUX tasklet is woken up. However, as MUX
iocb does not treat Rx for the moment, this is useless. It must be fix
to prevent possible freeze on POST transfers.
In practice, for the moment the HTX buffer is never full as the current
Rx code is limited by the quic-conn receive buffer size and the
incomplete flow-control implementation. So for now this patch is not
testable under the current conditions.
Released version 2.6-dev8 with the following main changes :
- BUG/MINOR: quic: fix use-after-free with trace on ACK consume
- BUG/MINOR: rules: Forbid captures in defaults section if used by a backend
- BUG/MEDIUM: rules: Be able to use captures defined in defaults section
- BUG/MINOR: rules: Fix check_capture() function to use the right rule arguments
- BUG/MINOR: http-act: make release_http_redir() more robust
- BUG/MINOR: sample: add missing use_backend/use-server contexts in smp_resolve_args
- MINOR: sample: don't needlessly call c_none() in sample_fetch_as_type()
- MINOR: sample: make the bool type cast to bin
- MEDIUM: backend: add new "balance hash <expr>" algorithm
- MINOR: init: add global setting "fd-hard-limit" to bound system limits
- BUILD: pollers: use an initcall to register the pollers
- BUILD: xprt: use an initcall to register the transport layers
- BUILD: thread: use initcall instead of a constructor
- BUILD: http: remove the two unused constructors in rules and ana
- CLEANUP: compression: move the default setting of maxzlibmem to defaults
- MINOR: tree-wide: always consider EWOULDBLOCK in addition to EAGAIN
- BUG/MINOR: connection: "connection:close" header added despite 'close-spread-time'
- MINOR: fd: add functions to set O_NONBLOCK and FD_CLOEXEC
- CLEANUP: tree-wide: use fd_set_nonblock() and fd_set_cloexec()
- CLEANUP: tree-wide: remove 25 occurrences of unneeded fcntl.h
- REGTESTS: fix the race conditions in be2dec.vtc ad field.vtc
- REGTESTS: webstats: remove unused stats socket in /tmp
- MEDIUM: httpclient: disable SSL when the ca-file couldn't be loaded
- BUG/MINOR: httpclient/lua: error when the httpclient_start() fails
- BUG/MINOR: ssl: free the cafile entries on deinit
- BUG/MINOR: ssl: memory leak when trying to load a directory with ca-file
- MEDIUM: httpclient: re-enable the verify by default
- BUG/MEDIUM: ssl/cli: fix yielding in show_cafile_detail
- BUILD: compiler: properly distinguish weak and global symbols
- MINOR: connection: Add way to disable active connection closing during soft-stop
- BUG/MEDIUM: http-ana: Fix memleak in redirect rules with ignore-empty option
- CLEANUP: Destroy `http_err_chunks` members during deinit
- BUG/MINOR: resolvers: Fix memory leak in resolvers_deinit()
- MINOR: Call deinit_and_exit(0) for `haproxy -vv`
- BUILD: fd: disguise the fd_set_nonblock/cloexec result
- BUG/MINOR: pools: make sure to also destroy shared pools in pool_destroy_all()
- MINOR: ssl: add a new global option "tune.ssl.hard-maxrecord"
- CLEANUP: errors: also call deinit_errors_buffers() on deinit()
- CLEANUP: chunks: release trash also in deinit
- CLEANUP: deinit: release the pre-check callbacks
- CLEANUP: deinit: release the config postparsers
- CLEANUP: listeners/deinit: release accept queue tasklets on deinit
- CLEANUP: connections/deinit: destroy the idle_conns tasks
- BUG/MINOR: mux-quic: fix build in release mode
- MINOR: mux-quic: adjust comment on emission function
- MINOR: mux-quic: remove unused bogus qcc_get_stream()
- BUG/MINOR: mux-quic: fix leak if cs alloc failure
- MINOR: mux-quic: count local flow-control stream limit on reception
- BUG/MINOR: h3: fix incomplete POST requests
- BUG/MEDIUM: h3: fix use-after-free on mux Rx buffer wrapping
- MINOR: mux-quic: partially copy Rx frame if almost full buf
- MINOR: h3: change frame demuxing API
- MINOR: mux-quic: add a app-layer context in qcs
- MINOR: h3: implement h3 stream context
- MINOR: h3: support DATA demux if buffer full
- MINOR: quic: decode as much STREAM as possible
- MINOR: quic: Improve qc_prep_pkts() flexibility
- MINOR: quic: Prepare quic_frame struct duplication
- MINOR: quic: Do not retransmit frames from coalesced packets
- MINOR: quic: Add traces about TX frame memory releasing
- MINOR: quic: process_timer() rework
- MEDIUM: quic: New functions for probing rework
- MEDIUM: quic: Retransmission functions rework
- MEDIUM: quic: qc_requeue_nacked_pkt_tx_frms() rework
- MINOR: quic: old data distinction for qc_send_app_pkt()
- MINOR: quic: Mark packets as probing with old data
- MEDIUM: quic: Mark copies of acknowledged frames as acknowledged
- MEDIUM: quic: Enable the new datagram probing process
- MINOR: quic: Do not send ACK frames when probing
- BUG/MINOR: quic: Wrong returned status by qc_build_frms()
- BUG/MINOR: quic: Avoid sending useless PADDING frame
- BUG/MINOR: quic: Traces fix about remaining frames upon packet build failure
- MINOR: quic: Wake up the mux to probe with new data
- BUG/MEDIUM: quic: Possible crash on STREAM frame loss
- BUG/MINOR: quic: Missing Initial packet length check
- CLEANUP: quic: Rely on the packet length set by qc_lstnr_pkt_rcv()
- MINOR: quic: Drop 0-RTT packets if not allowed
- BUG/MINOR: httpclient/ssl: use the correct verify constant
- BUG/MEDIUM: conn-stream: Don't erase endpoint flags on reset
- BUG/MEDIUM: httpclient: Fix loop consuming HTX blocks from the response channel
- BUG/MINOR: httpclient: Count metadata in size to transfer via htx_xfer_blks()
- MINOR: httpclient: Don't use co_set_data() to decrement output
- BUG/MINOR: conn_stream: do not confirm a connection from the frontend path
- MEDIUM: quic: do not ACK packet with STREAM if MUX not present
- MEDIUM: quic: do not ack packet with invalid STREAM
- MINOR: quic: Drop 0-RTT packets without secrets
- CLEANUP: quic: Remaining fprintf() debug trace
- MINOR: quic: moving code for QUIC loss detection
- BUG/MINOR: quic: Missing time threshold multiplifier for loss delay computation
- CI: github actions: update LibreSSL to 3.5.2
- SCRIPTS: announce-release: add URL of dev packages
It seems this multiplier ended up in oblivion. Indeed a multiplier must be
applied to the loss delay expressed as an RTT multiplier: 9/8.
So, some packets were detected as lost too soon, leading to be retransmitted too
early!
If the MUX cannot handle immediately nor buffer a STREAM frame, the
packet containing it must not be acknowledge. This is in conformance
with the RFC9000.
qcc_recv() return codes have been adjusted to differentiate an invalid
frame with an already fully received offset which must be acknowledged.
If a packet contains a STREAM frame but the MUX is not allocated, the
frame cannot be enqueued. According to the RFC9000, we must not
acknowledge the packet under this condition.
This may prevents a bug with firefox which keeps trying on refreshing
the web page. This issue has already been detected before closing state
implementation : haproxy wasn't emitted CONNECTION_CLOSE and keeps
acknowledge STREAM frames despite not handle them.
In the future, it might be necessary to respond with a CONNECTION_CLOSE
if the MUX has already been freed.
In issue #1468 it was reported that sometimes server-side connection
attempts were only validated after the "timeout connect" value, and
that would only happen with an H2 client. A long code analysis with the
output dumps showed only one possible call path: an I/O event on the
frontend while reading had just been disabled calls h2_wake() which in
turns wakes cs_conn_io_cb(), which tries cs_conn_process() and cs_notify(),
which sees that the other side is not blocked (already in CS_ST_CON)
and tries cs_chk_snd() on it. But on that side the connection had just
finished to be set up and not yet woken the stream up, cs_notify()
would then call cs_conn_send() which succeeds and passes the connection
to CS_ST_RDY. The problem is that nothing new happened on the frontend
side so there's no reason to wake the stream up and the backend-side
conn_stream remains in CS_ST_RDY state with the stream never being
woken up.
Once the "timeout connect" strikes, process_stream() is woken up and
finds the connection finally setup, so it ignores the timeout and goes
on.
The number of conditions to meet to reproduce this is huge, which also
explains why the reporter says it's "occasional" and we were never able
to reproduce it in the lab. It needs at least reads to be disabled and
immediately re-enabled on the frontend side (e.g. buffer full) with an
I/O even reported before the poller had an opportunity to be disabled
but with no subscribe being reinstalled, so that sock_conn_iocb() has
no other choice but calling h2_wake(), and exactly at the same time
the backend connection must finish to set up so that it was not yet
reported by the poller, the data were sent and the polling for writes
disabled.
Several factors are to be considered here:
- h2_wake() should probably not call h2_wake_some_streams() for
ret >= 0 (common case), but only if some special event is reported
for at least one stream; that part is sensitive though as in the
past we managed to lose some rare cases (e.g. restart processing
after a pause), and such wakeups are extremely rare so we'd better
make that effort once in a while.
- letting a lazy forward attempt on the frontend confirm a backend
connection establishment is too smart to be reliable. That wasn't
in fact the intent and it's inherited from the very old code where
muxes didn't exist and where it was guaranteed that an even at this
layer would wake everyone up.
Here the best thing to do is to refrain from attempting to forward data
until the connection is confirmed. This will let the poller report the
connect() event to the backend side which will process it as it should
and does in all other cases.
Thanks to Jimmy Crutchfield for having reported useful traces and
tested patches.
This will have to be backported to all stable branches after some
observation. Before 2.6 the function is stream_int_chk_snd_conn(),
and the flag to remove is SI_SB_CON.
The use of co_set_data() should be strictly limited to setting the amount of
existing data to be transmitted. It ought not be used to decrement the
output after the data have left the buffer, because doing so involves
performing incorrect calculations using co_data() that still comprises data
that are not in the buffer anymore. Let's use c_rew() for this, which is
made exactly for this purpose, i.e. decrement c->output by as much as
requested.
When HTX blocks are transfer from the HTTP client context to the request
channel, via htx_xfer_blks() function, the metadata must also be counted, in
addition to the data size. Otherwise, expected payload size will not be
copied because the metadata of an HTX block (8 bytes) will be reserved. And
if the payload size is lower than 8 bytes, nothing will be copied. Thus only
a zero-copy will be able to copy the payload.
This issue is 2.6-specific, no backport is needed.
When the HTTP client consumes the response, it loops on the HTX message to
copy blocks content and it removes blocks by calling htx_remove_blk(). But
this function removes a block and returns the next one in the HTX
message. The result must be used instead of using htx_get_next(). It is
especially important because the block used in htx_get_next() loop was
removed. It only works because the message is not defragmented during the
loop.
In addition, the loop on the response was simplified to iter on blocks
instead of positions.
This patch must be backported to 2.5.
Only CS_EP_ERROR flag is now removed from the endpoint when a reset is
performed. When a new the endpoint is allocated, flags are preserved. It is
the caller responsibility to remove other flags, depending on its need.
Concretly, during a connection retry or a L7 retry, we must preserve
flags. In tcpcheck and the CLI, we reset flags.
This patch is 2.6-specific. No backport needed.
The SSL_SERVER_VERIFY_* constants were incorrectly set on the httpclient
server verify. The right constants are SSL_SOCK_VERIFY_* .
This could cause issues when using "httpclient-ssl-verify" or when the
SSL certificates can't be loaded.
No backport needed
This function is used to parse the QUIC packets carried by a UDP datagram.
When a correct packet could be found, the ->len RX packet structure value
is set to the packet length value. On the contrary, it is set to the remaining
number of bytes in the UDP datagram if no correct QUIC packet could be found.
So, there is no need to make this function return a status value. It allows
the caller to parse any QUIC packet carried by a UDP datagram without this.
Any client Initial packet carried in a datagram smaller than QUIC_INITIAL_PACKET_MINLEN(200)
bytes must be discarded. This does not mean we must discard the entire datagram.
So we must at least try to parse the packet length before dropping the packet
and return its length from qc_lstnr_pkt_rcv().
A crash is possible under such circumtances:
- The congestion window is drastically reduced to its miniaml value
when a quic listener is experiencing extreme packet loss ;
- we enqueue several STREAM frames to be resent and some of them could not be
transmitted ;
- some of the still in flight are acknowledged and trigger the
stream memory releasing ;
- when we come back to send the remaing STREAM frames, haproxy
crashes when it tries to build them.
To fix this issue, we mark the STREAM frame as lost when detected as lost.
Then a lookup if performed for the stream the STREAM frames are attached
to before building them. They are released if the stream is no more available
or the data range of the frame is consumed.
When we have to probe the peer, we must first try to send new data. This is done
here waking up the mux after having set the number of maximum number of datagrams
to send to QUIC_MAX_NB_PTO_DGRAMS (2). Of course, this is only the case if the
mux was subscribed to SEND events.
This may happen in rare cases with extreme packet loss (30% for both TX and RX)
which leads the congestion window to decrease down to its minimal value (two
datagrams). Under such circumtances, no ack-eliciting frame can be added to
a packet by qc_build_frms(). In this case we must cancel the packet building
process if there is no ACK or probe (PING frame) to send.
This function must return a successful status as soon as it could be build
a frame to be embedded by a packet. This behavior was broken by the last
modifications. This was due to a dangerous "ret = 1" statement inside
a loop. This statement must be reach only if we go out of a switch/case
after a "break" statement.
Add comments to mention this information.
When we are probing, we do not receive packets, furthermore all ACK frames have
already been sent. This is useless to send ACK when probing the peer. This
modification does not reset the flag which marks the connection as requiring an
ACK frame to be sent. If this is the case, this will be taken into an account
by after the probing process.
Make the two I/O handlers quic_conn_io_cb() and quic_conn_app_io_cb() call
qc_dgrams_retransmit() after probing retransmissions need was detected by
the timer task (qc_process_timer()).
We must modify qc_prep_pkts() to support QUIC_TLS_ENC_LEVEL_NONE as <next_tel>
parameter when called from qc_dgrams_retransmit().