To ease the fast forwarding and the infinte forwarding on HTX proxies, 2
functions have been added to let the channel be almost aware of the way data are
stored in its buffer. By calling these functions instead of legacy ones, we are
sure to forward the right amount of data.
Now, the function htx_from_buf() will set the buffer's length to its size
automatically. In return, the caller should call htx_to_buf() at the end to be
sure to leave the buffer hosting the HTX message in the right state. When the
caller can use the function htxbuf() to get the HTX message without any update
on the underlying buffer.
On the server side, we must test the request headers to deduce if we able to do
keepalive or not. Otherwise, by default, the keepalive will be enabled on the
server's connection, whatever the client said.
After 8706c8131 ("BUG/MEDIUM: mux_pt: Always set CS_FL_RCV_MORE."), a
side effect caused failed receives to mark the buffer as missing room,
a flag that no other place can remove since it's empty. Ideally we need
a separate flag to mean "failed to deliver data by lack of room", but
in the mean time at the very least we must not mark as blocked an
empty buffer.
No backport is needed.
In order to properly deal with unaligned contents, the output data are
currently copied into a temporary buffer, to be copied into the mux's
output buffer at the end. The new buffer API allows several buffers to
share the same data area, so we're using this here to make the temporary
buffer point to the same area as the output buffer when that one is
empty. This is enough to avoid the copy at the end, only pointers and
lengths have to be adjusted. In addition the output buffer's head is
advanced by the HTX header size so that the remaining copy is aligned.
By doing this we improve the large object performance by an extra 10%,
which is 64% above the 1.9-dev9 state. It's worth noting that there are
no more calls to __memcpy_sse2_unaligned() now.
Since this code deals with various block types, it appears difficult to
adjust it to be smart enough to even avoid the first copy. However a
distinct approach could consist in trying to detect a single blocked
HTX and jump to dedicated code in this case.
When transferring large objects, most calls are made between a full
buffer and an empty buffer. In this case there is a large opportunity
for performing zero-copy calls, with a few exceptions : the input data
must fit into the output buffer, and the data need to be properly
aligned and formated to let the HTX header fit before and the HTX
block(s) fit after.
This patch does two things :
1) it makes sure that we prepare an empty input buffer before an recv()
call so that it appears as holding an HTX block at the front, which is
removed afterwards. This way the data received using recv() are placed
exactly at the target position in the input buffer for a later cast to
HTX.
2) when receiving data in h1_process_data(), if it appears that the input
buffer can be cast to an HTX buffer and the target buffer is empty,
then the buffers are swapped, an HTX block is prepended in front of the
data area, and the HTX block is appended to reference this data block.
In practice, this ensures that in most cases when transferring large files,
calls to h1_rcv_buf() are made using zero copy and a little bit of buffer
preparation (~40 bytes to be written).
Doing this adds an extra 13% performance boost on top of previous patch,
resulting in a total of 50% speed up on large transfers.
Just by using this buffer room estimation for the demux buffer, the large
object performance has increased by up to 33%. This is mostly due to less
recv() calls and unaligned copies.
The small HTX overhead is enough to make the system perform multiple
reads and unaligned memory copies. Here we provide a function whose
purpose is to reduce the apparent room in a buffer by the size of the
overhead for DATA blocks, which is the struct htx plus 2 blocks (one
for DATA, one for the end of message so that small blocks can fit at
once). The muxes using HTX will be encouraged to use this one instead
of b_room() to compute the available buffer room and avoid filling
their demux buf with more data than can fit at once into the HTX
buffer.
This one is used a lot during transfers, let's avoid resetting its
size when there are already data in the buffer since it implies the
size is correct.
When using the mux_pt, as we can't know if there's more data to be read,
always set CS_FL_RCV_MORE, and only remove it if we got an error or a shutr
and rcv_buf() returned 0.
If the ibuf only contains a small amount of data, realign it
before calling rcv_buf(), as it's probably going to be cheaper
to do so than to do 2 calls to recv().
It's incorrect to send more bytes than requested, because some filters
(e.g. compression) might intentionally hold on some blocks, so DATA
blocks must not be processed past the advertised byte count. It is not
the case for headers however.
No backport is needed.
If we're blocking on mux full, mux busy or whatever, we must get out of
the loop. In legacy mode this problem doesn't exist as we can normally
return 0 but here it's not a sufficient condition to stop sending, so
we must inspect the blocking flags as well.
No backport is needed.
The way htx_xfer_blks() was used is wrong, if we receive data, we must
report everything we found, not just the headers blocks. This ways causing
the EOM to be postponed and some fast responses (or errors) to be incorrectly
delayed.
No backport is needed.
In h2_snd_buf(), when running with htx, make sure we return the amount of
data the caller specified, if we emptied the buffer, as it is what the
caller expects, and will lead to him properly consider the buffer to be
empty.
With the current design, there is always an H1 stream attached to the mux. So
after the conn_stream is detached, if we don't create a new H1 stream in
h1_process(), it is important to release the mux.
In h1_recv(), return 1 if we have data available, or if h1_recv_allowed()
failed, to be sure h1_process() is called. Also don't subscribe if our buffer
is full.
Don't always wake the tasklets subscribed to recv or send events as soon as
we had any I/O event, and don't call the wake() method if there were no
subscription, instead, wake the recv tasklet if we received data in h2_recv(),
and wake the send tasklet if we were able to send data in h2_send(), and the
buffer is not full anymore.
Only call the data_cb->wake() method if we get an error/a read 0, just in
case the stream was not subscribed to receive events.
Of course, the flag FLT_CFG_FL_HTX must be used and not
STRM_FLT_FL_HAS_FILTERS. "Fortunately", these 2 flags have the same value, so
everything worked as expected.
When reaching h2_shutr/h2_shutw, as we may have generated an empty frame,
a goaway or a rst, make sure we wake the I/O tasklet, or we may not send
what we just generated.
Also in h2_shutw(), don't forget to return if all went well, we don't want
to subscribe the h2s to wait events.
The previous code was only stopping the listeners in the master, not the
entire proxy.
Since we now have a polling loop in the master, there might be some side
effects, indeed some things that are still initialized. For example the
checks were still running.
When ssl_bc_alpn was meant to be added, a typo slipped in and as a result ssl_fc_alpn behaved as ssl_bc_alpn,
and ssl_bc_alpn was not a valid keyword. this patch aims at fixing this.
This only happens for connections using the h1 mux. We must be sure to force the
version to HTTP/1.1 when the version of the message is 1.1 or above. It is
important for H2 messages to not send an invalid version string (HTTP/2.0) to
peers.
Released version 1.9-dev9 with the following main changes :
- BUILD/MINOR: ssl: fix build with non-alpn/non-npn libssl
- BUG/MINOR: mworker: Do not attempt to close(2) fd -1
- BUILD: compression: fix build error with DEFAULT_MAXZLIBMEM
- MINOR: compression: always create the compression pool
- BUG/MEDIUM: mworker: fix FD leak upon reload
- BUILD: htx: fix fprintf format inconsistency on 32-bit platforms
- BUILD: buffers: buf.h requires unistd to get ssize_t on libmusl
- MINOR: initcall: introduce a way to register init functions to call at boot
- MINOR: init: process all initcalls in order at boot time
- MEDIUM: init: convert all trivial registration calls to initcalls
- MINOR: thread: provide a set of lock initialisers
- MINOR: threads: add new macros to declare self-initializing locks
- MEDIUM: init: use self-initializing spinlocks and rwlocks
- MINOR: initcall: apply initcall to all register_build_opts() calls
- MINOR: initcall: use initcalls for most post_{check,deinit} and per_thread*
- MINOR: initcall: use initcalls for section parsers
- MINOR: memory: add a callback function to create a pool
- MEDIUM: init: use initcall for all fixed size pool creations
- MEDIUM: memory: use pool_destroy_all() to destroy all pools on deinit()
- MEDIUM: initcall: use initcalls for a few initialization functions
- MEDIUM: memory: make the pool cache an array and not a thread_local
- MINOR: ssl: free ctx when libssl doesn't support NPN
- BUG/MINOR: proto_htx: only mark connections private if NTLM is detected
- MINOR: h2: make struct h2_ops static
- BUG/MEDIUM: mworker: avoid leak of client socket
- REORG: mworker: declare master variable in global.h
- BUG/MEDIUM: listeners: CLOEXEC flag is not correctly set
- CLEANUP: http: Fix typo in init_http's comment
- BUILD: Makefile: Disable -Wcast-function-type if it exists.
- BUG/MEDIUM: h2: Don't bogusly error if the previous stream was closed.
- REGTEST/MINOR: script: add run-regtests.sh script
- REGTEST: Add a basic test for the cache.
- BUG/MEDIUM: mux_pt: Don't forget to unsubscribe() on attach.
- BUG/MINOR: ssl: ssl_sock_parse_clienthello ignores session id
- BUG/MEDIUM: connections: Wake the stream once the mux is chosen.
- BUG/MEDIUM: connections: Don't forget to detach the connection from the SI.
- BUG/MEDIUM: stream_interface: Don't check if the handshake is done.
- BUG/MEDIUM: stream_interface: Make sure we read all the data available.
- BUG/MEDIUM: h2: Call h2_process() if there's an error on the connection.
- REGTEST: Fix several issues.
- REGTEST: lua: check socket functionality from a lua-task
- BUG/MEDIUM: session: Remove the session from the session_list in session_free.
- BUG/MEDIUM: streams: Don't assume we have a CS in sess_update_st_con_tcp.
- BUG/MEDIUM: connections: Don't assume we have a mux in connect_server().
- BUG/MEDIUM: connections: Remove the connection from the idle list before destroy.
- BUG/MEDIUM: session: properly clean the outgoing connection before freeing.
- BUG/MEDIUM: mux_pt: Don't try to send if handshake is not done.
- MEDIUM: connections: Put H2 connections in the idle list if http-reuse always.
- MEDIUM: h2: Destroy a connection with no stream if it has no owner.
- MAJOR: sessions: Store multiple outgoing connections in the session.
- MEDIUM: session: Steal owner-less connections on end of transaction.
- MEDIUM: server: Be smarter about deciding to reuse the last server.
- BUG/MEDIUM: Special-case http_proxy when dealing with outgoing connections.
- BUG/MINOR: cfgparse: Fix transition between 2 sections with the same name
- BUG/MINOR: http: Use out buffer instead of trash to display error snapshot
- BUG/MINOR: htx: Fix block size calculation when a start-line is added/replaced
- BUG/MINOR: mux-h1: Fix processing of "Connection: " header on outgoing messages
- BUG/MEDIUM: mux-h1: Reset the H1 parser when an outgoing message is processed
- BUG/MINOR: proto_htx: Send outgoing data to client to start response processing
- BUG/MINOR: htx: Stop a header or a start line lookup on the first EOH or EOM
- BUG/MINOR: connection: report mux modes when HTX is supported
- MINOR: htx: add a function to cut the beginning of a DATA block
- MEDIUM: conn_stream: Add a way to get mux's info on a CS from the upper layer
- MINOR: mux-h1: Implement get_cs_info() callback
- MINOR: stream: Rely on CS's info if it exists and fallback on session's ones
- MINOR: proto_htx: Use conn_stream's info to set t_idle duration when possible
- MINOR: mux-h1: Don't rely on the stream anymore in h1_set_srv_conn_mode()
- MINOR: mux-h1: Write last chunk and trailers if not found in the HTX message
- MINOR: mux-h1: Be prepare to fail when EOM is added during trailers parsing
- MINOR: mux-h1: Subscribe to send in h1_snd_buf() when not all data have been sent
- MINOR: mux-h1: Consume channel's data in a loop in h1_snd_buf()
- MEDIUM: mux-h1: Add keep-alive outgoing connections in connections list
- MINOR: htx: Add function to add an HTX block just before another one
- MINOR: htx: Add function to iterate on an HTX message using HTX blocks
- MINOR: htx: Add a function to find the HTX block corresponding to a data offset
- MINOR: stats: Don't add end-of-data marker and trailers in the HTX response
- MEDIUM: htx: Change htx_sl to be a struct instead of an union
- MINOR: htx: Add the start-line offset for the HTX message in the HTX structure
- MEDIUM: htx: Don't rely on h1_sl anymore except during H1 header parsing
- MINOR: proto-htx: Use the start-line flags to set the HTTP messsage ones
- MINOR: htx: Add BODYLESS flags on the HTX start-line and the HTTP message
- MINOR: proto_htx: Use full HTX messages to send 100-Continue responses
- MINOR: proto_htx: Use full HTX messages to send 103-Early-Hints responses
- MINOR: proto_htx: Use full HTX messages to send 401 and 407 responses
- MINOR: proto_htx: Send valid HTX message when redir mode is enabled on a server
- MINOR: proto_htx: Send valid HTX message to send 30x responses
- MEDIUM: proto_htx: Convert all HTTP error messages into HTX
- MINOR: mux-h1: Process conn_mode on the EOH when no connection header is found
- MINOR: mux-h1: Change client conn_mode on an explicit close for the response
- MINOR: mux-h1: Capture bad H1 messages
- MAJOR: filters: Adapt filters API to be compatible with the HTX represenation
- MEDIUM: proto_htx/filters: Add data filtering during the forwarding
- MINOR: flt_trace: Adapt to be compatible with the HTX representation
- MEDIUM: compression: Adapt to be compatible with the HTX representation
- MINOR: h2: implement H2->HTX request header frame transcoding
- MEDIUM: mux-h2: register mux for both HTTP and HTX modes
- MEDIUM: mux-h2: make h2_rcv_buf() support HTX transfers
- MEDIUM: mux-h2: make h2_snd_buf() HTX-aware
- MEDIUM: mux-h2: add basic H2->HTX transcoding support for headers
- MEDIUM: mux-h2: implement emission of H2 headers frames from HTX blocks
- MEDIUM: mux-h2: implement the emission of DATA frames from HTX DATA blocks
- MEDIUM: mux-h2: support passing H2 DATA frames to HTX blocks
- BUG/MINOR: cfgparse: Fix the call to post parser of the last sections parsed
- BUG/MEDIUM: mux-h2: don't lose the first response header in HTX mode
- BUG/MEDIUM: mux-h2: remove the HTX EOM block on H2 response headers
- MINOR: listener: the mux_proto entry in the bind_conf is const
- MINOR: connection: create conn_get_best_mux_entry()
- MINOR: server: the mux_proto entry in the server is const
- MINOR: config: make sure to associate the proper mux to bind and servers
- MINOR: hpack: add ":path" to the list of common header fields
- MINOR: h2: add new functions to produce an HTX message from an H2 response
- MINOR: mux-h2: mention that the mux is compatible with both sides
- MINOR: mux-h2: implement an outgoing stream allocator : h2c_bck_stream_new()
- MEDIUM: mux-h2: start to create the outgoing mux
- MEDIUM: mux-h2: implement encoding of H2 request on the backend side
- MEDIUM: mux-h2: make h2_frt_decode_headers() direction-agnostic
- MEDIUM: mux-h2: make h2_process_demux() capable of processing responses as well
- MEDIUM: mux-h2: Implement h2_attach().
- MEDIUM: mux-h2: Don't bother flagging outgoing connections as TOOMANY.
- REGTEST: Fix LEVEL 4 script 0 of "connection" module.
- MINOR: connection: Fix a comment.
- MINOR: mux: add a "max_streams" method.
- MEDIUM: servers: Add a way to keep idle connections alive.
- CLEANUP: fix typos in the htx subsystem
- CLEANUP: Fix typo in the chunk headers file
- CLEANUP: Fix typos in the h1 subsystem
- CLEANUP: Fix typos in the h2 subsystem
- CLEANUP: Fix a typo in the mini-clist header
- CLEANUP: Fix a typo in the proto_htx subsystem
- CLEANUP: Fix typos in the proto_tcp subsystem
- CLEANUP: Fix a typo in the signal subsystem
- CLEANUP: Fix a typo in the session subsystem
- CLEANUP: Fix a typo in the queue subsystem
- CLEANUP: Fix typos in the shctx subsystem
- CLEANUP: Fix typos in the socket pair protocol subsystem
- CLEANUP: Fix typos in the map management functions
- CLEANUP: Fix typo in the fwrr subsystem
- CLEANUP: Fix typos in the cli subsystem
- CLEANUP: Fix typo in the 51d subsystem
- CLEANUP: Fix a typo in the base64 subsystem
- CLEANUP: Fix a typo in the connection subsystem
- CLEANUP: Fix a typo in the protocol header file
- CLEANUP: Fix a typo in the checks header file
- CLEANUP: Fix typos in the file descriptor subsystem
- CLEANUP: Fix a typo in the listener subsystem
- BUG/MINOR: lb-map: fix unprotected update to server's score
- BUILD: threads: fix minor build warnings when threads are disabled
These potential null-deref warnings are emitted on gcc 7 and above
when threads are disabled due to the use of objt_server() after an
existing validity test. Let's switch to __objt_server() since we
know the pointer is valid, it will not confuse the compiler.
Some of these may be backported to 1.8.
The loop trying to figure the best server is theorically capable of
finishing the loop with best == NULL, causing the HA_ATOMIC_SUB()
to fail there. However for this to happen the list should be empty,
which is avoided at the beginning of the function. As it is, the
function still remains at risk so better address this now.
This patch should be backported to 1.8.