Connection headers are now sanitized during the parsing and the formatting. This
means "close" and "keep-alive" values are always removed but right flags are
set. This way, client side and server side are independent of each other. On the
input side, after the parsing, neither "close" nor "keep-alive" values
remain. So on the output side, if we found one of these values in a connection
headers, it means it was explicitly added by HAProxy. So it overwrites the other
rules, if applicable. Always sanitizing the output is also a way to simplifiy
conditions to update the connection header. Concretly, only additions of "close"
or "keep-alive" values remain, depending the case.
No need to backport this patch.
The flag H1_MF_CLEAN_CONN_HDR has been added to let the H1 parser sanitize
connection headers. It means it will remove all "close" and "keep-alive" values
during the parsing. One noticeable effect is that connection headers may be
unfolded. In practice, this is not a problem because it is not frequent to have
multiple values for the connection headers.
If this flag is set, during the parsing The function
h1_parse_next_connection_header() is called in a loop instead of
h1_parse_conection_header().
No need to backport this patch
On the client side, as far as possible, we will try to keep connection
alive. So, in most of cases, this header will be removed. So it is better to not
add it at all. If finally the connection must be closed, the header will be
added by the mux h1.
No need to backport this patch.
Because of previous changes on http tunneling, the synchronization of the
transaction can be simplified. Only the check on intermediate messages remains
and it only concerns the response path.
This patch must be backported to 1.9. It is not strictly speaking required but
it will ease futur backports.
In HTX, the HTTP tunneling does not work if h1 and h2 are mixed (an h1 client
sending requests to an h2 server or this opposite) because the h1 multiplexer
always adds an EOM before switching it to tunnel mode. The h2 multiplexer
interprets it as an end of stream, closing the stream as for any other
transaction.
To make it works again, we need to swith to the tunnel mode without emitting any
EOM blocks. Because of that, HTX analyzers have been updated to switch the
transaction to tunnel mode before end of the message (because there is no end of
message...).
To be consistent, the protocol switching is also handled the same way even
though the 101 responses are not supported in h2.
This patch must be backported to 1.9.
Because the option http-tunnel is now ignored in HTX, there is no longer any
need to adjust the transaction mode in HTX analyzers. A channel can still be
switch to the tunnel mode for legitimate cases (HTTP CONNECT or switching
protocols). So the function htx_adjust_conn_mode() is now useless.
This patch must be backported to 1.9. It is not strictly speaking required but
it will ease futur backports.
The option http-tunnel disables any HTTP processing past the first
transaction. In HTX, it works for full h1 transactions. As for the legacy HTTP,
it is a workaround, but it works. But it is impossible to make it works with an
h2 connection. In such case, it has no effect, the stream is closed at the end
of the transaction. So to avoid any inconsistancies between h1 and h2
connections, this option is now always ignored when the HTX is enabled. It is
also a good opportinity to deprecate an old and ugly option. A warning is
emitted during HAProxy startup to encourage users to remove this option.
Note that in legacy HTTP, this option only works with full h1 transactions
too. If an h2 connection is established on a frontend with this option enabled,
it will have no effect at all. But we keep it for the legacy HTTP for
compatibility purpose. It will be removed with the legacy HTTP.
So to be short, if you have to really (REALLY) use it, it will only work for
legacy HTTP frontends with H1 clients.
The documentation has been updated accordingly.
This patch must be backported to 1.9. It is not strictly speaking required but
it will ease futur backports.
If there is a data block when a header block is added in a HTX message, its
payload will be inserted after the data block payload. But its index will be
moved before the EOH block. So at this stage, if a new data block is added, we
will try to append its payload to the last data block (because it is also the
tail). Thus the payload of the further header block will be crushed.
This cannot happens if the payloads wrap thanks to the previous fix. But it
happens when the tail is not the front too. So now, in this case, we add a new
block instead of appending.
This patch must be backported in 1.9.
When a header is added or when a data block is added before another one, the
blocks position may be changed (but not their payloads position). For instance,
when a header is added, we move the block just before the EOH, if any. When the
payloads wraps, it is pretty annoying because we loose the last inserted
block. It is neither the tail nor the head. And it is not the front either.
It is a design problem. Waiting for fixing this problem, we force a
defragmentation in such case. Anyway, it should be pretty rare, so it's not
really critical.
This patch must be backported to 1.9.
When a message or a fragment is encoded, the date the frame processing starts
must be set if it is undefined. The test on tv_request field was wrong.
This patch must be backported to 1.9.
If the maximum frame size is very small with a large message or argument name,
it is possible to be unable to encode anything. In such case, it is important to
stop processing returning an error otherwise we will retry in loop to encode the
message, failing each time because of the too small frame size.
This patch must be backported to 1.9 and 1.8.
If a SPOE applet is already attached to a stream to handle its messages, we must
not queue them. Otherwise it could be handled by another applet leading to
errors. This happens with fragmented messages only. When the first framgnent is
sent, the SPOE applet sending it is attached to the stream. It should be used to
send all other fragments.
This patch must be backported to 1.9 and 1.8.
Seeing the size of each ring helps understand which threads are
overloaded and why some of them are less often elected than others
by the multi-queue load balancer.
The "show activity" command reports the number of incoming connections
dispatched per thread but doesn't report the number of connections
received by each thread. It is important to be able to monitor this
value as it can show that for whatever reason a smaller set of threads
is receiving the connections and dispatching them to all other ones.
It is not acceptable that the accept queues are handled with a normal
priority since they are supposed to quickly dispatch the incoming
traffic, resulting in tasks which will have their respective nice
values and place in the queue. Let's renice the accept ring tasks
to -1024.
No backport is needed, this is strictly 2.0.
The run queue offset computed from the nice value depends on the run
queue size, but for the first task to enter the run queue, this size
is zero and the task gets queued just as if its nice value was zero as
well. This is problematic for example for the CLI socket if another
higher priority task gets queued immediately after as it can steal its
place.
This patch simply adds one to the rq_size value to make sure the nice
is never multiplied by zero. The way the offset is calculated is
questionable anyway these days, since with the newer scheduler it seems
that just using the nice value as an offset should work (possibly damped
by the task's number of calls).
This fix must be backported to 1.9. It may possibly be backported to
older versions if it proves to make the CLI more interactive.
It is possible to hit a fairness issue in the scheduler when a local
task runs for a long time (i.e. process_stream() returns running), and
a global task wants to run on the same thread and remains in the global
queue. What happens in this case is that the condition to extract tasks
from the global queue will rarely be satisfied for very low task counts
since whatever non-null queue size multiplied by a thread count >1 is
always greater than the small remaining number of tasks in the queue.
In theory another thread should pick the task but we do have some mono
threaded tasks in the global queue as well during inter-thread wakeups.
Note that this can only happen with task counts lower than the thread
counts, typically one task in each queue for more than two threads.
This patch works around the problem by allowing a very small unfairness,
making sure that we can always pick at least one task from the global
queue even if there is already one in the local queue.
A better approach will consist in scanning the two trees in parallel
and always pick the best task. This will be more complex and will
constitute a separate patch.
This fix must be backported to 1.9.
If the interface is not in state SI_ST_CON or SI_ST_EST, don't bother
trying to send/recv data, we can't do it anyway, and if we're in SI_ST_TAR,
that may lead to adding the SI_FL_ERR flag back on the stream_interface,
while we don't want it.
This should be backported to 1.9.
In process_stream(), only try again when there's the SI_FL_ERR flag and we're
in a connected state, otherwise we can loop forever.
It used to work because si_update_both() bogusly removed the SI_FL_ERR flag,
and it would never be set at this point. Now it does, so take that into
account.
Many, many thanks to Maciej Zdeb for reporting the problem, and helping
investigating it.
This should be backported to 1.9.
For LOG_FMT_TS (%Ts), the tm variable is not used, so save some cycles
on the call to get_gmtime.
Backport: 1.9 1.8
Signed-off-by: Robin H. Johnson <rjohnson@digitalocean.com>
Pavlos Parissis reported an interesting case where some map identifiers
were not assigned (appearing as -1 in show map). It turns out that it
only happens for log-format expressions parsed in check_config_validity()
that involve maps (log-format, use_backend, unique-id-header), as in the
sample configuration below :
frontend foo
bind :8001
unique-id-format %[src,map(addr.lst)]
log-format %[src,map(addr.lst)]
use_backend %[src,map(addr.lst)]
The reason stems from the initial introduction of unique IDs in 1.5 via
commit af5a29d5f ("MINOR: pattern: Each pattern is identified by unique
id.") : the unique_id assignment was done before calling
check_config_validity() so all maps loaded after this call are not
properly configured. From what the function does, it seems they will not
be able to use a cache, will not have a unique_id assigned and will not
be updatable from the CLI.
This fix must be backported to all supported versions.
Use cpuset_getaffinity() to implement thread_cpus_enabled() on FreeBSD, so
that we can know the number of CPUs available, and automatically launch as
much threads if nbthread isn't specified.
When creating a new initcall, don't forget to define the symbols, as it may
not be done automatically and that would lead to undefined symbols.
This should be backported to 1.9.
In commit d7704b534, we introduced and expiration flag on the stream interface,
which is used for the connect, the queue and the turn around. Because the
turn around state isn't an error, the flag was reset in process_stream(), and
later in commit cff6411f9 when introducing the SI_FL_ERR flag, the cleanup
of the flag at this place was erroneously generalized.
To fix this, the SI_FL_EXP flag is only cleared at the end of the turn around
state, and nobody should clear the stream interface flags anymore.
This should be backported to 1.9, it has no known impact on older versions.
As si_update_both() sets prev_state to state for each stream_interface, if
we want to check it changed, copy it before calling si_update_both().
This should be backported to 1.9.
Don't inconditionally remove the SI_FL_ERR code in si_update_both(), which
is called at the end of process_stream(). Doing so was a bug that was there
since the flag was introduced, because we were always setting si->flags to
SI_FL_NONE, however we don't want to lose that one, except if we will retry
connecting, so only remove it in sess_update_st_cer().
This should be backported to 1.9.
It can happen in some cases that the last block of an H2 transfer over
HTX is truncated. This was tracked down to a leftover of an earlier
implementation of htx_xfer_blks() causing the computed size of a block
to be incorrectly calculated if a data block doesn't completely fit into
the target buffer. In practice it causes the EOM block to be attempted to
be emitted with a wrong size and the message to be truncated. One way to
reproduce this is to chain two haproxy instances in h1->h2->h1 with
httpterm as the server and h2load as the client, making many requests
between 8 and 10kB over a single connection. Usually one of the very
first requests will fail.
This fix must be backported to 1.9.
Modify h2c_restart_reading() to add a new parameter, to let it know if it
should consider if the buffer isn't empty when retrying to read or not, and
call h2c_restart_reading() using 0 as a parameter from h2_process_demux().
If we're leaving h2_process_demux() with a non-empty buffer, it means the
frame is incomplete, and we're waiting for more data, and if we already
subscribed, we'll be waken when more data are available.
Failing to do so means we'll be waken up in a loop until more data are
available.
This should be backported to 1.9.
The deinit took place in only peer_session_release, but in the a case of a
previous call to peer_session_forceshutdown, the session cursors
won't be reset, resulting in a bad state for new session of the same
peer. For instance, a table definition message could be dropped and
so all update messages will be dropped by the remote peer.
This patch move the deinit processing directly in the force shutdown
funtion. Killed session remains in "ST_END" state but ref on peer was
reset to NULL and deinit will be skipped on session release function.
The session release continue to assure the deinit for "active" sessions.
This patch should be backported on all stable version since proto
peers v2.
To do so, the server does not send anything. Instead it waits 2ms before
closing. The client, on its side, will wait for a response. So it will be
blocked. Becauase the client timeout is set to 1ms, HAProxy should always close
the client connection because it times out.
Because HAProxy may decide to close 301 responses, as others internal responses,
it is safer to use a different client for these requests. This is not the
purpose of this test to verify the keep-alive in such cases.
The way unexpected bodies are handled for responses to HEAD requests differs from
the legacy HTTP to the HTX. While it is dropped wih the legacy HTTP, in HTX, it
is parsed as the response to the next request. So, in HTX, a 502 error is
returned to the client and the connexion is closed.
This test has been modified to pass in both mode.
Size reported in logs may differ between legacy HTTP and HTX, at least for
now. So in the regtest http-capture/h00000.vtc, we need to relax the regex
matching the log message.
Because we try to forward infinitly message body, when its state is set to DONE,
we must be sure to reset to_foward value of the corresponding
channel. Otherwise, some errors can be errornously triggered.
No need to backport this patch.
Displays a prefix for every addresses in 'show cli sockets'.
It could be 'unix@', 'ipv4@', 'ipv6@', 'abns@' or 'sockpair@'.
Could be backported in 1.9 and 1.8.
The 'show cli sockets' was not handling the abns sockets. This is a
problem since it uses the AF_UNIX family, it displays nothing
in the path column because the path starts by \0.
Should be backported to 1.9 and 1.8.
This patch implements the external binary support in the master worker.
To configure an external process, you need to use the program section,
for example:
program dataplane-api
command ./dataplane_api
Those processes are launched at the same time as the workers.
During a reload of HAProxy, those processes are dealing with the same
sequence as a worker:
- the master is re-executed
- the master sends a USR1 signal to the program
- the master launches a new instance of the program
During a stop, or restart, a SIGTERM is sent to the program.
The children variable is still used in haproxy, it is not required
anymore since we have the information about the current workers in the
mworker_proc linked list.
The oldpids array is also replaced by this linked list when we
generated the arguments for the master reexec.