This converter was recently introduced by commit ed0d24e ("MINOR:
sample: add len converter").
As found by Cyril, it causes an issue in "http-request capture"
statements. The non-obvious problem is that an old syntax for sample
expressions and converters used to support a series of words, each
representing a converter. This used to be how the "stick" directives
were created initially. By having a converter called "len", a
statement such as "http-request capture foo len 10" considers "len"
as a converter and not as the capture length.
This obsolete syntax needs to be changed in 1.9 but it's too late
for other versions. It's worth noting that the same problem can
happen if converters are registered on the fly using Lua. Other
language keywords that currently have to be avoided in converters
include "id", "table", "if", "unless".
Randomly, haproxy could fail to start when a "http-request capture"
action is defined, without any change to the configuration. The issue
depends on the memory content, which may raise a fatal error like :
unable to find capture id 'xxxx' referenced by http-request capture
rule
Commit fd608dd2 already prevents the condition to happen, but this one
should be included for completeness and to reclect the code on the
response side.
The issue was introduced recently by commit 29730ba5 and should only be
backported to haproxy 1.8.
Adrian Williams reported that several balancing methods were broken and
sent all requests to one backend. This is a regression in haproxy 1.8 where
the server score was not correctly recalculated.
This fix must be backported to the 1.8 branch.
When an end of stream has been reported, we should not try to receive again
as the mux layer might not be prepared to this and could report unexpected
errors.
This is more of a strengthening measure that follows the introduction of
conn_stream that came in 1.8. It's desired to backport this into 1.8 though
it's uncertain at this time whether it may have caused real issues.
Commit 4974561 ("BUG/MEDIUM: h2: enforce the per-connection stream limit")
implemented a stream limit enforcement on the connection but it was not
correctly done as it would count streams still known by the connection,
which includes the lingering ones that are already marked close. We need
to count only the non-closed ones, which this patch does. The effect is
that some streams are rejected a bit before the limit.
This fix needs to be backported to 1.8.
The HTTP forwarding engine needs to disable lingering on requests in
case the connection to the server has to be suddenly closed due to
http-server-close being used, so that we don't accumulate lethal
TIME_WAIT sockets on the outgoing side. A problem happens when the
server doesn't advertise a response size, because the response
message quickly goes through the MSG_DONE and MSG_TUNNEL states,
and once the client has transferred all of its data, it turns to
MSG_DONE and immediately sets NOLINGER and closes before the server
has a chance to respond. The problem is that this destroys some of
the pending DATA being uploaded, the server doesn't receive all of
them, detects an error and closes.
This early NOLINGER is inappropriate in this situation because it
happens before the response is transmitted. This state transition
to MSG_TUNNEL doesn't happen when the response size is known since
we stay in MSG_DATA (and related states) during all the transfer.
Given that the issue is only related to connections not advertising
a response length and that by definition these connections cannot be
reused, there's no need for NOLINGER when the response's transfer
length is not known, which can be verified when entering the CLOSED
state. That's what this patch does.
This fix needs to be backported to 1.8 and very likely to 1.7 and
older as it affects the very rare case where a client immediately
closes after the last uploaded byte (typically a script). However
given that the risk of occurrence in HTTP/1 is extremely low, it is
probably wise to wait before backporting it before 1.8.
Tunnelled responses are those without a content-length nor a chunked
encoding. They are specially dealt with in the current code but the
behaviour is not correct. The fact that the chunk size is left to zero
with a state artificially set to CHUNK_SIZE validates the test on
whether or not to set the end of stream flag. Thus the first DATA
frame always carries the ES flag and subsequent ones remain blocked.
This patch fixes it in two ways :
- update h1m->curr_len to the size of the current buffer so that it
is properly subtracted later to find the real end ;
- don't set the state to CHUNK_SIZE when there's no content-length
and instead set it to CHUNK_SIZE only when there's chunking.
This fix needs to be backported to 1.8.
We used to switch the stream's state to HREM when seeing and ES bit on
the DATA frame before actually being able to process that frame, possibly
resulting in the DATA frame being processed after the stream was seen as
half-closed and possibly being rejected. The state must not change before
the frame is really processed.
Also fixes a harmless typo in the flag name which should have DATA and
not HEADERS in its name (but all values are equal).
Must be backported to 1.8.
Since last commit it's not required that the DATA frames are complete anymore
so better start with what we have. Only the HEADERS frame requires this. This
may be backported as part of the upload fixes.
We currently have a problem with DATA frames when they don't fit into
the destination buffer. While it was imagined that in theory this never
happens, in practice it does when "option http-buffer-request" is set,
because the headers don't leave the target buffer before trying to read
so if the frame is full, there's never enough room.
This fix consists in reading what can be read from the frame and advancing
the input buffer. Once the contents left are only the padding, the frame
is completely processed. This also solves another problem we had which is
that it was possible to fill a request buffer beyond its reserve because
the <count> argument was not respected in h2_rcv_buf(). Thus it's possible
that some POST requests sent at once with a headers+body filling exactly a
buffer could result in "400 bad req" when trying to add headers.
This fix must be backported to 1.8.
We'll try to process partial frames and for this we need to know the
padding length. The first step requires to extract it during the parsing
and store it in the demux context in the connection. Till now it was only
processed at once.
Even after previous commit ("BUG/MEDIUM: h2: work around a connection
API limitation") there is still a problem with some requests. Sometimes
when polling for more request data while some pending data lies in the
buffer, there's no way to enter h2_recv() because the FD is not marked
ready for reading.
We need to slightly change the approach and make h2_recv() only receive
from the buffer and h2_wake() always attempt to demux if the demux is not
blocked.
However, if the connection is already being polled for reading, it will
not wake up from polling. For this reason we need to cheat and also
pretend a request for sending data, which ensures that as soon as any
direction may move, we can continue to demux. This shows that in the
long term we probably need a better way to resume an interrupted
operation at the mux level.
With this fix, no more hangups happen during uploads. Note that this
time the setup required to provoke the hangups was a bit complex :
- client is "curl" running on local host, uploading 1.7 MB of
data via haproxy
- haproxy running on local host, forwarding to a remote server
through a 100 Mbps only switch
- timeouts disabled on haproxy
- remote server made of thttpd executing a cgi reading request data
through "dd bs=10" to slow down everything.
With such a setup, around 3-5% of the connections would hang up.
This fix needs to be backported to 1.8.
The connection API permits us to enable or disable receiving on a
connection. The underlying FD layer arranges this with the polling
and the fd cache. In practice, if receiving was allowed and an end
of buffer was reached, the FD is subscribed to the polling. If later
we want to process pending data from the buffer, we have to enable
receiving again, but since it's already enabled (in polled mode),
nothing happens and the pending data remain stuck until a new event
happens on the connection to wake the FD up. This is a limitation of
the internal connection API which is not very friendly to the new mux
architecture.
The visible effect is that certain uploads to slow servers experience
truncation on timeout on their last blocks because nothing new comes
from the connection to wake it up while it's being polled.
In order to work around this, there are two solutions :
- either cheat on the connection so that conn_update_xprt_polling()
always performs a call to fd_may_recv() after fd_want_recv(), that
we can trigger from the mux by always calling conn_xprt_stop_recv()
before conn_xprt_want_recv(), but that's a bit tricky and may have
side effects on other parts (eg: SSL)
- or we refrain from receiving in the mux as soon as we're busy on
anything else, regardless of whether or not some room is available
in the receive buffer.
This patch takes the second approach above. This way once we read some
data, as soon as we detect that we're stuck, we immediately stop receiving.
This ensures the event doesn't go into polled mode for this period and
that as soon as we're unstuck we can continue. In fact this guarantees
that we can only wait on one side of the mux for a given direction. A
future improvement of the connection layer should make it possible to
resume processing of an interrupted receive operation.
This fix must be backported to 1.8.
In order to allow demuxing when the dmux buffer is full, we need to
enable data receipt in multiple conditions. Since the conditions are a
bit complex, they have been delegated to a new function h2_recv_allowed()
which follows these rules :
- if an error or a shutdown was detected on the connection and the buffer
is empty, we must not attempt to receive
- if the demux buf failed to be allocated, we must not try to receive and
we know there is nothing pending
- if the buffer is not full, we may attempt to receive
- if no flag indicates a blocking condition, we may attempt to receive
- otherwise must may not attempt
No more truncated payloads are detected in tests anymore, which seems to
indicate that the issue was worked around. A better connection API will
have to be created for new versions to make this stuff simpler and more
intuitive.
This fix needs to be backported to 1.8 along with the rest of the patches
related to CS_FL_RCV_MORE.
If we can't demux pending data due to a stream buffer full condition, we
now set CS_FL_RCV_MORE on the conn_stream so that the stream layer knows
it must call back as soon as possible to restart demuxing. Without this,
some uploaded payloads are truncated if the server does not consume them
fast enough and buffers fill up.
Note that this is still not enough to solve the problem, some changes are
required on the recv() and update_poll() paths to allow to restart reading
even with a buffer full condition.
This patch must be backported to 1.8.
When a stream interface tries to read data from a mux using rcv_buf(),
sometimes it sees 0 as the return value and concludes that there's no
more data while there are, resulting in the connection being polled for
more data and no new attempt being made at reading these pending data.
Now it will automatically check for flag CS_FL_RCV_MORE to know if the
mux really did not have anything available or was not able to provide
these data by lack of room in the destination buffer, and will set
SI_FL_WAIT_ROOM accordingly. This will ensure that once current data
lying in the buffer are forwarded to the other side, reading chk_rcv()
will be called to re-enable reading.
It's important to note that in practice it will rely on the mux's
update_poll() function to re-enable reading and that where the calls
are placed in the stream interface, it's not possible to perform a
new synchronous rcv_buf() call. Thus a corner case remains where the
mux cannot receive due to a full buffer or any similar condition, but
needs to be able to wake itself up to deliver pending data. This is a
limitation of the current connection/conn_stream API which will likely
need a new event subscription to at least call ->wake() asynchronously
(eg: mux->{kick,restart,touch,update} ?).
For now the affected mux (h2 only) will have to take care of the extra
logic to carefully enable polling to restart processing incoming data.
This patch relies on previous one (MINOR: conn_stream: add new flag
CS_FL_RCV_MORE to indicate pending data) and both must be backported to
1.8.
Due to the nature of multiplexed protocols, it will often happen that
some operations are only performed on full frames, preventing any partial
operation from being performed. HTTP/2 is one such example. The current
MUX API causes a problem here because the rcv_buf() function has no way
to let the stream layer know that some data could not be read due to a
lack of room in the buffer, but that data are definitely present. The
problem with this is that the stream layer might not know it needs to
call the function again after it has made some room. And if the frame
in the buffer is not followed by any other, nothing will move anymore.
This patch introduces a new conn_stream flag CS_FL_RCV_MORE whose purpose
is to indicate on the stream that more data than what was received are
already available for reading as soon as more room will be available in
the buffer.
This patch doesn't make use of this flag yet, it only declares it. It is
expected that other similar flags may come in the future, such as reports
of pending end of stream, errors or any such event that might save the
caller from having to poll, or simply let it know that it can take some
actions after having processed data.
The thread patches adds refcount for notifications. The notifications are
used with the Lua cosocket. These refcount free the notifications when
the session is cleared. In the Lua task case, it not have sessions, so
the nofications are never cleraed.
This patch adds a garbage collector for signals. The garbage collector
just clean the notifications for which the end point is disconnected.
This patch should be backported in 1.8
This variable was used by the wrapper which was removed in
a6cfa9098e. The correct way to do seamless reload is now to enable
"expose-fd listeners" on the stat socket.
In register_name, before locking the var_names array, we check the variable name
validity. So if we try to register an invalid or empty name, we need to return
without unlocking it (because it was never locked).
This patch must be backported in 1.8.
This avoids possible 100% cpu usage deadlock on a EMAIL_ALERTS_LOCK and
avoids sending lots of emails when 'option log-health-checks' is used.
It is avoided to change the server state and possibly queue a new email
while processing the email alert by setting check->status to
HCHK_STATUS_UNKNOWN which will exit the set_server_check_status(..) early.
This needs to be backported to 1.8.
Commit 4cfede87a3 removed
`exit-on-failure` in favor of `no-exit-on-failure`, but failed
to update references to the former in user facing messages.
This should be backported to haproxy 1.8.
Commit 9470d2c ("BUG/MINOR: h2: try to abort closed streams as
soon as possible") tried to address the situations where a stream
is closed by the client, but caused a side effect which is that in
some cases, a regularly closed stream reports an error to the stream
layer. The reason is that we purposely matched H2_SS_CLOSED in the
test for H2_SS_ERROR to report this so that we can check for RST,
but it accidently catches certain end of transfers as well. This
results in valid requests to report flags "CD" in the logs.
Instead, let's roll back to detecting H2_SS_ERROR and explicitly check
for a received RST. This way we can correctly abort transfers without
mistakenly reporting errors in normal situations.
This fix needs to be backported to 1.8 as the fix above was merged into
1.8.1.
Since peers were ported to an applet in 1.5, an issue appeared which
is that certain attempts to close an outgoing connection are a bit
"too nice". Specifically, protocol errors and stream timeouts result
in a clean shutdown to be sent, waiting for the other side to confirm.
This is particularly problematic in the case of timeouts since by
definition the other side will not confirm as it has disappeared.
As found by Fred, this issue was further emphasized in 1.8 by commit
f9ce57e ("MEDIUM: connection: make conn_sock_shutw() aware of
lingering") which causes clean shutdowns not to be sent if the fd is
marked as linger_risk, because now even a clean timeout will not be
sent on an idle peers session, and the other one will have nothing
to respond to.
The solution here is to set NOLINGER on the outgoing stream interface
to ensure we always close whenever we attempt a simple shutdown.
However it is important to keep in mind that this also underlines
some weaknesses of the shutr/shutw processing inside process_stream()
and that all this part needs to be reworked to clearly consider the
abort case, and to stop the confusion between linger_risk and NOLINGER.
This fix needs to be backported as far as 1.5 (all versions are affected).
However, during testing of the backport it was found that 1.5 never tries
to close the peers connection on timeout, so it suffers for another issue.
The new admin state was not correctly commited in this case.
Checks were fully disabled but the server was not marked in MAINT state.
It results with a server definitely stucked on the DOWN state.
This patch should be backported on haproxy 1.8
The number of async fd is computed considering the maxconn, the number
of sides using ssl and the number of engines using async mode.
This patch should be backported on haproxy 1.8
There's a nasty case related to signaling all processes via SIGUSR1.
Since the master process still holds the peers sockets, the old process
trying to connect to the new one to teach it its tables has a risk to
connect to the master instead, which will not do anything, causing the
old process to hang instead of quitting.
This patch ensures we correctly close the peers in the master process
on startup, just like it is done for proxies. Ultimately we would rather
have a complete list of listeners to avoid such issues. But that's a bit
trickier as it would require using unbind_all() and avoiding side effects
the master could cause to other processes (like unlinking unix sockets).
To be backported to 1.8.
Since the split of the shctx and the ssl cache, we lost the ability to
disable the cache with tune.ssl.cachesize 0.
Worst than that, when using this configuration, haproxy segfaults during
the configuration parsing.
Must be backported to 1.8.
In hpack_dht_make_room(), we try to fulfill this rule form RFC7541#4.4 :
"It is not an error to attempt to add an entry that is larger than the
maximum size; an attempt to add an entry larger than the maximum size
causes the table to be emptied of all existing entries and results in
an empty table."
Unfortunately it is not consistent with the way it's used in
hpack_dht_insert() as this last one will consider a success as a
confirmation it can copy the header into the table, and a failure as
an indexing error. This results in the two following issues :
- if a client sends too large a header into an empty table, this
header may overflow the table. Fortunately, most clients send
small headers like :authority first, and never mark headers that
don't fit into the table as indexable since it is counter-productive ;
- if a client sends too large a header into a populated table, the
operation fails after the table is totally flushed and the request
is not processed.
This patch fixes the two issues at once :
- a header not fitting into an empty table is always a sign that it
will never fit ;
- not fitting into the table is not an error
Thanks to Yves Lafon for reporting detailed traces demonstrating this
issue. This fix must be backported to 1.8.
This is a regression in the commit 29730ba5 ("MINOR: action: Add a functions to
check http capture rules"). We must check the capture id only when an id is
defined.
This patch must be backported in 1.8.
This is explicitly forbidden by 7540#8.1.2, and may be used to bypass
some of the other filters, so they must be blocked early. It removes
another issue reported by h2spec.
To backport to 1.8.
h2spec rightfully outlines that we used not to reject these ones, and
they may cause trouble if presented, especially "upgrade".
Must be backported to 1.8.
At the moment there's only ":status". Let's block it early when parsing
the request. Otherwise it would be blocked by the HTTP/1 code anyway.
This silences another h2spec issue.
To backport to 1.8.
We currently don't use stream dependencies, but as reported by h2spec,
the spec requires that we reject streams that depend on themselves in
HEADERS frames.
To backport to 1.8.
h2spec reports that we unfortunately didn't enforce the per-connection
stream limit that we advertise. It's important to ensure it's never
crossed otherwise it's cheap for a client to create many streams. This
requires the addition of a stream count. The h2c struct could be cleaned
up a bit, just like the h2_detach() function where an "if" block doesn't
make sense anymore since it's always true.
To backport to 1.8.
As reported by h2spec, the h2->h1 gateway doesn't verify that ":path"
is not empty. This is harmless since the H1 parser will reject such a
request, but better fix it anyway.
To backport to 1.8.
The purpose here is to be able to signal receipt of RST_STREAM to
streams when they start to provide a response so that the response
can be aborted ASAP. Given that RST_STREAM immediately switches the
stream to the CLOSED state, we must check for CLOSED in addition to
the existing ERROR check.
To be backported to 1.8.
The h2spec test suite reveals that a GOAWAY frame received after the
last stream doesn't cause an immediate close, because we count on the
last stream to quit to do so. By simply setting the last_sid to the
received value in case it was not set, we can ensure to properly close
an idle connection during h2_wake().
To be backported to 1.8.
Due to a typo in the request maximum length calculation, we count the
request path twice instead of counting it added to the method's length.
This has two effects, the first one being that a path cannot be larger
than half a buffer, and the second being that the method's length isn't
properly checked. Due to the way the temporary buffers are used internally,
it is quite difficult to meet this condition. In practice, the only
situation where this can cause a problem is when exactly one of either
the method or the path are compressed and the other ones is sent as a
literal.
Thanks to Yves Lafon for providing useful traces exhibiting this issue.
To be backported to 1.8.
h2spec reports that we used to support a dynamic table size update
anywhere in the header block but it's only allowed before other
headers (cf RFC7541#4.2.1). In practice we don't use these for now
since we only use literals in responses.
To backport to 1.8.
If the hpack decoder sees an invalid header index, it emits value
"### ERR ###" that was used during debugging instead of rejecting the
block. This is harmless, and was detected by h2spec.
To backport to 1.8.