The check_status field in the CSV stats output is conditionally prefixed
with "* " if a check is currently underway. This can trip tools that
parse the CSV output and compare against a well known list of values.
This commit just adds this bit to the documentation.
Unfortunatly, a regression bug was introduced in the commit 1486b0ab
("BUG/MEDIUM: http: Switch HTTP responses in TUNNEL mode when body length is
undefined"). HTTP responses with undefined body length are blocked until timeout
when the compression is enabled. This bug was fixed in commit 69744d92
("BUG/MEDIUM: http: Fix blocked HTTP/1.0 responses when compression is
enabled").
The bug is still the same. We do not forward response data because we are
waiting for the synchronization between the HTTP request and the response.
To fix the bug, conditions to infinitly forward channel data has been slightly
relaxed. Now, it is done if there is no more analyzer registered on the channel
or if _FLT_END analyzer is still there but without the flag CF_FLT_ANALYZE. This
last condition is only possible when a channel is waiting the end of the other
side. So, fundamentally, it means that no one is analyzing the channel
anymore. This is a transitional state during a sync phase.
This patch must be backported in 1.7.
Patch "MINOR: ssl: support ssl-min-ver and ssl-max-ver with crt-list"
introduce ssl_methods in struct ssl_bind_conf. struct bind_conf have now
ssl_methods and ssl_conf.ssl_methods (unused). It's error-prone. This patch
remove the duplicate structure to avoid any confusion.
As per a recent mailing list discussion, suggesting specific cipher
settings is not too helpful, because they depend on a lot of factors,
ranging from client capabilities, available TLS libraries, new
security research, and others.
To avoid the documentation from become stale -- and potentially
wrong/dangerous, this commit adds links to Mozilla's well-reknowned
TLS blog, as well as to their configuration generator.
After careful inspection, this flag is set at exactly two places :
- once in the health-check receive callback after receipt of a
response
- once in the stream interface's shutw() code where CF_SHUTW is
always set on chn->flags
The flag was checked in the checks before deciding to send data, but
when it is set, the wake() callback immediately closes the connection
so the CO_FL_SOCK_WR_SH flag is also set.
The flag was also checked in si_conn_send(), but checking the channel's
flag instead is enough and even reveals that one check involving it
could never match.
So it's time to remove this flag and replace its check with a check of
CF_SHUTW in the stream interface. This way each layer is responsible
for its shutdown, this will ease insertion of the mux layer.
This flag is both confusing and wrong. It is supposed to report the
fact that the data layer has received a shutdown, but in fact this is
reported by CO_FL_SOCK_RD_SH which is set by the transport layer after
this condition is detected. The only case where the flag above is set
is in the stream interface where CF_SHUTR is also set on the receiving
channel.
In addition, it was checked in the health checks code (while never set)
and was always test jointly with CO_FL_SOCK_RD_SH everywhere, except in
conn_data_read0_pending() which incorrectly doesn't match the second
time it's called and is fortunately protected by an extra check on
(ic->flags & CF_SHUTR).
This patch gets rid of the flag completely. Now conn_data_read0_pending()
accurately reports the fact that the transport layer has detected the end
of the stream, regardless of the fact that this state was already consumed,
and the stream interface watches ic->flags&CF_SHUTR to know if the channel
was already closed by the upper layer (which it already used to do).
The now unused conn_data_read0() function was removed.
The session may need to enforce a timeout when waiting for a handshake.
Till now we used a trick to avoid allocating a pointer, we used to set
the connection's owner to the task and set the task's context to the
session, so that it was possible to circle between all of them. The
problem is that we'll really need to pass the pointer to the session
to the upper layers during initialization and that the only place to
store it is conn->owner, which is squatted for this trick.
So this patch moves the struct task* into the session where it should
always have been and ensures conn->owner points to the session until
the data layer is properly initialized.
Historically listeners used to have a handler depending on the upper
layer. But now it's exclusively process_stream() and nothing uses it
anymore so it can safely be removed.
Currently a task is allocated in session_new() and serves two purposes :
- either the handshake is complete and it is offered to the stream via
the second arg of stream_new()
- or the handshake is not complete and it's diverted to be used as a
timeout handler for the embryonic session and repurposed once we land
into conn_complete_session()
Furthermore, the task's process() function was taken from the listener's
handler in conn_complete_session() prior to being replaced by a call to
stream_new(). This will become a serious mess with the mux.
Since it's impossible to have a stream without a task, this patch removes
the second arg from stream_new() and make this function allocate its own
task. In session_accept_fd(), we now only allocate the task if needed for
the embryonic session and delete it later.
The ->init() callback of the connection's data layer was only used to
complete the session's initialisation since sessions and streams were
split apart in 1.6. The problem is that it creates a big confusion in
the layers' roles as the session has to register a dummy data layer
when waiting for a handshake to complete, then hand it off to the
stream which will replace it.
The real need is to notify that the transport has finished initializing.
This should enable a better splitting between these layers.
This patch thus introduces a connection-specific callback called
xprt_done_cb() which informs about handshake successes or failures. With
this, data->init() can disappear, CO_FL_INIT_DATA as well, and we don't
need to register a dummy data->wake() callback to be notified of errors.
The stream interface chk_snd() code checks if the connection has already
subscribed to write events in order to avoid attempting a useless write()
which will fail. But it used to check both the CO_FL_CURR_WR_ENA and the
CO_FL_DATA_WR_ENA flags, while the former may only be present without the
latterif either the other side just disabled writing did not synchronize
yet (which is harmless) or if it's currently performing a handshake, which
is being checked by the next condition and will be better dealt with by
properly subscribing to the data events.
This code was added back in 1.5-dev20 to limit the number of useless calls
to splice() but both flags were checked at once while only CO_FL_DATA_WR_ENA
was needed. This bug seems to have no impact other than making code changes
more painful. This fix may be backported down to 1.5 though is unlikely to
be needed there.
Till now connections used to rely exclusively on file descriptors. It
was planned in the past that alternative solutions would be implemented,
leading to member "union t" presenting sock.fd only for now.
With QUIC, the connection will need to continue to exist but will not
rely on a file descriptor but a connection ID.
So this patch introduces a "connection handle" which is either a file
descriptor or a connection ID, to replace the existing "union t". We've
now removed the intermediate "struct sock" which was never used. There
is no functional change at all, though the struct connection was inflated
by 32 bits on 64-bit platforms due to alignment.
Haproxy doesn't need this anymore, we're wasting cycles checking for
a Connection header in order to add "Connection: close" only in the
1.1 case so that haproxy sees it and removes it. All tests were run
in 1.0 and 1.1, with/without the request header, and in the various
keep-alive/close modes, with/without compression, and everything works
fine. It's worth noting that this header was inherited from the stats
applet and that the same cleanup probably ought to be done there as
well.
In the HTTP applet, we have to parse the response headers provided by
the application and to produce a response. strcasecmp() is expensive,
and chunk_append() even more as it uses a format string.
Here we check the string length before calling strcasecmp(), which
results in strcasecmp() being called only on the relevant header in
practise due to very few collisions on the name lengths, effectively
dividing the number of calls by 3, and we replace chunk_appendf()
with memcpy() as we already know the string lengths.
Doing just this makes the "hello-world" applet 5% faster, reaching
41400 requests/s on a core i5-3320M.
Commit 4850e51 ("BUG/MAJOR: lua: Do not force the HTTP analysers in
use-services") fixed a bug in how services are used in Lua, but this
fix broke the ability for Lua services to support keep-alive.
The cause is that we branch to a service while we have not yet set the
body analysers on the request nor the response, and when we start to
deal with the response we don't have any request analyser anymore. This
leads the response forward engine to detect an error and abort. It's
very likely that this also causes some random truncation of responses
though this has not been observed during the tests.
The root cause is not the Lua part in fact, the commit above was correct,
the problem is the implementation of the "use-service" action. When done
in an HTTP request, it bypasses the load balancing decisions and the
connect() phase. These ones are normally the ones preparing the request
analysers to parse the body when keep-alive is set. This should be dealt
with in the main process_use_service() function in fact.
That's what this patch does. If process_use_service() is called from the
http-request rule set, it enables the XFER_BODY analyser on the request
(since the same is always set on the response). Note that it's exactly
what is being done on the stats page which properly supports keep-alive
and compression.
This fix must be backported to 1.7 and 1.6 as the breakage appeared in 1.6.3.
The header's value was parsed with atoi() then compared against -1,
meaning that all the unparsable stuff returning zero was not considered
and that all multiples of 2^32 + 0xFFFFFFFF would continue to emit a
chunk.
Now instead we parse the value using a long long, only accept positive
values and consider all unparsable values as incorrect and switch to
either close or chunked encoding. This is more in line with what a
client (including haproxy's parser) would expect.
This may be backported as a cleanup to stable versions, though it's
really unlikely that Lua applications are facing side effects of this.
The following Lua code causes emission of a final chunk after the body,
which is wrong :
core.register_service("send204", "http", function(applet)
applet:set_status(204)
applet:start_response()
end)
Indeed, responses with status codes 1xx, 204 and 304 do not contain any
body and the message ends immediately after the empty header (cf RFC7230)
so by emitting a 0<CR><LF> we're disturbing keep-alive responses. There's
a workaround against this for now which consists in always emitting
"Content-length: 0" but it may not be cool with 304 when clients use
the headers to update their cache.
This fix must be backported to stable versions back to 1.6.
Commit d1aa41f ("BUG/MAJOR: lua: properly dequeue hlua_applet_wakeup()
for new scheduler") tried to address the side effects of the scheduler
changes on Lua, but it was not enough. Having some Lua code send data
in chunks separated by one second each clearly shows busy polling being
done.
The issue was tracked down to hlua_applet_wakeup() being woken up on
timer expiration, and returning itself without clearing the timeout,
causing the task to be re-inserted with an expiration date in the past,
thus firing again. In the past it was not a problem, as returning NULL
was enough to clear the timer. Now we can't rely on this anymore so
it's important to clear this timeout.
No backport is needed, this issue is specific to 1.8-dev and results
from an incomplete fix in the commit above.
Since commit 9d8dbbc ("MINOR: dns: Maximum DNS udp payload set to 8192") it's
possible to specify a packet size, but passing too large a size or a negative
size is not detected and results in memset() being performed over a 2GB+ area
upon receipt of the first DNS response, causing runtime crashes.
We now check that the size is not smaller than the smallest packet which is
the DNS header size (12 bytes).
No backport is needed.
Since the DNS layer split and the use of obj_type structure, we did not
updated propoerly the code used to compute the interval between 2
resolutions.
A nasty loop was then created when:
- resolver's hold.valid is shorter than servers' check.inter
- a valid response is available in the DNS cache
A task was woken up for a server's resolution. The servers pick up the IP
in the cache and returns without updating the 'last update' timestamp of
the resolution (which is normal...). Then the task is woken up again for
the same server.
The fix simply computes now properly the interval between 2 resolutions
and the cache is used properly while a new resolution is triggered if
the data is not fresh enough.
a reader pointer comparison to the end of the buffer was performed twice
while once is obviously enough.
backport status: this patch can be backported into HAProxy 1.6 (with some
modification. Please contact me)
For troubleshooting purpose, it may be important to know when a server
got its fqdn updated by a SRV record.
This patch makes HAProxy to report such events through stderr and logs.
RFC 6891 states that if a DNS client announces "big" payload size and
doesn't receive a response (because some equipments on the path may
block/drop UDP fragmented packets), then it should try asking for
smaller responses.
Following up DNS extension introduction, this patch aims at making the
computation of the maximum number of records in DNS response dynamic.
This computation is based on the announced payload size accepted by
HAProxy.
This patch fixes a bug where some servers managed by SRV record query
types never ever recover from a "no resolution" status.
The problem is due to a wrong function called when breaking the
server/resolution (A/AAAA) relationship: this is performed when a server's SRV
record disappear from the SRV response.
Contrary to 64-bits libCs where size_t type size is 8, on systems with 32-bits
size of size_t is 4 (the size of a long) which does not equal to size of uint64_t type.
This was revealed by such GCC warnings on 32bits systems:
src/flt_spoe.c:2259:40: warning: passing argument 4 of spoe_decode_buffer from
incompatible pointer type
if (spoe_decode_buffer(&p, end, &str, &sz) == -1)
^
As the already existing code using spoe_decode_buffer() already use such pointers to
uint64_t, in place of pointer to size_t ;), most of this code is in contrib directory,
this simple patch modifies the prototype of spoe_decode_buffer() so that to use a
pointer to uint64_t in place of a pointer to size_t, uint64_t type being the type
finally required for decode_varint().
The two macros EXPECT_LF_HERE and EAT_AND_JUMP_OR_RETURN were exported
for use outside the HTTP parser. They now take extra arguments to avoid
implicit pointers and jump labels. These will be used to reimplement a
minimalist HTTP/1 parser in the H1->H2 gateway.
This test file covers the various functions provided by ist.h. It allows
both to test them for absence of regression, and to observe the code
emitted at different optimization levels.
For HPACK we'll need to perform a lot of string manipulation between the
dynamic headers table and the output stream, and we need an efficient way
to deal with that, considering that the zero character is not an end of
string marker here. It turns out that gcc supports returning structs from
functions and is able to place up to two words directly in registers when
-freg-struct is used, which is the case by default on x86 and armv8. On
other architectures the caller reserves some stack space where the callee
can write, which is equivalent to passing a pointer to the return value.
So let's implement a few functions to deal with this as the resulting code
will be optimized on certain architectures where retrieving the length of
a string will simply consist in reading one of the two returned registers.
Extreme care was taken to ensure that the compiler gets maximum opportunities
to optimize out every bit of unused code. This is also the reason why no
call to regular string functions (such as strlen(), memcmp(), memcpy() etc)
were used. The code involving them is often larger than when they are open
coded. Given that strings are usually very small, especially when manipulating
headers, the time spent calling a function optimized for large vectors often
ends up being higher than the few cycles needed to count a few bytes.
An issue was met with __builtin_strlen() which can automatically convert
a constant string to its constant length. It doesn't accept NULLs and there
is no way to hide them using expressions as the check is made before the
optimizer is called. On gcc 4 and above, using an intermediary variable
is enough to hide it. On older versions, calls to ist() with an explicit
NULL argument will issue a warning. There is normally no reason to do this
but taking care of it the best possible still seems important.
We now refrain from clearing a session's variables, counters, and from
releasing it as long as at least one stream references it. For now it
never happens but with H2 this will be mandatory to avoid double frees.
Now each stream is added to the session's list of streams, so that it
will be possible to know all the streams belonging to a session, and
to know if any stream is still attached to a sessoin.
These two functions respectively copy a memory area onto the chunk, and
append the contents of a memory area over a chunk. They are convenient
to prepare binary output data to be sent and will be used for HTTP/2.
The "hold obsolete" timer is used to prevent HAProxy from moving a server to
an other IP or from considering the server as DOWN if the IP currently
affected to this server has not been seen for this period of time in DNS
responses.
That said, historically, HAProxy used to update servers as soon as the IP
has disappeared from the response. Current default timeout break this
historical behavior and may change HAProxy's behavior when people will
upgrade to 1.8.
This patch changes the default value to 0 to keep backward compatibility.
Edns extensions may be used to negotiate some settings between a DNS
client and a server.
For now we only use it to announce the maximum response payload size accpeted
by HAProxy.
This size can be set through a configuration parameter in the resolvers
section. If not set, it defaults to 512 bytes.
The function srv_set_fqdn() is used to update a server's fqdn and set
accordingly its DNS resolution.
Current implementation prevents a server whose update is triggered by a
SRV record from being linked to an existing resolution in the cache (if
applicable).
This patch aims at fixing this.
Current code implementation prevents multiple backends from relying on
the same SRV resolution. Actually, only the first backend which triggers
the resolution gets updated.
This patch makes HAProxy to process the whole list of the 'curr'
requesters to apply the changes everywhere (hence, the cache also applies
to SRV records...)
This function is particularly useful when debugging DNS resolution at
run time in HAProxy.
SRV records must be read differently, hence we have to update this
function.
DNS SRV records uses "dns name compression" to store the target name.
"dns compression" principle is simple. Let's take the name below:
3336633266663038.red.default.svc.cluster.local.
It can be stored "as is" in the response or it can be compressed like
this:
3336633266663038<POINTER>
and <POINTER> would point to the string
'.red.default.svc.cluster.local.' availble in the question section for
example.
This mechanism allows storing much more data in a single DNS response.
This means the flag "record->data_len" which stores the size of the
record (hence the whole string, uncompressed) can't be used to move the
pointer forward when reading responses. We must use the "offset" integer
which means the real number of bytes occupied by the target name.
If we don't do that, we can properly read the first SRV record, then we
loose alignment and we start reading unrelated data (still in the
response) leading to a false negative error treated as an "invalid"
response...