When using the peers feature a race condition could prevent
a connection from being properly counted. When this connection
exits it is being "uncounted" nonetheless, leading to a possible
underflow (-1) of the conn_curr stick table entry in the following
scenario :
- Connect to peer A (A=1, B=0)
- Peer A sends 1 to B (A=1, B=1)
- Kill connection to A (A=0, B=1)
- Connect to peer B (A=0, B=2)
- Peer A sends 0 to B (A=0, B=0)
- Peer B sends 0/2 to A (A=?, B=0)
- Kill connection to B (A=?, B=-1)
- Peer B sends -1 to A (A=-1, B=-1)
This fix may be backported to all supported branches.
The most significant change from 1.8 to >=1.9 is the buffer
data structure, using the new field and fixing along side
a little hidden compilation warning.
This must be backported to 1.9.
When an argument <draws> is present, it must be an integer value one
or greater, indicating the number of draws before selecting the least
loaded of these servers. It was indeed demonstrated that picking the
least loaded of two servers is enough to significantly improve the
fairness of the algorithm, by always avoiding to pick the most loaded
server within a farm and getting rid of any bias that could be induced
by the unfair distribution of the consistent list. Higher values N will
take away N-1 of the highest loaded servers at the expense of performance.
With very high values, the algorithm will converge towards the leastconn's
result but much slower. The default value is 2, which generally shows very
good distribution and performance. This algorithm is also known as the
Power of Two Random Choices and is described here :
http://www.eecs.harvard.edu/~michaelm/postscripts/handbook2001.pdf
Since all of them are exclusive, let's move them to an union instead
of eating memory with the sum of all of them. We're using a transparent
union to limit the code changes.
Doing so reduces the struct lbprm from 392 bytes to 372, and thanks
to these changes, the struct proxy is now down to 6480 bytes vs 6624
before the changes (144 bytes saved per proxy).
This one is a proxy option which can be inherited from defaults even
if the LB algo changes. Move it out of the lb_chash struct so that we
don't need to keep anything separate between these structs. This will
allow us to merge them into an union later. It even takes less room
now as it fills a hole and removes another one.
The algo-specific settings move from the proxy to the LB algo this way :
- uri_whole => arg_opt1
- uri_len_limit => arg_opt2
- uri_dirs_depth1 => arg_opt3
Some algorithms require a few extra options (up to 3). Let's provide
some room in lbprm to store them, and make sure they're passed from
defaults to backends.
These ones used to rely on separate variables called hh_name/hh_len
but they are exclusive with the former. Let's use the same variable
which becomes a generic argument name and length for the LB algorithm.
There are a few instances where the lookup algo is tested against
BE_LB_LKUP_CHTREE using a binary "AND" operation while this macro
is a value among a set, and not a bit. The test happens to work
because the value is exactly 4 and no bit overlaps with the other
possible values but this is a latent bug waiting for a new LB algo
to appear to strike. At the moment the only other algo sharing a bit
with it is the "first" algo which is never supported in the same code
places.
This fix should be backported to maintained versions for safety if it
passes easily, otherwise it's not important as it will not fix any
visible issue.
The "balance uri" options "whole", "len" and "depth" were not properly
inherited from the defaults sections. In addition, "whole" and "len"
were not even reset when parsing "uri", meaning that 2 subsequent
"balance uri" statements would not have the expected effect as the
options from the first one would remain for the second one.
This may be backported to all maintained versions.
At a few places in the code we used to rely on this variable to guess
what LB algo was in place. This is wrong because if the defaults section
presets "balance url_param foo" and a backend uses "balance roundrobin",
these locations will still see this url_param_name set and consider it.
The harm is limited, as this only causes the beginning of the request
body to be buffered. And in general this is a bad practice which prevents
us from cleaning the lbprm stuff. Let's explicitly check the LB algo
instead.
This may be backported to all currently maintained versions.
Openssl switched from aes128 to aes256 since may 2016 to compute
tls ticket secrets used by default. But Haproxy still handled only
128 bits keys for both tls key file and CLI.
This patch permit the user to set aes256 keys throught CLI or
the key file (80 bytes encoded in base64) in the same way that
aes128 keys were handled (48 bytes encoded in base64):
- first 16 bytes for the key name
- next 16/32 bytes for aes 128/256 key bits key
- last 16/32 bytes for hmac 128/256 bits
Both sizes are now supported (but keys from same file must be
of the same size and can but updated via CLI only using a key of
the same size).
Note: This feature need the fix "dec func ignores padding for output
size checking."
This patch fixes missing allocation checks loading tls key file
and avoid memory leak in some error cases.
This patch should be backport on branches 1.9 and 1.8
Decode function returns an error even if the ouptut buffer is
large enought because the padding was not considered. This
case was never met with current code base.
In h1_process(), if we have no associated stream, and the connection got a
shutw, then destroy it, it is unusable and it may be our last chance to do
so.
This should be backported to 1.9.
When using a check to send email, avoid having an associated server, so that
we don't modify the server state if we fail to send an email.
Also revert back to initialize the check status to HCHK_STATUS_INI, now that
set_server_check_status() stops early if there's no server, we shouldn't
get in a mail loop anymore.
This should be backported to 1.9.
Instead of assuming we have a server, store the proxy directly in struct
check, and use it instead of s->server.
This should be a no-op for now, but will be useful later when we change
mail checks to avoid having a server.
This should be backported to 1.9.
The cache uses the first 32 bits of the uri's hash as the key to reference
the object in the cache. It makes a special case of the value zero to mean
that the object is not in the cache anymore. The problem is that when an
object hashes as zero, it's still inserted but the eb32_delete() call is
skipped, resulting in the object still being chained in the memory area
while the block has been reclaimed and used for something else. Then when
objects which were chained below it (techically any object since zero is
at the root) are deleted, the walk through the upper object may encounter
corrupted values where valid pointers were expected.
But while this should only happen statically once on 4 billion, the problem
gets worse when the cache-use conditions don't match the cache-store ones,
because cache-store runs with an uninitialized key, which can create objects
that will never be found by the lookup code, or worse, entries with a zero
key preventing eviction of the tree node and resulting in a crash. It's easy
to accidently end up on such a config because the request rules generally
can't be used to decide on the response :
http-request cache-use cache if { path_beg /images }
http-response cache-store cache
In this test, mixing traffic with /images/$RANDOM and /foo/$RANDOM will
result in random keys being inserted, some of them possibly being zero,
and crashes will quickly happen.
The fix consists in 1) always initializing the transaction's cache_hash
to zero, and 2) never storing a response for which the hash has not been
calculated, as indicated by the value zero.
It is worth noting that objects hashing as value zero will never be cached,
but given that there's only one chance among 4 billion that this happens,
this is totally harmless.
This fix must be backported to 1.9 and 1.8.
When mux->init() fails, session_free() will call it again to unregister
it while it was already done, resulting in null derefs or use-after-free.
This typically happens on out-of-memory conditions during H1 or H2 connection
or stream allocation.
This fix must be backported to 1.9.
Document a bit better than allow-0rtt can trivially be used for replay attacks,
and so should only be used when it's safe to replay a request.
This should probably be backported to 1.8 and 1.9.
When using early data, disable the OpenSSL anti-replay protection, and set
the max amount of early data we're ready to accept, based on the size of
buffers, or early data won't work with the released OpenSSL 1.1.1.
This should be backported to 1.8.
When initializing server-template all of the servers after the first
have srv->idle_orphan_conns initialized within server_template_init()
The first server does not have this initialized and when http-reuse
is active this causes a segmentation fault when accessed from
srv_add_to_idle_list(). This patch removes the check for
srv->tmpl_info.prefix within server_finalize_init() and allows
the first server within a server-template to have srv->idle_orphan_conns
properly initialized.
This should be backported to 1.9.
In the function hlua_applet_htx_send_yield(), there already was a test to
respect the reserve but the wrong function was used to get the available space
for data in the HTX buffer. Instead of calling htx_free_space(), the function
htx_free_data_space() must be used. But in fact, there is no reason to bother
with that anymore because the function channel_htx_recv_max() has been added for
this purpose.
The result of this bug is that the call to htx_add_data() failed unexpectedly
while the amount of written data was incremented, leading the applet to think
all data was sent. To prevent any futher bugs, a test has been added to yield if
we are not able to write data into the channel buffer.
This patch must be backported to 1.9.
Tim Dsterhus reported a possible crash in the H2 HEADERS frame decoder
when the PRIORITY flag is present. A check is missing to ensure the 5
extra bytes needed with this flag are actually part of the frame. As per
RFC7540#4.2, let's return a connection error with code FRAME_SIZE_ERROR.
Many thanks to Tim for responsibly reporting this issue with a working
config and reproducer. This issue was assigned CVE-2018-20615.
This fix must be backported to 1.9 and 1.8.
channel_truncate() is not aware of the underlying format of the messages. So if
there are some outgoing data in the channel when called, it does some unexpected
operations on the channel's buffer. So the HTX version, channel_htx_truncate(),
must be used. The same is true for channel_erase(). It resets the buffer but not
the HTX message. So channel_htx_erase() must be used instead. This patch is
flagged as a bug, but as far as we know, it was never hitted.
This patch should be backported to 1.9. If so, following patch must be
backported too:
* MINOR: channel/htx: Add the HTX version of channel_truncate/erase
The function channel_htx_truncate() can now be used on HTX buffer to truncate
all incoming data, keeping outgoing one intact. This function relies on the
function channel_htx_erase() and htx_truncate().
This patch may be backported to 1.9. If so, the patch "MINOR: channel/htx: Add
the HTX version of channel_truncate()" must also be backported.
When the reg tests fail, it may be useful to display additional information
coming from varnishtest, especially when this latter aborts.
In such case, the test output may be made of lines prefixed by "* diag"
string.
We need to check if any compression filter precedes the cache filter. This is
only possible when the compression is configured in the frontend while the cache
filter is configured on the backend (via a cache-store action or
explicitly). This case cannot be detected during HAProxy startup. So in such
cases, the cache is disabled.
The patch must be backported to 1.9.
On legacy HTTP streams, it is forbidden to use the compression with the
cache. When the compression filter is explicitly specified, the detection works
as expected and such configuration are rejected at startup. But it does not work
when the compression filter is implicitly defined. To fix the bug, the implicit
declaration of the compression filter is checked first, before calling .check()
callback of each filters.
This patch should be backported to 1.9.
Since the commit 9666720c8 ("BUG/MEDIUM: compression: Use the right buffer
pointers to compress input data"), the compression can be done twice. The first
time on the frontend and the second time on the backend. This may happen by
configuring the compression in a default section.
To fix the bug, when the response is checked to know if it should be compressed
or not, if the flag HTTP_MSGF_COMPRESSING is set, the compression is not
performed. It means it is already handled by a previous compression filter.
Thanks to Pieter (PiBa-NL) to report this bug.
This patch must be backported to 1.9.
Now, h1_shutr() only do a shutdown read and try to set the flag
H1C_F_CS_SHUTDOWN if shutdown write was already performed. On its side,
h1_shutw(), if all conditions are met, do the same for the shutdown write. The
real connection close is done when the mux h1 is released, in h1_release().
The flag H1C_F_CS_SHUTW was renamed to H1C_F_CS_SHUTDOWN to be less ambiguous.
This patch may be backported to 1.9.
In h1_shutr(), to fully close the connection, we must be sure the shutdown write
was already performed on the connection. So we know rely on connection flags
instead of conn_stream flags. If CO_FL_SOCK_WR_SH is already set when h1_shutr()
is called, we can do a full connection close. Otherwise, we just do the shutdown
read.
Without this patch, it is possible to close the connection too early with some
outgoing data in the output buf.
This patch must be backported to 1.9.
This script runs two tests. One with "httpchk" over SSL/TLS and another
one with "check-ssl" option. As varnishtest does not support SSL/TLS
we use two haproxy processes to run these tests. h2 haproxy process
be2 and be4 backends declare one server each wich are the frontend
of h1 haproxy process. We check the layer6/7 checks thanks to syslog
messages.
Signed-off-by: Frdric Lcaille <flecaille@haproxy.com>
This test verifies the mailers section works properly by checking that
it sends the proper amount of mails when health-checks are changing and
or marking a server up/down
The test currently fails on all versions of haproxy i tried with varying
results:
- 1.9.0 produces thousands of mails.
- 1.8.14 only sends 1 mail, needs a 200ms 'timeout mail' to succeed
- 1.7.11 only sends 1 mail, needs a 200ms 'timeout mail' to succeed
- 1.6 only sends 1 mail, (does not have the 'timeout mail' setting implemented)