This patch adds a new stats socket command to modify server
FQDNs at run time.
Its syntax:
set server <backend>/<server> fqdn <FQDN>
This patch also adds FQDNs to server state file at the end
of each line for backward compatibility ("-" if not present).
Sometimes it's convenient to be able to execute a command directly on
the stream, whether we're connecting or accepting an incoming connection.
New command 'X' makes this possible. It simply calls execvp() on the
next arguments and branches stdin/stdout/stderr on the socket. Optionally
it's possible to limit the passed FDs to any combination of them by
appending 'i', 'o', 'e' after the X. In any case the program ends just
after executing this command.
Examples :
- chargen server
tcploop 8001 L A Xo cat /dev/zero
- telnet server
tcploop 8001 L W N A X /usr/sbin/in.telnetd
This patch replaces the calls to TLSvX_X_client/server/_method
by the new TLS_client/server_method and it uses the new functions
SSL_set_min_proto_version and SSL_set_max_proto_version, setting them
at the wanted protocol version using 'force-' statements.
The sample fetch returns all headers including the last jump line.
The last jump line is used to determine if the block of headers is
truncated or not.
These encoding functions does general stuff and can be used in
other context than spoe. This patch moves the function spoe_encode_varint
and spoe_decode_varint from spoe to common. It also remove the prefix spoe.
These functions will be used for encoding values in new binary sample fetch.
When we include the header proto/spoe.h in other files in the same
project, the compilator claim that the symbol have multiple definitions:
src/flt_spoe.o: In function `spoe_encode_varint':
~/git/haproxy/include/proto/spoe.h:45: multiple definition of `spoe_encode_varint'
src/proto_http.o:~/git/haproxy/include/proto/spoe.h:45: first defined here
in chash_get_server_hash, we find the nearest server entries both
before and after the request hash. If the next and prev entries both
point to the same server, the function would exit early and return that
server, to save work.
Before hash-balance-factor this was a valid optimization -- one of nsrv
and psrv would definitely be chosen, so if they are the same there's no
need to choose between them. But with hash-balance-factor it's possible
that adding another request to that server would overload it
(chash_server_is_eligible returns false) and we go further around the
ring. So it's not valid to return before checking for that.
This commit simply removes the early return, as it provides a minimal
savings even when it's correct.
The priv context is not cleaned when we set a new priv context.
This is caused by a stupid swap between two parameter of the
luaL_unref() function.
workaround: use set_priv only once when we process a stream.
This patch should be backported in version 1.7 and 1.6
Idea from Aleksandar Lazic: add explanation/links about layer4
tcp-request connection or content reject to "block" keyword.
Add http-request cross ref. to "tcp-request content".
This patch adds server_template_init() function used to initialize servers
from server templates. It is called just after having parsed a 'server-template'
line.
This patch makes backend sections support 'server-template' new keyword.
Such 'server-template' objects are parsed similarly to a 'server' object
by parse_server() function, but its first arguments are as follows:
server-template <ID prefix> <nb | range> <ip | fqdn>:<port> ...
The remaining arguments are the same as for 'server' lines.
With such server template declarations, servers may be allocated with IDs
built from <ID prefix> and <nb | range> arguments.
For instance declaring:
server-template foo 1-5 google.com:80 ...
or
server-template foo 5 google.com:80 ...
would be equivalent to declare:
server foo1 google.com:80 ...
server foo2 google.com:80 ...
server foo3 google.com:80 ...
server foo4 google.com:80 ...
server foo5 google.com:80 ...
This patch moves the code which is responsible of finalizing server initializations
after having fully parsed a 'server' line (health-check, agent check and SNI expression
initializations) from parse_server() to new functions.
This patch moves the code responsible of copying default server settings
to a new server instance from parse_server() function to new defsrv_*_cpy()
functions which may be used both during server lines parsing and during server
templates initializations to come.
These defsrv_*_cpy() do not make any reference to anything else than default
server settings.
'resolvers' setting was not duplicated from default server setting to
new server instances when parsing 'server' lines.
This fix is simple: strdup() default resolvers <id> string argument after
having allocated a new server when parsing 'server' lines.
This patch must be backported to 1.7 and 1.6.
This bug occurs when a redirect rule is applied during the request analysis on a
persistent connection, on a proxy without any server. This means, in a frontend
section or in a listen/backend section with no "server" line.
Because the transaction processing is shortened, no server can be selected to
perform the connection. So if we try to establish it, this fails and a 503 error
is returned, while a 3XX was already sent. So, in this case, HAProxy generates 2
replies and only the first one is expected.
Here is the configuration snippet to easily reproduce the problem:
listen www
bind :8080
mode http
timeout connect 5s
timeout client 3s
timeout server 6s
redirect location /
A simple HTTP/1.1 request without body will trigger the bug:
$ telnet 0 8080
Trying 0.0.0.0...
Connected to 0.
Escape character is '^]'.
GET / HTTP/1.1
HTTP/1.1 302 Found
Cache-Control: no-cache
Content-length: 0
Location: /
HTTP/1.0 503 Service Unavailable
Cache-Control: no-cache
Connection: close
Content-Type: text/html
<html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>
Connection closed by foreign host.
[wt: only 1.8-dev is impacted though the bug is present in older ones]
In server_parse_sni_expr(), we use the "proxy" global variable, when we
should probably be using "px" given as an argument.
It happens to work by accident right now, but may not in the future.
[wt: better backport it]
Haproxy relies on signed integer wraparound on overflow, however this is
really an undefined behavior, so the C compiler is allowed to do whatever
it wants, and clang does exactly that, and that causes problems when the
timer goes from <= INT_MAX to > INT_MAX, and explains the various hangs
reported on FreeBSD every 49.7 days. To make sure we get the intended
behavior, use -fwrapv for now. A proper fix is to switch everything to
unsigned, and it will happen later, but this is simpler, and more likely to
be backported to the stable branches.
Many thanks to David King, Mark S, Dave Cottlehuber, Slawa Olhovchenkov,
Piotr Pawel Stefaniak, and any other I may have forgotten for reporting that
and investigating.
Overall we do have an issue with the severity of a number of errors. Most
fatal errors are reported with ERR_FATAL (which prevents startup) and not
ERR_ABORT (which stops parsing ASAP), but check_config_validity() is still
called on ERR_FATAL, and will most of the time report bogus errors. This
is what caused smp_resolve_args() to be called on a number of unparsable
ACLs, and it also is what reports incorrect ordering or unresolvable
section names when certain entries could not be properly parsed.
This patch stops this domino effect by simply aborting before trying to
further check and resolve the configuration when it's already know that
there are fatal errors.
A concrete example comes from this config :
userlist users :
user foo insecure-password bar
listen foo
bind :1234
mode htttp
timeout client 10S
timeout server 10s
timeout connect 10s
stats uri /stats
stats http-request auth unless { http_auth(users) }
http-request redirect location /index.html if { path / }
It contains a colon after the userlist name, a typo in the client timeout value,
another one in "mode http" which cause some other configuration elements not to
be properly handled.
Previously it would confusingly report :
[ALERT] 108/114851 (20224) : parsing [err-report.cfg:1] : 'userlist' cannot handle unexpected argument ':'.
[ALERT] 108/114851 (20224) : parsing [err-report.cfg:6] : unknown proxy mode 'htttp'.
[ALERT] 108/114851 (20224) : parsing [err-report.cfg:7] : unexpected character 'S' in 'timeout client'
[ALERT] 108/114851 (20224) : Error(s) found in configuration file : err-report.cfg
[ALERT] 108/114851 (20224) : parsing [err-report.cfg:11] : unable to find userlist 'users' referenced in arg 1 of ACL keyword 'http_auth' in proxy 'foo'.
[WARNING] 108/114851 (20224) : config : missing timeouts for proxy 'foo'.
| While not properly invalid, you will certainly encounter various problems
| with such a configuration. To fix this, please ensure that all following
| timeouts are set to a non-zero value: 'client', 'connect', 'server'.
[WARNING] 108/114851 (20224) : config : 'stats' statement ignored for proxy 'foo' as it requires HTTP mode.
[WARNING] 108/114851 (20224) : config : 'http-request' rules ignored for proxy 'foo' as they require HTTP mode.
[ALERT] 108/114851 (20224) : Fatal errors found in configuration.
The "requires HTTP mode" errors are just pollution resulting from the
improper spelling of this mode earlier. The unresolved reference to the
userlist is caused by the extra colon on the declaration, and the warning
regarding the missing timeouts is caused by the wrong character.
Now it more accurately reports :
[ALERT] 108/114900 (20225) : parsing [err-report.cfg:1] : 'userlist' cannot handle unexpected argument ':'.
[ALERT] 108/114900 (20225) : parsing [err-report.cfg:6] : unknown proxy mode 'htttp'.
[ALERT] 108/114900 (20225) : parsing [err-report.cfg:7] : unexpected character 'S' in 'timeout client'
[ALERT] 108/114900 (20225) : Error(s) found in configuration file : err-report.cfg
[ALERT] 108/114900 (20225) : Fatal errors found in configuration.
Despite not really a fix, this patch should be backported at least to 1.7,
possibly even 1.6, and 1.5 since it hardens the config parser against
certain bad situations like the recently reported use-after-free and the
last null dereference.
Stephan Zeisberg reported another dirty abort case which can be triggered
with this simple config (where file "d" doesn't exist) :
backend b1
stats auth a:b
acl auth_ok http_auth(c) -f d
This issue was brought in 1.5-dev9 by commit 34db108 ("MAJOR: acl: make use
of the new argument parsing framework") when prune_acl_expr() started to
release arguments. The arg pointer is set to NULL but not its length.
Because of this, later in smp_resolve_args(), the argument is still seen
as valid (since only a test on the length is made as in all other places),
and the NULL pointer is dereferenced.
This patch properly clears the lengths to avoid such tests.
This fix needs to be backported to 1.7, 1.6, and 1.5.
Any valid keyword could not be parsed anymore if provided after 'source' keyword.
This was due to the fact that 'source' number of arguments is variable.
So, as its parser srv_parse_source() is the only one who may know how many arguments
was provided after 'source' keyword, it updates 'cur_arg' variable (the index
in the line of the current arg to be parsed), this is a good thing.
This variable is also incremented by one (to skip the 'source' keyword).
This patch disable this behavior.
Should have come with dba9707 commit.
'usesrc' setting is not permitted on 'server' lines if not provided after
'source' setting. This is now also the case on 'default-server' lines.
Without this patch parse_server() parser displayed that 'usersrc' is
an unknown keyword.
Should have come with dba9707 commit.
Make the systemd wrapper chech if HAPROXY_STATS_SOCKET if set.
If set, it will use it as an argument to the "-x" option, which makes
haproxy asks for any listening socket, on the stats socket, in order
to achieve reloads with no new connection lost.
When running with multiple process, if some proxies are just assigned
to some processes, the other processes will just close the file descriptors
for the listening sockets. However, we may still have to provide those
sockets when reloading, so instead we just try hard to pretend those proxies
are dead, while keeping the sockets opened.
A new global option, no-reused-socket", has been added, to restore the old
behavior of closing the sockets not bound to this process.
Try to reuse any socket from the old process, provided by the "-x" flag,
before binding a new one, assuming it is compatible.
"Compatible" here means same address and port, same namspace if any,
same interface if any, and that the following flags are the same :
LI_O_FOREIGN, LI_O_V6ONLY and LI_O_V4V6.
Also change tcp_bind_listener() to always enable/disable socket options,
instead of just doing so if it is in the configuration file, as the option
may have been removed, ie TCP_FASTOPEN may have been set in the old process,
and removed from the new configuration, so we have to disable it.
Add the "-x" flag, that takes a path to a unix socket as an argument. If
used, haproxy will connect to the socket, and asks to get all the
listening sockets from the old process. Any failure is fatal.
This is needed to get seamless reloads on linux.
Add a new command that will send all the listening sockets, via the
stats socket, and their properties.
This is a first step to workaround the linux problem when reloading
haproxy.
luaL_setstate() uses malloc() to initialize the first objects, and only
after this we replace the allocator. This creates trouble when replacing
the standard memory allocators during debugging sessions since the new
allocator is used to realloc() an area previously allocated using the
default malloc().
Lua provides lua_newstate() in addition to luaL_newstate(), which takes
an allocator for the initial malloc. This is exactly what we need, and
this patch does this and fixes the problem. The now useless call to
lua_setallocf() could be removed.
This has no impact outside of debugging sessions and there's no need to
backport this.
This reverts commit 266b1a8 ("MEDIUM: server: Inherit CLI weight changes and
agent-check weight responses") from Michal Idzikowski, which is still broken.
It stops propagating weights at the first error encountered, leaving servers
in a random state depending on what LB algorithms are used on other servers
tracking the one experiencing the weight change. It's unsure what the best
way to address this is, but we cannot leave the servers in an inconsistent
state between farms. For example :
backend site1
mode http
balance uri
hash-type consistent
server s1 127.0.0.1:8001 weight 10 track servers/s1
backend site2
mode http
balance uri
server s1 127.0.0.1:8001 weight 10 track servers/s1
backend site3
mode http
balance uri
hash-type consistent
server s1 127.0.0.1:8001 weight 10 track servers/s1
backend servers
server s1 127.0.0.1:8001 weight 10 check inter 1s
The weight change is applied on "servers/s1". It tries to propagate
to the servers tracking it, which are site1/s1, site2/s1 and site3/s1.
Let's say that "weight 50%" is requested. The servers are linked in
reverse-order, so the change is applied to "servers/s1", then to
"site3/s1", then to "site2/s1" and this one fails and rejects the
change. The change is aborted and never propagated to "site1/s1",
which keeps the server in a different state from "site3/s1". At the
very least, in case of error, the changes should probably be unrolled.
Also the error reported on the CLI (when changing from the CLI) simply says :
Backend is using a static LB algorithm and only accepts weights '0%' and '100%'.
Without more indications what the faulty backend is.
Let's revert this change for now, as initially feared it will definitely
cause more harm than good and at least needs to be revisited. It was never
backported to any stable branch so no backport is needed.
In case of error it's very difficult to properly unroll the list of
unresolved args because the error can appear on any argument, and all
of them share the same memory area, pointed to by one or multiple links
from the global args list. The problem is that till now the arguments
themselves were released and were not unlinked from the list, causing
all forms of corruption in deinit() when quitting on the error path if
an argument couldn't properly parse.
A few attempts at trying to selectively spot the appropriate list entries
to kill before releasing the shared area have only resulted in complicating
the code and pushing the issue further.
Here instead we use a simple conservative approach : prune_acl_expr()
only tries to free the argument array if none of the arguments were
unresolved, which means that none of them was added to the arg list.
It's unclear what a better approach would be. We could imagine that
args would point to their own location in the shared list but given
that this extra cost and complexity would be added exclusively in
order to cleanly release everything when we're exiting due to a config
parse error, this seems quite overkill.
This bug was noticed on 1.7 and likely affects 1.6 and 1.5, so the fix
should be backported. It's not easy to reproduce it, as the reproducers
randomly work depending on how memory is allocated. One way to do it is
to use parsable and non-parsable patterns on an ACL making use of args.
Big thanks to Stephan Zeisberg for reporting this problem with a working
reproducer.
If make_arg_list() fails to process an argument after having queued an
unresolvable one, it frees the allocated argument list but doesn't remove
the referenced args from the arg list. This causes a use after free or a
double free if the same location was reused, during the deinit phase upon
exit after reporting the error.
Since it's not easy to properly unlinked all elements, we only release the
args block if none of them was queued in the list.
When agent-check or CLI command executes relative weight change this patch
propagates it to tracking server allowing grouping many backends running on
same server underneath. Additionaly in case with many src IPs many backends
can have shared state checker, so there won't be unnecessary health checks.
[wt: Note: this will induce some behaviour change on some setups]
Take care of arg_list_clone() returning NULL in arg_list_add() since
the former does it too. It's only used during parsing so the impact
is very low.
Can be backported to 1.7, 1.6 and 1.5.
The error doesn't prevent checking for other errors after an invalid
character was detected in an ACL name. Better quit ASAP to avoid risking
to emit garbled and confusing error messages if something else fails on
the same line.
This should be backported to 1.7, 1.6 and 1.5.
AF_INET address family was always used to create sockets to connect
to name servers. This prevented any connection over IPv6 from working.
This fix must be backported to 1.7 and 1.6.
Commit 0ebb511 ("MINOR: tools: add a generic hexdump function for debugging")
introduced debug_hexdump() which is used to dump a memory area during
debugging sessions. This function can start at an unaligned offset and
uses a signed comparison to know where to start dumping from. But the
operation mixes signed and unsigned, making the test incorrect and causing
the following warnings to be emitted under Clang :
src/standard.c:3775:14: warning: comparison of unsigned expression >= 0 is
always true [-Wtautological-compare]
if (b + j >= 0 && b + j < len)
~~~~~ ^ ~
Make "j" signed instead. At the moment this function is not used at all
so there's no impact. Thanks to Dmitry Sivachenko for reporting it. No
backport is needed.
Commit 05ee213 ("MEDIUM: stats: Add JSON output option to show (info|stat)")
used to pass argument "uri" to the aforementionned function which doesn't
take any. It's probably a leftover from multiple iterations of the same
patchset. Spotted by Dmitry Sivachenko. No backport is needed.
IP*_BINDANY is not defined under this system thus it is
necessary to make those fields access since CONFIG_HAP_TRANSPARENT
is not defined.
[wt: problem introduced late in 1.8-dev. The same fix was also reported
by Steven Davidovitz]
Each time we generate a dynamic cookie, we try to make sure the same
cookie hasn't been generated for another server, it's very unlikely, but
it may happen.
We only have to check that for the servers in the same proxy, no, need to
check in others, plus the code was buggy and would always check in the
first proxy of the proxy list.