Commit d4448bc8 brought support for parsing port ranges, but invalid
characters are not properly handled and can result in a crash while
parsing the configuration if an invalid character is present in the
port, because the return value is set to NULL then dereferenced.
An invalid copy-paste called it NR_splice instead of NR_accept4.
This does not lead to real issues because if this define is used,
then the code cannot compile since NR_accept4 is still missing.
Add new tunable "tune.ssl.maxrecord".
Over SSL/TLS, the client can decipher the data only once it has received
a full record. With large records, it means that clients might have to
download up to 16kB of data before starting to process them. Limiting the
record size can improve page load times on browsers located over high
latency or low bandwidth networks. It is suggested to find optimal values
which fit into 1 or 2 TCP segments (generally 1448 bytes over Ethernet
with TCP timestamps enabled, or 1460 when timestamps are disabled), keeping
in mind that SSL/TLS add some overhead. Typical values of 1419 and 2859
gave good results during tests. Use "strace -e trace=write" to find the
best value.
This trick was first suggested by Mike Belshe :
http://www.belshe.com/2010/12/17/performance-and-the-tls-record-size/
Then requested again by Ilya Grigorik who provides some hints here :
http://ofps.oreilly.com/titles/9781449344764/_transport_layer_security_tls.html#ch04_00000101
When parsing the config, we now use str2sa_range() to detect when
ranges or port offsets were improperly used. Among the new checks
are "log", "source", "addr", "usesrc" which previously didn't check
for extra parameters.
Right now we have multiple methods for parsing IP addresses in the
configuration. This is quite painful. This patch aims at adapting
str2sa_range() to make it support all formats, so that the callers
perform the appropriate tests on the return values. str2sa() was
changed to simply return str2sa_range().
The output values are now the following ones (taken from the comment
on top of the function).
Converts <str> to a locally allocated struct sockaddr_storage *, and a port
range or offset consisting in two integers that the caller will have to
check to find the relevant input format. The following format are supported :
String format | address | port | low | high
addr | <addr> | 0 | 0 | 0
addr: | <addr> | 0 | 0 | 0
addr:port | <addr> | <port> | <port> | <port>
addr:pl-ph | <addr> | <pl> | <pl> | <ph>
addr:+port | <addr> | <port> | 0 | <port>
addr:-port | <addr> |-<port> | <port> | 0
The detection of a port range or increment by the caller is made by
comparing <low> and <high>. If both are equal, then port 0 means no port
was specified. The caller may pass NULL for <low> and <high> if it is not
interested in retrieving port ranges.
Note that <addr> above may also be :
- empty ("") => family will be AF_INET and address will be INADDR_ANY
- "*" => family will be AF_INET and address will be INADDR_ANY
- "::" => family will be AF_INET6 and address will be IN6ADDR_ANY
- a host name => family and address will depend on host name resolving.
This is the same as -uc except that instead of counting URLs, it
counts source addresses. The reported times are request times and
not response times.
The code becomes heavily ugly, the url struct is being abused to
store an address, and there are no more bit fields available. The
code needs a major revamp.
If an address is improperly formated on a bind or server address
and haproxy is built for using getaddrinfo, then a crash may occur
upon the call to freeaddrinfo().
Thanks to Jon Meredith for helping me patch this for SmartOS,
I am not a C/GDB wizard.
Support for server side TFO was actually introduced in linux-3.7,
linux-3.6 just has client support.
This patch fixes documentation and a code comment about the
kernel requirement. It also fixes a wrong tfo related code
comment in src/proto_tcp.c.
Currently when cross-compiling, it's generally necessary to force
PCREDIR which the Makefile automatically appends /include and /lib to.
Unfortunately on most 64-bit linux distros, the lib path is instead
/lib64, which is really annoying to fix in the makefile.
So now we're computing PCRE_INC and PCRE_LIB from PCREDIR and using
these ones instead. If one wants to force paths individually, it is
possible to set them instead of setting PCREDIR. The old behaviour
of not passing anything to the compiler when PCREDIR is forced to blank
is conserved.
Commit a2b9dad introduced use of isblank() which is not present everywhere
and requires -std=c99 or various other defines. Here on gcc-3.4 with glibc
2.3 it emits a warning though it works :
src/checks.c: In function 'event_srv_chk_r':
src/checks.c:1007: warning: implicit declaration of function 'isblank'
This macro matches only 2 values, better replace it with the explicit match.
Support a agent health check performed by opening a TCP socket to a
pre-defined port and reading an ASCII string. The string should have one of
the following forms:
* An ASCII representation of an positive integer percentage.
e.g. "75%"
Values in this format will set the weight proportional to the initial
weight of a server as configured when haproxy starts.
* The string "drain".
This will cause the weight of a server to be set to 0, and thus it will
not accept any new connections other than those that are accepted via
persistence.
* The string "down", optionally followed by a description string.
Mark the server as down and log the description string as the reason.
* The string "stopped", optionally followed by a description string.
This currently has the same behaviour as down (iii).
* The string "fail", optionally followed by a description string.
This currently has the same behaviour as down (iii).
A agent health check may be configured using "option lb-agent-chk".
The use of an alternate check-port, used to obtain agent heath check
information described above as opposed to the port of the service,
may be useful in conjunction with this option.
e.g.
option lb-agent-chk
server http1_1 10.0.0.10:80 check port 10000 weight 100
Signed-off-by: Simon Horman <horms@verge.net.au>
Detect:
* Empty weight string, including no digits before '%' in relative
weight string
* Trailing garbage, including between the last integer and '%'
in relative weights
The motivation for this is to allow the weight string to be safely
logged if successfully parsed by this function
Signed-off-by: Simon Horman <horms@verge.net.au>
Allow relative weights greater than 100%,
capping the absolute value to 256 which is
the largest supported absolute weight.
Signed-off-by: Simon Horman <horms@verge.net.au>
Break out set weight processing code.
This is in preparation for reusing the code.
Also, remove duplicate check in nested if clauses.
{px->lbprm.algo & BE_LB_PROP_DYN) is checked by
the immediate outer if clause, so there is no need
to check it a second time.
Signed-off-by: Simon Horman <horms@verge.net.au>
This corrects what appears to be logic errors in cut_crlf().
I assume that the intention of this function is to truncate a
string at the first cr or lf. However, currently lf are ignored.
Also use '\0' instead of 0 as the null character, a cosmetic change.
Cc: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Simon Horman <horms@verge.net.au>
[WT: this fix may be backported to 1.4 too]
Currently, to reload haproxy configuration, you have to use "-sf".
There is a problem with this way of doing things. First of all, in the systemd world,
reload commands should be "oneshot" ones, which means they should not be the new main
process but rather a tool which makes a call to it and then exits. With the current approach,
the reload command is the new main command and moreover, it makes the previous one exit.
Systemd only tracks the main program, seeing it ending, it assumes it either finished or failed,
and kills everything remaining as a grabage collector. We then end up with no haproxy running
at all.
This patch adds wrapper around haproxy, no changes at all have been made into it,
so it's not intrusive and doesn't change anything for other hosts. What this wrapper does
is basically launching haproxy as a child, listen to the SIGUSR2 (not to conflict with
haproxy itself) signal, and spawing a new haproxy with "-sf" as a child to relay the
first one.
Signed-off-by: Marc-Antoine Perennou <Marc-Antoine@Perennou.com>
This patch adds a new option "-Ds" which is exactly like "-D", but instead of
forking n times to get n jobs running and then exiting, prefers to wait for all the
children it just created. With this done, haproxy becomes more systemd-compliant,
without changing anything for other systems.
Signed-off-by: Marc-Antoine Perennou <Marc-Antoine@Perennou.com>
The current documentation of the bind option "interface" can be misleading
(as seen on the ML recently).
This patch tries to address misunderstandings by :
- avoiding the words listen or bind in the behavior description, using
"restrict to interface" instead
- using a different sentence construction (partially stolen from
"man 7 socket": SO_BINDTODEVICE)
- "defragmentation": moving behavior related explanations to the beginning
and restrictions, use-cases and requirements to the end.
When observe layer7 is enabled on a server, a response may cause a server
to be marked down while a check is in progress. When the check finally
completes, the connection is not properly released in process_chk() because
the server states makes it think that no check was in progress due to the
lastly reported failure.
When a new check gets scheduled, it reuses the same connection structure
which is reinitialized. When the server finally closes the previous
connection, epoll_wait() notifies conn_fd_handler() which sees that the
old connection is still referenced by fdtab[fd], but it can not do anything
with this fd which does not match conn->t.sock.fd. So epoll_wait() keeps
reporting this fd forever.
The solution is to always make process_chk() always take care of closing
the connection and not make it rely on the connection layer to so.
Special thanks go to James Cole and Finn Arne Gangstad who encountered
the issue almost at the same time and took care of reporting a very
detailed analysis with rich information to help understand the issue.
Commit 2b0108ad accidently got rid of the ability to emit a "-" for
empty log fields. This can happen for captured request and response
cookies, as well as for fetches. Since we don't want to have this done
for headers however, we set the default log method when parsing the
format. It is still possible to force the desired mode using +M/-M.
Openssl needs to access /dev/urandom to initialize its internal random
number generator. It does so when it needs a random for the first time,
which fails if it is a handshake performed after the chroot(), causing
all SSL incoming connections to fail.
This fix consists in calling RAND_bytes() to produce a random before
the chroot, which will in turn open /dev/urandom before it's too late,
and avoid the issue.
If the random generator fails to work while processing the config,
haproxy now fails with an error instead of causing SSL connections to
fail at runtime.
This new option ensures that there is no possible fallback to a default
certificate if the client does not provide an SNI which is explicitly
handled by a certificate.
In select_compression_response_header(), some tests are rather confusing
as the "fail" label is used to deinitialize the compression context for
the session while it's branched only before initialization succeeds. The
test is always false here and the dereferencing of the comp_algo pointer
which might be null is also confusing. Remove that code which is not needed
anymore since commit ec3e3890 got rid of the latest issues.
Reported-by: Dinko Korunic <dkorunic@reflected.net>
Commit 290e63aa moved the unix parameters out of the global stats socket
to the bind_conf struct. As such the stats admin level was also moved
overthere, but it remained in the stats global section where it was not
used, except by a nasty memcpy() used to initialize the ux struct in the
bind_conf with too large data. Fortunately, the extra data copied were
the previous level over the new level so it did not have any impact, but
it could have been worse.
This bug is 1.5 specific, no backport is needed.
Reported-by: Dinko Korunic <dkorunic@reflected.net>
A test is obviously wrong in uri_auth(). If strdup(pass) returns an error
while strdup(user) passes, the NULL pointer is still stored into the
structure. If the user returns the NULL instead, the allocated memory is
not released before returning the error.
The issue was present in 1.4 so the fix should be backported.
Reported-by: Dinko Korunic <dkorunic@reflected.net>
This function may write the \0 one char too far in the static array.
There is no effect right now as the function has never been used except
maybe in code that was never released. Out-of-tree code might possibly
be affected though (hence the MEDIUM flag).
No backport is needed.
Reported-by: Dinko Korunic <dkorunic@reflected.net>
sig is checked for < 0 or > MAX_SIGNAL, but the signal array is
MAX_SIGNAL in size. At the moment, MAX_SIGNAL is 256. If a system supports
more than MAX_SIGNAL signals, then sending signal MAX_SIGNAL to the process
will corrupt one integer in its memory and might also crash the process.
This bug is also present in 1.4 and 1.3, and the fix must be backported.
Reported-by: Dinko Korunic <dkorunic@reflected.net>
srv cannot be null in http_perform_server_redirect(), as it's taken
from the stream interface's target which is always valid for a
server-based redirect, and it was already dereferenced above, so in
practice, gcc already removes the test anyway.
Reported-by: Dinko Korunic <dkorunic@reflected.net>
utoa_pad() is directly fed into tmplog, which is checked for NULL.
First, when NULLs are possible, they should be put into a temp variable
in order to preserve tmplog, and second, this return value can never be
NULL because the value passed is tv_usec/1000 (between "0" and "999")
with a 4-char output. However better fix the check in case this code gets
improperly copy-pasted for another usage later.
Reported-by: Dinko Korunic <dkorunic@reflected.net>
Currently s->listener is set for all sessions, but this may not remain
the case forever so we already check s->listener for validity. On check
was missed.
Reported-by: Dinko Korunic <dkorunic@reflected.net>
health_adjust() checks for incorrect bounds for the status argument.
With current code, the argument is always a constant from the valid
enum so there is no impact and the check is basically a NOP. However
users running local patches (eg: new checks) might want to recheck
their code.
This fix should be backported to 1.4 which introduced the issue.
Reported-by: Dinko Korunic <dkorunic@reflected.net>
logsrv->minlvl gets the numeric log level from the equivalent string.
Upon error, ->level was checked due to a wrong copy-paste. The effect
is that a wrong name will silently be ignored and due to minlvl=-1,
will act as if the option was not set.
No backport is needed, this is 1.5-specific.
Reported-by: Dinko Korunic <dkorunic@reflected.net>
An error caused by an invalid port does not cause the raddr string to
be freed. This is harmless at the moment since we exit, but may have
an impact later if we ever support hot config changes.
Reported-by: Dinko Korunic <dkorunic@reflected.net>
The wrong variable is checked after a calloc() so a memory shortage would
result in a segfault while loading the config instead of a clean error.
This fix may be backported to 1.4 and 1.3 which are both affected.
Reported-by: Dinko Korunic <dkorunic@reflected.net>
epoll_wait() takes a number of returned events, not the number of
fds to consider. We must not pass it the number of the smallest fd,
as it leads to value zero being used, which is invalid in epoll_wait().
The effect may sometimes be observed with peers sections trying to
connect and causing 2-seconds CPU loops upon a soft reload because
epoll_wait() immediately returns -1 EINVAL instead of waiting for the
timeout to happen.
This fix should be backported to 1.4 too (into ev_epoll and ev_sepoll).
If a peers section contains several instances of the local peer name, only
the first one was considered and the next ones were silently ignored. This
can cause some trouble to debug such a configuration. Now the extra entries
are rejected with an error message indicating where the first occurrence was
found.
Without it, haproxy will retain the group membership of root, which may
give more access than intended to the process. For example, haproxy would
still be in the wheel group on Fedora 18, as seen with :
# haproxy -f /etc/haproxy/haproxy.cfg
# ps a -o pid,user,group,command | grep hapr
3545 haproxy haproxy haproxy -f /etc/haproxy/haproxy.cfg
4356 root root grep --color=auto hapr
# grep Group /proc/3545/status
Groups: 0 1 2 3 4 6 10
# getent group wheel
wheel❌10:root,misc
[WT: The issue has been investigated by independent security research team
and realized by itself not being able to allow security exploitation.
Additionally, dropping groups is not allowed to unprivileged users,
though this mode of deployment is quite common. Thus a warning is
emitted in this case to inform the user. The fix could be backported
into all supported versions as the issue has always been there. ]
Due to a typo in the peers section lookup code, the last declared peers
section was used instead of the one matching the requested name. This bug
has been there since the very first commit on peers section (1.5-dev2).