Negatives timeouts doesn't have sense. A negative timeout doesn't cause
a crash, but the connection expires before the system try to extablish it.
This patch should be backported in all versions from 1.6
The output of these function indicates that one element is pushed in
the stack, but no element is set in the stack. Actually, if anyone
read the value returned by this function, is gets "something"
present in the stack.
This patch is a complement of these one: 119a5f10e4
The LuaSocket documentation tell anything about the returned value,
but the effective code set an integer of value one.
316a9455b9/src/timeout.c (L172)
Thanks to Tim for the bug report.
This patch should be backported in all version from 1.6
issue was identified by cppcheck
[src/map.c:372] -> [src/map.c:376]: (warning) Variable 'appctx->st2' is reassigned a value before the old one has been used. 'break;' missing?
[src/map.c:433] -> [src/map.c:437]: (warning) Variable 'appctx->st2' is reassigned a value before the old one has been used. 'break;' missing?
[src/map.c:555] -> [src/map.c:559]: (warning) Variable 'appctx->st2' is reassigned a value before the old one has been used. 'break;' missing?
[src/stream.c:3264] -> [src/stream.c:3268]: (warning) Variable 'appctx->st2' is reassigned a value before the old one has been used. 'break;' missing?
Signed-off-by: Ilya Shipitsin <chipitsine@gmail.com>
When a freshly created session is rejected, for any reason, during the accept in
the function "session_accept_fd", the variable "actconn" is decreased twice. The
first time when the rejected session is released, then in the function
"listener_accpect", because of the failure. So it is possible to have an
negative value for actconn. Note that, in this case, we will also have a negatve
value for the current number of connections on the listener rejecting the
session (actconn and l->nbconn are in/decreased in same time).
It is easy to reproduce the bug with this small configuration:
global
stats socket /tmp/haproxy
listen test
bind *:12345
tcp-request connection reject if TRUE
A "show info" on the stat socket, after a connection attempt, will show a very
high value (the unsigned representation of -1).
To fix the bug, if the function "session_accept_fd" returns an error, it
decrements the right counters and "listener_accpect" leaves them untouched.
This patch must be backported in 1.8.
There are some corner cases where this could happen by accident. Since
the spec explicitly forbids this (RFC7540#5.4.2), let's add a test in
the two only functions which make the RST to avoid this. Thanks to user
klzgrad for reporting this problem. Usually it is expected to be harmless
but may result in browsers issuing a warning.
This fix must be backported to 1.8.
Recent fixes made to process partial frames broke the flow control on
DATA frames, as the padding is not considered anymore, only the actual
data is. Let's simply take account of the padding once the transfer
ends. The probability to meet this bug is low because, when used, padding
is small and it can require a large number of padded transfers before the
window is completely depleted.
Thanks to user klzgrad for reporting this bug and confirming the fix.
This fix must be backported to 1.8.
This patch add option crc32c (PP2_TYPE_CRC32C) to proxy protocol v2.
It compute the checksum of proxy protocol v2 header as describe in
"doc/proxy-protocol.txt".
Commit 4815c8c ("MAJOR: fd/threads: Make the fdcache mostly lockless.")
made the fd cache lockless, but after a few iterations, a subtle part was
lost, consisting in setting the bit on the fd_cache_mask immediately when
adding an event. Now it was done only when the cache started to process
events, but the problem it causes is that fd_cache_mask isn't reliable
anymore as an indicator of presence of events to be processed with no
delay outside of fd_process_cached_events(). This results in some spurious
delays when processing inter-thread wakeups between tasks. Just restoring
the flag when the event is added is enough to fix the problem.
Kudos to Christopher for spotting this one!
No backport is needed as this is only in the development version.
Some time ago, integer overflows detection stopped working in the timer
code on recent compliers and were addressed by commit 73bdb32 ("BUG/MAJOR:
Use -fwrapv."). By then it was thought that -fno-strict-overflow was not
needed as implied, but it resulted from a misinterpretation of the doc,
as this one is still needed to disable pointer overflow optimization that
is automatically enabled at -O2/-O3/-Os.
Unfortunately the compiler happily removes overflow checks without the
slightest warning so it's not trivial to guess the extent of this issue
without comparing the emitted asm code. By checking the emitted assembly
code with and without the option, it was found that the only affected
location was the reported one, in ssl_sock_parse_clienthello(), where
the test can never fail on any system where the highest userland pointer
is at least 64kB away from wrapping (ie all 32/64 bit OS in field), so
there it is harmless.
This patch must be backported to all maintained versions.
Special thanks to Ilya Shipitsin for reporting this issue.
This is a recurring pain when using certain unix domain sockets or when
sending to temporarily unroutable addresses, if the process remains in
the foreground, the console is full of error which it's impossible to
do anything about. It's even worse when the process is remote, or when
run from a serial console which will slow the whole process down. Let's
send them only once now to warn about a possible config issue, and not
pollute the system nor slow everything down.
The previous patch about queues (5cd4bbd7a "BUG/MAJOR: threads/queue: Fix
thread-safety issues on the queues management") revealed a performance drop when
multithreading is enabled (nbthread > 1). This happens when pending connections
handled by other theads are dequeued. If these other threads are blocked in the
poller, we have to wait the poller's timeout (or any I/O event) to process the
dequeued connections.
To fix the problem, at least temporarly, we "wake up" the threads by requesting
a synchronization. This may seem a bit overkill to use the sync point to do a
wakeup on threads, but it fixes this performance issue. So we can now think
calmly on the good way to address this kind of issues.
This patch should be backported in 1.8 with the commit 5cd4bbd7a ("BUG/MAJOR:
threads/queue: Fix thread-safety issues on the queues management").
When running tcp-check scripts, one must ensure we can establish a tcp
connection first.
When doing this action, HAProxy needs a TCP port configured either on
the server or on the check itself or on the connect rule itself.
For some reasons, the connect code did not evaluate the service port on
the server structure...
this patch fixes this error.
Backport status: 1.8
When tcpcheck is used to do TCP port monitoring only and the script is
composed by a single "tcp-check connect" rule (whatever port and ssl
options enabled), then the server can't be seen as DOWN.
Simple configuration to reproduce:
backend b
[...]
option tcp-check
tcp-check connect
server s1 127.0.0.1:22 check
The main reason for this issue is that the piece of code which validates
that we're not at the end of the chained list (of rules) prevents
executing the validation of the establishment of the TCP connection.
Since validation is not executed, the rule is terminated and the report
says no errors were encountered, hence the server is UP all the time.
The workaround is simple: move the connection validation outsied the
CONNECT rule processing loop, into the main function.
That way, if the connection status is not CONNECTED, then HAProxy will
now add more time to wait for it. If the time is expired, an error is
now well reported.
Backport status: 1.8
Buf is unsigned, so nbargs will be negative for more then 127 args.
Note that I cant test this bug because I cant put sufficient args
on the configuration line. It is just detected reading code.
[wt: this can be backported to 1.8 & 1.7]
OpenSSL can be built without NEXTPROTONEG support by passing
-no-npn to the configure script. This sets the
OPENSSL_NO_NEXTPROTONEG flag in opensslconf.h
Since NEXTPROTONEG is now considered deprecated, it is superseeded
by ALPN (Application Layer Protocol Next), HAProxy should allow
building withough NPN support.
This bug was introduced in 48bcfdab2 ("MEDIUM: dumpstat: make the CLI
parser understand the backslash as an escape char").
This should be backported to 1.8.
Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
Since 200b0fac ("MEDIUM: Add support for updating TLS ticket keys via
socket"), 4147b2ef ("MEDIUM: ssl: basic OCSP stapling support."),
4df59e9 ("MINOR: cli: add socket commands and config to prepend
informational messages with severity") and 654694e1 ("MEDIUM: stats/cli:
add support for "set table key" to enter values"), commands
'set ssl tls-key', 'set ssl ocsp-response', 'set severity-output' and
'set table' do not always send an extra LF at the end of their outputs.
This is required as mentioned in doc/management.txt:
"Since multiple commands may be issued at once, haproxy uses the empty
line as a delimiter to mark an end of output for each command"
Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
When doing a seemless reload, while receiving the sockets from the old process
the new process will die if the socket has been bound to a specific
interface.
This happens because the code that tries to parse the informations bogusly
try to set xfer_sock->namespace, while it should be setting wfer_sock->iface.
This should be backported to 1.8.
issue was identified by cppcheck
[src/dns.c:2037] -> [src/dns.c:2041]: (warning) Variable 'appctx->st2' is reassigned a value before the old one has been used. 'break;' missing?
Automatic downgrade of DNS accepted payload size may have undesired side
effect, which could make a backend with all servers DOWN.
After talking with Lukas on the ML, I realized this "feature" introduces
more issues that it fixes problem.
The "best" way to handle properly big responses will be to implement DNS
over TCP.
To be backported to 1.8.
The management of the servers and the proxies queues was not thread-safe at
all. First, the accesses to <strm>->pend_pos were not protected. So it was
possible to release it on a thread (for instance because the stream is released)
and to use it in same time on another one (because we redispatch pending
connections for a server). Then, the accesses to stream's information (flags and
target) from anywhere is forbidden. To be safe, The stream's state must always
be updated in the context of process_stream.
So to fix these issues, the queue module has been refactored. A lock has been
added in the pendconn structure. And now, when we try to dequeue a pending
connection, we start by unlinking it from the server/proxy queue and we wake up
the stream. Then, it is the stream reponsibility to really dequeue it (or
release it). This way, we are sure that only the stream can create and release
its <pend_pos> field.
However, be careful. This new implementation should be thread-safe
(hopefully...). But it is not optimal and in some situations, it could be really
slower in multi-threaded mode than in single-threaded one. The problem is that,
when we try to dequeue pending connections, we process it from the older one to
the newer one independently to the thread's affinity. So we need to wait the
other threads' wakeup to really process them. If threads are blocked in the
poller, this will add a significant latency. This problem happens when maxconn
values are very low.
This patch must be backported in 1.8.
When a listener is temporarily disabled, we start by locking it and then we call
.pause callback of the underlying protocol (tcp/unix). For TCP listeners, this
is not a problem. But listeners bound on an unix socket are in fact closed
instead. So .pause callback relies on unbind_listener function to do its job.
Unfortunatly, unbind_listener hold the listener's lock and then call an internal
function to unbind it. So, there is a deadlock here. This happens during a
reload. To fix the problemn, the function do_unbind_listener, which is lockless,
is now exported and is called when a listener bound on an unix socket is
temporarily disabled.
This patch must be backported in 1.8.
>From the very first day of force-persist and ignore-persist features,
they only applied to backends, except that the documentation stated it
could also be applied to frontends.
In order to make it clear, the documentation is updated and the parser
will raise a warning if the keywords are used in a frontend section.
This patch should be backported up to the 1.5 branch.
Krishna Kumar reported a 100% cpu usage with a configuration using
cpu-map and a high number of threads,
Indeed, this minimal configuration to reproduce the issue :
global
nbthread 40
cpu-map auto:1/1-40 0-39
frontend test
bind :8000
This is due to a wrong type in a shift operator (int vs unsigned long int),
causing an endless loop while applying the cpu affinity on threads. The same
issue may also occur with nbproc under FreeBSD. This commit addresses both
cases.
This patch must be backported to 1.8.
The correct keyword is 'ssl-sessions' (vs. 'ssl-session').
The typo was introduced in 45c742be05 ('REORG: cli: move the "set
rate-limit" functions to their own parser').
Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
This printf() was added in f886e3478d ("MINOR: cli: Add a command to
send listening sockets.").
Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
openssl/x509.h is included twice since commit fc0421fde ("MEDIUM: ssl:
add support for SNI and wildcard certificates").
Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
This bug is present since 7a4a0ac71d ("MINOR: cli: add a new "show fd"
command").
This should be backported to 1.8.
Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
Right now the h2 idle timeout is only set when there is no stream. If we
fail to send because the socket buffers are full (generally indicating
the client has left), we also need to arm it so that we can properly
expire such connections, otherwise some failed transfers might leave
H2 connections pending forever.
Thanks to Thierry Fournier for the diag and the traces.
This patch needs to be backported to 1.8.
When removing the socket from the xfer_sock_list, we want to set
next->prev to prev, not to next->prev, which is useless.
This should be backported to 1.8.
Some sample fetches check if session is established using
the flag CO_FL_CONNECTED. But in some cases, when a handshake
is performed this flag is set too late, after the process
of the tcp-request session rules.
This fix move the raising of the flag at the beginning of the
conn_complete_session function which processes the tcp-request
session rules.
This fix must be backported to 1.8 (and perhaps 1.7)
Previous commit (13113d6 "MINOR/BUILD: fix Lua build on Mac OS X")
contains a typo, it uses "-export-dynamic" instead of "-export_dynamic"
(dash instead of underscore), despite what the commit message suggests,
and it obviously doesn't work. Thanks to Kirill A. Korinsky for reporting
it.
This patch should be backported on each version from 1.6 like the
aforementionned one above.
Change gcc option syntax for Mac. -Wl,--export-dynamic is not
supported, use -Wl,-export_dynamic.
Thanks to Kirill A. Korinsky for the report.
This patch should be backported on each version from 1.6
We used to have one buffer allocator per direction while we can never
block on two buffers at once. Let's have a single one and rely on the
connection's flags to know which one we're waitinf for.
This function takes an h2c and an h2s but it never uses the h2c, which
is a bit confusing at some places in the code. Let's make it clear that
it only operates on the h2s instead by renaming it and removing the
unused h2c argument.
While the haproxy workers usually are running chrooted the master
process is not. This patch is a pretty safe defense in depth measure
to ensure haproxy cannot touch sensitive parts of the file system.
ProtectSystem takes non-boolean arguments in newer SystemD versions,
but setting those would leave older systems such as Ubuntu Xenial
unprotected. Distro maintainers and system administrators could
adapt the ProtectSystem value to the SystemD version they ship.