The new "rate-limit sessions" statement sets a limit on the number of
new connections per second on the frontend. As it is extremely accurate
(about 0.1%), it is efficient at limiting resource abuse or DoS.
There is a problem when an instance is marked "disabled". Its ports are
still bound but will not be unbound upon termination. This causes processes
to accumulate during soft restarts, and might even cause failures to restart
new ones due to the inability to bind to the same port.
The ideal solution would be to bind all ports at the end of the configuration
parsing. An acceptable workaround is to unbind all listeners of disabled
proxies. This is what the current patch does.
(cherry picked from commit a944218e9c)
(cherry picked from commit 8cfebbb82b87345bade831920177077e7d25840a)
During a configuration reload, haproxy tried to pause all proxies.
Unfortunately, it also tried to pause backends, which would fail
and cause trouble to the new process since the port was still bound.
(backported from commit eab5c70f93)
(cherry picked from commit ac1ca38e9b07422e21b5b4778918d243768e5498)
It should be stated as a rule that a C file should never
include types/xxx.h when proto/xxx.h exists, as it gives
less exposure to declaration conflicts (one of which was
caught and fixed here) and it complicates the file headers
for nothing.
Only types/global.h, types/capture.h and types/polling.h
have been found to be valid includes from C files.
This is the first attempt at moving all internal parts from
using struct timeval to integer ticks. Those provides simpler
and faster code due to simplified operations, and this change
also saved about 64 bytes per session.
A new header file has been added : include/common/ticks.h.
It is possible that some functions should finally not be inlined
because they're used quite a lot (eg: tick_first, tick_add_ifset
and tick_is_expired). More measurements are required in order to
decide whether this is interesting or not.
Some function and variable names are still subject to change for
a better overall logics.
We now insert tasks in a certain sequence in the run queue.
The sorting key currently is the arrival order. It will now
be possible to apply a "nice" value to any task so that it
goes forwards or backwards in the run queue.
The calls to wake_expired_tasks() and maintain_proxies()
have been moved to the main run_poll_loop(), because they
had nothing to do in process_runnable_tasks().
The task_wakeup() function is not inlined anymore, as it was
only used at one place.
The qlist member of the task structure has been removed now.
The run_queue list has been replaced for an integer indicating
the number of tasks in the run queue.
The first implementation of the monotonic clock did not verify
forward jumps. The consequence is that a fast changing time may
expire a lot of tasks. While it does seem minor, in fact it is
problematic because most machines which boot with a wrong date
are in the past and suddenly see their time jump by several
years in the future.
The solution is to check if we spent more apparent time in
a poller than allowed (with a margin applied). The margin
is currently set to 1000 ms. It should be large enough for
any poll() to complete.
Tests with randomly jumping clock show that the result is quite
accurate (error less than 1 second at every change of more than
one second).
If the system date is set backwards while haproxy is running,
some scheduled events are delayed by the amount of time the
clock went backwards. This is particularly problematic on
systems where the date is set at boot, because it seldom
happens that health-checks do not get sent for a few hours.
Before switching to use clock_gettime() on systems which
provide it, we can at least ensure that the clock is not
going backwards and maintain two clocks : the "date" which
represents what the user wants to see (mostly for logs),
and an internal date stored in "now", used for scheduled
events.
This patch implements ability to set the current state of one server
by tracking another one. It:
- adds two variables: *tracknext, *tracked to struct server
- implements findserver(), similar to findproxy()
- adds "track" keyword accepting both "proxy/server" and "server" (assuming current proxy)
- verifies if both checks and tracking is not enabled at the same time
- changes set_server_down() to notify tracking server
- creates set_server_up(), set_server_disabled(), set_server_enabled() by
moving the code from process_chk() and adding notifications
- changes stats to show a name of tracked server instead of Chk/Dwn/Dwntime(html)
or by adding new variable (csv)
Changes from the previuos version:
- it is possibile to track independently of the declaration order
- one extra comma bug is fixed
- new condition to check if there is no disable-on-404 inconsistency
This patch adds two new variables: fastinter and downinter.
When server state is:
- non-transitionally UP -> inter (no change)
- transitionally UP (going down), unchecked or transitionally DOWN (going up) -> fastinter
- down -> downinter
It allows to set something like:
server sr6 127.0.51.61:80 cookie s6 check inter 10000 downinter 20000 fastinter 500 fall 3 weight 40
In the above example haproxy uses 10000ms between checks but as soon as
one check fails fastinter (500ms) is used. If server is down
downinter (20000) is used or fastinter (500ms) if one check pass.
Fastinter is also used when haproxy starts.
New "timeout.check" variable was added, if set haproxy uses it as an additional
read timeout, but only after a connection has been already established. I was
thinking about using "timeout.server" here but most people set this
with an addition reserve but still want checks to kick out laggy servers.
Please also note that in most cases check request is much simpler
and faster to handle than normal requests so this timeout should be smaller.
I also changed the timeout used for check connections establishing.
Changes from the previous version:
- use tv_isset() to check if the timeout is set,
- use min("timeout connect", "inter") but only if "timeout check" is set
as this min alone may be to short for full (connect + read) check,
- debug code (fprintf) commented/removed
- documentation
Compile tested only (sorry!) as I'm currently traveling but changes
are rather small and trivial.
In order to offer DoS protection, it may be required to lower the maximum
accepted time to receive a complete HTTP request without affecting the client
timeout. This helps protecting against established connections on which
nothing is sent. The client timeout cannot offer a good protection against
this abuse because it is an inactivity timeout, which means that if the
attacker sends one character every now and then, the timeout will not
trigger. With the HTTP request timeout, no matter what speed the client
types, the request will be aborted if it does not complete in time.
Add the "backlog" parameter to frontends, to give hints to
the system about the approximate listen backlog desired size.
In order to protect against SYN flood attacks, one solution is
to increase the system's SYN backlog size. Depending on the
system, sometimes it is just tunable via a system parameter,
sometimes it is not adjustable at all, and sometimes the system
relies on hints given by the application at the time of the
listen() syscall. By default, HAProxy passes the frontend's
maxconn value to the listen() syscall. On systems which can
make use of this value, it can sometimes be useful to be able
to specify a different value, hence this backlog parameter.
A new "timeout" keyword replaces old "{con|cli|srv}timeout", and
provides the ability to independantly set the following timeouts :
- client
- tarpit
- queue
- connect
- server
- appsession
Additionally, the "clitimeout", "contimeout" and "srvtimeout" values
are supported but deprecated. No warning is emitted yet when they are
used since the option is very new.
Other timeouts should follow soon now.
AIX does not know about MSG_DONTWAIT. Fortunately, nearly all sockets
are already set to O_NONBLOCK, so it's not even required to change the
code. It was only necessary to add this fcntl to the log socket which
lacked it. The MSG_DONTWAIT value has been defined to zero when unset
in order to make the code cleaner and more portable.
Also, on AIX, "hz" is defined, which causes a problem with one function
parameter in time.c. It's enough to rename the parameter there. Last,
fix a missing #include <string.h> in proxy.c.
It is very convenient for SNMP monitoring to have unique process ID,
proxy ID and server ID. Those have been added to the CSV outputs.
The numbers start at 1. 0 is reserved. For servers, 0 means that the
reported name is not a server name but half a proxy (FRONTEND/BACKEND).
A remaining hidden "-" in the CSV output has been eliminated too.
Proxy listeners were very special and not very easy to manipulate.
A proto_tcp file has been created with all that is required to
manage TCPv4/TCPv6 as raw protocols, and provide generic listeners.
The code of start_proxies() and maintain_proxies() now looks less
like spaghetti. Also, event_accept will need a serious lifting in
order to use more of the information provided by the listener.
It's not easy to report useful information to help the user quickly
fix a configuration. This patch :
- removes the word "listener" in favor of "proxy" as it has been
used since the beginning ;
- ensures that the same function (hence the same words) will be
used to report capabilities of a proxy being declared and an
existing proxy ;
- avoid the term "conflicting capabilities" in favor of "overlapping
capabilities" which is more exact.
- just report that the same name is reused in case of warnings
This patch:
- adds proxy_mode_str() similar to proxy_type_str()
- adds a generic findproxy function used with default_backend/setbe/use_backed
- rewrite default_backend/senbe/use_backed to use introduced findproxy()
- relaxes duplicated proxy check
- changes capabilities displaying from "%X" to "%s" with a call to proxy_type_str()
The stream_sock_* functions had to know about sessions just in
order to get the server's address for a connect() operation. This
is not desirable, particularly for non-IP protocols (eg: PF_UNIX).
Put a pointer to the peer's sockaddr_storage or sockaddr address
in the fdtab structure so that we never need to look further.
With this small change, the stream_sock.c file is now 100% protocol
independant.
The following patch will give the ability to tweak socket linger mode.
You can use this option with "option nolinger" inside fronted or backend
configuration declaration.
This will help in environments where lots of FIN_WAIT sockets are
encountered.
When we're interrupted by another instance, it is very likely
that the other one will need some memory. Now we know how to
free what is not used, so let's do it.
Also only free non-null pointers. Previously, pool_destroy()
did implicitly check for this case which was incidentely
needed.
The timeout functions were difficult to manipulate because they were
rounding results to the millisecond. Thus, it was difficult to compare
and to check what expired and what did not. Also, the comparison
functions were heavy with multiplies and divides by 1000. Now, all
timeouts are stored in timevals, reducing the number of operations
for updates and leading to cleaner and more efficient code.
Now fdtab can contain the FD_POLL_* events so that the pollers
which can fill them can give userful information to readers and
writers about the precise condition of wakeup.
It may be dangerous to play with fdtab before doing fd_insert()
because this last one is responsible for growing maxfd as needed.
Call fd_insert() before instead.
Some pollers such as kqueue lose their FD across fork(), meaning that
the registered file descriptors are lost too. Now when the proxies are
started by start_proxies(), the file descriptors are not registered yet,
leaving enough time for the fork() to take place and to get a new pollfd.
It will be the first call to maintain_proxies that will register them.
The notion of capabilities has been added to the proxy so that we
know whether a proxy supports frontend, backend, or rulesets. Given
this, some parameters are optionnal, some are ignored with a warning
and others are forbidden. It is now possible to write valid two level
configs without binding to dummy address/ports.
The nbconn attribute in the proxies was not relevant anymore because
a frontend A may use backend B and both of them must account for their
respective connections. For this reason, there now are two separate
counters for frontend and backend connections.
The stats page has been updated to reflect the backend, but a separate
line entry for the frontend with error counts would be good.
Note that as of now, beconn may be higher than maxconn, because maxconn
applies to the frontend, while beconn may be increased due to sessions
passed from another frontend.
The files are now stored under :
- include/haproxy for the generic includes
- include/types.h for the structures needed within prototypes
- include/proto.h for function prototypes and inline functions
- src/*.c for the C files
Most include files are now covered by LGPL. A last move still needs
to be done to put inline functions under GPL and not LGPL.
Version has been set to 1.3.0 in the code but some control still
needs to be done before releasing.