Commit Graph

66 Commits

Author SHA1 Message Date
Willy Tarreau
171819b5d7 [MINOR] tcp: src_count acl does not have a permanent result
This ACL's count can change along the session's life because it depends
on other sessions' activity. Switch it to volatile since any session
could appear while evaluating the ACLs.
2010-08-10 18:04:11 +02:00
Willy Tarreau
fb35620e87 [MEDIUM] session: support "tcp-request content" rules in backends
Sometimes it's necessary to be able to perform some "layer 6" analysis
in the backend. TCP request rules were not available till now, although
documented in the diagram. Enable them in backend now.
2010-08-10 14:10:58 +02:00
Willy Tarreau
f535683123 [BUG] config: report the correct proxy type in tcp-request errors
A copy-paste typo caused a wrong proxy's type to be reported in case of
parsing errors.
2010-06-14 18:40:26 +02:00
Willy Tarreau
6a984fa7c1 [CLEANUP] proto_tcp: make the config parser a little bit more flexible
We'll need to let the tcp-request parser able to delegate parsing of
track-counters to a commun function, let's prepare it.
2010-06-14 16:44:27 +02:00
Willy Tarreau
cb18364ca7 [MEDIUM] stick_table: separate storage and update of session entries
When an entry already exists, we just need to update its expiration
timer. Let's have a dedicated function for that instead of spreading
open code everywhere.

This change also ensures that an update of an existing sticky session
really leads to an update of its expiration timer, which was apparently
not the case till now. This point needs to be checked in 1.4.
2010-06-14 15:10:26 +02:00
Willy Tarreau
a975b8f381 [MINOR] tcp: add per-source connection rate limiting
This change makes use of the stick-tables to keep track of any source
address activity. Two ACLs make it possible to check the count of an
entry or update it and act accordingly. The typical usage will be to
reject a TCP request upon match of an excess value.
2010-06-14 15:10:25 +02:00
Willy Tarreau
2799e98a36 [MINOR] frontend: count denied TCP requests separately
It's very disturbing to see the "denied req" counter increase without
any other session counter moving. In fact, we can't count a rejected
TCP connection as "denied req" as we have not yet instanciated any
session at all. Let's use a new counter for that.
2010-06-14 10:53:20 +02:00
Willy Tarreau
a5c0ab200b [MEDIUM] frontend: check for LI_O_TCP_RULES in the listener
The new LI_O_TCP_RULES listener option indicates that some TCP rules
must be checked upon accept on this listener. It is now checked by
the frontend and the L4 rules are evaluated only in this case. The
flag is only set when at least one tcp-req rule is present in the
frontend.

The L4 rules check function has now been moved to proto_tcp.c where
it ought to be.
2010-06-14 10:53:13 +02:00
Willy Tarreau
1a68794418 [MEDIUM] config: parse tcp layer4 rules (tcp-request accept/reject)
These rules currently only support the "accept" and "reject" actions.
They will apply on pure layer 4 and will not support any content.
2010-06-14 10:53:12 +02:00
Willy Tarreau
eb472685cb [MEDIUM] separate protocol-level accept() from the frontend's
For a long time we had two large accept() functions, one for TCP
sockets instanciating proxies, and another one for UNIX sockets
instanciating the stats interface.

A lot of code was duplicated and both did not work exactly the same way.

Now we have a stream_sock layer accept() called for either TCP or UNIX
sockets, and this function calls the frontend-specific accept() function
which does the rest of the frontend-specific initialisation.

Some code is still duplicated (session & task allocation, stream interface
initialization), and might benefit from having an intermediate session-level
accept() callback to perform such initializations. Still there are some
minor differences that need to be addressed first. For instance, the monitor
nets should only be checked for proxies and not for other connection templates.

Last, we renamed l->private as l->frontend. The "private" pointer in
the listener is only used to store a frontend, so let's rename it to
eliminate this ambiguity. When we later support detached listeners
(eg: FTP), we'll add another field to avoid the confusion.
2010-06-14 10:53:11 +02:00
Willy Tarreau
03fa5df64a [CLEANUP] rename client -> frontend
The 'client.c' file now only contained frontend-specific functions,
so it has naturally be renamed 'frontend.c'. Same for client.h. This
has also been an opportunity to remove some cross references from
files that should not have depended on it.

In the end, this file should contain a protocol-agnostic accept()
code, which would initialize a session, task, etc... based on an
accept() from a lower layer. Right now there are still references
to TCP.
2010-06-14 10:53:10 +02:00
Willy Tarreau
645513ade8 [CLEANUP] client: move some ACLs away to their respective locations
Some ACLs in the client ought to belong to proto_tcp, or protocols.
This file should only contain frontend-specific information and will
be renamed that way in next commit.
2010-06-14 10:53:10 +02:00
Willy Tarreau
44b90cc4d8 [CLEANUP] tcp: move some non tcp-specific layer6 processing out of proto_tcp
Some functions which act on generic buffer contents without being
tcp-specific were historically in proto_tcp.c. This concerns ACLs
and RDP cookies. Those have been moved away to more appropriate
locations. Ideally we should create some new files for each layer6
protocol parser. Let's do that later.
2010-06-14 10:53:09 +02:00
Willy Tarreau
06457871a4 [CLEANUP] acl: use 'L6' instead of 'L4' in ACL flags relying on contents
Just like we do on health checks, we should consider that ACLs that make
use of buffer data are layer 6 and not layer 4, because we'll soon have
to distinguish between pure layer 4 ACLs (without any buffer) and these
ones.
2010-06-14 10:53:09 +02:00
Willy Tarreau
23968d898a [BUG] tcp: dropped connections must be counted as "denied" not "failed"
This probably was a copy-paste typo from the initial tcp-request feature.
This must be backported to 1.4 and possibly 1.3.
2010-05-28 18:10:31 +02:00
Willy Tarreau
c4262961f8 [MEDIUM] acl: add tree-based lookups of exact strings
Now if some ACL patterns are loaded from a file and the operation is
an exact string match, the data will be arranged in a tree, yielding
a significant performance boost on large data sets. Note that this
only works when case is sensitive.

A new dedicated function, acl_lookup_str(), has been created for this
matching. It is called for every possible input data to test and it
looks the tree up for the data. Since the keywords are loosely typed,
we would have had to add a new columns to all keywords to adjust the
function depending on the type. Instead, we just compare on the match
function. We call acl_lookup_str() when we could use acl_match_str().
The tree lookup is performed first, then the remaining patterns are
attempted if the tree returned nothing.

A quick test shows that when matching a header against a list of 52000
network names, haproxy uses 68% of one core on a core2-duo 3.2 GHz at
42000 requests per second, versus 66% without any rule, which means
only a 2% CPU increase for 52000 rules. Doing the same test without
the tree leads to 100% CPU at 6900 requests/s. Also it was possible
to run the same test at full speed with about 50 sets of 52000 rules
without any measurable performance drop.
2010-05-13 21:37:45 +02:00
Willy Tarreau
090466c91a [MINOR] add new tproxy flags for dynamic source address binding
This patch adds a new TPROXY bind type, TPROXY_DYN, to indicate to the
TCP connect function that we want to bind to the address passed in
argument.
2010-03-30 09:59:44 +02:00
Willy Tarreau
b1d67749db [MEDIUM] backend: move the transparent proxy address selection to backend
The transparent proxy address selection was set in the TCP connect function
which is not the most appropriate place since this function has limited
access to the amount of parameters which could produce a source address.

Instead, now we determine the source address in backend.c:connect_server(),
right after calling assign_server_address() and we assign this address in
the session and pass it to the TCP connect function. This cannot be performed
in assign_server_address() itself because in some cases (transparent mode,
dispatch mode or http_proxy mode), we assign the address somewhere else.

This change will open the ability to bind to addresses extracted from many
other criteria (eg: from a header).
2010-03-30 09:59:43 +02:00
Willy Tarreau
ef6494cb8c [CLEANUP] config: use build_acl_cond() instead of parse_acl_cond()
This allows to clean up the code a little bit by moving some of the
ACL internals out of the config parser.
2010-01-28 17:12:36 +01:00
Willy Tarreau
e803de2c6b [MINOR] add the ability to force kernel socket buffer size.
Sometimes we need to be able to change the default kernel socket
buffer size (recv and send). Four new global settings have been
added for this :
   - tune.rcvbuf.client
   - tune.rcvbuf.server
   - tune.sndbuf.client
   - tune.sndbuf.server

Those can be used to reduce kernel memory footprint with large numbers
of concurrent connections, and to reduce risks of write timeouts with
very slow clients due to excessive kernel buffering.
2010-01-22 11:49:41 +01:00
Willy Tarreau
7c3c54177a [MAJOR] buffers: automatically compute the maximum buffer length
We used to apply a limit to each buffer's size in order to leave
some room to rewrite headers, then we used to remove this limit
once the session switched to a data state.

Proceeding that way becomes a problem with keepalive because we
have to know when to stop reading too much data into the buffer
so that we can leave some room again to process next requests.

The principle we adopt here consists in only relying on to_forward+send_max.
Indeed, both of those data define how many bytes will leave the buffer.
So as long as their sum is larger than maxrewrite, we can safely
fill the buffers. If they are smaller, then we refrain from filling
the buffer. This means that we won't risk to fill buffers when
reading last data chunk followed by a POST request and its contents.

The only impact identified so far is that we must ensure that the
BF_FULL flag is correctly dropped when starting to forward. Right
now this is OK because nobody inflates to_forward without using
buffer_forward().
2009-12-22 10:06:34 +01:00
Krzysztof Piotr Oledzki
97f07b832f [MEDIUM] Decrease server health based on http responses / events, version 3
Implement decreasing health based on observing communication between
HAProxy and servers.

Changes in this version 2:
 - documentation
 - close race between a started check and health analysis event
 - don't force fastinter if it is not set
 - better names for options
 - layer4 support

Changes in this version 3:
 - add stats
 - port to the current 1.4 tree
2009-12-16 00:29:27 +01:00
Willy Tarreau
8d5d77efc3 [OPTIM] move some rarely used fields out of fdtab
Some rarely information are stored in fdtab, making it larger for no
reason (source port ranges, remote address, ...). Such information
lie there because the checks can't find them anywhere else. The goal
will be to move these information to the stream interface once the
checks make use of it.

For now, we move them to an fdinfo array. This simple change might
have improved the cache hit ratio a little bit because a 0.5% of
performance increase has measured.
2009-10-18 08:17:33 +02:00
Willy Tarreau
cb6cd43725 [MINOR] tcp: add support for the defer_accept bind option
This can ensure that data is readily available on a socket when
we accept it, but a bug in the kernel ignores the timeout so the
socket can remain pending as long as the client does not talk.
Use with care.
2009-10-13 07:34:14 +02:00
Krzysztof Piotr Oledzki
aeebf9ba65 [MEDIUM] Collect & provide separate statistics for sockets, v2
This patch allows to collect & provide separate statistics for each socket.
It can be very useful if you would like to distinguish between traffic
generate by local and remote users or between different types of remote
clients (peerings, domestic, foreign).

Currently no "Session rate" is supported, but adding it should be possible
if we found it useful.
2009-10-04 18:56:02 +02:00
Krzysztof Piotr Oledzki
052d4fd07d [CLEANUP] Move counters to dedicated structures
Move counters from "struct proxy" and "struct server"
to "struct pxcounters" and "struct svcounters".

This patch should make no functional change.
2009-10-04 18:32:39 +02:00
Willy Tarreau
520d95e42b [MAJOR] buffers: split BF_WRITE_ENA into BF_AUTO_CONNECT and BF_AUTO_CLOSE
The BF_WRITE_ENA buffer flag became very complex to deal with, because
it was used to :
  - enable automatic connection
  - enable close forwarding
  - enable data forwarding

The last point was not very true anymore since we introduced ->send_max,
but still the test remained everywhere. This was causing issues such as
impossibility to connect without forwarding data, impossibility to prevent
closing when data was forwarded, etc...

This patch clarifies the situation by getting rid of this multi-purpose
flag and replacing it with :
  - data forwarding based only on ->send_max || ->pipe ;
  - a new BF_AUTO_CONNECT flag to allow automatic connection and only
    that ;
  - ability to perform an automatic connection when ->send_max or ->pipe
    indicate that data is waiting to leave the buffer ;
  - a new BF_AUTO_CLOSE flag to let the producer automatically set the
    BF_SHUTW_NOW flag when it gets a BF_SHUTR.

During this cleanup, it was discovered that some tests were performed
twice, or that the BF_HIJACK flag was still tested, which is not needed
anymore since ->send_max replcaed it. These places have been fixed too.

These cleanups have also revealed a few areas where the other flags
such as BF_EMPTY are not cleanly used. This will be an opportunity for
a second patch.
2009-09-19 21:14:54 +02:00
Dmitry Sivachenko
caf58986fb [BUILD] compilation of haproxy-1.4-dev2 on FreeBSD
Please consider the following patches. They are required to
compile haproxy-1.4-dev2 on FreeBSD.

Summary:
1) include <sys/types.h> before <netinet/tcp.h>
2) Use IPPROTO_TCP instead of SOL_TCP
(they are both defined as 6, TCP protocol number)
2009-08-30 14:45:19 +02:00
Willy Tarreau
9650f37628 [MEDIUM] move connection establishment from backend to the SI.
The connection establishment was completely handled by backend.c which
normally just handles LB algos. Since it's purely TCP, it must move to
proto_tcp.c. Also, instead of calling it directly, we now call it via
the stream interface, which will later help us unify session handling.
2009-08-16 17:46:15 +02:00
Willy Tarreau
c9fce2fee8 [BUILD] fix build for systems without SOL_TCP
Andrew Azarov reported that haproxy-1.4-dev1 does not build
under FreeBSD 7.2 because SOL_TCP is not defined. So add a
check for its definition before using it. This only impacts
network optimisations anyway.
2009-08-16 14:13:47 +02:00
Willy Tarreau
606ad73e73 [BUG] config: tcp-request content only accepts "if" or "unless"
As reported by Maik Broemme, if something different from "if" or
"unless" was specified after "tcp-request content accept", the
condition would silently remain void. The parser must obviously
complain since this typically corresponds to a forgotten "if".
2009-07-14 21:17:05 +02:00
Willy Tarreau
1a211943f6 [MINOR] acl: don't complain anymore when using L7 acls in TCP
Since TCP can now check contents using L7 acls, we must not
complain anymore.
2009-07-14 13:53:17 +02:00
Emeric Brun
647caf1ebc [MEDIUM] add support for RDP cookie persistence
The new statement "persist rdp-cookie" enables RDP cookie
persistence. The RDP cookie is then extracted from the RDP
protocol, and compared against available servers. If a server
matches the RDP cookie, then it gets the connection.
2009-07-14 12:50:40 +02:00
Emeric Brun
bede3d0ef4 [MINOR] acl: add support for matching of RDP cookies
The RDP protocol is quite simple and documented, which permits
an easy detection and extraction of cookies. It can be useful
to match the MSTS cookie which can contain the username specified
by the client.
2009-07-14 12:50:39 +02:00
Willy Tarreau
51d5dad90a [MINOR] allow TCP inspection rules to make use of HTTP ACLs
Since we can call the HTTP parser from TCP inspection rules, it makes
sense to be able to use the HTTP ACLs with it. That way, we can decide
from a TCP frontend to take a switching decision based on full layer7
decoding. This might be useful to perform layer7 content switching from
a layer4 frontend in fact. For instance, we might want to be able to
detect http/https on a frontend, but still switch to backend X or Y
depending on the Host header. Note that it is mandatory to wait for
an HTTP request otherwise the ACLs will randomly match.
2009-07-12 10:10:05 +02:00
Willy Tarreau
a9fb08317f [MINOR] report in the proxies the requirements for ACLs
This patch propagates the ACL conditions' "requires" bitfield
to the proxies. This makes it possible to know exactly what a
proxy might have to support for any request, which helps knowing
whether we have to allocate some space for certain types of
structures or not (eg: the hdr_idx struct).

The concept might be extended to a lot more types of information,
such as detecting whether we need to allocate some space for some
request ACLs which need a result in the response, etc...
2009-07-10 23:09:39 +02:00
Willy Tarreau
3a816293e9 [MEDIUM] session: tell analysers what bit they were called for
Some stream analysers might become generic enough to be called
for several bits. So we cannot have the analyser bit hard coded
into the analyser itself. Let's make the caller inform the callee.
2009-07-07 10:55:49 +02:00
Willy Tarreau
5d707e1aaa [MEDIUM] stream_sock: don't close prematurely when nolinger is set
When the nolinger option is used, we must not close too fast because
some data might be left unsent. Instead we must proceed with a normal
shutdown first, then a close. Also, we want to avoid merging FIN with
the last segment if nolinger is set, because if that one gets lost,
there is no chance for it to be retransmitted.
2009-06-28 11:09:07 +02:00
Willy Tarreau
be1b91842a [MEDIUM] add support for TCP MSS adjustment for listeners
Sometimes it can be useful to limit the advertised TCP MSS on
incoming connections, for instance when requests come through
a VPN or when the system is running with jumbo frames enabled.

Passing the "mss <value>" arguments to a "bind" line will set
the value. This works under Linux >= 2.6.28, and maybe a few
earlier ones, though due to an old kernel bug most of earlier
versions will probably ignore it. It is also possible that some
other OSes will support this.
2009-06-14 18:48:19 +02:00
Willy Tarreau
fb14edc215 [MEDIUM] stream_sock: implement tcp-cork for use during shutdowns on Linux
Setting TCP_CORK on a socket before sending the last segment enables
automatic merging of this segment with the FIN from the shutdown()
call. Playing with TCP_CORK is not easy though as we have to track
the status of the TCP_NODELAY flag since both are mutually exclusive.
Doing so saves one more packet per session and offers about 5% more
performance.

There is no reason not to do it, so there is no associated option.
2009-06-14 15:24:37 +02:00
Willy Tarreau
9ea05a790f [MEDIUM] implement option tcp-smart-accept at the frontend
This option disables TCP quick ack upon accept. It is also
automatically enabled in HTTP mode, unless the option is
explicitly disabled with "no option tcp-smart-accept".

This saves one packet per connection which can bring reasonable
amounts of bandwidth for servers processing small requests.
2009-06-14 12:07:01 +02:00
Willy Tarreau
8e80e0bc4c [BUG] fix parser crash on unconditional tcp content rules
Since 1.3.17, a config containing one of the following lines would
crash the parser :

    tcp content reject
    tcp content accept

This is because a check is performed on the condition which is not
specified. The obvious fix consists in checkinf for a condition
first.
2009-05-10 12:22:39 +02:00
Willy Tarreau
61d188920e [MINOR] improve reporting of misplaced acl/reqxxx rules
Now we can detect improper ordering of "block", "reqxxx", "reqadd",
"redirect" and "use_backend", and warn the user accordingly.
2009-03-31 10:49:21 +02:00
Willy Tarreau
86ef7dc98d [MINOR] tcp_request: let the caller take care of errors and timeouts
tcp_request is not meant to decide how an error or a timeout has to
be handled. It must just apply it rules. Now that the error checks
have been added to the session, we don't need to check them anymore
in tcp_request_inspect(), which will only consider the shutdown which
may be the result of such an error.

That makes a lot more sense since tcp_request is not really waiting
for a request.
2009-03-15 22:55:47 +01:00
Willy Tarreau
5af24efee9 [CLEANUP] config: catch and report some possibly wrong rule ordering
There are some configurations in which redirect rules are declared
after use_backend rules. We can also find "block" rules after any
of these ones. The processing sequence is :
  - block
  - redirect
  - use_backend

So as of now we try to detect wrong ordering to warn the user about
a possibly undesired behaviour.
2009-03-15 15:23:16 +01:00
Willy Tarreau
d869b24119 [MINOR] tcp-inspect: permit the use of no-delay inspection
Sometimes it may make sense to be able to immediately apply a verdict
without waiting at all. It was not possible because no inspect-delay
meant no inspection at all. This is now fixed.
2009-03-15 14:43:58 +01:00
Willy Tarreau
604e83097f [BUG] interface binding: length must include the trailing zero
The interface length passed to the setsockopt(SO_BINDTODEVICE) must
include the trailing \0. Otherwise it will randomly fail.
2009-03-06 00:48:23 +01:00
Willy Tarreau
5e6e204d1c [MINOR] add support for bind interface name
By appending "interface <name>" to a "bind" line, it is now possible
to specifically bind to a physical interface name. Note that this
currently only works on Linux and requires root privileges.
2009-02-04 17:19:29 +01:00
Willy Tarreau
03d60bbaf9 [OPTIM] buffer: replace rlim by max_len
In the buffers, the read limit used to leave some place for header
rewriting was set by a pointer to the end of the buffer. Not only
this required subtracts at every place in the code, but this will
also soon not be usable anymore when we want to support keepalive.

Let's replace this with a length limit, comparable to the buffer's
length. This has also sightly reduced the code size.
2009-01-09 11:14:39 +01:00
Willy Tarreau
b5654f6ff4 [MINOR] move the listener reference from fd to session
The listener referenced in the fd was only used to check the
listener state upon session termination. There was no guarantee
that the FD had not been reassigned by the moment it was processed,
so this was a bit racy. Having it in the session is more robust.
2008-12-07 16:45:10 +01:00