H A - P r o x y --------------- version 1.1.27 willy tarreau 2003/10/27 ============ | Abstract | ============ HA-Proxy is a TCP/HTTP reverse proxy which is particularly suited for high availability environments. Indeed, it can : - route HTTP requests depending on statically assigned cookies ; - spread the load among several servers while assuring server persistence through the use of HTTP cookies ; - switch to backup servers in the event a main one fails ; - accept connections to special ports dedicated to service monitoring ; - stop accepting connections without breaking existing ones ; - add/modify/delete HTTP headers both ways ; - block requests matching a particular pattern ; It needs very little resource. Its event-driven architecture allows it to easily handle thousands of simultaneous connections on hundreds of instances without risking the system's stability. ==================== | Start parameters | ==================== There are only a few command line options : -f -n -N -d starts in foregreound with debugging mode enabled -D starts in daemon mode -p asks the process to write down each of its children's pids to this file in daemon mode. -s shows statistics (only if compiled in) -l shows even more statistics (implies '-s') The maximal number of connections per proxy is used as the default parameter for each instance for which the 'maxconn' paramter is not set in the 'listen' section. The maximal number of total connections limits the number of connections used by the whole process if the 'maxconn' parameter is not set in the 'global' section. The debugging mode has the same effect as the 'debug' option in the 'global' section. When the proxy runs in this mode, it dumps every connections, disconnections, timestamps, and HTTP headers to stdout. This should NEVER be used in an init script since it will prevent the system from starting up. Statistics are only available if compiled in with the 'STATTIME' option. It's only used during code optimization phases. ====================== | Configuration file | ====================== Structure ========= The configuration file parser ignores empty lines, spaces, tabs. Anything between a sharp ('#') not following a backslash ('\'), and the end of a line constitutes a comment and is ignored too. The configuration file is segmented in sections. A section begins whenever one of these 3 keywords are encountered : - 'global' - 'listen' - 'defaults' Every parameter refer to the section beginning at the last one of these 3 keywords. 1) Global parameters ==================== Global parameters affect the whole process behaviour. They are all set in the 'global' section. There may be several 'global' sections if needed, but their parameters will only be merged. Allowed parameters in 'global' section include the following ones : - log
[max_level] - maxconn - uid - gid - chroot - nbproc - daemon - debug - quiet - pidfile 1.1) Event logging ------------------ Most events are logged : start, stop, servers going up and down, connections and errors. Each event generates a syslog message which can be sent to up to 2 servers. The syntax is : log [max_level] Connections are logged at level "info". Services initialization and servers going up are logged at level "notice", termination signals are logged at "warning", and definitive service termination, as well as loss of servers are logged at level "alert". The optional parameter specifies above what level messages should be sent. Level can take one of these 8 values : emerg, alert, crit, err, warning, notice, info, debug For backwards compatibility with versions 1.1.16 and earlier, the default level value is "debug" if not specified. Permitted facilities are : kern, user, mail, daemon, auth, syslog, lpr, news, uucp, cron, auth2, ftp, ntp, audit, alert, cron2, local0, local1, local2, local3, local4, local5, local6, local7 According to RFC3164, messages are truncated to 1024 bytes before being emitted. Example : --------- global log 192.168.2.200 local3 log 127.0.0.1 local4 notice 1.2) limiting the number of connections --------------------------------------- It is possible and recommended to limit the global number of per-process connections. Since one connection includes both a client and a server, it means that the max number of TCP sessions will be about the double of this number. It's important to understand this when trying to find best values for 'ulimit -n' before starting the proxy. To anticipate the number of sockets needed, all these parameters must be counted : - 1 socket per incoming connection - 1 socket per outgoing connection - 1 socket per address/port/proxy tuple. - 1 socket per server being health-checked - 1 socket for all logs In simple configurations where each proxy only listens one one address/port, set the limit of file descriptors (ulimit -n) to (2 * maxconn + nbproxies + nbservers + 1). In a future release, haproxy may be able to set this value itself. 1.3) Drop of priviledges ------------------------ In order to reduce the risk and consequences of attacks, in the event where a yet non-identified vulnerability would be successfully exploited, it's possible to lower the process priviledges and even isolate it in a riskless directory. In the 'global' section, the 'uid' parameter sets a numerical user identifier which the process will switch to after binding its listening sockets. The value '0', which normally represents the super-user, here indicates that the UID must not change during startup. It's the default behaviour. The 'gid' parameter does the same for the group identifier. It's particularly advised against use of generic accounts such as 'nobody' because it has the same consequences as using 'root' if other services use them. The 'chroot' parameter makes the process isolate itself in an empty directory just before switching its UID. This type of isolation (chroot) can sometimes be worked around on certain OS (Linux, Solaris), provided that the attacker has gained 'root' priviledges and has the ability to use or create a directory. For this reason, it's capital to use a dedicated directory and not to share one between several services of different nature. To make isolation more resistant, it's recommended to use an empty directory without any right, and to change the UID of the process so that it cannot do anything there. Note: in the event where such a vulnerability would be exploited, it's most likely that first attempts would kill the process due to 'Segmentation Fault', 'Bus Error' or 'Illegal Instruction' signals. Eventhough it's true that isolating the server reduces the risks of intrusion, it's sometimes useful to find why a process dies, via the analysis of a 'core' file, although very rare (the last bug of this sort was fixed in 1.1.9). For security reasons, most systems disable the generation of core file when a process changes its UID. So the two workarounds are either to start the process from a restricted user account, which will not be able to chroot itself, or start it as root and not change the UID. In both cases the core will be either in the start or the chroot directories. Do not forget to allow core dumps prior to start the process : # ulimit -c unlimited Example : --------- global uid 30000 gid 30000 chroot /var/chroot/haproxy 1.4) Startup modes ------------------ The service can start in several different : - foreground / background - quiet / normal / debug The default mode is normal, foreground, which means that the program doesn't return once started. NEVER EVER use this mode in a system startup script, or the system won't boot. It needs to be started in background, so that it returns immediately after forking. That's accomplished by the 'daemon' option in the 'global' section, which is the equivalent of the '-D' command line argument. Moreover, certain alert messages are still sent to the standard output even in 'daemon' mode. To make them disappear, simply add the 'quiet' option in the 'global' section. This option has no command-line equivalent. Last, the 'debug' mode, enabled with the 'debug' option in the 'global' section, and which is equivalent of the '-d' option, allows deep TCP/HTTP analysis, with timestamped display of each connection, disconnection, and HTTP headers for both ways. This mode is incompatible with 'daemon' and 'quiet' modes for obvious reasons. 1.5) Increasing the overall processing power -------------------------------------------- On multi-processor systems, it may seem to be a shame to use only one processor, eventhough the load needed to saturate a recent processor are far above common usage. Anyway, for very specific needs, the proxy can start several processes between which the operating system will spread the incoming connections. The number of processes is controlled by the 'nbproc' parameter in the 'global' section. It defaults to 1, and obviously works only in 'daemon' mode. Example : --------- global daemon quiet nbproc 2 1.6) Helping process management ------------------------------- Haproxy now supports the notion of pidfile. If the '-p' command line argument, or the 'pidfile' global option is followed with a file name, this file will be removed, then filled with all children's pids, one per line (only in daemon mode). This file is NOT within the chroot, which allows to work with a readonly chroot. It will be owned by the user starting the process, and will have permissions 0644. Example : --------- global daemon quiet nbproc 2 pidfile /var/run/haproxy-private.pid # to stop only those processes among others : # kill $( [ :[,...] ] - is the name of the instance. This name will be reported in logs, so it is good to have it reflect the proxied service. No unicity test is done on this name, and it's not mandatory for it to be unique, but highly recommended. - is the IP address the proxy binds to. Empty address, '*' and '0.0.0.0' all mean that the proxy listens to all valid addresses on the system. - is either a unique port, or a port range for which the proxy will accept connections for the IP address specified above. This range can be : - a numerical port (ex: '80') - a dash-delimited ports range explicitly stating the lower and upper bounds (ex: '2000-2100') which are included in the range. Particular care must be taken against port ranges, because every couple consumes one socket (=a file descriptor), so it's easy to eat lots of descriptors with a simple range. The couple must be used only once among all instances running on a same system. Please note that attaching to ports lower than 1024 need particular priviledges to start the program, which are independant of the 'uid' parameter. - the : couple may be repeated indefinitely to require the proxy to listen to other addresses and/or ports. To achieve this, simply separate them with a coma. Examples : --------- listen http_proxy :80 listen x11_proxy 127.0.0.1:6000-6009 listen smtp_proxy 127.0.0.1:25,127.0.0.1:587 listen ldap_proxy :389,:663 In the event that all addresses do not fit line width, it's preferable to detach secondary addresses on other lines with the 'bind' keyword. If this keyword is used, it's not even necessary to specify the first address on the 'listen' line, which sometimes makes multiple configuration handling easier : bind [ :[,...] ] Examples : ---------- listen http_proxy bind :80,:443 bind 10.0.0.1:10080,10.0.0.1:10443 2.1) Inhibiting a service ------------------------- A service may be disabled for maintenance reasons, without needing to comment out the whole section, simply by specifying the 'disabled' keyword in the section to be disabled : listen smtp_proxy 0.0.0.0:25 disabled Note: the 'enabled' keyword allows to enable a service which has been disabled previously by a default configuration. 2.2) Modes of operation ----------------------- A service can work in 3 different distinct modes : - TCP - HTTP - monitoring TCP mode -------- In this mode, the service relays TCP connections as soon as they're established, towards one or several servers. No processing is done on the stream. It's only an association of source(addr:port) -> destination(addr:port). To use this mode, you must specify 'mode tcp' in the 'listen' section. This is the default mode. Example : --------- listen smtp_proxy 0.0.0.0:25 mode tcp HTTP mode --------- In this mode, the service relays TCP connections towards one or several servers, when it has enough informations to decide, which normally means that all HTTP headers have been read. Some of them may be scanned for a cookie or a pattern matching a regex. To use this mode, specify 'mode http' in the 'listen' section. Example : --------- listen http_proxy 0.0.0.0:80 mode http Health-checking mode -------------------- This mode provides a way for external components to check the proxy's health. It is meant to be used with intelligent load-balancers which can use send/expect scripts to check for all of their servers' availability. This one simply accepts the connection, returns the word 'OK' and closes it. If the 'option httpchk' is set, then the reply will be 'HTTP/1.0 200 OK' with no data, so that it can be tested from a tool which supports HTTP health-checks. To enable it, simply specify 'health' as the working mode : Example : --------- # simple response : 'OK' listen health_check 0.0.0.0:60000 mode health # HTTP response : 'HTTP/1.0 200 OK' listen http_health_check 0.0.0.0:60001 mode health option httpchk 2.3) Limiting the number of simultaneous connections ---------------------------------------------------- The 'maxconn' parameter allows a proxy to refuse connections above a certain amount of simultaneous ones. When the limit is reached, it simply stops listening, but the system may still be accepting them because of the back log queue. These connections will be processed further when other ones have freed some slots. This provides a serialization effect which helps very fragile servers resist to high loads. Se further for system limitations. Example : --------- listen tiny_server 0.0.0.0:80 maxconn 10 2.4) Soft stop -------------- It is possible to stop services without breaking existing connections by the sending of the SIG_USR1 signal to the process. All services are then put into soft-stop state, which means that they will refuse to accept new connections, except for those which have a non-zero value in the 'grace' parameter, in which case they will still accept connections for the specified amount of time, in milliseconds. This allows to tell a load-balancer that the service is failing, while still doing the job during the time it needs to detect it. Note: active connections are never killed. In the worst case, the user will have to wait for all of them to close or to time-out, or simply kill the process normally (SIG_TERM). The default 'grace' value is '0'. Example : --------- # enter soft stop after 'killall -USR1 haproxy' # the service will still run 10 seconds after the signal listen http_proxy 0.0.0.0:80 mode http grace 10000 # this port is dedicated to a load-balancer, and must fail immediately listen health_check 0.0.0.0:60000 mode health grace 0 2.5) Connections expiration time -------------------------------- It is possible (and recommended) to configure several time-outs on TCP connections. Three independant timers are adjustable with values specified in milliseconds. A session will be terminated if either one of these timers expire. - the time we accept to wait for data from the client, or for the client to accept data : 'clitimeout' : # client time-out set to 2mn30. clitimeout 150000 - the time we accept to wait for data from the server, or for the server to accept data : 'srvtimeout' : # server time-out set to 30s. srvtimeout 30000 - the time we accept to wait for a connection to establish on a server : 'contimeout' : # we give up if the connection does not complete within 4 seconds contimeout 4000 Notes : ------- - 'contimeout' and 'srvtimeout' have no sense on 'health' mode servers ; - under high loads, or with a saturated or defective network, it's possible that some packets get lost. Since the first TCP retransmit only happens after 3 seconds, a time-out equal to, or lower than 3 seconds cannot compensate for a packet loss. A 4 seconds time-out seems a reasonable minimum which will considerably reduce connection failures. 2.6) Attempts to reconnect -------------------------- After a connection failure to a server, it is possible to retry, potentially on another server. This is useful if health-checks are too rare and you don't want the clients to see the failures. The number of attempts to reconnect is set by the 'retries' paramter. Example : --------- # we can retry 3 times max after a failure retries 3 2.7) Address of the dispatch server (deprecated) ------------------------------------------------ The server which will be sent all new connections is defined by the 'dispatch' parameter, in the form
:. It generally is dedicated to unknown connections and will assign them a cookie, in case of HTTP persistence mode, or simply is a single server in case of generic TCP proxy. This old mode is only provided for backwards compatibility, but doesn't allow to check remote servers state, and has a rather limited usage. All new setups should switch to 'balance' mode. The principle of the dispatcher is to be able to perform the load balancing itself, but work only on new clients so that the server doesn't need to be a big machine. Example : --------- # all new connections go there dispatch 192.168.1.2:80 Note : ------ This parameter has no sense for 'health' servers, and is incompatible with 'balance' mode. 2.8) Outgoing source address ---------------------------- It is often necessary to bind to a particular address when connecting to some remote hosts. This is done via the 'source' parameter which is a per-proxy parameter. A newer version may allow to fix different sources to reach different servers. The syntax is 'source
[:]', where
is a valid local address (or '0.0.0.0' or '*' or empty to let the system choose), and is an optional parameter allowing the user to force the source port for very specific needs. If the port is not specified or is '0', the system will choose a free port. Note that as of version 1.1.18, the servers health checks are also performed from the same source. Examples : ---------- listen http_proxy *:80 # all connections take 192.168.1.200 as source address source 192.168.1.200:0 listen rlogin_proxy *:513 # use address 192.168.1.200 and the reserved port 900 (needs to be root) source 192.168.1.200:900 2.9) Setting the cookie name ---------------------------- In HTTP mode, it is possible to look for a particular cookie which will contain a server identifier which should handle the connection. The cookie name is set via the 'cookie' parameter. Example : --------- listen http_proxy :80 mode http cookie SERVERID It is possible to change the cookie behaviour to get a smarter persistence, depending on applications. It is notably possible to delete or modify a cookie emitted by a server, insert a cookie identifying the server in an HTTP response and even add a header to tell upstream caches not to cache this response. Examples : ---------- To remove the cookie for direct accesses (ie when the server matches the one which was specified in the client cookie) : cookie SERVERID indirect To replace the cookie value with the one assigned to the server if any (no cookie will be created if the server does not provide one, nor if the configuration does not provide one). This lets the application put the cookie exactly on certain pages (eg: successful authentication) : cookie SERVERID rewrite To create a new cookie and assign the server identifier to it (in this case, all servers should be associated with a valid cookie, since no cookie will simply delete the cookie from the client's browser) : cookie SERVERID insert To insert a cookie and ensure that no upstream cache will store it, add the 'nocache' option : cookie SERVERID insert nocache To insert a cookie only after a POST request, add 'postonly' after 'insert'. This has the advantage that there's no risk of caching, and that all pages seen before the POST one can still be cached : cookie SERVERID insert postonly Notes : ----------- - it is possible to combine 'insert' with 'indirect' or 'rewrite' to adapt to applications which already generate the cookie with an invalid content. - in the case where 'insert' and 'indirect' are both specified, the cookie is never transmitted to the server, since it wouldn't understand it. This is the most application-transparent mode. - it is particularly recommended to use 'nocache' in 'insert' mode if any upstream HTTP/1.0 cache is susceptible to cache the result, because this may lead to many clients going to the same server, or even worse, some clients having their server changed while retrieving a page from the cache. - when the application is well known and controlled, the best method is to only add the persistence cookie on a POST form because it's up to the application to select which page it wants the upstream servers to cache. In this case, you would use 'insert postonly indirect'. 2.10) Associating a cookie value with a server ---------------------------------------------- In HTTP mode, it's possible to associate a cookie value to each server. This was initially used in combination with 'dispatch' mode to handle direct accesses but it is now the standard way of doing the load balancing. The syntax is : server
: cookie - is any name which can be used to identify the server in the logs. -
: specifies where the server is bound. - is the value to put in or to read from the cookie. Example : the 'SERVERID' cookie can be either 'server01' or 'server02' --------- listen http_proxy :80 mode http cookie SERVERID dispatch 192.168.1.100:80 server web1 192.168.1.1:80 cookie server01 server web2 192.168.1.2:80 cookie server02 Warning : the syntax has changed since version 1.0 ! --------- 3) Autonomous load balancer =========================== The proxy can perform the load-balancing itself, both in TCP and in HTTP modes. This is the most interesting mode which obsoletes the old 'dispatch' mode described above. It has advantages such as server health monitoring, multiple port binding and port mapping. To use this mode, the 'balance' keyword is used, followed by the selected algorithm. As of version 1.1.23, only 'roundrobin' is available, which is also the default value if unspecified. In this mode, there will be no dispatch address, but the proxy needs at least one server. Example : same as the last one, with internal load balancer --------- listen http_proxy :80 mode http cookie SERVERID balance roundrobin server web1 192.168.1.1:80 cookie server01 server web2 192.168.1.2:80 cookie server02 Since version 1.1.22, it is possible to automatically determine on which port the server will get the connection, depending on the port the client connected to. Indeed, there now are 4 possible combinations for the server's field: - unspecified or '0' : the connection will be sent to the same port as the one on which the proxy received the client connection itself. - numerical value (the only one supported in versions earlier than 1.1.22) : the connection will always be sent to the specified port. - '+' followed by a numerical value : the connection will be sent to the same port as the one on which the proxy received the connection, plus this value. - '-' followed by a numerical value : the connection will be sent to the same port as the one on which the proxy received the connection, minus this value. Examples : ---------- # same as previous example listen http_proxy :80 mode http cookie SERVERID balance roundrobin server web1 192.168.1.1 cookie server01 server web2 192.168.1.2 cookie server02 # simultaneous relaying of ports 80, 81 and 8080-8089 listen http_proxy :80,:81,:8080-8089 mode http cookie SERVERID balance roundrobin server web1 192.168.1.1 cookie server01 server web2 192.168.1.2 cookie server02 # relaying of TCP ports 25, 389 and 663 to ports 1025, 1389 and 1663 listen http_proxy :25,:389,:663 mode tcp balance roundrobin server srv1 192.168.1.1:+1000 server srv2 192.168.1.2:+1000 3.1) Server monitoring ---------------------- It is possible to check the servers status by trying to establish TCP connections or even sending HTTP requests to them. A server which fails to reply to health checks as expected will not be used by the load balancing algorithms. To enable monitoring, add the 'check' keyword on a server line. It is possible to specify the interval between tests (in milliseconds) with the 'inter' parameter, the number of failures supported before declaring that the server has fallen down with the 'fall' parameter, and the number of valid checks needed for the server to fully get up with the 'rise' parameter. Since version 1.1.22, it is also possible to send checks to a different port (mandatory when none is specified) with the 'port' parameter. The default values are the following ones : - inter : 2000 - rise : 2 - fall : 3 - port : default server port The default mode consists in establishing TCP connections only. But in certain types of application failures, it is often that the server continues to accept connections because the system does it itself while the application is running an endless loop, or is completely stuck. So in version 1.1.16 were introduced HTTP health checks which only performed simple lightweight requests and analysed the response. Now, as of version 1.1.23, it is possible to change the HTTP method, the URI, and the HTTP version string (which even allows to send headers with a dirty trick). To enable HTTP health-checks, use 'option httpchk'. By default, requests use the 'OPTIONS' method because it's very light and easy to filter from logs, and does it on '/'. Only HTTP responses 2xx and 3xx are considered valid ones, and only if they come before the time to send a new request is reached ('inter' parameter). If some servers block this type of request, 3 other forms help to forge a request : - option httpchk -> OPTIONS / HTTP/1.0 - option httpchk URI -> OPTIONS HTTP/1.0 - option httpchk METH URI -> HTTP/1.0 - option httpchk METH URI VER -> See examples below. Since version 1.1.17, it is possible to specify backup servers. These servers are only sollicited when no other server is available. This may only be useful to serve a maintenance page, or define one active and one backup server (seldom used in TCP mode). To make a server a backup one, simply add the 'backup' option on its line. These servers also support cookies, so if a cookie is specified for a backup server, clients assigned to this server will stick to it even when the other ones come back. Conversely, if no cookie is assigned to such a server, the clients will get their cookies removed (empty cookie = removal), and will be balanced against other servers once they come back. Please note that there is no load-balancing among backup servers. If there are several backup servers, the second one will only be used when the first one dies, and so on. Since version 1.1.17, it is also possible to visually check the status of all servers at once. For this, you just have to send a SIGHUP signal to the proxy. The servers status will be dumped into the logs at the 'notice' level, as well as on if not closed. For this reason, it's always a good idea to have one local log server at the 'notice' level. Examples : ---------- # same setup as in paragraph 3) with TCP monitoring listen http_proxy 0.0.0.0:80 mode http cookie SERVERID balance roundrobin server web1 192.168.1.1:80 cookie server01 check server web2 192.168.1.2:80 cookie server02 check inter 500 rise 1 fall 2 # same with HTTP monitoring via 'OPTIONS / HTTP/1.0' listen http_proxy 0.0.0.0:80 mode http cookie SERVERID balance roundrobin option httpchk server web1 192.168.1.1:80 cookie server01 check server web2 192.168.1.2:80 cookie server02 check inter 500 rise 1 fall 2 # same with HTTP monitoring via 'OPTIONS /index.html HTTP/1.0' listen http_proxy 0.0.0.0:80 mode http cookie SERVERID balance roundrobin option httpchk /index.html server web1 192.168.1.1:80 cookie server01 check server web2 192.168.1.2:80 cookie server02 check inter 500 rise 1 fall 2 # same with HTTP monitoring via 'HEAD /index.jsp? HTTP/1.1\r\nHost: www' listen http_proxy 0.0.0.0:80 mode http cookie SERVERID balance roundrobin option httpchk HEAD /index.jsp? HTTP/1.1\r\nHost:\ www server web1 192.168.1.1:80 cookie server01 check server web2 192.168.1.2:80 cookie server02 check inter 500 rise 1 fall 2 # automatic insertion of a cookie in the server's response, and automatic # deletion of the cookie in the client request, while asking upstream caches # not to cache replies. listen web_appl 0.0.0.0:80 mode http cookie SERVERID insert nocache indirect balance roundrobin server web1 192.168.1.1:80 cookie server01 check server web2 192.168.1.2:80 cookie server02 check # same with off-site application backup and local error pages server listen web_appl 0.0.0.0:80 mode http cookie SERVERID insert nocache indirect balance roundrobin server web1 192.168.1.1:80 cookie server01 check server web2 192.168.1.2:80 cookie server02 check server web-backup 192.168.2.1:80 cookie server03 check backup server web-excuse 192.168.3.1:80 check backup # SMTP+TLS relaying with heakth-checks and backup servers listen http_proxy :25,:587 mode tcp balance roundrobin server srv1 192.168.1.1 check port 25 inter 30000 rise 1 fall 2 server srv2 192.168.1.2 backup 3.2) Redistribute connections in case of failure ------------------------------------------------ In HTTP mode, if a server designated by a cookie does not respond, the clients may definitely stick to it because they cannot flush the cookie, so they will not be able to access the service anymore. Specifying 'redispatch' will allow the proxy to break their persistence and redistribute them to working servers. Example : --------- listen http_proxy 0.0.0.0:80 mode http cookie SERVERID dispatch 192.168.1.100:80 server web1 192.168.1.1:80 cookie server01 server web2 192.168.1.2:80 cookie server02 redispatch # send back to dispatch in case of connection failure Up to, and including version 1.1.16, this parameter only applied to connection failures. Since version 1.1.17, it also applies to servers which have been detected as failed by the health check mechanism. Indeed, a server may be broken but still accepting connections, which would not solve every case. But it is possible to conserve the old behaviour, that is, make a client insist on trying to connect to a server even if it is said to be down, by setting the 'persist' option : listen http_proxy 0.0.0.0:80 mode http option persist cookie SERVERID dispatch 192.168.1.100:80 server web1 192.168.1.1:80 cookie server01 server web2 192.168.1.2:80 cookie server02 redispatch # send back to dispatch in case of connection failure 4) Additionnal features ======================= Other features are available. They are transparent mode, event logging and header rewriting/filtering. 4.1) Transparent mode --------------------- In HTTP mode, the 'transparent' keyword allows to intercept sessions which are routed through the system hosting the proxy. This mode was implemented as a replacement for the 'dispatch' mode, since connections without cookie will be sent to the original address while known cookies will be sent to the servers. This mode implies that the system can redirect sessions to a local port. Example : --------- listen http_proxy 0.0.0.0:65000 mode http transparent cookie SERVERID server server01 192.168.1.1:80 server server02 192.168.1.2:80 # iptables -t nat -A PREROUTING -i eth0 -p tcp -d 192.168.1.100 \ --dport 80 -j REDIRECT --to-ports 65000 Note : ------ If the port is left unspecified on the server, the port the client connected to will be used. This allows to relay a full port range without using transparent mode nor thousands of file descriptors, provided that the system can redirect sessions to local ports. Example : --------- # redirect all ports to local port 65000, then forward to the server on the # original port. listen http_proxy 0.0.0.0:65000 mode tcp server server01 192.168.1.1 check port 60000 server server02 192.168.1.2 check port 60000 # iptables -t nat -A PREROUTING -i eth0 -p tcp -d 192.168.1.100 \ -j REDIRECT --to-ports 65000 4.2) Event logging ------------------ 4.2.1) Log levels ----------------- TCP and HTTP connections can be logged with informations such as date, time, source IP address, destination address, connection duration, response times, HTTP request, the HTTP return code, number of bytes transmitted, the conditions in which the session ended, and even exchanged cookies values, to track a particular user's problems for example. All messages are sent to up to two syslog servers. Consult section 1.1 for more info about log facilities. The syntax follows : log [max_level_1] log [max_level_2] or log global Note : ------ The particular syntax 'log global' means that the same log configuration as the 'global' section will be used. Example : --------- listen http_proxy 0.0.0.0:80 mode http log 192.168.2.200 local3 log 192.168.2.201 local4 4.2.2) Log format ----------------- By default, connections are logged at the TCP level, as soon as the session establishes between the client and the proxy. By enabling the 'tcplog' option, the proxy will wait until the session ends to generate an enhanced log containing more information such as session duration and its state during the disconnection. Another option, 'httplog', provides more detailed information about HTTP contents, such as the request and some cookies. In the event where an external component would establish frequent connections to check the service, logs may be full of useless lines. So it is possible not to log any session which didn't transfer any data, by the setting of the 'dontlognull' option. This only has effect on sessions which are established then closed. Example : --------- listen http_proxy 0.0.0.0:80 mode http option httplog option dontlognull log 192.168.2.200 local3 4.2.3) Timing events -------------------- Timers provide a great help in trouble shooting network problems. All values are reported in milliseconds (ms). In HTTP mode, four control points are reported under the form 'Tq/Tc/Tr/Tt' : - Tq: total time to get the client request. It's the time elapsed between the moment the client connection was accepted and the moment the proxy received the last HTTP header. The value '-1' indicates that the end of headers (empty line) has never been seen. - Tc: total time to establish the TCP connection to the server. It's the time elapsed between the moment the proxy sent the connection request, and the moment it was acknowledged, or between the TCP SYN packet and the matching SYN/ACK in return. The value '-1' means that the connection never established. - Tr: server response time. It's the time elapsed between the moment the TCP connection was established to the server and the moment it send its complete response header. It purely shows its request processing time, without the network overhead due to the data transmission. The value '-1' means that the last the response header (empty line) was never seen. - Tt: total session duration time, between the moment the proxy accepted it and the moment both ends were closed. From this one, we can deduce Td, the data transmission time, by substracting other timers when valid : Td = Tt - (Tq + Tc + Tr) Timers with '-1' values have to be excluded from this equation. In TCP mode ('option tcplog'), only Tc and Tt are reported. These timers provide precious indications on trouble causes. Since the TCP protocol defines retransmit delays of 3, 6, 12... seconds, we know for sure that timers close to multiples of 3s are nearly always related to packets lost due to network problems (wires or negociation). Moreover, if is close to a timeout value specified in the configuration, it often means that a session has been aborted on time-out. Most common cases : - If Tq is close to 3000, a packet has probably been lost between the client and the proxy. - If Tc is close to 3000, a packet has probably been lost between the server and the proxy during the server connection phase. This one should always be very low (less than a few tens). - If Tr is nearly always lower than 3000 except some rare values which seem to be the average majored by 3000, there are probably some packets lost between the proxy and the server. - If Tt is often slightly higher than a time-out, it's often because the client and the server use HTTP keep-alive and the session is maintained after the response ends. Se further for how to disable HTTP keep-alive. Other cases ('xx' means any value to be ignored) : -1/xx/xx/Tt : the client was not able to send its complete request in time, or that it aborted it too early. Tq/-1/xx/Tt : the connection could not establish on the server. Either it refused it or it timed out after Tt-Tq ms. Tq/Tc/-1/Tt : the server has accepted the connection but did not return a complete response in time, or it closed its connexion unexpectedly, after Tt-(Tq+Tc) ms. 4.2.4) Session state at disconnection ------------------------------------- TCP and HTTP logs provide a session completion indicator. It's a 4-characters (2 in TCP) field preceeding the HTTP request, and indicating : - On the first character, a code reporting the first event which caused the session to terminate : C : the TCP session was aborted by the client. S : the TCP session was aborted by the server, or the server refused it. P : the session was abordted prematurely by the proxy, either because of an internal error, or because a DENY filter was matched. c : the client time-out expired first. s : the server time-out expired first. - : normal session completion. - on the second character, the HTTP session state when it was closed : R : waiting for complete REQUEST from the client C : waiting for CONNECTION to establish on the server H : waiting for complete HEADERS from the server D : the session was in the DATA phase L : the proxy was still transmitting LAST data to the client while the server had already finished. - : normal session completion after end of data transfer. - the third character tells whether the persistence cookie was provided by the client (only in HTTP mode) : N : the client provided NO cookie. I : the client provided an INVALID cookie matching no known server. D : the client provided a cookie designating a server which was DOWN, so either the 'persist' option was used and the client was sent to this server, or it was not set and the client was redispatched to another server. V : the client provided a valid cookie, and was sent to the associated server. - : does not apply (no cookie set in configuration). - the last character reports what operations were performed on the persistence cookie returned by the server (only in HTTP mode) : N : NO cookie was provided by the server. P : a cookie was PROVIDED by the server and transmitted as-is. I : no cookie was provided by the server, and one was INSERTED by the proxy. D : the cookie provided by the server was DELETED by the proxy. R : the cookie provided by the server was REWRITTEN by the proxy. - : does not apply (no cookie set in configuration). The 'capture' keyword allows to capture and log informations exchanged between clients and servers. As of version 1.1.23, only cookies can be captured, which makes it easy to track a complete user session. The syntax is : capture cookie len The FIRST cookie whose name starts with will be captured, and logged as 'NAME=value', without exceeding characters (64 max). When the cookie name is fixed and known, it's preferable to suffix '=' to it to ensure that no other cookie will be logged. Examples : ---------- # capture the first cookie whose name starts with "ASPSESSION" capture cookie ASPSESSION len 32 # capture the first cookie whose name is exactly "vgnvisitor" capture cookie vgnvisitor= len 32 In the logs, the field preceeding the completion indicator contains the cookie value as sent by the server, preceeded by the cookie value as sent by the client. Each of these field is replaced with '-' when no cookie was seen. 4.2.5) Examples of logs ----------------------- - haproxy[674]: 127.0.0.1:33319 [15/Oct/2003:08:31:57] relais-http Srv1 6559/7/147/6723 200 243 - - ---- "HEAD / HTTP/1.0" => long request (6.5s) entered by hand through 'telnet'. The server replied in 147 ms, and the session ended normally ('----') - haproxy[18113]: 127.0.0.1:34548 [15/Oct/2003:15:18:55] relais-http -1/-1/-1/8490 -1 0 - - CR-- "" => the client never completed its request and aborted itself ('C---') after 8.5s, while the proxy was waiting for the request headers ('-R--'). Nothing was sent to the server. - haproxy[18113]: 127.0.0.1:34549 [15/Oct/2003:15:19:06] relais-http -1/-1/-1/50001 408 0 - - cR-- "" => The client never completed its request, which was aborted by the time-out ('c---') after 50s, while the proxy was waiting for the request headers ('-R--'). Nothing was sent to the server, but the proxy could send a 408 return code to the client. - haproxy[18989]: 127.0.0.1:34550 [15/Oct/2003:15:24:28] relais-tcp Srv1 0/5007 0 cD => This is a 'tcplog' entry. Client-side time-out ('c----') occured after 5s. - haproxy[18989]: 10.0.0.1:34552 [15/Oct/2003:15:26:31] relais-http Srv1 3183/-1/-1/11215 503 0 - - SC-- "HEAD / HTTP/1.0" => The request took 3s to complete (probably a network problem), and the connection to the server failed ('SC--') after 4 attemps of 2 seconds (config says 'retries 3'), then a 503 error code was sent to the client. 4.3) HTTP header manipulation ----------------------------- In HTTP mode, it is possible to rewrite, add or delete some of the request and response headers based on regular expressions. It is also possible to block a request or a response if a particular header matches a regular expression, which is enough to stops most elementary protocol attacks, and to protect against information leak from the internal network. But there is a limitation to this : since haproxy's HTTP engine knows nothing about keep-alive, only headers passed during the first request of a TCP session will be seen. All subsequent headers will be considered data only and not analyzed. Furthermore, haproxy doesn't touch data contents, it stops at the end of headers. The syntax is : reqadd to add a header to the request reqrep to modify the request reqirep same, but ignoring the case reqdel to delete a header in the request reqidel same, but ignoring the case reqallow definitely allow a request if a header matches reqiallow same, but ignoring the case reqdeny denies a request if a header matches reqideny same, but ignoring the case reqpass ignore a header matching reqipass same, but ignoring the case rspadd to add a header to the response rsprep to modify the response rspirep same, but ignoring the case rspdel to delete the response rspidel same, but ignoring the case is a POSIX regular expression (regex) which supports grouping through parenthesis (without the backslash). Spaces and other delimiters must be prefixed with a backslash ('\') to avoid confusion with a field delimiter. Other characters may be prefixed with a backslash to change their meaning : \t for a tab \r for a carriage return (CR) \n for a new line (LF) \ to mark a space and differentiate it from a delimiter \# to mark a sharp and differentiate it from a comment \\ to use a backslash in a regex \\\\ to use a backslash in the text (*2 for regex, *2 for haproxy) \xXX to write the ASCII hex code XX as in the C language containst the string to be used to replace the largest portion of text matching the regex. It can make use of the special characters above, and can reference a substring delimited by parenthesis in the regex, by the group numerical order from 1 to 9. In this case, you would write a backslah ('\') immediately followed by one digit indicating the group position. represents the string which will systematically be added after the last header line. It can also use special characters above. Notes : ------- - the first line is considered as a header, which makes it possible to rewrite or filter HTTP requests URIs or response codes. - 'reqrep' is the equivalent of 'cliexp' in version 1.0, and 'rsprep' is the equivalent of 'srvexp' in 1.0. Those names are still supported but deprecated. - for performances reasons, the number of characters added to a request or to a response is limited to 4096 since version 1.1.5 (it was 256 before). This value is easy to modify in the code if needed (#define). If it is too short on occasional uses, it is possible to gain some space by removing some useless headers before adding new ones. Examples : ---------- ###### a few examples ###### # rewrite 'online.fr' instead of 'free.fr' for GET and POST requests reqrep ^(GET\ .*)(.free.fr)(.*) \1.online.fr\3 reqrep ^(POST\ .*)(.free.fr)(.*) \1.online.fr\3 # force proxy connections to close reqirep ^Proxy-Connection:.* Proxy-Connection:\ close # rewrite locations rspirep ^(Location:\ )([^:]*://[^/]*)(.*) \1\3 ###### A full configuration being used on production ###### # Every header should end with a colon followed by one space. reqideny ^[^:\ ]*[\ ]*$ # block Apache chunk exploit reqideny ^Transfer-Encoding:[\ ]*chunked reqideny ^Host:\ apache- # block annoying worms that fill the logs... reqideny ^[^:\ ]*\ .*(\.|%2e)(\.|%2e)(%2f|%5c|/|\\\\) reqideny ^[^:\ ]*\ ([^\ ]*\ [^\ ]*\ |.*%00) reqideny ^[^:\ ]*\ .*