---------------------- HAProxy Configuration Manual ---------------------- version 1.9 willy tarreau 2018/09/29 This document covers the configuration language as implemented in the version specified above. It does not provide any hints, examples, or advice. For such documentation, please refer to the Reference Manual or the Architecture Manual. The summary below is meant to help you find sections by name and navigate through the document. Note to documentation contributors : This document is formatted with 80 columns per line, with even number of spaces for indentation and without tabs. Please follow these rules strictly so that it remains easily printable everywhere. If a line needs to be printed verbatim and does not fit, please end each line with a backslash ('\') and continue on next line, indented by two characters. It is also sometimes useful to prefix all output lines (logs, console outputs) with 3 closing angle brackets ('>>>') in order to emphasize the difference between inputs and outputs when they may be ambiguous. If you add sections, please update the summary below for easier searching. Summary ------- 1. Quick reminder about HTTP 1.1. The HTTP transaction model 1.2. HTTP request 1.2.1. The request line 1.2.2. The request headers 1.3. HTTP response 1.3.1. The response line 1.3.2. The response headers 2. Configuring HAProxy 2.1. Configuration file format 2.2. Quoting and escaping 2.3. Environment variables 2.4. Time format 2.5. Examples 3. Global parameters 3.1. Process management and security 3.2. Performance tuning 3.3. Debugging 3.4. Userlists 3.5. Peers 3.6. Mailers 4. Proxies 4.1. Proxy keywords matrix 4.2. Alphabetically sorted keywords reference 5. Bind and server options 5.1. Bind options 5.2. Server and default-server options 5.3. Server DNS resolution 5.3.1. Global overview 5.3.2. The resolvers section 6. HTTP header manipulation 7. Using ACLs and fetching samples 7.1. ACL basics 7.1.1. Matching booleans 7.1.2. Matching integers 7.1.3. Matching strings 7.1.4. Matching regular expressions (regexes) 7.1.5. Matching arbitrary data blocks 7.1.6. Matching IPv4 and IPv6 addresses 7.2. Using ACLs to form conditions 7.3. Fetching samples 7.3.1. Converters 7.3.2. Fetching samples from internal states 7.3.3. Fetching samples at Layer 4 7.3.4. Fetching samples at Layer 5 7.3.5. Fetching samples from buffer contents (Layer 6) 7.3.6. Fetching HTTP samples (Layer 7) 7.4. Pre-defined ACLs 8. Logging 8.1. Log levels 8.2. Log formats 8.2.1. Default log format 8.2.2. TCP log format 8.2.3. HTTP log format 8.2.4. Custom log format 8.2.5. Error log format 8.3. Advanced logging options 8.3.1. Disabling logging of external tests 8.3.2. Logging before waiting for the session to terminate 8.3.3. Raising log level upon errors 8.3.4. Disabling logging of successful connections 8.4. Timing events 8.5. Session state at disconnection 8.6. Non-printable characters 8.7. Capturing HTTP cookies 8.8. Capturing HTTP headers 8.9. Examples of logs 9. Supported filters 9.1. Trace 9.2. HTTP compression 9.3. Stream Processing Offload Engine (SPOE) 10. Cache 10.1. Limitation 10.2. Setup 10.2.1. Cache section 10.2.2. Proxy section 1. Quick reminder about HTTP ---------------------------- When HAProxy is running in HTTP mode, both the request and the response are fully analyzed and indexed, thus it becomes possible to build matching criteria on almost anything found in the contents. However, it is important to understand how HTTP requests and responses are formed, and how HAProxy decomposes them. It will then become easier to write correct rules and to debug existing configurations. 1.1. The HTTP transaction model ------------------------------- The HTTP protocol is transaction-driven. This means that each request will lead to one and only one response. Traditionally, a TCP connection is established from the client to the server, a request is sent by the client through the connection, the server responds, and the connection is closed. A new request will involve a new connection : [CON1] [REQ1] ... [RESP1] [CLO1] [CON2] [REQ2] ... [RESP2] [CLO2] ... In this mode, called the "HTTP close" mode, there are as many connection establishments as there are HTTP transactions. Since the connection is closed by the server after the response, the client does not need to know the content length. Due to the transactional nature of the protocol, it was possible to improve it to avoid closing a connection between two subsequent transactions. In this mode however, it is mandatory that the server indicates the content length for each response so that the client does not wait indefinitely. For this, a special header is used: "Content-length". This mode is called the "keep-alive" mode : [CON] [REQ1] ... [RESP1] [REQ2] ... [RESP2] [CLO] ... Its advantages are a reduced latency between transactions, and less processing power required on the server side. It is generally better than the close mode, but not always because the clients often limit their concurrent connections to a smaller value. Another improvement in the communications is the pipelining mode. It still uses keep-alive, but the client does not wait for the first response to send the second request. This is useful for fetching large number of images composing a page : [CON] [REQ1] [REQ2] ... [RESP1] [RESP2] [CLO] ... This can obviously have a tremendous benefit on performance because the network latency is eliminated between subsequent requests. Many HTTP agents do not correctly support pipelining since there is no way to associate a response with the corresponding request in HTTP. For this reason, it is mandatory for the server to reply in the exact same order as the requests were received. The next improvement is the multiplexed mode, as implemented in HTTP/2. This time, each transaction is assigned a single stream identifier, and all streams are multiplexed over an existing connection. Many requests can be sent in parallel by the client, and responses can arrive in any order since they also carry the stream identifier. By default HAProxy operates in keep-alive mode with regards to persistent connections: for each connection it processes each request and response, and leaves the connection idle on both sides between the end of a response and the start of a new request. When it receives HTTP/2 connections from a client, it processes all the requests in parallel and leaves the connection idling, waiting for new requests, just as if it was a keep-alive HTTP connection. HAProxy supports 4 connection modes : - keep alive : all requests and responses are processed (default) - tunnel : only the first request and response are processed, everything else is forwarded with no analysis. - server close : the server-facing connection is closed after the response. - close : the connection is actively closed after end of response. For HTTP/2, the connection mode resembles more the "server close" mode : given the independence of all streams, there is currently no place to hook the idle server connection after a response, so it is closed after the response. HTTP/2 is only supported for incoming connections, not on connections going to servers. 1.2. HTTP request ----------------- First, let's consider this HTTP request : Line Contents number 1 GET /serv/login.php?lang=en&profile=2 HTTP/1.1 2 Host: www.mydomain.com 3 User-agent: my small browser 4 Accept: image/jpeg, image/gif 5 Accept: image/png 1.2.1. The Request line ----------------------- Line 1 is the "request line". It is always composed of 3 fields : - a METHOD : GET - a URI : /serv/login.php?lang=en&profile=2 - a version tag : HTTP/1.1 All of them are delimited by what the standard calls LWS (linear white spaces), which are commonly spaces, but can also be tabs or line feeds/carriage returns followed by spaces/tabs. The method itself cannot contain any colon (':') and is limited to alphabetic letters. All those various combinations make it desirable that HAProxy performs the splitting itself rather than leaving it to the user to write a complex or inaccurate regular expression. The URI itself can have several forms : - A "relative URI" : /serv/login.php?lang=en&profile=2 It is a complete URL without the host part. This is generally what is received by servers, reverse proxies and transparent proxies. - An "absolute URI", also called a "URL" : http://192.168.0.12:8080/serv/login.php?lang=en&profile=2 It is composed of a "scheme" (the protocol name followed by '://'), a host name or address, optionally a colon (':') followed by a port number, then a relative URI beginning at the first slash ('/') after the address part. This is generally what proxies receive, but a server supporting HTTP/1.1 must accept this form too. - a star ('*') : this form is only accepted in association with the OPTIONS method and is not relayable. It is used to inquiry a next hop's capabilities. - an address:port combination : 192.168.0.12:80 This is used with the CONNECT method, which is used to establish TCP tunnels through HTTP proxies, generally for HTTPS, but sometimes for other protocols too. In a relative URI, two sub-parts are identified. The part before the question mark is called the "path". It is typically the relative path to static objects on the server. The part after the question mark is called the "query string". It is mostly used with GET requests sent to dynamic scripts and is very specific to the language, framework or application in use. HTTP/2 doesn't convey a version information with the request, so the version is assumed to be the same as the one of the underlying protocol (i.e. "HTTP/2"). However, haproxy natively processes HTTP/1.x requests and headers, so requests received over an HTTP/2 connection are transcoded to HTTP/1.1 before being processed. This explains why they still appear as "HTTP/1.1" in haproxy's logs as well as in server logs. 1.2.2. The request headers -------------------------- The headers start at the second line. They are composed of a name at the beginning of the line, immediately followed by a colon (':'). Traditionally, an LWS is added after the colon but that's not required. Then come the values. Multiple identical headers may be folded into one single line, delimiting the values with commas, provided that their order is respected. This is commonly encountered in the "Cookie:" field. A header may span over multiple lines if the subsequent lines begin with an LWS. In the example in 1.2, lines 4 and 5 define a total of 3 values for the "Accept:" header. Contrary to a common misconception, header names are not case-sensitive, and their values are not either if they refer to other header names (such as the "Connection:" header). In HTTP/2, header names are always sent in lower case, as can be seen when running in debug mode. The end of the headers is indicated by the first empty line. People often say that it's a double line feed, which is not exact, even if a double line feed is one valid form of empty line. Fortunately, HAProxy takes care of all these complex combinations when indexing headers, checking values and counting them, so there is no reason to worry about the way they could be written, but it is important not to accuse an application of being buggy if it does unusual, valid things. Important note: As suggested by RFC7231, HAProxy normalizes headers by replacing line breaks in the middle of headers by LWS in order to join multi-line headers. This is necessary for proper analysis and helps less capable HTTP parsers to work correctly and not to be fooled by such complex constructs. 1.3. HTTP response ------------------ An HTTP response looks very much like an HTTP request. Both are called HTTP messages. Let's consider this HTTP response : Line Contents number 1 HTTP/1.1 200 OK 2 Content-length: 350 3 Content-Type: text/html As a special case, HTTP supports so called "Informational responses" as status codes 1xx. These messages are special in that they don't convey any part of the response, they're just used as sort of a signaling message to ask a client to continue to post its request for instance. In the case of a status 100 response the requested information will be carried by the next non-100 response message following the informational one. This implies that multiple responses may be sent to a single request, and that this only works when keep-alive is enabled (1xx messages are HTTP/1.1 only). HAProxy handles these messages and is able to correctly forward and skip them, and only process the next non-100 response. As such, these messages are neither logged nor transformed, unless explicitly state otherwise. Status 101 messages indicate that the protocol is changing over the same connection and that haproxy must switch to tunnel mode, just as if a CONNECT had occurred. Then the Upgrade header would contain additional information about the type of protocol the connection is switching to. 1.3.1. The response line ------------------------ Line 1 is the "response line". It is always composed of 3 fields : - a version tag : HTTP/1.1 - a status code : 200 - a reason : OK The status code is always 3-digit. The first digit indicates a general status : - 1xx = informational message to be skipped (e.g. 100, 101) - 2xx = OK, content is following (e.g. 200, 206) - 3xx = OK, no content following (e.g. 302, 304) - 4xx = error caused by the client (e.g. 401, 403, 404) - 5xx = error caused by the server (e.g. 500, 502, 503) Please refer to RFC7231 for the detailed meaning of all such codes. The "reason" field is just a hint, but is not parsed by clients. Anything can be found there, but it's a common practice to respect the well-established messages. It can be composed of one or multiple words, such as "OK", "Found", or "Authentication Required". HAProxy may emit the following status codes by itself : Code When / reason 200 access to stats page, and when replying to monitoring requests 301 when performing a redirection, depending on the configured code 302 when performing a redirection, depending on the configured code 303 when performing a redirection, depending on the configured code 307 when performing a redirection, depending on the configured code 308 when performing a redirection, depending on the configured code 400 for an invalid or too large request 401 when an authentication is required to perform the action (when accessing the stats page) 403 when a request is forbidden by a "block" ACL or "reqdeny" filter 408 when the request timeout strikes before the request is complete 500 when haproxy encounters an unrecoverable internal error, such as a memory allocation failure, which should never happen 502 when the server returns an empty, invalid or incomplete response, or when an "rspdeny" filter blocks the response. 503 when no server was available to handle the request, or in response to monitoring requests which match the "monitor fail" condition 504 when the response timeout strikes before the server responds The error 4xx and 5xx codes above may be customized (see "errorloc" in section 4.2). 1.3.2. The response headers --------------------------- Response headers work exactly like request headers, and as such, HAProxy uses the same parsing function for both. Please refer to paragraph 1.2.2 for more details. 2. Configuring HAProxy ---------------------- 2.1. Configuration file format ------------------------------ HAProxy's configuration process involves 3 major sources of parameters : - the arguments from the command-line, which always take precedence - the "global" section, which sets process-wide parameters - the proxies sections which can take form of "defaults", "listen", "frontend" and "backend". The configuration file syntax consists in lines beginning with a keyword referenced in this manual, optionally followed by one or several parameters delimited by spaces. 2.2. Quoting and escaping ------------------------- HAProxy's configuration introduces a quoting and escaping system similar to many programming languages. The configuration file supports 3 types: escaping with a backslash, weak quoting with double quotes, and strong quoting with single quotes. If spaces have to be entered in strings, then they must be escaped by preceding them by a backslash ('\') or by quoting them. Backslashes also have to be escaped by doubling or strong quoting them. Escaping is achieved by preceding a special character by a backslash ('\'): \ to mark a space and differentiate it from a delimiter \# to mark a hash and differentiate it from a comment \\ to use a backslash \' to use a single quote and differentiate it from strong quoting \" to use a double quote and differentiate it from weak quoting Weak quoting is achieved by using double quotes (""). Weak quoting prevents the interpretation of: space as a parameter separator ' single quote as a strong quoting delimiter # hash as a comment start Weak quoting permits the interpretation of variables, if you want to use a non -interpreted dollar within a double quoted string, you should escape it with a backslash ("\$"), it does not work outside weak quoting. Interpretation of escaping and special characters are not prevented by weak quoting. Strong quoting is achieved by using single quotes (''). Inside single quotes, nothing is interpreted, it's the efficient way to quote regexes. Quoted and escaped strings are replaced in memory by their interpreted equivalent, it allows you to perform concatenation. Example: # those are equivalents: log-format %{+Q}o\ %t\ %s\ %{-Q}r log-format "%{+Q}o %t %s %{-Q}r" log-format '%{+Q}o %t %s %{-Q}r' log-format "%{+Q}o %t"' %s %{-Q}r' log-format "%{+Q}o %t"' %s'\ %{-Q}r # those are equivalents: reqrep "^([^\ :]*)\ /static/(.*)" \1\ /\2 reqrep "^([^ :]*)\ /static/(.*)" '\1 /\2' reqrep "^([^ :]*)\ /static/(.*)" "\1 /\2" reqrep "^([^ :]*)\ /static/(.*)" "\1\ /\2" 2.3. Environment variables -------------------------- HAProxy's configuration supports environment variables. Those variables are interpreted only within double quotes. Variables are expanded during the configuration parsing. Variable names must be preceded by a dollar ("$") and optionally enclosed with braces ("{}") similarly to what is done in Bourne shell. Variable names can contain alphanumerical characters or the character underscore ("_") but should not start with a digit. Example: bind "fd@${FD_APP1}" log "${LOCAL_SYSLOG}:514" local0 notice # send to local server user "$HAPROXY_USER" A special variable $HAPROXY_LOCALPEER is defined at the startup of the process which contains the name of the local peer. (See "-L" in the management guide.) 2.4. Time format ---------------- Some parameters involve values representing time, such as timeouts. These values are generally expressed in milliseconds (unless explicitly stated otherwise) but may be expressed in any other unit by suffixing the unit to the numeric value. It is important to consider this because it will not be repeated for every keyword. Supported units are : - us : microseconds. 1 microsecond = 1/1000000 second - ms : milliseconds. 1 millisecond = 1/1000 second. This is the default. - s : seconds. 1s = 1000ms - m : minutes. 1m = 60s = 60000ms - h : hours. 1h = 60m = 3600s = 3600000ms - d : days. 1d = 24h = 1440m = 86400s = 86400000ms 2.5. Examples ------------- # Simple configuration for an HTTP proxy listening on port 80 on all # interfaces and forwarding requests to a single backend "servers" with a # single server "server1" listening on 127.0.0.1:8000 global daemon maxconn 256 defaults mode http timeout connect 5000ms timeout client 50000ms timeout server 50000ms frontend http-in bind *:80 default_backend servers backend servers server server1 127.0.0.1:8000 maxconn 32 # The same configuration defined with a single listen block. Shorter but # less expressive, especially in HTTP mode. global daemon maxconn 256 defaults mode http timeout connect 5000ms timeout client 50000ms timeout server 50000ms listen http-in bind *:80 server server1 127.0.0.1:8000 maxconn 32 Assuming haproxy is in $PATH, test these configurations in a shell with: $ sudo haproxy -f configuration.conf -c 3. Global parameters -------------------- Parameters in the "global" section are process-wide and often OS-specific. They are generally set once for all and do not need being changed once correct. Some of them have command-line equivalents. The following keywords are supported in the "global" section : * Process management and security - ca-base - chroot - crt-base - cpu-map - daemon - description - deviceatlas-json-file - deviceatlas-log-level - deviceatlas-separator - deviceatlas-properties-cookie - external-check - gid - group - hard-stop-after - log - log-tag - log-send-hostname - lua-load - nbproc - nbthread - node - pidfile - presetenv - resetenv - uid - ulimit-n - user - setenv - stats - ssl-default-bind-ciphers - ssl-default-bind-ciphersuites - ssl-default-bind-options - ssl-default-server-ciphers - ssl-default-server-ciphersuites - ssl-default-server-options - ssl-dh-param-file - ssl-server-verify - unix-bind - unsetenv - 51degrees-data-file - 51degrees-property-name-list - 51degrees-property-separator - 51degrees-cache-size - wurfl-data-file - wurfl-information-list - wurfl-information-list-separator - wurfl-engine-mode - wurfl-cache-size - wurfl-useragent-priority * Performance tuning - max-spread-checks - maxconn - maxconnrate - maxcomprate - maxcompcpuusage - maxpipes - maxsessrate - maxsslconn - maxsslrate - maxzlibmem - noepoll - nokqueue - nopoll - nosplice - nogetaddrinfo - noreuseport - spread-checks - server-state-base - server-state-file - ssl-engine - ssl-mode-async - tune.buffers.limit - tune.buffers.reserve - tune.bufsize - tune.chksize - tune.comp.maxlevel - tune.h2.header-table-size - tune.h2.initial-window-size - tune.h2.max-concurrent-streams - tune.http.cookielen - tune.http.logurilen - tune.http.maxhdr - tune.idletimer - tune.lua.forced-yield - tune.lua.maxmem - tune.lua.session-timeout - tune.lua.task-timeout - tune.lua.service-timeout - tune.maxaccept - tune.maxpollevents - tune.maxrewrite - tune.pattern.cache-size - tune.pipesize - tune.rcvbuf.client - tune.rcvbuf.server - tune.recv_enough - tune.runqueue-depth - tune.sndbuf.client - tune.sndbuf.server - tune.ssl.cachesize - tune.ssl.lifetime - tune.ssl.force-private-cache - tune.ssl.maxrecord - tune.ssl.default-dh-param - tune.ssl.ssl-ctx-cache-size - tune.ssl.capture-cipherlist-size - tune.vars.global-max-size - tune.vars.proc-max-size - tune.vars.reqres-max-size - tune.vars.sess-max-size - tune.vars.txn-max-size - tune.zlib.memlevel - tune.zlib.windowsize * Debugging - debug - quiet 3.1. Process management and security ------------------------------------ ca-base Assigns a default directory to fetch SSL CA certificates and CRLs from when a relative path is used with "ca-file" or "crl-file" directives. Absolute locations specified in "ca-file" and "crl-file" prevail and ignore "ca-base". chroot Changes current directory to and performs a chroot() there before dropping privileges. This increases the security level in case an unknown vulnerability would be exploited, since it would make it very hard for the attacker to exploit the system. This only works when the process is started with superuser privileges. It is important to ensure that is both empty and non-writable to anyone. cpu-map [auto:][/] ... On Linux 2.6 and above, it is possible to bind a process or a thread to a specific CPU set. This means that the process or the thread will never run on other CPUs. The "cpu-map" directive specifies CPU sets for process or thread sets. The first argument is a process set, eventually followed by a thread set. These sets have the format all | odd | even | number[-[number]] > must be a number between 1 and 32 or 64, depending on the machine's word size. Any process IDs above nbproc and any thread IDs above nbthread are ignored. It is possible to specify a range with two such number delimited by a dash ('-'). It also is possible to specify all processes at once using "all", only odd numbers using "odd" or even numbers using "even", just like with the "bind-process" directive. The second and forthcoming arguments are CPU sets. Each CPU set is either a unique number between 0 and 31 or 63 or a range with two such numbers delimited by a dash ('-'). Multiple CPU numbers or ranges may be specified, and the processes or threads will be allowed to bind to all of them. Obviously, multiple "cpu-map" directives may be specified. Each "cpu-map" directive will replace the previous ones when they overlap. A thread will be bound on the intersection of its mapping and the one of the process on which it is attached. If the intersection is null, no specific binding will be set for the thread. Ranges can be partially defined. The higher bound can be omitted. In such case, it is replaced by the corresponding maximum value, 32 or 64 depending on the machine's word size. The prefix "auto:" can be added before the process set to let HAProxy automatically bind a process or a thread to a CPU by incrementing process/thread and CPU sets. To be valid, both sets must have the same size. No matter the declaration order of the CPU sets, it will be bound from the lowest to the highest bound. Having a process and a thread range with the "auto:" prefix is not supported. Only one range is supported, the other one must be a fixed number. Examples: cpu-map 1-4 0-3 # bind processes 1 to 4 on the first 4 CPUs cpu-map 1/all 0-3 # bind all threads of the first process on the # first 4 CPUs cpu-map 1- 0- # will be replaced by "cpu-map 1-64 0-63" # or "cpu-map 1-32 0-31" depending on the machine's # word size. # all these lines bind the process 1 to the cpu 0, the process 2 to cpu 1 # and so on. cpu-map auto:1-4 0-3 cpu-map auto:1-4 0-1 2-3 cpu-map auto:1-4 3 2 1 0 # all these lines bind the thread 1 to the cpu 0, the thread 2 to cpu 1 # and so on. cpu-map auto:1/1-4 0-3 cpu-map auto:1/1-4 0-1 2-3 cpu-map auto:1/1-4 3 2 1 0 # bind each process to exactly one CPU using all/odd/even keyword cpu-map auto:all 0-63 cpu-map auto:even 0-31 cpu-map auto:odd 32-63 # invalid cpu-map because process and CPU sets have different sizes. cpu-map auto:1-4 0 # invalid cpu-map auto:1 0-3 # invalid # invalid cpu-map because automatic binding is used with a process range # and a thread range. cpu-map auto:all/all 0 # invalid cpu-map auto:all/1-4 0 # invalid cpu-map auto:1-4/all 0 # invalid crt-base Assigns a default directory to fetch SSL certificates from when a relative path is used with "crtfile" directives. Absolute locations specified after "crtfile" prevail and ignore "crt-base". daemon Makes the process fork into background. This is the recommended mode of operation. It is equivalent to the command line "-D" argument. It can be disabled by the command line "-db" argument. This option is ignored in systemd mode. deviceatlas-json-file Sets the path of the DeviceAtlas JSON data file to be loaded by the API. The path must be a valid JSON data file and accessible by HAProxy process. deviceatlas-log-level Sets the level of information returned by the API. This directive is optional and set to 0 by default if not set. deviceatlas-separator Sets the character separator for the API properties results. This directive is optional and set to | by default if not set. deviceatlas-properties-cookie Sets the client cookie's name used for the detection if the DeviceAtlas Client-side component was used during the request. This directive is optional and set to DAPROPS by default if not set. external-check Allows the use of an external agent to perform health checks. This is disabled by default as a security precaution. See "option external-check". gid Changes the process' group ID to . It is recommended that the group ID is dedicated to HAProxy or to a small set of similar daemons. HAProxy must be started with a user belonging to this group, or with superuser privileges. Note that if haproxy is started from a user having supplementary groups, it will only be able to drop these groups if started with superuser privileges. See also "group" and "uid". hard-stop-after