[DOC] added documentation about HTTP header manipulations

This section has been inserted before the logging section.
This commit is contained in:
Willy Tarreau 2008-01-17 20:35:34 +01:00
parent 303c035725
commit ced27013b6

View File

@ -854,7 +854,7 @@ capture cookie <name> len <length>
capture cookie ASPSESSION len 32
See also : "capture request header", "capture response header" as well as
section 2.5 about logging.
section 2.6 about logging.
capture request header <name> len <length>
@ -891,7 +891,7 @@ capture request header <name> len <length>
capture request header X-Forwarded-For len 15
capture request header Referrer len 15
See also : "capture cookie", "capture response header" as well as section 2.5
See also : "capture cookie", "capture response header" as well as section 2.6
about logging.
@ -927,7 +927,7 @@ capture response header <name> len <length>
capture response header Content-length len 9
capture response header Location len 15
See also : "capture cookie", "capture request header" as well as section 2.5
See also : "capture cookie", "capture request header" as well as section 2.6
about logging.
@ -1706,7 +1706,7 @@ no option dontlognull
If this option has been enabled in a "defaults" section, it can be disabled
in a specific instance by prepending the "no" keyword before it.
See also : "log", "monitor-net", "monitor-uri" and section 2.5 about logging.
See also : "log", "monitor-net", "monitor-uri" and section 2.6 about logging.
option forceclose
@ -1905,7 +1905,7 @@ option httplog
This option may be set either in the frontend or the backend.
See also : section 2.5 about logging.
See also : section 2.6 about logging.
option logasap
@ -1926,7 +1926,7 @@ no option logasap
"Content-Length" response header so that the logs at least indicate how many
bytes are expected to be transferred.
See also : "option httplog", "capture response header", and section 2.5 about
See also : "option httplog", "capture response header", and section 2.6 about
logging.
@ -2160,7 +2160,7 @@ option tcplog
This option may be set either in the frontend or the backend.
See also : "option httplog", and section 2.5 about logging.
See also : "option httplog", and section 2.6 about logging.
option tcpsplice [ experimental ]
@ -2245,7 +2245,7 @@ reqadd <string>
Arguments :
<string> is the complete line to be added. Any space or known delimiter
must be escaped using a backslash ('\'). Please refer to section
2.6 about HTTP header manipulation for more information.
2.5 about HTTP header manipulation for more information.
A new line consisting in <string> followed by a line feed will be added after
the last header of an HTTP request.
@ -2254,7 +2254,7 @@ reqadd <string>
and not to traffic generated by HAProxy, such as health-checks or error
responses.
See also: "rspadd" and section 2.6 about HTTP header manipulation
See also: "rspadd" and section 2.5 about HTTP header manipulation
reqallow <search>
@ -2285,7 +2285,7 @@ reqiallow <search> (ignore case)
reqiallow ^Host:\ www\.
reqideny ^Host:\ .*\.local
See also: "reqdeny", "acl", "block" and section 2.6 about HTTP header
See also: "reqdeny", "acl", "block" and section 2.5 about HTTP header
manipulation
@ -2316,7 +2316,7 @@ reqidel <search> (ignore case)
reqidel ^X-Forwarded-For:.*
reqidel ^Cookie:.*SERVER=
See also: "reqadd", "reqrep", "rspdel" and section 2.6 about HTTP header
See also: "reqadd", "reqrep", "rspdel" and section 2.5 about HTTP header
manipulation
@ -2340,6 +2340,10 @@ reqideny <search> (ignore case)
headers. Keep in mind that URLs in request line are case-sensitive while
header names are not.
A denied request will generate an "HTTP 403 forbidden" response once the
complete request has been parsed. This is consistent with what is practised
using ACLs.
It is easier, faster and more powerful to use ACLs to write access policies.
Reqdeny, reqallow and reqpass should be avoided in new designs.
@ -2348,7 +2352,7 @@ reqideny <search> (ignore case)
reqideny ^Host:\ .*\.local
reqiallow ^Host:\ www\.
See also: "reqallow", "rspdeny", "acl", "block" and section 2.6 about HTTP
See also: "reqallow", "rspdeny", "acl", "block" and section 2.5 about HTTP
header manipulation
@ -2380,7 +2384,7 @@ reqipass <search> (ignore case)
reqideny ^Host:\ .*\.local
reqiallow ^Host:\ www\.
See also: "reqallow", "reqdeny", "acl", "block" and section 2.6 about HTTP
See also: "reqallow", "reqdeny", "acl", "block" and section 2.5 about HTTP
header manipulation
@ -2401,7 +2405,7 @@ reqirep <search> <string> (ignore case)
must be escaped using a backslash ('\'). References to matched
pattern groups are possible using the common \N form, with N
being a single digit between 0 and 9. Please refer to section
2.6 about HTTP header manipulation for more information.
2.5 about HTTP header manipulation for more information.
Any line matching extended regular expression <search> in the request (both
the request line and header lines) will be completely replaced with <string>.
@ -2419,7 +2423,7 @@ reqirep <search> <string> (ignore case)
# replace "www.mydomain.com" with "www" in the host name.
reqirep ^Host:\ www.mydomain.com Host:\ www
See also: "reqadd", "reqdel", "rsprep" and section 2.6 about HTTP header
See also: "reqadd", "reqdel", "rsprep" and section 2.5 about HTTP header
manipulation
@ -2439,7 +2443,9 @@ reqitarpit <search> (ignore case)
A request containing any line which matches extended regular expression
<search> will be tarpitted, which means that it will connect to nowhere, will
be kept open for a pre-defined time, then will return an HTTP error 500. The
be kept open for a pre-defined time, then will return an HTTP error 500 so
that the attacker does not suspect it has been tarpitted. The status 500 will
be reported in the logs, but the completion flags will indicate "PT". The
delay is defined by "timeout tarpit", or "timeout connect" if the former is
not set.
@ -2455,7 +2461,7 @@ reqitarpit <search> (ignore case)
reqipass ^User-Agent:\.*(Mozilla|MSIE)
reqitarpit ^User-Agent:
See also: "reqallow", "reqdeny", "reqpass", and section 2.6 about HTTP header
See also: "reqallow", "reqdeny", "reqpass", and section 2.5 about HTTP header
manipulation
@ -2466,7 +2472,7 @@ rspadd <string>
Arguments :
<string> is the complete line to be added. Any space or known delimiter
must be escaped using a backslash ('\'). Please refer to section
2.6 about HTTP header manipulation for more information.
2.5 about HTTP header manipulation for more information.
A new line consisting in <string> followed by a line feed will be added after
the last header of an HTTP response.
@ -2475,7 +2481,7 @@ rspadd <string>
and not to traffic generated by HAProxy, such as health-checks or error
responses.
See also: "reqadd" and section 2.6 about HTTP header manipulation
See also: "reqadd" and section 2.5 about HTTP header manipulation
rspdel <search>
@ -2505,7 +2511,7 @@ rspidel <search> (ignore case)
# remove the Server header from responses
reqidel ^Server:.*
See also: "rspadd", "rsprep", "reqdel" and section 2.6 about HTTP header
See also: "rspadd", "rsprep", "reqdel" and section 2.5 about HTTP header
manipulation
@ -2529,9 +2535,9 @@ rspideny <search> (ignore case)
case-sensitive.
Main use of this keyword is to prevent sensitive information leak and to
block the response before it reaches the client. If a response is denied,
it will be replaced with an HTTP 502 error so that the client never gets
the sensitive data.
block the response before it reaches the client. If a response is denied, it
will be replaced with an HTTP 502 error so that the client never retrieves
any sensitive data.
It is easier, faster and more powerful to use ACLs to write access policies.
Rspdeny should be avoided in new designs.
@ -2540,7 +2546,7 @@ rspideny <search> (ignore case)
# Ensure that no content type matching ms-word will leak
rspideny ^Content-type:\.*/ms-word
See also: "reqdeny", "acl", "block" and section 2.6 about HTTP header
See also: "reqdeny", "acl", "block" and section 2.5 about HTTP header
manipulation
@ -2562,7 +2568,7 @@ rspirep <search> <string> (ignore case)
must be escaped using a backslash ('\'). References to matched
pattern groups are possible using the common \N form, with N
being a single digit between 0 and 9. Please refer to section
2.6 about HTTP header manipulation for more information.
2.5 about HTTP header manipulation for more information.
Any line matching extended regular expression <search> in the response (both
the response line and header lines) will be completely replaced with
@ -2578,7 +2584,7 @@ rspirep <search> <string> (ignore case)
# replace "Location: 127.0.0.1:8080" with "Location: www.mydomain.com"
rspirep ^Location:\ 127.0.0.1:8080 Location:\ www.mydomain.com
See also: "rspadd", "rspdel", "reqrep" and section 2.6 about HTTP header
See also: "rspadd", "rspdel", "reqrep" and section 2.5 about HTTP header
manipulation
@ -3823,7 +3829,103 @@ weight <weight>
adjustments.
2.5) Logging
2.5) HTTP header manipulation
-----------------------------
In HTTP mode, it is possible to rewrite, add or delete some of the request and
response headers based on regular expressions. It is also possible to block a
request or a response if a particular header matches a regular expression,
which is enough to stop most elementary protocol attacks, and to protect
against information leak from the internal network. But there is a limitation
to this : since HAProxy's HTTP engine does not support keep-alive, only headers
passed during the first request of a TCP session will be seen. All subsequent
headers will be considered data only and not analyzed. Furthermore, HAProxy
never touches data contents, it stops analysis at the end of headers.
This section covers common usage of the following keywords, described in detail
in section 2.2.1 :
- reqadd <string>
- reqallow <search>
- reqiallow <search>
- reqdel <search>
- reqidel <search>
- reqdeny <search>
- reqideny <search>
- reqpass <search>
- reqipass <search>
- reqrep <search> <replace>
- reqirep <search> <replace>
- reqtarpit <search>
- reqitarpit <search>
- rspadd <string>
- rspdel <search>
- rspidel <search>
- rspdeny <search>
- rspideny <search>
- rsprep <search> <replace>
- rspirep <search> <replace>
With all these keywords, the same conventions are used. The <search> parameter
is a POSIX extended regular expression (regex) which supports grouping through
parenthesis (without the backslash). Spaces and other delimiters must be
prefixed with a backslash ('\') to avoid confusion with a field delimiter.
Other characters may be prefixed with a backslash to change their meaning :
\t for a tab
\r for a carriage return (CR)
\n for a new line (LF)
\ to mark a space and differentiate it from a delimiter
\# to mark a sharp and differentiate it from a comment
\\ to use a backslash in a regex
\\\\ to use a backslash in the text (*2 for regex, *2 for haproxy)
\xXX to write the ASCII hex code XX as in the C language
The <replace> parameter contains the string to be used to replace the largest
portion of text matching the regex. It can make use of the special characters
above, and can reference a substring which is delimited by parenthesis in the
regex, by writing a backslash ('\') immediately followed by one digit from 0 to
9 indicating the group position (0 designating the entire line). This practise
is very common to users of the "sed" program.
The <string> parameter represents the string which will systematically be added
after the last header line. It can also use special character sequences above.
Notes related to these keywords :
---------------------------------
- these keywords are not always convenient to allow/deny based on header
contents. It is strongly recommended to use ACLs with the "block" keyword
instead, resulting in far more flexible and manageable rules.
- lines are always considered as a whole. It is not possible to reference
a header name only or a value only. This is important because of the way
headers are written (notably the number of spaces after the colon).
- the first line is always considered as a header, which makes it possible to
rewrite or filter HTTP requests URIs or response codes, but in turn makes
it harder to distinguish between headers and request line. The regex prefix
^[^\ \t]*[\ \t] matches any HTTP method followed by a space, and the prefix
^[^ \t:]*: matches any header name followed by a colon.
- for performances reasons, the number of characters added to a request or to
a response is limited at build time to values between 1 and 4 kB. This
should normally be far more than enough for most usages. If it is too short
on occasional usages, it is possible to gain some space by removing some
useless headers before adding new ones.
- keywords beginning with "reqi" and "rspi" are the same as their couterpart
without the 'i' letter except that they ignore case when matching patterns.
- when a request passes through a frontend then a backend, all req* rules
from the frontend will be evaluated, then all req* rules from the backend
will be evaluated. The reverse path is applied to responses.
- req* statements are applied after "block" statements, so that "block" is
always the first one, but before "use_backend" in order to permit rewriting
before switching.
2.6) Logging
------------
[to do]