Commit Graph

1292 Commits

Author SHA1 Message Date
Willy Tarreau
d32c399747 MINOR: stats: report correct throttling percentage for servers in slowstart
The column used to report the throttle percentage when a server is in
slowstart is based on the time only. This is wrong, because server weights
in slowstart are updated at most once a second, so the reported value is
wrong at least fo rone second during each step, which means all the time
when using short delays (< 20s).

The second point is that it's disturbing to see a weight < 100% without
any throttle at the end of the period (during the last second), because
the effective weight has not yet been updated.

Instead, we now compute the exact ratio between eweight and uweight and
report it. It's always accurate and describes the value being used instead
of using only the date.

It can be backported to 1.4 though it's not particularly important.
2013-11-21 15:30:45 +01:00
Willy Tarreau
004e045f31 BUG/MAJOR: server: weight calculation fails for map-based algorithms
A crash was reported by Igor at owind when changing a server's weight
on the CLI. Lukas Tribus could reproduce a related bug where setting
a server's weight would result in the new weight being multiplied by
the initial one. The two bugs are the same.

The incorrect weight calculation results in the total farm weight being
larger than what was initially allocated, causing the map index to be out
of bounds on some hashes. It's easy to reproduce using "balance url_param"
with a variable param, or with "balance static-rr".

It appears that the calculation is made at many places and is not always
right and not always wrong the same way. Thus, this patch introduces a
new function "server_recalc_eweight()" which is dedicated to this task
of computing ->eweight from many other elements including uweight and
current time (for slowstart), and all users now switch to use this
function.

The patch is a bit large but the code was not trivially fixable in a way
that could guarantee this situation would not occur anymore. The fix is
much more readable and has been verified to work with all algorithms,
with both consistent and map-based hashes, and even with static-rr.

Slowstart was tested as well, just like enable/disable server.

The same bug is very likely present in 1.4 as well, so the patch will
probably need to be backported eventhough it will not apply as-is.

Thanks to Lukas and Igor for the information they provided to reproduce it.
2013-11-21 15:09:02 +01:00
Simon Horman
125d099662 MEDIUM: Move health element to struct check
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-11-19 09:36:07 +01:00
Simon Horman
cd5d7b678e MEDIUM: Add state to struct check
Add state to struct check. This is currently used to store one bit,
CHK_RUNNING, which is set if a check is running and clear otherwise.
This bit was previously SRV_CHK_RUNNING of the state element of struct
server.

This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.

Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
2013-11-19 09:36:04 +01:00
Simon Horman
4a741432be MEDIUM: Paramatise functions over the check of a server
Paramatise the following functions over the check of a server

* set_server_down
* set_server_up
* srv_getinter
* server_status_printf
* set_server_check_status
* set_server_disabled
* set_server_enabled

Generally the server parameter of these functions has been removed.
Where it is still needed it is obtained using check->server.

This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.
By paramatising these functions they may act on each of the checks
without further significant modification.

Explanation of the SSP_O_HCHK portion of this change:

* Prior to this patch SSP_O_HCHK serves a single purpose which
  is to tell server_status_printf() weather it should print
  the details of the check of a server or not.

  With the paramatisation that this patch adds there are two cases.
  1) Printing the details of the check in which case a
     valid check parameter is needed.
  2) Not printing the details of the check in which case
     the contents check parameter are unused.

  In case 1) we could pass SSP_O_HCHK and a valid check and;
  In case 2) we could pass !SSP_O_HCHK and any value for check
  including NULL.

  If NULL is used for case 2) then SSP_O_HCHK becomes supurfulous
  and as NULL is used for case 2) SSP_O_HCHK has been removed.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-11-19 09:35:54 +01:00
Simon Horman
28b5ffc76f MEDIUM: Move result element to struct check
Move result element from struct server to struct check
This allows check results to be independent of the check's server.

This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-11-19 09:35:52 +01:00
Simon Horman
6618300e13 MEDIUM: Split up struct server's check element
This is in preparation for associating a agent check
with a server which runs as well as the server's existing check.

The split has been made by:
* Moving elements of struct server's check element that will
  be shared by both checks into a new check_common element
  of struct server.
* Moving the remaining elements to a new struct check and
  making struct server's check element a struct check.
* Adding a server element to struct check, a back-pointer
  to the server element it is a member of.
  - At this time the server could be obtained using
    container_of, however, this will not be so easy
    once a second struct check element is added to struct server
    to accommodate an agent health check.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-11-19 09:35:48 +01:00
Simon Horman
c69d547638 CLEANUP: Remove unused 'last_slowstart_change' field from struct peer
This was inadvertently added by "MEDIUM: checks: Add agent health check".
It appears to have never been used.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-11-19 08:04:59 +01:00
Simon Horman
a360844735 CLEANUP: Make parameters of srv_downtime and srv_getinter const
The parameters of srv_downtime and srv_getinter are not modified
and thus may be const.

Signed-off-by: Simon Horman <horms@verge.net.au>
2013-11-19 08:04:58 +01:00
Willy Tarreau
a0f4271497 MEDIUM: backend: add support for the wt6 hash
This function was designed for haproxy while testing other functions
in the past. Initially it was not planned to be used given the not
very interesting numbers it showed on real URL data : it is not as
smooth as the other ones. But later tests showed that the other ones
are extremely sensible to the server count and the type of input data,
especially DJB2 which must not be used on numeric input. So in fact
this function is still a generally average performer and it can make
sense to merge it in the end, as it can provide an alternative to
sdbm+avalanche or djb2+avalanche for consistent hashing or when hashing
on numeric data such as a source IP address or a visitor identifier in
a URL parameter.
2013-11-14 16:37:50 +01:00
Bhaskar Maddala
b6c0ac94a4 MEDIUM: backend: Implement avalanche as a modifier of the hashing functions.
Summary:
Avalanche is supported not as a native hashing choice, but a modifier
on the hashing function. Note that this means that possible configs
written after 1.5-dev4 using "hash-type avalanche" will get an informative
error instead. But as discussed on the mailing list it seems nobody ever
used it anyway, so let's fix it before the final 1.5 release.

The default values were selected for backward compatibility with previous
releases, as discussed on the mailing list, which means that the consistent
hashing will still apply the avalanche hash by default when no explicit
algorithm is specified.

Examples
  (default) hash-type map-based
	Map based hashing using sdbm without avalanche

  (default) hash-type consistent
	Consistent hashing using sdbm with avalanche

Additional Examples:

  (a) hash-type map-based sdbm
	Same as default for map-based above
  (b) hash-type map-based sdbm avalanche
	Map based hashing using sdbm with avalanche
  (c) hash-type map-based djb2
	Map based hashing using djb2 without avalanche
  (d) hash-type map-based djb2 avalanche
	Map based hashing using djb2 with avalanche
  (e) hash-type consistent sdbm avalanche
	Same as default for consistent above
  (f) hash-type consistent sdbm
	Consistent hashing using sdbm without avalanche
  (g) hash-type consistent djb2
	Consistent hashing using djb2 without avalanche
  (h) hash-type consistent djb2 avalanche
	Consistent hashing using djb2 with avalanche
2013-11-14 16:37:50 +01:00
Bhaskar
98634f0c7b MEDIUM: backend: Enhance hash-type directive with an algorithm options
Summary:
In testing at tumblr, we found that using djb2 hashing instead of the
default sdbm hashing resulted is better workload distribution to our backends.

This commit implements a change, that allows the user to specify the hash
function they want to use. It does not limit itself to consistent hashing
scenarios.

The supported hash functions are sdbm (default), and djb2.

For a discussion of the feature and analysis, see mailing list thread
"Consistent hashing alternative to sdbm" :

      http://marc.info/?l=haproxy&m=138213693909219

Note: This change does NOT make changes to new features, for instance,
applying an avalance hashing always being performed before applying
consistent hashing.
2013-11-14 16:37:50 +01:00
Thierry FOURNIER
a054d410db BUILD/MINOR: missing header file
In the header file "types/proto_http.h", the list are used
but the header file "mini-clist.h" is not included.
2013-10-23 15:53:56 +02:00
Thierry FOURNIER
de6617b486 MINOR: http: some exported functions were not in the header file
Export the following functions:
 - find_hdr_value_end
 - http_header_match2
 - http_remove_header2
 - http_header_add_tail2
2013-10-23 12:21:38 +02:00
Thierry FOURNIER
ef37a66628 CLEANUP: The function "regex_exec" needs the string length but in many case they expect null terminated char.
If haproxy is compiled with the USE_PCRE_JIT option, the length of the
string is used. If it is compiled without this option the function doesn't
use the length and expects a null terminated string.

The prototype of the function is ambiguous, and depends on the
compilation option. The developer can think that the length is always
used, and many bugs can be created.

This patch makes sure that the length is used. The regex_exec function
adds the final '\0' if it is needed.
2013-10-23 12:19:51 +02:00
Thierry FOURNIER
ed5a4aefae CLEANUP: regex: Create regex_comp function that compiles regex using compilation options
The current file "regex.h" define an abstraction for the regex. It
provides the same struct name and the same "regexec" function for the
3 regex types supported: standard libc, basic pcre and jit pcre.

The regex compilation function is not provided by this file. If the
developper wants to use regex, he must write regex compilation code
containing "#define *JIT*".

This patch provides a unique regex compilation function according to
the compilation options.

In addition, the "regex.h" file checks the presence of the "#define
PCRE_CONFIG_JIT" when "USE_PCRE_JIT" is enabled. If this flag is not
present, the pcre lib doesn't support JIT and "#error" is emitted.
2013-10-14 14:42:50 +02:00
Thierry FOURNIER
e28f1ecf2b BUILD/MINOR: missing header file
In the header file "common/regex.h", the C keyword NULL is used. This
keyword is referenced into the header file "stdlib.h", but this is not
included.
2013-10-10 11:38:35 +02:00
Godbach
2b8fd54287 DOC: fix typo in comments
Hi Willy,

There is a patch to fix typo in comments, please check the attachment
for you information.

The commit log is as below:

commit 9824d1b3740ac2746894f1aa611c795366c84210
Author: Godbach <nylzhaowei@gmail.com>
Date:   Mon Sep 30 11:05:42 2013 +0800

    DOC: fix typo in comments

      0x20000000 -> 0x40000000
      vuf -> buf
      ethod -> Method

    Signed-off-by: Godbach <nylzhaowei@gmail.com>

--
Best Regards,
Godbach

From 9824d1b3740ac2746894f1aa611c795366c84210 Mon Sep 17 00:00:00 2001
From: Godbach <nylzhaowei@gmail.com>
Date: Mon, 30 Sep 2013 11:05:42 +0800
Subject: [PATCH] DOC: fix typo in comments

  0x20000000 -> 0x40000000
  vuf -> buf
  ethod -> Method

Signed-off-by: Godbach <nylzhaowei@gmail.com>
2013-10-01 09:49:21 +02:00
Willy Tarreau
cc1e04b1e8 MINOR: tcp: add new "close" action for tcp-response
This new action immediately closes the connection with the server
when the condition is met. The first such rule executed ends the
rules evaluation. The main purpose of this action is to force a
connection to be finished between a client and a server after an
exchange when the application protocol expects some long time outs
to elapse first. The goal is to eliminate idle connections which
take signifiant resources on servers with certain protocols.
2013-09-11 23:28:51 +02:00
Willy Tarreau
3a925c155d MEDIUM: stick-tables: flush old entries upon soft-stop
When a process with large stick tables is replaced by a new one and remains
present until the last connection finishes, it keeps these data in memory
for nothing since they will never be used anymore by incoming connections,
except during syncing with the new process. This is especially problematic
when dealing with long session protocols such as WebSocket as it becomes
possible to stack many processes and eat a lot of memory.

So the idea here is to know if a table still needs to be synced or not,
and to purge all unused entries once the sync is complete. This means that
after a few hundred milliseconds when everything has been synchronized with
the new process, only a few entries will remain allocated (only the ones
held by sessions during the restart) and all the remaining memory will be
freed.

Note that we carefully do that only after the grace period is expired so as
not to impact a possible proxy that needs to accept a few more connections
before leaving.

Doing this required to add a sync counter to the stick tables, to know how
many peer sync sessions are still in progress in order not to flush the entries
until all synchronizations are completed.
2013-09-04 17:54:01 +02:00
Evan Broder
be55431f9f MINOR: ssl: Add statement 'verifyhost' to "server" statements
verifyhost allows you to specify a hostname that the remote server's
SSL certificate must match. Connections that don't match will be
closed with an SSL error.
2013-09-01 07:55:49 +02:00
Willy Tarreau
9f09521f2d BUG/MEDIUM: unique_id: HTTP request counter must be unique!
The HTTP request counter is incremented non atomically, which means that
many requests can log the same ID. Let's increment it when it is consumed
so that we avoid this case.

This bug was reported by Patrick Hemmer. It's 1.5-specific and does not
need to be backported.
2013-08-13 17:52:20 +02:00
Willy Tarreau
47060b6ae0 MINOR: cli: make it possible to enter multiple values at once with "set table"
The "set table" statement allows to create new entries with their respective
values. Till now it was limited to a single data type per line, requiring as
many "set table" statements as the desired data types to be set. Since this
is only a parser limitation, this patch gets rid of it. It also allows the
creation of a key with no data types (all reset to their default values).
2013-08-01 21:17:19 +02:00
Willy Tarreau
b4c8493a9f MINOR: session: make the number of stick counter entries more configurable
In preparation of more flexibility in the stick counters, make their
number configurable. It still defaults to 3 which is the minimum
accepted value. Changing the value alone is not sufficient to get
more counters, some bitfields still need to be updated and the TCP
actions need to be updated as well, but this update tries to be
easier, which is nice for experimentation purposes.
2013-08-01 21:17:14 +02:00
Willy Tarreau
cadd8c9ec3 MINOR: payload: split smp_fetch_rdp_cookie()
This function is also called directly from backend.c, so let's stop
building fake args to call it as a sample fetch, and have a lower
layer more generic function instead.
2013-08-01 21:17:13 +02:00
Willy Tarreau
ef38c39287 MEDIUM: sample: systematically pass the keyword pointer to the keyword
We're having a lot of duplicate code just because of minor variants between
fetch functions that could be dealt with if the functions had the pointer to
the original keyword, so let's pass it as the last argument. An earlier
version used to pass a pointer to the sample_fetch element, but this is not
the best solution for two reasons :
  - fetch functions will solely rely on the keyword string
  - some other smp_fetch_* users do not have the pointer to the original
    keyword and were forced to pass NULL.

So finally we're passing a pointer to the keyword as a const char *, which
perfectly fits the original purpose.
2013-08-01 21:17:13 +02:00
Godbach
a34bdc0ea4 BUG/MEDIUM: server: set the macro for server's max weight SRV_UWGHT_MAX to SRV_UWGHT_RANGE
The max weight of server is 256 now, but SRV_UWGHT_MAX is still 255. As a result,
FWRR will not work well when server's weight is 256. The description is as below:

There are some macros related to server's weight in include/types/server.h:
    #define SRV_UWGHT_RANGE 256
    #define SRV_UWGHT_MAX   (SRV_UWGHT_RANGE - 1)
    #define SRV_EWGHT_MAX   (SRV_UWGHT_MAX   * BE_WEIGHT_SCALE)

Since weight of server can be reach to 256 and BE_WEIGHT_SCALE equals to 16,
the max eweight of server should be 256*16 = 4096, it will exceed SRV_EWGHT_MAX
which equals to SRV_UWGHT_MAX*BE_WEIGHT_SCALE = 255*16 = 4080. When a server
with weight 256 is insterted into FWRR tree during initialization, the key value
of this server should be SRV_EWGHT_MAX - s->eweight = 4080 - 4096 = -16 which
is closed to UINT_MAX in unsigned type, so the server with highest weight will
be not elected as the first server to process request.

In addition, it is a better choice to compare with SRV_UWGHT_MAX than a magic
number 256 while doing check for the weight. The max number of servers for
round-robin algorithm is also updated.

Signed-off-by: Godbach <nylzhaowei@gmail.com>
2013-07-22 09:29:34 +02:00
Lukas Tribus
67db8df12b MEDIUM: http: add IPv6 support for "set-tos"
As per RFC3260 #4 and BCP37 #4.2 and #5.2, the IPv6 counterpart of TOS
is "traffic class".

Add support for IPv6 traffic class in "set-tos" by moving the "set-tos"
related code to the new inline function inet_set_tos(), handling IPv4
(IP_TOS), IPv6 (IPV6_TCLASS) and IPv4-mapped sockets (IP_TOS, like
::ffff:127.0.0.1).

Also define - if missing - the IN6_IS_ADDR_V4MAPPED() macro in
include/common/compat.h for compatibility.
2013-06-23 18:01:38 +02:00
Willy Tarreau
dc13c11c1e BUG/MEDIUM: prevent gcc from moving empty keywords lists into BSS
Benoit Dolez reported a failure to start haproxy 1.5-dev19. The
process would immediately report an internal error with missing
fetches from some crap instead of ACL names.

The cause is that some versions of gcc seem to trim static structs
containing a variable array when moving them to BSS, and only keep
the fixed size, which is just a list head for all ACL and sample
fetch keywords. This was confirmed at least with gcc 3.4.6. And we
can't move these structs to const because they contain a list element
which is needed to link all of them together during the parsing.

The bug indeed appeared with 1.5-dev19 because it's the first one
to have some empty ACL keyword lists.

One solution is to impose -fno-zero-initialized-in-bss to everyone
but this is not really nice. Another solution consists in ensuring
the struct is never empty so that it does not move there. The easy
solution consists in having a non-null list head since it's not yet
initialized.

A new "ILH" list head type was thus created for this purpose : create
an Initialized List Head so that gcc cannot move the struct to BSS.
This fixes the issue for this version of gcc and does not create any
burden for the declarations.
2013-06-21 23:29:02 +02:00
Godbach
430f291a99 CLEANUP: session: remove event_accept() which was not used anymore
Remove event_accept() in include/proto/proto_http.h and use correct function
name in other two files instead of event_accept().

Signed-off-by: Godbach <nylzhaowei@gmail.com>
2013-06-20 08:07:35 +02:00
Willy Tarreau
be4a3eff34 MEDIUM: counters: use sc0/sc1/sc2 instead of sc1/sc2/sc3
It was a bit inconsistent to have gpc start at 0 and sc start at 1,
so make sc start at zero like gpc. No previous release was issued
with sc3 anyway, so no existing setup should be affected.
2013-06-17 15:04:07 +02:00
Thierry FOURNIER
7eeb435494 CLEANUP: protect checks.h from multiple inclusions
The types/checks.h include file isn't protected against multiple inclusions,
so let's surround it with "#ifndef _TYPES_CHECKS_H/#endif" to fix this.
2013-06-14 19:53:02 +02:00
Thierry FOURNIER
d3879e8b57 CLEANUP: fix missing include <string.h> in proto/listener.h
The file proto/listener.h makes use of strdup() but doesn't include
<string.h> so it's sensible to include file ordering.
2013-06-14 19:52:17 +02:00
Willy Tarreau
4f0d919bd4 MEDIUM: tcp: add "tcp-request connection expect-proxy layer4"
This configures the client-facing connection to receive a PROXY protocol
header before any byte is read from the socket. This is equivalent to
having the "accept-proxy" keyword on the "bind" line, except that using
the TCP rule allows the PROXY protocol to be accepted only for certain
IP address ranges using an ACL. This is convenient when multiple layers
of load balancers are passed through by traffic coming from public
hosts.
2013-06-11 20:40:55 +02:00
Willy Tarreau
51347ed94c MEDIUM: http: add the "set-mark" action on http-request/http-response rules
"set-mark" is used to set the Netfilter MARK on all packets sent to the
client to the value passed in <mark> on platforms which support it. This
value is an unsigned 32 bit value which can be matched by netfilter and
by the routing table. It can be expressed both in decimal or hexadecimal
format (prefixed by "0x"). This can be useful to force certain packets to
take a different route (for example a cheaper network path for bulk
downloads). This works on Linux kernels 2.6.32 and above and requires
admin privileges.
2013-06-11 19:34:13 +02:00
Willy Tarreau
42cf39e3b9 MEDIUM: http: add support for "set-tos" in http-request/http-response
This manipulates the TOS field of the IP header of outgoing packets sent
to the client. This can be used to set a specific DSCP traffic class based
on some request or response information. See RFC2474, 2597, 3260 and 4594
for more information.
2013-06-11 19:04:37 +02:00
Willy Tarreau
9a355ec257 MEDIUM: http: add support for action "set-log-level" in http-request/http-response
Some users want to disable logging for certain non-important requests such as
stats requests or health-checks coming from another equipment. Other users want
to log with a higher importance (eg: notice) some special traffic (POST requests,
authenticated requests, requests coming from suspicious IPs) or some abnormally
large responses.

This patch responds to all these needs at once by adding a "set-log-level" action
to http-request/http-response. The 8 syslog levels are supported, as well as "silent"
to disable logging.
2013-06-11 17:50:26 +02:00
Willy Tarreau
abcd5145f8 MEDIUM: log: add a log level override value in struct session
This log level will be used in a further patch to change the log level
depending on the request or response.
2013-06-11 17:50:26 +02:00
Willy Tarreau
f4c43c13be MEDIUM: http: add the "set-nice" action to http-request and http-response
This new action changes the nice factor of the task processing the current
request.
2013-06-11 17:50:26 +02:00
Willy Tarreau
e365c0b92b MEDIUM: http: add a new "http-response" ruleset
Some actions were clearly missing to process response headers. This
patch adds a new "http-response" ruleset which provides the following
actions :
  - allow : stop evaluating http-response rules
  - deny : stop and reject the response with a 502
  - add-header : add a header in log-format mode
  - set-header : set a header in log-format mode
2013-06-11 16:06:12 +02:00
Willy Tarreau
2b57cb8f30 MEDIUM: protocol: implement a "drain" function in protocol layers
Since commit cfd97c6f was merged into 1.5-dev14 (BUG/MEDIUM: checks:
prevent TIME_WAITs from appearing also on timeouts), some valid health
checks sometimes used to show some TCP resets. For example, this HTTP
health check sent to a local server :

  19:55:15.742818 IP 127.0.0.1.16568 > 127.0.0.1.8000: S 3355859679:3355859679(0) win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
  19:55:15.742841 IP 127.0.0.1.8000 > 127.0.0.1.16568: S 1060952566:1060952566(0) ack 3355859680 win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
  19:55:15.742863 IP 127.0.0.1.16568 > 127.0.0.1.8000: . ack 1 win 257
  19:55:15.745402 IP 127.0.0.1.16568 > 127.0.0.1.8000: P 1:23(22) ack 1 win 257
  19:55:15.745488 IP 127.0.0.1.8000 > 127.0.0.1.16568: FP 1:146(145) ack 23 win 257
  19:55:15.747109 IP 127.0.0.1.16568 > 127.0.0.1.8000: R 23:23(0) ack 147 win 257

After some discussion with Chris Huang-Leaver, it appeared clear that
what we want is to only send the RST when we have no other choice, which
means when the server has not closed. So we still keep SYN/SYN-ACK/RST
for pure TCP checks, but don't want to see an RST emitted as above when
the server has already sent the FIN.

The solution against this consists in implementing a "drain" function at
the protocol layer, which, when defined, causes as much as possible of
the input socket buffer to be flushed to make recv() return zero so that
we know that the server's FIN was received and ACKed. On Linux, we can make
use of MSG_TRUNC on TCP sockets, which has the benefit of draining everything
at once without even copying data. On other platforms, we read up to one
buffer of data before the close. If recv() manages to get the final zero,
we don't disable lingering. Same for hard errors. Otherwise we do.

In practice, on HTTP health checks we generally find that the close was
pending and is returned upon first recv() call. The network trace becomes
cleaner :

  19:55:23.650621 IP 127.0.0.1.16561 > 127.0.0.1.8000: S 3982804816:3982804816(0) win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
  19:55:23.650644 IP 127.0.0.1.8000 > 127.0.0.1.16561: S 4082139313:4082139313(0) ack 3982804817 win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
  19:55:23.650666 IP 127.0.0.1.16561 > 127.0.0.1.8000: . ack 1 win 257
  19:55:23.651615 IP 127.0.0.1.16561 > 127.0.0.1.8000: P 1:23(22) ack 1 win 257
  19:55:23.651696 IP 127.0.0.1.8000 > 127.0.0.1.16561: FP 1:146(145) ack 23 win 257
  19:55:23.652628 IP 127.0.0.1.16561 > 127.0.0.1.8000: F 23:23(0) ack 147 win 257
  19:55:23.652655 IP 127.0.0.1.8000 > 127.0.0.1.16561: . ack 24 win 257

This change should be backported to 1.4 which is where Chris encountered
this issue. The code is different, so probably the tcp_drain() function
will have to be put in the checks only.
2013-06-10 20:33:23 +02:00
Willy Tarreau
04ff9f105f MINOR: http: add full-length header fetch methods
The req.hdr and res.hdr fetch methods do not work well on headers which
are allowed to contain commas, such as User-Agent, Date or Expires.
More specifically, full-length matching is impossible if a comma is
present.

This patch introduces 4 new fetch functions which are designed to work
with these full-length headers :
  - req.fhdr, req.fhdr_cnt
  - res.fhdr, res.fhdr_cnt

These ones do not stop at commas and permit to return full-length header
values.
2013-06-10 18:39:42 +02:00
Willy Tarreau
570f221cbb MINOR: log: add a new flag 'L' for locally processed requests
People who use "option dontlog-normal" are bothered with redirects and
stats being logged and reported as errors in the logs ("PR" = proxy
blocked the request).

This patch introduces a new flag 'L' for when a request is locally
processed, that is not considered as an error by the log filters. That
way we know a request was intercepted and processed by haproxy without
logging the line when "option dontlog-normal" is in effect.
2013-06-10 16:42:09 +02:00
Willy Tarreau
bf43f6eb32 MINOR: defaults: allow REQURI_LEN and CAPTURE_LEN to be redefined
Some users want to change these default settings but it was not
convenient.
2013-06-10 10:30:09 +02:00
Willy Tarreau
379357af58 BUG/MAJOR: http: always ensure response buffer has some room for a response
Since 1.5-dev12 and commit 3bf1b2b8 (MAJOR: channel: stop relying on
BF_FULL to take action), the HTTP parser switched to channel_full()
instead of BF_FULL to decide whether a buffer had enough room to start
parsing a request or response. The problem is that channel_full()
intentionally ignores outgoing data, so a corner case exists where a
large response might still be left in a response buffer with just a
few bytes left (much less than the reserve), enough to accept a second
response past the last data, but not enough to permit the HTTP processor
to add some headers. Since all the processing relies on this space being
available, we can get some random crashes when clients pipeline requests.

The analysis of a core from haproxy configured with 20480 bytes buffers
shows this : with enough "luck", when sending back the response for the
first request, the client is slow, the TCP window is congested, the socket
buffers are full, and haproxy's buffer fills up. We still have 20230 bytes
of response data in a 20480 response buffer. The second request is sent to
the server which returns 214 bytes which fit in the small 250 bytes left
in this buffer. And the buffer arrangement makes it possible to escape all
the controls in http_wait_for_response() :

    |<------ response buffer = 20480 bytes ------>|
    [ 2/2  | 3 | 4 |          1/2                 ]
           ^ start of circular buffer

      1/2 = beginning of previous response (18240)
      2/2 = end of previous response       (1990)
        3 = current response               (214)
        4 = free space                     (36)

  - channel_full() returns false (20230 bytes are going to leave)
  - the response headers does not wrap at the end of the buffer
  - the remaining linear room after the headers is larger than the
    reserve, because it's the previous response which wraps :
  => response is processed

Header rewriting causes it to reach 260 bytes, 10 bytes larger than what
the buffer could hold. So all computations during header addition are
wrong and lead to the corruption we've observed.

All the conditions are very hard to meet (which explains why it took
almost one year for this bug to show up) and are almost impossible to
reproduce on purpose on a test platform. But the bug is clearly there.

This issue was reported by Dinko Korunic who kindly devoted a lot of
time to provide countless traces and cores, and to experiment with
troubleshooting patches to knock the bug down. Thanks Dinko!

No backport is needed, but all 1.5-dev versions between dev12 and dev18
included must be upgraded. A workaround consists in setting option
forceclose to prevent pipelined requests from being processed.
2013-06-08 13:14:17 +02:00
Willy Tarreau
ba2ffd18b5 MEDIUM: counters: add a new "gpc0_rate" counter in stick-tables
This counter is special in that instead of reporting the gpc0 cumulative
count, it returns its increase rate over the configured period.
2013-05-29 15:54:14 +02:00
Willy Tarreau
e25c917af8 MEDIUM: counters: add support for tracking a third counter
We're often missin a third counter to track base, src and base+src at
the same time. Here we introduce track_sc3 to have this third counter.
It would be wise not to add much more counters because that slightly
increases the session size and processing time though the real issue
is more the declaration of the keywords in the code and in the doc.
2013-05-29 00:37:16 +02:00
Willy Tarreau
d5ca9abb0d MINOR: counters: make it easier to extend the amount of tracked counters
By properly affecting the flags and values, it becomes easier to add
more tracked counters, for example for experimentation. It also slightly
reduces the code and the number of tests. No counters were added with
this patch.
2013-05-28 17:43:40 +02:00
Pieter Baauw
1eb7592bba MINOR: tproxy: add support for OpenBSD
OpenBSD uses (SOL_SOCKET, SO_BINDANY) to enable transparent
proxy on a socket.

This patch adds support for the relevant setsockopt() calls.
2013-05-11 08:03:50 +02:00
Pieter Baauw
ff30b6667b MINOR: tproxy: add support for FreeBSD
FreeBSD uses (IPPROTO_IP, IP_BINDANY) and (IPPROTO_IPV6, IPV6_BINDANY)
to enable transparent proxy on a socket.

This patch adds support for the relevant setsockopt() calls.
2013-05-11 08:03:43 +02:00