Commit a1cc38 introduced a regression which was easy to trigger till ad4cd58
(snapshots 20120222 to 20120311 included). The bug was still present after
that but harder to trigger.
The bug is caused by the use of two distinct log buffers due to intermediary
changes. The issue happens when an HTTP request is logged just after a TCP
request during the same second and the HTTP request is too large for the buffer.
In this case, it happens that the HTTP request is logged into the TCP buffer
instead and that length controls can't detect anything.
Starting with bddd4f, the issue is still possible when logging too large an
HTTP request just after a send_log() call (typically a server status change).
We owe a big thanks to Sander Klein for testing several snapshots and more
specifically for taking significant risks in production by letting the buggy
version crash several times in order to provide an exploitable core ! The bug
could not have been found without this precious help. Thank you Sander !
This fix does not need to be backported, it did not affect any released version.
Merge http_sess_log() and tcp_sess_log() to sess_log() and move it to
log.c
A new field in logformat_type define if you can use a logformat
variable in TCP or HTTP mode.
doc: log-format in tcp mode
Note that due to the way log buffer allocation currently works, trying to
log an HTTP request without "option httplog" is still not possible. This
will change in the near future.
A number of offset computation functions use struct buffer* arguments
and return integers without modifying the input. Using consts helps
simplifying some operations in callers.
The principle behind this load balancing algorithm was first imagined
and modeled by Steen Larsen then iteratively refined through several
work sessions until it would totally address its original goal.
The purpose of this algorithm is to always use the smallest number of
servers so that extra servers can be powered off during non-intensive
hours. Additional tools may be used to do that work, possibly by
locally monitoring the servers' activity.
The first server with available connection slots receives the connection.
The servers are choosen from the lowest numeric identifier to the highest
(see server parameter "id"), which defaults to the server's position in
the farm. Once a server reaches its maxconn value, the next server is used.
It does not make sense to use this algorithm without setting maxconn. Note
that it can however make sense to use minconn so that servers are not used
at full load before starting new servers, and so that introduction of new
servers requires a progressively increasing load (the number of servers
would more or less follow the square root of the load until maxconn is
reached). This algorithm ignores the server weight, and is more beneficial
to long sessions such as RDP or IMAP than HTTP, though it can be useful
there too.
http_sess_log now use the logformat linked list to make the log
string, snprintf is not used for speed issue.
CLF mode also uses logformat.
NOTE: as of now, empty fields in CLF now are "" not "-" anymore.
parse_logformat_string: parse the string, detect the type: text,
separator or variable
parse_logformat_var: dectect variable name
parse_logformat_var_args: parse arguments and flags
add_to_logformat_list: add to the logformat linked list
send_log function is now splited in 3 functions
* hdr_log: generate the syslog header
* send_log: send a syslog message with a printf format string
* __send_log: send a syslog message
It was reported that a server configured with a zero weight would
sometimes still take connections from the backend queue. This issue is
real, it happens this way :
1) the disabled server accepts a request with a cookie
2) many cookie-less requests accumulate in the backend queue
3) when the disabled server completes its request, it checks its own
queue and the backend's queue
4) the server takes a pending request from the backend queue and
processes it. In response, the server's cookie is assigned to
the client, which ensures that some requests will continue to
be served by this server, leading back to point 1 above.
The fix consists in preventing a zero-weight server from dequeuing pending
requests from the backend. Making use of srv_is_usable() in such tests makes
the tests more robust against future changes.
This fix must be backported to 1.4 and 1.3.
In a config where server "s1" is marked disabled and "s2" tracks "s1",
s2 appears disabled on the stats but is still inserted into the LB farm
because the tracking is resolved too late in the configuration process.
We now resolve tracked servers before building LB maps and we also mark
the tracking server in maintenance mode, which previously was not done,
causing half of the issue.
Last point is that we also protect srv_is_usable() against electing a
server marked for maintenance. This is not absolutely needed but is a
safe choice and makes a lot of sense.
This fix must be backported to 1.4.
New option "http-send-name-header" specifies the name of a header which
will hold the server name in outgoing requests. This is the name of the
server the connection is really sent to, which means that upon redispatches,
the header's value is updated so that it always matches the server's name.
The new function does not return IP addresses but header values instead,
so that the caller is free to make what it want of them. The conversion
is not quite clean yet, as the previous test which considered that address
0.0.0.0 meant "no address" is still used. A different IP parsing function
should be used to take this into account.
In commit 4b517ca93a (MEDIUM: buffers:
add some new primitives and rework existing ones), we forgot to check
if buffer_max_len() < l.
No backport is needed.
A number of primitives were missing for buffer management, and some
of them were particularly awkward to use. Specifically, the functions
used to compute free space could not always be used depending what was
wrapping in the buffers. Some documentation has been added about how
the buffers work and their properties. Some functions are still missing
such as a buffer replacement which would support wrapping buffers.
It makes no sense to have one pointer to the hdr_idx pool in each proxy
struct since these pools do not depend on the proxy. Let's have a common
pool instead as it is already the case for other types.
Stream interfaces used to distinguish between client and server addresses
because they were previously of different types (sockaddr_storage for the
client, sockaddr_in for the server). This is not the case anymore, and this
distinction is confusing at best and has caused a number of regressions to
be introduced in the process of converting everything to full-ipv6. We can
now remove this and have a much cleaner code.
This patch introduces hdr_len, path_len and url_len for matching these
respective parts lengths against integers. This can be used to detect
abuse or empty headers.
We already had the ability to kill a connection, but it was only
for the checks. Now we can do this for any session, and for this we
add a specific flag "K" to the logs.
The stats socket now allows the admin to disable, enable or shutdown a frontend.
This can be used when a bug is discovered in a configuration and it's desirable
to fix it but the rules in place don't allow to change a running config. Thus it
becomes possible to kill the frontend to release the port and start a new one in
a separate process.
This can also be used to temporarily make haproxy return TCP resets to incoming
requests to pretend the service is not bound. For instance, this may be useful
to quickly flush a very deep SYN backlog.
The frontend check and lookup code was factored with the "set maxconn" usage.
This one enforces a per-process connection rate limit, regardless of what
may be set per frontend. It can be a way to limit the CPU usage of a process
being severely attacked.
The side effect is that the global process connection rate is now measured
for each incoming connection, so it will be possible to report it.
This was revealed with one of the very latest patches which caused
the listener_queue not to be initialized on the stats socket frontend.
And in fact a number of other ones were missing too. This is getting so
boring that now we'll always make use of the same function to initialize
any proxy. Doing so has even saved about 500 bytes on the binary due to
the avoided code redundancy.
No backport is needed.
This function is finally not needed anymore, as it has been replaced with
a per-proxy task that is scheduled when some limits are encountered on
incoming connections or when the process is stopping. The savings should
be noticeable on configs with a large number of proxies. The most important
point is that the rate limiting is now enforced in a clean and solid way.
When a listeners encounters a resource shortage, it currently stops until
one re-enables it. This is far from being perfect as it does not yet handle
the case where the single connection from the listener is rejected (eg: the
stats page).
Now we'll have a special status for resource limited listeners and we'll
queue them into one or multiple lists. That way, each time we have to stop
a listener because of a resource shortage, we can enqueue it and change its
state, so that it is dequeued once more resources are available.
This patch currently does not change any existing behaviour, it only adds
the basic building blocks for doing that.
Managing listeners state is difficult because they have their own state
and can at the same time have theirs dictated by their proxy. The pause
is not done properly, as the proxy code is fiddling with sockets. By
introducing new functions such as pause_listener()/resume_listener(), we
make it a bit more obvious how/when they're supposed to be used. The
listen_proxies() function was also renamed to resume_proxies() since
it's only used for pause/resume.
This patch is the first in a series aiming at getting rid of the maintain_proxies
mess. In the end, proxies should not call enable_listener()/disable_listener()
anymore.
Patch af5149 introduced an issue which can be detected only on out of
memory conditions : a LIST_DEL() may be performed on an uninitialized
struct member instead of a LIST_INIT() during the accept() phase,
causing crashes and memory corruption to occur.
This issue was detected and diagnosed by the Exceliance R&D team.
This is 1.5-specific and very recent, so no existing deployment should
be impacted.
The motivation for this is to allow iteration of all the connections
of a server without the expense of iterating over the global list
of connections.
The first use of this will be to implement an option to close connections
associated with a server when is is marked as being down or in maintenance
mode.
* The declaration of peer_session_create() does
not match its definition. As it is only
used inside of peers.c make it static.
* Make the declaration of peers_register_table()
match its definition.
* Also, make all functions in peers.c that
are not also in peers.h static
Since IPv6 is a different type than IPv4, the pattern fetch functions
src6 and dst6 were added. IPv6 stick-tables can also fetch IPv4 addresses
with src and dst. In this case, the IPv4 addresses are mapped to their
IPv6 counterpart, according to RFC 4291.
Since the latest additions to buffer_forward(), it became too large for
inlining, so let's uninline it. The code size drops by 3kB. Should be
backported to 1.4 too.
Despite much care around handling the content-length as a 64-bit integer,
forwarding was broken on 32-bit platforms due to the 32-bit nature of
the ->to_forward member of the "buffer" struct. The issue is that this
member is declared as a long, so while it works OK on 64-bit platforms,
32-bit truncate the content-length to the lower 32-bits.
One solution could consist in turning to_forward to a long long, but it
is used a lot in the critical path, so it's not acceptable to perform
all buffer size computations on 64-bit there.
The fix consists in changing the to_forward member to a strict 32-bit
integer and ensure in buffer_forward() that only the amount of bytes
that can fit into it is considered. Callers of buffer_forward() are
responsible for checking that their data were taken into account. We
arbitrarily ensure we never consider more than 2G at once.
That's the way it was intended to work on 32-bit platforms except that
it did not.
This issue was tracked down hard at Exosec with Bertrand Jacquin,
Thierry Fournier and Julien Thomas. It remained undetected for a long
time because files larger than 4G are almost always transferred in
chunked-encoded format, and most platforms dealing with huge contents
these days run on 64-bit.
The bug affects all 1.5 and 1.4 versions, and must be backported.
And also rename "req_acl_rule" "http_req_rule". At the beginning that
was a bit confusing to me, especially the "req_acl" list which in fact
holds what we call rules. After some digging, it appeared that some
part of the code is 100% HTTP and not just related to authentication
anymore, so let's move that part to HTTP and keep the auth-only code
in auth.c.
It's very annoying that frontend and backend stats are merged because we
don't know what we're observing. For instance, if a "listen" instance
makes use of a distinct backend, it's impossible to know what the bytes_out
means.
Some points take care of not updating counters twice if the backend points
to the frontend, indicating a "listen" instance. The thing becomes more
complex when we try to add support for server side keep-alive, because we
have to maintain a pointer to the backend used for last request, and to
update its stats. But we can't perform such comparisons anymore because
the counters will not match anymore.
So in order to get rid of this situation, let's have both frontend AND
backend stats in the "struct proxy". We simply update the relevant ones
during activity. Some of them are only accounted for in the backend,
while others are just for frontend. Maybe we can improve a bit on that
later, but the essential part is that those counters now reflect what
they really mean.
This patch turns internal server addresses to sockaddr_storage to
store IPv6 addresses, and makes the connect() function use it. This
code already works but some caveats with getaddrinfo/gethostbyname
still need to be sorted out while the changes had to be merged at
this stage of internal architecture changes. So for now the config
parser will not emit an IPv6 address yet so that user experience
remains unchanged.
This change should have absolutely zero user-visible effect, otherwise
it's a bug introduced during the merge, that should be reported ASAP.
This one has been removed and is now totally superseded by ->target.
To get the server, one must use target_srv(&s->target) instead of
s->srv now.
The function ensures that non-server targets still return NULL.
This is in fact where those parts belong to. The old data_state was replaced
by applet.state and is now initialized when the applet is registered. It's
worth noting that the applet does not need to know the session nor the
buffer anymore since everything is brought by the stream interface.
It is possible that having a separate applet struct would simplify the
code but that's not a big deal.