2006-06-15 19:48:13 +00:00
|
|
|
/*
|
2010-06-01 15:45:26 +00:00
|
|
|
* include/proto/session.h
|
|
|
|
* This file defines everything related to sessions.
|
|
|
|
*
|
|
|
|
* Copyright (C) 2000-2010 Willy Tarreau - w@1wt.eu
|
|
|
|
*
|
|
|
|
* This library is free software; you can redistribute it and/or
|
|
|
|
* modify it under the terms of the GNU Lesser General Public
|
|
|
|
* License as published by the Free Software Foundation, version 2.1
|
|
|
|
* exclusively.
|
|
|
|
*
|
|
|
|
* This library is distributed in the hope that it will be useful,
|
|
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
|
|
|
* Lesser General Public License for more details.
|
|
|
|
*
|
|
|
|
* You should have received a copy of the GNU Lesser General Public
|
|
|
|
* License along with this library; if not, write to the Free Software
|
|
|
|
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
|
|
|
*/
|
2006-06-15 19:48:13 +00:00
|
|
|
|
2006-06-26 00:48:02 +00:00
|
|
|
#ifndef _PROTO_SESSION_H
|
|
|
|
#define _PROTO_SESSION_H
|
|
|
|
|
2006-06-29 16:54:54 +00:00
|
|
|
#include <common/config.h>
|
2007-05-13 17:43:47 +00:00
|
|
|
#include <common/memory.h>
|
2006-06-26 00:48:02 +00:00
|
|
|
#include <types/session.h>
|
MAJOR: session: only wake up as many sessions as available buffers permit
We've already experimented with three wake up algorithms when releasing
buffers : the first naive one used to wake up far too many sessions,
causing many of them not to get any buffer. The second approach which
was still in use prior to this patch consisted in waking up either 1
or 2 sessions depending on the number of FDs we had released. And this
was still inaccurate. The third one tried to cover the accuracy issues
of the second and took into consideration the number of FDs the sessions
would be willing to use, but most of the time we ended up waking up too
many of them for nothing, or deadlocking by lack of buffers.
This patch completely removes the need to allocate two buffers at once.
Instead it splits allocations into critical and non-critical ones and
implements a reserve in the pool for this. The deadlock situation happens
when all buffers are be allocated for requests pending in a maxconn-limited
server queue, because then there's no more way to allocate buffers for
responses, and these responses are critical to release the servers's
connection in order to release the pending requests. In fact maxconn on
a server creates a dependence between sessions and particularly between
oldest session's responses and latest session's requests. Thus, it is
mandatory to get a free buffer for a response in order to release a
server connection which will permit to release a request buffer.
Since we definitely have non-symmetrical buffers, we need to implement
this logic in the buffer allocation mechanism. What this commit does is
implement a reserve of buffers which can only be allocated for responses
and that will never be allocated for requests. This is made possible by
the requester indicating how much margin it wants to leave after the
allocation succeeds. Thus it is a cooperative allocation mechanism : the
requester (process_session() in general) prefers not to get a buffer in
order to respect other's need for response buffers. The session management
code always knows if a buffer will be used for requests or responses, so
that is not difficult :
- either there's an applet on the initiator side and we really need
the request buffer (since currently the applet is called in the
context of the session)
- or we have a connection and we really need the response buffer (in
order to support building and sending an error message back)
This reserve ensures that we don't take all allocatable buffers for
requests waiting in a queue. The downside is that all the extra buffers
are really allocated to ensure they can be allocated. But with small
values it is not an issue.
With this change, we don't observe any more deadlocks even when running
with maxconn 1 on a server under severely constrained memory conditions.
The code becomes a bit tricky, it relies on the scheduler's run queue to
estimate how many sessions are already expected to run so that it doesn't
wake up everyone with too few resources. A better solution would probably
consist in having two queues, one for urgent requests and one for normal
requests. A failed allocation for a session dealing with an error, a
connection event, or the need for a response (or request when there's an
applet on the left) would go to the urgent request queue, while other
requests would go to the other queue. Urgent requests would be served
from 1 entry in the pool, while the regular ones would be served only
according to the reserve. Despite not yet having this, it works
remarkably well.
This mechanism is quite efficient, we don't perform too many wake up calls
anymore. For 1 million sessions elapsed during massive memory contention,
we observe about 4.5M calls to process_session() compared to 4.0M without
memory constraints. Previously we used to observe up to 16M calls, which
rougly means 12M failures.
During a test run under high memory constraints (limit enforced to 27 MB
instead of the 58 MB normally needed), performance used to drop by 53% prior
to this patch. Now with this patch instead it *increases* by about 1.5%.
The best effect of this change is that by limiting the memory usage to about
2/3 to 3/4 of what is needed by default, it's possible to increase performance
by up to about 18% mainly due to the fact that pools are reused more often
and remain hot in the CPU cache (observed on regular HTTP traffic with 20k
objects, buffers.limit = maxconn/10, buffers.reserve = limit/2).
Below is an example of scenario which used to cause a deadlock previously :
- connection is received
- two buffers are allocated in process_session() then released
- one is allocated when receiving an HTTP request
- the second buffer is allocated then released in process_session()
for request parsing then connection establishment.
- poll() says we can send, so the request buffer is sent and released
- process session gets notified that the connection is now established
and allocates two buffers then releases them
- all other sessions do the same till one cannot get the request buffer
without hitting the margin
- and now the server responds. stream_interface allocates the response
buffer and manages to get it since it's higher priority being for a
response.
- but process_session() cannot allocate the request buffer anymore
=> We could end up with all buffers used by responses so that none may
be allocated for a request in process_session().
When the applet processing leaves the session context, the test will have
to be changed so that we always allocate a response buffer regardless of
the left side (eg: H2->H1 gateway). A final improvement would consists in
being able to only retry the failed I/O operation without waking up a
task, but to date all experiments to achieve this have proven not to be
reliable enough.
2014-11-27 00:11:56 +00:00
|
|
|
#include <proto/fd.h>
|
2010-06-20 09:19:22 +00:00
|
|
|
#include <proto/freq_ctr.h>
|
2010-06-14 19:04:55 +00:00
|
|
|
#include <proto/stick_table.h>
|
MAJOR: session: only wake up as many sessions as available buffers permit
We've already experimented with three wake up algorithms when releasing
buffers : the first naive one used to wake up far too many sessions,
causing many of them not to get any buffer. The second approach which
was still in use prior to this patch consisted in waking up either 1
or 2 sessions depending on the number of FDs we had released. And this
was still inaccurate. The third one tried to cover the accuracy issues
of the second and took into consideration the number of FDs the sessions
would be willing to use, but most of the time we ended up waking up too
many of them for nothing, or deadlocking by lack of buffers.
This patch completely removes the need to allocate two buffers at once.
Instead it splits allocations into critical and non-critical ones and
implements a reserve in the pool for this. The deadlock situation happens
when all buffers are be allocated for requests pending in a maxconn-limited
server queue, because then there's no more way to allocate buffers for
responses, and these responses are critical to release the servers's
connection in order to release the pending requests. In fact maxconn on
a server creates a dependence between sessions and particularly between
oldest session's responses and latest session's requests. Thus, it is
mandatory to get a free buffer for a response in order to release a
server connection which will permit to release a request buffer.
Since we definitely have non-symmetrical buffers, we need to implement
this logic in the buffer allocation mechanism. What this commit does is
implement a reserve of buffers which can only be allocated for responses
and that will never be allocated for requests. This is made possible by
the requester indicating how much margin it wants to leave after the
allocation succeeds. Thus it is a cooperative allocation mechanism : the
requester (process_session() in general) prefers not to get a buffer in
order to respect other's need for response buffers. The session management
code always knows if a buffer will be used for requests or responses, so
that is not difficult :
- either there's an applet on the initiator side and we really need
the request buffer (since currently the applet is called in the
context of the session)
- or we have a connection and we really need the response buffer (in
order to support building and sending an error message back)
This reserve ensures that we don't take all allocatable buffers for
requests waiting in a queue. The downside is that all the extra buffers
are really allocated to ensure they can be allocated. But with small
values it is not an issue.
With this change, we don't observe any more deadlocks even when running
with maxconn 1 on a server under severely constrained memory conditions.
The code becomes a bit tricky, it relies on the scheduler's run queue to
estimate how many sessions are already expected to run so that it doesn't
wake up everyone with too few resources. A better solution would probably
consist in having two queues, one for urgent requests and one for normal
requests. A failed allocation for a session dealing with an error, a
connection event, or the need for a response (or request when there's an
applet on the left) would go to the urgent request queue, while other
requests would go to the other queue. Urgent requests would be served
from 1 entry in the pool, while the regular ones would be served only
according to the reserve. Despite not yet having this, it works
remarkably well.
This mechanism is quite efficient, we don't perform too many wake up calls
anymore. For 1 million sessions elapsed during massive memory contention,
we observe about 4.5M calls to process_session() compared to 4.0M without
memory constraints. Previously we used to observe up to 16M calls, which
rougly means 12M failures.
During a test run under high memory constraints (limit enforced to 27 MB
instead of the 58 MB normally needed), performance used to drop by 53% prior
to this patch. Now with this patch instead it *increases* by about 1.5%.
The best effect of this change is that by limiting the memory usage to about
2/3 to 3/4 of what is needed by default, it's possible to increase performance
by up to about 18% mainly due to the fact that pools are reused more often
and remain hot in the CPU cache (observed on regular HTTP traffic with 20k
objects, buffers.limit = maxconn/10, buffers.reserve = limit/2).
Below is an example of scenario which used to cause a deadlock previously :
- connection is received
- two buffers are allocated in process_session() then released
- one is allocated when receiving an HTTP request
- the second buffer is allocated then released in process_session()
for request parsing then connection establishment.
- poll() says we can send, so the request buffer is sent and released
- process session gets notified that the connection is now established
and allocates two buffers then releases them
- all other sessions do the same till one cannot get the request buffer
without hitting the margin
- and now the server responds. stream_interface allocates the response
buffer and manages to get it since it's higher priority being for a
response.
- but process_session() cannot allocate the request buffer anymore
=> We could end up with all buffers used by responses so that none may
be allocated for a request in process_session().
When the applet processing leaves the session context, the test will have
to be changed so that we always allocate a response buffer regardless of
the left side (eg: H2->H1 gateway). A final improvement would consists in
being able to only retry the failed I/O operation without waking up a
task, but to date all experiments to achieve this have proven not to be
reliable enough.
2014-11-27 00:11:56 +00:00
|
|
|
#include <proto/task.h>
|
2006-06-15 19:48:13 +00:00
|
|
|
|
2007-05-13 17:43:47 +00:00
|
|
|
extern struct pool_head *pool2_session;
|
2008-11-23 18:53:55 +00:00
|
|
|
extern struct list sessions;
|
MAJOR: session: implement a wait-queue for sessions who need a buffer
When a session_alloc_buffers() fails to allocate one or two buffers,
it subscribes the session to buffer_wq, and waits for another session
to release buffers. It's then removed from the queue and woken up with
TASK_WAKE_RES, and can attempt its allocation again.
We decide to try to wake as many waiters as we release buffers so
that if we release 2 and two waiters need only once, they both have
their chance. We must never come to the situation where we don't wake
enough tasks up.
It's common to release buffers after the completion of an I/O callback,
which can happen even if the I/O could not be performed due to half a
failure on memory allocation. In this situation, we don't want to move
out of the wait queue the session that was just added, otherwise it
will never get any buffer. Thus, we only force ourselves out of the
queue when freeing the session.
Note: at the moment, since session_alloc_buffers() is not used, no task
is subscribed to the wait queue.
2014-11-25 20:10:35 +00:00
|
|
|
extern struct list buffer_wq;
|
2007-05-13 17:43:47 +00:00
|
|
|
|
2012-11-19 15:10:32 +00:00
|
|
|
extern struct data_cb sess_conn_cb;
|
|
|
|
|
2010-06-01 15:45:26 +00:00
|
|
|
int session_accept(struct listener *l, int cfd, struct sockaddr_storage *addr);
|
2006-06-15 19:48:13 +00:00
|
|
|
|
2007-05-13 17:43:47 +00:00
|
|
|
/* perform minimal intializations, report 0 in case of error, 1 if OK. */
|
|
|
|
int init_session();
|
2006-06-26 00:48:02 +00:00
|
|
|
|
2011-09-07 21:01:56 +00:00
|
|
|
/* kill a session and set the termination flags to <why> (one of SN_ERR_*) */
|
|
|
|
void session_shutdown(struct session *session, int why);
|
2011-06-08 00:19:07 +00:00
|
|
|
|
2007-11-24 21:12:47 +00:00
|
|
|
void session_process_counters(struct session *s);
|
[BUG] fix the dequeuing logic to ensure that all requests get served
The dequeuing logic was completely wrong. First, a task was assigned
to all servers to process the queue, but this task was never scheduled
and was only woken up on session free. Second, there was no reservation
of server entries when a task was assigned a server. This means that
as long as the task was not connected to the server, its presence was
not accounted for. This was causing trouble when detecting whether or
not a server had reached maxconn. Third, during a redispatch, a session
could lose its place at the server's and get blocked because another
session at the same moment would have stolen the entry. Fourth, the
redispatch option did not work when maxqueue was reached for a server,
and it was not possible to do so without indefinitely hanging a session.
The root cause of all those problems was the lack of pre-reservation of
connections at the server's, and the lack of tracking of servers during
a redispatch. Everything relied on combinations of flags which could
appear similarly in quite distinct situations.
This patch is a major rework but there was no other solution, as the
internal logic was deeply flawed. The resulting code is cleaner, more
understandable, uses less magics and is overall more robust.
As an added bonus, "option redispatch" now works when maxqueue has
been reached on a server.
2008-06-20 13:04:11 +00:00
|
|
|
void sess_change_server(struct session *sess, struct server *newsrv);
|
2009-03-08 08:38:41 +00:00
|
|
|
struct task *process_session(struct task *t);
|
2009-03-15 21:34:05 +00:00
|
|
|
void default_srv_error(struct session *s, struct stream_interface *si);
|
2014-07-15 17:06:18 +00:00
|
|
|
struct stkctr *smp_fetch_sc_stkctr(struct session *l4, const struct arg *args, const char *kw);
|
2010-06-14 19:04:55 +00:00
|
|
|
int parse_track_counters(char **args, int *arg,
|
|
|
|
int section_type, struct proxy *curpx,
|
|
|
|
struct track_ctr_prm *prm,
|
2012-05-08 17:47:01 +00:00
|
|
|
struct proxy *defpx, char **err);
|
2010-06-14 19:04:55 +00:00
|
|
|
|
2014-06-17 10:19:18 +00:00
|
|
|
/* Update the session's backend and server time stats */
|
|
|
|
void session_update_time_stats(struct session *s);
|
MAJOR: session: only wake up as many sessions as available buffers permit
We've already experimented with three wake up algorithms when releasing
buffers : the first naive one used to wake up far too many sessions,
causing many of them not to get any buffer. The second approach which
was still in use prior to this patch consisted in waking up either 1
or 2 sessions depending on the number of FDs we had released. And this
was still inaccurate. The third one tried to cover the accuracy issues
of the second and took into consideration the number of FDs the sessions
would be willing to use, but most of the time we ended up waking up too
many of them for nothing, or deadlocking by lack of buffers.
This patch completely removes the need to allocate two buffers at once.
Instead it splits allocations into critical and non-critical ones and
implements a reserve in the pool for this. The deadlock situation happens
when all buffers are be allocated for requests pending in a maxconn-limited
server queue, because then there's no more way to allocate buffers for
responses, and these responses are critical to release the servers's
connection in order to release the pending requests. In fact maxconn on
a server creates a dependence between sessions and particularly between
oldest session's responses and latest session's requests. Thus, it is
mandatory to get a free buffer for a response in order to release a
server connection which will permit to release a request buffer.
Since we definitely have non-symmetrical buffers, we need to implement
this logic in the buffer allocation mechanism. What this commit does is
implement a reserve of buffers which can only be allocated for responses
and that will never be allocated for requests. This is made possible by
the requester indicating how much margin it wants to leave after the
allocation succeeds. Thus it is a cooperative allocation mechanism : the
requester (process_session() in general) prefers not to get a buffer in
order to respect other's need for response buffers. The session management
code always knows if a buffer will be used for requests or responses, so
that is not difficult :
- either there's an applet on the initiator side and we really need
the request buffer (since currently the applet is called in the
context of the session)
- or we have a connection and we really need the response buffer (in
order to support building and sending an error message back)
This reserve ensures that we don't take all allocatable buffers for
requests waiting in a queue. The downside is that all the extra buffers
are really allocated to ensure they can be allocated. But with small
values it is not an issue.
With this change, we don't observe any more deadlocks even when running
with maxconn 1 on a server under severely constrained memory conditions.
The code becomes a bit tricky, it relies on the scheduler's run queue to
estimate how many sessions are already expected to run so that it doesn't
wake up everyone with too few resources. A better solution would probably
consist in having two queues, one for urgent requests and one for normal
requests. A failed allocation for a session dealing with an error, a
connection event, or the need for a response (or request when there's an
applet on the left) would go to the urgent request queue, while other
requests would go to the other queue. Urgent requests would be served
from 1 entry in the pool, while the regular ones would be served only
according to the reserve. Despite not yet having this, it works
remarkably well.
This mechanism is quite efficient, we don't perform too many wake up calls
anymore. For 1 million sessions elapsed during massive memory contention,
we observe about 4.5M calls to process_session() compared to 4.0M without
memory constraints. Previously we used to observe up to 16M calls, which
rougly means 12M failures.
During a test run under high memory constraints (limit enforced to 27 MB
instead of the 58 MB normally needed), performance used to drop by 53% prior
to this patch. Now with this patch instead it *increases* by about 1.5%.
The best effect of this change is that by limiting the memory usage to about
2/3 to 3/4 of what is needed by default, it's possible to increase performance
by up to about 18% mainly due to the fact that pools are reused more often
and remain hot in the CPU cache (observed on regular HTTP traffic with 20k
objects, buffers.limit = maxconn/10, buffers.reserve = limit/2).
Below is an example of scenario which used to cause a deadlock previously :
- connection is received
- two buffers are allocated in process_session() then released
- one is allocated when receiving an HTTP request
- the second buffer is allocated then released in process_session()
for request parsing then connection establishment.
- poll() says we can send, so the request buffer is sent and released
- process session gets notified that the connection is now established
and allocates two buffers then releases them
- all other sessions do the same till one cannot get the request buffer
without hitting the margin
- and now the server responds. stream_interface allocates the response
buffer and manages to get it since it's higher priority being for a
response.
- but process_session() cannot allocate the request buffer anymore
=> We could end up with all buffers used by responses so that none may
be allocated for a request in process_session().
When the applet processing leaves the session context, the test will have
to be changed so that we always allocate a response buffer regardless of
the left side (eg: H2->H1 gateway). A final improvement would consists in
being able to only retry the failed I/O operation without waking up a
task, but to date all experiments to achieve this have proven not to be
reliable enough.
2014-11-27 00:11:56 +00:00
|
|
|
void __session_offer_buffers(int rqlimit);
|
|
|
|
static inline void session_offer_buffers();
|
|
|
|
int session_alloc_work_buffer(struct session *s);
|
2014-11-25 18:46:36 +00:00
|
|
|
void session_release_buffers(struct session *s);
|
2014-12-28 12:09:02 +00:00
|
|
|
int session_alloc_recv_buffer(struct channel *chn);
|
2014-06-17 10:19:18 +00:00
|
|
|
|
2014-01-28 22:18:23 +00:00
|
|
|
/* sets the stick counter's entry pointer */
|
|
|
|
static inline void stkctr_set_entry(struct stkctr *stkctr, struct stksess *entry)
|
|
|
|
{
|
|
|
|
stkctr->entry = caddr_from_ptr(entry, 0);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* returns the entry pointer from a stick counter */
|
|
|
|
static inline struct stksess *stkctr_entry(struct stkctr *stkctr)
|
|
|
|
{
|
|
|
|
return caddr_to_ptr(stkctr->entry);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* returns the two flags from a stick counter */
|
|
|
|
static inline unsigned int stkctr_flags(struct stkctr *stkctr)
|
|
|
|
{
|
|
|
|
return caddr_to_data(stkctr->entry);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* sets up to two flags at a time on a composite address */
|
|
|
|
static inline void stkctr_set_flags(struct stkctr *stkctr, unsigned int flags)
|
|
|
|
{
|
|
|
|
stkctr->entry = caddr_set_flags(stkctr->entry, flags);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* returns the two flags from a stick counter */
|
|
|
|
static inline void stkctr_clr_flags(struct stkctr *stkctr, unsigned int flags)
|
|
|
|
{
|
|
|
|
stkctr->entry = caddr_clr_flags(stkctr->entry, flags);
|
|
|
|
}
|
|
|
|
|
2010-06-14 19:04:55 +00:00
|
|
|
/* Remove the refcount from the session to the tracked counters, and clear the
|
|
|
|
* pointer to ensure this is only performed once. The caller is responsible for
|
|
|
|
* ensuring that the pointer is valid first.
|
|
|
|
*/
|
|
|
|
static inline void session_store_counters(struct session *s)
|
|
|
|
{
|
2010-08-03 14:29:52 +00:00
|
|
|
void *ptr;
|
2012-12-09 14:55:40 +00:00
|
|
|
int i;
|
2010-08-03 14:29:52 +00:00
|
|
|
|
2013-07-23 17:15:30 +00:00
|
|
|
for (i = 0; i < MAX_SESS_STKCTR; i++) {
|
2014-01-28 22:18:23 +00:00
|
|
|
if (!stkctr_entry(&s->stkctr[i]))
|
2012-12-09 14:55:40 +00:00
|
|
|
continue;
|
2014-01-28 22:18:23 +00:00
|
|
|
ptr = stktable_data_ptr(s->stkctr[i].table, stkctr_entry(&s->stkctr[i]), STKTABLE_DT_CONN_CUR);
|
2010-08-03 14:29:52 +00:00
|
|
|
if (ptr)
|
|
|
|
stktable_data_cast(ptr, conn_cur)--;
|
2014-01-28 22:18:23 +00:00
|
|
|
stkctr_entry(&s->stkctr[i])->ref_cnt--;
|
|
|
|
stksess_kill_if_expired(s->stkctr[i].table, stkctr_entry(&s->stkctr[i]));
|
|
|
|
stkctr_set_entry(&s->stkctr[i], NULL);
|
2010-06-18 14:35:43 +00:00
|
|
|
}
|
2010-06-14 19:04:55 +00:00
|
|
|
}
|
|
|
|
|
BUG/MEDIUM: counters: flush content counters after each request
One year ago, commit 5d5b5d8 ("MEDIUM: proto_tcp: add support for tracking
L7 information") brought support for tracking L7 information in tcp-request
content rules. Two years earlier, commit 0a4838c ("[MEDIUM] session-counters:
correctly unbind the counters tracked by the backend") used to flush the
backend counters after processing a request.
While that earliest patch was correct at the time, it became wrong after
the second patch was merged. The code does what it says, but the concept
is flawed. "TCP request content" rules are evaluated for each HTTP request
over a single connection. So if such a rule in the frontend decides to
track any L7 information or to track L4 information when an L7 condition
matches, then it is applied to all requests over the same connection even
if they don't match. This means that a rule such as :
tcp-request content track-sc0 src if { path /index.html }
will count one request for index.html, and another one for each of the
objects present on this page that are fetched over the same connection
which sent the initial matching request.
Worse, it is possible to make the code do stupid things by using multiple
counters:
tcp-request content track-sc0 src if { path /foo }
tcp-request content track-sc1 src if { path /bar }
Just sending two requests first, one with /foo, one with /bar, shows
twice the number of requests for all subsequent requests. Just because
both of them persist after the end of the request.
So the decision to flush backend-tracked counters was not the correct
one. In practice, what is important is to flush countent-based rules
since they are the ones evaluated for each request.
Doing so requires new flags in the session however, to keep track of
which stick-counter was tracked by what ruleset. A later change might
make this easier to maintain over time.
This bug is 1.5-specific, no backport to stable is needed.
2014-01-28 20:40:28 +00:00
|
|
|
/* Remove the refcount from the session counters tracked at the content level if
|
2010-08-03 14:29:52 +00:00
|
|
|
* any, and clear the pointer to ensure this is only performed once. The caller
|
|
|
|
* is responsible for ensuring that the pointer is valid first.
|
2010-06-14 19:04:55 +00:00
|
|
|
*/
|
BUG/MEDIUM: counters: flush content counters after each request
One year ago, commit 5d5b5d8 ("MEDIUM: proto_tcp: add support for tracking
L7 information") brought support for tracking L7 information in tcp-request
content rules. Two years earlier, commit 0a4838c ("[MEDIUM] session-counters:
correctly unbind the counters tracked by the backend") used to flush the
backend counters after processing a request.
While that earliest patch was correct at the time, it became wrong after
the second patch was merged. The code does what it says, but the concept
is flawed. "TCP request content" rules are evaluated for each HTTP request
over a single connection. So if such a rule in the frontend decides to
track any L7 information or to track L4 information when an L7 condition
matches, then it is applied to all requests over the same connection even
if they don't match. This means that a rule such as :
tcp-request content track-sc0 src if { path /index.html }
will count one request for index.html, and another one for each of the
objects present on this page that are fetched over the same connection
which sent the initial matching request.
Worse, it is possible to make the code do stupid things by using multiple
counters:
tcp-request content track-sc0 src if { path /foo }
tcp-request content track-sc1 src if { path /bar }
Just sending two requests first, one with /foo, one with /bar, shows
twice the number of requests for all subsequent requests. Just because
both of them persist after the end of the request.
So the decision to flush backend-tracked counters was not the correct
one. In practice, what is important is to flush countent-based rules
since they are the ones evaluated for each request.
Doing so requires new flags in the session however, to keep track of
which stick-counter was tracked by what ruleset. A later change might
make this easier to maintain over time.
This bug is 1.5-specific, no backport to stable is needed.
2014-01-28 20:40:28 +00:00
|
|
|
static inline void session_stop_content_counters(struct session *s)
|
2010-06-14 19:04:55 +00:00
|
|
|
{
|
2010-08-03 14:29:52 +00:00
|
|
|
void *ptr;
|
2012-12-09 14:55:40 +00:00
|
|
|
int i;
|
2010-06-18 19:03:20 +00:00
|
|
|
|
2013-07-23 17:15:30 +00:00
|
|
|
for (i = 0; i < MAX_SESS_STKCTR; i++) {
|
2014-01-28 22:18:23 +00:00
|
|
|
if (!stkctr_entry(&s->stkctr[i]))
|
2012-12-09 14:55:40 +00:00
|
|
|
continue;
|
|
|
|
|
2014-01-28 22:18:23 +00:00
|
|
|
if (!(stkctr_flags(&s->stkctr[i]) & STKCTR_TRACK_CONTENT))
|
2012-12-09 14:55:40 +00:00
|
|
|
continue;
|
|
|
|
|
2014-01-28 22:18:23 +00:00
|
|
|
ptr = stktable_data_ptr(s->stkctr[i].table, stkctr_entry(&s->stkctr[i]), STKTABLE_DT_CONN_CUR);
|
2010-08-06 18:11:05 +00:00
|
|
|
if (ptr)
|
|
|
|
stktable_data_cast(ptr, conn_cur)--;
|
2014-01-28 22:18:23 +00:00
|
|
|
stkctr_entry(&s->stkctr[i])->ref_cnt--;
|
|
|
|
stksess_kill_if_expired(s->stkctr[i].table, stkctr_entry(&s->stkctr[i]));
|
|
|
|
stkctr_set_entry(&s->stkctr[i], NULL);
|
2010-08-06 18:11:05 +00:00
|
|
|
}
|
2010-08-03 14:29:52 +00:00
|
|
|
}
|
2010-06-18 19:03:20 +00:00
|
|
|
|
2010-08-03 14:29:52 +00:00
|
|
|
/* Increase total and concurrent connection count for stick entry <ts> of table
|
|
|
|
* <t>. The caller is responsible for ensuring that <t> and <ts> are valid
|
|
|
|
* pointers, and for calling this only once per connection.
|
|
|
|
*/
|
|
|
|
static inline void session_start_counters(struct stktable *t, struct stksess *ts)
|
|
|
|
{
|
|
|
|
void *ptr;
|
|
|
|
|
|
|
|
ptr = stktable_data_ptr(t, ts, STKTABLE_DT_CONN_CUR);
|
|
|
|
if (ptr)
|
|
|
|
stktable_data_cast(ptr, conn_cur)++;
|
|
|
|
|
|
|
|
ptr = stktable_data_ptr(t, ts, STKTABLE_DT_CONN_CNT);
|
|
|
|
if (ptr)
|
|
|
|
stktable_data_cast(ptr, conn_cnt)++;
|
|
|
|
|
|
|
|
ptr = stktable_data_ptr(t, ts, STKTABLE_DT_CONN_RATE);
|
|
|
|
if (ptr)
|
|
|
|
update_freq_ctr_period(&stktable_data_cast(ptr, conn_rate),
|
|
|
|
t->data_arg[STKTABLE_DT_CONN_RATE].u, 1);
|
|
|
|
if (tick_isset(t->expire))
|
|
|
|
ts->expire = tick_add(now_ms, MS_TO_TICKS(t->expire));
|
|
|
|
}
|
2010-06-20 09:19:22 +00:00
|
|
|
|
2012-12-09 14:55:40 +00:00
|
|
|
/* Enable tracking of session counters as <stkctr> on stksess <ts>. The caller is
|
2010-08-03 14:29:52 +00:00
|
|
|
* responsible for ensuring that <t> and <ts> are valid pointers. Some controls
|
|
|
|
* are performed to ensure the state can still change.
|
|
|
|
*/
|
2012-12-09 14:55:40 +00:00
|
|
|
static inline void session_track_stkctr(struct stkctr *ctr, struct stktable *t, struct stksess *ts)
|
2010-08-03 14:29:52 +00:00
|
|
|
{
|
2014-01-28 22:18:23 +00:00
|
|
|
if (stkctr_entry(ctr))
|
2010-08-03 14:29:52 +00:00
|
|
|
return;
|
|
|
|
|
|
|
|
ts->ref_cnt++;
|
2012-12-09 14:55:40 +00:00
|
|
|
ctr->table = t;
|
2014-01-28 22:18:23 +00:00
|
|
|
stkctr_set_entry(ctr, ts);
|
2010-08-03 14:29:52 +00:00
|
|
|
session_start_counters(t, ts);
|
2010-06-14 19:04:55 +00:00
|
|
|
}
|
2007-11-24 21:12:47 +00:00
|
|
|
|
2010-06-23 09:44:09 +00:00
|
|
|
/* Increase the number of cumulated HTTP requests in the tracked counters */
|
|
|
|
static void inline session_inc_http_req_ctr(struct session *s)
|
|
|
|
{
|
2010-08-03 14:29:52 +00:00
|
|
|
void *ptr;
|
2012-12-09 14:55:40 +00:00
|
|
|
int i;
|
2010-06-23 09:44:09 +00:00
|
|
|
|
2013-07-23 17:15:30 +00:00
|
|
|
for (i = 0; i < MAX_SESS_STKCTR; i++) {
|
2014-01-28 22:18:23 +00:00
|
|
|
if (!stkctr_entry(&s->stkctr[i]))
|
2012-12-09 14:55:40 +00:00
|
|
|
continue;
|
2010-06-23 09:44:09 +00:00
|
|
|
|
2014-01-28 22:18:23 +00:00
|
|
|
ptr = stktable_data_ptr(s->stkctr[i].table, stkctr_entry(&s->stkctr[i]), STKTABLE_DT_HTTP_REQ_CNT);
|
2010-08-03 14:29:52 +00:00
|
|
|
if (ptr)
|
|
|
|
stktable_data_cast(ptr, http_req_cnt)++;
|
2012-12-09 11:00:04 +00:00
|
|
|
|
2014-01-28 22:18:23 +00:00
|
|
|
ptr = stktable_data_ptr(s->stkctr[i].table, stkctr_entry(&s->stkctr[i]), STKTABLE_DT_HTTP_REQ_RATE);
|
2012-12-09 11:00:04 +00:00
|
|
|
if (ptr)
|
|
|
|
update_freq_ctr_period(&stktable_data_cast(ptr, http_req_rate),
|
2012-12-09 14:55:40 +00:00
|
|
|
s->stkctr[i].table->data_arg[STKTABLE_DT_HTTP_REQ_RATE].u, 1);
|
2012-12-09 11:00:04 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Increase the number of cumulated HTTP requests in the backend's tracked counters */
|
|
|
|
static void inline session_inc_be_http_req_ctr(struct session *s)
|
|
|
|
{
|
|
|
|
void *ptr;
|
2012-12-09 14:55:40 +00:00
|
|
|
int i;
|
2012-12-09 11:00:04 +00:00
|
|
|
|
2013-07-23 17:15:30 +00:00
|
|
|
for (i = 0; i < MAX_SESS_STKCTR; i++) {
|
2014-01-28 22:18:23 +00:00
|
|
|
if (!stkctr_entry(&s->stkctr[i]))
|
2012-12-09 14:55:40 +00:00
|
|
|
continue;
|
2012-12-09 11:00:04 +00:00
|
|
|
|
2014-01-28 22:18:23 +00:00
|
|
|
if (!(stkctr_flags(&s->stkctr[i]) & STKCTR_TRACK_BACKEND))
|
2012-12-09 14:55:40 +00:00
|
|
|
continue;
|
2012-12-09 11:00:04 +00:00
|
|
|
|
2014-01-28 22:18:23 +00:00
|
|
|
ptr = stktable_data_ptr(s->stkctr[i].table, stkctr_entry(&s->stkctr[i]), STKTABLE_DT_HTTP_REQ_CNT);
|
2012-12-09 11:00:04 +00:00
|
|
|
if (ptr)
|
|
|
|
stktable_data_cast(ptr, http_req_cnt)++;
|
2010-08-03 14:29:52 +00:00
|
|
|
|
2014-01-28 22:18:23 +00:00
|
|
|
ptr = stktable_data_ptr(s->stkctr[i].table, stkctr_entry(&s->stkctr[i]), STKTABLE_DT_HTTP_REQ_RATE);
|
2010-08-03 14:29:52 +00:00
|
|
|
if (ptr)
|
|
|
|
update_freq_ctr_period(&stktable_data_cast(ptr, http_req_rate),
|
2012-12-09 14:55:40 +00:00
|
|
|
s->stkctr[i].table->data_arg[STKTABLE_DT_HTTP_REQ_RATE].u, 1);
|
2010-06-23 09:44:09 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Increase the number of cumulated failed HTTP requests in the tracked
|
|
|
|
* counters. Only 4xx requests should be counted here so that we can
|
|
|
|
* distinguish between errors caused by client behaviour and other ones.
|
|
|
|
* Note that even 404 are interesting because they're generally caused by
|
|
|
|
* vulnerability scans.
|
|
|
|
*/
|
|
|
|
static void inline session_inc_http_err_ctr(struct session *s)
|
|
|
|
{
|
2010-08-03 14:29:52 +00:00
|
|
|
void *ptr;
|
2012-12-09 14:55:40 +00:00
|
|
|
int i;
|
2010-08-03 14:29:52 +00:00
|
|
|
|
2013-07-23 17:15:30 +00:00
|
|
|
for (i = 0; i < MAX_SESS_STKCTR; i++) {
|
2014-01-28 22:18:23 +00:00
|
|
|
if (!stkctr_entry(&s->stkctr[i]))
|
2012-12-09 14:55:40 +00:00
|
|
|
continue;
|
2010-06-23 09:44:09 +00:00
|
|
|
|
2014-01-28 22:18:23 +00:00
|
|
|
ptr = stktable_data_ptr(s->stkctr[i].table, stkctr_entry(&s->stkctr[i]), STKTABLE_DT_HTTP_ERR_CNT);
|
2010-06-23 09:44:09 +00:00
|
|
|
if (ptr)
|
|
|
|
stktable_data_cast(ptr, http_err_cnt)++;
|
|
|
|
|
2014-01-28 22:18:23 +00:00
|
|
|
ptr = stktable_data_ptr(s->stkctr[i].table, stkctr_entry(&s->stkctr[i]), STKTABLE_DT_HTTP_ERR_RATE);
|
2010-06-23 09:44:09 +00:00
|
|
|
if (ptr)
|
|
|
|
update_freq_ctr_period(&stktable_data_cast(ptr, http_err_rate),
|
2012-12-09 14:55:40 +00:00
|
|
|
s->stkctr[i].table->data_arg[STKTABLE_DT_HTTP_ERR_RATE].u, 1);
|
2010-06-23 09:44:09 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2011-06-21 05:34:57 +00:00
|
|
|
static void inline session_add_srv_conn(struct session *sess, struct server *srv)
|
|
|
|
{
|
|
|
|
sess->srv_conn = srv;
|
|
|
|
LIST_ADD(&srv->actconns, &sess->by_srv);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void inline session_del_srv_conn(struct session *sess)
|
|
|
|
{
|
|
|
|
if (!sess->srv_conn)
|
|
|
|
return;
|
|
|
|
|
|
|
|
sess->srv_conn = NULL;
|
|
|
|
LIST_DEL(&sess->by_srv);
|
|
|
|
}
|
|
|
|
|
2011-07-19 22:17:39 +00:00
|
|
|
static void inline session_init_srv_conn(struct session *sess)
|
|
|
|
{
|
|
|
|
sess->srv_conn = NULL;
|
|
|
|
LIST_INIT(&sess->by_srv);
|
|
|
|
}
|
|
|
|
|
MAJOR: session: only wake up as many sessions as available buffers permit
We've already experimented with three wake up algorithms when releasing
buffers : the first naive one used to wake up far too many sessions,
causing many of them not to get any buffer. The second approach which
was still in use prior to this patch consisted in waking up either 1
or 2 sessions depending on the number of FDs we had released. And this
was still inaccurate. The third one tried to cover the accuracy issues
of the second and took into consideration the number of FDs the sessions
would be willing to use, but most of the time we ended up waking up too
many of them for nothing, or deadlocking by lack of buffers.
This patch completely removes the need to allocate two buffers at once.
Instead it splits allocations into critical and non-critical ones and
implements a reserve in the pool for this. The deadlock situation happens
when all buffers are be allocated for requests pending in a maxconn-limited
server queue, because then there's no more way to allocate buffers for
responses, and these responses are critical to release the servers's
connection in order to release the pending requests. In fact maxconn on
a server creates a dependence between sessions and particularly between
oldest session's responses and latest session's requests. Thus, it is
mandatory to get a free buffer for a response in order to release a
server connection which will permit to release a request buffer.
Since we definitely have non-symmetrical buffers, we need to implement
this logic in the buffer allocation mechanism. What this commit does is
implement a reserve of buffers which can only be allocated for responses
and that will never be allocated for requests. This is made possible by
the requester indicating how much margin it wants to leave after the
allocation succeeds. Thus it is a cooperative allocation mechanism : the
requester (process_session() in general) prefers not to get a buffer in
order to respect other's need for response buffers. The session management
code always knows if a buffer will be used for requests or responses, so
that is not difficult :
- either there's an applet on the initiator side and we really need
the request buffer (since currently the applet is called in the
context of the session)
- or we have a connection and we really need the response buffer (in
order to support building and sending an error message back)
This reserve ensures that we don't take all allocatable buffers for
requests waiting in a queue. The downside is that all the extra buffers
are really allocated to ensure they can be allocated. But with small
values it is not an issue.
With this change, we don't observe any more deadlocks even when running
with maxconn 1 on a server under severely constrained memory conditions.
The code becomes a bit tricky, it relies on the scheduler's run queue to
estimate how many sessions are already expected to run so that it doesn't
wake up everyone with too few resources. A better solution would probably
consist in having two queues, one for urgent requests and one for normal
requests. A failed allocation for a session dealing with an error, a
connection event, or the need for a response (or request when there's an
applet on the left) would go to the urgent request queue, while other
requests would go to the other queue. Urgent requests would be served
from 1 entry in the pool, while the regular ones would be served only
according to the reserve. Despite not yet having this, it works
remarkably well.
This mechanism is quite efficient, we don't perform too many wake up calls
anymore. For 1 million sessions elapsed during massive memory contention,
we observe about 4.5M calls to process_session() compared to 4.0M without
memory constraints. Previously we used to observe up to 16M calls, which
rougly means 12M failures.
During a test run under high memory constraints (limit enforced to 27 MB
instead of the 58 MB normally needed), performance used to drop by 53% prior
to this patch. Now with this patch instead it *increases* by about 1.5%.
The best effect of this change is that by limiting the memory usage to about
2/3 to 3/4 of what is needed by default, it's possible to increase performance
by up to about 18% mainly due to the fact that pools are reused more often
and remain hot in the CPU cache (observed on regular HTTP traffic with 20k
objects, buffers.limit = maxconn/10, buffers.reserve = limit/2).
Below is an example of scenario which used to cause a deadlock previously :
- connection is received
- two buffers are allocated in process_session() then released
- one is allocated when receiving an HTTP request
- the second buffer is allocated then released in process_session()
for request parsing then connection establishment.
- poll() says we can send, so the request buffer is sent and released
- process session gets notified that the connection is now established
and allocates two buffers then releases them
- all other sessions do the same till one cannot get the request buffer
without hitting the margin
- and now the server responds. stream_interface allocates the response
buffer and manages to get it since it's higher priority being for a
response.
- but process_session() cannot allocate the request buffer anymore
=> We could end up with all buffers used by responses so that none may
be allocated for a request in process_session().
When the applet processing leaves the session context, the test will have
to be changed so that we always allocate a response buffer regardless of
the left side (eg: H2->H1 gateway). A final improvement would consists in
being able to only retry the failed I/O operation without waking up a
task, but to date all experiments to achieve this have proven not to be
reliable enough.
2014-11-27 00:11:56 +00:00
|
|
|
static inline void session_offer_buffers()
|
|
|
|
{
|
|
|
|
int avail;
|
|
|
|
|
|
|
|
if (LIST_ISEMPTY(&buffer_wq))
|
|
|
|
return;
|
|
|
|
|
|
|
|
/* all sessions will need 1 buffer, so we can stop waking up sessions
|
|
|
|
* once we have enough of them to eat all the buffers. Note that we
|
|
|
|
* don't really know if they are sessions or just other tasks, but
|
|
|
|
* that's a rough estimate. Similarly, for each cached event we'll need
|
|
|
|
* 1 buffer. If no buffer is currently used, always wake up the number
|
|
|
|
* of tasks we can offer a buffer based on what is allocated, and in
|
|
|
|
* any case at least one task per two reserved buffers.
|
|
|
|
*/
|
|
|
|
avail = pool2_buffer->allocated - pool2_buffer->used - global.tune.reserved_bufs / 2;
|
|
|
|
|
|
|
|
if (avail > (int)run_queue)
|
|
|
|
__session_offer_buffers(avail);
|
|
|
|
}
|
|
|
|
|
2006-06-26 00:48:02 +00:00
|
|
|
#endif /* _PROTO_SESSION_H */
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Local variables:
|
|
|
|
* c-indent-level: 8
|
|
|
|
* c-basic-offset: 8
|
|
|
|
* End:
|
|
|
|
*/
|