haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-03-03 10:01:27 +00:00

History

Willy Tarreau a24adf0795 MAJOR: session: only wake up as many sessions as available buffers permit We've already experimented with three wake up algorithms when releasing buffers : the first naive one used to wake up far too many sessions, causing many of them not to get any buffer. The second approach which was still in use prior to this patch consisted in waking up either 1 or 2 sessions depending on the number of FDs we had released. And this was still inaccurate. The third one tried to cover the accuracy issues of the second and took into consideration the number of FDs the sessions would be willing to use, but most of the time we ended up waking up too many of them for nothing, or deadlocking by lack of buffers. This patch completely removes the need to allocate two buffers at once. Instead it splits allocations into critical and non-critical ones and implements a reserve in the pool for this. The deadlock situation happens when all buffers are be allocated for requests pending in a maxconn-limited server queue, because then there's no more way to allocate buffers for responses, and these responses are critical to release the servers's connection in order to release the pending requests. In fact maxconn on a server creates a dependence between sessions and particularly between oldest session's responses and latest session's requests. Thus, it is mandatory to get a free buffer for a response in order to release a server connection which will permit to release a request buffer. Since we definitely have non-symmetrical buffers, we need to implement this logic in the buffer allocation mechanism. What this commit does is implement a reserve of buffers which can only be allocated for responses and that will never be allocated for requests. This is made possible by the requester indicating how much margin it wants to leave after the allocation succeeds. Thus it is a cooperative allocation mechanism : the requester (process_session() in general) prefers not to get a buffer in order to respect other's need for response buffers. The session management code always knows if a buffer will be used for requests or responses, so that is not difficult : - either there's an applet on the initiator side and we really need the request buffer (since currently the applet is called in the context of the session) - or we have a connection and we really need the response buffer (in order to support building and sending an error message back) This reserve ensures that we don't take all allocatable buffers for requests waiting in a queue. The downside is that all the extra buffers are really allocated to ensure they can be allocated. But with small values it is not an issue. With this change, we don't observe any more deadlocks even when running with maxconn 1 on a server under severely constrained memory conditions. The code becomes a bit tricky, it relies on the scheduler's run queue to estimate how many sessions are already expected to run so that it doesn't wake up everyone with too few resources. A better solution would probably consist in having two queues, one for urgent requests and one for normal requests. A failed allocation for a session dealing with an error, a connection event, or the need for a response (or request when there's an applet on the left) would go to the urgent request queue, while other requests would go to the other queue. Urgent requests would be served from 1 entry in the pool, while the regular ones would be served only according to the reserve. Despite not yet having this, it works remarkably well. This mechanism is quite efficient, we don't perform too many wake up calls anymore. For 1 million sessions elapsed during massive memory contention, we observe about 4.5M calls to process_session() compared to 4.0M without memory constraints. Previously we used to observe up to 16M calls, which rougly means 12M failures. During a test run under high memory constraints (limit enforced to 27 MB instead of the 58 MB normally needed), performance used to drop by 53% prior to this patch. Now with this patch instead it increases by about 1.5%. The best effect of this change is that by limiting the memory usage to about 2/3 to 3/4 of what is needed by default, it's possible to increase performance by up to about 18% mainly due to the fact that pools are reused more often and remain hot in the CPU cache (observed on regular HTTP traffic with 20k objects, buffers.limit = maxconn/10, buffers.reserve = limit/2). Below is an example of scenario which used to cause a deadlock previously : - connection is received - two buffers are allocated in process_session() then released - one is allocated when receiving an HTTP request - the second buffer is allocated then released in process_session() for request parsing then connection establishment. - poll() says we can send, so the request buffer is sent and released - process session gets notified that the connection is now established and allocates two buffers then releases them - all other sessions do the same till one cannot get the request buffer without hitting the margin - and now the server responds. stream_interface allocates the response buffer and manages to get it since it's higher priority being for a response. - but process_session() cannot allocate the request buffer anymore => We could end up with all buffers used by responses so that none may be allocated for a request in process_session(). When the applet processing leaves the session context, the test will have to be changed so that we always allocate a response buffer regardless of the left side (eg: H2->H1 gateway). A final improvement would consists in being able to only retry the failed I/O operation without waking up a task, but to date all experiments to achieve this have proven not to be reliable enough.		2014-12-24 23:47:33 +01:00
..
acl.h	MINOR: pattern: store configuration reference for each acl or map pattern.	2014-03-17 18:06:07 +01:00
arg.h	MAJOR: sample: maintain a per-proxy list of the fetch args to resolve	2013-04-03 02:13:02 +02:00
auth.h	MEDIUM: pattern: The match function browse itself the list or the tree.	2014-03-17 18:06:07 +01:00
backend.h	MAJOR: checks: add support for a new "drain" administrative mode	2014-05-23 14:29:11 +02:00
channel.h	MEDIUM: channel: do not report full when buf_empty is present on a channel	2014-12-24 23:47:32 +01:00
checks.h	MEDIUM: checks: simplify server up/down/nolb transitions	2014-05-23 14:29:11 +02:00
compression.h	MINOR: compression: CPU usage limit	2012-11-21 02:15:16 +01:00
connection.h	MAJOR: namespace: add Linux network namespace support	2014-11-21 07:51:57 +01:00
cttproxy.h
dumpstats.h	MEDIUM: stats: reimplement HTTP keep-alive on the stats page	2014-04-24 17:24:56 +02:00
fd.h	MAJOR: polling: centralize calls to I/O callbacks	2014-11-21 20:37:32 +01:00
freq_ctr.h	MINOR: freq_ctr: introduce a new averaging method	2014-06-17 17:15:51 +02:00
frontend.h
hdr_idx.h
lb_chash.h
lb_fas.h
lb_fwlc.h
lb_fwrr.h
lb_map.h
listener.h	CLEANUP: fix missing include <string.h> in proto/listener.h	2013-06-14 19:52:17 +02:00
log.h	MEDIUM: log: support a user-configurable max log line length	2014-06-27 18:13:53 +02:00
map.h	MINOR: map: export parse output sample functions	2013-12-12 15:44:05 +01:00
obj_type.h	MINOR: obj: introduce a new type appctx	2013-12-09 15:40:22 +01:00
pattern.h	BUG/MEDIUM: pattern: don't load more than once a pattern list.	2014-11-24 15:40:16 +01:00
payload.h	MINOR: payload: split smp_fetch_rdp_cookie()	2013-08-01 21:17:13 +02:00
peers.h
pipe.h
port_range.h
proto_http.h	BUG/MEDIUM: http: adjust close mode when switching to backend	2014-09-30 18:44:22 +02:00
proto_tcp.h	MEDIUM: listener: implement a per-protocol pause() function	2014-07-08 01:13:34 +02:00
proto_uxst.h	BUG/MEDIUM: unix: completely unbind abstract sockets during a pause()	2014-07-08 01:13:35 +02:00
protocol.h
proxy.h	MEDIUM: proxy: create a tree to store proxies by name	2014-03-15 07:48:35 +01:00
queue.h	REORG: checks: put the functions in the appropriate files !	2014-05-22 11:27:00 +02:00
raw_sock.h
sample.h	MINOR: configuration: File and line propagation	2014-03-17 18:06:08 +01:00
server.h	BUG/MINOR: server: move the directive #endif to the end of file	2014-07-29 11:03:14 +02:00
session.h	MAJOR: session: only wake up as many sessions as available buffers permit	2014-12-24 23:47:33 +01:00
shctx.h	BUG/MAJOR: ssl: Fallback to private session cache if current lock mode is not supported.	2014-05-08 22:46:32 +02:00
signal.h
ssl_sock.h	BUILD: ssl: use OPENSSL_NO_OCSP to detect OCSP support	2014-12-09 20:49:22 +01:00
stick_table.h	MEDIUM: stick-table: make it easier to register extra data types	2014-07-15 19:14:52 +02:00
stream_interface.h	MINOR: stream-int: retrieve session pointer from stream-int	2014-12-24 23:47:31 +01:00
task.h	MINOR: task: release the task pool when stopping	2014-11-13 16:57:19 +01:00
template.h