haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-03-02 09:30:36 +00:00

Author	SHA1	Message	Date
Willy Tarreau	3d241e78a1	MEDIUM: args: use #define to specify the number of bits used by arg types and counts This is in order to add new types. This patch does not change anything else. Two remaining (harmless) occurrences of a count of 8 instead of 7 were fixed by this patch : empty_arg_list[] and the for() loop counting args.	2015-01-22 14:24:53 +01:00
Willy Tarreau	319f745ba0	MINOR: channel: rename bi_erase() to channel_truncate() It applies to the channel and it doesn't erase outgoing data, only pending unread data, which is strictly equivalent to what recv() does with MSG_TRUNC, so that new name is more accurate and intuitive.	2015-01-14 20:32:59 +01:00
Willy Tarreau	b5051f8742	MINOR: channel: rename bi_avail() to channel_recv_max() This name more accurately reminds that it applies to a channel and not to a buffer, and that what is returned may be used as a max number of bytes to pass to recv().	2015-01-14 20:26:54 +01:00
Willy Tarreau	3f5096ddf2	MINOR: channel: rename buffer_max_len() to channel_recv_limit() Buffer_max_len() is ambiguous and misleading since it considers the channel. The new name more accurately designates the size limit for received data.	2015-01-14 20:21:43 +01:00
Willy Tarreau	a4178192b9	MINOR: channel: rename buffer_reserved() to channel_reserved() This applies to the channel, not the buffer, so let's fix this name. Warning, the function's name happens to be the same as the old one which was mistakenly used during 1.5.	2015-01-14 20:21:12 +01:00
Willy Tarreau	3889fffe92	MINOR: channel: rename channel_full() to !channel_may_recv() This function's name was poorly chosen and is confusing to the point of being suspiciously used at some places. The operations it does always consider the ability to forward pending input data before receiving new data. This is not obvious at all, especially at some places where it was used when consuming outgoing data to know if the buffer has any chance to ever get the missing data. The code needs to be re-audited with that in mind. Care must be taken with existing code since the polarity of the function was switched with the renaming.	2015-01-14 18:41:33 +01:00
Willy Tarreau	ba0902ede4	CLEANUP: channel: rename channel_reserved -> channel_is_rewritable channel_reserved is confusingly named. It is used to know whether or not the rewrite area is left intact for situations where we want to ensure we can use it before proceeding. Let's rename it to fix this confusion.	2015-01-14 18:41:33 +01:00
Willy Tarreau	9c06ee4ccf	BUG/MEDIUM: channel: don't schedule data in transit for leaving until connected Option http-send-name-header is still hurting. If a POST request has to be redispatched when this option is used, and the next server's name is larger than the initial one, and the POST body fills the buffer, it becomes impossible to rewrite the server's name in the buffer when redispatching. In 1.4, this is worse, the process may crash because of a negative size computation for the memmove(). The only solution to fix this is to refrain from eating the reserve before we're certain that we won't modify the buffer anymore. And the condition for that is that the connection is established. This patch introduces "channel_may_send()" which helps to detect whether it's safe to eat the reserve or not. This condition is used by channel_in_transit() introduced by recent patches. This patch series must be backported into 1.5, and a simpler version must be backported into 1.4 where fixing the bug is much easier since there were no channels by then. Note that in 1.4 the severity is major.	2015-01-14 16:08:45 +01:00
Willy Tarreau	27bb0e14a8	MEDIUM: channel: make bi_avail() use channel_in_transit() This ensures that we rely on a sane computation for the buffer size.	2015-01-14 15:57:24 +01:00
Willy Tarreau	fe57834955	MEDIUM: channel: make buffer_reserved() use channel_in_transit() This ensures that we rely on a sane computation for the buffer size.	2015-01-14 15:57:21 +01:00
Willy Tarreau	1a4484dec8	MINOR: channel: add channel_in_transit() This function returns the amount of bytes in transit in a channel's buffer, which is the amount of outgoing data plus the amount of incoming data bound to the forward limit.	2015-01-14 13:51:48 +01:00
Willy Tarreau	bb3f994f1a	BUG/MINOR: channel: compare to_forward with buf->i, not buf->size We know that all incoming data are going to be purged if to_forward is greater than them, not only if greater than the buffer size. This buf has no direct impact on this version, but it participates to some bugs affecting http-send-name-header since 1.4. This fix will have to be backported down to 1.4 albeit in a different form.	2015-01-14 13:50:24 +01:00
Willy Tarreau	0428a146c0	BUG/MEDIUM: channel: fix possible integer overflow on reserved size computation The buffer_max_len() function is subject to an integer overflow in this calculus : int ret = global.tune.maxrewrite - chn->to_forward - chn->buf->o; - chn->to_forward may be up to 2^31 - 1 - chn->buf->o may be up to chn->buf->size - global.tune.maxrewrite is by definition smaller than chn->buf->size Thus here we can subtract (2^31 + buf->o) (highly negative) from something slightly positive, and result in ret being larger than expected. Fortunately in 1.5 and 1.6, this is only used by bi_avail() which itself is used by applets which do not set high values for to_forward so this problem does not happen there. However in 1.4 the equivalent computation was used to limit the size of a read and can result in a read overflow when combined with the nasty http-send-name-header feature. This fix must be backported to 1.5 and 1.4.	2015-01-14 12:04:34 +01:00
Willy Tarreau	3c23a85550	CLEANUP: session: remove session_from_task() Since commit `3dd6a25` ("MINOR: stream-int: retrieve session pointer from stream-int"), we can get the session from the task, so let's get rid of this less obvious function.	2014-12-28 12:19:57 +01:00
Willy Tarreau	b034b2598d	MEDIUM: channel: implement a zero-copy buffer transfer bi_swpbuf() swaps the buffer passed in argument with the one attached to the channel, but only if this last one is empty. The idea is to avoid a copy when buffers can simply be swapped.	2014-12-24 23:47:33 +01:00
Willy Tarreau	a24adf0795	MAJOR: session: only wake up as many sessions as available buffers permit We've already experimented with three wake up algorithms when releasing buffers : the first naive one used to wake up far too many sessions, causing many of them not to get any buffer. The second approach which was still in use prior to this patch consisted in waking up either 1 or 2 sessions depending on the number of FDs we had released. And this was still inaccurate. The third one tried to cover the accuracy issues of the second and took into consideration the number of FDs the sessions would be willing to use, but most of the time we ended up waking up too many of them for nothing, or deadlocking by lack of buffers. This patch completely removes the need to allocate two buffers at once. Instead it splits allocations into critical and non-critical ones and implements a reserve in the pool for this. The deadlock situation happens when all buffers are be allocated for requests pending in a maxconn-limited server queue, because then there's no more way to allocate buffers for responses, and these responses are critical to release the servers's connection in order to release the pending requests. In fact maxconn on a server creates a dependence between sessions and particularly between oldest session's responses and latest session's requests. Thus, it is mandatory to get a free buffer for a response in order to release a server connection which will permit to release a request buffer. Since we definitely have non-symmetrical buffers, we need to implement this logic in the buffer allocation mechanism. What this commit does is implement a reserve of buffers which can only be allocated for responses and that will never be allocated for requests. This is made possible by the requester indicating how much margin it wants to leave after the allocation succeeds. Thus it is a cooperative allocation mechanism : the requester (process_session() in general) prefers not to get a buffer in order to respect other's need for response buffers. The session management code always knows if a buffer will be used for requests or responses, so that is not difficult : - either there's an applet on the initiator side and we really need the request buffer (since currently the applet is called in the context of the session) - or we have a connection and we really need the response buffer (in order to support building and sending an error message back) This reserve ensures that we don't take all allocatable buffers for requests waiting in a queue. The downside is that all the extra buffers are really allocated to ensure they can be allocated. But with small values it is not an issue. With this change, we don't observe any more deadlocks even when running with maxconn 1 on a server under severely constrained memory conditions. The code becomes a bit tricky, it relies on the scheduler's run queue to estimate how many sessions are already expected to run so that it doesn't wake up everyone with too few resources. A better solution would probably consist in having two queues, one for urgent requests and one for normal requests. A failed allocation for a session dealing with an error, a connection event, or the need for a response (or request when there's an applet on the left) would go to the urgent request queue, while other requests would go to the other queue. Urgent requests would be served from 1 entry in the pool, while the regular ones would be served only according to the reserve. Despite not yet having this, it works remarkably well. This mechanism is quite efficient, we don't perform too many wake up calls anymore. For 1 million sessions elapsed during massive memory contention, we observe about 4.5M calls to process_session() compared to 4.0M without memory constraints. Previously we used to observe up to 16M calls, which rougly means 12M failures. During a test run under high memory constraints (limit enforced to 27 MB instead of the 58 MB normally needed), performance used to drop by 53% prior to this patch. Now with this patch instead it increases by about 1.5%. The best effect of this change is that by limiting the memory usage to about 2/3 to 3/4 of what is needed by default, it's possible to increase performance by up to about 18% mainly due to the fact that pools are reused more often and remain hot in the CPU cache (observed on regular HTTP traffic with 20k objects, buffers.limit = maxconn/10, buffers.reserve = limit/2). Below is an example of scenario which used to cause a deadlock previously : - connection is received - two buffers are allocated in process_session() then released - one is allocated when receiving an HTTP request - the second buffer is allocated then released in process_session() for request parsing then connection establishment. - poll() says we can send, so the request buffer is sent and released - process session gets notified that the connection is now established and allocates two buffers then releases them - all other sessions do the same till one cannot get the request buffer without hitting the margin - and now the server responds. stream_interface allocates the response buffer and manages to get it since it's higher priority being for a response. - but process_session() cannot allocate the request buffer anymore => We could end up with all buffers used by responses so that none may be allocated for a request in process_session(). When the applet processing leaves the session context, the test will have to be changed so that we always allocate a response buffer regardless of the left side (eg: H2->H1 gateway). A final improvement would consists in being able to only retry the failed I/O operation without waking up a task, but to date all experiments to achieve this have proven not to be reliable enough.	2014-12-24 23:47:33 +01:00
Willy Tarreau	bf883e0aa7	MAJOR: session: implement a wait-queue for sessions who need a buffer When a session_alloc_buffers() fails to allocate one or two buffers, it subscribes the session to buffer_wq, and waits for another session to release buffers. It's then removed from the queue and woken up with TASK_WAKE_RES, and can attempt its allocation again. We decide to try to wake as many waiters as we release buffers so that if we release 2 and two waiters need only once, they both have their chance. We must never come to the situation where we don't wake enough tasks up. It's common to release buffers after the completion of an I/O callback, which can happen even if the I/O could not be performed due to half a failure on memory allocation. In this situation, we don't want to move out of the wait queue the session that was just added, otherwise it will never get any buffer. Thus, we only force ourselves out of the queue when freeing the session. Note: at the moment, since session_alloc_buffers() is not used, no task is subscribed to the wait queue.	2014-12-24 23:47:33 +01:00
Willy Tarreau	656859d478	MEDIUM: session: implement a basic atomic buffer allocator This patch introduces session_alloc_recv_buffer(), session_alloc_buffers() and session_release_buffers() whose purpose will be to allocate missing buffers and release unneeded ones around the process_session() and during I/O operations. I/O callbacks only need a single buffer for recv operations and none for send. However we still want to ensure that we don't pick the last buffer. That's what session_alloc_recv_buffer() is for. This allocator is atomic in that it always ensures we can get 2 buffers or fails. Here, if any of the buffers is not ready and cannot be allocated, the operation is cancelled. The purpose is to guarantee that we don't enter into the deadlock where all buffers are allocated by the same size of all sessions. A queue will have to be implemented for failed allocations. For now they're just reported as failures.	2014-12-24 23:47:32 +01:00
Willy Tarreau	4428a29e52	MEDIUM: channel: do not report full when buf_empty is present on a channel Till now we'd consider a buffer full even if it had size==0 due to pointing to buf.size. Now we change this : if buf_wanted is present, it means that we have already tried to allocate a buffer but failed. Thus the buffer must be considered full so that we stop trying to poll for reads on it. Otherwise if it's empty, it's buf_empty and we report !full since we may allocate it on the fly.	2014-12-24 23:47:32 +01:00
Willy Tarreau	2a4b54359b	MEDIUM: buffer: always assign a dummy empty buffer to channels Channels are now created with a valid pointer to a buffer before the buffer is allocated. This buffer is a global one called "buf_empty" and of size zero. Thus it prevents any activity from being performed on the buffer and still ensures that chn->buf may always be dereferenced. b_free() also resets the buffer to &buf_empty, and was split into b_drop() which does not reset the buffer.	2014-12-24 23:47:32 +01:00
Willy Tarreau	474cf54a97	MINOR: buffer: reset a buffer in b_reset() and not channel_init() We'll soon need to be able to switch buffers without touching the channel, so let's move buffer initialization out of channel_init(). We had the same in compressoin.c.	2014-12-24 23:47:31 +01:00
Willy Tarreau	3dd6a25323	MINOR: stream-int: retrieve session pointer from stream-int sess_from_si() does this via the owner (struct task). It works because all stream ints belong to a task nowadays.	2014-12-24 23:47:31 +01:00
Lukas Tribus	e4e30f7d52	BUILD: ssl: use OPENSSL_NO_OCSP to detect OCSP support Since commit `656c5fa7e8` ("BUILD: ssl: disable OCSP when using boringssl) the OCSP code is bypassed when OPENSSL_IS_BORINGSSL is defined. The correct thing to do here is to use OPENSSL_NO_OCSP instead, which is defined for this exact purpose in openssl/opensslfeatures.h. This makes haproxy forward compatible if boringssl ever introduces full OCSP support with the additional benefit that it links fine against a OCSP-disabled openssl. Signed-off-by: Lukas Tribus <luky-37@hotmail.com>	2014-12-09 20:49:22 +01:00
Thierry FOURNIER	315ec4217f	BUG/MEDIUM: pattern: don't load more than once a pattern list. A memory optimization can use the same pattern expression for many equal pattern list (same parse method, index method and index_smp method). The pattern expression is returned by "pattern_new_expr", but this function dont indicate if the returned pattern is already in use. So, the caller function reload the list of patterns in addition with the existing patterns. This behavior is not a problem with tree indexed pattern, but it grows the lists indexed patterns. This fix add a "reuse" flag in return of the function "pattern_new_expr". If the flag is set, I suppose that the patterns are already loaded. This fix must be backported into 1.5.	2014-11-24 15:40:16 +01:00
Willy Tarreau	5be2f35231	MAJOR: polling: centralize calls to I/O callbacks In order for HTTP/2 not to eat too much memory, we'll have to support on-the-fly buffer allocation, since most streams will have an empty request buffer at some point. Supporting allocation on the fly means being able to sleep inside I/O callbacks if a buffer is not available. Till now, the I/O callbacks were called from two locations : - when processing the cached events - when processing the polled events from the poller This change cleans up the design a bit further than what was started in 1.5. It now ensures that we never call any iocb from the poller itself and that instead, events learned by the poller are put into the cache. The benefit is important in terms of stability : we don't have to care anymore about the risk that new events are added into the poller while processing its events, and we're certain that updates are processed at a single location. To achieve this, we now modify all the fd_* functions so that instead of creating updates, they add/remove the fd to/from the cache depending on its state, and only create an update when the polling status reaches a state where it will have to change. Since the pollers make use of these functions to notify readiness (using fd_may_recv/fd_may_send), the cache is always up to date with the poller. Creating updates only when the polling status needs to change saves a significant amount of work for the pollers : a benchmark showed that on a typical TCP proxy test, the amount of updates per connection dropped from 11 to 1 on average. This also means that the update list is smaller and has more chances of not thrashing too many CPU cache lines. The first observed benefit is a net 2% performance gain on the connection rate. A second benefit is that when a connection is accepted, it's only when we're processing the cache, and the recv event is automatically added into the cache after the current one, resulting in this event to be processed immediately during the same loop. Previously we used to have a second run over the updates to detect if new events were added to catch them before waking up tasks. The next gain will be offered by the next steps on this subject consisting in implementing an I/O queue containing all cached events ordered by priority just like the run queue, and to be able to leave some events pending there as long as needed. That will allow us not to perform some FD processing if it's not the proper time for this (typically keep waiting for a buffer to be allocated if none is available for an recv()). And by only processing a small bunch of them, we'll allow priorities to take place even at the I/O level. As a result of this change, functions fd_alloc_or_release_cache_entry() and fd_process_polled_events() have disappeared, and the code dedicated to checking for new fd events after the callback during the poll() loop was removed as well. Despite the patch looking large, it's mostly a change of what function is falled upon fd_*() and almost nothing was added.	2014-11-21 20:37:32 +01:00
KOVACS Krisztian	b3e54fe387	MAJOR: namespace: add Linux network namespace support This patch makes it possible to create binds and servers in separate namespaces. This can be used to proxy between multiple completely independent virtual networks (with possibly overlapping IP addresses) and a non-namespace-aware proxy implementation that supports the proxy protocol (v2). The setup is something like this: net1 on VLAN 1 (namespace 1) -\ net2 on VLAN 2 (namespace 2) -- haproxy ==== proxy (namespace 0) net3 on VLAN 3 (namespace 3) -/ The proxy is configured to make server connections through haproxy and sending the expected source/target addresses to haproxy using the proxy protocol. The network namespace setup on the haproxy node is something like this: = 8< = $ cat setup.sh ip netns add 1 ip link add link eth1 type vlan id 1 ip link set eth1.1 netns 1 ip netns exec 1 ip addr add 192.168.91.2/24 dev eth1.1 ip netns exec 1 ip link set eth1.$id up ... = 8< = = 8< = $ cat haproxy.cfg frontend clients bind 127.0.0.1:50022 namespace 1 transparent default_backend scb backend server mode tcp server server1 192.168.122.4:2222 namespace 2 send-proxy-v2 = 8< = A bind line creates the listener in the specified namespace, and connections originating from that listener also have their network namespace set to that of the listener. A server line either forces the connection to be made in a specified namespace or may use the namespace from the client-side connection if that was set. For more documentation please read the documentation included in the patch itself. Signed-off-by: KOVACS Tamas <ktamas@balabit.com> Signed-off-by: Sarkozi Laszlo <laszlo.sarkozi@balabit.com> Signed-off-by: KOVACS Krisztian <hidden@balabit.com>	2014-11-21 07:51:57 +01:00
Willy Tarreau	eb11889f1e	MINOR: task: release the task pool when stopping When we're stopping, we're not going to create new tasks anymore, so let's release the task pool upon each task_free() in order to reduce memory fragmentation.	2014-11-13 16:57:19 +01:00
Willy Tarreau	4e21ff9244	BUG/MEDIUM: http: adjust close mode when switching to backend Commit `179085c` ("MEDIUM: http: move Connection header processing earlier") introduced a regression : the backend's HTTP mode is not considered anymore when setting the session's HTTP mode, because wait_for_request() is only called once, when the frontend receives the request (or when the frontend is in TCP mode, when the backend receives the request). The net effect is that in some situations when the frontend and the backend do not work in the same mode (eg: keep-alive vs close), the backend's mode is ignored. This patch moves all that processing to a dedicated function, which is called from the original place, as well as from session_set_backend() when switching from an HTTP frontend to an HTTP backend in different modes. This fix must be backported to 1.5.	2014-09-30 18:44:22 +02:00
Dave McCowan	328fb58d74	MEDIUM: connection: add new bit in Proxy Protocol V2 There are two sample commands to get information about the presence of a client certificate. ssl_fc_has_crt is true if there is a certificate present in the current connection ssl_c_used is true if there is a certificate present in the session. If a session has stopped and resumed, then ssl_c_used could be true, while ssl_fc_has_crt is false. In the client byte of the TLS TLV of Proxy Protocol V2, there is only one bit to indicate whether a certificate is present on the connection. The attached patch adds a second bit to indicate the presence for the session. This maintains backward compatibility. [wt: this should be backported to 1.5 to help maintain compatibility between versions]	2014-08-23 07:35:29 +02:00
Lukas Tribus	656c5fa7e8	BUILD: ssl: disable OCSP when using boringssl Google's boringssl doesn't currently support OCSP, so disable it if detected. OCSP support may be reintroduced as per: https://code.google.com/p/chromium/issues/detail?id=398677 In that case we can simply revert this commit. Signed-off-by: Lukas Tribus <luky-37@hotmail.com>	2014-08-18 14:33:48 +02:00
Godbach	e468d55998	BUG/MINOR: server: move the directive #endif to the end of file If a source file includes proto/server.h twice or more, redefinition errors will be triggered for such inline functions as server_throttle_rate(), server_is_draining(), srv_adm_set_maint() and so on. Just move #endif directive to the end of file to solve this issue. Signed-off-by: Godbach <nylzhaowei@gmail.com>	2014-07-29 11:03:14 +02:00
Willy Tarreau	09448f7d7c	MEDIUM: http: add the track-sc* actions to http-request rules Add support for http-request track-sc, similar to what is done in tcp-request for backends. A new act_prm field was added to HTTP request rules to store the track params (table, counter). Just like for TCP rules, the table is resolved while checking for config validity. The code was mostly copied from the TCP code with the exception that here we also count the HTTP request count and rate by hand. Probably that something could be factored out in the future. It seems like tracking flags should be improved to mark each hook which tracks a key so that we can have some check points where to increase counters of the past if not done yet, a bit like is done for TRACK_BACKEND.	2014-07-16 17:26:40 +02:00
Willy Tarreau	edee1d60b7	MEDIUM: stick-table: make it easier to register extra data types Some users want to add their own data types to stick tables. We don't want to use a linked list here for performance reasons, so we need to continue to use an indexed array. This patch allows one to reserve a compile-time-defined number of extra data types by setting the new macro STKTABLE_EXTRA_DATA_TYPES to anything greater than zero, keeping in mind that anything larger will slightly inflate the memory consumed by stick tables (not per entry though). Then calling stktable_register_data_store() with the new keyword will either register a new keyword or fail if the desired entry was already taken or the keyword already registered. Note that this patch does not dictate how the data will be used, it only offers the possibility to create new keywords and have an index to reference them in the config and in the tables. The caller will not be able to use stktable_data_cast() and will have to explicitly cast the stable pointers to the expected types. It can be used for experimentation as well.	2014-07-15 19:14:52 +02:00
Willy Tarreau	e12704bfc7	MINOR: session: export the function 'smp_fetch_sc_stkctr' This one is sometimes useful outside of this file.	2014-07-15 19:09:56 +02:00
Thierry FOURNIER	055b9d5c63	MINOR: http: export the function 'smp_fetch_base32' It's sometimes useful outside of proto_http.c.	2014-07-15 19:09:36 +02:00
Willy Tarreau	8fed9037cd	MEDIUM: stick-table: implement lookup from a sample fetch Currently we have stktable_fetch_key() which fetches a sample according to an expression and returns a stick table key, but we also need a function which does only the second half of it from a known sample. So let's cut the function in two and introduce smp_to_stkey() to perform this lookup. The first function was adapted to make use of it in order to avoid code duplication.	2014-07-10 16:43:44 +02:00
Willy Tarreau	fd0e008d9d	BUG/MEDIUM: unix: completely unbind abstract sockets during a pause() Abstract namespace sockets ignore the shutdown() call and do not make it possible to temporarily stop listening. The issue it causes is that during a soft reload, the new process cannot bind, complaining that the address is already in use. This change registers a new pause() function for unix sockets and completely unbinds the abstract ones since it's possible to rebind them later. It requires the two previous patches as well as preceeding fixes. This fix should be backported into 1.5 since the issue apperas there.	2014-07-08 01:13:35 +02:00
Willy Tarreau	092d865c53	MEDIUM: listener: implement a per-protocol pause() function In order to fix the abstact socket pause mechanism during soft restarts, we'll need to proceed differently depending on the socket protocol. The pause_listener() function already supports some protocol-specific handling for the TCP case. This commit makes this cleaner by adding a new ->pause() function to the protocol struct, which, if defined, may be used to pause a listener of a given protocol. For now, only TCP has been adapted, with the specific code moved from pause_listener() to tcp_pause_listener().	2014-07-08 01:13:34 +02:00
Willy Tarreau	18324f574f	MEDIUM: log: support a user-configurable max log line length With all the goodies supported by logformat, people find that the limit of 1024 chars for log lines is too short. Some servers do not support larger lines and can simply drop them, so changing the default value is not always the best choice. This patch takes a different approach. Log line length is specified per log server on the "log" line, with a value between 80 and 65535. That way it's possibly to satisfy all needs, even with some fat local servers and small remote ones.	2014-06-27 18:13:53 +02:00
Willy Tarreau	b5975defba	MINOR: stick-table: make stktable_fetch_key() indicate why it failed stktable_fetch_key() does not indicate whether it returns NULL because the input sample was not found or because it's unstable. It causes trouble with track-sc* rules. Just like with sample_fetch_string(), we want it to be able to give more information to the caller about what it found. Thus, now we use the pointer to a sample passed by the caller, and fill it with the information we have about the sample. That way, even if we return NULL, the caller has the ability to check whether a sample was found and if it is still changing or not.	2014-06-25 17:17:53 +02:00
Emeric Brun	0abf836ecb	BUG/MINOR: ssl: Fix external function in order not to return a pointer on an internal trash buffer. 'ssl_sock_get_common_name' applied to a connection was also renamed 'ssl_sock_get_remote_common_name'. Currently, this function is only used with protocol PROXYv2 to retrieve the client certificate's common name. A further usage could be to retrieve the server certificate's common name on an outgoing connection.	2014-06-24 22:39:16 +02:00
Emeric Brun	4147b2ef10	MEDIUM: ssl: basic OCSP stapling support. The support is all based on static responses. This doesn't add any request / response logic to HAProxy, but allows a way to update information through the socket interface. Currently certificates specified using "crt" or "crt-list" on "bind" lines are loaded as PEM files. For each PEM file, haproxy checks for the presence of file at the same path suffixed by ".ocsp". If such file is found, support for the TLS Certificate Status Request extension (also known as "OCSP stapling") is automatically enabled. The content of this file is optional. If not empty, it must contain a valid OCSP Response in DER format. In order to be valid an OCSP Response must comply with the following rules: it has to indicate a good status, it has to be a single response for the certificate of the PEM file, and it has to be valid at the moment of addition. If these rules are not respected the OCSP Response is ignored and a warning is emitted. In order to identify which certificate an OCSP Response applies to, the issuer's certificate is necessary. If the issuer's certificate is not found in the PEM file, it will be loaded from a file at the same path as the PEM file suffixed by ".issuer" if it exists otherwise it will fail with an error. It is possible to update an OCSP Response from the unix socket using: set ssl ocsp-response <response> This command is used to update an OCSP Response for a certificate (see "crt" on "bind" lines). Same controls are performed as during the initial loading of the response. The <response> must be passed as a base64 encoded string of the DER encoded response from the OCSP server. Example: openssl ocsp -issuer issuer.pem -cert server.pem \ -host ocsp.issuer.com:80 -respout resp.der echo "set ssl ocsp-response $(base64 -w 10000 resp.der)" \| \ socat stdio /var/run/haproxy.stat This feature is automatically enabled on openssl 0.9.8h and above. This work was performed jointly by Dirkjan Bussink of GitHub and Emeric Brun of HAProxy Technologies.	2014-06-18 18:28:56 +02:00
Sasha Pachev	218f064f55	MEDIUM: http: add actions "replace-header" and "replace-values" in http-req/resp This patch adds two new actions to http-request and http-response rulesets : - replace-header : replace a whole header line, suited for headers which might contain commas - replace-value : replace a single header value, suited for headers defined as lists. The match consists in a regex, and the replacement string takes a log-format and supports back-references.	2014-06-17 18:34:32 +02:00
Willy Tarreau	4bfc580dd3	MEDIUM: session: maintain per-backend and per-server time statistics Using the last rate counters, we now compute the queue, connect, response and total times per server and per backend with a 95% accuracy over the last 1024 samples. The operation is cheap so we don't need to condition it.	2014-06-17 17:15:56 +02:00
Willy Tarreau	2438f2b984	MINOR: freq_ctr: introduce a new averaging method While the current functions report average event counts per period, we are also interested in average values per event. For this we use a different method. The principle is to rely on a long tail which sums the new value with a fraction of the previous value, resulting in a sliding window of infinite length depending on the precision we're interested in. The idea is that we always keep (N-1)/N of the sum and add the new sampled value. The sum over N values can be computed with a simple program for a constant value 1 at each iteration : N ,--- \ N - 1 e - 1 > ( --------- )^x ~= N * ----- / N e '--- x = 1 Note: I'm not sure how to demonstrate this but at least this is easily verified with a simple program, the sum equals N * 0.632120 for any N moderately large (tens to hundreds). Inserting a constant sample value V here simply results in : sum = V * N * (e - 1) / e But we don't want to integrate over a small period, but infinitely. Let's cut the infinity in P periods of N values. Each period M is exactly the same as period M-1 with a factor of ((N-1)/N)^N applied. A test shows that given a large N : N - 1 1 ( ------- )^N ~= --- N e Our sum is now a sum of each factor times : NP P ,--- ,--- \ N - 1 e - 1 \ 1 > v ( --------- )^x ~= VN ----- * > --- / N e / e^x '--- '--- x = 1 x = 0 For P "large enough", in tests we get this : P ,--- \ 1 e > --- ~= ----- / e^x e - 1 '--- x = 0 This simplifies the sum above : N*P ,--- \ N - 1 > v ( --------- )^x = VN / N '--- x = 1 So basically by summing values and applying the last result an (N-1)/N factor we just get N times the values over the long term, so we can recover the constant value V by dividing by N. A value added at the entry of the sliding window of N values will thus be reduced to 1/e or 36.7% after N terms have been added. After a second batch, it will only be 1/e^2, or 13.5%, and so on. So practically speaking, each old period of N values represents only a quickly fading ratio of the global sum : period ratio 1 36.7% 2 13.5% 3 4.98% 4 1.83% 5 0.67% 6 0.25% 7 0.09% 8 0.033% 9 0.012% 10 0.0045% So after 10N samples, the initial value has already faded out by a factor of 22026, which is quite fast. If the sliding window is 1024 samples wide, it means that a sample will only count for 1/22k of its initial value after 10k samples went after it, which results in half of the value it would represent using an arithmetic mean. The benefit of this method is that it's very cheap in terms of computations when N is a power of two. This is very well suited to record response times as large values will fade out faster than with an arithmetic mean and will depend on sample count and not time. Demonstrating all the above assumptions with maths instead of a program is left as an exercise for the reader.	2014-06-17 17:15:51 +02:00
Willy Tarreau	bfc7b7acd8	MAJOR: checks: add support for a new "drain" administrative mode This patch adds support for a new "drain" mode. So now we have 3 admin modes for a server : - READY - DRAIN - MAINT The drain mode disables load balancing but leaves the server up. It can coexist with maint, except that maint has precedence. It is also inherited from tracked servers, so just like maint, it's represented with 2 bits. New functions were designed to set/clear each flag and to propagate the changes to tracking servers when relevant, and to log the changes. Existing functions srv_set_adm_maint() and srv_set_adm_ready() were replaced to make use of the new functions. Currently the drain mode is not yet used, however the whole logic was tested with all combinations of set/clear of both flags in various orders to catch all corner cases.	2014-05-23 14:29:11 +02:00
Willy Tarreau	8eb7784634	MINOR: server: implement srv_set_stopping() This function was taken from check_set_server_drain(). It does not consider health checks at all and only sets a server to stopping provided it's not in maintenance and is not currently stopped. The resulting state will be STOPPING. The state change is propagated to tracked servers. For now the function is not used, but the goal is to split health checks status from server status and to be able to change a server's state regardless of health checks statuses.	2014-05-23 14:29:11 +02:00
Willy Tarreau	dbd5e78f5b	MINOR: server: implement srv_set_running() This function was taken from check_set_server_up(). It does not consider health checks at all and only sets a server up provided it's not in maintenance. The resulting state may be either RUNNING or STARTING depending on the presence of a slowstart or not. The state change is propagated to tracked servers. For now the function is not used, but the goal is to split health checks status from server status and to be able to change a server's state regardless of health checks statuses.	2014-05-23 14:29:11 +02:00
Willy Tarreau	e7d1ef16bf	MINOR: server: implement srv_set_stopped() This function was extracted from check_set_server_down(). In only manipulates the server state and does not consider the health checks at all, nor does it modify their status. It takes a reason message to report in logs, however it passes NULL when recursing through the trackers chain. For now the function is not used, but the goal is to split health checks status from server status and to be able to change a server's state regardless of health checks statuses.	2014-05-23 14:29:11 +02:00
Willy Tarreau	bda92271e6	MINOR: server: make the status reporting function support a reason srv_adm_append_status() was renamed srv_append_status() since it's no more dedicated to maintenance mode. It now supports a reason which if not null is appended to the output string.	2014-05-23 14:29:11 +02:00

1 2 3 4 5 ...

778 Commits