haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-03-05 10:58:14 +00:00

Author	SHA1	Message	Date
Willy Tarreau	73796535a9	REORG/MEDIUM: channel: only use chn_prod / chn_cons to find stream-interfaces The purpose of these two macros will be to pass via the session to find the relevant stream interfaces so that we don't need to store the ->cons nor ->prod pointers anymore. Currently they're only defined so that all references could be removed. Note that many places need a second pass of clean up so that we don't have any chn_prod(&s->req) anymore and only &s->si[0] instead, and conversely for the 3 other cases.	2015-03-11 20:41:47 +01:00
Willy Tarreau	50fe03be78	CLEANUP: stream-int: add si_opposite() to find the other stream interface At a few places we need to find one stream interface from the other one. Instead of passing via the channel, we simply use the session as an intermediary, which simply results in applying an offset to the pointer.	2015-03-11 20:41:47 +01:00
Willy Tarreau	4e4292b9af	CLEANUP: stream-int: add si_ib/si_ob to dereference the buffers This makes the code cleaner and is more intuitive to use.	2015-03-11 20:41:46 +01:00
Willy Tarreau	819d332dfd	MEDIUM: stream-int: remove any reference to the owner si->owner is not used anymore now, so let's remove any reference to it.	2015-03-11 20:41:46 +01:00
Willy Tarreau	07373b8660	MEDIUM: stream-int: use si_task() to retrieve the task from the stream int We go back to the session to get the owner. Here again it's very easy and is just a matter of relative offsets. Since the owner always exists and always points to the session's task, we can remove some unneeded tests.	2015-03-11 20:41:46 +01:00
Willy Tarreau	aefd79004c	MEDIUM: stream-int: make si_sess() use the stream int's side This one relies on the SI's side to find the pointer to the session. That the stream interface doesn't have to look at the task's context anymore.	2015-03-11 20:41:46 +01:00
Willy Tarreau	a2df3fa251	MEDIUM: stream-interface: remove now unused pointers to channels Everyone must now use si_ic() / si_oc() to find the relevant channels, the points have been totally removed.	2015-03-11 20:41:46 +01:00
Willy Tarreau	0b2fb7f9a3	MAJOR: stream-int: only rely on SI_FL_ISBACK to find the requested channel In order to plan removal of si->ib / si->ob, we now check the side of the stream interface and find the session, then the requested channel. In practice it's just an offset applied to the pointer based on the flag.	2015-03-11 20:41:46 +01:00
Willy Tarreau	a5f5d8dc69	MEDIUM: stream-int: add a flag indicating which side the SI is on This new flag "SI_FL_ISBACK" is set only on the back SI and is cleared on the front SI. That way it's possible only by looking at the SI to know what side it is.	2015-03-11 20:41:46 +01:00
Willy Tarreau	2bb4a96f8f	REORG/MEDIUM: stream-int: introduce si_ic/si_oc to access channels We'll soon remove direct references to the channels from the stream interface since everything belongs to the same session, so let's first not dereference si->ib / si->ob anymore and use macros instead.	2015-03-11 20:41:46 +01:00
Willy Tarreau	a27dc19eda	CLEANUP: remove now unused channel pool The channels are now part of the struct session. Their pool is not needed anymore.	2015-03-11 20:41:46 +01:00
Willy Tarreau	22ec1eadd0	REORG/MAJOR: move session's req and resp channels back into the session The channels were pointers to outside structs and this is not needed anymore since the buffers have moved, but this complicates operations. Move them back into the session so that both channels and stream interfaces are always allocated for a session. Some places (some early sample fetch functions) used to validate that a channel was NULL prior to dereferencing it. Now instead we check if chn->buf is NULL and we force it to remain NULL until the channel is initialized.	2015-03-11 20:41:46 +01:00
Thierry FOURNIER	2694a1a3c8	MINOR: lua: fetches and converters can return an empty string in place of nil In some cases we don't want to known if a fetch or converter fails. We just want a valid string. After this patch, we have two sets of fetches and two sets of converters. There are: txn.f, txn.sf, txn.c, txn.sc. The version prefixed by 's' always returns strings for any type, and returns an empty string in the error case or when the data are not available. This is particularly useful when manipulating headers or cookies.	2015-03-11 20:26:49 +01:00
Thierry FOURNIER	594afe76e4	MINOR: lua: wrapper for converters This patch implements a wrapper to give access to the converters in the Lua code. The converters are used with the transaction. The automatically created function are prefixed by "conv_".	2015-03-11 19:55:10 +01:00
Thierry FOURNIER	8fd1376014	MINOR: converters: add function to browse converters This patch adds a fucntion to browse each converter. This is used with Lua for using the converters with a wrapper.	2015-03-11 19:55:10 +01:00
Thierry FOURNIER	bb53c7b687	MEDIUM: lua: create a namespace for the fetches HAProxy proposes many sample fetches. It is possible that the automatic registration of the sample fetches causes a collision with an existing Lua function. This patch sets a namespace for the sample fetches.	2015-03-11 19:55:10 +01:00
Thierry FOURNIER	d2b597aa10	BUG/MEDIUM: lua: segfault with buffer_replace2 The function buffer_contig_space() returns the contiguous space avalaible to add data (at the end of the input side) while the function hlua_channel_send_yield() needs to insert data starting at p. Here we introduce a new function bi_space_for_replace() which returns the amount of space that can be inserted at the head of the input side with one of the buffer_replace* functions. This patch proposes a function that returns the space avalaible after buf->p.	2015-03-09 18:12:59 +01:00
Thierry FOURNIER	53e08ecc41	BUG/MEDIUM: lua: the Lua process is not waked up after sending data on requests side If we are writing in the request buffer, we are not waked up when the data are forwarded because it is useles. The request analyzers are waked up only when data is incoming. So, if the request buffer is full, we set the WAKE_ON_WRITE flag.	2015-03-09 17:47:52 +01:00
Thierry FOURNIER	ef6a2115fd	BUG/MEDIUM: lua: fix infinite loop about channel Before this patch, each yield in a Lua action set a flags to be waked up when some activity were detected on the response channel. This behavior causes loop in the analyzer process. This patch set the wake up on response buffer activity only if we really want to be waked up on this activity.	2015-03-09 17:47:52 +01:00
Thierry FOURNIER	bd1f1325e1	MINOR: lua: add the struct session in the lua channel struct This is used later to modify some flags in the session.	2015-03-09 17:47:52 +01:00
Thierry FOURNIER	4abd3ae184	MINOR: lua: adds "forced yield" flag This flag indicate that the current yield is returned by the Lua execution task control. If this flag is set, the current task may quit but will be set in the run queue to be re-executed immediatly. This patch modify the "hlua_yieldk()" function, it adds an argument that contain a field containing yield options.	2015-03-04 17:58:52 +01:00
Thierry FOURNIER	c42c1ae885	MEDIUM: lua: each yielding function returns a wake up time. This is used to ensure that the task doesn't become a zombie when the Lua returns a yield. The yield wrapper ensure that an timer used for waking up the task will be set. The timer is reseted to TICK_ETERNITY if the Lua execution is done.	2015-03-04 17:58:52 +01:00
Thierry FOURNIER	bd41349831	MINOR: lua: set skeleton for Lua execution expiration This first patch permits to cofigure the Lua execution exipiration. This expiration is configured but it is not yet avalaible, it will be add in a future patch.	2015-03-04 17:58:52 +01:00
Thierry FOURNIER	a097fdfb62	MINOR: lua: use bitfield and macro in place of integer and enum In the future, the lua execution must return scheduling informations. We want more than one flag, so I convert an integer used with an enum into an interer used as bitfield.	2015-03-04 17:58:52 +01:00
Thierry FOURNIER	a718b29b6d	MINOR: lua: remove some #define The #define compilation directives are centralized in the hlua include files. This permits to remove ome #ifdef from the haproxy main code.	2015-03-04 17:58:52 +01:00
Thierry FOURNIER	5c49aeb1b0	MINOR: remove unused declaration. This declaration is removed in the patch 'Lua initialisation on demand". commit id `05ac42455f`	2015-03-04 17:58:52 +01:00
Thierry FOURNIER	5a6d3fdf51	MINOR: lua: channel: add "channel" class The channel class permits manipulation of channels. A channel is an FIFO buffer between the client and the server. This class provides function to read, write, forward, destroy and alter data between the input and the ouput of the buffer.	2015-02-28 23:12:36 +01:00
Thierry FOURNIER	7e7ac32dad	MEDIUM: lua: socket: add "socket" class for TCP I/O This patch adds the TCP I/O functionnality. The class implemented provides the same functions than the "lua socket" project. This make network compatibility with another LUA project. The documentation is located here: http://w3.impa.br/~diego/software/luasocket/tcp.html	2015-02-28 23:12:35 +01:00
Thierry FOURNIER	5b8608f1ed	MINOR: lua: core: add sleep functions This version of sleep is based on a coroutine. A sleeping task is started and a signal is registered. This sleep version must disapear to be replaced by a version using the internal timers.	2015-02-28 23:12:35 +01:00
Thierry FOURNIER	258d8aafa6	MINOR: lua: add bindings for tcp and http actions This patch adds the runtime environment for http and tcp actions. It provides also the function for action registering.	2015-02-28 23:12:35 +01:00
Thierry FOURNIER	fa0e5dd217	MINOR: lua: register and execute sample-fetches in LUA This patch permits to write LUA samples fetches. Note that all the fethes declared trough LUA are automatically prefixed by "lua.".	2015-02-28 23:12:35 +01:00
Thierry FOURNIER	d0fa538fe3	MINOR: lua: txn: import existing sample-fetches in the class TXN This patch adds the browsing of all the HAProxy fetches and create associated LUA functions. The HAProxy internal fetches can be used in LUA trough the class "TXN". Note that the symbols "-", "+" and "." in the name of current sample fetch are rewrited as "_" in LUA because ".", "-" and "+" are operators.	2015-02-28 23:12:35 +01:00
Thierry FOURNIER	65f34c6367	MINOR: lua: txn: create class TXN associated with the transaction. This class of functions permit to access to all the functions associated with the transaction like http header, HAProxy internal fetches, etc ... This patch puts the skeleton of this class. The class will be enhanced later.	2015-02-28 23:12:34 +01:00
Thierry FOURNIER	a4a0f3d7c8	MINOR: lua: post initialisation bindings This system permits to execute some lua function after than HAProxy complete his initialisation. These functions are executed between the end of the configuration parsing and check and the begin of the scheduler.	2015-02-28 23:12:34 +01:00
Thierry FOURNIER	2ba18a2aa6	MINOR: lua: core: create "core" class and object This object provides main HAProxy functions. This first version creates an empty object. It will be enhanced later.	2015-02-28 23:12:34 +01:00
Thierry FOURNIER	9ff7e6e3b2	MEDIUM: lua: "com" signals This system permits to send signals between lua tasks. A main lua stack can register the signal in a coprocess. When the coprocess finish his job, it send a signal, and the associated task is wakes. If the main lua execution stack stop (with or without errors), the list or pending signals is purged.	2015-02-28 23:12:33 +01:00
Thierry FOURNIER	380d0930bd	MINOR: lua: add runtime execution context The functions added permits to execute the LUA stack execution in HAProxy. It provides all the runtie environment and initialise the main LUA stack.	2015-02-28 23:12:33 +01:00
Thierry FOURNIER	6f1fd48ef1	MEDIUM: lua: lua integration in the build and init system. This is the first step of the lua integration. We add the useful files in the HAProxy project. These files contains the main includes, the Makefile options and empty initialisation function. Is is the LUA skeleton.	2015-02-28 23:12:33 +01:00
Thierry FOURNIER	ca16b03813	MINOR: channel: functions to get data from a buffer without copy We now have functions to retrieve one block and one line from either the input or the output part of a buffer. They return up to two (pointer,length) values in case the buffer wraps.	2015-02-28 23:12:33 +01:00
Thierry FOURNIER	bc4c1ac6ad	MEDIUM: http/tcp: permit to resume http and tcp custom actions Later, the processing of some actions needs to be interrupted and resumed later. This patch permit to resume the actions. The actions that needs to run with the resume mode are not yet avalaible. It will be soon with Lua patches. So the code added by this patch is untestable for the moment. The list of "tcp_exec_req_rules" cannot resme because is called by the unresumable function "accept_session".	2015-02-28 23:12:33 +01:00
Thierry FOURNIER	549aac8d0b	MEDIUM: buffer: make bo_putblk/bo_putstr/bo_putchk return the number of bytes copied. This is not used yet. Planned for LUA.	2015-02-28 23:12:32 +01:00
Thierry FOURNIER	cc87a11842	MEDIUM: tcp: add register keyword system. This patch introduces an action keyword registration system for TCP rulesets similar to what is available for HTTP rulesets. This sytem will be useful with lua.	2015-02-28 23:12:32 +01:00
Thierry FOURNIER	ac836baad1	MINOR: includes: fix a lot of missing or useless includes These modifications are done for resolving cross-dependent includes in the upcoming LUA code. <proto/channel.h> misses <types/channel.h>. <types/acl.h> doesn't use <types/session.h> because the session is already declared in the file as undefined pointer. appsession.c misses <unistd.h> to use "write()". Declare undefined pointer "struct session" for <types/proxy.h> and <types/queue.h>. These includes dont need the detail of this struct.	2015-02-28 23:12:32 +01:00
Thierry FOURNIER	49f45af9aa	MINOR: global: export many symbols. The functions "val_payload_lv" and "val_hdr" are useful with lua. The lua automatic binding for sample fetchs needs to compare check functions. The "arg_type_names" permit to display error messages.	2015-02-28 23:12:32 +01:00
Thierry FOURNIER	4d9a1d1a5c	MINOR: sample: add function for browsing samples. This function is useful with the incoming lua functions.	2015-02-28 23:12:32 +01:00
Thierry FOURNIER	58639a0ef3	MINOR: global: export function and permits to not resolve DNS names exports the commonly used function str2ip. The function str2ip2 is created and permits to not resolve DNS names.	2015-02-28 23:12:32 +01:00
Thierry FOURNIER	f41a809dc9	MINOR: sample: add private argument to the struct sample_fetch The add of this private argument is to prepare the integration of the lua fetchs.	2015-02-28 23:12:31 +01:00
Thierry FOURNIER	68a556e282	MINOR: converters: give the session pointer as converter argument Some usages of the converters need to know the attached session. The Lua needs the session for retrieving his running context. This patch adds the "session" as an argument of the converters prototype.	2015-02-28 23:12:31 +01:00
Thierry FOURNIER	1edc971919	MINOR: converters: add a "void *private" argument to converters This permits to store specific configuration pointer. It is useful with future Lua integration.	2015-02-28 23:12:31 +01:00
Thierry FOURNIER	b83862dd74	MEDIUM: channel: wake up any request analyzer on response activity This behavior is already existing for the "WAIT_HTTP" analyzer, this patch just extends the system to any analyzer that would be waked up on response activity.	2015-02-28 23:12:31 +01:00
Thierry FOURNIER	bb2ae64b82	MEDIUM: protocol: automatically pick the proto associated to the connection. When the destination IP is dynamically set, we can't use the "target" to define the proto. This patch ensures that we always use the protocol associated with the address family. The proto field was removed from the server and check structs.	2015-02-28 23:12:31 +01:00
Willy Tarreau	b550d009ca	MEDIUM: protocol: use a family array to index the protocol handlers Instead of walking over a list, we now have a direct mapping between protocol families and their respective handlers. This will allow fast lookups.	2015-02-28 23:12:31 +01:00
Thierry FOURNIER	9cf7c4b9df	MAJOR: poll: only rely on wake_expired_tasks() to compute the wait delay Actually, HAProxy uses the function "process_runnable_tasks" and "wake_expired_tasks" to get the next task which can expires. If a task is added with "task_schedule" or other method during the execution of an other task, the expiration of this new task is not taken into account, and the execution of this task can be too late. Actualy, HAProxy seems to be no sensitive to this bug. This fix moves the call to process_runnable_tasks() before the timeout calculation and ensures that all wakeups are processed together. Only wake_expired_tasks() needs to return a timeout now.	2015-02-28 23:12:30 +01:00
Nenad Merdanovic	05552d4b98	MEDIUM: Add support for configurable TLS ticket keys Until now, the TLS ticket keys couldn't have been configured and shared between multiple instances or multiple servers running HAproxy. The result was that if a request got a TLS ticket from one instance/server and it hits another one afterwards, it will have to go through the full SSL handshake and negotation. This patch enables adding a ticket file to the bind line, which will be used for all SSL contexts created from that bind line. We can use the same file on all instances or servers to mitigate this issue and have consistent TLS tickets assigned. Clients will no longer have to negotiate every time they change the handling process. Signed-off-by: Nenad Merdanovic <nmerdan@anine.io>	2015-02-28 23:10:22 +01:00
Willy Tarreau	501260bf67	MEDIUM: task: always ensure that the run queue is consistent As found by Thierry Fournier, if a task manages to kill another one and if this other task is the next one in the run queue, we can do whatever including crashing, because the scheduler restarts from the saved next task. For now, there is no such concept of a task killing another one, but with Lua it will come. A solution consists in always performing the lookup of the first task in the scheduler's loop, but it's expensive and costs around 2% of the performance. Another solution consists in keeping a global next run queue node and ensuring that when this task gets removed, it updates this pointer to the next one. This allows to simplify the code a bit and in the end to slightly increase the performance (0.3-0.5%). The mechanism might still be usable if we later migrate to a multi-threaded scheduler.	2015-02-23 16:07:01 +01:00
Thierry FOURNIER	70fd7480f9	BUG/MINOR: ARG6 and ARG7 don't fit in a 32 bits word The patch "MEDIUM: args: increase arg type to 5 bits and limit arg count to 5" (`dbc79d0a`) increased the number of types supported, but forgot to remove the ARG6/ARG7 macros.	2015-02-20 14:34:16 +01:00
Willy Tarreau	2a3fb1c8bb	MINOR: ssl/server: add the "no-ssl-reuse" server option This option disables SSL session reuse when SSL is used to communicate with the server. It will force the server to perform a full handshake for every new connection. It's probably only useful for benchmarking, troubleshooting, and for paranoid users.	2015-02-06 18:04:08 +01:00
Simon Horman	64e3416662	MEDIUM: Allow suppression of email alerts by log level This patch adds a new option which allows configuration of the maximum log level of messages for which email alerts will be sent. The default is alert which is more restrictive than the current code which sends email alerts for all priorities. That behaviour may be configured using the new configuration option to set the maximum level to notice or greater. email-alert level notice Signed-off-by: Simon Horman <horms@verge.net.au>	2015-02-06 07:59:58 +01:00
Willy Tarreau	2af207a5f5	MEDIUM: tcp: implement tcp-ut bind option to set TCP_USER_TIMEOUT On Linux since 2.6.37, it's possible to set the socket timeout for pending outgoing data, with an accuracy of 1 millisecond. This is pretty handy to deal with dead connections to clients and or servers. For now we only implement it on the frontend side (bind line) so that when a client disappears from the net, we're able to quickly get rid of its connection and possibly release a server connection. This can be useful with long-lived connections where an application level timeout is not suited because long pauses are expected (remote terminals, connection pools, etc). Thanks to Thijs Houtenbos and John Eckersberg for the suggestion.	2015-02-04 00:54:40 +01:00
Simon Horman	0ba0e4ac07	MEDIUM: Support sending email alerts Signed-off-by: Simon Horman <horms@verge.net.au>	2015-02-03 00:24:16 +01:00
Simon Horman	9dc4996344	MEDIUM: Allow configuration of email alerts This currently does nothing beyond parsing the configuration and storing in the proxy as there is no implementation of email alerts. Signed-off-by: Simon Horman <horms@verge.net.au>	2015-02-03 00:24:16 +01:00
Simon Horman	0d16a4011e	MEDIUM: Add parsing of mailers section As mailer and mailers structures and allow parsing of a mailers section into those structures. These structures will subsequently be freed as it is not yet possible to use reference them in the configuration. Signed-off-by: Simon Horman <horms@verge.net.au>	2015-02-03 00:24:16 +01:00
Simon Horman	e16c1b3f3d	MEDIUM: Attach tcpcheck_rules to check This is to allow checks to be established whose tcpcheck_rules are not those of its proxy. Signed-off-by: Simon Horman <horms@verge.net.au>	2015-02-03 00:24:16 +01:00
Simon Horman	41f5876750	MEDIUM: Move proto and addr fields struct check The motivation for this is to make checks more independent of each other to allow further reuse of their infrastructure. For nowserver->check and server->agent still always use the same values for the addr and proto fields so this patch should not introduce any behavioural changes. Signed-off-by: Simon Horman <horms@verge.net.au>	2015-02-03 00:24:16 +01:00
Simon Horman	bfb5d33fe6	MEDIUM: Add free_check() helper Add free_check() helper to free the memory allocated by init_check(). Signed-off-by: Simon Horman <horms@verge.net.au>	2015-02-03 00:24:15 +01:00
Simon Horman	b1900d55df	MEDIUM: Refactor init_check and move to checks.c Refactor init_check so that an error string is returned rather than alerts being printed by it. Also init_check to checks.c and provide a prototype to allow it to be used from multiple C files. Signed-off-by: Simon Horman <horms@verge.net.au>	2015-02-03 00:24:15 +01:00
Willy Tarreau	a0dc23f093	MEDIUM: http: implement http-request set-{method,path,query,uri} This commit implements the following new actions : - "set-method" rewrites the request method with the result of the evaluation of format string <fmt>. There should be very few valid reasons for having to do so as this is more likely to break something than to fix it. - "set-path" rewrites the request path with the result of the evaluation of format string <fmt>. The query string, if any, is left intact. If a scheme and authority is found before the path, they are left intact as well. If the request doesn't have a path ("*"), this one is replaced with the format. This can be used to prepend a directory component in front of a path for example. See also "set-query" and "set-uri". Example : # prepend the host name before the path http-request set-path /%[hdr(host)]%[path] - "set-query" rewrites the request's query string which appears after the first question mark ("?") with the result of the evaluation of format string <fmt>. The part prior to the question mark is left intact. If the request doesn't contain a question mark and the new value is not empty, then one is added at the end of the URI, followed by the new value. If a question mark was present, it will never be removed even if the value is empty. This can be used to add or remove parameters from the query string. See also "set-query" and "set-uri". Example : # replace "%3D" with "=" in the query string http-request set-query %[query,regsub(%3D,=,g)] - "set-uri" rewrites the request URI with the result of the evaluation of format string <fmt>. The scheme, authority, path and query string are all replaced at once. This can be used to rewrite hosts in front of proxies, or to perform complex modifications to the URI such as moving parts between the path and the query string. See also "set-path" and "set-query". All of them are handled by the same parser and the same exec function, which is why they're merged all together. For once, instead of adding even more entries to the huge switch/case, we used the new facility to register action keywords. A number of the existing ones should probably move there as well.	2015-01-23 20:27:41 +01:00
Willy Tarreau	15a53a4384	MEDIUM: regex: add support for passing regex flags to regex_exec_match() This function (and its sister regex_exec_match2()) abstract the regex execution but make it impossible to pass flags to the regex engine. Currently we don't use them but we'll need to support REG_NOTBOL soon (to indicate that we're not at the beginning of a line). So let's add support for this flag and update the API accordingly.	2015-01-22 14:24:53 +01:00
Willy Tarreau	469477879c	MINOR: args: implement a new arg type for regex : ARGT_REG This one will be used when a regex is expected. It is automatically resolved after the parsing and compiled into a regex. Some optional flags are supported in the type-specific flags that should be set by the optional arg checker. One is used during the regex compilation : ARGF_REG_ICASE to ignore case.	2015-01-22 14:24:53 +01:00
Willy Tarreau	085dafac5f	MINOR: args: add type-specific flags for each arg in a list These flags are meant to be used by arg checkers to pass out-of-band information related to some args. A typical use is to indicate how a regex is expected to be compiled/matched based on other arguments. These flags are initialized to zero by default and it is up to the args checkers to set them if needed.	2015-01-22 14:24:53 +01:00
Willy Tarreau	dbc79d0aed	MEDIUM: args: increase arg type to 5 bits and limit arg count to 5 We'll soon need to add new argument types, and we don't use the current limit of 7 arguments, so let's increase the arg type size to 5 bits and reduce the arg count to 5 (3 max are used today).	2015-01-22 14:24:53 +01:00
Willy Tarreau	3d241e78a1	MEDIUM: args: use #define to specify the number of bits used by arg types and counts This is in order to add new types. This patch does not change anything else. Two remaining (harmless) occurrences of a count of 8 instead of 7 were fixed by this patch : empty_arg_list[] and the for() loop counting args.	2015-01-22 14:24:53 +01:00
Willy Tarreau	324f07f6dd	MEDIUM: backend: add the crc32 hash algorithm for load balancing Since we have it available, let's make it usable for load balancing, it comes at no cost except 3 lines of documentation.	2015-01-20 19:48:14 +01:00
Willy Tarreau	c829ee48c7	MINOR: hash: add new function hash_crc32 This function will be used to perform CRC32 computations. This one wa loosely inspired from crc32b found here, and focuses on size and speed at the same time : http://www.hackersdelight.org/hdcodetxt/crc.c.txt Much faster table-based versions exist but are pointless for our usage here, this hash already sustains gigabit speed which is far faster than what we'd ever need. Better preserve the CPU's cache instead.	2015-01-20 19:48:05 +01:00
Willy Tarreau	d025648f7c	MAJOR: init: automatically set maxconn and/or maxsslconn when possible If a memory size limit is enforced using "-n" on the command line and one or both of maxconn / maxsslconn are not set, instead of using the build-time values, haproxy now computes the number of sessions that can be allocated depending on a number of parameters among which : - global.maxconn (if set) - global.maxsslconn (if set) - maxzlibmem - tune.ssl.cachesize - presence of SSL in at least one frontend (bind lines) - presence of SSL in at least one backend (server lines) - tune.bufsize - tune.cookie_len The purpose is to ensure that not haproxy will not run out of memory when maxing out all parameters. If neither maxconn nor maxsslconn are used, it will consider that 100% of the sessions involve SSL on sides where it's supported. That means that it will typically optimize maxconn for SSL offloading or SSL bridging on all connections. This generally means that the simple act of enabling SSL in a frontend or in a backend will significantly reduce the global maxconn but in exchange of that, it will guarantee that it will not fail. All metrics may be enforced using #defines to accomodate variations in SSL libraries or various allocation sizes.	2015-01-15 21:45:22 +01:00
Willy Tarreau	d92aa5c44a	MINOR: global: report information about the cost of SSL connections An SSL connection takes some memory when it exists and during handshakes. We measured up to 16kB for an established endpoint, and up to 76 extra kB during a handshake. The SSL layer stores these values into the global struct during initialization. If other SSL libs are used, it's easy to change these values. Anyway they'll only be used as gross estimates in order to guess the max number of SSL conns that can be established when memory is constrained and the limit is not set.	2015-01-15 21:34:39 +01:00
Willy Tarreau	fce03113fa	MINOR: global: always export some SSL-specific metrics We'll need to know the number of SSL connections, their use and their cost soon. In order to avoid getting tons of ifdefs everywhere, always export SSL information in the global section. We add two flags to know whether or not SSL is used in a frontend and in a backend.	2015-01-15 21:32:40 +01:00
Willy Tarreau	3ca1a883f9	MINOR: tools: add new round_2dig() function to round integers This function rounds down an integer to the closest value having only 2 significant digits.	2015-01-15 19:02:27 +01:00
Willy Tarreau	319f745ba0	MINOR: channel: rename bi_erase() to channel_truncate() It applies to the channel and it doesn't erase outgoing data, only pending unread data, which is strictly equivalent to what recv() does with MSG_TRUNC, so that new name is more accurate and intuitive.	2015-01-14 20:32:59 +01:00
Willy Tarreau	b5051f8742	MINOR: channel: rename bi_avail() to channel_recv_max() This name more accurately reminds that it applies to a channel and not to a buffer, and that what is returned may be used as a max number of bytes to pass to recv().	2015-01-14 20:26:54 +01:00
Willy Tarreau	3f5096ddf2	MINOR: channel: rename buffer_max_len() to channel_recv_limit() Buffer_max_len() is ambiguous and misleading since it considers the channel. The new name more accurately designates the size limit for received data.	2015-01-14 20:21:43 +01:00
Willy Tarreau	a4178192b9	MINOR: channel: rename buffer_reserved() to channel_reserved() This applies to the channel, not the buffer, so let's fix this name. Warning, the function's name happens to be the same as the old one which was mistakenly used during 1.5.	2015-01-14 20:21:12 +01:00
Willy Tarreau	3889fffe92	MINOR: channel: rename channel_full() to !channel_may_recv() This function's name was poorly chosen and is confusing to the point of being suspiciously used at some places. The operations it does always consider the ability to forward pending input data before receiving new data. This is not obvious at all, especially at some places where it was used when consuming outgoing data to know if the buffer has any chance to ever get the missing data. The code needs to be re-audited with that in mind. Care must be taken with existing code since the polarity of the function was switched with the renaming.	2015-01-14 18:41:33 +01:00
Willy Tarreau	ba0902ede4	CLEANUP: channel: rename channel_reserved -> channel_is_rewritable channel_reserved is confusingly named. It is used to know whether or not the rewrite area is left intact for situations where we want to ensure we can use it before proceeding. Let's rename it to fix this confusion.	2015-01-14 18:41:33 +01:00
Willy Tarreau	9c06ee4ccf	BUG/MEDIUM: channel: don't schedule data in transit for leaving until connected Option http-send-name-header is still hurting. If a POST request has to be redispatched when this option is used, and the next server's name is larger than the initial one, and the POST body fills the buffer, it becomes impossible to rewrite the server's name in the buffer when redispatching. In 1.4, this is worse, the process may crash because of a negative size computation for the memmove(). The only solution to fix this is to refrain from eating the reserve before we're certain that we won't modify the buffer anymore. And the condition for that is that the connection is established. This patch introduces "channel_may_send()" which helps to detect whether it's safe to eat the reserve or not. This condition is used by channel_in_transit() introduced by recent patches. This patch series must be backported into 1.5, and a simpler version must be backported into 1.4 where fixing the bug is much easier since there were no channels by then. Note that in 1.4 the severity is major.	2015-01-14 16:08:45 +01:00
Willy Tarreau	27bb0e14a8	MEDIUM: channel: make bi_avail() use channel_in_transit() This ensures that we rely on a sane computation for the buffer size.	2015-01-14 15:57:24 +01:00
Willy Tarreau	fe57834955	MEDIUM: channel: make buffer_reserved() use channel_in_transit() This ensures that we rely on a sane computation for the buffer size.	2015-01-14 15:57:21 +01:00
Willy Tarreau	1a4484dec8	MINOR: channel: add channel_in_transit() This function returns the amount of bytes in transit in a channel's buffer, which is the amount of outgoing data plus the amount of incoming data bound to the forward limit.	2015-01-14 13:51:48 +01:00
Willy Tarreau	bb3f994f1a	BUG/MINOR: channel: compare to_forward with buf->i, not buf->size We know that all incoming data are going to be purged if to_forward is greater than them, not only if greater than the buffer size. This buf has no direct impact on this version, but it participates to some bugs affecting http-send-name-header since 1.4. This fix will have to be backported down to 1.4 albeit in a different form.	2015-01-14 13:50:24 +01:00
Willy Tarreau	0428a146c0	BUG/MEDIUM: channel: fix possible integer overflow on reserved size computation The buffer_max_len() function is subject to an integer overflow in this calculus : int ret = global.tune.maxrewrite - chn->to_forward - chn->buf->o; - chn->to_forward may be up to 2^31 - 1 - chn->buf->o may be up to chn->buf->size - global.tune.maxrewrite is by definition smaller than chn->buf->size Thus here we can subtract (2^31 + buf->o) (highly negative) from something slightly positive, and result in ret being larger than expected. Fortunately in 1.5 and 1.6, this is only used by bi_avail() which itself is used by applets which do not set high values for to_forward so this problem does not happen there. However in 1.4 the equivalent computation was used to limit the size of a read and can result in a read overflow when combined with the nasty http-send-name-header feature. This fix must be backported to 1.5 and 1.4.	2015-01-14 12:04:34 +01:00
Willy Tarreau	75abcb3106	MINOR: config: extend the default max hostname length to 64 and beyond Some users reported that the default max hostname length of 32 is too short in some environments. This patch does two things : - it relies on the system's max hostname length as found in MAXHOSTNAMELEN if it is set. This is the most logical thing to do as the system libs generally present the appropriate value supported by the system. This value is 64 on Linux and 256 on Solaris, to give a few examples. - otherwise it defaults to 64 It is still possible to override this value by defining MAX_HOSTNAME_LEN at build time. After some observation time, this patch may be backported to 1.5 if it does not cause any build issue, as it is harmless and may help some users.	2015-01-14 11:52:34 +01:00
Willy Tarreau	094af4e16e	MINOR: logs: add a new per-proxy "log-tag" directive This is equivalent to what was done in commit `48936af` ("[MINOR] log: ability to override the syslog tag") but this time instead of doing this globally, it does it per proxy. The purpose is to be able to use a separate log tag for various proxies (eg: make it easier to route log messages depending on the customer).	2015-01-07 15:03:42 +01:00
Willy Tarreau	3c23a85550	CLEANUP: session: remove session_from_task() Since commit `3dd6a25` ("MINOR: stream-int: retrieve session pointer from stream-int"), we can get the session from the task, so let's get rid of this less obvious function.	2014-12-28 12:19:57 +01:00
Cyril Bont�	ac92a065d7	MINOR: checks: update dynamic environment variables in external checks commit `9ede66b0` introduced an environment variable (HAPROXY_SERVER_CURCONN) that was supposed to be dynamically updated, but it was set only once, during its initialization. Most of the code provided in this previous patch has been rewritten in order to easily update the environment variables without reallocating memory during each check. Now, HAPROXY_SERVER_CURCONN will contain the current number of connections on the server at the time of the check.	2014-12-28 01:22:56 +01:00
Willy Tarreau	b034b2598d	MEDIUM: channel: implement a zero-copy buffer transfer bi_swpbuf() swaps the buffer passed in argument with the one attached to the channel, but only if this last one is empty. The idea is to avoid a copy when buffers can simply be swapped.	2014-12-24 23:47:33 +01:00
Willy Tarreau	33cb065348	MINOR: config: implement global setting tune.buffers.limit This setting is used to limit memory usage without causing the alloc failures caused by "-m". Unexpectedly, tests have shown a performance boost of up to about 18% on HTTP traffic when limiting the number of buffers to about 10% of the amount of concurrent connections. tune.buffers.limit <number> Sets a hard limit on the number of buffers which may be allocated per process. The default value is zero which means unlimited. The minimum non-zero value will always be greater than "tune.buffers.reserve" and should ideally always be about twice as large. Forcing this value can be particularly useful to limit the amount of memory a process may take, while retaining a sane behaviour. When this limit is reached, sessions which need a buffer wait for another one to be released by another session. Since buffers are dynamically allocated and released, the waiting time is very short and not perceptible provided that limits remain reasonable. In fact sometimes reducing the limit may even increase performance by increasing the CPU cache's efficiency. Tests have shown good results on average HTTP traffic with a limit to 1/10 of the expected global maxconn setting, which also significantly reduces memory usage. The memory savings come from the fact that a number of connections will not allocate 2*tune.bufsize. It is best not to touch this value unless advised to do so by an haproxy core developer.	2014-12-24 23:47:33 +01:00
Willy Tarreau	a24adf0795	MAJOR: session: only wake up as many sessions as available buffers permit We've already experimented with three wake up algorithms when releasing buffers : the first naive one used to wake up far too many sessions, causing many of them not to get any buffer. The second approach which was still in use prior to this patch consisted in waking up either 1 or 2 sessions depending on the number of FDs we had released. And this was still inaccurate. The third one tried to cover the accuracy issues of the second and took into consideration the number of FDs the sessions would be willing to use, but most of the time we ended up waking up too many of them for nothing, or deadlocking by lack of buffers. This patch completely removes the need to allocate two buffers at once. Instead it splits allocations into critical and non-critical ones and implements a reserve in the pool for this. The deadlock situation happens when all buffers are be allocated for requests pending in a maxconn-limited server queue, because then there's no more way to allocate buffers for responses, and these responses are critical to release the servers's connection in order to release the pending requests. In fact maxconn on a server creates a dependence between sessions and particularly between oldest session's responses and latest session's requests. Thus, it is mandatory to get a free buffer for a response in order to release a server connection which will permit to release a request buffer. Since we definitely have non-symmetrical buffers, we need to implement this logic in the buffer allocation mechanism. What this commit does is implement a reserve of buffers which can only be allocated for responses and that will never be allocated for requests. This is made possible by the requester indicating how much margin it wants to leave after the allocation succeeds. Thus it is a cooperative allocation mechanism : the requester (process_session() in general) prefers not to get a buffer in order to respect other's need for response buffers. The session management code always knows if a buffer will be used for requests or responses, so that is not difficult : - either there's an applet on the initiator side and we really need the request buffer (since currently the applet is called in the context of the session) - or we have a connection and we really need the response buffer (in order to support building and sending an error message back) This reserve ensures that we don't take all allocatable buffers for requests waiting in a queue. The downside is that all the extra buffers are really allocated to ensure they can be allocated. But with small values it is not an issue. With this change, we don't observe any more deadlocks even when running with maxconn 1 on a server under severely constrained memory conditions. The code becomes a bit tricky, it relies on the scheduler's run queue to estimate how many sessions are already expected to run so that it doesn't wake up everyone with too few resources. A better solution would probably consist in having two queues, one for urgent requests and one for normal requests. A failed allocation for a session dealing with an error, a connection event, or the need for a response (or request when there's an applet on the left) would go to the urgent request queue, while other requests would go to the other queue. Urgent requests would be served from 1 entry in the pool, while the regular ones would be served only according to the reserve. Despite not yet having this, it works remarkably well. This mechanism is quite efficient, we don't perform too many wake up calls anymore. For 1 million sessions elapsed during massive memory contention, we observe about 4.5M calls to process_session() compared to 4.0M without memory constraints. Previously we used to observe up to 16M calls, which rougly means 12M failures. During a test run under high memory constraints (limit enforced to 27 MB instead of the 58 MB normally needed), performance used to drop by 53% prior to this patch. Now with this patch instead it increases by about 1.5%. The best effect of this change is that by limiting the memory usage to about 2/3 to 3/4 of what is needed by default, it's possible to increase performance by up to about 18% mainly due to the fact that pools are reused more often and remain hot in the CPU cache (observed on regular HTTP traffic with 20k objects, buffers.limit = maxconn/10, buffers.reserve = limit/2). Below is an example of scenario which used to cause a deadlock previously : - connection is received - two buffers are allocated in process_session() then released - one is allocated when receiving an HTTP request - the second buffer is allocated then released in process_session() for request parsing then connection establishment. - poll() says we can send, so the request buffer is sent and released - process session gets notified that the connection is now established and allocates two buffers then releases them - all other sessions do the same till one cannot get the request buffer without hitting the margin - and now the server responds. stream_interface allocates the response buffer and manages to get it since it's higher priority being for a response. - but process_session() cannot allocate the request buffer anymore => We could end up with all buffers used by responses so that none may be allocated for a request in process_session(). When the applet processing leaves the session context, the test will have to be changed so that we always allocate a response buffer regardless of the left side (eg: H2->H1 gateway). A final improvement would consists in being able to only retry the failed I/O operation without waking up a task, but to date all experiments to achieve this have proven not to be reliable enough.	2014-12-24 23:47:33 +01:00
Willy Tarreau	bf883e0aa7	MAJOR: session: implement a wait-queue for sessions who need a buffer When a session_alloc_buffers() fails to allocate one or two buffers, it subscribes the session to buffer_wq, and waits for another session to release buffers. It's then removed from the queue and woken up with TASK_WAKE_RES, and can attempt its allocation again. We decide to try to wake as many waiters as we release buffers so that if we release 2 and two waiters need only once, they both have their chance. We must never come to the situation where we don't wake enough tasks up. It's common to release buffers after the completion of an I/O callback, which can happen even if the I/O could not be performed due to half a failure on memory allocation. In this situation, we don't want to move out of the wait queue the session that was just added, otherwise it will never get any buffer. Thus, we only force ourselves out of the queue when freeing the session. Note: at the moment, since session_alloc_buffers() is not used, no task is subscribed to the wait queue.	2014-12-24 23:47:33 +01:00
Willy Tarreau	656859d478	MEDIUM: session: implement a basic atomic buffer allocator This patch introduces session_alloc_recv_buffer(), session_alloc_buffers() and session_release_buffers() whose purpose will be to allocate missing buffers and release unneeded ones around the process_session() and during I/O operations. I/O callbacks only need a single buffer for recv operations and none for send. However we still want to ensure that we don't pick the last buffer. That's what session_alloc_recv_buffer() is for. This allocator is atomic in that it always ensures we can get 2 buffers or fails. Here, if any of the buffers is not ready and cannot be allocated, the operation is cancelled. The purpose is to guarantee that we don't enter into the deadlock where all buffers are allocated by the same size of all sessions. A queue will have to be implemented for failed allocations. For now they're just reported as failures.	2014-12-24 23:47:32 +01:00
Willy Tarreau	f4718e8ec0	MEDIUM: buffer: implement b_alloc_margin() This function is used to allocate a buffer and ensure that we leave some margin after it in the pool. The function is not obvious. While we allocate only one buffer, we want to ensure that at least two remain available after our allocation. The purpose is to ensure we'll never enter a deadlock where all sessions allocate exactly one buffer, and none of them will be able to allocate the second buffer needed to build a response in order to release the first one. We also take care of remaining fast in the the fast path by first checking whether or not there is enough margin, in which case we only rely on b_alloc_fast() which is guaranteed to succeed. Otherwise we take the slow path using pool_refill_alloc().	2014-12-24 23:47:32 +01:00
Willy Tarreau	620bd6c88e	MINOR: buffer: implement b_alloc_fast() This function allocates a buffer and replaces buf with this buffer. If no memory is available, &buf_wanted is used instead. No control is made to check if buf already pointed to another buffer. The allocated buffer is returned, or NULL in case no memory is available. The difference with b_alloc() is that this function only picks from the pool and never calls malloc(), so it can fail even if some memory is available. It is the caller's job to refill the buffer pool if needed.	2014-12-24 23:47:32 +01:00
Willy Tarreau	4428a29e52	MEDIUM: channel: do not report full when buf_empty is present on a channel Till now we'd consider a buffer full even if it had size==0 due to pointing to buf.size. Now we change this : if buf_wanted is present, it means that we have already tried to allocate a buffer but failed. Thus the buffer must be considered full so that we stop trying to poll for reads on it. Otherwise if it's empty, it's buf_empty and we report !full since we may allocate it on the fly.	2014-12-24 23:47:32 +01:00
Willy Tarreau	f2f7d6b27b	MEDIUM: buffer: add a new buf_wanted dummy buffer to report failed allocations Doing so ensures that even when no memory is available, we leave the channel in a sane condition. There's a special case in proto_http.c regarding the compression, we simply pre-allocate the tmpbuf to point to the dummy buffer. Not reusing &buf_empty for this allows the rest of the code to differenciate an empty buffer that's not used from an empty buffer that results from a failed allocation which has the same semantics as a buffer full.	2014-12-24 23:47:32 +01:00
Willy Tarreau	2a4b54359b	MEDIUM: buffer: always assign a dummy empty buffer to channels Channels are now created with a valid pointer to a buffer before the buffer is allocated. This buffer is a global one called "buf_empty" and of size zero. Thus it prevents any activity from being performed on the buffer and still ensures that chn->buf may always be dereferenced. b_free() also resets the buffer to &buf_empty, and was split into b_drop() which does not reset the buffer.	2014-12-24 23:47:32 +01:00
Willy Tarreau	7dfca9daec	MINOR: buffer: only use b_free to release buffers We don't call pool_free2(pool2_buffers) anymore, we only call b_free() to do the job. This ensures that we can start to centralize the releasing of buffers.	2014-12-24 23:47:32 +01:00
Willy Tarreau	e583ea583a	MEDIUM: buffer: use b_alloc() to allocate and initialize a buffer b_alloc() now allocates a buffer and initializes it to the size specified in the pool minus the size of the struct buffer itself. This ensures that callers do not need to care about buffer details anymore. Also this never applies memory poisonning, which is slow and useless on buffers.	2014-12-24 23:47:32 +01:00
Willy Tarreau	474cf54a97	MINOR: buffer: reset a buffer in b_reset() and not channel_init() We'll soon need to be able to switch buffers without touching the channel, so let's move buffer initialization out of channel_init(). We had the same in compressoin.c.	2014-12-24 23:47:31 +01:00
Willy Tarreau	3dd6a25323	MINOR: stream-int: retrieve session pointer from stream-int sess_from_si() does this via the owner (struct task). It works because all stream ints belong to a task nowadays.	2014-12-24 23:47:31 +01:00
Willy Tarreau	a885f6dc65	MEDIUM: memory: improve pool_refill_alloc() to pass a refill count Till now this function would only allocate one entry at a time. But with dynamic buffers we'll like to allocate the number of missing entries to properly refill the pool. Let's modify it to take a minimum amount of available entries. This means that when we know we need at least a number of available entries, we can ask to allocate all of them at once. It also ensures that we don't move the pointers back and forth between the caller and the pool, and that we don't call pool_gc2() for each failed malloc. Instead, it's called only once and the malloc is only allowed to fail once.	2014-12-24 23:47:31 +01:00
Willy Tarreau	0262241e26	MINOR: memory: cut pool allocator in 3 layers pool_alloc2() used to pick the entry from the pool, fall back to pool_refill_alloc(), and to perform the poisonning itself, which pool_refill_alloc() was also doing. While this led to optimal code size, it imposes memory poisonning on the buffers as well, which is extremely slow on large buffers. This patch cuts the allocator in 3 layers : - a layer to pick the first entry from the pool without falling back to pool_refill_alloc() : pool_get_first() - a layer to allocate a dirty area by falling back to pool_refill_alloc() but never performing the poisonning : pool_alloc_dirty() - pool_alloc2() which calls the latter and optionally poisons the area No functional changes were made.	2014-12-24 23:47:31 +01:00
Willy Tarreau	e430e77dfd	CLEANUP: memory: replace macros pool_alloc2/pool_free2 with functions Using inline functions here makes the code more readable and reduces its size by about 2 kB.	2014-12-24 23:47:31 +01:00
Willy Tarreau	62405a2155	CLEANUP: memory: remove dead code The very old pool managment code has not been used for the last 7 years and is still polluting the file. Get rid of it now.	2014-12-24 23:47:31 +01:00
Willy Tarreau	3dd717cd5d	CLEANUP: lists: remove dead code Remove the code dealing with the old dual-linked lists imported from librt that has remained unused for the last 8 years. Now everything uses the linux-like circular lists instead.	2014-12-24 23:47:31 +01:00
Godbach	f2dd68d0e0	DOC: fix a few typos include/types/proto_http.h: hwen -> when include/types/server.h: SRV_ST_DOWN -> SRV_ST_STOPPED src/backend.c: prefer-current-server -> prefer-last-server Signed-off-by: Godbach <nylzhaowei@gmail.com>	2014-12-10 05:34:55 +01:00
Lukas Tribus	e4e30f7d52	BUILD: ssl: use OPENSSL_NO_OCSP to detect OCSP support Since commit `656c5fa7e8` ("BUILD: ssl: disable OCSP when using boringssl) the OCSP code is bypassed when OPENSSL_IS_BORINGSSL is defined. The correct thing to do here is to use OPENSSL_NO_OCSP instead, which is defined for this exact purpose in openssl/opensslfeatures.h. This makes haproxy forward compatible if boringssl ever introduces full OCSP support with the additional benefit that it links fine against a OCSP-disabled openssl. Signed-off-by: Lukas Tribus <luky-37@hotmail.com>	2014-12-09 20:49:22 +01:00
Willy Tarreau	23a5c396ec	DEBUG: pools: apply poisonning on every allocated pool Till now, when memory poisonning was enabled, it used to be done only after a calloc(). But sometimes it's not enough to detect unexpected sharing, so let's ensure that we now poison every allocation once it's in place. Note that enabling poisonning significantly hurts performance (it can typically half the overall performance).	2014-11-25 13:48:43 +01:00
Thierry FOURNIER	315ec4217f	BUG/MEDIUM: pattern: don't load more than once a pattern list. A memory optimization can use the same pattern expression for many equal pattern list (same parse method, index method and index_smp method). The pattern expression is returned by "pattern_new_expr", but this function dont indicate if the returned pattern is already in use. So, the caller function reload the list of patterns in addition with the existing patterns. This behavior is not a problem with tree indexed pattern, but it grows the lists indexed patterns. This fix add a "reuse" flag in return of the function "pattern_new_expr". If the flag is set, I suppose that the patterns are already loaded. This fix must be backported into 1.5.	2014-11-24 15:40:16 +01:00
Willy Tarreau	5be2f35231	MAJOR: polling: centralize calls to I/O callbacks In order for HTTP/2 not to eat too much memory, we'll have to support on-the-fly buffer allocation, since most streams will have an empty request buffer at some point. Supporting allocation on the fly means being able to sleep inside I/O callbacks if a buffer is not available. Till now, the I/O callbacks were called from two locations : - when processing the cached events - when processing the polled events from the poller This change cleans up the design a bit further than what was started in 1.5. It now ensures that we never call any iocb from the poller itself and that instead, events learned by the poller are put into the cache. The benefit is important in terms of stability : we don't have to care anymore about the risk that new events are added into the poller while processing its events, and we're certain that updates are processed at a single location. To achieve this, we now modify all the fd_* functions so that instead of creating updates, they add/remove the fd to/from the cache depending on its state, and only create an update when the polling status reaches a state where it will have to change. Since the pollers make use of these functions to notify readiness (using fd_may_recv/fd_may_send), the cache is always up to date with the poller. Creating updates only when the polling status needs to change saves a significant amount of work for the pollers : a benchmark showed that on a typical TCP proxy test, the amount of updates per connection dropped from 11 to 1 on average. This also means that the update list is smaller and has more chances of not thrashing too many CPU cache lines. The first observed benefit is a net 2% performance gain on the connection rate. A second benefit is that when a connection is accepted, it's only when we're processing the cache, and the recv event is automatically added into the cache after the current one, resulting in this event to be processed immediately during the same loop. Previously we used to have a second run over the updates to detect if new events were added to catch them before waking up tasks. The next gain will be offered by the next steps on this subject consisting in implementing an I/O queue containing all cached events ordered by priority just like the run queue, and to be able to leave some events pending there as long as needed. That will allow us not to perform some FD processing if it's not the proper time for this (typically keep waiting for a buffer to be allocated if none is available for an recv()). And by only processing a small bunch of them, we'll allow priorities to take place even at the I/O level. As a result of this change, functions fd_alloc_or_release_cache_entry() and fd_process_polled_events() have disappeared, and the code dedicated to checking for new fd events after the callback during the poll() loop was removed as well. Despite the patch looking large, it's mostly a change of what function is falled upon fd_*() and almost nothing was added.	2014-11-21 20:37:32 +01:00
KOVACS Krisztian	b3e54fe387	MAJOR: namespace: add Linux network namespace support This patch makes it possible to create binds and servers in separate namespaces. This can be used to proxy between multiple completely independent virtual networks (with possibly overlapping IP addresses) and a non-namespace-aware proxy implementation that supports the proxy protocol (v2). The setup is something like this: net1 on VLAN 1 (namespace 1) -\ net2 on VLAN 2 (namespace 2) -- haproxy ==== proxy (namespace 0) net3 on VLAN 3 (namespace 3) -/ The proxy is configured to make server connections through haproxy and sending the expected source/target addresses to haproxy using the proxy protocol. The network namespace setup on the haproxy node is something like this: = 8< = $ cat setup.sh ip netns add 1 ip link add link eth1 type vlan id 1 ip link set eth1.1 netns 1 ip netns exec 1 ip addr add 192.168.91.2/24 dev eth1.1 ip netns exec 1 ip link set eth1.$id up ... = 8< = = 8< = $ cat haproxy.cfg frontend clients bind 127.0.0.1:50022 namespace 1 transparent default_backend scb backend server mode tcp server server1 192.168.122.4:2222 namespace 2 send-proxy-v2 = 8< = A bind line creates the listener in the specified namespace, and connections originating from that listener also have their network namespace set to that of the listener. A server line either forces the connection to be made in a specified namespace or may use the namespace from the client-side connection if that was set. For more documentation please read the documentation included in the patch itself. Signed-off-by: KOVACS Tamas <ktamas@balabit.com> Signed-off-by: Sarkozi Laszlo <laszlo.sarkozi@balabit.com> Signed-off-by: KOVACS Krisztian <hidden@balabit.com>	2014-11-21 07:51:57 +01:00
Christian Ruppert	de898712a0	MEDIUM: regex: Use pcre_study always when PCRE is used, regardless of JIT pcre_study() has been around long before JIT has been added. It also seems to affect the performance in some cases (positive). Below I've attached some test restults. The test is based on http://sljit.sourceforge.net/regex_perf.html (see bottom). It has been modified to just test pcre_study vs. no pcre_study. Note: This test does not try to match specific header it's instead run over a larger text with more and less complex patterns to make the differences more clear. % ./runtest 'mark.txt' loaded. (Length: 19665221 bytes) ----------------- Regex: 'Twain' [pcre-nostudy] time: 14 ms (2388 matches) [pcre-study] time: 21 ms (2388 matches) ----------------- Regex: '^Twain' [pcre-nostudy] time: 109 ms (100 matches) [pcre-study] time: 109 ms (100 matches) ----------------- Regex: 'Twain$' [pcre-nostudy] time: 14 ms (127 matches) [pcre-study] time: 16 ms (127 matches) ----------------- Regex: 'Huck[a-zA-Z]+\|Finn[a-zA-Z]+' [pcre-nostudy] time: 695 ms (83 matches) [pcre-study] time: 26 ms (83 matches) ----------------- Regex: 'a[^x]{20}b' [pcre-nostudy] time: 90 ms (12495 matches) [pcre-study] time: 91 ms (12495 matches) ----------------- Regex: 'Tom\|Sawyer\|Huckleberry\|Finn' [pcre-nostudy] time: 1236 ms (3015 matches) [pcre-study] time: 34 ms (3015 matches) ----------------- Regex: '.{0,3}(Tom\|Sawyer\|Huckleberry\|Finn)' [pcre-nostudy] time: 5696 ms (3015 matches) [pcre-study] time: 5655 ms (3015 matches) ----------------- Regex: '[a-zA-Z]+ing' [pcre-nostudy] time: 1290 ms (95863 matches) [pcre-study] time: 1167 ms (95863 matches) ----------------- Regex: '^[a-zA-Z]{0,4}ing[^a-zA-Z]' [pcre-nostudy] time: 136 ms (4507 matches) [pcre-study] time: 134 ms (4507 matches) ----------------- Regex: '[a-zA-Z]+ing$' [pcre-nostudy] time: 1334 ms (5360 matches) [pcre-study] time: 1214 ms (5360 matches) ----------------- Regex: '^[a-zA-Z ]{5,}$' [pcre-nostudy] time: 198 ms (26236 matches) [pcre-study] time: 197 ms (26236 matches) ----------------- Regex: '^.{16,20}$' [pcre-nostudy] time: 173 ms (4902 matches) [pcre-study] time: 175 ms (4902 matches) ----------------- Regex: '([a-f](.[d-m].){0,2}[h-n]){2}' [pcre-nostudy] time: 1242 ms (68621 matches) [pcre-study] time: 690 ms (68621 matches) ----------------- Regex: '([A-Za-z]awyer\|[A-Za-z]inn)[^a-zA-Z]' [pcre-nostudy] time: 1215 ms (675 matches) [pcre-study] time: 952 ms (675 matches) ----------------- Regex: '"[^"]{0,30}[?!\.]"' [pcre-nostudy] time: 27 ms (5972 matches) [pcre-study] time: 28 ms (5972 matches) ----------------- Regex: 'Tom.{10,25}river\|river.{10,25}Tom' [pcre-nostudy] time: 705 ms (2 matches) [pcre-study] time: 68 ms (2 matches) In some cases it's more or less the same but when it's faster than by a huge margin. It always depends on the pattern, the string(s) to match against etc. Signed-off-by: Christian Ruppert <c.ruppert@babiel.com>	2014-11-18 13:26:18 +01:00
Cyril Bont�	9ce1311ebc	BUG/MEDIUM: checks: fix conflicts between agent checks and ssl healthchecks Lasse Birnbaum Jensen reported an issue when agent checks are used at the same time as standard healthchecks when SSL is enabled on the server side. The symptom is that agent checks try to communicate in SSL while it should manage raw data. This happens because the transport layer is shared between all kind of checks. To fix the issue, the transport layer is now stored in each check type, allowing to use SSL healthchecks when required, while an agent check should always use the raw_sock implementation. The fix must be backported to 1.5.	2014-11-16 00:53:12 +01:00
Willy Tarreau	eb11889f1e	MINOR: task: release the task pool when stopping When we're stopping, we're not going to create new tasks anymore, so let's release the task pool upon each task_free() in order to reduce memory fragmentation.	2014-11-13 16:57:19 +01:00
Emeric Brun	2c86cbf753	MINOR: ssl: add statement to force some ssl options in global. Adds global statements 'ssl-default-server-options' and 'ssl-default-bind-options' to force on 'server' and 'bind' lines some ssl options. Currently available options are 'no-sslv3', 'no-tlsv10', 'no-tlsv11', 'no-tlsv12', 'force-sslv3', 'force-tlsv10', 'force-tlsv11', 'force-tlsv12', and 'no-tls-tickets'. Example: global ssl-default-server-options no-sslv3 ssl-default-bind-options no-sslv3	2014-10-30 17:06:29 +01:00
Thierry FOURNIER	317e1c4f1e	MINOR: sample: add "json" converter This converter escapes string to use it as json/ascii escaped string. It can read UTF-8 with differents behavior on errors and encode it in json/ascii. json([<input-code>]) Escapes the input string and produces an ASCII ouput string ready to use as a JSON string. The converter tries to decode the input string according to the <input-code> parameter. It can be "ascii", "utf8", "utf8s", "utf8"" or "utf8ps". The "ascii" decoder never fails. The "utf8" decoder detects 3 types of errors: - bad UTF-8 sequence (lone continuation byte, bad number of continuation bytes, ...) - invalid range (the decoded value is within a UTF-8 prohibited range), - code overlong (the value is encoded with more bytes than necessary). The UTF-8 JSON encoding can produce a "too long value" error when the UTF-8 character is greater than 0xffff because the JSON string escape specification only authorizes 4 hex digits for the value encoding. The UTF-8 decoder exists in 4 variants designated by a combination of two suffix letters : "p" for "permissive" and "s" for "silently ignore". The behaviors of the decoders are : - "ascii" : never fails ; - "utf8" : fails on any detected errors ; - "utf8s" : never fails, but removes characters corresponding to errors ; - "utf8p" : accepts and fixes the overlong errors, but fails on any other error ; - "utf8ps" : never fails, accepts and fixes the overlong errors, but removes characters corresponding to the other errors. This converter is particularly useful for building properly escaped JSON for logging to servers which consume JSON-formated traffic logs. Example: capture request header user-agent len 150 capture request header Host len 15 log-format {"ip":"%[src]","user-agent":"%[capture.req.hdr(1),json]"} Input request from client 127.0.0.1: GET / HTTP/1.0 User-Agent: Very "Ugly" UA 1/2 Output log: {"ip":"127.0.0.1","user-agent":"Very \"Ugly\" UA 1\/2"}	2014-10-26 06:41:12 +01:00
Willy Tarreau	4e21ff9244	BUG/MEDIUM: http: adjust close mode when switching to backend Commit `179085c` ("MEDIUM: http: move Connection header processing earlier") introduced a regression : the backend's HTTP mode is not considered anymore when setting the session's HTTP mode, because wait_for_request() is only called once, when the frontend receives the request (or when the frontend is in TCP mode, when the backend receives the request). The net effect is that in some situations when the frontend and the backend do not work in the same mode (eg: keep-alive vs close), the backend's mode is ignored. This patch moves all that processing to a dedicated function, which is called from the original place, as well as from session_set_backend() when switching from an HTTP frontend to an HTTP backend in different modes. This fix must be backported to 1.5.	2014-09-30 18:44:22 +02:00
Willy Tarreau	3986b9c140	MEDIUM: config: report it when tcp-request rules are misplaced A config where a tcp-request rule appears after an http-request rule might seem valid but it is not. So let's report a warning about this since this case is hard to detect by the naked eye.	2014-09-16 15:43:24 +02:00
Willy Tarreau	9dc1c61c43	BUG/CRITICAL: http: don't update msg->sov once data start to leave the buffer Commit `bb2e669` ("BUG/MAJOR: http: correctly rewind the request body after start of forwarding") was incorrect/incomplete. It used to rely on CF_READ_ATTACHED to stop updating msg->sov once data start to leave the buffer, but this is unreliable because since commit `a6eebb3` ("[BUG] session: clear BF_READ_ATTACHED before next I/O") merged in 1.5-dev1, this flag is only ephemeral and is cleared once all analysers have seen it. So we can start updating msg->sov again each time we pass through this place with new data. With a sufficiently large amount of data, it is possible to make msg->sov wrap and validate the if() condition at the top, causing the buffer to advance by about 2GB and crash the process. Note that the offset cannot be controlled by the attacker because it is a sum of millions of small random sizes depending on how many bytes were read by the server and how many were left in the buffer, only because of the speed difference between reading and writing. Also, nothing is written, the invalid pointer resulting from this operation is only read. Many thanks to James Dempsey for reporting this bug and to Chris Forbes for narrowing down the faulty area enough to make its root cause analysable. This fix must be backported to haproxy 1.5.	2014-09-02 16:48:54 +02:00
Willy Tarreau	7346acb6f1	MINOR: log: add a new field "%lc" to implement a per-frontend log counter Sometimes it would be convenient to have a log counter so that from a log server we know whether some logs were lost or not. The frontend's log counter serves exactly this purpose. It's incremented each time a traffic log is produced. If a log is disabled using "http-request set-log-level silent", the counter will not be incremented. However, admin logs are not accounted for. Also, if logs are filtered out before being sent to the server because of a minimum level set on the log line, the counter will be increased anyway. The counter is 32-bit, so it will wrap, but that's not an issue considering that 4 billion logs are rarely in the same file, let alone close to each other.	2014-08-28 15:08:14 +02:00
Willy Tarreau	4edd6836fc	OPTIM/MINOR: proxy: reduce struct proxy by 48 bytes on 64-bit archs Just by moving a few struct members around, we can avoid 32-bit holes between 64-bit pointers and shrink the struct size by 48 bytes. That's not huge but that's for free, so let's do it.	2014-08-28 15:08:14 +02:00
Dave McCowan	328fb58d74	MEDIUM: connection: add new bit in Proxy Protocol V2 There are two sample commands to get information about the presence of a client certificate. ssl_fc_has_crt is true if there is a certificate present in the current connection ssl_c_used is true if there is a certificate present in the session. If a session has stopped and resumed, then ssl_c_used could be true, while ssl_fc_has_crt is false. In the client byte of the TLS TLV of Proxy Protocol V2, there is only one bit to indicate whether a certificate is present on the connection. The attached patch adds a second bit to indicate the presence for the session. This maintains backward compatibility. [wt: this should be backported to 1.5 to help maintain compatibility between versions]	2014-08-23 07:35:29 +02:00
Lukas Tribus	656c5fa7e8	BUILD: ssl: disable OCSP when using boringssl Google's boringssl doesn't currently support OCSP, so disable it if detected. OCSP support may be reintroduced as per: https://code.google.com/p/chromium/issues/detail?id=398677 In that case we can simply revert this commit. Signed-off-by: Lukas Tribus <luky-37@hotmail.com>	2014-08-18 14:33:48 +02:00
Godbach	e468d55998	BUG/MINOR: server: move the directive #endif to the end of file If a source file includes proto/server.h twice or more, redefinition errors will be triggered for such inline functions as server_throttle_rate(), server_is_draining(), srv_adm_set_maint() and so on. Just move #endif directive to the end of file to solve this issue. Signed-off-by: Godbach <nylzhaowei@gmail.com>	2014-07-29 11:03:14 +02:00
Willy Tarreau	09448f7d7c	MEDIUM: http: add the track-sc* actions to http-request rules Add support for http-request track-sc, similar to what is done in tcp-request for backends. A new act_prm field was added to HTTP request rules to store the track params (table, counter). Just like for TCP rules, the table is resolved while checking for config validity. The code was mostly copied from the TCP code with the exception that here we also count the HTTP request count and rate by hand. Probably that something could be factored out in the future. It seems like tracking flags should be improved to mark each hook which tracks a key so that we can have some check points where to increase counters of the past if not done yet, a bit like is done for TRACK_BACKEND.	2014-07-16 17:26:40 +02:00
Willy Tarreau	5ed1bbfc75	CLEANUP: session: move the stick counters declarations to stick_table.h They're really not appropriate in session.h as they always require a stick table, and I'm having a hard time finding them each time I need to.	2014-07-16 17:26:40 +02:00
Willy Tarreau	edee1d60b7	MEDIUM: stick-table: make it easier to register extra data types Some users want to add their own data types to stick tables. We don't want to use a linked list here for performance reasons, so we need to continue to use an indexed array. This patch allows one to reserve a compile-time-defined number of extra data types by setting the new macro STKTABLE_EXTRA_DATA_TYPES to anything greater than zero, keeping in mind that anything larger will slightly inflate the memory consumed by stick tables (not per entry though). Then calling stktable_register_data_store() with the new keyword will either register a new keyword or fail if the desired entry was already taken or the keyword already registered. Note that this patch does not dictate how the data will be used, it only offers the possibility to create new keywords and have an index to reference them in the config and in the tables. The caller will not be able to use stktable_data_cast() and will have to explicitly cast the stable pointers to the expected types. It can be used for experimentation as well.	2014-07-15 19:14:52 +02:00
Willy Tarreau	e12704bfc7	MINOR: session: export the function 'smp_fetch_sc_stkctr' This one is sometimes useful outside of this file.	2014-07-15 19:09:56 +02:00
Thierry FOURNIER	055b9d5c63	MINOR: http: export the function 'smp_fetch_base32' It's sometimes useful outside of proto_http.c.	2014-07-15 19:09:36 +02:00
Willy Tarreau	65d805fdfc	BUILD: fix dependencies between config and compat.h compat.h only depends on the system, and config needs compat, not the opposite. global.h was fixed to explicitly include standard.h for LONGBITS.	2014-07-15 19:09:36 +02:00
Willy Tarreau	bb2e669f9e	BUG/MAJOR: http: correctly rewind the request body after start of forwarding Daniel Dubovik reported an interesting bug showing that the request body processing was still not 100% fixed. If a POST request contained short enough data to be forwarded at once before trying to establish the connection to the server, we had no way to correctly rewind the body. The first visible case is that balancing on a header does not always work on such POST requests since the header cannot be found. But there are even nastier implications which are that http-send-name-header would apply to the wrong location and possibly even affect part of the request's body due to an incorrect rewinding. There are two options to fix the problem : - first one is to force the HTTP_MSG_F_WAIT_CONN flag on all hash-based balancing algorithms and http-send-name-header, but there's always a risk that any new algorithm forgets to set it ; - the second option is to account for the amount of skipped data before the connection establishes so that we always know the position of the request's body relative to the buffer's origin. The second option is much more reliable and fits very well in the spirit of the past changes to fix forwarding. Indeed, at the moment we have msg->sov which points to the start of the body before headers are forwarded and which equals zero afterwards (so it still points to the start of the body before forwarding data). A minor change consists in always making it point to the start of the body even after data have been forwarded. It means that it can get a negative value (so we need to change its type to signed).. In order to avoid wrapping, we only do this as long as the other side of the buffer is not connected yet. Doing this definitely fixes the issues above for the requests. Since the response cannot be rewound we don't need to perform any change there. This bug was introduced/remained unfixed in 1.5-dev23 so the fix must be backported to 1.5.	2014-07-10 19:29:45 +02:00
Willy Tarreau	8fed9037cd	MEDIUM: stick-table: implement lookup from a sample fetch Currently we have stktable_fetch_key() which fetches a sample according to an expression and returns a stick table key, but we also need a function which does only the second half of it from a known sample. So let's cut the function in two and introduce smp_to_stkey() to perform this lookup. The first function was adapted to make use of it in order to avoid code duplication.	2014-07-10 16:43:44 +02:00
Dan Dubovik	bd57a9f977	BUG/MEDIUM: backend: Update hash to use unsigned int throughout When we were generating a hash, it was done using an unsigned long. When the hash was used to select a backend, it was sent as an unsigned int. This made it difficult to predict which backend would be selected. This patch updates get_hash, and the hash methods to use an unsigned int, to remain consistent throughout the codebase. This fix should be backported to 1.5 and probably in part to 1.4.	2014-07-08 22:00:21 +02:00
Willy Tarreau	fd0e008d9d	BUG/MEDIUM: unix: completely unbind abstract sockets during a pause() Abstract namespace sockets ignore the shutdown() call and do not make it possible to temporarily stop listening. The issue it causes is that during a soft reload, the new process cannot bind, complaining that the address is already in use. This change registers a new pause() function for unix sockets and completely unbinds the abstract ones since it's possible to rebind them later. It requires the two previous patches as well as preceeding fixes. This fix should be backported into 1.5 since the issue apperas there.	2014-07-08 01:13:35 +02:00
Willy Tarreau	092d865c53	MEDIUM: listener: implement a per-protocol pause() function In order to fix the abstact socket pause mechanism during soft restarts, we'll need to proceed differently depending on the socket protocol. The pause_listener() function already supports some protocol-specific handling for the TCP case. This commit makes this cleaner by adding a new ->pause() function to the protocol struct, which, if defined, may be used to pause a listener of a given protocol. For now, only TCP has been adapted, with the specific code moved from pause_listener() to tcp_pause_listener().	2014-07-08 01:13:34 +02:00
Willy Tarreau	18324f574f	MEDIUM: log: support a user-configurable max log line length With all the goodies supported by logformat, people find that the limit of 1024 chars for log lines is too short. Some servers do not support larger lines and can simply drop them, so changing the default value is not always the best choice. This patch takes a different approach. Log line length is specified per log server on the "log" line, with a value between 80 and 65535. That way it's possibly to satisfy all needs, even with some fat local servers and small remote ones.	2014-06-27 18:13:53 +02:00
Willy Tarreau	4e957907aa	MINOR: log: make MAX_SYSLOG_LEN overridable at build time This value was set in log.h without any #ifndef around, so when one wanted to change it, a patch was needed. Let's move it to defaults.h with the usual #ifndef so that it's easier to change it.	2014-06-27 18:13:53 +02:00
Willy Tarreau	b5975defba	MINOR: stick-table: make stktable_fetch_key() indicate why it failed stktable_fetch_key() does not indicate whether it returns NULL because the input sample was not found or because it's unstable. It causes trouble with track-sc* rules. Just like with sample_fetch_string(), we want it to be able to give more information to the caller about what it found. Thus, now we use the pointer to a sample passed by the caller, and fill it with the information we have about the sample. That way, even if we return NULL, the caller has the ability to check whether a sample was found and if it is still changing or not.	2014-06-25 17:17:53 +02:00
Emeric Brun	0abf836ecb	BUG/MINOR: ssl: Fix external function in order not to return a pointer on an internal trash buffer. 'ssl_sock_get_common_name' applied to a connection was also renamed 'ssl_sock_get_remote_common_name'. Currently, this function is only used with protocol PROXYv2 to retrieve the client certificate's common name. A further usage could be to retrieve the server certificate's common name on an outgoing connection.	2014-06-24 22:39:16 +02:00
Simon Horman	98637e5bff	MEDIUM: Add external check Add an external check which makes use of an external process to check the status of a server.	2014-06-20 07:10:07 +02:00
Emeric Brun	c8b27b6c68	MEDIUM: ssl: add 300s supported time skew on OCSP response update. OCSP_MAX_RESPONSE_TIME_SKEW can be set to a different value at compilation (default is 300 seconds).	2014-06-19 14:37:30 +02:00
Thierry FOURNIER	f4e6129e30	MINOR: missing regex.h include	2014-06-19 14:29:32 +02:00

1 2 3 4 5 ...

1832 Commits