haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-02-19 12:16:59 +00:00

Author	SHA1	Message	Date
Willy Tarreau	a9fb08317f	[MINOR] report in the proxies the requirements for ACLs This patch propagates the ACL conditions' "requires" bitfield to the proxies. This makes it possible to know exactly what a proxy might have to support for any request, which helps knowing whether we have to allocate some space for certain types of structures or not (eg: the hdr_idx struct). The concept might be extended to a lot more types of information, such as detecting whether we need to allocate some space for some request ACLs which need a result in the response, etc...	2009-07-10 23:09:39 +02:00
Willy Tarreau	1d0dfb155d	[MAJOR] http: complete splitting of the remaining stages The HTTP processing has been splitted into 7 steps, one of which is not anymore HTTP-specific (content-switching). That way, it becomes possible to use "use_backend" rules in TCP mode. A new "use_server" directive should follow soon.	2009-07-07 15:10:31 +02:00
Willy Tarreau	d787e6648c	[MEDIUM] http: split request waiter from request processor We want to split several steps in HTTP processing so that we can call individual analysers depending on what processing we want to perform. The first step consists in splitting the part that waits for a request from the rest.	2009-07-07 10:14:51 +02:00
Willy Tarreau	dc340a900d	[MEDIUM] splice: set the capability on each stream_interface The splice code did not consider compatibility between both ends of the connection. Now we set different capabilities on each stream interface, depending on what the protocol can splice to/from. Right now, only TCP is supported. Thanks to this, we're now able to automatically detect when splice() is not implemented and automatically disable it on one end instead of reporting errors to the upper layer.	2009-06-28 23:10:19 +02:00
Willy Tarreau	5d707e1aaa	[MEDIUM] stream_sock: don't close prematurely when nolinger is set When the nolinger option is used, we must not close too fast because some data might be left unsent. Instead we must proceed with a normal shutdown first, then a close. Also, we want to avoid merging FIN with the last segment if nolinger is set, because if that one gets lost, there is no chance for it to be retransmitted.	2009-06-28 11:09:07 +02:00
Willy Tarreau	be1b91842a	[MEDIUM] add support for TCP MSS adjustment for listeners Sometimes it can be useful to limit the advertised TCP MSS on incoming connections, for instance when requests come through a VPN or when the system is running with jumbo frames enabled. Passing the "mss <value>" arguments to a "bind" line will set the value. This works under Linux >= 2.6.28, and maybe a few earlier ones, though due to an old kernel bug most of earlier versions will probably ignore it. It is also possible that some other OSes will support this.	2009-06-14 18:48:19 +02:00
Willy Tarreau	d88edf2e52	[MEDIUM] implement tcp-smart-connect option at the backend This new option enables combining of request buffer data with the initial ACK of an outgoing TCP connection. Doing so saves one packet per connection which is quite noticeable on workloads mostly consisting in small objects. The option is not enabled by default.	2009-06-14 15:48:17 +02:00
Willy Tarreau	fb14edc215	[MEDIUM] stream_sock: implement tcp-cork for use during shutdowns on Linux Setting TCP_CORK on a socket before sending the last segment enables automatic merging of this segment with the FIN from the shutdown() call. Playing with TCP_CORK is not easy though as we have to track the status of the TCP_NODELAY flag since both are mutually exclusive. Doing so saves one more packet per session and offers about 5% more performance. There is no reason not to do it, so there is no associated option.	2009-06-14 15:24:37 +02:00
Willy Tarreau	9ea05a790f	[MEDIUM] implement option tcp-smart-accept at the frontend This option disables TCP quick ack upon accept. It is also automatically enabled in HTTP mode, unless the option is explicitly disabled with "no option tcp-smart-accept". This saves one packet per connection which can bring reasonable amounts of bandwidth for servers processing small requests.	2009-06-14 12:07:01 +02:00
Willy Tarreau	84b57dae4a	[MINOR] config: track "no option"/"option" changes Sometimes we would want to implement implicit default options, but for this we need to be able to disable them, which requires to keep track of "no option" settings. With this change, an option explicitly disabled in a defaults section will still be seen as explicitly disabled. There should be no regression as nothing makes use of this yet.	2009-06-14 11:10:45 +02:00
Willy Tarreau	c6f4ce8fc4	[MEDIUM] add support for binding to source port ranges during connect Some users are already hitting the 64k source port limit when connecting to servers. The system usually maintains a list of unused source ports, regardless of the source IP they're bound to. So in order to go beyond the 64k concurrent connections, we have to manage the source ip:port lists ourselves. The solution consists in assigning a source port range to each server and use a free port in that range when connecting to that server, either for a proxied connection or for a health check. The port must then be put back into the server's range when the connection is closed. This mechanism is used only when a port range is specified on a server. It makes it possible to reach 64k connections per server, possibly all from the same IP address. Right now it should be more than enough even for huge deployments.	2009-06-10 12:23:32 +02:00
Willy Tarreau	13a34bd110	[MINOR] compute the max of sessions/s on fe/be/srv Some users want to keep the max sessions/s seen on servers, frontends and backends for capacity planning. It's easy to grab it while the session count is updated, so let's keep it.	2009-05-10 18:52:49 +02:00
Willy Tarreau	f7edefa413	[MINOR] implement per-logger log level limitation Some people are using haproxy in a shared environment where the system logger by default sends alert and emerg messages to all consoles, which happens when all servers go down on a backend for instance. These people can not always change the system configuration and would like to limit the outgoing messages level in order not to disturb the local users. The addition of an optional 4th field on the "log" line permits exactly this. The minimal log level ensures that all outgoing logs will have at least this level. So the logs are not filtered out, just set to this level.	2009-05-10 17:20:05 +02:00
Benoit	affb481f1a	[MEDIUM] add support for "balance hdr(name)" There is a patch made by me that allow for balancing on any http header field. [WT: made minor changes: - turned 'balance header name' into 'balance hdr(name)' to match more closely the ACL syntax for easier future convergence - renamed the proxy structure fields header_* => hh_* - made it possible to use the domain name reduction to any header, not only "host" since it makes sense to do it with other ones. Otherwise patch looks good. /WT]	2009-05-10 15:50:15 +02:00
Willy Tarreau	c9bd0cc224	[MINOR] add options dontlog-normal and log-separate-errors Some big traffic sites have trouble dealing with logs and tend to disable them. Here are two new options to help cope with massive logs. - dontlog-normal only disables logging for 100% successful connections, other ones will still be logged - log-separate-errors will cause non-100% successful connections to be logged at level "err" instead of level "info" so that a properly configured syslog daemon can send them to a different file for longer conservation.	2009-05-10 11:57:02 +02:00
Willy Tarreau	8f38bd0497	[MINOR] add basic signal handling functions These functions will be used to deliver asynchronous signals in order to make the signal handling functions more robust. The goal is to keep the same interface to signal handlers.	2009-05-10 09:24:23 +02:00
Maik Broemme	2850cb42b6	[MINOR] add X-Original-To: header I have attached a patch which will add on every http request a new header 'X-Original-To'. If you have HAProxy running in transparent mode with a big number of SQUID servers behind it, it is very nice to have the original destination ip as a common header to make decisions based on it. The whole thing is configurable with a new option 'originalto'. I have updated the sourcecode as well as the documentation. The 'haproxy-en.txt' and 'haproxy-fr.txt' files are untouched, due to lack of my french language knowledge. ;) Also the patch adds this header for IPv4 only. I haven't any IPv6 test environment running here and don't know if getsockopt() with SO_ORIGINAL_DST will work on IPv6. If someone knows it and wants to test it I can modify the diff. Feel free to ask me questions or things which should be changed. :) --Maik	2009-05-01 16:22:33 +02:00
Willy Tarreau	3b88d441e9	[MINOR] switch all stat counters to 64-bit The byte counters have long been 64-bit to avoid overflows. But with several sites nowadays, we see session counters wrap around every 10-days or so. So it was the moment to switch counters to 64-bit, including error and warning counters which can theorically rise as fast as session counters even if in practice there is very low risk. The performance impact should not be noticeable since those counters are only updated once per session. The stats output have been carefully checked for proper types on both 32- and 64-bit platforms.	2009-04-11 20:44:08 +02:00
Willy Tarreau	32a4ec0ed7	[MEDIUM] http: add options to ignore invalid header names Sometimes it is required to let invalid requests pass because applications sometimes take time to be fixed and other servers do not care. Thus we provide two new options : option accept-invalid-http-request (for the frontend) option accept-invalid-http-response (for the backend) When those options are set, invalid requests or responses do not cause a 403/502 error to be generated.	2009-04-02 21:36:34 +02:00
Willy Tarreau	3884cbaae6	[MINOR] show sess: report number of calls to each task For debugging purposes, it can be useful to know how many times each task has been called.	2009-03-28 17:54:35 +01:00
Willy Tarreau	1b194fe03e	[OPTIM] buffer: new BF_READ_DONTWAIT flag reduces EAGAIN rates When the reader does not expect to read lots of data, it can set BF_READ_DONTWAIT on the request buffer. When it is set, the stream_sock_read callback will not try to perform multiple reads, it will return after only one, and clear the flag. That way, we can immediately return when waiting for an HTTP request without trying to read again. On pure request/responses schemes such as monitor-uri or redirects, this has completely eliminated the EAGAIN occurrences and the epoll_ctl() calls, resulting in a performance increase of about 10%. Similar effects should be observed once we support HTTP keep-alive since we'll immediately disable reads once we get a full request.	2009-03-21 21:57:30 +01:00
Willy Tarreau	6f4a82c7af	[OPTIM] stream_sock: don't retry to read after a large read If we get very large data at once, it's almost certain that it's worthless trying to read again, because we got everything we could get. Doing this has made all -EAGAIN disappear from splice reads. The threshold has been put in the global tunable structures so that if we one day want to make it accessible from user config, it will be easy to do so.	2009-03-21 20:43:57 +01:00
Willy Tarreau	d0a201b35c	[CLEANUP] task: distinguish between clock ticks and timers Timers are unsigned and used as tree positions. Ticks are signed and used as absolute date within current time frame. While the two are normally equal (except zero), it's important not to confuse them in the code as they are not interchangeable. We add two inline functions to turn each one into the other. The comments have also been moved to the proper location, as it was not easy to understand what was a tick and what was a timer unit.	2009-03-08 15:58:07 +01:00
Willy Tarreau	26c250683f	[MEDIUM] minor update to the task api: let the scheduler queue itself All the tasks callbacks had to requeue the task themselves, and update a global timeout. This was not convenient at all. Now the API has been simplified. The tasks callbacks only have to update their expire timer, and return either a pointer to the task or NULL if the task has been deleted. The scheduler will take care of requeuing the task at the proper place in the wait queue.	2009-03-08 09:38:41 +01:00
Willy Tarreau	4726f53794	[OPTIM] task: don't unlink a task from a wait queue when waking it up In many situations, we wake a task on an I/O event, then queue it exactly where it was. This is a real waste because we delete/insert tasks into the wait queue for nothing. The only reason for this is that there was only one tree node in the task struct. By adding another tree node, we can have one tree for the timers (wait queue) and one tree for the priority (run queue). That way, we can have a task both in the run queue and wait queue at the same time. The wait queue now really holds timers, which is what it was designed for. The net gain is at least 1 delete/insert cycle per session, and up to 2-3 depending on the workload, since we save one cycle each time the expiration date is not changed during a wake up.	2009-03-08 07:59:18 +01:00
Willy Tarreau	3a7d20781d	[MEDIUM] implement "rate-limit sessions" for the frontend The new "rate-limit sessions" statement sets a limit on the number of new connections per second on the frontend. As it is extremely accurate (about 0.1%), it is efficient at limiting resource abuse or DoS.	2009-03-05 23:48:25 +01:00
Willy Tarreau	7f062c4193	[MEDIUM] measure and report session rate on frontend, backends and servers With this change, all frontends, backends, and servers maintain a session counter and a timer to compute a session rate over the last second. This value will be very useful because it varies instantly and can be used to check thresholds. This value is also reported in the stats in a new "rate" column.	2009-03-05 18:43:00 +01:00
Willy Tarreau	74808cb907	[MEDIUM] implement error dump on unix socket with "show errors" The new "show errors" command sent on a unix socket will dump all captured request and response errors for all proxies. It is also possible to bound the log to frontends and backends whose ID is passed as an optional parameter. The output provides information about frontend, backend, server, session ID, source address, error type, and error position along with a complete dump of the request or response which has caused the error. If a new error scratches the one currently being reported, then the dump is aborted with a warning message, and processing goes on to next error.	2009-03-04 15:53:18 +01:00
Willy Tarreau	f073a83b1d	[MEDIUM] store a complete dump of request and response errors in proxies Each proxy instance, either frontend or backend, now has some room dedicated to storing a complete dated request or response in case of parsing error. This will make it possible to consult errors in order to find the exact cause, which is particularly important for troubleshooting faulty applications.	2009-03-04 10:26:38 +01:00
Willy Tarreau	0b9c02c861	[MEDIUM] implement bind-process to limit service presence by process The "bind-process" keyword lets the admin select which instances may run on which process (in multi-process mode). It makes it easier to more evenly distribute the load across multiple processes by avoiding having too many listen to the same IP:ports.	2009-02-04 22:05:05 +01:00
Willy Tarreau	c76721da57	[MEDIUM] add support for source interface binding at the server level Add support for "interface <name>" after the "source" statement on the server line.	2009-02-04 20:20:58 +01:00
Willy Tarreau	d53f96b3f0	[MEDIUM] add support for source interface binding Specifying "interface <name>" after the "source" statement allows one to bind to a specific interface for proxy<->server traffic. This makes it possible to use multiple links to reach multiple servers, and to force traffic to pass via an interface different from the one the system would have chosen based on the routing table.	2009-02-04 18:46:54 +01:00
Willy Tarreau	5e6e204d1c	[MINOR] add support for bind interface name By appending "interface <name>" to a "bind" line, it is now possible to specifically bind to a physical interface name. Note that this currently only works on Linux and requires root privileges.	2009-02-04 17:19:29 +01:00
Willy Tarreau	3ab68cf0ae	[MEDIUM] splice: add the global "nosplice" option Setting "nosplice" in the global section will disable the use of TCP splicing (both tcpsplice and linux 2.6 splice). The same will be achieved using the "-dS" parameter on the command line.	2009-01-25 16:03:28 +01:00
Willy Tarreau	43b78999ec	[MEDIUM] move global tuning options to the global structure The global tuning options right now only concern the polling mechanisms, and they are not in the global struct itself. It's not very practical to add other options so let's move them to the global struct and remove types/polling.h which was not used for anything else.	2009-01-25 15:42:27 +01:00
Willy Tarreau	3eba98aa57	[MEDIUM] splice: make use of pipe pools Using pipe pools makes pipe management a lot easier. It also allows to remove quite a bunch of #ifdefs in areas which depended on the presence or not of support for kernel splicing. The buffer now holds a pointer to a pipe structure which is always NULL except if there are still data in the pipe. When it needs to use that pipe, it dynamically allocates it from the pipe pool. When the data is consumed, the pipe is immediately released. That way, there is no need anymore to care about pipe closure upon session termination, nor about pipe creation when trying to use splice(). Another immediate advantage of this method is that it considerably reduces the number of pipes needed to use splice(). Tests have shown that even with 0.2 pipe per connection, almost all sessions can use splice(), because the same pipe may be used by several consecutive calls to splice().	2009-01-25 13:56:13 +01:00
Willy Tarreau	982b6e37e4	[MEDIUM] introduce pipe pools A new data type has been added : pipes. Some pre-allocated empty pipes are maintained in a pool for users such as splice which use them a lot for very short times. Pipes are allocated using get_pipe() and released using put_pipe(). Pipes which are released with pending data are immediately killed. The struct pipe is small (16 to 20 bytes) and may even be further reduced by unifying ->data and ->next. It would be nice to have a dedicated cleanup task which would watch for the pipes usage and destroy a few of them from time to time.	2009-01-25 13:49:53 +01:00
Willy Tarreau	259de1b702	[MINOR] introduce structures required to support Linux kernel splicing When CONFIG_HAP_LINUX_SPLICE is defined, the buffer structure will be slightly enlarged to support information needed for kernel splicing on Linux. A first attempt consisted in putting this information into the stream interface, but in the long term, it appeared really awkward. This version puts the information into the buffer. The platform-dependant part is conditionally added and will only enlarge the buffers when compiled in. One new flag has also been added to the buffers: BF_KERN_SPLICING. It indicates that the application considers it is appropriate to use splicing to forward remaining data.	2009-01-18 21:56:21 +01:00
Willy Tarreau	66aa61f76b	[MEDIUM] splice: add configuration options and set global.maxpipes Three new options have been added when CONFIG_HAP_LINUX_SPLICE is set : - splice-request - splice-response - splice-auto They are used to enable splicing per frontend/backend. They are also supported in defaults sections. The "splice-auto" option is meant to automatically turn splice on for buffers marked as fast streamers. This should save quite a bunch of file descriptors. It was required to add a new "options2" field to the proxy structure because the original "options" is full. When global.maxpipes is not set, it is automatically adjusted to the max of the sums of all frontend's and backend's maxconns for those which have at least one splice option enabled.	2009-01-18 21:44:07 +01:00
Willy Tarreau	3ec79b9c42	[MINOR] global.maxpipes: add the ability to reserve file descriptors for pipes This will be needed to use linux's splice() syscall.	2009-01-18 20:39:42 +01:00
Willy Tarreau	03d60bbaf9	[OPTIM] buffer: replace rlim by max_len In the buffers, the read limit used to leave some place for header rewriting was set by a pointer to the end of the buffer. Not only this required subtracts at every place in the code, but this will also soon not be usable anymore when we want to support keepalive. Let's replace this with a length limit, comparable to the buffer's length. This has also sightly reduced the code size.	2009-01-09 11:14:39 +01:00
Willy Tarreau	0abebcc0fb	[MEDIUM] i/o: rework ->to_forward and ->send_max The way the buffers and stream interfaces handled ->to_forward was really not handy for multiple reasons. Now we've moved its control to the receive-side of the buffer, which is also responsible for keeping send_max up to date. This makes more sense as it now becomes possible to send some pre-formatted data followed by forwarded data. The following explanation has also been added to buffer.h to clarify the situation. Right now, tests show that the I/O is behaving extremely well. Some work will have to be done to adapt existing splice code though. /* Note about the buffer structure The buffer contains two length indicators, one to_forward counter and one send_max limit. First, it must be understood that the buffer is in fact split in two parts : - the visible data (->data, for ->l bytes) - the invisible data, typically in kernel buffers forwarded directly from the source stream sock to the destination stream sock (->splice_len bytes). Those are used only during forward. In order not to mix data streams, the producer may only feed the invisible data with data to forward, and only when the visible buffer is empty. The consumer may not always be able to feed the invisible buffer due to platform limitations (lack of kernel support). Conversely, the consumer must always take data from the invisible data first before ever considering visible data. There is no limit to the size of data to consume from the invisible buffer, as platform-specific implementations will rarely leave enough control on this. So any byte fed into the invisible buffer is expected to reach the destination file descriptor, by any means. However, it's the consumer's responsibility to ensure that the invisible data has been entirely consumed before consuming visible data. This must be reflected by ->splice_len. This is very important as this and only this can ensure strict ordering of data between buffers. The producer is responsible for decreasing ->to_forward and increasing ->send_max. The ->to_forward parameter indicates how many bytes may be fed into either data buffer without waking the parent up. The ->send_max parameter says how many bytes may be read from the visible buffer. Thus it may never exceed ->l. This parameter is updated by any buffer_write() as well as any data forwarded through the visible buffer. The consumer is responsible for decreasing ->send_max when it sends data from the visible buffer, and ->splice_len when it sends data from the invisible buffer. A real-world example consists in part in an HTTP response waiting in a buffer to be forwarded. We know the header length (300) and the amount of data to forward (content-length=9000). The buffer already contains 1000 bytes of data after the 300 bytes of headers. Thus the caller will set ->send_max to 300 indicating that it explicitly wants to send those data, and set ->to_forward to 9000 (content-length). This value must be normalised immediately after updating ->to_forward : since there are already 1300 bytes in the buffer, 300 of which are already counted in ->send_max, and that size is smaller than ->to_forward, we must update ->send_max to 1300 to flush the whole buffer, and reduce ->to_forward to 8000. After that, the producer may try to feed the additional data through the invisible buffer using a platform-specific method such as splice(). */	2009-01-09 10:15:03 +01:00
Willy Tarreau	dcef33fa9b	[MINOR] add the splice_len member to the buffer struct in preparation of splice support In preparation of splice support, let's add the splice_len member to the buffer struct. An earlier implementation made it conditional, which made the whole logics very complex due to a large number of ifdefs. Now BF_EMPTY is only set once both buf->l and buf->splice_len are null. Splice_len is initialized to zero during buffer creation and is currently not changed, so the whole logics remains unaffected. When splice gets merged, splice_len will reflect the number of bytes in flight out of the buffer but not yet sent, typically in a pipe for the Linux case.	2009-01-09 10:15:02 +01:00
Willy Tarreau	6b66f3e4f6	[MAJOR] implement autonomous inter-socket forwarding If an analyser sets buf->to_forward to a given value, that many data will be forwarded between the two stream interfaces attached to a buffer without waking the task up. The same applies once all analysers have been released. This saves a large amount of calls to process_session() and a number of task_dequeue/queue.	2009-01-09 10:15:02 +01:00
Willy Tarreau	3ffeba1f67	[MEDIUM] enable inter-stream_interface wakeup calls By letting the producer tell the consumer there is data to check, and the consumer tell the producer there is some space left again, we can cut in half the number of session wakeups. This is also an important starting point for future splicing support.	2008-12-28 11:09:02 +01:00
Willy Tarreau	b0ef735c71	[MINOR] add flags to indicate when a stream interface is waiting for space/data It will soon be required to know when a stream interface is waiting for buffer data or buffer room. Let's add two flags for that.	2008-12-28 11:08:03 +01:00
Willy Tarreau	86491c3164	[MEDIUM] indicate when we don't care about read timeout Sometimes we don't care about a read timeout, for instance, from the client when waiting for the server, but we still want the client to be able to read. Till now it was done by articially forcing the read timeout to ETERNITY. But this will cause trouble when we want the low level stream sock to communicate without waking the session up. So we add a BF_READ_NOEXP flag to indicate that when the read timeout is to be set, it might have to be set to ETERNITY. Since BF_READ_ENA was not used, we replaced this flag.	2008-12-28 11:06:40 +01:00
Willy Tarreau	f890dc9003	[MEDIUM] add a send limit to a buffer For keep-alive, line-mode protocols and splicing, we will need to limit the sender to process a certain amount of bytes. The limit is automatically set to the buffer size when analysers are detached from the buffer.	2008-12-28 10:58:52 +01:00
Willy Tarreau	0140f2553c	[MINOR] redirect: add support for "set-cookie" and "clear-cookie" It is now possible to set or clear a cookie during a redirection. This is useful for logout pages, or for protecting against some DoSes. Check the documentation for the options supported by the "redirect" keyword. (cherry-picked from commit 4af993822e880d8c932f4ad6920db4c9242b0981)	2008-12-07 23:46:38 +01:00
Willy Tarreau	79da4697ca	[MINOR] redirect: add support for the "drop-query" option If "drop-query" is present on a "redirect" line using the "prefix" mode, then the returned Location header will be the request URI without the query-string. This may be used on some login/logout pages, or when it must be decided to redirect the user to a non-secure server. (cherry-picked from commit f2d361ccd73aa16538ce767c766362dd8f0a88fd)	2008-12-07 23:42:01 +01:00

1 2 3 4 5 ...

260 Commits