haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-04-28 05:48:01 +00:00

Author	SHA1	Message	Date
Olivier Houchard	8f0b4c66f5	MINOR: stream_interface: Give stream_interface its own wait_list. Instead of just using the conn_stream wait_list, give the stream_interface its own. When the conn_stream will have its own buffers, the stream_interface may have to wait on it.	2018-08-16 17:29:54 +02:00
Olivier Houchard	e1c6dbcd70	MINOR: connections/mux: Add the wait reason(s) to wait_list. Add a new element to the wait_list, that let us know which event(s) we are waiting on.	2018-08-16 17:29:53 +02:00
Olivier Houchard	ed0f207ef5	MINOR: connections: Get rid of txbuf. Remove txbuf from conn_stream. It is not used yet, and its only user will probably be the mux_h2, so it will be better suited in the struct h2s.	2018-08-16 17:29:51 +02:00
Olivier Houchard	638b799b09	MINOR: connections: Move rxbuf from the conn_stream to the h2s. As the mux_h2 is the only user of rxbuf, move it to the struct h2s, instead of conn_stream.	2018-08-16 17:28:11 +02:00
Patrick Hemmer	268a707a3d	MEDIUM: add set-priority-class and set-priority-offset This adds the set-priority-class and set-priority-offset actions to http-request and tcp-request content. At this point they are not used yet, which is the purpose of the next commit, but all the logic to set and clear the values is there.	2018-08-10 15:06:31 +02:00
Patrick Hemmer	0355dabd7c	MINOR: queue: replace the linked list with a tree We'll need trees to manage the queues by priorities. This change replaces the list with a tree based on a single key. It's effectively a list but allows us to get rid of the list management right now.	2018-08-10 15:06:27 +02:00
Patrick Hemmer	da282f4a8f	MINOR: queue: store the queue index in the stream when enqueuing We store the queue index in the stream and check it on dequeueing to figure how many entries were processed in between. This way we'll be able to count the elements that may later be added before ours.	2018-08-10 15:06:25 +02:00
Patrick Hemmer	ffe5e8c638	MINOR: stream: rename {srv,prx}_queue_size to *_queue_pos The current name is misleading as it implies a queue size, but the value instead indicates a position in the queue. The value is only the queue size at the exact moment the element is enqueued. Soon we will gain the ability to insert anywhere into the queue, upon which clarity of the name is more important.	2018-08-10 15:04:14 +02:00
Christopher Faulet	8ed0a3e32a	MINOR: mux/server: Add 'proto' keyword to force the multiplexer's protocol For now, it is parsed but not used. Tests are done on it to check if the side and the mode are compatible with the server's definition.	2018-08-08 10:42:08 +02:00
Christopher Faulet	a717b99284	MINOR: mux/frontend: Add 'proto' keyword to force the mux protocol For now, it is parsed but not used. Tests are done on it to check if the side and the mode are compatible with the proxy's definition.	2018-08-08 10:41:11 +02:00
Willy Tarreau	91c2826e1d	CLEANUP: server: remove the update list and the update lock These ones are not more used, let's get rid of them.	2018-08-08 09:57:45 +02:00
Christopher Faulet	32f61c0421	MINOR: mux: Unlink ALPN and multiplexers to rather speak of mux protocols Multiplexers are not necessarily associated to an ALPN. ALPN is a TLS extension, so it is not always defined or used. Instead, we now rather speak of multiplexer's protocols. So in this patch, there are no significative changes, some structures and functions are just renamed.	2018-08-08 09:54:22 +02:00
Christopher Faulet	2d5292a412	MINOR: mux: Add info about the supported side in alpn_mux_list structure Now, a multiplexer can specify if it can be install on incoming connections (ALPN_SIDE_FE), on outgoing connections (ALPN_SIDE_BE) or both (ALPN_SIDE_BOTH). These flags are compatible with proxies' ones.	2018-08-08 09:54:22 +02:00
Christopher Faulet	3c51802fb9	MINOR: conn_stream: add an tx buffer to the conn_stream To be symmetrical with the recv() part, we no handle retryable and partial transmission using a intermediary buffer in the conn_stream. For now it's only set to BUF_NULL and never allocated nor used. It cannot yet be used as-is without risking to lose data on close since conn_streams need to be orphaned for this.	2018-08-08 09:53:01 +02:00
Christopher Faulet	d44a9b3627	MEDIUM: mux: Remove const on the buffer in mux->snd_buf() This is a partial revert of the commit `deccd1116` ("MEDIUM: mux: make mux->snd_buf() take the byte count in argument"). It is a requirement to do zero-copy transfers. This will be mandatory when the TX buffer of the conn_stream will be used. So, now, data are consumed by mux->snd_buf() and not only sent. So it needs to update the buffer state. On its side, the caller must be aware the buffer can be replaced y an empty or unallocated one. As a side effet of this change, the function co_set_data() is now only responsible to update the channel set, by update ->output field.	2018-08-07 14:36:52 +02:00
Olivier Houchard	79321b95a8	MINOR: pollers: Add a way to wake a thread sleeping in the poller. Add a new pipe, one per thread, so that we can write on it to wake a thread sleeping in a poller, and use it to wake threads supposed to take care of a task, if they are all sleeping.	2018-07-26 19:09:50 +02:00
Willy Tarreau	3201e4e428	MEDIUM: queue: get rid of the pendconn lock This lock was necessary to manipulate the pendconn element between concurrent places, but was causing great difficulties in the list walk by having to iterate over multiple entries instead of being able to safely pick the first one (in fact the first element was always the right one but the locking model was hard to prove). Here since we know we can always rely on the queue's locks, we take the queue's lock every time we need to modify the element. In practice it was already the case everywhere except in pendconn_dequeue() which only works on an element that was already detached. This function had to be protected against the risk of meeting an incompletely detached element (which could be unlinked but not yet assigned). By taking the queue lock around the LIST_ISEMPTY test, it's enough to ensure that a concurrent thread either didn't begin or had completed the operation. The true benefit really is in pendconn_process_next_strm() where we can again safely work with the first element of each queue. This will significantly simplify next updates to this code.	2018-07-26 17:32:51 +02:00
Willy Tarreau	88930dd364	MINOR: queue: use a distinct variable for the assigned server and the queue The pendconn struct uses ->px and ->srv to designate where the element is queued. There is something confusing regarding threads though, because we have to lock the appropriate queue before inserting/removing elements, and this queue may only be determined by looking at ->srv (if it's not NULL it's the server, otherwise use the proxy). But pendconn_grab_from_px() and pendconn_process_next_strm() both assign this ->srv field, making it complicated to know what queue to lock before manipulating the element, which is exactly why we have the pendconn_lock in the first place. This commit introduces pendconn->target which is the target server that the two aforementioned functions will set when assigning the server. Thanks to this, the server pointer may always be relied on to determine what queue to use.	2018-07-26 17:32:51 +02:00
Olivier Houchard	76e45181b2	MINOR: tasks: Add a flag that tells if we're in the global runqueue. How that we have bits available in task->state, add a flag that tells if we're in the global runqueue or not.	2018-07-26 16:33:10 +02:00
Willy Tarreau	f0cea1ee3f	MINOR: tasks: extend the state bits from 8 to 16 and remove the reason By removing the reason code for the wakeup we can gain 8 extra bits to encode the task's state. The reason code was never used at all and is wrong by design since subsequent calls will OR this value anyway. Let's say it goodbye and leave the room for more precious bits. The woken bits were moved to the higher byte so that the most important bits can stay grouped together.	2018-07-26 16:13:00 +02:00
Willy Tarreau	5e1cc5ea83	MINOR: conn_stream: add an rx buffer to the conn_stream In order to reorganize the connection layers, recv() operations will need to be retryable and to support partial transfers. This requires an intermediary buffer to hold the data coming from the mux. After a few attempts, it turns out that this buffer is best placed inside the conn_stream itself. For now it's only set to buf_empty and it will be up to the caller to allocate it if required.	2018-07-20 19:21:43 +02:00
Willy Tarreau	a3f7efe009	MINOR: conn_stream: add a new CS_FL_REOS flag This flag indicates that the mux layer has already detected an end of stream which will become CS_FL_EOS during a recv() once the rx buffer is empty.	2018-07-20 19:21:43 +02:00
Olivier Houchard	910b2bc829	MEDIUM: connections/mux: Revamp the send direction. Totally nuke the "send" method, instead, the upper layer decides when it's time to send data, and if it's not possible, uses the new subscribe() method to be called when it can send data again.	2018-07-19 18:31:07 +02:00
Olivier Houchard	6ff2039d13	MINOR: connections/mux: Add a new "subscribe" method. Add a new "subscribe" method for connection, conn_stream and mux, so that upper layer can subscribe to them, to be called when the event happens. Right now, the only event implemented is "SUB_CAN_SEND", where the upper layer can register to be called back when it is possible to send data. The connection and conn_stream got a new "send_wait_list" entry, which required to move a few struct members around to maintain an efficient cache alignment (and actually this slightly improved performance).	2018-07-19 16:23:43 +02:00
Willy Tarreau	83061a820e	MAJOR: chunks: replace struct chunk with struct buffer Now all the code used to manipulate chunks uses a struct buffer instead. The functions are still called "chunk*", and some of them will progressively move to the generic buffer handling code as they are cleaned up.	2018-07-19 16:23:43 +02:00
Willy Tarreau	c9fa0480af	MAJOR: buffer: finalize buffer detachment Now the buffers only contain the header and a pointer to the storage area which can be anywhere. This will significantly simplify buffer swapping and will make it possible to map chunks on buffers as well. The buf_empty variable was removed, as now it's enough to have size==0 and area==NULL to designate the empty buffer (thus a non-allocated head is the empty buffer by default). buf_wanted for now is indicated by size==0 and area==(void *)1. The channels and the checks now embed the buffer's head, and the only pointer is to the storage area. This slightly increases the unallocated buffer size (3 extra ints for the empty buffer) but considerably simplifies dynamic buffer management. It will also later permit to detach unused checks. The way the struct buffer is arranged has proven quite efficient on a number of tests, which makes sense given that size is always accessed and often first, followed by the othe ones.	2018-07-19 16:23:43 +02:00
Olivier Houchard	08afac0fd7	MEDIUM: buffers: move "output" from struct buffer to struct channel Since we never access this field directly anymore, but only through the channel's wrappers, it can now move to the channel. The buffers are now completely free from the distinction between input and output data.	2018-07-19 16:23:43 +02:00
Willy Tarreau	337ea57cfc	MINOR: connection: add a new receive flag : CO_RFL_BUF_WET With this flag we introduce the notion of "dry" vs "wet" buffers : some demultiplexers like the H2 mux require as much room as possible for some operations that are not retryable like decoding a headers frame. For this they need to know if the buffer is congested with data scheduled for leaving soon or not. Since the new API will not provide this information in the buffer itself, the caller must indicate it. We never need to know the amount of such data, just the fact that the buffer is not in its optimal condition to be used for receipt. This "CO_RFL_BUF_WET" flag is used to mention that such outgoing data are still pending in the buffer and that a sensitive receiver should better let it "dry" before using it.	2018-07-19 16:23:41 +02:00
Willy Tarreau	7f3225f251	MINOR: connection: add a flags argument to rcv_buf() The mux and transport rcv_buf() now takes a "flags" argument, just like the snd_buf() one or like the equivalent syscall lower part. The upper layers will use this to pass some information such as indicating whether the buffer is free from outgoing data or if the lower layer may allocate the buffer itself.	2018-07-19 16:23:41 +02:00
Willy Tarreau	d9cf540457	MEDIUM: mux: make mux->rcv_buf() take a size_t for the count It also returns a size_t. This is in order to clean the API. Note that the H2 mux still uses some ints in the functions called from h2_rcv_buf(), though it's not really a problem given that H2 frames are smaller. It may deserve a general cleanup later though.	2018-07-19 16:23:41 +02:00
Willy Tarreau	bfc4d77ad3	MEDIUM: connection: make xprt->rcv_buf() use size_t for the count Just like we have a size_t for xprt->snd_buf(), we adjust to use size_t for rcv_buf()'s count argument and return value. It also removes the ambiguity related to the possibility to see a negative value there.	2018-07-19 16:23:41 +02:00
Willy Tarreau	deccd1116d	MEDIUM: mux: make mux->snd_buf() take the byte count in argument This way the mux doesn't need to modify the buffer's metadata anymore nor to know the output's size. The mux->snd_buf() function now takes a const buffer and it's up to the caller to update the buffer's state. The return type was updated to return a size_t to comply with the count argument.	2018-07-19 16:23:41 +02:00
Willy Tarreau	787db9a6a4	MEDIUM: connection: make xprt->snd_buf() take the byte count in argument This way the senders don't need to modify the buffer's metadata anymore nor to know about the output's split point. This way the functions can take a const buffer and it's clearer who's in charge of updating the buffer after a send. That's why the buffer realignment is now performed by the caller of the transport's snd_buf() functions. The return type was updated to return a size_t to comply with the count argument.	2018-07-19 16:23:41 +02:00
Willy Tarreau	17b4aa1adc	BUG/MINOR: ssl: properly ref-count the tls_keys entries Commit `200b0fa` ("MEDIUM: Add support for updating TLS ticket keys via socket") introduced support for updating TLS ticket keys from the CLI, but missed a small corner case : if multiple bind lines reference the same tls_keys file, the same reference is used (as expected), but during the clean shutdown, it will lead to a double free when destroying the bind_conf contexts since none of the lines knows if others still use it. The impact is very low however, mostly a core and/or a message in the system's log upon old process termination. Let's introduce some basic refcounting to prevent this from happening, so that only the last bind_conf frees it. Thanks to Janusz Dziemidowicz and Thierry Fournier for both reporting the same issue with an easy reproducer. This fix needs to be backported from 1.6 to 1.8.	2018-07-18 08:59:50 +02:00
Baptiste Assmann	8e2d9430c0	MINOR: dns: new DNS options to allow/prevent IP address duplication By default, HAProxy's DNS resolution at runtime ensure that there is no IP address duplication in a backend (for servers being resolved by the same hostname). There are a few cases where people want, on purpose, to disable this feature. This patch introduces a couple of new server side options for this purpose: "resolve-opts allow-dup-ip" or "resolve-opts prevent-dup-ip".	2018-07-12 17:56:44 +02:00
Tim Duesterhus	3fd1973d37	MINOR: http: Log warning if (add\|set)-header fails This patch adds a warning if an http-(request\|reponse) (add\|set)-header rewrite fails to change the respective header in a request or response. This usually happens when tune.maxrewrite is not sufficient to hold all the headers that should be added.	2018-05-28 14:53:59 +02:00
Olivier Houchard	673867c357	MAJOR: applets: Use tasks, instead of rolling our own scheduler. There's no real reason to have a specific scheduler for applets anymore, so nuke it and just use tasks. This comes with some benefits, the first one being that applets cannot induce high latencies anymore since they share nice values with other tasks. Later it will be possible to configure the applets' nice value. The second benefit is that the applet scheduler was not very thread-friendly, having a big lock around it in prevision of this change. Thus applet-intensive workloads should now scale much better with threads. Some more improvement is possible now : some applets also use a task to handle timers and timeouts. These ones could now be simplified to use only one task.	2018-05-26 20:03:30 +02:00
Olivier Houchard	1599b80360	MINOR: tasks: Make the number of tasks to run at once configurable. Instead of hardcoding 200, make the number of tasks to be run configurable using tune.runqueue-depth. 200 is still the default.	2018-05-26 20:03:24 +02:00
Olivier Houchard	b0bdae7b88	MAJOR: tasks: Introduce tasklets. Introduce tasklets, lightweight tasks. They have no notion of priority, they are just run as soon as possible, and will probably be used for I/O later. For the moment they're used to replace the temporary thread-local list that was used in the scheduler. The first part of the struct is common with tasks so that tasks can be cast to tasklets and queued in this list. Once a task is in the tasklet list, it has its leaf_p set to 0x1 so that it cannot accidently be confused as not in the queue. Pure tasklets are identifiable by their nice value of -32768 (which is normally not possible).	2018-05-26 20:03:19 +02:00
Olivier Houchard	9f6af33222	MINOR: tasks: Change the task API so that the callback takes 3 arguments. In preparation for thread-specific runqueues, change the task API so that the callback takes 3 arguments, the task itself, the context, and the state, those were retrieved from the task before. This will allow these elements to change atomically in the scheduler while the application uses the copied value, and even to have NULL tasks later.	2018-05-26 19:23:57 +02:00
Thierry Fournier	d5b073cf1f	MINOR: lua: Improve error message The function hlua_ctx_resume return less text message and more error code. These error code allow the caller to return appropriate message to the user.	2018-05-22 18:57:46 +02:00
Christopher Faulet	68db0235fd	CLEANUP: spoe: Remove unused variables the agent structure applets_act and applets_idle were used for debugging purpose. Now, these values are part of the agent's counters.	2018-05-18 15:04:46 +02:00
Olivier Houchard	cb92f5cae4	MINOR: pollers: move polled_mask outside of struct fdtab. The polled_mask is only used in the pollers, and removing it from the struct fdtab makes it fit in one 64B cacheline again, on a 64bits machine, so make it a separate array.	2018-05-06 06:27:34 +02:00
Olivier Houchard	6b96f7289c	BUG/MEDIUM: pollers: Use a global list for fd shared between threads. With the old model, any fd shared by multiple threads, such as listeners or dns sockets, would only be updated on one threads, so that could lead to missed event, or spurious wakeups. To avoid this, add a global list for fd that are shared, using the same implementation as the fd cache, and only remove entries from this list when every thread as updated its poller. [wt: this will need to be backported to 1.8 but differently so this patch must not be backported as-is]	2018-05-06 06:27:09 +02:00
Willy Tarreau	760e81d356	MINOR: backend: implement random-based load balancing For large farms where servers are regularly added or removed, picking a random server from the pool can ensure faster load transitions than when using round-robin and less traffic surges on the newly added servers than when using leastconn. This commit introduces "balance random". It internally uses a random as the key to the consistent hashing mechanism, thus all features available in consistent hashing such as weights and bounded load via hash-balance- factor are usable. It is extremely convenient because one common concern when using random is what happens when a server is hammered a bit too much. Here that can trivially be avoided, like in the configuration below : backend bk0 balance random hash-balance-factor 110 server-template s 1-100 127.0.0.1:8000 check inter 1s Note that while "balance random" internally relies on a hash algorithm, it holds the same properties as round-robin and as such is compatible with reusing an existing server connection with "option prefer-last-server".	2018-05-03 07:20:40 +02:00
Tim Duesterhus	e2b10bf491	MINOR: http: Add support for 421 Misdirected Request This makes haproxy aware of HTTP 421 Misdirected Request, which is defined in RFC 7540, section 9.1.2.	2018-04-28 07:03:39 +02:00
Aur�lien Nephtali	abbf607105	MEDIUM: cli: Add payload support In order to use arbitrary data in the CLI (multiple lines or group of words that must be considered as a whole, for example), it is now possible to add a payload to the commands. To do so, the first line needs to end with a special pattern: <<\n. Everything that follows will be left untouched by the CLI parser and will be passed to the commands parsers. Per-command support will need to be added to take advantage of this feature. Signed-off-by: Aur�lien Nephtali <aurelien.nephtali@corp.ovh.com>	2018-04-26 14:19:33 +02:00
Christopher Faulet	caf2feca62	MINOR: spoe: Add counters to log info about SPOE agents In addition to metrics about time spent in the SPOE, following counters have been added: * applets : number of SPOE applets. * idles : number of idle applets. * nb_sending : number of streams waiting to send data. * nb_waiting : number of streams waiting for a ack. * nb_processed : number of events/groups processed by the SPOE (from the stream point of view). * nb_errors : number of errors during the processing (from the stream point of view). Log messages has been updated to report these counters. Following pattern has been added at the end of the log message: ... <idles>/<applets> <nb_sending>/<nb_waiting> <nb_error>/<nb_processed>	2018-04-05 15:13:54 +02:00
Christopher Faulet	7250b8fb5c	MINOR: spoe: Add loggers dedicated to the SPOE agent Now it is possible to configure a logger in a spoe-agent section using a "log" line, as for a proxy. "no log", "log global" and "log <address> ..." syntaxes are supported.	2018-04-05 15:13:54 +02:00
Christopher Faulet	28ac099907	MINOR: log: Keep the ref when a log server is copied to avoid duplicate entries With "log global" line, the global list of loggers are copied into the proxy's struct. The list coming from the default section is also copied when a frontend or a backend section is parsed. So it is possible to have duplicate entries in the proxy's list. For instance, with this following config, all messages will be logged twice: global log 127.0.0.1 local0 debug daemon defaults mode http log global option httplog frontend front-http log global bind *:8888 default_backend back-http backend back-http server www 127.0.0.1:8000	2018-04-05 15:13:54 +02:00

1 2 3 4 5 ...

1654 Commits