haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-02-26 15:40:32 +00:00

Author	SHA1	Message	Date
Tim Duesterhus	7fec021537	MEDIUM: proxy_protocol: Convert IPs to v6 when protocols are mixed http-request set-src possibly creates a situation where src and dst are from different address families. Convert both addresses to IPv6 to avoid a PROXY UNKNOWN. This patch should be backported to haproxy 1.8.	2018-07-30 11:23:30 +02:00
Willy Tarreau	c477b6fcc9	BUG/MEDIUM: threads: unbreak "bind" referencing an incorrect thread number The "process" directive on "bind" lines supports process references and thread references. No check is performed on the thread number validity, so that if a listener is only bound to non-existent threads, the traffic will never be processed. It easily happens when setting one bind line per thread with an incorrect (or reduced) thread count. No warning appears and some random connections are never served. It also happens when setting thread references with threads support disabled at build time. This patch makes use of the all_threads_mask variable to detect if some referenced threads don't exist, to emit a warning and fix this. This patch needs to be backported to 1.8, just like the previous one which it depends on (MINOR: threads: move "nbthread" parsing to hathreads.c).	2018-07-30 11:10:46 +02:00
Willy Tarreau	0ccd32285f	MINOR: threads: move "nbthread" parsing to hathreads.c The purpose is to make sure that all variables which directly depend on this nbthread argument are set at the right moment. For now only all_threads_mask needs to be set. It used to be set while calling thread_sync_init() which is called too late for certain checks. The same function handles threads and non-threads, which removes the need for some thread-specific knowledge from cfgparse.c.	2018-07-30 11:10:46 +02:00
Willy Tarreau	5e954e1f27	BUG/MEDIUM: threads: properly fix nbthreads == MAX_THREADS While moving Olivier's patch for nbthread==MAX_THREADS in commit `3e12304` ("BUG/MINOR: threads: Handle nbthread == MAX_THREADS.") to hathreads.c, I missed one place resulting in the computed thread mask being used as the thread count, which is worse than the initial bug. Let's fix it properly this time. This fix must be backported to 1.8 just like the other one.	2018-07-30 11:10:26 +02:00
Olivier Houchard	3e12304ae0	BUG/MINOR: threads: Handle nbthread == MAX_THREADS. If nbthread is MAX_THREADS, the shift operation needed to compute all_threads_mask fails in thread_sync_init(). Instead pass a number of threads to this function and let it compute the mask without overflowing. This should be backported to 1.8.	2018-07-27 17:18:22 +02:00
Willy Tarreau	85d9b84eb1	BUILD/MINOR: threads: unbreak build with threads disabled Depending on the optimization level, gcc may complain that wake_thread() uses an invalid array index for poller_wr_pipe[] when called from __task_wakeup(). Normally the condition to get there never happens, but it's simpler to ifdef out this part of the code which is only used to wake other threads up. No backport is needed, this was brought by the recent introduction of the ability to wake a sleeping thread.	2018-07-27 17:18:22 +02:00
Willy Tarreau	c786768dba	BUG/MINOR: config: stick-table is not supported in defaults section Thierry discovered that the following config crashes haproxy while parsing the config (it's probably the smallest crasher) : defaults stick-table type ip size 1M And indeed it does because it looks for the current proxy's name which it does not have as it's the default one. This affects all versions since 1.6. This fix must be backported to all versions back to 1.6.	2018-07-27 10:26:22 +02:00
Willy Tarreau	a2b5181e7a	BUG/MEDIUM: h2: prevent orphaned streams from blocking a connection forever Some h2 connections remaining in CLOSE_WAIT state forever have been reported for a while. Thanks to detailed captures provided by Milan Petruzelka, the sequence where this happens became clearer : 1) multiple streams compete for the mux and are queued in the send_list 2) at this point the mux has to emit a GOAWAY for any reason (for example because it received a bad message) 3) the streams are woken up, notified about the error 4) h2_detach() is called for each of them 5) the CS they are detached from the H2S 6) since the streams are marked as blocked for some room, they are orphaned and nothing more is done on them. 7) at this point, any activity on the connection goes through h2_wake() which sees the conneciton in ERROR2 state, tries again to release the streams, cannot, and stops polling (thus even connection errors cannot be detected anymore). => from this point, no more events can be received on the connection, and the streams remain orphaned forever. This patch makes sure that we never return without doing anything once an error was met. It has to act both on the h2_detach() side (for h2 streams being detached after the error was emitted) and on the h2_wake() side (for errors reported after h2s have already been orphaned). Many thanks to Milan Petruzelka and Janusz Dziemidowicz for their awesome work on this issue, collecting traces and testing patches, and to Olivier Doucet for extra testing and confirming the fix. This fix must be backported to 1.8.	2018-07-27 09:55:14 +02:00
Emmanuel Hocdet	ebabd8768a	MINOR: ssl: BoringSSL matches OpenSSL 1.1.0 Since BoringSSL 3b2ff028, API now correctly match OpenSSL 1.1.0. The patch revert part of haproxy `019f9b10`: "Fix BoringSSL call and openssl-compat.h/#define occordingly.". This will not break openssl/libressl compat.	2018-07-27 09:43:40 +02:00
Willy Tarreau	3ea2490b48	BUG/MEDIUM: threads/sync: use sched_yield when available There is a corner case with the sync point which can significantly degrade performance. The reason is that it forces all threads to busy spin there, and that if there are less CPUs available than threads, this busy activity from some threads will force others to wait longer in epoll() or to simply be scheduled out while doing something else, and will increase the time needed to reach the sync point. Given that the sync point is not expected to be stressed that much, better call sched_yield() while waiting there to release the CPU and offer it to waiting threads. On a simple test with 4 threads bound to two cores using "maxconn 1" on the server line, the performance was erratic before the recent scheduler changes (between 40 and 200 conn/s with hundreds of ms response time), and it jumped to 7200 with 12ms response time with this fix applied. It should be backported to 1.8 since 1.8 is affected as well.	2018-07-27 07:54:08 +02:00
Olivier Houchard	ecfe673f61	MINOR: threads/queue: Get rid of THREAD_WANT_SYNC in the queue code. Now that we can wake one thread sleeping in the poller, we don't have to use THREAD_WANT_SYNC any more. This gives a significant performance boost on highly contended accesses (servers with maxconn 1), showing a jump from 21k to 31k conn/s on a test involving 8 threads.	2018-07-26 20:55:02 +02:00
Olivier Houchard	79321b95a8	MINOR: pollers: Add a way to wake a thread sleeping in the poller. Add a new pipe, one per thread, so that we can write on it to wake a thread sleeping in a poller, and use it to wake threads supposed to take care of a task, if they are all sleeping.	2018-07-26 19:09:50 +02:00
Olivier Houchard	eba0c0b51d	MINOR: tasks: Make global_tasks_mask volatile. In order to make sure modifications are noticed by other threads when needed, make global_tasks_mask volatile.	2018-07-26 19:09:50 +02:00
Olivier Houchard	9b03c0c9a7	MINOR: tasks: Make active_tasks_mask volatile. To be sure we have the relevant informations, make active_tasks_mask volatile	2018-07-26 19:09:50 +02:00
Willy Tarreau	3201e4e428	MEDIUM: queue: get rid of the pendconn lock This lock was necessary to manipulate the pendconn element between concurrent places, but was causing great difficulties in the list walk by having to iterate over multiple entries instead of being able to safely pick the first one (in fact the first element was always the right one but the locking model was hard to prove). Here since we know we can always rely on the queue's locks, we take the queue's lock every time we need to modify the element. In practice it was already the case everywhere except in pendconn_dequeue() which only works on an element that was already detached. This function had to be protected against the risk of meeting an incompletely detached element (which could be unlinked but not yet assigned). By taking the queue lock around the LIST_ISEMPTY test, it's enough to ensure that a concurrent thread either didn't begin or had completed the operation. The true benefit really is in pendconn_process_next_strm() where we can again safely work with the first element of each queue. This will significantly simplify next updates to this code.	2018-07-26 17:32:51 +02:00
Willy Tarreau	7c6f8a2b0d	MINOR: queue: implement pendconn queue locking functions The new pendconn_queue_lock() and pendconn_queue_unlock() functions are made to make it more convenient to lock or unlock the pendconn queue either at the proxy or the server depending on pendconn->srv. This way it is possible to remove the open-coding of these locks at various places. These ones have been used in pendconn_unlink() and pendconn_add(), thus significantly simplifying the logic there.	2018-07-26 17:32:51 +02:00
Willy Tarreau	88930dd364	MINOR: queue: use a distinct variable for the assigned server and the queue The pendconn struct uses ->px and ->srv to designate where the element is queued. There is something confusing regarding threads though, because we have to lock the appropriate queue before inserting/removing elements, and this queue may only be determined by looking at ->srv (if it's not NULL it's the server, otherwise use the proxy). But pendconn_grab_from_px() and pendconn_process_next_strm() both assign this ->srv field, making it complicated to know what queue to lock before manipulating the element, which is exactly why we have the pendconn_lock in the first place. This commit introduces pendconn->target which is the target server that the two aforementioned functions will set when assigning the server. Thanks to this, the server pointer may always be relied on to determine what queue to use.	2018-07-26 17:32:51 +02:00
Willy Tarreau	c1a60d6218	MINOR: queue: make sure pendconn->strm->pend_pos is always valid pendconn_add() used to assign strm->pend_pos very late, after unlocking the queue, so that a watching thread could see a random value in pendconn->strm->pend_pos even while holding the lock on the element and the queue itself. While there's currently nothing wrong with this, it costs nothing to arrange it and will simplify code analysis later.	2018-07-26 17:32:51 +02:00
Willy Tarreau	6bdd05c0ef	DOC: queue: document the expected locking model for the server's queue The locking model is not trivial and is worth documenting to avoid seeing apparent bugs everywhere while they are not.	2018-07-26 17:32:51 +02:00
Willy Tarreau	d0ad4a87f0	MEDIUM: queue: make pendconn_free() work on the stream instead Now pendconn_free() takes a stream, checks that pend_pos is set, clears it, and uses pendconn_unlink() to complete the job. It's cleaner and centralizes all the bookkeeping work in pendconn_unlink() only and ensures that there's a single place where the stream's position in the queue is manipulated.	2018-07-26 17:32:51 +02:00
Willy Tarreau	9624faec86	MINOR: queue: centralize dequeuing code a bit better For now the pendconns may be dequeued at two places : - pendconn_unlink(), which operates on a locked queue - pendconn_free(), which operates on an unlocked queue and frees everything. Some changes are coming to the queue and we'll need to be able to be a bit stricter regarding the places where we dequeue to keep the accounting accurate. This first step renames the locked function __pendconn_unlink() as it's for use by those aware of it, and introduces a new general purpose pendconn_unlink() function which automatically grabs the necessary locks before calling the former, and pendconn_cond_unlink() which additionally checks the pointer and the presence in the queue.	2018-07-26 17:32:48 +02:00
Olivier Houchard	77551ee8a7	BUG/MEDIUM: tasks: make __task_unlink_rq responsible for the rqueue size. As __task_wakeup() is responsible for increasing rqueue_local[tid]/global_rqueue_size, make __task_unlink_rq responsible for decreasing it, as process_runnable_tasks() isn't the only one that removes tasks from runqueues.	2018-07-26 16:33:29 +02:00
Olivier Houchard	76e45181b2	MINOR: tasks: Add a flag that tells if we're in the global runqueue. How that we have bits available in task->state, add a flag that tells if we're in the global runqueue or not.	2018-07-26 16:33:10 +02:00
Willy Tarreau	f0cea1ee3f	MINOR: tasks: extend the state bits from 8 to 16 and remove the reason By removing the reason code for the wakeup we can gain 8 extra bits to encode the task's state. The reason code was never used at all and is wrong by design since subsequent calls will OR this value anyway. Let's say it goodbye and leave the room for more precious bits. The woken bits were moved to the higher byte so that the most important bits can stay grouped together.	2018-07-26 16:13:00 +02:00
Willy Tarreau	ad8bd2467c	MINOR: signal: don't pass the signal number anymore as the wakeup reason This is never used and would even be wrong since the reasons are ORed so two signals would be turned into a third value, just like if any other reason was used at the same time.	2018-07-26 16:12:48 +02:00
Olivier Houchard	c4aac9effe	BUG/MEDIUM: tasks: Make sure there's no task left before considering inactive. We may remove the thread's bit in active_tasks_mask despite tasks for that thread still being present in the global runqueue. To fix that, introduce global_tasks_mask, and set the correspnding bits when we add a task to the runqueue.	2018-07-26 15:40:22 +02:00
Willy Tarreau	189ea856a7	BUG/MEDIUM: tasks: use atomic ops for active_tasks_mask We don't have the lock anymore so we need to protect it.	2018-07-26 15:16:43 +02:00
Olivier Houchard	e85ee7b663	BUG/MEDIUM: tasks: Decrement rqueue_size at the right time. We need to decrement requeue_size when we remove a task form rqueue_local, not when we remove if from the task list, or we'd also decrement it for any tasklet, that was never in the rqueue in the first place.	2018-07-26 15:00:58 +02:00
Willy Tarreau	9a77186cb0	BUG/MEDIUM: tasks: make sure we pick all tasks in the run queue Commit `09eeb76` ("BUG/MEDIUM: tasks: Don't forget to increase/decrease tasks_run_queue.") addressed a count issue in the run queue and uncovered another issue with the way the tasks are dequeued from the global run queue. The number of tasks to pick is computed using an integral divide, which results in up to nbthread-1 tasks never being run. The fix simply consists in getting rid of the divide and checking the task count in the loop. No backport is needed, this is 1.9-specific.	2018-07-26 14:24:46 +02:00
Olivier Houchard	306e653331	BUG/MINOR: servers: Don't make "server" in a frontend fatal. When parsing the configuration, if "server", "default-server" or "server-template" are found in a frontend, we first warn that it will be ignored, only to be considered a fatal error later. Be true to our word, and just ignore it. This should be backported to 1.8 and 1.7.	2018-07-24 17:13:54 +02:00
Willy Tarreau	055ba4f505	BUG/MEDIUM: stats: don't ask for more data as long as we're responding The stats applet is still a bit hackish. It uses the HTTP txn to parse the POST contents. Due to this it pretends not having parsed the request from the buffer so that the HTTP parser continues to work fine on these data. This comes with a side effect : the request lies pending in the channel's buffer, and because of this, stream_int_update_applet() always wakes the applet up. It's very visible when retrieving a large stats page over a slow link as haproxy eats 100% of the CPU waiting for the data to leave. While the proper long term solution definitely is to consume these data and parse the body from the applet, changing this is not suitable for a fix. What this patch does instead is to disable request polling as long as there are pending data in the response buffer. Given that for almost all cases, the applet remains busy sending data, this is at least enough to ensure that we don't wake up for the pending request data while we're waiting for the client to receive these data. Now a 5k backend stats page is dumped at 1% CPU over a 10 Mbps link instead of 100%, using 1500 epoll_wait() calls instead of 80000. Note that the previous fix (BUG/MEDIUM: stream-int: don't immediately enable reading when the buffer was reportedly full) is necessary for the effects of the fix to be noticed since both bugs have the exact same effect. This fix must be backported at least as far as 1.5.	2018-07-24 17:13:32 +02:00
Willy Tarreau	171d5f203a	BUG/MEDIUM: stream-int: don't immediately enable reading when the buffer was reportedly full There is a long-time issue which affects some applets, at least the stats applet. If a large stats page is read over a slow link, regularly the channel's buffer contains too many response data to allow another round of ci_putblk() to copy a new message. In this case the applet calls si_applet_cant_put() to mention that it failed to emit data into the channel's buffer, and wants to be called only once some room is made. The problem is that stream_int_update(), which is called from process_stream(), will clear this flag whenever it sees there's some spare room in the channel's buffer. It causes the applet to be woken again immediately. This is very visible when reading a large stats page over a slow link, because in this case haproxy will run at 100% CPU and strace shows mostly epoll_wait(0). It is very likely that some other applets like CLI, Lua, peers or SPOE have also been affected but that the effect were less noticeable because it was mixed with traffic. Ideally stream_int_update() should not touch these flags, but changing this would require a very careful auditing of all users. Instead here what we do is that we respect the flag if the channel still has output data. This way the flag will automatically disappear once the buffer is empty, and the applet function will be called only when input data remains, if at all. This patch alone is not enough to observe the behaviour change on the stats page because another bug takes over, addressed by next patch (BUG/MEDIUM: stats: don't ask for more data as long as we're responding). When both are applied, dumping stats for 5k backends over a 10 Mbps link take 1% CPU instead of 100%, with 1.5k epoll_wait() calls instead of 80k. This fix should be backported at least as far as 1.5.	2018-07-24 17:12:38 +02:00
Willy Tarreau	616ac81dec	MINOR: h2: add the error code and the max/last stream IDs to "show fd" This is intented to help debugging H2 in field.	2018-07-24 14:12:42 +02:00
Willy Tarreau	7cc040cc74	DOC: add more design feedback on the new layering model Introduce the distinction between structured messages and raw data, and how to make them coexist in a buffer. This is still a design draft.	2018-07-23 17:29:37 +02:00
Willy Tarreau	842ed9b1cb	MEDIUM: h2: use the default conn_stream's receive function This removes h2_rcv_buf() now that the generic code can handle it fine.	2018-07-20 19:37:12 +02:00
Willy Tarreau	39d68508c3	MINOR: h2: make use of CS_FL_REOS to indicate that end of stream was seen This allows h2_rcv_buf() not to depend anymore on h2s at all and to become generic.	2018-07-20 19:35:14 +02:00
Willy Tarreau	2df65e7194	MEDIUM: h2: don't call data_cb->recv() anymore Now we simply call data_cb->wake() which will automatically perform the recv() call if required.	2018-07-20 19:31:36 +02:00
Willy Tarreau	2a761dcf0d	MEDIUM: h2: perform a single call to the data layer in demux() Instead of calling the data layer from each individual frame processing function, we now call it from demux. This requires to know the h2s that was created inside h2c_frt_handle_headers(), which is why the pointer is now returned. This results in a small performance boost from 58k to 60k POST requests/s compared to -master, thanks to half the number of si_cs_recv_cb() calls and 66% calls to si_cs_wake_cb(). It's interesting to note that all calls to data_cb->recv() are now always immediately followed by a call to data_cb->wake(). The next step should be to let the ->wake handler perform the recv() call itself. For this it will be useful to have some info on the CS to indicate whether or not it is ready to be read (ie: contains a non-empty input buffer).	2018-07-20 19:30:03 +02:00
Willy Tarreau	7999bfbfd3	MEDIUM: buffers: make b_xfer() automatically swap buffers when possible Whenever it's possible to avoid a copy, b_xfer() will simply swap the buffer's heads without touching the data. This has brought the performance back from 140 kH/s to 202 kH/s on the test case.	2018-07-20 19:21:43 +02:00
Willy Tarreau	a56a6def91	MEDIUM: h2: move headers and data frame decoding to their respective parsers Now we entirely process the input frame before transfering it above, so that h2_rcv_buf() doesn't have to "speak" h2 anymore.	2018-07-20 19:21:43 +02:00
Willy Tarreau	454b57b347	MEDIUM: h2: centralize transfer of decoded frames in h2_rcv_buf() We still call the parser but it should soon not be needed anymore. The decode functions don't need the buffer nor the max size anymore. They must also not touch the CS_FL_EOS or CS_FL_RCV_MORE flags either, so this is done within h2_rcv_buf() after transmission. The "flags" argument to h2_frt_decode_headers() and h2_frt_transfer_data() has been removed since it's not used anymore.	2018-07-20 19:21:43 +02:00
Willy Tarreau	d755ea6c7d	MEDIUM: h2: make h2_frt_transfer_data() copy via an intermediary buffer The purpose here is also to ensure we can split the lower from the top layers. The way the CS_FL_MSG_MORE flag is set was updated so that it's set or cleared upon exit depending on the buffer's remaining contents.	2018-07-20 19:21:43 +02:00
Willy Tarreau	937f760e1e	MEDIUM: h2: make h2_frt_decode_headers() use an intermediary buffer The purpose is to decode to a temporary buffer and then to copy this buffer to the caller. This double-copy definitely has an impact on performance, the test code goes down from 220k to 140k req/s, but this memcpy() will disappear soon. The test on CO_RFL_BUF_WET has become irrelevant now since we only use the cs' rxbuf, so we cannot be blocked by "output" data that has to be forwarded first. Thus instead we don't start until the rxbuf is empty (it will be drained from any input data once the stream processes it).	2018-07-20 19:21:43 +02:00
Willy Tarreau	0b559071dd	MINOR: h2: make each H2 stream support an intermediary input buffer The purpose is to decode to a temporary buffer and then to copy this buffer to the caller upon request to avoid having to process frames on the fly when called from the higher level. For now the buffer is only initialized on stream creation via cs_new() and allocated if the buffer_wait's callback is called.	2018-07-20 19:21:43 +02:00
Willy Tarreau	67b1e78f68	MEDIUM: stream-int: automatically call si_cs_recv_cb() if the cs has data on wake() If the cs has data pending or shutdown and the input channel is still waiting for reads, let's simply call the recv() function from the wake() callback. This will allow the lower layers to simply wake the upper one up without having to consider the recv() nor anything else.	2018-07-20 19:21:43 +02:00
Willy Tarreau	11c9aa424e	MEDIUM: conn_stream: add cs_recv() as a default rcv_buf() function This function is generic and is able to automatically transfer data from a conn_stream's rx buffer to the destination buffer. It does this automatically if the mux doesn't define another rcv_buf() function.	2018-07-20 19:21:43 +02:00
Willy Tarreau	5e1cc5ea83	MINOR: conn_stream: add an rx buffer to the conn_stream In order to reorganize the connection layers, recv() operations will need to be retryable and to support partial transfers. This requires an intermediary buffer to hold the data coming from the mux. After a few attempts, it turns out that this buffer is best placed inside the conn_stream itself. For now it's only set to buf_empty and it will be up to the caller to allocate it if required.	2018-07-20 19:21:43 +02:00
Willy Tarreau	a3f7efe009	MINOR: conn_stream: add a new CS_FL_REOS flag This flag indicates that the mux layer has already detected an end of stream which will become CS_FL_EOS during a recv() once the rx buffer is empty.	2018-07-20 19:21:43 +02:00
Willy Tarreau	9382cdd8e1	DOC: add some design notes about the new layering model This explains how streams and connection should interact.	2018-07-20 19:21:43 +02:00
Willy Tarreau	f148888d19	MINOR: buffers: add b_xfer() to transfer data between buffers Instead of open-coding buffer-to-buffer transfers using blocks, let's have a dedicated function for this. It also adjusts the buffer counts.	2018-07-20 19:21:43 +02:00

1 2 3 4 5 ...

7796 Commits