haproxy

mirror of http://git.haproxy.org/git/haproxy.git/ synced 2025-04-28 05:48:01 +00:00

Author	SHA1	Message	Date
Willy Tarreau	501260bf67	MEDIUM: task: always ensure that the run queue is consistent As found by Thierry Fournier, if a task manages to kill another one and if this other task is the next one in the run queue, we can do whatever including crashing, because the scheduler restarts from the saved next task. For now, there is no such concept of a task killing another one, but with Lua it will come. A solution consists in always performing the lookup of the first task in the scheduler's loop, but it's expensive and costs around 2% of the performance. Another solution consists in keeping a global next run queue node and ensuring that when this task gets removed, it updates this pointer to the next one. This allows to simplify the code a bit and in the end to slightly increase the performance (0.3-0.5%). The mechanism might still be usable if we later migrate to a multi-threaded scheduler.	2015-02-23 16:07:01 +01:00
Willy Tarreau	98c6121ee5	[OPTIM] task: don't scan the run queue if we know it's empty It happens quite often in fact, so let's save those precious cycles.	2011-09-10 20:08:49 +02:00
Willy Tarreau	45cb4fb640	[MEDIUM] build: switch ebtree users to use new ebtree version All files referencing the previous ebtree code were changed to point to the new one in the ebtree directory. A makefile variable (EBTREE_DIR) is also available to use files from another directory. The ability to build the libebtree library temporarily remains disabled because it can have an impact on some existing toolchains and does not appear worth it in the medium term if we add support for multi-criteria stickiness for instance.	2009-10-26 21:10:04 +01:00
SaVaGe	1d7a420c84	[BUG] task.c: don't assing last_timer to node-less entries I noticed that in __eb32_insert , if the tree is empty (root->b[EB_LEFT] == NULL) , the node.bit is not defined. However in __task_queue there are checks: - if (last_timer->node.bit < 0) - if (task->wq.node.bit < last_timer->node.bit) which might rely upon an undefined value. This is how I see it: 1. We insert eb32_node in an empty wait queue tree for a task (called by process_runnable_tasks() ): Inserting into empty wait queue &task->wq = 0x72a87c8, last_timer pointer: (nil) 2. Then, we set the last timer to the same address: Setting last_timer: (nil) to: 0x72a87c8 3. We get a new task to be inserted in the queue (again called by process_runnable_tasks()) , before the __task_unlink_wq() is called for the previous task. 4. At this point, we still have last_timer set to 0x72a87c8 , but since it was inserted in an empty tree, it doesn't have node.bit and the values above get dereferenced with undefined value. The bug has no effect right now because the check for equality is still made, so the next timer will still be queued at the right place anyway, without any possible side-effect. But it's a pending bug waiting for a small change somewhere to strike. Iliya Polihronov	2009-10-10 15:15:07 +02:00
Willy Tarreau	34e98ea70d	[BUG] task: fix possible crash when some timeouts are not configured Cristian Ditoiu reported a major regression when testing 1.3.19 at transfer.ro. It would crash within a few minutes while 1.3.15.10 was OK. He offered to help so we could run gdb and debug the crash live. We finally found that the crash was the result of a regression introduced by recent fix `814c978fb6` (task: fix possible timer drift after update) which makes it possible for a tree walk to start from a detached task if this task has got its timeout disabled due to a missing timeout. The trivial fix below has been extensively tested and confirmed not to crash anymore. Special thanks to Cristian who spontaneously provided a lot of help and trust to debug this issue which at first glance looked impossible after reading the code and traces, but took less than an hour to spot and fix when caught live in gdb ! That's really appreciated !	2009-08-09 09:09:54 +02:00
Willy Tarreau	814c978fb6	[BUG] task: fix possible timer drift after update When the scheduler detected that a task was misplaced in the timer queue, it used to place it right again. Unfortunately, it did not check whether it would still call the new task from its new place. This resulted in some tasks not getting called on timeout once in a while, causing a minor drift for repetitive timers. This effect was only observable with slow health checks and without any activity because no other task would cause the scheduler to be immediately called again. In practice, it does not affect any real-world configuration, but it's still better to fix it.	2009-07-14 23:48:55 +02:00
Willy Tarreau	3884cbaae6	[MINOR] show sess: report number of calls to each task For debugging purposes, it can be useful to know how many times each task has been called.	2009-03-28 17:54:35 +01:00
Willy Tarreau	c7bdf09f9f	[MINOR] stats: report number of tasks (active and running) It may be useful for statistics purposes to report the number of tasks.	2009-03-21 18:33:52 +01:00
Willy Tarreau	a461318f97	[MINOR] task: keep a task count and clean up task creators It's sometimes useful at least for statistics to keep a task count. It's easy to do by forcing the rare task creators to always use the same functions to create/destroy a task.	2009-03-21 18:13:21 +01:00
Willy Tarreau	135a113e36	[MINOR] sched: permit a task to stay up between calls If a task wants to stay in the run queue, it is possible. It just needs to wake itself up. We just want to ensure that a reniced task will be processed at the right instant.	2009-03-21 13:26:05 +01:00
Willy Tarreau	26ca34e66e	[BUG] scheduler: fix improper handling of duplicates __task_queue() The top of a duplicate tree is not where bit == -1 but at the most negative bit. This was causing tasks to be queued in reverse order within duplicates. While this is not dramatic, it's incorrect and might lead to longer than expected duplicate depths under some circumstances.	2009-03-21 12:57:06 +01:00
Willy Tarreau	218859ad6c	[BUG] sched: don't leave 3 lasts tasks unprocessed when niced tasks are present When there are niced tasks, we would only process #tasks/4 per turn, without taking care of running #tasks when #tasks was below 4, leaving those tasks waiting for a few other tasks to push them. The fix simply consists in checking (#tasks+3)/4.	2009-03-21 11:53:09 +01:00
Willy Tarreau	e35c94a748	[MEDIUM] scheduler: get rid of the 4 trees thanks and use ebtree v4.1 Since we're now able to search from a precise expiration date in the timer tree using ebtree 4.1, we don't need to maintain 4 trees anymore. Not only does this simplify the code a lot, but it also ensures that we can always look 24 days back and ahead, which doubles the ability of the previous scheduler. Indeed, while based on absolute values, the timer tree is now relative to <now> as we can always search from <now>-31 bits. The run queue uses the exact same principle now, and is now simpler and a bit faster to process. With these changes alone, an overall 0.5% performance gain was observed. Tests were performed on the few wrapping cases and everything works as expected.	2009-03-21 10:25:14 +01:00
Willy Tarreau	87bed62a92	[BUILD] build fixes for Solaris One build error in stream_sock.c when MSG_NOSIGNAL is not defined, and a warning in task.c.	2009-03-08 22:25:28 +01:00
Willy Tarreau	531cf0cf8d	[OPTIM] task: reduce the number of calls to task_queue() Most of the time, task_queue() will immediately return. By extracting the preliminary checks and putting them in an inline function, we can significantly reduce the number of calls to the function itself, and most of the tests can be optimized away due to the caller's context. Another minor improvement in process_runnable_tasks() consisted in taking benefit from the processor's branch prediction unit by making a special case of the process_session() callback which is by far the most common one. All this improved performance by about 1%, mainly during the call from process_runnable_tasks().	2009-03-08 16:35:27 +01:00
Willy Tarreau	d0a201b35c	[CLEANUP] task: distinguish between clock ticks and timers Timers are unsigned and used as tree positions. Ticks are signed and used as absolute date within current time frame. While the two are normally equal (except zero), it's important not to confuse them in the code as they are not interchangeable. We add two inline functions to turn each one into the other. The comments have also been moved to the proper location, as it was not easy to understand what was a tick and what was a timer unit.	2009-03-08 15:58:07 +01:00
Willy Tarreau	26c250683f	[MEDIUM] minor update to the task api: let the scheduler queue itself All the tasks callbacks had to requeue the task themselves, and update a global timeout. This was not convenient at all. Now the API has been simplified. The tasks callbacks only have to update their expire timer, and return either a pointer to the task or NULL if the task has been deleted. The scheduler will take care of requeuing the task at the proper place in the wait queue.	2009-03-08 09:38:41 +01:00
Willy Tarreau	4136522527	[OPTIM] displace tasks in the wait queue only if absolutely needed We don't need to remove then add tasks in the wait queue every time we update a timeout. We only need to do that when the new timeout is earlier than previous one. We can rely on wake_expired_tasks() to perform the proper checks and bounce the misplaced tasks in the rare case where this happens. The motivation behind this is that we very rarely hit timeouts, so we save a lot of CPU cycles by moving the tasks very rarely. This now means we can also find tasks with expiration date set to eternity in the queue, and that is not a problem.	2009-03-08 07:59:27 +01:00
Willy Tarreau	4726f53794	[OPTIM] task: don't unlink a task from a wait queue when waking it up In many situations, we wake a task on an I/O event, then queue it exactly where it was. This is a real waste because we delete/insert tasks into the wait queue for nothing. The only reason for this is that there was only one tree node in the task struct. By adding another tree node, we can have one tree for the timers (wait queue) and one tree for the priority (run queue). That way, we can have a task both in the run queue and wait queue at the same time. The wait queue now really holds timers, which is what it was designed for. The net gain is at least 1 delete/insert cycle per session, and up to 2-3 depending on the workload, since we save one cycle each time the expiration date is not changed during a wake up.	2009-03-08 07:59:18 +01:00
Willy Tarreau	1b8ca663a4	[BUG] task: fix handling of duplicate keys A bug was introduced with the ebtree-based scheduler. It seldom causes some timeouts to last longer than required if they hit an expiration date which is the same as the last queued date, is also part of a duplicate tree without being the top of the tree. In this case, the task will not be expired until after the duplicate tree has been flushed. It is easier to reproduce by setting a very short client timeout (1s) and sending connections and waiting for them to expire with the 408 status. Then in parallel, inject at about 1kh/s. The bug causes the connections to sometimes wait longer than 1s before timing out. The cause was the use of eb_insert_dup() on wrong nodes, as this function is designed to work only on the top of the dup tree. The solution consists in updating last_timer only when its bit is -1, and using it only if its bit is still -1 (top of a dup tree). The fix has not reduced performance because it only fixes the case where this bug could fire, which is extremely rare.	2009-03-08 07:57:47 +01:00
Willy Tarreau	fdccded0e8	[MEDIUM] indicate a reason for a task wakeup It's very frequent to require some information about the reason why a task is running. Some flags have been added so that a task now knows if it got woken up due to I/O completion, timeout, etc...	2008-11-02 10:19:08 +01:00
Willy Tarreau	4df8206832	[OPTIM] reduce the number of calls to task_wakeup() A test has shown that more than 16% of the calls to task_wakeup() could be avoided because the task is already woken up. So make it inline and move the test to the inline part.	2008-11-02 10:19:07 +01:00
Willy Tarreau	ec6c5df018	[CLEANUP] remove many #include <types/xxx> from C files It should be stated as a rule that a C file should never include types/xxx.h when proto/xxx.h exists, as it gives less exposure to declaration conflicts (one of which was caught and fixed here) and it complicates the file headers for nothing. Only types/global.h, types/capture.h and types/polling.h have been found to be valid includes from C files.	2008-07-16 10:30:42 +02:00
Willy Tarreau	0c303eec87	[MAJOR] convert all expiration timers from timeval to ticks This is the first attempt at moving all internal parts from using struct timeval to integer ticks. Those provides simpler and faster code due to simplified operations, and this change also saved about 64 bytes per session. A new header file has been added : include/common/ticks.h. It is possible that some functions should finally not be inlined because they're used quite a lot (eg: tick_first, tick_add_ifset and tick_is_expired). More measurements are required in order to decide whether this is interesting or not. Some function and variable names are still subject to change for a better overall logics.	2008-07-07 00:09:58 +02:00
Willy Tarreau	ce44f12c1e	[OPTIM] task_queue: assume most consecutive timers are equal When queuing a timer, it's very likely that an expiration date is equal to that of the previously queued timer, due to time rounding to the millisecond. Optimizing for this case provides a noticeable 1% performance boost.	2008-07-05 18:16:19 +02:00
Willy Tarreau	91e99931b7	[MEDIUM] introduce task->nice and boot access to statistics The run queue scheduler now considers task->nice to queue a task and to pick a task out of the queue. This makes it possible to boost the access to statistics (both via HTTP and UNIX socket). The UNIX socket receives twice as much a boost as the HTTP socket because it is more sensible.	2008-06-30 07:51:00 +02:00
Willy Tarreau	58b458d8ba	[MAJOR] use an ebtree instead of a list for the run queue We now insert tasks in a certain sequence in the run queue. The sorting key currently is the arrival order. It will now be possible to apply a "nice" value to any task so that it goes forwards or backwards in the run queue. The calls to wake_expired_tasks() and maintain_proxies() have been moved to the main run_poll_loop(), because they had nothing to do in process_runnable_tasks(). The task_wakeup() function is not inlined anymore, as it was only used at one place. The qlist member of the task structure has been removed now. The run_queue list has been replaced for an integer indicating the number of tasks in the run queue.	2008-06-29 22:40:23 +02:00
Willy Tarreau	af754fc88f	[OPTIM] shrink wake_expired_tasks() by using task_wakeup() It's not worth duplicating task_wakeup() in wake_expired_tasks(). Calling it reduces code size and slightly improves performance.	2008-06-29 19:25:52 +02:00
Willy Tarreau	28c41a4041	[MEDIUM] rework the wait queue mechanism The wait queues now rely on 4 trees for past, present and future timers. The computations are cleaner and more reliable. The wake_expired_tasks function has become simpler. Also, a bug previously introduced in task_queue() by the first introduction of eb_trees has been fixed (the eb->key was never updated).	2008-06-29 17:00:59 +02:00
Willy Tarreau	e62bdd4026	[BUG] wqueue: perform proper timeout comparisons with wrapping values With wrapping keys, we cannot simply do "if (key > now)", but we must at least do "if ((signed)(key-now) > 0)".	2008-06-29 10:32:02 +02:00
Willy Tarreau	9789f7bd68	[MAJOR] replace ultree with ebtree in wait-queues The ultree code has been removed in favor of a simpler and cleaner ebtree implementation. The eternity queue does not need to exist anymore, and the pool_tree64 has been removed. The ebtree node is stored in the task itself. The qlist list header is still used by the run-queue, but will be able to disappear once the run-queue uses ebtree too.	2008-06-24 08:17:16 +02:00
Willy Tarreau	70bcfb77a7	[OPTIM] GCC4's builtin_expect() is suboptimal GCC4 is stupid (unbelievable news!). When some code uses __builtin_expect(x != 0, 1), it really performs the check of x != 0 then tests that the result is not zero! This is a double check when only one was expected. Some performance drops of 10% in the HTTP parser code have been observed due to this bug. GCC 3.4 is fine though. A solution consists in expecting that the tested value is 1. In this case, it emits the correct code, but it's still not optimal it seems. Finally the best solution is to ignore likely() and to pray for the compiler to emit correct code. However, we still have to fix unlikely() to remove the test there too, and to fix all code which passed pointers overthere to pass integers instead.	2008-02-14 23:14:33 +01:00
Willy Tarreau	315bff5183	Merge branch 'pools' into merge-pools	2007-05-14 02:11:56 +02:00
Willy Tarreau	1209033e46	[MINOR] disable useless hint in wake_expired_tasks wake_expired_tasks() used a hint to avoid scanning the tree in most cases, but it looks like the hint is more expensive than reaching the first node in the tree. Disable it for now.	2007-05-14 02:11:39 +02:00
Willy Tarreau	fbfc053e34	[BUG] fix buggy timeout computation in wake_expired_tasks Wake_expired_tasks is supposed to return a date, not an interval. It was causing busy loops in pollers.	2007-05-14 02:03:47 +02:00
Willy Tarreau	c6ca1a02aa	[MAJOR] migrated task, tree64 and session to pool2 task and tree64 are already very close in size and are merged together. Overall performance gained slightly by this simple change.	2007-05-13 19:43:47 +02:00
Willy Tarreau	c64e5397f6	[MINOR] avoid inlining in task.c The task management functions used to call __tv_* which is not really optimal given the size of the functions.	2007-05-13 16:07:06 +02:00
Willy Tarreau	d825eef9c5	[MAJOR] replaced all timeouts with struct timeval The timeout functions were difficult to manipulate because they were rounding results to the millisecond. Thus, it was difficult to compare and to check what expired and what did not. Also, the comparison functions were heavy with multiplies and divides by 1000. Now, all timeouts are stored in timevals, reducing the number of operations for updates and leading to cleaner and more efficient code.	2007-05-12 22:35:00 +02:00
Willy Tarreau	7317eb5a1d	[MAJOR] fixed some expiration dates on tasks The time subsystem really needs fixing. It was still possible that some tasks with expiration date below the millisecond in the future caused busy loop around poll() waiting for the timeout to happen.	2007-05-09 00:54:10 +02:00
Willy Tarreau	e33aecefa6	[MINOR] uninline task_wakeup task_wakup has become bigger since we used the trees. Let's not inline it anymore.	2007-04-30 14:38:03 +02:00
Willy Tarreau	42aae5c7cf	[MEDIUM] many cleanups in the time functions Now, functions whose name begins with '__tv_' are inlined. Also, 'tv_ms' is used as a prefix for functions using milliseconds.	2007-04-29 17:43:56 +02:00
Willy Tarreau	a6a6a93e56	[MAJOR] changed TV_ETERNITY to ~0 instead of 0 The fact that TV_ETERNITY was 0 was very awkward because it required that comparison functions handled the special case. Now it is ~0 and all comparisons are performed on unsigned values, so that it is naturally greater than any other value. A performance gain of about 2-5% has been noticed.	2007-04-29 13:44:24 +02:00
Willy Tarreau	96bcfd75aa	[MAJOR] replaced rbtree with ul2tree. The rbtree-based wait queue consumes a lot of CPU. Use the ul2tree instead. Lots of cleanups and code reorganizations made it possible to reduce the task struct and simplify the code a bit.	2007-04-29 13:43:53 +02:00
Willy Tarreau	5e8f066961	[MINOR] slightly optimize time calculation for rbtree The new rbtree-based scheduler makes heavy use of tv_cmp2(), and this function becomes a huge CPU eater. Refine it a little bit in order to slightly reduce CPU usage.	2007-02-12 00:59:08 +01:00
Willy Tarreau	b1b8272a54	[MINOR] uninline rb_insert_task_queue() rb_insert_task_queue() was inlined and is quite large. Uninlining it reduces code size by about 2 kB and slightly improves performance.	2007-02-11 13:52:16 +01:00
Willy Tarreau	964c936b04	[MAJOR] replace the wait-queue linked list with an rbtree. This patch from Sin Yu makes use of an rbtree for the wait queue, which will solve the slowdown problem encountered when timeouts are heterogenous in the configuration. The next step will be to turn maintain_proxies() into a per-proxy task so that we won't have to scan them all after each poll() loop.	2007-01-07 02:14:23 +01:00
Willy Tarreau	2dd0d4799e	[CLEANUP] renamed include/haproxy to include/common	2006-06-29 17:53:05 +02:00
Willy Tarreau	baaee00406	[BIGMOVE] exploded the monolithic haproxy.c file into multiple files. The files are now stored under : - include/haproxy for the generic includes - include/types.h for the structures needed within prototypes - include/proto.h for function prototypes and inline functions - src/*.c for the C files Most include files are now covered by LGPL. A last move still needs to be done to put inline functions under GPL and not LGPL. Version has been set to 1.3.0 in the code but some control still needs to be done before releasing.	2006-06-26 02:48:02 +02:00

1 2 3 4

198 Commits