Commit Graph

20 Commits

Author SHA1 Message Date
Willy Tarreau eab777c32e BUG/MINOR: time: frequency counters are not totally accurate
When a frontend is rate-limited to 1000 connections per second, the
effective rate measured from the client is 999/s, and connections
experience an average response time of 99.5 ms with a standard
deviation of 2 ms.

The reason for this inaccuracy is that when computing frequency
counters, we use one part of the previous value proportional to the
number of milliseconds remaining in the current second. But even the
last millisecond still uses a part of the past value, which is wrong :
since we have a 1ms resolution, the last millisecond must be dedicated
only to filling the current second.

So we slightly adjust the algorithm to use 999/1000 of the past value
during the first millisecond, and 0/1000 of the past value during the
last millisecond.  We also slightly improve the computation by computing
the remaining time instead of the current time in tv_update_date(), so
that we don't have to negate the value in each frequency counter.

Now with the fix, the connection rate measured by both the client and
haproxy is a steady 1000/s, the average response time measured is 99.2ms
and more importantly, the standard deviation has been divided by 3 to
0.6 millisecond.

This fix should also be backported to 1.4 which has the same issue.
2012-12-29 21:50:07 +01:00
William Lallemand 421f5b5882 MINOR: Date and time fonctions that don't use snprintf
Also move human_time() to standard.c since it's not related to
timeval calculations.
2012-02-09 17:03:28 +01:00
Willy Tarreau 45a1251515 [MEDIUM] poll: add a measurement of idle vs work time
We now measure the work and idle times in order to report the idle
time in the stats. It's expected that we'll be able to use it at
other places later.
2011-09-10 18:01:41 +02:00
Willy Tarreau 755905857a [MINOR] add curr_sec_ms and curr_sec_ms_scaled for current second.
Several algorithms will need to know the millisecond value within
the current second. Instead of doing a divide every time it is needed,
it's better to compute it when it changes, which is when now and now_ms
are recomputed.

curr_sec_ms_scaled is the same multiplied by 2^32/1000, which will be
useful to compute some ratios based on the position within last second.
2009-03-05 16:56:16 +01:00
Willy Tarreau 776cd87e32 [MINOR] time: add __usec_to_1024th to convert usecs to 1024th of second
This function performs a fast conversion from usec to 1024th of a second,
and will be useful for many fast sub-second computations.
2009-03-05 00:34:01 +01:00
Willy Tarreau e6313a37d6 [MINOR] introduce now_ms, the current date in milliseconds
This new time value will be used to compute timeouts and wait queue
positions. The operation is made once for all when time is retrieved.
A future improvement might consist in having it in ticks of 1/1024
second and to convert all timeouts into ticks.
2008-06-29 13:47:25 +02:00
Willy Tarreau b0b37bcd65 [MEDIUM] further improve monotonic clock by check forward jumps
The first implementation of the monotonic clock did not verify
forward jumps. The consequence is that a fast changing time may
expire a lot of tasks. While it does seem minor, in fact it is
problematic because most machines which boot with a wrong date
are in the past and suddenly see their time jump by several
years in the future.

The solution is to check if we spent more apparent time in
a poller than allowed (with a margin applied). The margin
is currently set to 1000 ms. It should be large enough for
any poll() to complete.

Tests with randomly jumping clock show that the result is quite
accurate (error less than 1 second at every change of more than
one second).
2008-06-23 14:00:57 +02:00
Willy Tarreau b7f694f20e [MEDIUM] implement a monotonic internal clock
If the system date is set backwards while haproxy is running,
some scheduled events are delayed by the amount of time the
clock went backwards. This is particularly problematic on
systems where the date is set at boot, because it seldom
happens that health-checks do not get sent for a few hours.

Before switching to use clock_gettime() on systems which
provide it, we can at least ensure that the clock is not
going backwards and maintain two clocks : the "date" which
represents what the user wants to see (mostly for logs),
and an internal date stored in "now", used for scheduled
events.
2008-06-22 17:18:02 +02:00
Willy Tarreau c8f24f8ec1 [BUILD] fix 2 minor issues on AIX
AIX does not know about MSG_DONTWAIT. Fortunately, nearly all sockets
are already set to O_NONBLOCK, so it's not even required to change the
code.  It was only necessary to add this fcntl to the log socket which
lacked it.  The MSG_DONTWAIT value has been defined to zero when unset
in order to make the code cleaner and more portable.

Also, on AIX, "hz" is defined, which causes a problem with one function
parameter in time.c. It's enough to rename the parameter there. Last,
fix a missing #include <string.h> in proxy.c.
2007-11-30 18:38:35 +01:00
Krzysztof Oledzki 85130941e7 [MEDIUM] stats: report server and backend cumulated downtime
Hello,

This patch implements new statistics for SLA calculation by adding new
field 'Dwntime' with total down time since restart (both HTTP/CSV) and
extending status field (HTTP) or inserting a new one (CSV) with time
showing how long each server/backend is in a current state. Additionaly,
down transations are also calculated and displayed for backends, so it is
possible to know how many times selected backend was down, generating "No
server is available to handle this request." error.

New information are presentetd in two different ways:
   - for HTTP: a "human redable form", one of "100000d 23h", "23h 59m" or
      "59m 59s"
   - for CSV: seconds

I believe that seconds resolution is enough.

As there are more columns in the status page I decided to shrink some
names to make more space:
   - Weight -> Wght
   - Check -> Chk
   - Down -> Dwn

Making described changes I also made some improvements and fixed some
small bugs:
   - don't increment s->health above 's->rise + s->fall - 1'. Previously it
     was incremented an then (re)set to 's->rise + s->fall - 1'.
   - do not set server down if it is down already
   - do not set server up if it is up already
   - fix colspan in multiple places (mostly introduced by my previous patch)
   - add missing "status" header to CSV
   - fix order of retries/redispatches in server (CSV)
   - s/Tthen/Then/
   - s/server/backend/ in DATA_ST_PX_BE (dumpstats.c)

Changes from previous version:
  - deal with negative time intervales
  - don't relay on s->state (SRV_RUNNING)
  - little reworked human_time + compacted format (no spaces). If needed it
    can be used in the future for other purposes by optionally making "cnt"
    as an argument
  - leave set_server_down mostly unchanged
  - only little reworked "process_chk: 9"
  - additional fields in CSV are appended to the rigth
  - fix "SEC" macro
  - named arguments (human_time, be_downtime, srv_downtime)

Hope it is OK. If there are only cosmetic changes needed please fill free
to correct it, however if there are some bigger changes required I would
like to discuss it first or at last to know what exactly was changed
especially since I already put this patch into my production server. :)

Thank you,

Best regards,

 				Krzysztof Oledzki
2007-10-22 21:36:23 +02:00
Willy Tarreau 0481c20e66 [MINOR] add new tv_* functions
The most useful, tv_add_ifset only adds the increment if it is set. It
is designed for use in expiration computation.
2007-05-13 16:03:27 +02:00
Willy Tarreau 49fa3a1453 [MINOR] time.h: added a few tv_* functions to manipulate timevals
A few more tv_* functions will be needed to switch from a millisecond-based
scheduler to a struct timeval-based one.
2007-05-12 22:29:44 +02:00
Willy Tarreau 42aae5c7cf [MEDIUM] many cleanups in the time functions
Now, functions whose name begins with '__tv_' are inlined. Also,
'tv_ms' is used as a prefix for functions using milliseconds.
2007-04-29 17:43:56 +02:00
Willy Tarreau 8d7d1497e0 [MEDIUM] implement and use tv_cmp2_le instead of tv_cmp2_ms
tv_cmp2_ms handles multiple combinations of tv1 and tv2, but only
one form is used: (tv1 <= tv2). So it is overkill to use it everywhere.
A new function designed to do exactly this has been written for that
purpose: tv_cmp2_le. Also, removed old unused tv_* functions.
2007-04-29 13:44:43 +02:00
Willy Tarreau a6a6a93e56 [MAJOR] changed TV_ETERNITY to ~0 instead of 0
The fact that TV_ETERNITY was 0 was very awkward because it
required that comparison functions handled the special case.
Now it is ~0 and all comparisons are performed on unsigned
values, so that it is naturally greater than any other value.

A performance gain of about 2-5% has been noticed.
2007-04-29 13:44:24 +02:00
Willy Tarreau 5e8f066961 [MINOR] slightly optimize time calculation for rbtree
The new rbtree-based scheduler makes heavy use of tv_cmp2(), and
this function becomes a huge CPU eater. Refine it a little bit in
order to slightly reduce CPU usage.
2007-02-12 00:59:08 +01:00
Willy Tarreau fb278677e2 [MEDIUM] use regparm on a few tv_* functions
Some of the tv_* functions are called very often. Passing their
arguments as registers is quite faster. This can be disabled
by setting CONFIG_HAP_DISABLE_REGPARM.
2006-10-15 15:38:50 +02:00
Willy Tarreau b17916e89b [CLEANUP] add a few "const char *" where appropriate
As suggested by Markus Elfring, a few "const char *" have replaced
some "char *" declarations where a function is not expected to
modify a value. It does not change the code but it helps detecting
coding errors.
2006-10-15 15:17:57 +02:00
Willy Tarreau e3ba5f0aaa [CLEANUP] included common/version.h everywhere 2006-06-29 18:54:54 +02:00
Willy Tarreau 2dd0d4799e [CLEANUP] renamed include/haproxy to include/common 2006-06-29 17:53:05 +02:00