Commit Graph

10 Commits

Author SHA1 Message Date
Willy Tarreau 627505d36a MINOR: freq_ctr: add swrate_add_scaled() to work with large samples
Some samples representing time will cover more than one sample at once
if they are units of time per time. For this we'd need to have the
ability to loop over swrate_add() multiple times but that would be
inefficient. By developing the function elevated to power N, it's
visible that some coefficients quickly disappear and that those which
remain at the first order more or less compensate each other.

Thus a simplified version of this function was added to provide a single
value for a given number of samples. Tests with multiple values, window
sizes and sample sizes have shown that it is possible to make it remain
surprisingly accurate (typical error < 0.2% over various large window
and sample sizes, even samples representing up to 1/4 of the window).
2018-10-22 08:13:57 +02:00
Emeric Brun 6e0128630b BUG/MAJOR: threads/freq_ctr: fix lock on freq counters.
The wrong bit was set to keep the lock on freq counter update. And the read
functions were re-worked to use volatile.

Moreover, when a freq counter is updated, it is now rotated only if the current
counter is in the past (now.tv_sec > ctr->curr_sec). It is important with
threads because the current time (now) is thread-local. So, rounded to the
second, the time may vary by more or less 1 second. So a freq counter rotated by
one thread may be see 1 second in the future. In this case, it is updated but
not rotated.
2017-10-31 13:58:33 +01:00
Christopher Faulet 94b712337d MEDIUM: threads/freq_ctr: Make the frequency counters thread-safe
When a frequency counter must be updated, we use the curr_sec/curr_tick fields
as a lock, by setting the MSB to 1 in a compare-and-swap to lock and by reseting
it to unlock. And when we need to read it, we loop until the counter is
unlocked. This way, the frequency counters are thread-safe without any external
lock. It is important to avoid increasing the size of many structures (global,
proxy, server, stick_table).
2017-10-31 13:58:32 +01:00
Christopher Faulet de2075fd21 MINOR: freq_ctr: Return the new value after an update
This will ease threads support integration.
2017-09-05 11:55:07 +02:00
Willy Tarreau 3758581e19 BUG/MINOR: freq-ctr: make swrate_add() support larger values
Reinhard Vicinus reported that the reported average response times cannot
be larger than 16s due to the double multiply being performed by
swrate_add() which causes an overflow very quickly. Indeed, with N=512,
the highest average value is 16448.

One solution proposed by Reinhard is to turn to long long, but this
involves 64x64 multiplies and 64->32 divides, which are extremely
expensive on 32-bit platforms.

There is in fact another way to avoid the overflow without using larger
integers, it consists in avoiding the multiply using the fact that
x*(n-1)/N = x-(x/N).

Now it becomes possible to store average values as large as 8.4 millions,
which is around 2h18mn.

Interestingly, this improvement also makes the code cheaper to execute
both on 32 and on 64 bit platforms :

Before :

00000000 <swrate_add>:
   0:   8b 54 24 04             mov    0x4(%esp),%edx
   4:   8b 0a                   mov    (%edx),%ecx
   6:   89 c8                   mov    %ecx,%eax
   8:   c1 e0 09                shl    $0x9,%eax
   b:   29 c8                   sub    %ecx,%eax
   d:   8b 4c 24 0c             mov    0xc(%esp),%ecx
  11:   c1 e8 09                shr    $0x9,%eax
  14:   01 c8                   add    %ecx,%eax
  16:   89 02                   mov    %eax,(%edx)

After :

00000020 <swrate_add>:
  20:   8b 4c 24 04             mov    0x4(%esp),%ecx
  24:   8b 44 24 0c             mov    0xc(%esp),%eax
  28:   8b 11                   mov    (%ecx),%edx
  2a:   01 d0                   add    %edx,%eax
  2c:   81 c2 ff 01 00 00       add    $0x1ff,%edx
  32:   c1 ea 09                shr    $0x9,%edx
  35:   29 d0                   sub    %edx,%eax
  37:   89 01                   mov    %eax,(%ecx)

This fix may be backported to 1.6.
2016-11-25 11:55:10 +01:00
Willy Tarreau 2438f2b984 MINOR: freq_ctr: introduce a new averaging method
While the current functions report average event counts per period, we are
also interested in average values per event. For this we use a different
method. The principle is to rely on a long tail which sums the new value
with a fraction of the previous value, resulting in a sliding window of
infinite length depending on the precision we're interested in.

The idea is that we always keep (N-1)/N of the sum and add the new sampled
value. The sum over N values can be computed with a simple program for a
constant value 1 at each iteration :

    N
  ,---
   \       N - 1              e - 1
    >  ( --------- )^x ~= N * -----
   /         N                  e
  '---
  x = 1

Note: I'm not sure how to demonstrate this but at least this is easily
verified with a simple program, the sum equals N * 0.632120 for any N
moderately large (tens to hundreds).

Inserting a constant sample value V here simply results in :

   sum = V * N * (e - 1) / e

But we don't want to integrate over a small period, but infinitely. Let's
cut the infinity in P periods of N values. Each period M is exactly the same
as period M-1 with a factor of ((N-1)/N)^N applied. A test shows that given a
large N :

     N - 1           1
  ( ------- )^N ~=  ---
       N             e

Our sum is now a sum of each factor times  :

   N*P                                     P
  ,---                                   ,---
   \         N - 1               e - 1    \     1
    >  v ( --------- )^x ~= VN * -----  *  >   ---
   /           N                   e      /    e^x
  '---                                   '---
  x = 1                                  x = 0

For P "large enough", in tests we get this :

    P
  ,---
   \     1        e
    >   --- ~=  -----
   /    e^x     e - 1
  '---
  x = 0

This simplifies the sum above :

   N*P
  ,---
   \         N - 1
    >  v ( --------- )^x = VN
   /           N
  '---
  x = 1

So basically by summing values and applying the last result an (N-1)/N factor
we just get N times the values over the long term, so we can recover the
constant value V by dividing by N.

A value added at the entry of the sliding window of N values will thus be
reduced to 1/e or 36.7% after N terms have been added. After a second batch,
it will only be 1/e^2, or 13.5%, and so on. So practically speaking, each
old period of N values represents only a quickly fading ratio of the global
sum :

  period    ratio
    1       36.7%
    2       13.5%
    3       4.98%
    4       1.83%
    5       0.67%
    6       0.25%
    7       0.09%
    8       0.033%
    9       0.012%
   10       0.0045%

So after 10N samples, the initial value has already faded out by a factor of
22026, which is quite fast. If the sliding window is 1024 samples wide, it
means that a sample will only count for 1/22k of its initial value after 10k
samples went after it, which results in half of the value it would represent
using an arithmetic mean. The benefit of this method is that it's very cheap
in terms of computations when N is a power of two. This is very well suited
to record response times as large values will fade out faster than with an
arithmetic mean and will depend on sample count and not time.

Demonstrating all the above assumptions with maths instead of a program is
left as an exercise for the reader.
2014-06-17 17:15:51 +02:00
Willy Tarreau 2970b0bedf [MINOR] freq_ctr: add new types and functions for periods different from 1s
Some freq counters will have to work on periods different from 1 second.
The original freq counters rely on the period to be exactly one second.
The new ones (freq_ctr_period) let the user define the period in ticks,
and all computations are operated over that period. When reading a value,
it indicates the amount of events over that period too.
2010-08-10 14:01:09 +02:00
Willy Tarreau 78ff5d0a9e [MINOR] include time.h from freq_ctr.h as is uses "now". 2009-10-01 11:05:26 +02:00
Willy Tarreau 79584225e5 [OPTIM] rate-limit: cleaner behaviour on low rates and reduce consumption
The rate-limit was applied to the smoothed value which does a special
case for frequencies below 2 events per period. This caused irregular
limitations when set to 1 session per second.

The proper way to handle this is to compute the number of remaining
events that can occur without reaching the limit. This is what has
been added. It also has the benefit that the frequency calculation
is now done once when entering event_accept(), before the accept()
loop, and not once per accept() loop anymore, thus saving a few CPU
cycles during very high loads.

With this fix, rate limits of 1/s are perfectly respected.
2009-03-06 09:18:27 +01:00
Willy Tarreau 7f062c4193 [MEDIUM] measure and report session rate on frontend, backends and servers
With this change, all frontends, backends, and servers maintain a session
counter and a timer to compute a session rate over the last second. This
value will be very useful because it varies instantly and can be used to
check thresholds. This value is also reported in the stats in a new "rate"
column.
2009-03-05 18:43:00 +01:00