Commit Graph

49 Commits

Author SHA1 Message Date
Willy Tarreau
0a70688016 BUG/MINOR: halog: -ad/-ac report the correct number of output lines
There was a lines_out++ left from earlier code, causing each input
line to be counted as an output line.

This fix also affects 1.4 and should be backported.
2012-10-10 13:43:17 +02:00
Willy Tarreau
8a09b663a8 MINOR: halog: sort output by cookie code
It's sometimes useful to have the output sorted by cookie code to see
the ratios of NI vs VN for example. This is now possible with -cc.
2012-10-10 10:27:18 +02:00
Baptiste
61aaad06e8 CONTRIB: halog: sort URLs by avg bytes_read or total bytes_read
The patch attached to this mail brings ability to sort URLs by
averaged bytes read and total bytes read in HALog tool.
In most cases, bytes read is also the object size.
The purpose of this patch is to know which URL consume the most
bandwith, in average or in total.
It may be interesting as well to know the standard deviation (ecart
type in french) for some counters (like bytes_read).

The results:
- Sorting by average bytes read per URL:
./halog -uba <~/tmp/haproxy.log | column -t | head
2246 lines in, 302 lines out, 194 parsing errors
18    0    5101     283    5101   283    126573  2278327  /lib/exe/js.php
1     0    1        1      1      1      106734  106734   /wp-admin/images/screenshots/theme-customizer.png
2     0    2        1      2      1      106511  213022   /wp-admin/css/wp-admin.css
1     0    1        1      1      1      96698   96698    /wp-admin/images/screenshots/captions-1.png
1     0    1        1      1      1      73165   73165    /wp-admin/images/screenshots/flex-header-1.png
4     0    0        0      0      0      64832   259328   /cuisine/wp-content/plugins/stats/open-flash-chart.swf
1     0    0        0      0      0      48647   48647    /wp-admin/images/screenshots/flex-header-3.png
1     0    0        0      0      0      44046   44046    /wp-admin/images/screenshots/captions-2.png
1     0    1        1      1      1      38830   38830    /wp-admin/images/screenshots/flex-header-2.png

- Sorting by total bytes read per URL:
./halog -ubt <~/tmp/haproxy.log | column -t | head
2246 lines in, 302 lines out, 194 parsing errors
18    0    5101     283    5101   283    126573  2278327  /lib/exe/js.php
60    0    14387    239    14387  239    10081   604865   /lib/exe/css.php
64    2    8820     137    8819   142    7742    495524   /doku.php
14    0    250      17     250    17     24045   336632   /wp-admin/load-scripts.php
71    0    6422     90     6422   90     4048    287419   /wp-admin/
4     0    0        0      0      0      64832   259328   /cuisine/wp-content/plugins/stats/open-flash-chart.swf
2     0    2        1      2      1      106511  213022   /wp-admin/css/wp-admin.css
31    3    5423     174    5040   180    6804    210931   /index
10    0    429      42     429    42     18009   180093   /cuisine/files/2011/10/tarte_figue_amande-e1318281546905-225x300.jpg
2012-09-09 08:44:01 +02:00
Willy Tarreau
f8c95d2a25 OPTIM: halog: improve cold-cache behaviour when loading a file
Using posix_fadvise() it is possible to tell the system that we're
going to read a whole file at once. The kernel then doubles the
read-ahead size for this file. On Linux with an SSD, this has improved
cold-cache performance by around 20%. Hot-cache is not affected at all.
2012-06-12 09:16:56 +02:00
Willy Tarreau
419a598eae OPTIM: halog: make use of memchr() on platforms which provide a fast one
glibc-2.11 on x86_64 provides a machine-specific memchr() which is faster
than the generic C implementation by around 40%, so let's make it possible
to use it instead of the hand-coded version.
2012-06-12 08:52:22 +02:00
Willy Tarreau
8ad4193100 CLEANUP: halog: make clean should also remove .o files 2012-06-12 07:59:16 +02:00
Willy Tarreau
de5dc0509c MINOR: halog: use the more recent dual-mode fgets2 implementation
This version implements both 32 and 64 bit versions at once, it
avoids the need to have two separate output files. It also improves
efficiency on i386 platforms by adding a little bit of assembly where
gcc isn't efficient.
2012-06-09 11:22:27 +02:00
Willy Tarreau
7de211c88b MINOR: add a new function call tracer for debugging purposes
This feature relies on GCC's ability to call helpers at function entry/exit
points. We define these helpers to quickly dump the minimum info into a trace
file that can be converted to a human readable format using a script in the
contrib/trace directory. This has only been implemented in the GNU makefile
for now on as it is unsure whether it's supported on all OSes.

The feature is enabled by building with "TRACE=1". The performance impact is
huge, so this feature should only be used when debugging. To limit the loss
of performance, fprintf() has been disabled and the output is hand-crafted
and emitted using fwrite(), resulting in doubling the performance. Using the
TSC instead of gettimeofday() also doubles the performance. Around 1200 conns/s
may be achieved on a Pentium-M 1.7 GHz which leads to around 50 MB/s of traces.

The entry and exits of all functions will be dumped into a file designated
by the HAPROXY_TRACE environment variable, or by default "trace.out". If the
trace file name is empty or "/dev/null", then traces are disabled. If
opening the trace file fails, then stderr is used. If HAPROXY_TRACE_FAST is
used, then the time is taken from the global <now> variable. Last, if
HAPROXY_TRACE_TSC is used, then the machine's TSC is used instead of the
real time (almost twice as fast).

The output format is :

  <sec.usec> <level> <caller_ptr> <dir> <callee_ptr>
or :
  <tsc> <level> <caller_ptr> <dir> <callee_ptr>

where <dir> is '>' when entering a function and '<' when leaving.

The awk script in contrib/trace provides a nicer indented output :

6f74989e6f8 ->->->   run_poll_loop > signal_process_queue [src/haproxy.c:1097:0x804bd69] > [include/proto/signal.h:32:0x8049cd0]
6f74989eb00          run_poll_loop < signal_process_queue [src/haproxy.c:1097:0x804bd69] < [include/proto/signal.h:32:0x8049cd0]
6f74989ef44 ->->->   run_poll_loop > wake_expired_tasks [src/haproxy.c:1100:0x804bd72] > [src/task.c:123:0x8055060]
6f74989f3a6 ->->->->   wake_expired_tasks > eb32_lookup_ge [src/task.c:128:0x8055091] > [ebtree/eb32tree.c:138:0x80a8c70]
6f74989f7e9            wake_expired_tasks < eb32_lookup_ge [src/task.c:128:0x8055091] < [ebtree/eb32tree.c:138:0x80a8c70]
6f74989fc0d ->->->->   wake_expired_tasks > eb32_first [src/task.c:134:0x80550d5] > [ebtree/eb32tree.h:55:0x8054ad0]
6f7498a003d ->->->->->   eb32_first > eb_first [ebtree/eb32tree.h:56:0x8054af1] > [ebtree/ebtree.h:520:0x8054a10]
6f7498a0436 ->->->->->->   eb_first > eb_walk_down [ebtree/ebtree.h:521:0x8054a33] > [ebtree/ebtree.h:442:0x80549a0]
6f7498a0843 ->->->->->->->   eb_walk_down > eb_gettag [ebtree/ebtree.h:445:0x80549d6] > [ebtree/ebtree.h:418:0x80548e0]
6f7498a0c2b                  eb_walk_down < eb_gettag [ebtree/ebtree.h:445:0x80549d6] < [ebtree/ebtree.h:418:0x80548e0]
6f7498a1042 ->->->->->->->   eb_walk_down > eb_untag [ebtree/ebtree.h:447:0x80549e2] > [ebtree/ebtree.h:412:0x80548a0]
6f7498a1498                  eb_walk_down < eb_untag [ebtree/ebtree.h:447:0x80549e2] < [ebtree/ebtree.h:412:0x80548a0]
6f7498a18c6 ->->->->->->->   eb_walk_down > eb_root_to_node [ebtree/ebtree.h:448:0x80549e7] > [ebtree/ebtree.h:432:0x8054960]
6f7498a1cd4                  eb_walk_down < eb_root_to_node [ebtree/ebtree.h:448:0x80549e7] < [ebtree/ebtree.h:432:0x8054960]
6f7498a20c4                eb_first < eb_walk_down [ebtree/ebtree.h:521:0x8054a33] < [ebtree/ebtree.h:442:0x80549a0]
6f7498a24b4              eb32_first < eb_first [ebtree/eb32tree.h:56:0x8054af1] < [ebtree/ebtree.h:520:0x8054a10]
6f7498a289c            wake_expired_tasks < eb32_first [src/task.c:134:0x80550d5] < [ebtree/eb32tree.h:55:0x8054ad0]
6f7498a2c8c          run_poll_loop < wake_expired_tasks [src/haproxy.c:1100:0x804bd72] < [src/task.c:123:0x8055060]
6f7498a3095 ->->->   run_poll_loop > process_runnable_tasks [src/haproxy.c:1103:0x804bd7a] > [src/task.c:190:0x8055150]

A nice improvement would possibly consist in trying to get the function's
arguments in the stack and to dump a few more infor for some well-known
functions (eg: the session's status for process_session).
2012-05-26 00:12:37 +02:00
Willy Tarreau
9bb0e2042e MINOR: contrib/iprange: add a network IP range to mask converter
This tool has remained uncommitted in my development tree for almost a year.
Just minor polish and commit.

It can be used to convert some geolocation IP lists to ACLs.
2012-04-02 21:44:05 +02:00
Willy Tarreau
615674cdec MINOR: halog: add some help on the command line 2012-01-23 08:17:59 +01:00
Willy Tarreau
e1a908c369 OPTIM: halog: keep a fast path for the lines-count only
Using "halog -c" is still something quite common to perform on logs,
but unfortunately since the recent added controls, it was sensibly
slowed down due to the parsing of the accept date field.

Now we use a specific loop for the case where nothing is needed from
the input, and this sped up the line counting by 2.5x. A 2.4 GHz Xeon
now counts lines at a rate of 2 GB of logs per second.
2012-01-03 09:28:05 +01:00
Willy Tarreau
08911ff896 MINOR: halog: add support for matching queued requests
-Q outputs all requests which went through at least one queue.
-QS outputs all requests which went through a server queue.
2011-10-13 13:28:36 +02:00
Willy Tarreau
6ee71754e2 BUILD: halog: make halog build on solaris
Solaris' "rm" command does not support -v. Also, specify CC=gcc
because "cc" generally is not gcc there.
2011-09-16 15:03:37 +02:00
Willy Tarreau
f9042060c9 [OPTIM] halog: add assembly version of the field lookup code
Gcc tries to be a bit too smart in these small loops and the result is
that on i386 we waste a lot of time there. By recoding these loops in
assembly, we save up to 23% total processing time on i386! The savings
on x86_64 are much lower, probably because there are more registers and
gcc has to do less tricks. However, those savings vary a lot between gcc
versions and even cause harm on some of them (eg: 4.4) because gcc does
not know how to optimize the code once inlined.

However, by recoding field_start() in C to try to match the assembly
code as much as possible, we can significantly reduce its execution
time without risking the negative impacts. Thus, the assembly version
is less interesting there but still worth being used on some compilers.
2011-09-10 12:39:30 +02:00
Willy Tarreau
31a02e9c5b [OPTIM] halog: make fgets parse more bytes by blocks
By adding a "landing area" at the end of the buffer, it becomes safe to
parse more bytes at once. On 32-bit this makes fgets run about 4% faster
but it does not save anything on 64-bit.
2011-09-10 10:46:39 +02:00
Willy Tarreau
96c148b0d2 [MINOR] halog: do not consider byte 0x8A as end of line
A bug in the algorithm used to find an LF in multiple bytes at once
made byte 0x80 trigger detection of byte 0x00, thus 0x8A matches byte
0x0A. In practice, this issue never happens since byte 0x8A won't be
displayed in logs (or it will be encoded). This could still possibly
happen in mixed logs.
2011-09-09 08:21:55 +02:00
Willy Tarreau
61a40c7402 [MINOR] halog: support backslash-escaped quotes
Some syslog servers escape quotes, which make the resulting logs unusable
for URL processing since the parser looks for the first field beginning
with a quote. It now supports also fields starting with backslash and
quote in order to address this. No performance impact was measured.
2011-09-06 08:11:27 +02:00
Willy Tarreau
d3007ffa6f [MINOR] halog: add -hs/-HS to filter by HTTP status code range
The code was merged with the error code checking which is very similar and
which shares the same information. The new test adds about 1% slowdown to
error checking but makes it more reliable when facing wrongly formated
status codes.
2011-09-05 02:09:24 +02:00
Herv COMMOWICK
927cdddf9c [MINOR] halog: add support for termination code matching (-tcn/-TCN)
It is now possible to filter by termination code with -tcn <termcode>, to be
able to track one kind of errors, for example after counting it with -tc.
Use -TCN <termcode> gives you the opposite.
2011-08-10 18:04:50 +02:00
Willy Tarreau
14389e7036 [OPTIM] halog: remove support for tab delimiters in input data
Haproxy does not use tabs when sending logs, and checking for them
wastes no less than 4% of CPU cycles. Better get rid of these tests.
2011-07-11 06:48:04 +02:00
Willy Tarreau
a2b39fb5c5 [OPTIM] halog: remove many 'if' by using a function pointer for the filters
There were too many filters, we were losing time in all the "if" statements.
By moving all the filters to independant functions, we made the code cleaner
and slightly faster (3%).

One minor bug was found, the -tc and -st options did not report the number
of output lines, but always zero.
2011-07-11 06:48:04 +02:00
Willy Tarreau
26deaf51d9 [OPTIM] halog: check once for correct line format and reuse the pointer
Almost all filters first check the line format, which takes a lot of code
and requires parsing back and forth. By centralizing this test, we can
save about 15-20 more percent of performance for all filters.

Also, the test was wrong, it was checking that the source IP address was
starting with a digit, which is not always true with local IPv6 addresses.
Instead, we now check that the next field (accept field) starts with an
opening bracket and is followed by a digit between 0 and 3 (day of the
month). Doing this has contributed a 2% speedup because all other field
calculations were relative to a closer field.
2011-07-11 06:48:04 +02:00
Willy Tarreau
758a6ea46c [OPTIM] halog: cache some common fields positions
Since many fields are relative and some are used a lot, try to cache them
the first time they're used in order to avoid skipping them twice. The
status counts with HTTP pre-check enabled has sped up by 40%.
2011-07-11 06:48:03 +02:00
Willy Tarreau
df6f0d1e49 [MINOR] halog: gain back performance before SKIP_CHAR fix
The SKIP_CHAR fix caused a measurable performance drop. Since we can
consider all chars below 0x20 as delimiters, we can avoid a cache lookup
which requires a char to pointer conversion.
2011-07-11 06:48:03 +02:00
Willy Tarreau
70c428f7c6 [MINOR] halog: add support for HTTP log matching (-H)
Now it's possible to restrict analysis to HTTP-looking logs when passing -H.
-H -v gives the opposite (most likely TCP logs).
2011-07-11 06:48:03 +02:00
Willy Tarreau
c82570edec [MINOR] halog: make SKIP_CHAR stop on field delimiters
The SKIP_CHAR() macro did not consider field delimiters, causing the timer parser
to be able to search timers at wrong places when fed with TCP logs.
2011-07-11 06:48:02 +02:00
Willy Tarreau
812e7a73b2 [BUG] halog: correctly handle truncated last line
If last line is truncated (eg: truncated file), then halog would loop on
it forever.
2011-07-11 06:48:02 +02:00
Willy Tarreau
24bcb4f2ff [CONTRIB] halog: minor speed improvement in timer parser
The timer parser looks for the next slash after the last timer, which is
very far away. Those 4 occurrences have been fixed to match the way it's
done in URL sorting, which is faster. Average speed gain is 5-6% on -srv
and -pct.
(cherry picked from commit 3555671c93695f48c02ef05c8bb228523f17ca20)
2010-10-30 19:04:37 +02:00
Willy Tarreau
abe45b6bb3 [CONTRIB] halog: report per-url counts, errors and times
Using -u{,c,e,t,a,to,ao} it is possible to get per-URL statistics, sorted by
URL, request count, error count, total time, avg time, total time on OK requests,
avg time on OK requests.

Since it has to parse URLs and store a number of fields, it's quite slower
than other methods, but still correct for production usage (typically 800000
lines or 270 MB per second on a 2 GHz system).

Results are sorted in reverse order so that it's easy to catch them by piping
the output to the "head" command.
(cherry picked from commit 15ce7f56d15f839ce824279b84ffe14c58e41fda)
2010-10-30 19:04:37 +02:00
Krzysztof Piotr Oledzki
6190b7d9dc [CONTRIB] Update Cacti Tempates
This patch adds new haproxy_socket.xml template and updates
haproxy_backend.xml and haproxy_frontend.xml templates.
(cherry picked from commit 67cd1d55b5513e4186f021a7014e9442fd7a710f)
2010-10-30 19:04:36 +02:00
Krzysztof Piotr Oledzki
3989c4b0d4 [CONTRIB] Update haproxy.pl
This patch adds support for Sockets and several
new variables available in the 1.4 branch.
(cherry picked from commit d049c84fdc9e35472a3db87e45069afd92bee01d)
2010-10-30 19:04:36 +02:00
Mathieu Trudel
7cb62f8877 [CONTRIB] add templates for Cacti.
Hi,

I've attached the templates I've built for monitoring backends and
frontends of haproxy.

To install these, you will need to copy the XML files from the contrib/
directory of the haproxy distribution into a directory that Cacti can
reach, and edit the Data Queries "HaProxy Backends" and "HAProxy
Frontends" accordingly (the "XML Path" field. It's also dependant on
having a version of net-snmp that supports embedded Perl, and including
the "perl do 'path_to_haproxy.pl';" directive in your snmpd.conf file.

As for what is created:

- For the devices, you have two new data queries to choose from, they
can be added from the Devices page for each device, at the very end in
the drop-down box, then click "Add". The data queries are called
"HaProxy Backends" and "HAProxy Frontends".

- From "HaProxy Backends": in the new graphs page, you can choose which
backend to graph, and create one of two graphs:
	- Haproxy backend traffic:  ingress and egress bytes.
	- Haproxy backend sessions:  total sessions with _reponse_ errors.

- From "HAProxy Frontends": in the new graphs page again, you can choose
which frontend to graph, which will include aggregated data for the
backends behind it, obviously. You can create one of two graphs:
	- Haproxy frontend traffic:  ingress and egress bytes.
	- Haproxy frontend sessions:  total sessions with _request_ errors.

In the graphs and data sources, limits are set to reasonably high values
to support up to nearly 10G traffic, and up to 10000 concurrent
connections.

/ Matt
(cherry picked from commit f63090f2e85cdb7448071cdceb2eb5fabd2b9320)
2010-10-30 19:04:35 +02:00
Willy Tarreau
5417081c79 [MINOR] halog: skip non-traffic logs for -st and -tc
Those were reporting stupid results in presence of administrative logs.
2010-09-13 22:50:49 +02:00
Willy Tarreau
d8fc1103a5 [MINOR] halog: add '-tc' to sort by termination codes
This output lists all encountered termination codes by number of
occurrences.
2010-09-12 17:56:16 +02:00
Willy Tarreau
d220106092 [CONTRIB] halog: report per-server status codes, errors and response times
It's sometimes very useful to be able to monitor a production status in real
time by comparing servers behaviours. Now halog is able to do this when called
with "-srv". It reports various fields for each server found in a log, including
statuses, total reqs, valid reqs, percent of valid reqs, average connection time,
average response time.
2010-06-04 14:37:01 +02:00
Willy Tarreau
d2c142c7ee [OPTIM] halog: speed up fgets2-64 by about 10%
This version uses more 64-bit lookups and two 32-bit lookups
to converge faster. This saves about 10% performance.
2010-05-05 12:22:08 +02:00
Willy Tarreau
2651ac3302 [OPTIM] halog: minor speedup by using unlikely()
By moving the filter-specific code out of the loop, we can slightly
speed it up (3%).
2010-05-05 12:20:19 +02:00
Willy Tarreau
1769a18f62 [OPTIM] halog: use a faster zero test in fgets()
A new idea came up to detect the presence of a null byte in a word.
It saves several operations compared to the previous one, and eliminates
the jumps (about 6 instructions which can run 2-by-2 in parallel).

This sole optimisation improved the line count speed by about 30%.
2010-05-04 11:04:54 +02:00
Willy Tarreau
0f423a7073 [MINOR] halog: add support for statisticts on status codes
Using "-st", halog outputs number of requests by status codes.
2010-05-03 10:56:43 +02:00
Krzysztof Piotr Oledzki
4a3323b83d [CONTRIB] add base64rev-gen.c that was used to generate the base64rev table.
There is no offcial reverse table for base64, so a short
program is required to generate one.
2010-01-31 19:14:07 +01:00
Willy Tarreau
910ba4bb8b [BUG] halog: fix segfault in case of empty log in PCT mode
(cherry picked from commit fe362fe476)
2010-01-28 10:07:26 +01:00
Willy Tarreau
db40a1c8bd [BUILD] halog: make without arch-specific optimizations 2010-01-28 10:07:07 +01:00
Willy Tarreau
0b9da8dd45 [BUILD] halog: insufficient include path in makefile 2010-01-02 12:23:30 +01:00
Willy Tarreau
45cb4fb640 [MEDIUM] build: switch ebtree users to use new ebtree version
All files referencing the previous ebtree code were changed to point
to the new one in the ebtree directory. A makefile variable (EBTREE_DIR)
is also available to use files from another directory.

The ability to build the libebtree library temporarily remains disabled
because it can have an impact on some existing toolchains and does not
appear worth it in the medium term if we add support for multi-criteria
stickiness for instance.
2009-10-26 21:10:04 +01:00
Willy Tarreau
5bdfd968ed [CONTRIB] halog: support searching by response time
Also support inverting search criteria when specified uppercase
2009-10-14 20:37:29 +02:00
Jan-Frode Myklebust
6b6a53db5f [CONTRIB] selinux policy for haproxy
Here's an selinux policy for haproxy. The patch is built and lightly
tested with haproxy-1.3.15.7-1.fc10.i386 on Fedora9, and haproxy-1.2.18
on RHEL5.
2009-03-21 10:15:00 +01:00
Willy Tarreau
214c203c00 [CONTRIB] halog: faster fgets() and add support for percentile reporting
A new fgets implementation saves about 25-50% of the time spent parsing
the logs.

Percentile calculation has been added for timers using -pct.
2009-03-09 00:42:39 +01:00
Willy Tarreau
72c285345a [CONTRIB] halog: fast log parser for haproxy
halog can search errors, count lines, sort by accept date, look for
traffic holes and large connection counts at output graph plots of
timers.
2009-03-09 00:34:11 +01:00
Krzysztof Piotr Oledzki
2c6962c3c0 [MAJOR] proto_uxst rework -> SNMP support
Currently there is a ~16KB limit for a data size passed via unix socket.
It is caused by a trivial bug ttat is going to fixed soon, however
in most cases there is no need to dump a full stats.

This patch makes possible to select a scope of dumped data by extending
current "show stat" to "show stat [<iid> <type> <sid>]":
 - iid is a proxy id, -1 to dump all proxies
 - type selects type of dumpable objects: 1 for frontend, 2 for backend, 4 for
   server, -1 for all types. Values can be ORed, for example:
     1+2=3   -> frontend+backend.
     1+2+4=7 -> frontend+backend+server.
 - sid is a service id, -1 to dump everything from the selected proxy.

To do this I implemented a new session flag (SN_STAT_BOUND), added three
variables in data_ctx.stats (iid, type, sid), modified dumpstats.c and
completely revorked the process_uxst_stats: now it waits for a "\n"
terminated string, splits args and uses them. BTW: It should be quite easy
to add new commands, for example to enable/disable servers, the only problem
I can see is a not very lucky config name (*stats* socket). :|

During the work I also fixed two bug:
 - s->flags were not initialized for proto_uxst
 - missing comma if throttling not enabled (caused by a stupid change in
     "Implement persistent id for proxies and servers")

Other changes:
 - No more magic type valuse, use STATS_TYPE_FE/STATS_TYPE_BE/STATS_TYPE_SV
 - Don't memset full s->data_ctx (it was clearing s->data_ctx.stats.{iid/type/sid},
    instead initialize stats.sv & stats.sv_st (stats.px and stats.px_st were already
    initialized)

With all that changes it was extremely easy to write a short perl plugin
for a perl-enabled net-snmp (also included in this patch).

29385 is my PEN (Private Enterprise Number) and I'm willing to donate
the SNMPv2-SMI::enterprises.29385.106.* OIDs for HAProxy if there
is nothing assigned already.
2008-03-04 06:32:16 +01:00