Very early in the connection rework process leading to v1.5-dev12, commit
56a77e5 ("MEDIUM: connection: complete the polling cleanups") marked the
end of use for this flag which since was never set anymore, but it continues
to be tested. Let's kill it now.
This is a service that talks SPOE protocol and uses the Mod Defender (a
NAXSI clone) functionality to detect HTTP attacks. It returns a HTTP
status code to indicate whether the request is suspicious or not, based on
NAXSI rules. The value of the returned code can be used in HAProxy rules
to determine if the HTTP request should be blocked/rejected.
Adding Type=forking in the unit file ensure better monitoring from
systemd. During a systemctl start the tool is able to return an error if
it didn't work with this option.
Fix linker flags settings since 3rd parties libraries are not in
/usr/lib
Plus libfuzzy needs to be added.
undef LIST_HEAD from event2 which conflicts with haproxy's
Sometimes it's convenient to be able to execute a command directly on
the stream, whether we're connecting or accepting an incoming connection.
New command 'X' makes this possible. It simply calls execvp() on the
next arguments and branches stdin/stdout/stderr on the socket. Optionally
it's possible to limit the passed FDs to any combination of them by
appending 'i', 'o', 'e' after the X. In any case the program ends just
after executing this command.
Examples :
- chargen server
tcploop 8001 L A Xo cat /dev/zero
- telnet server
tcploop 8001 L W N A X /usr/sbin/in.telnetd
These encoding functions does general stuff and can be used in
other context than spoe. This patch moves the function spoe_encode_varint
and spoe_decode_varint from spoe to common. It also remove the prefix spoe.
These functions will be used for encoding values in new binary sample fetch.
Make the systemd wrapper chech if HAPROXY_STATS_SOCKET if set.
If set, it will use it as an argument to the "-x" option, which makes
haproxy asks for any listening socket, on the stats socket, in order
to achieve reloads with no new connection lost.
NULL is Linux-centric and we're not focused on performance here but
portability and reproducibility. Don't use NULL and use the trash
instead. It may lead to multiple recv() calls for large blocks but
as a benefit it will be possible to see the contents with strace.
Now, when a payload is fragmented, the first frame must define the frame type
and the followings must use the special type SPOE_FRM_T_UNSET. This way, it is
easy to know if a fragment is the first one or not. Of course, all frames must
still share the same stream-id and frame-id.
Update SPOA example accordingly.
If an agent want to abort the processing a fragmented NOTIFY frame before
receiving all fragments, it can send an ACK frame at any time with ABORT bit set
(and of course, the FIN bit too).
Beside this change, SPOE_FRM_ERR_FRAMEID_NOTFOUND error flag has been added. It
is set when a unknown ACK frame is received.
Now, we can use the option '-c' to enable the support of a capability. By
default, all capabilities are disabled. For example:
$> ./spoa -c async -c pipelining
In addition, it is also possible to set the maximum frame size supported by your
agent (-m) and to add a delay in frames processing (-t).
Just got this while cross-compiling :
tcploop.c: In function 'tcp_recv':
tcploop.c:444:48: error: 'INT_MAX' undeclared (first use in this function)
tcploop.c:444:48: note: each undeclared identifier is reported only once for each function it appears in
Jarno Huuskonen reported that ip6range doesn't build anymore on
Centos 7 (and possibly other distros) due to "in6_u" not being known.
Using s6_addr32 instead of in6_u.u6_addr32 apparently works fine, and
it's also what the Lua code uses so it should be OK.
This patch may be backported to 1.6.
This one jumps back to the oldest post-fork and post-accept action,
so it allows to recv(), pause() and send() in loops after a fork()
and an accept() for example. This is handy for bugs that reproduce
once in a while or to keep idle connections working.
By passing "S:<string>" instead of S<size> it's possible to send
a pre-defined string, which is convenient to write HTTP requests or
responses.
Example : produce two responses, one in keep-alive, one not for ab :
./tcploop 8001 L W N2 A R S:"HTTP/1.0 200 OK\r\nConnection: keep-alive\r\nContent-length: 50\r\n\r\n0123456789.123456789.123456789.123456789.123456789" R S:"HTTP/1.0 200 OK\r\nContent-length: 50\r\n\r\n0123456789.123456789.123456789.123456789.123456789"
With 20 such keep-alive responses and 10 parallel processes, ab achieves
350kreq/s, so it should be possible to get precise timings.
This is helpful to show what state we're dealing with. The pid is
written, optionally followed by the time in 3 different formats
(relative/absolute) depending on the command line option (-t, -tt, -ttt).
Fork is a very convenient way to deal with independant yet properly
timed connections. It's particularly useful here for accept(), and
ensures that any accepted FD will automatically be released. The
principle is that when we hit a fork command, the parent restarts
evaluating the actions from the beginning and the child continues
to evaluate the next actions. Listen and connect are skipped if the
connection is already established. Fork() is amazingly cheap on
Linux, 21k forked connections per second are handled on a single
core, and 38k on two cores.
For now it's not possible to have two different code paths so in order
to have both a listener and a connector, two distinct commands are
still needed.
netcat, nc6 and socat are only partially convenient as reproducers for
state machine bugs, but when it comes to adding delays, forcing resets,
waiting for data to be acked, they become useless.
The purpose of this utility is to be able to easily script some TCP
operations such as connect, accept, send, receive, shutdown and of
course pauses.
A new "option spop-check" statement has been added to enable server health
checks based on SPOP HELLO handshake. SPOP is the protocol used by SPOE filters
to talk to servers.
This is a very simple service that implement a "random" ip reputation
service. It will return random scores for all checked IP addresses. It only
shows you how to implement a ip reputation service or such kind of services
using the SPOE.
Users can set the location of haproxy.cfg and pidfile files by providing
a systemd overwrite file /etc/systemd/system/haproxy.service.d/overwrite.conf
with the following content:
[Service]
Environment=CONFIG=/etc/foobar/haproxy.cfg
The goal is to have a collection of quick-n-dirty utilities that make
debugging easier and that can easily be modified when needed. The first
utility in this series is called "flags". For a given numeric argument,
it reports the various known combinations of flags for channels, streams
and so on. This way it's easy to copy-paste values from the CLI or from
gdb and immediately know what state a stream-interface or connection is
in.
By default systemd will send SIGTERM to all processes in the service's
control group. In our case, this includes the wrapper, the master
process and all worker processes.
Since commit c54bdd2a the wrapper actually catches SIGTERM and survives
to see the master process getting killed by systemd and regard this as
an error, placing the unit in a failed state during "systemctl stop".
Since the wrapper now handles SIGTERM by itself, we switch the kill mode
to 'mixed', which means that systemd will deliver the initial SIGTERM to
the wrapper only, and if the actual haproxy processes don't exit after a
given amount of time (default: 90s), a SIGKILL is sent to all remaining
processes in the control group. See systemd.kill(5) for more
information.
This should also be backported to 1.5.
Dmitry Sivachenko reported that "uint" doesn't build on FreeBSD 10.
On Linux it's defined in sys/types.h and indicated as "old". Just
get rid of the very few occurrences.
The last commit provides time-based filtering. Unfortunately, it wastes
90% of the time calling the expensive time()/localtime()/mktime()
functions.
This patch does 3 things :
- call time()/localtime() only once to initialize the correct
struct timeinfo ;
- call mktime() only when the time has changed regardless of
the current second.
- manually add the current second to the cached result.
Doing just this is enough to multiply the parsing speed by 8.
I wanted to make a graph with average answer time in nagios that takes only
the last 5 mn of the log. Filtering the log before using halog was too
slow, so I added that filter to halog.
The patch attached to this mail is a proposal to add a new option : -time
[min][:max]
The values are min timestamp and/or max timestamp of the lines to be used
for stats. The date and time of the log lines between '[' and ']' are
converted to timestamp and compared to these values.
Here is an exemple of usage :
cat /var/log/haproxy.log | ./halog -srv -H -q -time $(date --date '-5 min' +%s)
This is the same as -uc except that instead of counting URLs, it
counts source addresses. The reported times are request times and
not response times.
The code becomes heavily ugly, the url struct is being abused to
store an address, and there are no more bit fields available. The
code needs a major revamp.
The iprange tool is handy for transforming network range formats, but
it's common to need a tool for running quick checks on the output.
The tool now supports a list of addresses on the command line, and it
will only output those which match. It's absolutely inefficient but is
handy for debugging.
Commit 667c905f introduced parameter -m to halog which limits the size
of the output. Unfortunately it is completely broken in that it doesn't
check that the limit was previously set or not, and also prevents a
simple counting operation from returning anything if a limit is not set.
Note that the -gt and -pct outputs behave differently in face of this
limit, since they count the valid output lines BEFORE actually producing
the data, so the limit really applies to valid input lines.
Sometimes it's useful to limit the output to a number of lines, for
example when output is already sorted (eg: 10 slowest URLs, ...). Now
we can use -m for this.
There was a lines_out++ left from earlier code, causing each input
line to be counted as an output line.
This fix also affects 1.4 and should be backported.
Using posix_fadvise() it is possible to tell the system that we're
going to read a whole file at once. The kernel then doubles the
read-ahead size for this file. On Linux with an SSD, this has improved
cold-cache performance by around 20%. Hot-cache is not affected at all.
glibc-2.11 on x86_64 provides a machine-specific memchr() which is faster
than the generic C implementation by around 40%, so let's make it possible
to use it instead of the hand-coded version.
This version implements both 32 and 64 bit versions at once, it
avoids the need to have two separate output files. It also improves
efficiency on i386 platforms by adding a little bit of assembly where
gcc isn't efficient.
This feature relies on GCC's ability to call helpers at function entry/exit
points. We define these helpers to quickly dump the minimum info into a trace
file that can be converted to a human readable format using a script in the
contrib/trace directory. This has only been implemented in the GNU makefile
for now on as it is unsure whether it's supported on all OSes.
The feature is enabled by building with "TRACE=1". The performance impact is
huge, so this feature should only be used when debugging. To limit the loss
of performance, fprintf() has been disabled and the output is hand-crafted
and emitted using fwrite(), resulting in doubling the performance. Using the
TSC instead of gettimeofday() also doubles the performance. Around 1200 conns/s
may be achieved on a Pentium-M 1.7 GHz which leads to around 50 MB/s of traces.
The entry and exits of all functions will be dumped into a file designated
by the HAPROXY_TRACE environment variable, or by default "trace.out". If the
trace file name is empty or "/dev/null", then traces are disabled. If
opening the trace file fails, then stderr is used. If HAPROXY_TRACE_FAST is
used, then the time is taken from the global <now> variable. Last, if
HAPROXY_TRACE_TSC is used, then the machine's TSC is used instead of the
real time (almost twice as fast).
The output format is :
<sec.usec> <level> <caller_ptr> <dir> <callee_ptr>
or :
<tsc> <level> <caller_ptr> <dir> <callee_ptr>
where <dir> is '>' when entering a function and '<' when leaving.
The awk script in contrib/trace provides a nicer indented output :
6f74989e6f8 ->->-> run_poll_loop > signal_process_queue [src/haproxy.c:1097:0x804bd69] > [include/proto/signal.h:32:0x8049cd0]
6f74989eb00 run_poll_loop < signal_process_queue [src/haproxy.c:1097:0x804bd69] < [include/proto/signal.h:32:0x8049cd0]
6f74989ef44 ->->-> run_poll_loop > wake_expired_tasks [src/haproxy.c:1100:0x804bd72] > [src/task.c:123:0x8055060]
6f74989f3a6 ->->->-> wake_expired_tasks > eb32_lookup_ge [src/task.c:128:0x8055091] > [ebtree/eb32tree.c:138:0x80a8c70]
6f74989f7e9 wake_expired_tasks < eb32_lookup_ge [src/task.c:128:0x8055091] < [ebtree/eb32tree.c:138:0x80a8c70]
6f74989fc0d ->->->-> wake_expired_tasks > eb32_first [src/task.c:134:0x80550d5] > [ebtree/eb32tree.h:55:0x8054ad0]
6f7498a003d ->->->->-> eb32_first > eb_first [ebtree/eb32tree.h:56:0x8054af1] > [ebtree/ebtree.h:520:0x8054a10]
6f7498a0436 ->->->->->-> eb_first > eb_walk_down [ebtree/ebtree.h:521:0x8054a33] > [ebtree/ebtree.h:442:0x80549a0]
6f7498a0843 ->->->->->->-> eb_walk_down > eb_gettag [ebtree/ebtree.h:445:0x80549d6] > [ebtree/ebtree.h:418:0x80548e0]
6f7498a0c2b eb_walk_down < eb_gettag [ebtree/ebtree.h:445:0x80549d6] < [ebtree/ebtree.h:418:0x80548e0]
6f7498a1042 ->->->->->->-> eb_walk_down > eb_untag [ebtree/ebtree.h:447:0x80549e2] > [ebtree/ebtree.h:412:0x80548a0]
6f7498a1498 eb_walk_down < eb_untag [ebtree/ebtree.h:447:0x80549e2] < [ebtree/ebtree.h:412:0x80548a0]
6f7498a18c6 ->->->->->->-> eb_walk_down > eb_root_to_node [ebtree/ebtree.h:448:0x80549e7] > [ebtree/ebtree.h:432:0x8054960]
6f7498a1cd4 eb_walk_down < eb_root_to_node [ebtree/ebtree.h:448:0x80549e7] < [ebtree/ebtree.h:432:0x8054960]
6f7498a20c4 eb_first < eb_walk_down [ebtree/ebtree.h:521:0x8054a33] < [ebtree/ebtree.h:442:0x80549a0]
6f7498a24b4 eb32_first < eb_first [ebtree/eb32tree.h:56:0x8054af1] < [ebtree/ebtree.h:520:0x8054a10]
6f7498a289c wake_expired_tasks < eb32_first [src/task.c:134:0x80550d5] < [ebtree/eb32tree.h:55:0x8054ad0]
6f7498a2c8c run_poll_loop < wake_expired_tasks [src/haproxy.c:1100:0x804bd72] < [src/task.c:123:0x8055060]
6f7498a3095 ->->-> run_poll_loop > process_runnable_tasks [src/haproxy.c:1103:0x804bd7a] > [src/task.c:190:0x8055150]
A nice improvement would possibly consist in trying to get the function's
arguments in the stack and to dump a few more infor for some well-known
functions (eg: the session's status for process_session).
This tool has remained uncommitted in my development tree for almost a year.
Just minor polish and commit.
It can be used to convert some geolocation IP lists to ACLs.
Using "halog -c" is still something quite common to perform on logs,
but unfortunately since the recent added controls, it was sensibly
slowed down due to the parsing of the accept date field.
Now we use a specific loop for the case where nothing is needed from
the input, and this sped up the line counting by 2.5x. A 2.4 GHz Xeon
now counts lines at a rate of 2 GB of logs per second.
Gcc tries to be a bit too smart in these small loops and the result is
that on i386 we waste a lot of time there. By recoding these loops in
assembly, we save up to 23% total processing time on i386! The savings
on x86_64 are much lower, probably because there are more registers and
gcc has to do less tricks. However, those savings vary a lot between gcc
versions and even cause harm on some of them (eg: 4.4) because gcc does
not know how to optimize the code once inlined.
However, by recoding field_start() in C to try to match the assembly
code as much as possible, we can significantly reduce its execution
time without risking the negative impacts. Thus, the assembly version
is less interesting there but still worth being used on some compilers.
By adding a "landing area" at the end of the buffer, it becomes safe to
parse more bytes at once. On 32-bit this makes fgets run about 4% faster
but it does not save anything on 64-bit.
A bug in the algorithm used to find an LF in multiple bytes at once
made byte 0x80 trigger detection of byte 0x00, thus 0x8A matches byte
0x0A. In practice, this issue never happens since byte 0x8A won't be
displayed in logs (or it will be encoded). This could still possibly
happen in mixed logs.
Some syslog servers escape quotes, which make the resulting logs unusable
for URL processing since the parser looks for the first field beginning
with a quote. It now supports also fields starting with backslash and
quote in order to address this. No performance impact was measured.
The code was merged with the error code checking which is very similar and
which shares the same information. The new test adds about 1% slowdown to
error checking but makes it more reliable when facing wrongly formated
status codes.
It is now possible to filter by termination code with -tcn <termcode>, to be
able to track one kind of errors, for example after counting it with -tc.
Use -TCN <termcode> gives you the opposite.
There were too many filters, we were losing time in all the "if" statements.
By moving all the filters to independant functions, we made the code cleaner
and slightly faster (3%).
One minor bug was found, the -tc and -st options did not report the number
of output lines, but always zero.
Almost all filters first check the line format, which takes a lot of code
and requires parsing back and forth. By centralizing this test, we can
save about 15-20 more percent of performance for all filters.
Also, the test was wrong, it was checking that the source IP address was
starting with a digit, which is not always true with local IPv6 addresses.
Instead, we now check that the next field (accept field) starts with an
opening bracket and is followed by a digit between 0 and 3 (day of the
month). Doing this has contributed a 2% speedup because all other field
calculations were relative to a closer field.
Since many fields are relative and some are used a lot, try to cache them
the first time they're used in order to avoid skipping them twice. The
status counts with HTTP pre-check enabled has sped up by 40%.
The SKIP_CHAR fix caused a measurable performance drop. Since we can
consider all chars below 0x20 as delimiters, we can avoid a cache lookup
which requires a char to pointer conversion.
The timer parser looks for the next slash after the last timer, which is
very far away. Those 4 occurrences have been fixed to match the way it's
done in URL sorting, which is faster. Average speed gain is 5-6% on -srv
and -pct.
(cherry picked from commit 3555671c93695f48c02ef05c8bb228523f17ca20)
Using -u{,c,e,t,a,to,ao} it is possible to get per-URL statistics, sorted by
URL, request count, error count, total time, avg time, total time on OK requests,
avg time on OK requests.
Since it has to parse URLs and store a number of fields, it's quite slower
than other methods, but still correct for production usage (typically 800000
lines or 270 MB per second on a 2 GHz system).
Results are sorted in reverse order so that it's easy to catch them by piping
the output to the "head" command.
(cherry picked from commit 15ce7f56d15f839ce824279b84ffe14c58e41fda)
This patch adds new haproxy_socket.xml template and updates
haproxy_backend.xml and haproxy_frontend.xml templates.
(cherry picked from commit 67cd1d55b5513e4186f021a7014e9442fd7a710f)
This patch adds support for Sockets and several
new variables available in the 1.4 branch.
(cherry picked from commit d049c84fdc9e35472a3db87e45069afd92bee01d)
Hi,
I've attached the templates I've built for monitoring backends and
frontends of haproxy.
To install these, you will need to copy the XML files from the contrib/
directory of the haproxy distribution into a directory that Cacti can
reach, and edit the Data Queries "HaProxy Backends" and "HAProxy
Frontends" accordingly (the "XML Path" field. It's also dependant on
having a version of net-snmp that supports embedded Perl, and including
the "perl do 'path_to_haproxy.pl';" directive in your snmpd.conf file.
As for what is created:
- For the devices, you have two new data queries to choose from, they
can be added from the Devices page for each device, at the very end in
the drop-down box, then click "Add". The data queries are called
"HaProxy Backends" and "HAProxy Frontends".
- From "HaProxy Backends": in the new graphs page, you can choose which
backend to graph, and create one of two graphs:
- Haproxy backend traffic: ingress and egress bytes.
- Haproxy backend sessions: total sessions with _reponse_ errors.
- From "HAProxy Frontends": in the new graphs page again, you can choose
which frontend to graph, which will include aggregated data for the
backends behind it, obviously. You can create one of two graphs:
- Haproxy frontend traffic: ingress and egress bytes.
- Haproxy frontend sessions: total sessions with _request_ errors.
In the graphs and data sources, limits are set to reasonably high values
to support up to nearly 10G traffic, and up to 10000 concurrent
connections.
/ Matt
(cherry picked from commit f63090f2e85cdb7448071cdceb2eb5fabd2b9320)
It's sometimes very useful to be able to monitor a production status in real
time by comparing servers behaviours. Now halog is able to do this when called
with "-srv". It reports various fields for each server found in a log, including
statuses, total reqs, valid reqs, percent of valid reqs, average connection time,
average response time.
A new idea came up to detect the presence of a null byte in a word.
It saves several operations compared to the previous one, and eliminates
the jumps (about 6 instructions which can run 2-by-2 in parallel).
This sole optimisation improved the line count speed by about 30%.
All files referencing the previous ebtree code were changed to point
to the new one in the ebtree directory. A makefile variable (EBTREE_DIR)
is also available to use files from another directory.
The ability to build the libebtree library temporarily remains disabled
because it can have an impact on some existing toolchains and does not
appear worth it in the medium term if we add support for multi-criteria
stickiness for instance.
Currently there is a ~16KB limit for a data size passed via unix socket.
It is caused by a trivial bug ttat is going to fixed soon, however
in most cases there is no need to dump a full stats.
This patch makes possible to select a scope of dumped data by extending
current "show stat" to "show stat [<iid> <type> <sid>]":
- iid is a proxy id, -1 to dump all proxies
- type selects type of dumpable objects: 1 for frontend, 2 for backend, 4 for
server, -1 for all types. Values can be ORed, for example:
1+2=3 -> frontend+backend.
1+2+4=7 -> frontend+backend+server.
- sid is a service id, -1 to dump everything from the selected proxy.
To do this I implemented a new session flag (SN_STAT_BOUND), added three
variables in data_ctx.stats (iid, type, sid), modified dumpstats.c and
completely revorked the process_uxst_stats: now it waits for a "\n"
terminated string, splits args and uses them. BTW: It should be quite easy
to add new commands, for example to enable/disable servers, the only problem
I can see is a not very lucky config name (*stats* socket). :|
During the work I also fixed two bug:
- s->flags were not initialized for proto_uxst
- missing comma if throttling not enabled (caused by a stupid change in
"Implement persistent id for proxies and servers")
Other changes:
- No more magic type valuse, use STATS_TYPE_FE/STATS_TYPE_BE/STATS_TYPE_SV
- Don't memset full s->data_ctx (it was clearing s->data_ctx.stats.{iid/type/sid},
instead initialize stats.sv & stats.sv_st (stats.px and stats.px_st were already
initialized)
With all that changes it was extremely easy to write a short perl plugin
for a perl-enabled net-snmp (also included in this patch).
29385 is my PEN (Private Enterprise Number) and I'm willing to donate
the SNMPv2-SMI::enterprises.29385.106.* OIDs for HAProxy if there
is nothing assigned already.