Commit Graph

161 Commits

Author SHA1 Message Date
Thierry FOURNIER
01e0974b5a MINOR: samples: add xx-hash functions
This patch adds the support of xx-hash 32 and 64-bits functions.
2016-12-26 12:45:04 +01:00
Willy Tarreau
97108e08ce CLEANUP: sample: report "converter" instead of "conv method" in error messages
This was inherited from the very early stick-tables code but it's about
time to produce understandable error messages :-)
2016-11-25 07:36:22 +01:00
Thierry FOURNIER / OZON.IO
a69c912187 CLEANUP: log-format: useless file and line in json converter
The caller must log location information, so this information is
provided two times in the log line. The error log is like this:

   [ALERT] 327/011513 (14291) : parsing [o3.conf:38]: 'http-response
   set-header': Sample fetch <method,json(rrr)> failed with : invalid
   args in conv method 'json' : Unexpected input code type at file
   'o3.conf', line 38. Allowed value are 'ascii', 'utf8', 'utf8s',
   'utf8p' and 'utf8ps'.

This patch removes the second location indication, the the same error
becomes:

   [ALERT] 327/011637 (14367) : parsing [o3.conf:38]: 'http-response
   set-header': Sample fetch <method,json(rrr)> failed with : invalid
   args in conv method 'json' : Unexpected input code type. Allowed
   value are 'ascii', 'utf8', 'utf8s', 'utf8p' and 'utf8ps'.
2016-11-24 18:54:25 +01:00
Christopher Faulet
f7e4e7e096 MAJOR: spoe: Add an experimental Stream Processing Offload Engine
SPOE makes possible the communication with external components to retrieve some
info using an in-house binary protocol, the Stream Processing Offload Protocol
(SPOP). In the long term, its aim is to allow any kind of offloading on the
streams. This first version, besides being experimental, won't do lot of
things. The most important today is to validate the protocol design and lay the
foundations of what will, one day, be a full offload engine for the stream
processing.

So, for now, the SPOE can offload the stream processing before "tcp-request
content", "tcp-response content", "http-request" and "http-response" rules. And
it only supports variables creation/suppression. But, in spite of these limited
features, we can easily imagine to implement a SSO solution, an ip reputation
service or an ip geolocation service.

Internally, the SPOE is implemented as a filter. So, to use it, you must use
following line in a proxy proxy section:

  frontend my-front
      ...
      filter spoe [engine <name>] config <file>
      ...

It uses its own configuration file to keep the HAProxy configuration clean. It
is also a easy way to disable it by commenting out the filter line.

See "doc/SPOE.txt" for all details about the SPOE configuration.
2016-11-09 22:57:01 +01:00
Christopher Faulet
476e5d0e03 REORG: sample: move code to release a sample expression in sample.c
This code has been moved from haproxy.c to sample.c and the function
release_sample_expr can now be called from anywhere to release a sample
expression. This function will be used by the stream processing offload engine
(SPOE).
2016-11-09 22:57:00 +01:00
Willy Tarreau
2235b261b6 OPTIM: http: move all http character classs tables into a single one
We used to have 7 different character classes, each was 256 bytes long,
resulting in almost 2kB being used in the L1 cache. It's as cheap to
test a bit than to check the byte is not null, so let's store a 7-bit
composite value and check for the respective bits there instead.

The executable is now 4 kB smaller and the performance on small
objects increased by about 1% to 222k requests/second with a config
involving 4 http-request rules including 1 header lookup, one header
replacement, and 2 variable assignments.
2016-11-05 15:58:08 +01:00
Willy Tarreau
f0645dce4f MINOR: sample: use smp_make_rw() in upper/lower converters
There's no point in always duplicating the sample, just ensure it's
writable, as was done prior to the smp_dup() change. This should be
backported to 1.6 to avoid a performance regression caused by this
change (about 30% more time for upper/lower due to the copy).
2016-08-09 14:31:25 +02:00
Willy Tarreau
ad63582eb9 BUG/MEDIUM: samples: make smp_dup() always duplicate the sample
Vedran Furac reported a strange problem where the "base" sample fetch
would not always work for tracking purposes.

In fact, it happens that commit bc8c404 ("MAJOR: stick-tables: use sample
types in place of dedicated types") merged in 1.6 exposed a fundamental
bug related to the way samples use chunks as strings. The problem is that
chunks convey a base pointer, a length and an optional size, which may be
zero when unknown or when the chunk is allocated from a read-only location.
The sole purpose of this size is to know whether or not the chunk may be
appended new data. This size cause some semantics issue in the sample,
which has its own SMP_F_CONST flag to indicate read-only contents.

The problem was emphasized by the commit above because it made use of new
calls to smp_dup() to convert a sample to a table key. And since smp_dup()
would only check the SMP_F_CONST flag, it would happily return read-write
samples indicating size=0.

So some tests were added upon smp_dup() return to ensure that the actual
length is smaller than size, but this in fact made things even worse. For
example, the "sni" server directive does some bad stuff on many occasions
because it limits len to size-1 and effectively sets it to -1 and writes
the zero byte before the beginning of the string!

It is therefore obvious that smp_dup() needs to be modified to take this
nature of the chunks into account. It's not enough but is needed. The core
of the problem comes from the fact that smp_dup() is called for 5 distinct
needs which are not always fulfilled :

  1) duplicate a sample to keep a copy of it during some operations
  2) ensure that the sample is rewritable for a converter like upper()
  3) ensure that the sample is terminated with a \0
  4) set a correct size on the sample
  5) grow the sample in case it was extracted from a partial chunk

Case 1 is not used for now, so we can ignore it. Case 2 indicates the wish
to modify the sample, so its R/O status must be removed if any, but there's
no implied requirement that the chunk becomes larger. Case 3 is used when
the sample has to be made compatible with libc's str* functions. There's no
need to make it R/W nor to duplicate it if it is already correct. Case 4
can happen when the sample's size is required (eg: before performing some
changes that must fit in the buffer). Case 5 is more or less similar but
will happen when the sample by be grown but we want to ensure we're not
bound by the current small size.

So the proposal is to have different functions for various operations. One
will ensure a sample is safe for use with str* functions. Another one will
ensure it may be rewritten in place. And smp_dup() will have to perform an
inconditional duplication to guarantee at least #5 above, and implicitly
all other ones.

This patch only modifies smp_dup() to make the duplication inconditional. It
is enough to fix both the "base" sample fetch and the "sni" server directive,
and all use cases in general though not always optimally. More patches will
follow to address them more optimally and even better than the current
situation (eg: avoid a dup just to add a \0 when possible).

The bug comes from an ambiguous design, so its roots are old. 1.6 is affected
and a backport is needed. In 1.5, the function already existed but was only
used by two converters modifying the data in place, so the bug has no effect
there.
2016-08-09 14:03:23 +02:00
Herve COMMOWICK
8dfe863fbf DOC: fix json converter example and error message 2016-08-07 08:08:18 +02:00
Willy Tarreau
5f6e9054b9 BUILD: fix build on Solaris 11
htonll()/ntohll() already exist on Solaris 11 with a different declaration,
causing a build error as reported by Jonathan Fisher. They used to exist on
OSX with a #define which allowed us to detect them. It was a bad idea to give
these functions a name subject to conflicts like this. Simply rename them
my_htonll()/my_ntohll() to definitely get rid of the conflict.

This patch must be backported to 1.6.
2016-05-26 07:15:57 +02:00
David Carlier
64a16ab19c BUG/MEDIUM: sample: initialize the pointer before parse_binary call.
parse_binary line 2025 checks the nullity of binstr parameter.
Other calls of parse_binary properly zeroify this parameter.
[wt: this could result in random failures of the const parser]
2016-04-12 11:08:24 +02:00
Vincent Bernat
02779b6263 CLEANUP: uniformize last argument of malloc/calloc
Instead of repeating the type of the LHS argument (sizeof(struct ...))
in calls to malloc/calloc, we directly use the pointer
name (sizeof(*...)). The following Coccinelle patch was used:

@@
type T;
T *x;
@@

  x = malloc(
- sizeof(T)
+ sizeof(*x)
  )

@@
type T;
T *x;
@@

  x = calloc(1,
- sizeof(T)
+ sizeof(*x)
  )

When the LHS is not just a variable name, no change is made. Moreover,
the following patch was used to ensure that "1" is consistently used as
a first argument of calloc, not the last one:

@@
@@

  calloc(
+ 1,
  ...
- ,1
  )
2016-04-03 14:17:42 +02:00
Willy Tarreau
6204cd9f27 BUG/MAJOR: vars: always retrieve the stream and session from the sample
This is the continuation of previous patch called "BUG/MAJOR: samples:
check smp->strm before using it".

It happens that variables may have a session-wide scope, and that their
session is retrieved by dereferencing the stream. But nothing prevents them
from being used from a streamless context such as tcp-request connection,
thus crashing the process. Example :

    tcp-request connection accept if { src,set-var(sess.foo) -m found }

In order to fix this, we have to always ensure that variable manipulation
only happens via the sample, which contains the correct owner and context,
and that we never use one from a different source. This results in quite a
large change since a lot of functions are inderctly involved in the call
chain, but the change is easy to follow.

This fix must be backported to 1.6, and requires the last two patches.
2016-03-10 17:28:04 +01:00
Willy Tarreau
7560dd4b6a MINOR: sample: always set a new sample's owner before evaluating it
Some functions like sample_conv_var2smp(), var_get_byname(), and
var_set_byname() directly or indirectly need to access the current
stream and/or session and must find it in the sample itself and not
as a distinct argument. Thus we first need to call smp_set_owner()
prior to each such calls.
2016-03-10 16:42:58 +01:00
Willy Tarreau
1777ea63e0 MINOR: sample: add a new helper to initialize the owner of a sample
Since commit 6879ad3 ("MEDIUM: sample: fill the struct sample with the
session, proxy and stream pointers") merged in 1.6-dev2, the sample
contains the pointer to the stream and sample fetch functions as well
as converters use it heavily. This requires from a lot of call places
to initialize 4 fields, and it was even forgotten at a few places.

This patch provides a convenient helper to initialize all these fields
at once, making it easy to prepare a new sample from a previous one for
example.

A few call places were cleaned up to make use of it. It will be needed
by further fixes.

At one place in the Lua code, it was moved earlier because we used to
call sample casts with a non completely initialized sample, which is
not clean eventhough at the moment there are no consequences.
2016-03-10 16:42:58 +01:00
Dragan Dosen
0b85ecee53 MEDIUM: logs: add a new RFC5424 log-format for the structured-data
This patch adds a new RFC5424-specific log-format for the structured-data
that is automatically send by __send_log() when the sender is in RFC5424
mode.

A new statement "log-format-sd" should be used in order to set log-format
for the structured-data part in RFC5424 formatted syslog messages.
Example:

    log-format-sd [exampleSDID@1234\ bytes=\"%B\"\ status=\"%ST\"]
2015-09-28 14:01:27 +02:00
Thierry FOURNIER
136f9d34a9 MINOR: samples: rename union from "data" to "u"
The union name "data" is a little bit heavy while we read the source
code because we can read "data.data.sint". The rename from "data" to "u"
makes the read easiest like "data.u.sint".
2015-08-20 17:13:46 +02:00
Thierry FOURNIER
8c542cac07 MEDIUM: samples: Use the "struct sample_data" in the "struct sample"
This patch remove the struct information stored both in the struct
sample_data and in the striuct sample. Now, only thestruct sample_data
contains data, and the struct sample use the struct sample_data for storing
his own data.
2015-08-20 17:13:46 +02:00
Thierry FOURNIER
cc4d1716a2 MINOR: sample: Add ipv6 to ipv4 and sint to ipv6 casts
The RFC4291 says that when the IPv6 adress have the followin form:
0000::ffff:a.b.c.d, if can be converted to an IPv4 adress. This patch
enable this conversion in casts.

As the sint can be casted as ipv4, and ipv4 can be casted as ipv6, we
can directly cast sint as ipv6 using the RFC4291.
2015-08-11 14:14:10 +02:00
Thierry FOURNIER
5d86fae234 MEDIUM: vars/sample: operators can use variables as parameter
This patch allow the existing operators to take a variable as parameter.
This is useful to add the content of two variables. This patch modify
the behavior of operators.
2015-07-22 00:48:24 +02:00
Thierry FOURNIER
00c005c726 MEDIUM: sample: switch to saturated arithmetic
This patch check calculus for overflow and returns capped values.
This permits to protect against integer overflow in certain operations
involving ratios, percentages, limits or anything. That can sometimes
be critically important with some operations (eg: content-length < X).
2015-07-22 00:48:24 +02:00
Thierry FOURNIER
bf65cd4d77 MAJOR: arg: converts uint and sint in sint
This patch removes the 32 bits unsigned integer and the 32 bit signed
integer. It replaces these types by a unique type 64 bit signed.
2015-07-22 00:48:23 +02:00
Thierry FOURNIER
07ee64ef4d MAJOR: sample: converts uint and sint in 64 bits signed integer
This patch removes the 32 bits unsigned integer and the 32 bit signed
integer. It replaces these types by a unique type 64 bit signed.

This makes easy the usage of integer and clarify signed and unsigned use.
With the previous version, signed and unsigned are used ones in place of
others, and sometimes the converter loose the sign. For example, divisions
are processed with "unsigned", if one entry is negative, the result is
wrong.

Note that the integer pattern matching and dotted version pattern matching
are already working with signed 64 bits integer values.

There is one user-visible change : the "uint()" and "sint()" sample fetch
functions which used to return a constant integer have been replaced with
a new more natural, unified "int()" function. These functions were only
introduced in the latest 1.6-dev2 so there's no impact on regular
deployments.
2015-07-22 00:48:23 +02:00
Thierry FOURNIER
fac9ccfb70 BUG/MINOR: http/sample: gmtime/localtime can fail
The man said that gmtime() and localtime() can return a NULL value.
This is not tested. It appears that all the values of a 32 bit integer
are valid, but it is better to check the return of these functions.

However, if the integer move from 32 bits to 64 bits, some 64 values
can be unsupported.
2015-07-20 12:21:35 +02:00
Willy Tarreau
28d976d5ee MINOR: args: add new context for servers
We'll have to support fetch expressions and args on server lines for
"usesrc", "usedst", "sni", etc...
2015-07-09 11:39:33 +02:00
Adis Nezirovic
79beb248b9 CLEANUP: sample: generalize sample_fetch_string() as sample_fetch_as_type()
This modification makes possible to use sample_fetch_string() in more places,
where we might need to fetch sample values which are not plain strings. This
way we don't need to fetch string, and convert it into another type afterwards.

When using aliased types, the caller should explicitly check which exact type
was returned (e.g. SMP_T_IPV4 or SMP_T_IPV6 for SMP_T_ADDR).

All usages of sample_fetch_string() are converted to use new function.
2015-07-06 16:17:25 +02:00
Dragan Dosen
93b38d9191 MEDIUM: 51Degrees code refactoring and cleanup
Moved 51Degrees code from src/haproxy.c, src/sample.c and src/cfgparse.c
into a separate files src/51d.c and include/import/51d.h.

Added two new functions init_51degrees() and deinit_51degrees(), updated
Makefile and other code reorganizations related to 51Degrees.
2015-06-30 10:43:03 +02:00
Thierry FOURNIER
cc103299c7 MINOR: samples: add samples which returns constants
This patch adds sample which returns constants values. This is useful
for intialising variables.
2015-06-13 23:01:37 +02:00
Thierry FOURNIER
9687c77c91 MINOR: debug: add a special converter which display its input sample content.
This converter displays its input sample type and content. It is useful
for debugging some complex configurations.
2015-06-13 23:01:36 +02:00
Thierry FOURNIER
9c627e84b2 MEDIUM: sample: Add type any
This type is used to accept any type of sample as input, and prevent
any automatic "cast". It runs like the type "ADDR" which accept the
type "IPV4" and "IPV6".
2015-06-13 22:59:14 +02:00
Thierry FOURNIER
0f811440d5 BUG/MINOR: sample: wrong conversion of signed values
The signed values are casted as unsigned before conversion. This patch
use the good converters according with the sample type.

Note: it depends on previous patch to parse signed ints.
2015-06-13 22:59:14 +02:00
Thierry FOURNIER
4c2479e1c4 BUG/MINOR: debug: display (null) in place of "meth"
The array which contains names of types, miss the METH entry.

[wt: should be backported to 1.5 as well]
2015-06-09 10:58:14 +02:00
Thomas Holmes
4d441a759c MEDIUM: sample: add trie support to 51Degrees
Trie or pattern algorithm is used depending on what 51Degrees source
files are provided to MAKE.
2015-06-02 19:30:53 +02:00
Thomas Holmes
951d44d24d MEDIUM: sample: add fiftyone_degrees converter.
It takes up to 5 string arguments that are to be 51Degrees property names.
It will then create a chunk with values detected based on the request header
supplied (this should be the User-Agent).
2015-06-02 14:00:25 +02:00
Willy Tarreau
f63386ad27 CLEANUP: da: move the converter registration to da.c
There's no reason to put it into sample.c, it's better to register it
locally in da.c, it removes a number of ifdefs and exports.
2015-06-02 13:42:12 +02:00
David Carlier
4542b10ae1 MEDIUM: sample: add the da-csv converter
This diff declares the deviceatlas module and can accept up to 5
property names for the API lookup.

[wt: this should probably be moved to its own file using the keyword
      registration mechanism]
2015-06-02 13:24:50 +02:00
Willy Tarreau
e2dc1fa8ca MEDIUM: stick-table: remove the now duplicate find_stktable() function
Since proxy_tbl_by_name() already does the same job, let's not keep
duplicate functions and use this one only.
2015-05-26 12:08:07 +02:00
Willy Tarreau
9e0bb1013e CLEANUP: proxy: make the proxy lookup functions more user-friendly
First, findproxy() was renamed proxy_find_by_name() so that its explicit
that a name is required for the lookup. Second, we give this function
the ability to search for tables if needed. Third we now provide inline
wrappers to pass the appropriate PR_CAP_* flags and to explicitly look
up a frontend, backend or table.
2015-05-26 11:24:42 +02:00
Thierry FOURNIER
0786d05a04 MEDIUM: sample: change the prototype of sample-fetches functions
This patch removes the "opt" entry from the prototype of the
sample-fetches fucntions. This permits to remove some weight
in the prototype call.
2015-05-11 20:03:08 +02:00
Thierry FOURNIER
1d33b882d2 MINOR: sample: fill the struct sample with the options.
Options are relative to the sample. Each sample fetched is associated with
fetch options or fetch flags.

This patch adds the 'opt' vaue in the sample struct. This permits to reduce
the sample-fetch function prototype. In other way, the converters will have
more detail about the origin of the sample.
2015-05-11 20:02:11 +02:00
Thierry FOURNIER
0a9a2b8cec MEDIUM: sample change the prototype of sample-fetches and converters functions
This patch removes the structs "session", "stream" and "proxy" from
the sample-fetches and converters function prototypes.

This permits to remove some weight in the prototype call.
2015-05-11 20:01:42 +02:00
Thierry FOURNIER
6879ad31a5 MEDIUM: sample: fill the struct sample with the session, proxy and stream pointers
Some sample analyzer (sample-fetch or converters) needs to known the proxy,
session and stream attached to the sampel. The sample-fetches and the converters
function pointers cannot be called without these 3 pointers filled.

This patch permits to reduce the sample-fetch and the converters called
prototypes, and provides a new mean to add information for this type of
functions.
2015-05-11 20:00:03 +02:00
Willy Tarreau
192252e2d8 MAJOR: sample: pass a pointer to the session to each sample fetch function
Many such function need a session, and till now they used to dereference
the stream. Once we remove the stream from the embryonic session, this
will not be possible anymore.

So as of now, sample fetch functions will be called with this :

   - sess = NULL,  strm = NULL                     : never
   - sess = valid, strm = NULL                     : tcp-req connection
   - sess = valid, strm = valid, strm->txn = NULL  : tcp-req content
   - sess = valid, strm = valid, strm->txn = valid : http-req / http-res
2015-04-06 11:37:25 +02:00
Willy Tarreau
15e91e1b36 MAJOR: sample: don't pass l7 anymore to sample fetch functions
All of them can now retrieve the HTTP transaction *if it exists* from
the stream and be sure to get NULL there when called with an embryonic
session.

The patch is a bit large because many locations were touched (all fetch
functions had to have their prototype adjusted). The opportunity was
taken to also uniformize the call names (the stream is now always "strm"
instead of "l4") and to fix indent where it was broken. This way when
we later introduce the session here there will be less confusion.
2015-04-06 11:35:53 +02:00
Willy Tarreau
87b09668be REORG/MAJOR: session: rename the "session" entity to "stream"
With HTTP/2, we'll have to support multiplexed streams. A stream is in
fact the largest part of what we currently call a session, it has buffers,
logs, etc.

In order to catch any error, this commit removes any reference to the
struct session and tries to rename most "session" occurrences in function
names to "stream" and "sess" to "strm" when that's related to a session.

The files stream.{c,h} were added and session.{c,h} removed.

The session will be reintroduced later and a few parts of the stream
will progressively be moved overthere. It will more or less contain
only what we need in an embryonic session.

Sample fetch functions and converters will have to change a bit so
that they'll use an L5 (session) instead of what's currently called
"L4" which is in fact L6 for now.

Once all changes are completed, we should see approximately this :

   L7 - http_txn
   L6 - stream
   L5 - session
   L4 - connection | applet

There will be at most one http_txn per stream, and a same session will
possibly be referenced by multiple streams. A connection will point to
a session and to a stream. The session will hold all the information
we need to keep even when we don't yet have a stream.

Some more cleanup is needed because some code was already far from
being clean. The server queue management still refers to sessions at
many places while comments talk about connections. This will have to
be cleaned up once we have a server-side connection pool manager.
Stream flags "SN_*" still need to be renamed, it doesn't seem like
any of them will need to move to the session.
2015-04-06 11:23:56 +02:00
Thierry FOURNIER
8fd1376014 MINOR: converters: add function to browse converters
This patch adds a fucntion to browse each converter. This
is used with Lua for using the converters with a wrapper.
2015-03-11 19:55:10 +01:00
Thierry FOURNIER
4d9a1d1a5c MINOR: sample: add function for browsing samples.
This function is useful with the incoming lua functions.
2015-02-28 23:12:32 +01:00
Thierry FOURNIER
f41a809dc9 MINOR: sample: add private argument to the struct sample_fetch
The add of this private argument is to prepare the integration
of the lua fetchs.
2015-02-28 23:12:31 +01:00
Thierry FOURNIER
68a556e282 MINOR: converters: give the session pointer as converter argument
Some usages of the converters need to know the attached session. The Lua
needs the session for retrieving his running context. This patch adds
the "session" as an argument of the converters prototype.
2015-02-28 23:12:31 +01:00
Thierry FOURNIER
1edc971919 MINOR: converters: add a "void *private" argument to converters
This permits to store specific configuration pointer. It is useful
with future Lua integration.
2015-02-28 23:12:31 +01:00
Willy Tarreau
9770787e70 MEDIUM: samples: provide basic arithmetic and bitwise operators
This commit introduces a new category of converters. They are bitwise and
arithmetic operators which support performing basic operations on integers.
Some bitwise operations are supported (and, or, xor, cpl) and some arithmetic
operations are supported (add, sub, mul, div, mod, neg). Some comparators
are provided (odd, even, not, bool) which make it possible to report a match
without having to write an ACL.

The detailed list of new operators as they appear in the doc is :

add(<value>)
  Adds <value> to the input value of type unsigned integer, and returns the
  result as an unsigned integer.

and(<value>)
  Performs a bitwise "AND" between <value> and the input value of type unsigned
  integer, and returns the result as an unsigned integer.

bool
  Returns a boolean TRUE if the input value of type unsigned integer is
  non-null, otherwise returns FALSE. Used in conjunction with and(), it can be
  used to report true/false for bit testing on input values (eg: verify the
  presence of a flag).

cpl
  Takes the input value of type unsigned integer, applies a twos-complement
  (flips all bits) and returns the result as an unsigned integer.

div(<value>)
  Divides the input value of type unsigned integer by <value>, and returns the
  result as an unsigned integer. If <value> is null, the largest unsigned
  integer is returned (typically 2^32-1).

even
  Returns a boolean TRUE if the input value of type unsigned integer is even
  otherwise returns FALSE. It is functionally equivalent to "not,and(1),bool".

mod(<value>)
  Divides the input value of type unsigned integer by <value>, and returns the
  remainder as an unsigned integer. If <value> is null, then zero is returned.

mul(<value>)
  Multiplies the input value of type unsigned integer by <value>, and returns
  the product as an unsigned integer. In case of overflow, the higher bits are
  lost, leading to seemingly strange values.

neg
  Takes the input value of type unsigned integer, computes the opposite value,
  and returns the remainder as an unsigned integer. 0 is identity. This
  operator is provided for reversed subtracts : in order to subtract the input
  from a constant, simply perform a "neg,add(value)".

not
  Returns a boolean FALSE if the input value of type unsigned integer is
  non-null, otherwise returns TRUE. Used in conjunction with and(), it can be
  used to report true/false for bit testing on input values (eg: verify the
  absence of a flag).

odd
  Returns a boolean TRUE if the input value of type unsigned integer is odd
  otherwise returns FALSE. It is functionally equivalent to "and(1),bool".

or(<value>)
  Performs a bitwise "OR" between <value> and the input value of type unsigned
  integer, and returns the result as an unsigned integer.

sub(<value>)
  Subtracts <value> from the input value of type unsigned integer, and returns
  the result as an unsigned integer. Note: in order to subtract the input from
  a constant, simply perform a "neg,add(value)".

xor(<value>)
  Performs a bitwise "XOR" (exclusive OR) between <value> and the input value
  of type unsigned integer, and returns the result as an unsigned integer.
2015-01-27 15:41:13 +01:00
Willy Tarreau
d817e468bf BUG/MINOR: sample: fix case sensitivity for the regsub converter
Two commits ago in 7eda849 ("MEDIUM: samples: add a regsub converter to
perform regex-based transformations"), I got caught for the second time
with the inverted case sensitivity usage of regex_comp(). So by default
it is case insensitive and passing the "i" flag makes it case sensitive.
I forgot to recheck that case before committing the cleanup. No harm
anyway, nobody had the time to use it.
2015-01-23 20:27:41 +01:00
Willy Tarreau
7eda849dce MEDIUM: samples: add a regsub converter to perform regex-based transformations
We can now replace matching regex parts with a string, a la sed. Note
that there are at least 3 different behaviours for existing sed
implementations when matching 0-length strings. Here is the result
of the following operation on each implementationt tested :

  echo 'xzxyz' | sed -e 's/x*y*/A/g'

  GNU sed 4.2.1       => AzAzA
  Perl's sed 5.16.1   => AAzAAzA
  Busybox v1.11.2 sed => AzAz

The psed behaviour was adopted because it causes the least exceptions
in the code and seems logical from a certain perspective :

  - "x"  matches x*y*  => add "A" and skip "x"
  - "z"  matches x*y*  => add "A" and keep "z", not part of the match
  - "xy" matches x*y*  => add "A" and skip "xy"
  - "z"  matches x*y*  => add "A" and keep "z", not part of the match
  - ""   matches x*y*  => add "A" and stop here

Anyway, given the incompatibilities between implementations, it's unlikely
that some processing will rely on this behaviour.

There currently is one big limitation : the configuration parser makes it
impossible to pass commas or closing parenthesis (or even closing brackets
in log formats). But that's still quite usable to replace certain characters
or character sequences. It will become more complete once the config parser
is reworked.
2015-01-22 14:24:53 +01:00
Willy Tarreau
469477879c MINOR: args: implement a new arg type for regex : ARGT_REG
This one will be used when a regex is expected. It is automatically
resolved after the parsing and compiled into a regex. Some optional
flags are supported in the type-specific flags that should be set by
the optional arg checker. One is used during the regex compilation :
ARGF_REG_ICASE to ignore case.
2015-01-22 14:24:53 +01:00
Willy Tarreau
8059977d3e MINOR: samples: provide a "crc32" converter
This converter hashes a binary input sample into an unsigned 32-bit quantity
using the CRC32 hash function. Optionally, it is possible to apply a full
avalanche hash function to the output if the optional <avalanche> argument
equals 1. This converter uses the same functions as used by the various hash-
based load balancing algorithms, so it will provide exactly the same results.
It is provided for compatibility with other software which want a CRC32 to be
computed on some input keys, so it follows the most common implementation as
found in Ethernet, Gzip, PNG, etc... It is slower than the other algorithms
but may provide a better or at least less predictable distribution.
2015-01-20 19:48:08 +01:00
Vincent Bernat
1228dc0e7a BUG/MEDIUM: sample: fix random number upper-bound
random() will generate a number between 0 and RAND_MAX. POSIX mandates
RAND_MAX to be at least 32767. GNU libc uses (1<<31 - 1) as
RAND_MAX.

In smp_fetch_rand(), a reduction is done with a multiply and shift to
avoid skewing the results. However, the shift was always 32 and hence
the numbers were not distributed uniformly in the specified range. We
fix that by dividing by RAND_MAX+1. gcc is smart enough to turn that
into a shift:

    0x000000000046ecc8 <+40>:    shr    $0x1f,%rax
2014-12-10 22:45:34 +01:00
Emeric Brun
c9a0f6d023 MINOR: samples: add the word converter.
word(<index>,<delimiters>)
  Extracts the nth word considering given delimiters from an input string.
  Indexes start at 1 and delimiters are a string formatted list of chars.
2014-11-25 14:48:39 +01:00
Emeric Brun
f399b0debf MINOR: samples: adds the field converter.
field(<index>,<delimiters>)
  Extracts the substring at the given index considering given delimiters from
  an input string. Indexes start at 1 and delimiters are a string formatted
  list of chars.
2014-11-24 17:44:02 +01:00
Emeric Brun
54c4ac8417 MINOR: samples: adds the bytes converter.
bytes(<offset>[,<length>])
  Extracts a some bytes from an input binary sample. The result is a
  binary sample starting at an offset (in bytes) of the original sample
  and optionnaly truncated at the given length.
2014-11-24 17:44:02 +01:00
Willy Tarreau
0f30d26dbf MINOR: sample: add a few basic internal fetches (nbproc, proc, stopping)
Sometimes, either for debugging or for logging we'd like to have a bit
of information about the running process. Here are 3 new fetches for this :

nbproc : integer
  Returns an integer value corresponding to the number of processes that were
  started (it equals the global "nbproc" setting). This is useful for logging
  and debugging purposes.

proc : integer
  Returns an integer value corresponding to the position of the process calling
  the function, between 1 and global.nbproc. This is useful for logging and
  debugging purposes.

stopping : boolean
  Returns TRUE if the process calling the function is currently stopping. This
  can be useful for logging, or for relaxing certain checks or helping close
  certain connections upon graceful shutdown.
2014-11-24 17:44:02 +01:00
Emeric Brun
4b9e80268e BUG/MINOR: samples: fix unnecessary memcopy converting binary to string. 2014-11-24 17:44:02 +01:00
Thierry FOURNIER
317e1c4f1e MINOR: sample: add "json" converter
This converter escapes string to use it as json/ascii escaped string.
It can read UTF-8 with differents behavior on errors and encode it in
json/ascii.

json([<input-code>])
  Escapes the input string and produces an ASCII ouput string ready to use as a
  JSON string. The converter tries to decode the input string according to the
  <input-code> parameter. It can be "ascii", "utf8", "utf8s", "utf8"" or
  "utf8ps". The "ascii" decoder never fails. The "utf8" decoder detects 3 types
  of errors:
   - bad UTF-8 sequence (lone continuation byte, bad number of continuation
     bytes, ...)
   - invalid range (the decoded value is within a UTF-8 prohibited range),
   - code overlong (the value is encoded with more bytes than necessary).

  The UTF-8 JSON encoding can produce a "too long value" error when the UTF-8
  character is greater than 0xffff because the JSON string escape specification
  only authorizes 4 hex digits for the value encoding. The UTF-8 decoder exists
  in 4 variants designated by a combination of two suffix letters : "p" for
  "permissive" and "s" for "silently ignore". The behaviors of the decoders
  are :
   - "ascii"  : never fails ;
   - "utf8"   : fails on any detected errors ;
   - "utf8s"  : never fails, but removes characters corresponding to errors ;
   - "utf8p"  : accepts and fixes the overlong errors, but fails on any other
                error ;
   - "utf8ps" : never fails, accepts and fixes the overlong errors, but removes
                characters corresponding to the other errors.

  This converter is particularly useful for building properly escaped JSON for
  logging to servers which consume JSON-formated traffic logs.

  Example:
     capture request header user-agent len 150
     capture request header Host len 15
     log-format {"ip":"%[src]","user-agent":"%[capture.req.hdr(1),json]"}

  Input request from client 127.0.0.1:
     GET / HTTP/1.0
     User-Agent: Very "Ugly" UA 1/2

  Output log:
     {"ip":"127.0.0.1","user-agent":"Very \"Ugly\" UA 1\/2"}
2014-10-26 06:41:12 +01:00
Willy Tarreau
6bcb0a84e7 BUG/MAJOR: tcp: fix a possible busy spinning loop in content track-sc*
As a consequence of various recent changes on the sample conversion,
a corner case has emerged where it is possible to wait forever for a
sample in track-sc*.

The issue is caused by the fact that functions relying on sample_process()
don't all exactly work the same regarding the SMP_F_MAY_CHANGE flag and
the output result. Here it was possible to wait forever for an output
sample from stktable_fetch_key() without checking the SMP_OPT_FINAL flag.
As a result, if the client connects and closes without sending the data
and haproxy expects a sample which is capable of coming, it will ignore
this impossible case and will continue to wait.

This change adds control for SMP_OPT_FINAL before waiting for extra data.
The various relevant functions have been better documented regarding their
output values.

This fix must be backported to 1.5 since it appeared there.
2014-07-30 08:56:35 +02:00
Willy Tarreau
bbfd1a25ee MINOR: sample: allow integers to cast to binary
Doing so finally allows to apply the hex converter to integers as well.
Note that all integers are represented in 32-bit, big endian so that their
conversion remains human readable and portable. A later improvement to the
hex converter could be to make it trim leading zeroes, and/or to only report
a number of least significant bytes.
2014-07-15 21:36:15 +02:00
Willy Tarreau
23ec4ca1bb MINOR: sample: add new converters to hash input
From time to time it's useful to hash input data (scramble input, or
reduce the space needed in a stick table). This patch provides 3 simple
converters allowing use of the available hash functions to hash input
data. The output is an unsigned integer which can be passed into a header,
a log or used as an index for a stick table. One nice usage is to scramble
source IP addresses before logging when there are requirements to hide them.
2014-07-15 21:36:15 +02:00
Willy Tarreau
9700e5c914 MINOR: sample: allow IP address to cast to binary
IP addresses are a perfect example of fixed size data which we could
cast to binary, still it was not allowed by lack of cast function,
eventhough the opposite was allowed in ACLs. Make that possible both
in sample expressions and in stick tables.
2014-07-15 21:36:15 +02:00
Willy Tarreau
0dbfdbaef1 MINOR: samples: add two converters for the date format
This patch adds two converters :

   ltime(<format>[,<offset>])
   utime(<format>[,<offset>])

Both use strftime() to emit the output string from an input date. ltime()
provides local time, while utime() provides the UTC time.
2014-07-10 16:43:44 +02:00
Willy Tarreau
6c616e0b96 BUG/MAJOR: sample: correctly reinitialize sample fetch context before calling sample_process()
We used to only clear flags when reusing the static sample before calling
sample_process(), but that's not enough because there's a context in samples
that can be used by some fetch functions such as auth, headers and cookies,
and not reinitializing it risks that a pointer of a different type is used
in the wrong context.

An example configuration which triggers the case consists in mixing hdr()
and http_auth_group() which both make use of contexts :

     http-request add-header foo2 %[hdr(host)],%[http_auth_group(foo)]

The solution is simple, initialize all the sample and not just the flags.
This fix must be backported into 1.5 since it was introduced in 1.5-dev19.
2014-06-25 17:12:08 +02:00
Willy Tarreau
3a4ac422ce MINOR: tcp: prepare support for the "capture" action
A few minor entries will be needed to capture sample fetches in requests
or responses. This patch just prepares the code for this.
2014-06-13 16:32:48 +02:00
Willy Tarreau
5b4bf70a95 MINOR: sample: improve sample_fetch_string() to report partial contents
Currently, all callers to sample_fetch_string() call it with SMP_OPT_FINAL.
Now we improve it to support the case where this option is not set, and to
make it return the original sample as-is. The purpose is to let the caller
check the SMP_F_MAY_CHANGE flag in the result and know that it should wait
to get complete contents. Currently this has no effect on existing code.
2014-06-13 16:32:48 +02:00
Emeric Brun
53d1a98270 MINOR: ssl: adds sample converter base64 for binary type.
The new converter encode binary type sample to base64 string.

i.e. : ssl_c_serial,base64
2014-04-30 22:31:11 +02:00
Thierry FOURNIER
eeaa951726 MINOR: configuration: File and line propagation
This patch permits to communicate file and line of the
configuration file at the configuration parser.
2014-03-17 18:06:08 +01:00
Thierry FOURNIER
d437314979 MEDIUM: sample/http_proto: Add new type called method
The method are actuelly stored using two types. Integer if the method is
known and string if the method is not known. The fetch is declared as
UINT, but in some case it can provides STR.

This patch create new type called METH. This type contain interge for
known method and string for the other methods. It can be used with
automatic converters.

The pattern matching can expect method.

During the free or prune function, http_meth pettern is freed. This
patch initialise the freed pointer to NULL.
2014-03-17 18:06:07 +01:00
Thierry FOURNIER
7654c9ff44 MEDIUM: sample: Remove types SMP_T_CSTR and SMP_T_CBIN, replace it by SMP_F_CONST flags
The operations applied on types SMP_T_CSTR and SMP_T_STR are the same,
but the check code and the declarations are double, because it must
declare action for SMP_T_C* and SMP_T_*. The declared actions and checks
are the same. this complexify the code. Only the "conv" functions can
change from "C*" to "*"

Now, if a function needs to modify input string, it can call the new
function smp_dup(). This one duplicate data in a trash buffer.
2014-03-17 18:06:07 +01:00
Thierry FOURNIER
0e9af55700 MINOR: sample: dont call the sample cast function "c_none"
If the cast function to execute is c_none, dont execute it and return
true. The function c_none, do nothing. This save a call.
2014-03-17 18:06:07 +01:00
Willy Tarreau
1cf8f08c17 MINOR: sample: move smp_to_type to sample.c
This way it can be exported and reused anywhere else to report type names.
2014-03-17 18:06:06 +01:00
Thierry FOURNIER
e87cac16cc MEDIUM: sample: change the behavior of the bin2str cast
The bin2str cast gives the hexadecimal representation of the binary
content when it is used as string. This was inherited from the
stick-table casts without realizing that it was a mistake. Indeed,
it breaks string processing on binary contents, preventing any _reg,
_beg, etc from working.

For example, with an HTTP GET request, the fetch "req.payload(0,3)"
returns the 3 bytes "G", "E", and "T" in binary. If this fetch is
used with regex, it is automatically converted to "474554" and the
regex is applied on this string, so it never matches.

This commit changes the cast so that bin2str does not convert the
contents anymore, and returns a string type. The contents can thus
be matched as is, and the NULL character continues to mark the end
of the string to avoid any issue with some string-based functions.

This commit could almost have been marked as a bug fix since it
does what the doc says.

Note that in case someone would rely on the hex encoding, then the
same behaviour could be achieved by appending ",hex" after the sample
fetch function (brought by previous patch).
2014-03-17 17:31:46 +01:00
Thierry FOURNIER
2f49d6d17b MINOR: sample: add hex converter
This new filter converts BIN type to its hexadecimal
representation in STR type. It is used to keep the
compatibility with the original bin2str cast.

It will be useful when bin2str changes to copy the
string as-is without encoding anymore.
2014-03-17 16:39:18 +01:00
Willy Tarreau
84310e2e73 MINOR: sample: add a rand() sample fetch to return a sample.
Sometimes it can be useful to generate a random value, at least
for debugging purposes, but also to take routing decisions or to
pass such a value to a backend server.
2014-02-14 11:59:04 +01:00
Thierry FOURNIER
60bb020d70 BUG/MINOR: sample: The c_str2int converter does not fail if the entry is not an integer
If the string not start with a number, the converter fails. In other, it
converts a maximum of characters to a number and stop to the first
character that not match a number.
2014-01-27 19:04:18 +01:00
Willy Tarreau
689a1df0a1 BUG/MEDIUM: sample: simplify and fix the argument parsing
Some errors may be reported about missing mandatory arguments when some
sample fetch arguments are marked as mandatory and implicit (eg: proxy
names such as in table_cnt or be_conn).

In practice the argument parser already handles all the situations very
well, it's just that the sample fetch parser want to go beyond its role
and starts some controls that it should not do. Simply removing these
useless controls lets make_arg_list() create the correct argument types
when such types are encountered.

This regression was introduced by the recent use of sample_parse_expr()
in ACLs which makes use of its own argument parser, while previously
the arguments were parsed in the ACL function itself. No backport is
needed.
2013-12-13 01:33:33 +01:00
Willy Tarreau
975c1784c8 MINOR: sample: make sample_parse_expr() use memprintf() to report parse errors
Doing so ensures that we're consistent between all the functions in the whole
chain. This is important so that we can extract the argument parsing from this
function.
2013-12-12 23:16:54 +01:00
Thierry FOURNIER
fd1399091e BUG/MEDIUM: sample: conversion from str to ipv6 may read data past end
Applying inet_pton() to input contents is not reliable because the
function requires a zero-terminated string. While inet_pton() will
stop when contents do not match an IPv6 address anymore, it could
theorically read past the end of a buffer if the data to be converted
was at the end of a buffer (this cannot happen right now thanks to
the reserve at the end of the buffer). At least the conversion does
not work.

Fix this by using buf2ip6() instead, which copies the string into a
padded aread.

This bug came with recent commit b805f71 (MEDIUM: sample: let the
cast functions set their output type), no backport is needed.
2013-12-11 22:03:00 +01:00
Willy Tarreau
3d536ac378 BUG/MINOR: acl: fix sample expression error reporting
ACL parse errors are not easy to understand since recent commit 348971e
(MEDIUM: acl: use the fetch syntax 'fetch(args),conv(),conv()' into the
ACL keyword) :

[ALERT] 339/154717 (26437) : parsing [check-bug.cfg:10] : error detected while parsing a 'stats admin' rule : unknown ACL or sample keyword 'env(a,b,c)': invalid arg 2 in fetch method 'env' : end of arguments expected at position 2, but got ',b,c'..

This error is only relevant to sample fetch keywords, so the new form is
a bit easier to understand :

[ALERT] 339/160011 (26626) : parsing [check-bug.cfg:12] : error detected while parsing a 'stats admin' rule : invalid arg 2 in fetch method 'env' : end of arguments expected at position 2, but got ',b,c' in sample expression 'env(a,b,c),upper'.

No backport is needed.
2013-12-06 16:02:46 +01:00
Willy Tarreau
67ff7e0af3 BUG/MEDIUM: acl: fix regression introduced by latest converters support
Since commit 348971e (MEDIUM: acl: use the fetch syntax
'fetch(args),conv(),conv()' into the ACL keyword), ACLs wait on input
that may change. This is visible in the configuration below :

        tcp-request inspect-delay 3s
        tcp-request content accept if REQ_CONTENT

Nothing will pass before the end of the timer. This is because
historically, sample_process() was dedicated to stick tables where
it was absolutely necessary to wait for a stable sample. Now samples
are used by many other things and we can't afford this. So let's move
this check to the stick tables after the call to sample_process()
instead.

This is post-1.5-dev19 work, no backport is required.
2013-12-05 02:23:13 +01:00
Thierry FOURNIER
d18cd0f110 MEDIUM: http: The redirect strings follows the log format rules.
We handle "http-request redirect" with a log-format string now, but we
leave "redirect" unaffected.

Note that the control of the special "/" case is move from the runtime
execution to the configuration parsing. If the format rule list is
empty, the build_logline() function does nothing.
2013-12-02 23:31:33 +01:00
Thierry FOURNIER
b805f71d1b MEDIUM: sample: let the cast functions set their output type
This patch allows each sample cast function to specify the sample
output type. The goal is to be able to emit an output type IPv4 or
IPv6 depending on what is found in the input if the next converter
is able to process them both.

The patch also adds a new pseudo type called "ADDR". This type is an
alias for IPV4 and IPV6 which is only used as an input type by converters
who want to express their compatibility with both address formats. It may
not be emitted.

The goal is to unify as much as possible the processing of IPv4 and IPv6
in order not to add extra keywords for the maps which act as converters,
but will match samples like ACLs do with their patterns.
2013-12-02 23:31:33 +01:00
Thierry FOURNIER
9c1d67ecbd MINOR: sample: provide the original sample_conv descriptor struct to the argument checker function.
Note that this argument checker is still unused but will be used by
maps.
2013-12-02 23:31:32 +01:00
Thierry FOURNIER
348971ea28 MEDIUM: acl: use the fetch syntax 'fetch(args),conv(),conv()' into the ACL keyword
If the acl keyword is a "fetch", the dedicated parsing function
"sample_parse_expr()" is used. Otherwise, the acl parsing function
"parse_acl_expr()" is extended to understand the syntax of a series
of converters placed after the "fetch" keyword.

Before this patch, each acl uses a "struct sample_fetch" and executes
it with the "<fetch>->process()" function. Now, the dedicated function
"sample_process()" is called.

These syntax are now avalaible:

   acl bad req.hdr(host),lower -m str www
   http-request redirect prefix /go-away if bad

   acl bad hdr_beg(host),lower www
   http-request redirect prefix /go-away if bad
2013-12-02 23:31:32 +01:00
Thierry FOURNIER
8af6ff12b5 MINOR: sample: export sample_casts
just export the sample cast matrix "sample_casts" to prepare the
generic sample conversion parser.
2013-12-02 23:31:32 +01:00
Thierry FOURNIER
1c0054fe83 BUG/MINOR: arg: fix error reporting for add-header/set-header sample fetch arguments
The 'add-header %[samples]' parsing errors associated to http-request
and http-response are displayed with the wrong keyword.

Configuration entry:

   http-request set-header mon-header %[res.hdr(user-agent)]

Original error message:

   [WARNING] 323/150920 (16559) : parsing [haproxy.conf:36] : 'log-format' : sample fetch <res.hdr ...

After commit error message:

   [WARNING] 323/150929 (16580) : parsing [haproxy.conf:36] : 'http-request' : sample fetch <res.hdr ...
2013-11-28 18:25:18 +01:00
Willy Tarreau
ef38c39287 MEDIUM: sample: systematically pass the keyword pointer to the keyword
We're having a lot of duplicate code just because of minor variants between
fetch functions that could be dealt with if the functions had the pointer to
the original keyword, so let's pass it as the last argument. An earlier
version used to pass a pointer to the sample_fetch element, but this is not
the best solution for two reasons :
  - fetch functions will solely rely on the keyword string
  - some other smp_fetch_* users do not have the pointer to the original
    keyword and were forced to pass NULL.

So finally we're passing a pointer to the keyword as a const char *, which
perfectly fits the original purpose.
2013-08-01 21:17:13 +02:00
Willy Tarreau
6236d3abe4 MINOR: sample: add a new "date" fetch to return the current date
Returns the current date as the epoch (number of seconds since 01/01/1970).
If an offset value is specified, then it is a number of seconds that is added
to the current date before returning the value. This is particularly useful
to compute relative dates, as both positive and negative offsets are allowed.
2013-07-25 15:00:37 +02:00
Willy Tarreau
5b8ad22228 CLEANUP: acl: move the 3 remaining sample fetches to samples.c
There is no more reason for having "always_true", "always_false" and "env"
in acl.c while they're the most basic sample fetch keywords, so let's move
them to sample.c where it's easier to find them.
2013-07-25 15:00:37 +02:00
Willy Tarreau
18387e2e48 MINOR: sample: fix sample_process handling of unstable data
sample_process() used to return NULL on changing data, regardless of the
SMP_OPT_FINAL flag. Let's change this so that it is now possible to
include such data in logs or HTTP headers. Also, one unconvenient
thing was that it used to always set the sample flags to zero, making
it incompatible with ACLs which may need to call it multiple times. Only
do this for locally-allocated samples.
2013-07-25 15:00:37 +02:00
Willy Tarreau
833cc79434 MEDIUM: sample: handle comma-delimited converter list
We now support having a comma-delimited converter list, which can start
right after the fetch keyword. The immediate benefit is that it allows
to use converters in log-format expressions, for example :

   set-header source-net %[src,ipmask(24)]

The parser is also slightly improved and should be more resilient against
configuration errors. Also, optional arguments in converters were mistakenly
not allowed till now, so this was fixed.
2013-07-25 15:00:37 +02:00
Willy Tarreau
dc13c11c1e BUG/MEDIUM: prevent gcc from moving empty keywords lists into BSS
Benoit Dolez reported a failure to start haproxy 1.5-dev19. The
process would immediately report an internal error with missing
fetches from some crap instead of ACL names.

The cause is that some versions of gcc seem to trim static structs
containing a variable array when moving them to BSS, and only keep
the fixed size, which is just a list head for all ACL and sample
fetch keywords. This was confirmed at least with gcc 3.4.6. And we
can't move these structs to const because they contain a list element
which is needed to link all of them together during the parsing.

The bug indeed appeared with 1.5-dev19 because it's the first one
to have some empty ACL keyword lists.

One solution is to impose -fno-zero-initialized-in-bss to everyone
but this is not really nice. Another solution consists in ensuring
the struct is never empty so that it does not move there. The easy
solution consists in having a non-null list head since it's not yet
initialized.

A new "ILH" list head type was thus created for this purpose : create
an Initialized List Head so that gcc cannot move the struct to BSS.
This fixes the issue for this version of gcc and does not create any
burden for the declarations.
2013-06-21 23:29:02 +02:00
Willy Tarreau
a4312fa28e MAJOR: sample: maintain a per-proxy list of the fetch args to resolve
While ACL args were resolved after all the config was parsed, it was not the
case with sample fetch args because they're almost everywhere now.

The issue is that ACLs now solely rely on sample fetches, so their args
resolving doesn't work anymore. And many fetches involving a server, a
proxy or a userlist don't work at all.

The real issue is that at the bottom layers we have no information about
proxies, line numbers, even ACLs in order to report understandable errors,
and that at the top layers we have no visibility over the locations where
fetches are referenced (think log node).

After failing multiple unsatisfying solutions attempts, we now have a new
concept of args list. The principle is that every proxy has a list head
which contains a number of indications such as the config keyword, the
context where it's used, the file and line number, etc... and a list of
arguments. This list head is of the same type as the elements, so it
serves as a template for adding new elements. This way, it is filled from
top to bottom by the callers with the information they have (eg: line
numbers, ACL name, ...) and the lower layers just have to duplicate it and
add an element when they face an argument they cannot resolve yet.

Then at the end of the configuration parsing, a loop passes over each
proxy's list and resolves all the args in sequence. And this way there is
all necessary information to report verbose errors.

The first immediate benefit is that for the first time we got very precise
location of issues (arg number in a keyword in its context, ...). Second,
in order to do this we had to parse log-format and unique-id-format a bit
earlier, so that was a great opportunity for doing so when the directives
are encountered (unless it's a default section). This way, the recorded
line numbers for these args are the ones of the place where the log format
is declared, not the end of the file.

Userlists report slightly more information now. They're the only remaining
ones in the ACL resolving function.
2013-04-03 02:13:02 +02:00
Willy Tarreau
bf8e251077 MINOR: sample: provide a function to report the name of a sample check point
We need to put names on places where samples are used in order to emit warnings
and errors. Let's do that now.
2013-04-03 02:13:00 +02:00
Willy Tarreau
80aca90ad2 MEDIUM: samples: use new flags to describe compatibility between fetches and their usages
Samples fetches were relying on two flags SMP_CAP_REQ/SMP_CAP_RES to describe
whether they were compatible with requests rules or with response rules. This
was never reliable because we need a finer granularity (eg: an HTTP request
method needs to parse an HTTP request, and is available past this point).

Some fetches are also dependant on the context (eg: "hdr" uses request or
response depending where it's involved, causing some abiguity).

In order to solve this, we need to precisely indicate in fetches what they
use, and their users will have to compare with what they have.

So now we have a bunch of bits indicating where the sample is fetched in the
processing chain, with a few variants indicating for some of them if it is
permanent or volatile (eg: an HTTP status is stored into the transaction so
it is permanent, despite being caught in the response contents).

The fetches also have a second mask indicating their validity domain. This one
is computed from a conversion table at registration time, so there is no need
for doing it by hand. This validity domain consists in a bitmask with one bit
set for each usage point in the processing chain. Some provisions were made
for upcoming controls such as connection-based TCP rules which apply on top of
the connection layer but before instantiating the session.

Then everywhere a fetch is used, the bit for the control point is checked in
the fetch's validity domain, and it becomes possible to finely ensure that a
fetch will work or not.

Note that we need these two separate bitfields because some fetches are usable
both in request and response (eg: "hdr", "payload"). So the keyword will have
a "use" field made of a combination of several SMP_USE_* values, which will be
converted into a wider list of SMP_VAL_* flags.

The knowledge of permanent vs dynamic information has disappeared for now, as
it was never used. Later we'll probably reintroduce it differently when
dealing with variables. Its only use at the moment could have been to avoid
caching a dynamic rate measurement, but nothing is cached as of now.
2013-04-03 02:12:56 +02:00