The pointer <regstr> is only used to compare and identify the original
regex string with the patterns. Now the patterns have a reference map
containing this original string. It is useless to store this value two
times.
Before this patch, this function try to add values in best effort. If
the parsing iof the value fail, the operation continue until the end.
Now, this function stop on the first error and left the pattern in
coherant state.
This patch adds new display type. This display returns allocated string,
when the string is flush into buffers, it is freed. This permit to
return the content of "memprintf(err, ...)" messages.
The pat_ref_add functions has changed to return error.
The format of the acl file are not the same than the format of the map
files. In some case, the same file can be used, but this is ambiguous
for the user because the patterns are not the expected.
The acl and map function do the same work with the file parsing. This
patch merge these code in only one.
Note that the function map_read_entries_from_file() in the file "map.c"
is moved to the the function pat_ref_read_from_file_smp() in the file
"pattern.c". The code of this function is not modified, only the the
name and the arguments order has changed.
Each pattern displayed is associated to the value of his pattern
reference. This value can be used for deleting the entry. It is useful
with complex regex: the users are not forced to write the regex with all
the amiguous chars and escaped chars on the CLI.
The find_smp search the smp using the value of the pat_ref_elt pointer.
The pat_find_smp_* are no longer used. The function pattern_find_smp()
known all pattern indexation, and can be found
All the pattern delete function can use her reference to the original
"struct pat_ref_elt" to find the element to be remove. The functions
pat_del_list_str() and pat_del_meth() were deleted because after
applying this modification, they have the same code than pat_del_list_ptr().
Before this patch, the "get map/acl" function try to convert and display
the sample. This behavior is not efficient because some type like the
regex cannot be reversed and displayed as string.
This patch display the original stored reference.
Now, each pattern entry known the original "struct pat_ref_elt" from
that was built. This patch permit to delete each pattern entry without
confusion. After this patch, each reference can use his pointer to be
targeted.
The function Pattern_add() is only used by pat_ref_push(). This patch
remove the function pattern_add() and merge his code in the function
pat_ref_push().
The pattern reference are stored with two identifiers: the unique_id and
the reference.
The reference identify a file. Each file with the same name point to the
same reference. We can register many times one file. If the file is
modified, all his dependencies are also modified. The reference can be
used with map or acl.
The unique_id identify inline acl. The unique id is unique for each acl.
You cannot force the same id in the configuration file, because this
repport an error.
The format of the acl and map listing through the "socket" has changed
for displaying these new ids.
This patch extract the expect_type variable from the "struct pattern" to
"struct pattern_head". This variable is set during the declaration of
ACL and MAP. With this change, the function "pat_parse_len()" become
useless and can be replaced by "pat_parse_int()".
Implicit ACLs by default rely on the fetch's output type, so let's simply do
the same for all other ones. It has been verified that they all match.
Sometimes the same pattern file is used with the same index, parse and
parse_smp functions. If this two condition are true, these two pattern
are identical and the same struct can be used.
This patch add the following socket command line options:
show acl [<id>]
clear acl <id>
get acl <id> <pattern>
del acl <id> <pattern>
add acl <id> <pattern>
The system used for maps is backported in the pattern functions.
Some functions needs to change the sample associated to pattern. This
new pointer permit to return the a pointer to the sample pointer. The
caller can use or change the value.
This commit adds a delete function for patterns. It looks up all
instances of the pattern to delete and deletes them all. The fetch
keyword declarations have been extended to point to the appropriate
delete function.
This commit adds second tree node in the pattern struct and use it to
index IPv6 addresses. This commit report feature used in the list. If
IPv4 not match the tree, try to convert the IPv4 address in IPv6 with
prefixing the IPv4 address by "::ffff", after this operation, the match
function try lookup in the IPv6 tree. If the IPv6 sample dont match the
IPv6 tree, try to convert the IPv6 addresses prefixed by "2002:IPv4",
"::ffff:IPv4" and "::0000:IPv4" in IPv4 address. after this operation,
the match function try lookup in the IPv4 tree.
The match function known the format of the pattern. The pattern can be
stored in a list or in a tree. The pattern matching function use itself
the good entry point and indexation type.
Each pattern matching function return the struct pattern that match. If
the flag "fill" is set, the struct pattern is filled, otherwise the
content of this struct must not be used.
With this feature, the general pattern matching function cannot have
exceptions for building the "struct pattern".
The original get map display function set the comma separator after each
word displayed. This is not efficient because we cannot knew if the
displayed word is the last.
This new system set the comma separator before the displayed word, and
independant "\n" is set a the end of the function.
Before this commit, the pattern_exec_match() function returns the
associate sample, the associate struct pattern or the associate struct
pattern_tree. This is complex to use, because we can check the type of
information returned.
Now the function return always a "struct pattern". If <fill> is not set,
only the value of the pointer can be used as boolean (NULL or other). If
<fill> is set, you can use the <smp> pointer and the pattern
information.
If information must be duplicated, it is stored in trash buffer.
Otherwise, the pattern can point on existing strings.
The method are actuelly stored using two types. Integer if the method is
known and string if the method is not known. The fetch is declared as
UINT, but in some case it can provides STR.
This patch create new type called METH. This type contain interge for
known method and string for the other methods. It can be used with
automatic converters.
The pattern matching can expect method.
During the free or prune function, http_meth pettern is freed. This
patch initialise the freed pointer to NULL.
The operations applied on types SMP_T_CSTR and SMP_T_STR are the same,
but the check code and the declarations are double, because it must
declare action for SMP_T_C* and SMP_T_*. The declared actions and checks
are the same. this complexify the code. Only the "conv" functions can
change from "C*" to "*"
Now, if a function needs to modify input string, it can call the new
function smp_dup(). This one duplicate data in a trash buffer.
The pattern parse functions put the parsed result in a "struct pattern"
without memory allocation. If the pattern must reference the input data
without changes, the pattern point to the parsed string. If buffers are
needed to store translated data, it use th trash buffer. The indexation
function that allocate the memory later if it is needed.
Before this patch, the indexation function check the declared patttern
matching function and index the data according with this function. This
is not useful to add some indexation mode.
This commit adds dedicated indexation function. Each struct pattern is
associated with one indexation function. This function permit to index
data according with the type of pattern and with the type of match.
This commit separes the "struct list" used for the chain the "struct
pattern" which contain the pattern data. Later, this change will permit
to manipulate lists ans trees with the same "struct pattern".
Each pattern parser take only one string. This change is reported to the
function prototype of the function "pattern_register()". Now, it is
called with just one string and no need to browse the array of args.
After the previous patches, the "pat_parse_strcat()" function disappear,
and the "pat_parse_int()" and "pat_parse_dotted_ver()" functions dont
use anymore the "opaque" argument, and take only one string on his
input.
So, after this patch, each pattern parser no longer use the opaque
variable and take only one string as input. This patch change the
prototype of the pattern parsing functions.
Now, the "char **args" is replaced by a "char *arg", the "int *opaque"
is removed and these functions return 1 in succes case, and 0 if fail.
The goal of these patch is to simplify the prototype of
"pat_pattern_*()" functions. I want to replace the argument "char
**args" by a simple "char *arg" and remove the "opaque" argument.
"pat_parse_int()" and "pat_parse_dotted_ver()" are the unique pattern
parser using the "opaque" argument and using more than one string
argument of the char **args. These specificities are only used with ACL.
Other systems using this pattern parser (MAP and CLI) just use one
string for describing a range.
This two functions can read a range, but the min and the max must y
specified. This patch extends the syntax to describe a range with
implicit min and max. This is used for operators like "lt", "le", "gt",
and "ge". the syntax is the following:
":x" -> no min to "x"
"x:" -> "x" to no max
This patch moves the parsing of the comparison operator from the
functions "pat_parse_int()" and "pat_parse_dotted_ver()" to the acl
parser. The acl parser read the operator and the values and build a
volatile string readable by the functions "pat_parse_int()" and
"pat_parse_dotted_ver()". The transformation is done with these rules:
If the parser is "pat_parse_int()":
"eq x" -> "x"
"le x" -> ":x"
"lt x" -> ":y" (with y = x - 1)
"ge x" -> "x:"
"gt x" -> "y:" (with y = x + 1)
If the parser is "pat_parse_dotted_ver()":
"eq x.y" -> "x.y"
"le x.y" -> ":x.y"
"lt x.y" -> ":w.z" (with w.z = x.y - 1)
"ge x.y" -> "x.y:"
"gt x.y" -> "w.z:" (with w.z = x.y + 1)
Note that, if "y" is not present, assume that is "0".
Now "pat_parse_int()" and "pat_parse_dotted_ver()" accept only one
pattern and the variable "opaque" is no longer used. The prototype of
the pattern parsers can be changed.
This patch remove the limit of 32 groups. It also permit to use standard
"pat_parse_str()" function in place of "pat_parse_strcat()". The
"pat_parse_strcat()" is no longer used and its removed. Before this
patch, the groups are stored in a bitfield, now they are stored in a
list of strings. The matching is slower, but the number of groups is
low and generally the list of allowed groups is short.
The fetch function "smp_fetch_http_auth_grp()" used with the name
"http_auth_group" return valid username. It can be used as string for
displaying the username or with the acl "http_auth_group" for checking
the group of the user.
Maybe the names of the ACL and fetch methods are no longer suitable, but
I keep the current names for conserving the compatibility with existing
configurations.
The function "userlist_postinit()" is created from verification code
stored in the big function "check_config_validity()". The code is
adapted to the new authentication storage system and it is moved in the
"src/auth.c" file. This function is used to check the validity of the
users declared in groups and to check the validity of groups declared
on the "user" entries.
This resolve function is executed before the check of all proxy because
many acl needs solved users and groups.
The ACL keyword returned by find_acl_kw() is checked for having a
valid ->parse() function. This dates back 2007 when ACLs were reworked
in order to differenciate old and new keywords. This check is
inappropriate and confusing since all keywords have a parser now.
The bin2str cast gives the hexadecimal representation of the binary
content when it is used as string. This was inherited from the
stick-table casts without realizing that it was a mistake. Indeed,
it breaks string processing on binary contents, preventing any _reg,
_beg, etc from working.
For example, with an HTTP GET request, the fetch "req.payload(0,3)"
returns the 3 bytes "G", "E", and "T" in binary. If this fetch is
used with regex, it is automatically converted to "474554" and the
regex is applied on this string, so it never matches.
This commit changes the cast so that bin2str does not convert the
contents anymore, and returns a string type. The contents can thus
be matched as is, and the NULL character continues to mark the end
of the string to avoid any issue with some string-based functions.
This commit could almost have been marked as a bug fix since it
does what the doc says.
Note that in case someone would rely on the hex encoding, then the
same behaviour could be achieved by appending ",hex" after the sample
fetch function (brought by previous patch).
This new filter converts BIN type to its hexadecimal
representation in STR type. It is used to keep the
compatibility with the original bin2str cast.
It will be useful when bin2str changes to copy the
string as-is without encoding anymore.
The binary samples are sometimes copied as is into http headers.
A sample can contain bytes unallowed by the http rfc concerning
header content, for example if it was extracted from binary data.
The resulting http request can thus be invalid.
This issue does not yet happen because haproxy currently (mistakenly)
hex-encodes binary data, so it is not really possible to retrieve
invalid HTTP chars.
The solution consists in hex-encoding all non-printable chars prefixed
by a '%' sign.
No backport is needed since existing code is not affected yet.