Commit Graph

153 Commits

Author SHA1 Message Date
Aurelien DARRAGON
23814a44e5 CLEANUP: tools: fix vma_set_name() function comment
There was a typo in the example provided in vma_set_name(): maps named
using the function will show up as "type:name", not "type.name", updating
the comment to reflect the current behavior.
2024-05-24 12:07:07 +02:00
Aurelien DARRAGON
51a8f134ef DEBUG: tools: add vma_set_name() helper
Following David Carlier's work in 98d22f21 ("MEDIUM: shctx: Naming shared
memory context"), let's provide an helper function to set a name hint on
a virtual memory area (ie: anonymous map created using mmap(), or memory
area returned by malloc()).

Naming will only occur if available, and naming errors will be ignored.
The function takes mandatory <type> and <name> parameterss to build the
map name as follow: "type:name". When looking at /proc/<pid>/maps, vma
named using this helper function will show up this way (provided that
the kernel has prtcl support for PR_SET_VMA_ANON_NAME):

example, using mmap + MAP_SHARED|MAP_ANONYMOUS:
  7364c4fff000-736508000000 rw-s 00000000 00:01 3540  [anon_shmem:type:name]
Another example, using mmap + MAP_PRIVATE|MAP_ANONYMOUS or using
glibc/malloc() above MMAP_THRESHOLD:
  7364c4fff000-736508000000 rw-s 00000000 00:01 3540  [anon:type:name]
2024-05-21 17:54:58 +02:00
Tim Duesterhus
6610f656ea DOC: Update UUID references to RFC 9562
When support for UUIDv7 was added in commit
aab6477b67
the specification still was a draft.

It has since been published as RFC 9562.

This patch updates all UUID references from the obsoleted RFC 4122 and the
draft for RFC 9562 to the published RFC 9562.
2024-05-15 11:40:08 +02:00
Aurelien DARRAGON
0e2aea8224 CLEANUP: tools/cbor: rename cbor_encode_ctx struct members
Rename e_byte_fct to e_fct_byte and e_fct_byte_ctx to e_fct_ctx, and
adjust some comments to make it clear that e_fct_ctx is here to provide
additional user-ctx to the custom cbor encode function pointers.

For now, only e_fct_byte function may be provided, but we could imagine
having e_fct_int{16,32,64}() one day to speed up the encoding when we
know we can encode multiple bytes at a time, but for now it's not worth
the hassle.
2024-04-29 14:47:37 +02:00
Aurelien DARRAGON
810303e3e6 MINOR: tools: add cbor encode helpers
Add cbor helpers to encode strings (bytes/text) and integers according to
RFC8949, also add cbor_encode_ctx struct to pass encoding options such as
how to encode a single byte.
2024-04-26 18:39:32 +02:00
Tim Duesterhus
aab6477b67 MINOR: Add ha_generate_uuid_v7
This function generates a version 7 UUID as per
draft-ietf-uuidrev-rfc4122bis-14.
2024-04-24 08:23:56 +02:00
Tim Duesterhus
c6cea750a9 MINOR: tools: Rename ha_generate_uuid to ha_generate_uuid_v4
This is in preparation of adding support for other UUID versions.
2024-04-24 08:23:56 +02:00
Willy Tarreau
16e3655fbd REORG: pool: move the area dump with symbol resolution to tools.c
This function is particularly useful to dump unknown areas watching
for opportunistic symbols, so let's move it to tools.c so that we can
reuse it a little bit more.
2024-04-12 18:01:20 +02:00
Aurelien DARRAGON
8226e92eb0 BUG/MINOR: tools/log: invalid encode_{chunk,string} usage
encode_{chunk,string}() is often found to be used this way:

  ret = encode_{chunk,string}(start, stop...)
  if (ret == NULL || *ret != '\0') {
	//error
  }
  //success

Indeed, encode_{chunk,string} will always try to add terminating NULL byte
to the output string, unless no space is available for even 1 byte.
However, it means that for the caller to be able to spot an error, then it
must provide a buffer (here: start) which is already initialized.

But this is wrong: not only this is very tricky to use, but since those
functions don't return NULL on failure, then if the output buffer was not
properly initialized prior to calling the function, the caller will
perform invalid reads when checking for failure this way. Moreover, even
if the buffer is initialized, we cannot reliably tell if the function
actually failed this way because if the buffer was previously initialized
with NULL byte, then the caller might think that the call actually
succeeded (since the function didn't return NULL and didn't update the
buffer).

Also, sess_build_logline() relies lf_encode_{chunk,string}() functions
which are in fact wrappers for encode_{chunk,string}() functions and thus
exhibit the same error handling mechanism. It turns out that
sess_build_logline() makes unsafe use of those functions because it uses
the error-checking logic mentionned above while buffer (tmplog) is not
guaranteed to be initialized when entering the function. This may
ultimately cause malfunctions or invalid reads if the output buffer is
lacking space.

To fix the issue once and for all and prevent similar bugs from being
introduced, we make it so encode_{string, chunk} and escape_string()
(based on encode_string()) now explicitly return NULL on failure
(when the function failed to write at least the ending NULL byte)

lf_encode_{string,chunk}() helpers had to be patched as well due to code
duplication.

This should be backported to all stable versions.

[ada: for 2.4 and 2.6 the patch won't apply as-is, it might be helpful to
 backport ae1e14d65 ("CLEANUP: tools: removing escape_chunk() function")
 first, considering it's not very relevant to maintain a dead function]
2024-04-09 17:35:45 +02:00
Brooks Davis
c03a023882 MINOR: tools: use public interface for FreeBSD get_exec_path()
Where possible (FreeBSD 13+), use the public, documented interface to
the ELF auxiliary argument vector: elf_aux_info().

__elf_aux_vector is a private interface exported so that the runtime
linker can set its value during process startup and not intended for
public consumption.  In FreeBSD 15 it has been removed from libc and
moved to libsys.
2024-03-11 19:00:37 +01:00
Willy Tarreau
7151076522 BUG/MINOR: tools: seed the statistical PRNG slightly better
Thomas Baroux reported a very interesting issue. "balance random" would
systematically assign the same server first upon restart. That comes from
its use of statistical_prng() which is only seeded with the thread number,
and since at low loads threads are assigned to incoming connections in
round robin order, practically speaking, the same thread always gets the
same request and will produce the same random number.

We already have a much better RNG that's also way more expensive, but we
can use it at boot time to seed the PRNG instead of using the thread ID
only.

This needs to be backported to 2.4.
2024-03-01 16:25:39 +01:00
Emeric Brun
ef02dba7bc BUG/MEDIUM: cli: some err/warn msg dumps add LR into CSV output on stat's CLI
The initial purpose of CSV stats through CLI was to make it easely
parsable by scripts. But in some specific cases some error or warning
messages strings containing LF were dumped into cells of this CSV.

This made some parsing failure on several tools. In addition, if a
warning or message contains to successive LF, they will be dumped
directly but double LFs tag the end of the response on CLI and the
client may consider a truncated response.

This patch extends the 'csv_enc_append' and 'csv_enc' functions used
to format quoted string content according to RFC  with an additionnal
parameter to convert multi-lines strings to one line: CRs are skipped,
and LFs are replaced with spaces. In addition and optionally, it is
also possible to remove resulting trailing spaces.

The call of this function to fill strings into stat's CSV output is
updated to force this conversion.

This patch should be backported on all supported branches (issue was
already present in v2.0)
2024-01-24 08:38:59 +01:00
Frdric Lcaille
ff8db5a85d BUG/MINOR: config: Stopped parsing upon unmatched environment variables
When an environment variable could not be matched by getenv(), the
current character to be parsed by parse_line() from <in> variable
is the trailing double quotes. If nothing is done in such a case,
this character is skipped by parse_line(), then the following spaces
are parsed as an empty argument.

To fix this, skip the double quotes character and the following spaces
to make <in> variable point to the next argument to be parsed.

Thank you to @sigint2 for having reported this issue in GH #2367.

Must be backported as far as 2.4.
2023-11-30 16:48:41 +01:00
Amaury Denoyelle
86e5c607d1 MINOR: rhttp: mark reverse HTTP as experimental
Mark the reverse HTTP feature as experimental. This will allow to adjust
if needed the configuration mechanism with future developments without
maintaining retro-compatibility.

Concretely, each config directives linked to it now requires to specify
first global expose-experimental-directives before. This is the case for
the following directives :
- rhttp@ prefix uses in bind and server lines
- nbconn bind keyword
- attach-srv tcp rule

Each documentation section refering to these keywords are updated to
highlight this new requirement.

Note that this commit has duplicated on several places the code from the
global function check_kw_experimental(). This is because the latter only
work with cfg_keyword type. This is not adapted with bind_kw or
action_kw types. This should be improve in a future patch.
2023-11-30 15:04:27 +01:00
Aurelien DARRAGON
24da4d3ee7 MINOR: tools: use const for read only pointers in ip{cmp,cpy}
In this patch we fix the prototype for ipcmp() and ipcpy() functions so
that input pointers that are used exclusively for reads are used as const
pointers. This way, the compiler can safely assume that those variables
won't be altered by the function.
2023-11-24 16:27:55 +01:00
Amaury Denoyelle
55e78ff7e1 MINOR: rhttp: large renaming to use rhttp prefix
Previous commit renames 'proto_reverse_connect' module to 'proto_rhttp'.
This commits follows this by replacing various custom prefix by 'rhttp_'
to make the code uniform.

Note that 'reverse_' prefix was kept in connection module. This is
because if a new reversable protocol not based on HTTP is implemented,
it may be necessary to reused the same connection function which are
protocol agnostic.
2023-11-23 17:40:01 +01:00
Ilya Shipitsin
80813cdd2a CLEANUP: assorted typo fixes in the code and comments
This is 37th iteration of typo fixes
2023-11-23 16:23:14 +01:00
Aurelien DARRAGON
12582eb8e5 MINOR: tools: make str2sa_range() directly return type hints
str2sa_range() already allows the caller to provide <proto> in order to
get a pointer on the protocol matching with the string input thanks to
5fc9328a ("MINOR: tools: make str2sa_range() directly return the protocol")

However, as stated into the commit message, there is a trick:
   "we can fail to return a protocol in case the caller
    accepts an fqdn for use later. This is what servers do and in this
    case it is valid to return no protocol"

In this case, we're unable to return protocol because the protocol lookup
depends on both the [proto type + xprt type] and the [family type] to be
known.

While family type might not be directly resolved when fqdn is involved
(because family type might be discovered using DNS queries), proto type
and xprt type are already known. As such, the caller might be interested
in knowing those address related hints even if the address family type is
not yet resolved and thus the matching protocol cannot be looked up.

Thus in this patch we add the optional net_addr_type (custom type)
argument to str2sa_range to enable the caller to check the protocol type
and transport type when the function succeeds.
2023-11-10 17:49:57 +01:00
Amaury Denoyelle
e05edf71df MINOR: cfgparse: rename "rev@" prefix to "rhttp@"
'rev@' was used to specify a bind/server used with reverse HTTP
transport. This notation was deemed not explicit enough. Rename it
'rhttp@' instead.
2023-10-20 14:44:37 +02:00
Aurelien DARRAGON
72514a4467 MEDIUM: tools/ip: v4tov6() and v6tov4() rework
v4tov6() and v6tov4() helper function were initially implemented in
4f92d3200 ("[MEDIUM] IPv6 support for stick-tables").

However, since ceb4ac9c3 ("MEDIUM: acl: support IPv6 address matching")
support for legacy ip6 to ip4 conversion formats were added, with the
parsing logic directly performed in acl_match_ip (which later became
pat_match_ip)

The issue is that the original v6tov4() function which is used for sample
expressions handling lacks those additional formats, so we could face
inconsistencies whether we rely on ip4/ip6 conversions from an acl context
or an expression context.

To unify ip4/ip6 automatic mapping behavior, we reworked v4tov6 and v6tov4
functions so that they now behave like in pat_match_ip() function.

Note: '6to4 (RFC3056)' and 'RFC4291 ipv4 compatible address' formats are
still supported for legacy purposes despite being deprecated for a while
now.
2023-09-21 09:50:55 +02:00
Willy Tarreau
1f2433fb6a MINOR: tools: add function read_line_to_trash() to read a line of a file
This function takes on input a printf format for the file name, making
it particularly suitable for /proc or /sys entries which take a lot of
numbers. It also automatically trims the trailing CR and/or LF chars.
2023-09-08 16:25:19 +02:00
Amaury Denoyelle
5db6dde058 MINOR: proto: define dedicated protocol for active reverse connect
A new protocol named "reverse_connect" is created. This will be used to
instantiate connections that are opened by a reverse bind.

For the moment, only a minimal set of callbacks are defined with no real
work. This will be extended along the next patches.
2023-08-24 17:02:37 +02:00
Thierry Fournier
65f18d65a3 BUG/MINOR: config: Lenient port configuration parsing
Configuration parsing allow port like 8000/websocket/. This is
a nonsense and allowing this syntax may hide to the user something
not corresponding to its intent.

This patch should not be backported because it could break existing
configurations
2023-07-11 20:58:28 +02:00
Thierry Fournier
111351eebb BUG/MINOR: config: Remove final '\n' in error messages
Because error messages are displayed with appending final '\n', it's
useless to add '\n' in the message.

This patch should be backported.
2023-07-11 20:58:28 +02:00
Willy Tarreau
997ad155fe BUG/MINOR: tools: check libssl and libcrypto separately
The lib compatibility checks introduced in 2.8-dev6 with commit c3b297d5a
("MEDIUM: tools: further relax dlopen() checks too consider grouped
symbols") were partially incorrect in that they check at the same time
libcrypto and libssl. But if loading a library that only depends on
libcrypto, the ssl-only symbols will be missing and this might present
an inconsistency. This is what is observed on FreeBSD 13.1 when
libcrypto is being loaded, where it sees two symbols having disappeared.

The fix consists in splitting the checks for libcrypto and libssl.

No backport is needed, unless the patch above finally gets backported.
2023-04-23 09:46:15 +02:00
Willy Tarreau
c3b297d5a4 MEDIUM: tools: further relax dlopen() checks too consider grouped symbols
There's a recurring issue regarding shared library loading from Lua. If
the imported library is linked with a different version of openssl but
doesn't use it, the check will trigger and emit a warning. In practise
it's not necessarily a problem as long as the API is the same, because
all symbols are replaced and the library will use the included ssl lib.

It's only a problem if the library comes with a different API because
the dynamic linker will only replace known symbols with ours, and not
all. Thus the loaded lib may call (via a static inline or a macro) a
few different symbols that will allocate or preinitialize structures,
and which will then pass them to the common symbols coming from a
different and incompatible lib, exactly what happens to users of Lua's
luaossl when building haproxy with quictls and without rebuilding
luaossl.

In order to better address this situation, we now define groups of
symbols that must always appear/disappear in a consistent way. It's OK
if they're all absent from either haproxy or the lib, it means that one
of them doesn't use them so there's no problem. But if any of them is
defined on any side, all of them must be in the exact same state on the
two sides. The symbols are represented using a bit in a mask, and the
mask of the group of other symbols they're related to. This allows to
check 64 symbols, this should be OK for a while. The first ones that
are tested for now are symbols whose combination differs between
openssl versions 1.0, 1.1, and 3.0 as well as quictls. Thus a difference
there will indicate upcoming trouble, but no error will mean that we're
running on a seemingly compatible API and that all symbols should be
replaced at once.

The same mechanism could possibly be used for pcre/pcre2, zlib and the
few other optional libs that may occasionally cause runtime issues when
used by dependencies, provided that discriminatory symbols are found to
distinguish them. But in practice such issues are pretty rare, mainly
because loading standard libs via Lua on a custom build of haproxy is
not pretty common.

In the event that further symbol compatibility issues would be reported
in the future, backporting this patch as well as the following series
might be an acceptable solution given that the scope of changes is very
narrow (the malloc stuff is needed so that the malloc/free tests can be
dropped):

  BUG/MINOR: illegal use of the malloc_trim() function if jemalloc is used
  MINOR: pools: make sure 'no-memory-trimming' is always used
  MINOR: pools: intercept malloc_trim() instead of trying to plug holes
  MEDIUM: pools: move the compat code from trim_all_pools() to malloc_trim()
  MINOR: pools: export trim_all_pools()
  MINOR: pattern: use trim_all_pools() instead of a conditional malloc_trim()
  MINOR: tools: relax dlopen() on malloc/free checks
2023-03-22 17:30:28 +01:00
Willy Tarreau
58912b8d92 MINOR: tools: relax dlopen() on malloc/free checks
Now that we can provide a safe malloc_trim() we don't need to detect
anymore that some dependencies use a different set of malloc/free
functions than ours because they will use the same as those we're
seeing, and we control their use of malloc_trim(). The comment about
the incompatibility with DEBUG_MEM_STATS is not true anymore either
since the feature relies on macros so we're now OK.

This will stop catching libraries linked against glibc's allocator
when haproxy is natively built with jemalloc. This was especially
annoying since dlopen() on a lib depending on jemalloc() tends to
fail on TLS issues.
2023-03-22 17:30:28 +01:00
Willy Tarreau
8f6da64641 MINOR: quic_sock: un-statify quic_conn_sock_fd_iocb()
This one is printed as the iocb in the "show fd" output, and arguably
this wasn't very convenient as-is:
    293 : st=0x000123(cl heopI W:sRa R:sRA) ref=0 gid=1 tmask=0x8 umask=0x0 prmsk=0x8 pwmsk=0x0 owner=0x7f488487afe0 iocb=0x50a2c0(main+0x60f90)

Let's unstatify it and export it so that the symbol can now be resolved
from the various points that need it.
2023-03-10 14:30:01 +01:00
Willy Tarreau
40725a4eb0 MINOR: listener: also support "quic+" as an address prefix
While we do support quic4@ and quic6@ for listening addresses, it was
not possible to specify that we want to use an FD inherited from the
parent with QUIC. It's just a matter of making it possible to enable
a dgram-type socket and a stream-type transport, so let's add this.

Now it becomes possible to write "quic+fd@12", "quic+ipv4@addr" etc.
2023-01-16 14:00:51 +01:00
Amaury Denoyelle
21e611dc89 MINOR: tools: add port for ipcmp as optional criteria
Complete ipcmp() function with a new argument <check_port>. If this
argument is true, the function will compare port values besides IP
addresses and return true only if both are identical.

This commit will simplify QUIC connection migration detection. As such,
it should be backported to 2.7.
2022-12-02 14:45:43 +01:00
Willy Tarreau
08093cc0fa CLEANUP: tools: do not needlessly include xxhash nor cli from tools.h
These includes brought by commit 9c76637ff ("MINOR: anon: add new macros
and functions to anonymize contents") resulted in an increase of exactly
20% of the number of lines to build. These include are not needed there,
only tools.c needs xxhash.h.
2022-11-24 08:30:48 +01:00
Aurelien DARRAGON
e3177af465 CLEANUP: tools: extra check in utoa_pad
Removing useless check in utoa_pad().

This was reported by Ilya with the help of cppcheck.
2022-11-22 16:27:52 +01:00
Willy Tarreau
7de8de0bf8 BUILD: tools: use __fallthrough in url_decode()
This avoids one build warning when preprocessing happens before compiling
with gcc >= 7.
2022-11-14 11:14:02 +01:00
Willy Tarreau
94ab139266 BUG/MEDIUM: config: count line arguments without dereferencing the output
Previous commit 8a6767d26 ("BUG/MINOR: config: don't count trailing spaces
as empty arg (v2)") was still not enough. As reported by ClusterFuzz in
issue 52049 (https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=52049),
there remains a case where for the sake of reporting the correct argument
count, the function may produce virtual args that span beyond the end of
the output buffer if that one is too short. That's what's happening with
a config file of one empty line followed by a large number of args.

This means that what args[] points to cannot be relied on and that a
different approach is needed. Since no output is produced for spaces and
comments, we know that args[arg] continues to point to out+outpos as long
as only comments or spaces are found, which is what we're interested in.

As such it's safe to check the last arg's pointer against the one before
the trailing zero was emitted, in order to decide to count one final arg.

No backport is needed, unless the commit above is backported.
2022-10-03 09:24:26 +02:00
Erwan Le Goas
8a6767d266 BUG/MINOR: config: don't count trailing spaces as empty arg (v2)
In parse_line(), spaces increment the arg count and it is incremented
again on '#' or end of line, resulting in an extra empty arg at the
end of arg's list. The visible effect is that the reported arg count
is in excess of 1. It doesn't seem to affect regular function but
specialized ones like anonymisation depends on this count.

This is the second attempt for this problem, here the explanation :

When called for the first line, no <out> was allocated yet so it's NULL,
letting the caller realloc a larger line if needed. However the words are
parsed and their respective args[arg] are filled with out+position, which
means that while the first arg is NULL, the other ones are no and fail the
test that was meant to avoid dereferencing a NULL. Let's simply check <out>
instead of <args> since the latter is always derived from the former and
cannot be NULL without the former also being.

This may need to be backported to stable versions.
2022-09-30 15:21:20 +02:00
Christopher Faulet
015bbc298f MINOR: tools: Impprove hash_ipanon to not hash FD-based addresses
"stdout" and "stderr" are not hashed. In the same spirit, "fd@" and
"sockpair@" prefixes are not hashed too. There is no reason to hash such
address and it may be useful to diagnose bugs.

No backport needed, except if anonymization mechanism is backported.
2022-09-29 11:53:08 +02:00
Christopher Faulet
7e50e4b9cc MINOR: tools: Impprove hash_ipanon to support dgram sockets and port offsets
Add PA_O_DGRAM and PA_O_PORT_OFS options when str2sa_range() is called. This
way dgram sockets and addresses with port offsets are supported.

No backport needed, except if anonymization mechanism is backported.
2022-09-29 11:46:35 +02:00
Erwan Le Goas
5eef1588a1 MINOR: tools: modify hash_ipanon in order to use it in cli
Add a parameter hasport to return a simple hash or ipstring when
ipstring has no port. Doesn't hash if scramble is null. Add
option PA_O_PORT_RESOLVE to str2sa_range. Add a case UNIX.
Those modification permit to use hash_ipanon in cli section
in order to dump the same anonymization of address in the
configuration file and with CLI.

No backport needed, except if anonymization mechanism is backported.
2022-09-29 10:53:14 +02:00
Christopher Faulet
c5daf2801a Revert "BUG/MINOR: config: don't count trailing spaces as empty arg"
This reverts commit 5529424ef1.

Since this patch, HAProxy crashes when the first line of the configuration
file contains more than one parameter because, on the first call of
parse_line(), the output line is not allocated. Thus elements in the
arguments array may point on invalid memory area.

It may be considered as a bug to reference invalid memory area and should be
fixed. But for now, it is safer to revert this patch

If the reverted commit is backported, this one must be backported too.
2022-09-28 18:40:50 +02:00
Erwan Le Goas
5529424ef1 BUG/MINOR: config: don't count trailing spaces as empty arg
In parse_line(), spaces increment the arg count and it is incremented
again on '#' or end of line, resulting in an extra empty arg at the
end of arg's list. The visible effect is that the reported arg count
is in excess of 1. It doesn't seem to affect regular function but
specialized ones like anonymisation depends on this count.

This may need to be backported to stable versions.
2022-09-28 15:16:29 +02:00
Erwan Le Goas
d2605cf0e5 BUG/MINOR: anon: memory illegal accesses in tools.c with hash_anon and hash_ipanon
chipitsine reported in github issue #1872 that in function hash_anon and
hash_ipanon, index_hash can be equal to NB_L_HASH_WORD and can reach an
inexisting line table, the table is initialized hash_word[NB_L_HASH_WORD][20];
so hash_word[NB_L_HASH_WORD] doesn't exist.

No backport needed, except if anonymization mechanism is backported.
2022-09-22 15:44:13 +02:00
Aurelien DARRAGON
ae1e14d65b CLEANUP: tools: removing escape_chunk() function
Func is not used anymore. See e3bde807d.
2022-09-20 16:25:30 +02:00
Aurelien DARRAGON
c5bff8e550 BUG/MINOR: log: improper behavior when escaping log data
Patrick Hemmer reported an improper log behavior when using
log-format to escape log data (+E option):
Some bytes were truncated from the output:

- escape_string() function now takes an extra parameter that
  allow the caller to specify input string stop pointer in
  case the input string is not guaranteed to be zero-terminated.
- Minors checks were added into lf_text_len() to make sure dst
  string will not overflow.
- lf_text_len() now makes proper use of escape_string() function.

This should be backported as far as 1.8.
2022-09-20 16:25:30 +02:00
Erwan Le Goas
9c76637fff MINOR: anon: add new macros and functions to anonymize contents
These macros and functions will be used to anonymize strings by producing
a short hash. This will allow to match config elements against dump elements
without revealing the original data. This will later be used to anonymize
configuration parts and CLI commands output. For now only string, identifiers
and addresses are supported, but the model is easily extensible.
2022-09-17 11:24:53 +02:00
Willy Tarreau
0dc9e6dca2 DEBUG: tools: provide a tree dump function for ebmbtrees as well
It's convenient for debugging IP trees. However we're not dumping the
full keys, for the sake of simplicity, only the 4 first bytes are dumped
as a u32 hex value. In practice this is sufficient for debugging. As a
reminder since it seems difficult to recover the command each time it's
needed, the output is converted to an image using dot from Graphviz:

    dot -o a.png -Tpng dump.txt
2022-08-01 11:59:15 +02:00
Willy Tarreau
5b3cd9561b BUG/MEDIUM: tools: avoid calling dlsym() in static builds (try 2)
The first approach in commit 288dc1d8e ("BUG/MEDIUM: tools: avoid calling
dlsym() in static builds") relied on dlopen() but on certain configs (at
least gcc-4.8+ld-2.27+glibc-2.17) it used to catch situations where it
ought not fail.

Let's have a second try on this using dladdr() instead. The variable was
renamed "build_is_static" as it's exactly what's being detected there.
We could even take it for reporting in -vv though that doesn't seem very
useful. At least the variable was made global to ease inspection via the
debugger, or in case it's useful later.

Now it properly detects a static build even with gcc-4.4+glibc-2.11.1 and
doesn't crash anymore.
2022-07-18 14:03:54 +02:00
Willy Tarreau
288dc1d8ee BUG/MEDIUM: tools: avoid calling dlsym() in static builds
Since 2.4 with commit 64192392c ("MINOR: tools: add functions to retrieve
the address of a symbol"), we can resolve symbols. However some old glibc
crash in dlsym() when the program is statically built.

Fortunately even on these old libs we can detect lack of support by
calling dlopen(NULL). Normally it returns a handle to the current
program, but on a static build it returns NULL. This is sufficient to
refrain from calling dlsym() (which will be of very limited use anyway),
so we check this once at boot and use the result when needed.

This may be backported to 2.4. On stable versions, be careful to place
the init code inside an if/endif guard that checks for DL support.
2022-07-16 13:49:34 +02:00
Willy Tarreau
c7a8a3c7bd MINOR: intops: add a function to return a valid bit position from a mask
Sometimes we need to be able to signal one thread among a mask, without
caring much about which bit will be picked. At the moment we use ffsl()
for this but this sometimes results in imbalance at certain specific
places where the same first thread in a set is always the same one that
is selected.

Another approach would consist in using the rank finding function but it
requires a popcount and a setup phase, and possibly a modulo operation
depending on the popcount, which starts to be very expensive.

Here we take a different approach. The idea is an input bit value is
passed, from 0 to LONGBITS-1, and that as much as possible we try to
pick the bit matching it if it is set. Otherwise we look at a mirror
position based on a decreasing power of two, and jump to the side
that still has bits left. In 6 iterations it ends up spotting one bit
among 64 and the operations are very cheap and optimizable. This method
has the benefit that we don't care where the holes are located in the
mask, thus it shows a good distribution of output bits based on the
input ones. A long-time test shows an average of 16 cycles, or ~4ns
per lookup at 3.8 GHz, which is about twice as fast as using the rank
finding function.

Just like for that one, the code was stored into tools.c since we don't
have a C file for intops.
2022-06-21 20:29:57 +02:00
Willy Tarreau
177aed56dc MEDIUM: debug: detect redefinition of symbols upon dlopen()
In order to better detect the danger caused by extra shared libraries
which replace some symbols, upon dlopen() we now compare a few critical
symbols such as malloc(), free(), and some OpenSSL symbols, to see if
the loaded library comes with its own version. If this happens, a
warning is emitted and TAINTED_REDEFINITION is set. This is important
because some external libs might be linked against different libraries
than the ones haproxy was linked with, and most often this will end
very badly (e.g. an OpenSSL object is allocated by haproxy and freed
by such libs).

Since the main source of dlopen() calls is the Lua lib, via a "require"
statement, it's worth trying to show a Lua call trace when detecting a
symbol redefinition during dlopen(). As such we emit a Lua backtrace if
Lua is detected as being in use.
2022-06-19 17:58:32 +02:00
Willy Tarreau
40dde2d5c1 MEDIUM: debug: add a tainted flag when a shared library is loaded
Several bug reports were caused by shared libraries being loaded by other
libraries or some Lua code. Such libraries could define alternate symbols
or include dependencies to alternate versions of a library used by haproxy,
making it very hard to understand backtraces and analyze the issue.

Let's intercept dlopen() and set a new TAINTED_SHARED_LIBS flag when it
succeeds, so that "show info" makes it visible that some external libs
were added.

The redefinition is based on the definition of RTLD_DEFAULT or RTLD_NEXT
that were previously used to detect that dlsym() is usable, since we need
it as well. This should be sufficient to catch most situations.
2022-06-19 17:58:32 +02:00