Commit Graph

4011 Commits

Author SHA1 Message Date
Willy Tarreau
638698da37 BUILD: stream-int: fix a few includes dependencies
The stream-int code doesn't need to load server.h as it doesn't use
servers at all. However removing this one reveals that proxy.h was
lacking types/checks.h that used to be silently inherited from
types/server.h loaded before in stream_interface.h.
2020-03-11 14:15:33 +01:00
Willy Tarreau
855796bdc8 BUG/MAJOR: list: fix invalid element address calculation
Ryan O'Hara reported that haproxy breaks on fedora-32 using gcc-10
(pre-release). It turns out that constructs such as:

    while (item != head) {
         item = LIST_ELEM(item.n);
    }

loop forever, never matching <item> to <head> despite a printf there
showing them equal. In practice the problem is that the LIST_ELEM()
macro is wrong, it assigns the subtract of two pointers (an integer)
to another pointer through a cast to its pointer type. And GCC 10 now
considers that this cannot match a pointer and silently optimizes the
comparison away. A tested workaround for this is to build with
-fno-tree-pta. Note that older gcc versions even with -ftree-pta do
not exhibit this rather surprizing behavior.

This patch changes the test to instead cast the null-based address to
an int to get the offset and subtract it from the pointer, and this
time it works. There were just a few places to adjust. Ideally
offsetof() should be used but the LIST_ELEM() API doesn't make this
trivial as it's commonly called with a typeof(ptr) and not typeof(ptr*)
thus it would require to completely change the whole API, which is not
something workable in the short term, especially for a backport.

With this change, the emitted code is subtly different even on older
versions. A code size reduction of ~600 bytes and a total executable
size reduction of ~1kB are expected to be observed and should not be
taken as an anomaly. Typically this loop in dequeue_proxy_listeners() :

   	while ((listener = MT_LIST_POP(...)))

used to produce this code where the comparison is performed on RAX
while the new offset is assigned to RDI even though both are always
identical:

  53ded8:       48 8d 78 c0             lea    -0x40(%rax),%rdi
  53dedc:       48 83 f8 40             cmp    $0x40,%rax
  53dee0:       74 39                   je     53df1b <dequeue_proxy_listeners+0xab>

and now produces this one which is slightly more efficient as the
same register is used for both purposes:

  53dd08:       48 83 ef 40             sub    $0x40,%rdi
  53dd0c:       74 2d                   je     53dd3b <dequeue_proxy_listeners+0x9b>

Similarly, retrieving the channel from a stream_interface using si_ic()
and si_oc() used to cause this (stream-int in rdi):

    1cb7:       c7 47 1c 00 02 00 00    movl   $0x200,0x1c(%rdi)
    1cbe:       f6 47 04 10             testb  $0x10,0x4(%rdi)
    1cc2:       74 1c                   je     1ce0 <si_report_error+0x30>
    1cc4:       48 81 ef 00 03 00 00    sub    $0x300,%rdi
    1ccb:       81 4f 10 00 08 00 00    orl    $0x800,0x10(%rdi)

and now causes this:

    1cb7:       c7 47 1c 00 02 00 00    movl   $0x200,0x1c(%rdi)
    1cbe:       f6 47 04 10             testb  $0x10,0x4(%rdi)
    1cc2:       74 1c                   je     1ce0 <si_report_error+0x30>
    1cc4:       81 8f 10 fd ff ff 00    orl    $0x800,-0x2f0(%rdi)

There is extremely little chance that this fix wakes up a dormant bug as
the emitted code effectively does what the source code intends.

This must be backported to all supported branches (dropping MT_LIST_ELEM
and the spoa_example parts as needed), since the bug is subtle and may
not always be visible even when compiling with gcc-10.
2020-03-11 14:12:51 +01:00
Olivier Houchard
1d117e3dcd BUG/MEDIUM: mt_lists: Make sure we set the deleted element to NULL;
In MT_LIST_DEL_SAFE(), when the code was changed to use a temporary variable
instead of using the provided pointer directly, we shouldn't have changed
the code that set the pointer to NULL, as we really want the pointer
provided to be nullified, otherwise other parts of the code won't know
we just deleted an element, and bad things will happen.

This should be backported to 2.1.
2020-03-10 17:45:05 +01:00
Willy Tarreau
9a0dfa5298 CLEANUP: remove the now unused common/syscall.h
It was added 9 years ago to implement USE_MY_SPLICE on some libcs where
syscall() was bogus. It's about time to get rid of this.
2020-03-10 07:28:46 +01:00
Willy Tarreau
06c63aec95 CLEANUP: remove support for USE_MY_SPLICE
The splice() syscall has been supported in glibc since version 2.5 issued
in 2006 and is present on supported systems so there's no need for having
our own arch-specific syscall definitions anymore.
2020-03-10 07:23:41 +01:00
Willy Tarreau
3858b122a6 CLEANUP: remove support for USE_MY_EPOLL
This was made to support epoll on patched 2.4 kernels, and on early 2.6
using alternative libcs thanks to the arch-specific syscall definitions.
All the features we support have been around since 2.6.2 and present in
glibc since 2.3.2, neither of which are found in field anymore. Let's
simply drop this and use epoll normally.
2020-03-10 07:08:10 +01:00
Willy Tarreau
618ac6ea52 CLEANUP: drop support for USE_MY_ACCEPT4
The accept4() syscall has been present for a while now, there is no more
reason for maintaining our own arch-specific syscall implementation for
systems lacking it in libc but having it in the kernel.
2020-03-10 07:02:46 +01:00
Willy Tarreau
c3e926bf3b CLEANUP: remove support for Linux i686 vsyscalls
This was introduced 10 years ago to squeeze a few CPU cycles per syscall
on 32-bit x86 machines and was already quite old by then, requiring to
explicitly enable support for this in the kernel. We don't even know if
it still builds, let alone if it works at all on recent kernels! Let's
completely drop this now.
2020-03-10 06:55:52 +01:00
William Lallemand
6763016866 BUG/MINOR: ssl/cli: sni_ctx' mustn't always be used as filters
Since commit 244b070 ("MINOR: ssl/cli: support crt-list filters"),
HAProxy generates a list of filters based on the sni_ctx in memory.
However it's not always relevant, sometimes no filters were configured
and the CN/SAN in the new certificate are not the same.

This patch fixes the issue by using a flag filters in the ckch_inst, so
we are able to know if there were filters or not. In the late case it
uses the CN/SAN of the new certificate to generate the sni_ctx.

note: filters are still only used in the crt-list atm.
2020-03-09 17:32:04 +01:00
William Lallemand
0a52846603 CLEANUP: ssl: is_default is a bit in ckch_inst
The field is_default becomes a bit in the ckch_inst structure.
2020-03-09 17:32:04 +01:00
Miroslav Zagorac
d7dc67ba1d CLEANUP: remove unused code in 'my_ffsl/my_flsl' functions
Shifting the variable 'a' one bit to the right has no effect on the
result of the functions.
2020-03-09 14:47:27 +01:00
Willy Tarreau
ee3bcddef7 MINOR: tools: add a generic function to generate UUIDs
We currently have two UUID generation functions, one for the sample
fetch and the other one in the SPOE filter. Both were a bit complicated
since they were made to support random() implementations returning an
arbitrary number of bits, and were throwing away 33 bits every 64. Now
we don't need this anymore, so let's have a generic function consuming
64 bits at once and use it as appropriate.
2020-03-08 18:04:16 +01:00
Willy Tarreau
52bf839394 BUG/MEDIUM: random: implement a thread-safe and process-safe PRNG
This is the replacement of failed attempt to add thread safety and
per-process sequences of random numbers initally tried with commit
1c306aa84d ("BUG/MEDIUM: random: implement per-thread and per-process
random sequences").

This new version takes a completely different approach and doesn't try
to work around the horrible OS-specific and non-portable random API
anymore. Instead it implements "xoroshiro128**", a reputedly high
quality random number generator, which is one of the many variants of
xorshift, which passes all quality tests and which is described here:

   http://prng.di.unimi.it/

While not cryptographically secure, it is fast and features a 2^128-1
period. It supports fast jumps allowing to cut the period into smaller
non-overlapping sequences, which we use here to support up to 2^32
processes each having their own, non-overlapping sequence of 2^96
numbers (~7*10^28). This is enough to provide 1 billion randoms per
second and per process for 2200 billion years.

The implementation was made thread-safe either by using a double 64-bit
CAS on platforms supporting it (x86_64, aarch64) or by using a local
lock for the time needed to perform the shift operations. This ensures
that all threads pick numbers from the same pool so that it is not
needed to assign per-thread ranges. For processes we use the fast jump
method to advance the sequence by 2^96 for each process.

Before this patch, the following config:
    global
        nbproc 8

    frontend f
        bind :4445
        mode http
        log stdout format raw daemon
        log-format "%[uuid] %pid"
        redirect location /

Would produce this output:
    a4d0ad64-2645-4b74-b894-48acce0669af 12987
    a4d0ad64-2645-4b74-b894-48acce0669af 12992
    a4d0ad64-2645-4b74-b894-48acce0669af 12986
    a4d0ad64-2645-4b74-b894-48acce0669af 12988
    a4d0ad64-2645-4b74-b894-48acce0669af 12991
    a4d0ad64-2645-4b74-b894-48acce0669af 12989
    a4d0ad64-2645-4b74-b894-48acce0669af 12990
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12987
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12992
    82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12986
    (...)

And now produces:
    f94b29b3-da74-4e03-a0c5-a532c635bad9 13011
    47470c02-4862-4c33-80e7-a952899570e5 13014
    86332123-539a-47bf-853f-8c8ea8b2a2b5 13013
    8f9efa99-3143-47b2-83cf-d618c8dea711 13012
    3cc0f5c7-d790-496b-8d39-bec77647af5b 13015
    3ec64915-8f95-4374-9e66-e777dc8791e0 13009
    0f9bf894-dcde-408c-b094-6e0bb3255452 13011
    49c7bfde-3ffb-40e9-9a8d-8084d650ed8f 13014
    e23f6f2e-35c5-4433-a294-b790ab902653 13012

There are multiple benefits to using this method. First, it doesn't
depend anymore on a non-portable API. Second it's thread safe. Third it
is fast and more proven than any hack we could attempt to try to work
around the deficiencies of the various implementations around.

This commit depends on previous patches "MINOR: tools: add 64-bit rotate
operators" and "BUG/MEDIUM: random: initialize the random pool a bit
better", all of which will need to be backported at least as far as
version 2.0. It doesn't require to backport the build fixes for circular
include files dependecy anymore.
2020-03-08 10:09:02 +01:00
Willy Tarreau
7a40909c00 MINOR: tools: add 64-bit rotate operators
This adds rotl64/rotr64 to rotate a 64-bit word by an arbitrary number
of bits. It's mainly aimed at being used with constants.
2020-03-08 00:42:18 +01:00
Willy Tarreau
0fbf28a05b Revert "BUG/MEDIUM: random: implement per-thread and per-process random sequences"
This reverts commit 1c306aa84d.

It breaks the build on all non-glibc platforms. I got confused by the
man page (which possibly is the most confusing man page I've ever read
about a standard libc function) and mistakenly understood that random_r
was portable, especially since it appears in latest freebsd source as
well but not in released versions, and with a slightly different API :-/

We need to find a different solution with a fallback. Among the
possibilities, we may reintroduce this one with a fallback relying on
locking around the standard functions, keeping fingers crossed for no
other library function to call them in parallel, or we may also provide
our own PRNG, which is not necessarily more difficult than working
around the totally broken up design of the portable API.
2020-03-07 11:24:39 +01:00
Willy Tarreau
1c306aa84d BUG/MEDIUM: random: implement per-thread and per-process random sequences
As mentioned in previous patch, the random number generator was never
made thread-safe, which used not to be a problem for health checks
spreading, until the uuid sample fetch function appeared. Currently
it is possible for two threads or processes to produce exactly the
same UUID. In fact it's extremely likely that this will happen for
processes, as can be seen with this config:

    global
        nbproc 8

    frontend f
        bind :4445
        mode http
        log stdout daemon format raw
        log-format "%[uuid] %pid"
        redirect location /

It typically produces this log:

  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30645
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30641
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30644
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30639
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30646
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30645
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30639
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30643
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30646
  b6773fdd-678f-4d04-96f2-4fb11ad15d6b 30646
  551ce567-0bfb-4bbd-9b58-cdc7e9365325 30642
  07764439-c24d-4e6f-a5a6-0138be59e7a8 30642

What this patch does is to use a distinct per-thread and per-process
seed to make sure the same sequences will not appear, and will then
extend these seeds by "burning" a number of randoms that depends on
the global random seed, the thread ID and the process ID. This adds
roughly 20 extra bits of randomness, resulting in 52 bits total per
thread and per process.

It only takes a few milliseconds to burn these randoms and given
that threads start with a different seed, we know they will not
catch each other. So these random extra bits are essentially added
to ensure randomness between boots and cluster instances.

This replaces all uses of random() with ha_random() which uses the
thread-local state.

This must be backported as far as 2.0 or any version having the
UUID sample-fetch function since it's the main victim here.

It's important to note that this patch, in addition to depending on
the previous one "BUG/MEDIUM: init: initialize the random pool a bit
better", also depends on the preceeding build fixes to address a
circular dependency issue in the include files that prevented it
from building. Part or all of these patches may need to be backported
or adapted as well.
2020-03-07 06:11:15 +01:00
Willy Tarreau
6c3a681bd6 BUG/MEDIUM: random: initialize the random pool a bit better
Since the UUID sample fetch was created, some people noticed that in
certain virtualized environments they manage to get exact same UUIDs
on different instances started exactly at the same moment. It turns
out that the randoms were only initialized to spread the health checks
originally, not to provide "clean" randoms.

This patch changes this and collects more randomness from various
sources, including existing randoms, /dev/urandom when available,
RAND_bytes() when OpenSSL is available, as well as the timing for such
operations, then applies a SHA1 on all this to keep a 160 bits random
seed available, 32 of which are passed to srandom().

It's worth mentioning that there's no clean way to pass more than 32
bits to srandom() as even initstate() provides an opaque state that
must absolutely not be tampered with since known implementations
contain state information.

At least this allows to have up to 4 billion different sequences
from the boot, which is not that bad.

Note that the thread safety was still not addressed, which is another
issue for another patch.

This must be backported to all versions containing the UUID sample
fetch function, i.e. as far as 2.0.
2020-03-07 06:11:11 +01:00
Willy Tarreau
5a421a8f49 BUILD: listener: types/listener.h must not include standard.h
It's only a type definition, this header is not needed and causes
some circular dependency issues.
2020-03-07 06:07:18 +01:00
Willy Tarreau
c7f64e7a58 BUILD: freq_ctr: proto/freq_ctr needs to include common/standard.h
This is needed for div_64_32() which is there and currently accidently
inherited via global.h!
2020-03-07 06:07:18 +01:00
Willy Tarreau
f23e029409 BUILD: global: must not include common/standard.h but only types/freq_ctr.h
This one was accidently inherited and used to work but causes a circular
dependency.
2020-03-07 06:07:18 +01:00
Willy Tarreau
8dd0d55efe BUILD: ssl: include mini-clist.h
We use some list definitions and we don't include this header which
is in fact accidently inherited from others, causing a circular
dependency issue.
2020-03-07 06:07:18 +01:00
Willy Tarreau
a8561db936 BUILD: buffer: types/{ring.h,checks.h} should include buf.h, not buffer.h
buffer.h relies on proto/activity because it contains some code and not
just type definitions. It must not be included from types files. It
should probably also be split in two if it starts to include a proto.
This causes some circular dependencies at other places.
2020-03-07 06:07:18 +01:00
Christopher Faulet
d8f0e073dd MINOR: lua: Remove the flag HLUA_TXN_HTTP_RDY
This flag was used in some internal functions to be sure the current stream is
able to handle HTTP content. It was introduced when the legacy HTTP code was
still there. Now, It is possible to rely on stream's flags to be sure we have an
HTX stream.

So the flag HLUA_TXN_HTTP_RDY can be removed. Everywhere it was tested, it is
replaced by a call to the IS_HTX_STRM() macro.

This patch is mandatory to allow the support of the filters written in lua.
2020-03-06 14:13:00 +01:00
Christopher Faulet
1cdceb9365 MINOR: htx: Add a function to return a block at a specific offset
The htx_find_offset() function may be used to look for a block at a specific
offset in an HTX message, starting from the message head. A compound result is
returned, an htx_ret structure, with the found block and the position of the
offset in the block. If the offset is ouside of the HTX message, the returned
block is NULL.
2020-03-06 14:12:59 +01:00
Christopher Faulet
251f4917c3 MINOR: buf: Add function to insert a string at an absolute offset in a buffer
The b_insert_blk() function may now be used to insert a string, given a pointer
and the string length, at an absolute offset in a buffer, moving data between
this offset and the buffer's tail just after the end of the inserted string. The
buffer's length is automatically updated. This function supports wrapping. All
the string is copied or nothing. So it returns 0 if there are not enough space
to perform the copy. Otherwise, the number of bytes copied is returned.
2020-03-06 14:12:59 +01:00
Carl Henrik Lunde
f91ac19299 OPTIM: startup: fast unique_id allocation for acl.
pattern_finalize_config() uses an inefficient algorithm which is a
problem with very large configuration files. This affects startup, and
therefore reload time. When haproxy is deployed as a router in a
Kubernetes cluster the generated configuration file may be large and
reloads are frequently occuring, which makes this a significant issue.

The old algorithm is O(n^2)
* allocate missing uids - O(n^2)
* sort linked list - O(n^2)

The new algorithm is O(n log n):
* find the user allocated uids - O(n)
* store them for efficient lookup - O(n log n)
* allocate missing uids - n times O(log n)
* sort all uids - O(n log n)
* convert back to linked list - O(n)

Performance examples, startup time in seconds:

    pat_refs old     new
    1000      0.02   0.01
    10000     2.1    0.04
    20000    12.3    0.07
    30000    27.9    0.10
    40000    52.5    0.14
    50000    77.5    0.17

Please backport to 1.8, 2.0 and 2.1.
2020-03-06 08:11:58 +01:00
Tim Duesterhus
a17e66289c MEDIUM: stream: Make the unique_id member of struct stream a struct ist
The `unique_id` member of `struct stream` now is a `struct ist`.
2020-03-05 20:21:58 +01:00
Tim Duesterhus
0643b0e7e6 MINOR: proxy: Make header_unique_id a struct ist
The `header_unique_id` member of `struct proxy` now is a `struct ist`.
2020-03-05 19:58:22 +01:00
Tim Duesterhus
9576ab7640 MINOR: ist: Add struct ist istdup(const struct ist)
istdup() performs the equivalent of strdup() on a `struct ist`.
2020-03-05 19:53:12 +01:00
Tim Duesterhus
35005d01d2 MINOR: ist: Add struct ist istalloc(size_t) and void istfree(struct ist*)
`istalloc` allocates memory and returns an `ist` with the size `0` that points
to this allocation.

`istfree` frees the pointed memory and clears the pointer.
2020-03-05 19:52:07 +01:00
Tim Duesterhus
e296d3e5f0 MINOR: ist: Add int isttest(const struct ist)
`isttest` returns whether the `.ptr` is non-null.
2020-03-05 19:52:07 +01:00
Tim Duesterhus
241e29ef9c MINOR: ist: Add IST_NULL macro
`IST_NULL` is equivalent to an `struct ist` with `.ptr = NULL` and
`.len = 0`.
2020-03-05 19:52:07 +01:00
William Lallemand
cfca1422c7 MINOR: ssl: reach a ckch_store from a sni_ctx
It was only possible to go down from the ckch_store to the sni_ctx but
not to go up from the sni_ctx to the ckch_store.

To allow that, 2 pointers were added:

- a ckch_inst pointer in the struct sni_ctx
- a ckckh_store pointer in the struct ckch_inst
2020-03-05 11:28:42 +01:00
William Lallemand
38df1c8006 MINOR: ssl/cli: support crt-list filters
Generate a list of the previous filters when updating a certificate
which use filters in crt-list. Then pass this list to the function
generating the sni_ctx during the commit.

This feature allows the update of the crt-list certificates which uses
the filters with "set ssl cert".

This function could be probably replaced by creating a new
ckch_inst_new_load_store() function which take the previous sni_ctx list as
an argument instead of the char **sni_filter, avoiding the
allocation/copy during runtime for each filter. But since are still
handling the multi-cert bundles, it's better this way to avoid code
duplication.
2020-03-05 11:27:53 +01:00
Tim Duesterhus
127a74dd48 MINOR: stream: Add stream_generate_unique_id function
Currently unique IDs for a stream are generated using repetitive code in
multiple locations, possibly allowing for inconsistent behavior.
2020-03-05 07:23:00 +01:00
Willy Tarreau
899e5f69a1 MINOR: debug: use our own backtrace function on clang+x86_64
A test on FreeBSD with clang 4 to 8 produces this on a call to a
spinning loop on the CLI:

  call trace(5):
  |       0x53e2bc [eb 16 48 63 c3 48 c1 e0]: wdt_handler+0x10c
  |    0x800e02cfe [e8 5d 83 00 00 8b 18 8b]: libthr:pthread_sigmask+0x53e

with our own function it correctly produces this:

  call trace(20):
  |       0x53e2dc [eb 16 48 63 c3 48 c1 e0]: wdt_handler+0x10c
  |    0x800e02cfe [e8 5d 83 00 00 8b 18 8b]: libthr:pthread_sigmask+0x53e
  |    0x800e022bf [48 83 c4 38 5b 41 5c 41]: libthr:pthread_getspecific+0xdef
  | 0x7ffffffff003 [48 8d 7c 24 10 6a 00 48]: main+0x7fffffb416f3
  |    0x801373809 [85 c0 0f 84 6f ff ff ff]: libc:__sys_gettimeofday+0x199
  |    0x801373709 [89 c3 85 c0 75 a6 48 8b]: libc:__sys_gettimeofday+0x99
  |    0x801371c62 [83 f8 4e 75 0f 48 89 df]: libc:gettimeofday+0x12
  |       0x51fa0a [48 89 df 4c 89 f6 e8 6b]: ha_thread_dump_all_to_trash+0x49a
  |       0x4b723b [85 c0 75 09 49 8b 04 24]: mworker_cli_sockpair_new+0xd9b
  |       0x4b6c68 [85 c0 75 08 4c 89 ef e8]: mworker_cli_sockpair_new+0x7c8
  |       0x532f81 [4c 89 e7 48 83 ef 80 41]: task_run_applet+0xe1

So let's add clang+x86_64 to the list of platforms that will use our
simplified version. As a bonus it will not require to link with
-lexecinfo on FreeBSD and will work out of the box when passing
USE_BACKTRACE=1.
2020-03-04 12:04:07 +01:00
Willy Tarreau
13faf16e1e MINOR: debug: improve backtrace() on aarch64 and possibly other systems
It happens that on aarch64 backtrace() only returns one entry (tested
with gcc 4.7.4, 5.5.0 and 7.4.1). Probably that it refrains from unwinding
the stack due to the risk of hitting a bad pointer. Here we can use
may_access() to know when it's safe, so we can actually unwind the stack
without taking risks. It happens that the faulting function (the one
just after the signal handler) is not listed here, very likely because
the signal handler uses a special stack and did not create a new frame.

So this patch creates a new my_backtrace() function in standard.h that
either calls backtrace() or does its own unrolling. The choice depends
on HA_HAVE_WORKING_BACKTRACE which is set in compat.h based on the build
target.
2020-03-04 12:04:07 +01:00
Emmanuel Hocdet
842e94ee06 MINOR: ssl: add "ca-verify-file" directive
It's only available for bind line. "ca-verify-file" allows to separate
CA certificates from "ca-file". CA names sent in server hello message is
only compute from "ca-file". Typically, "ca-file" must be defined with
intermediate certificates and "ca-verify-file" with certificates to
ending the chain, like root CA.

Fix issue #404.
2020-03-04 11:53:11 +01:00
Willy Tarreau
eb8b1ca3eb MINOR: tools: add resolve_sym_name() to resolve function pointers
We use various hacks at a few places to try to identify known function
pointers in debugging outputs (show threads & show fd). Let's centralize
this into a new function dedicated to this. It already knows about the
functions matched by "show threads" and "show fd", and when built with
USE_DL, it can rely on dladdr1() to resolve other functions. There are
some limitations, as static functions are not resolved, linking with
-rdynamic is mandatory, and even then some functions will not necessarily
appear. It's possible to do a better job by rebuilding the whole symbol
table from the ELF headers in memory but it's less portable and the gains
are still limited, so this solution remains a reasonable tradeoff.
2020-03-03 18:18:40 +01:00
Willy Tarreau
762fb3ec8e MINOR: tools: add new function dump_addr_and_bytes()
This function dumps <n> bytes from <addr> in hex form into buffer <buf>
enclosed in brackets after the address itself, formatted on 14 chars
including the "0x" prefix. This is meant to be used as a prefix for code
areas. For example: "0x7f10b6557690 [48 c7 c0 0f 00 00 00 0f]: "
It relies on may_access() to know if the bytes are dumpable, otherwise "--"
is emitted. An optional prefix is supported.
2020-03-03 17:46:37 +01:00
Willy Tarreau
27d00c0167 MINOR: task: export run_tasks_from_list
This will help refine debug traces.
2020-03-03 15:26:10 +01:00
Willy Tarreau
3ebd55ee51 MINOR: haproxy: export run_poll_loop
This will help refine debug traces.
2020-03-03 15:26:10 +01:00
Willy Tarreau
1827845a3d MINOR: haproxy: export main to ease access from debugger
Better just export main instead of declaring it as extern, it's cleaner
and may be usable elsewhere.
2020-03-03 15:26:10 +01:00
Willy Tarreau
1ed3781e21 MINOR: fd: merge the read and write error bits into RW error
We always set them both, which makes sense since errors at the FD level
indicate a terminal condition for the socket that cannot be recovered.
Usually this is detected via a write error, but sometimes such an error
may asynchronously be reported on the read side. Let's simplify this
using only the write bit and calling it RW since it's used like this
everywhere, and leave the R bit spare for future use.
2020-02-28 07:42:29 +01:00
Willy Tarreau
a135ea63a6 CLEANUP: fd: remove some unneeded definitions of FD_EV_* flags
There's no point in trying to be too generic for these flags as the
read and write sides will soon differ a bit. Better explicitly define
the flags for each direction without trying to be direction-agnostic.
this clarifies the code and removes some defines.
2020-02-28 07:42:29 +01:00
Willy Tarreau
f80fe832b1 CLEANUP: fd: remove the FD_EV_STATUS aggregate
This was used only by fd_recv_state() and fd_send_state(), both of
which are unused. This will not work anymore once recv and send flags
start to differ, so let's remove this.
2020-02-28 07:42:29 +01:00
Jerome Magnin
967d3cc105 BUG/MINOR: http_ana: make sure redirect flags don't have overlapping bits
commit c87e46881 ("MINOR: http-rules: Add a flag on redirect rules to know the
rule direction") introduced a new flag for redirect rules, but its value has
bits in common with REDIRECT_FLAG_DROP_QS, which makes us enter this code path
in http_apply_redirect_rule(), which will then drop the query string.
To fix this, just give REDIRECT_FLAG_FROM_REQ its own unique value.

This must be backported where c87e468816 is backported.

This should fix issue 521.
2020-02-27 23:44:41 +01:00
Willy Tarreau
2104659cd5 MEDIUM: buffer: remove the buffer_wq lock
This lock was only needed to protect the buffer_wq list, but now we have
the mt_list for this. This patch simply turns the buffer_wq list to an
mt_list and gets rid of the lock.

It's worth noting that the whole buffer_wait thing still looks totally
wrong especially in a threaded context: the wakeup_cb() callback is
called synchronously from any thread and may end up calling some
connection code that was not expected to run on a given thread. The
whole thing should probably be reworked to use tasklets instead and be
a bit more centralized.
2020-02-26 10:39:36 +01:00
William Lallemand
e0f3fd5b4c CLEANUP: ssl: move issuer_chain tree and definition
Move the cert_issuer_tree outside the global_ssl structure since it's
not a configuration variable. And move the declaration of the
issuer_chain structure in types/ssl_sock.h
2020-02-25 15:06:40 +01:00
Willy Tarreau
226ef26056 MINOR: compiler: add new alignment macros
This commit adds ALWAYS_ALIGN(), MAYBE_ALIGN() and ATOMIC_ALIGN() to
be placed as delimitors inside structures to force alignment to a
given size. These depend on the architecture's capabilities so that
it is possible to always align, align only on archs not supporting
unaligned accesses at all, or only on those not supporting them for
atomic accesses (e.g. before a lock).
2020-02-25 10:34:43 +01:00
Willy Tarreau
908071171b BUILD: general: always pass unsigned chars to is* functions
The isalnum(), isalpha(), isdigit() etc functions from ctype.h are
supposed to take an int in argument which must either reflect an
unsigned char or EOF. In practice on some platforms they're implemented
as macros referencing an array, and when passed a char, they either cause
a warning "array subscript has type 'char'" when lucky, or cause random
segfaults when unlucky. It's quite unconvenient by the way since none of
them may return true for negative values. The recent introduction of
cygwin to the list of regularly tested build platforms revealed a lot
of breakage there due to the same issues again.

So this patch addresses the problem all over the code at once. It adds
unsigned char casts to every valid use case, and also drops the unneeded
double cast to int that was sometimes added on top of it.

It may be backported by dropping irrelevant changes if that helps better
support uncommon platforms. It's unlikely to fix bugs on platforms which
would already not emit any warning though.
2020-02-25 08:16:33 +01:00
Willy Tarreau
03e7853581 BUILD: remove obsolete support for -mregparm / USE_REGPARM
This used to be a minor optimization on ix86 where registers are scarce
and the calling convention not very efficient, but this platform is not
relevant enough anymore to warrant all this dirt in the code for the sake
of saving 1 or 2% of performance. Modern platforms don't use this at all
since their calling convention already defaults to using several registers
so better get rid of this once for all.
2020-02-25 07:41:47 +01:00
Tim Duesterhus
1d48ba91d7 CLEANUP: net_helper: Do not negate the result of unlikely
This patch turns the double negation of 'not unlikely' into 'likely'
and then turns the negation of 'not smaller' into 'greater or equal'
in an attempt to improve readability of the condition.

[wt: this was not a bug but purposely written like this to improve code
 generation on older compilers but not needed anymore as described here:
 https://www.mail-archive.com/haproxy@formilux.org/msg36392.html ]
2020-02-25 07:30:49 +01:00
Tim Duesterhus
927063b892 CLEANUP: conn: Do not pass a pointer to likely
Move the `!` inside the likely and negate it to unlikely.

The previous version should not have caused issues, because it is converted
to a boolean / integral value before being passed to __builtin_expect(), but
it's certainly unusual.

[wt: this was not a bug but purposely written like this to improve code
 generation on older compilers but not needed anymore as described here:
 https://www.mail-archive.com/haproxy@formilux.org/msg36392.html ]
2020-02-25 07:30:49 +01:00
Willy Tarreau
89ee79845c MINOR: compiler: drop special cases of likely/unlikely for older compilers
We used to special-case the likely()/unlikely() macros for a series of
early gcc 4.x compilers which used to produce very bad code when using
__builtin_expect(x,1), which basically used to build an integer (0 or 1)
from a condition then compare it to integer 1. This was already fixed in
5.x, but even now, looking at the code produced by various flavors of 4.x
this bad behavior couldn't be witnessed anymore. So let's consider it as
fixed by now, which will allow to get rid of some ugly tricks at some
specific places. A test on 4.7.4 shows that the code shrinks by about 3kB
now, thanks to some tests being inlined closer to the call place and the
unlikely case being moved to real functions. See the link below for more
background on this.

Link: https://www.mail-archive.com/haproxy@formilux.org/msg36392.html
2020-02-25 07:29:55 +01:00
Willy Tarreau
0e2686762f MINOR: compiler: move CPU capabilities definition from config.h and complete them
These ones are irrelevant to the config but rather to the platform, and
as such are better placed in compiler.h.

Here we take the opportunity for declaring a few extra capabilities:
 - HA_UNALIGNED         : CPU supports unaligned accesses
 - HA_UNALIGNED_LE      : CPU supports unaligned accesses in little endian
 - HA_UNALIGNED_FAST    : CPU supports fast unaligned accesses
 - HA_UNALIGNED_ATOMIC  : CPU supports unaligned accesses in atomics

This will help remove a number of #ifdefs with arch-specific statements.
2020-02-21 16:32:57 +01:00
Jerome Magnin
9dde0b2d31 MINOR: ist: add an iststop() function
Add a function that finds a character in an ist and returns an
updated ist with the length of the portion of the original string
that doesn't contain the char.

Might be backported to 2.1
2020-02-21 11:47:25 +01:00
Willy Tarreau
716bec2dc6 MINOR: connection: introduce a new receive flag: CO_RFL_READ_ONCE
This flag is currently supported by raw_sock to perform a single recv()
attempt and avoid subscribing. Typically on the request and response
paths with keep-alive, with short messages we know that it's very likely
that the first message is enough.
2020-02-21 11:22:45 +01:00
Willy Tarreau
5d4d1806db CLEANUP: connection: remove the definitions of conn_xprt_{stop,want}_{send,recv}
This marks the end of the transition from the connection polling states
introduced in 1.5-dev12 and the subscriptions in that arrived in 1.9.
The socket layer can now safely use its FD while all upper layers rely
exclusively on subscriptions. These old functions were removed. Some may
deserve some renaming to improved clarty though. The single call to
conn_xprt_stop_both() was dropped in favor of conn_cond_update_polling()
which already does the same.
2020-02-21 11:21:12 +01:00
Willy Tarreau
d1d14c3157 MINOR: connection: remove the last calls to conn_xprt_{want,stop}_*
The last few calls to conn_xprt_{want,stop}_{recv,send} in the central
connection code were replaced with their strictly exact equivalent fd_*,
adding the call to conn_ctrl_ready() when it was missing.
2020-02-21 11:21:12 +01:00
Willy Tarreau
19bc201c9f MEDIUM: connection: remove the intermediary polling state from the connection
Historically we used to require that the connections held the desired
polling states for the data layer and the socket layer. Then with muxes
these were more or less merged into the transport layer, and now it
happens that with all transport layers having their own state, the
"transport layer state" as we have it in the connection (XPRT_RD_ENA,
XPRT_WR_ENA) is only an exact copy of the undelying file descriptor
state, but with a delay. All of this is causing some difficulties at
many places in the code because there are still some locations which
use the conn_want_* API to remain clean and only rely on connection,
and count on a later collection call to conn_cond_update_polling(),
while others need an immediate action and directly use the FD updates.

Since our updates are now much cheaper, most of them being only an
atomic test-and-set operation, and since our I/O callbacks are deferred,
there's no benefit anymore in trying to "cache" the transient state
change in the connection flags hoping to cancel them before they
become an FD event. Better make such calls transparent indirections
to the FD layer instead and get rid of the deferred operations which
needlessly complicate the logic inside.

This removes flags CO_FL_XPRT_{RD,WR}_ENA and CO_FL_WILL_UPDATE.
A number of functions related to polling updates were either greatly
simplified or removed.

Two places were using CO_FL_XPRT_WR_ENA as a hint to know if more data
were expected to be sent after a PROXY protocol or SOCKSv4 header. These
ones were simply replaced with a check on the subscription which is
where we ought to get the autoritative information from.

Now the __conn_xprt_want_* and their conn_xprt_want_* counterparts
are the same. conn_stop_polling() and conn_xprt_stop_both() are the
same as well. conn_cond_update_polling() only causes errors to stop
polling. It also becomes way more obvious that muxes should not at
all employ conn_xprt_{want|stop}_{recv,send}(), and that the call
to __conn_xprt_stop_recv() in case a mux failed to allocate a buffer
is inappropriate, it ought to unsubscribe from reads instead. All of
this definitely requires a serious cleanup.
2020-02-21 11:21:12 +01:00
Christopher Faulet
727a3f1ca3 MINOR: http-htx: Add a function to retrieve the headers size of an HTX message
http_get_hdrs_size() function may now be used to get the bytes held by headers
in an HTX message. It only works if the headers were not already
forwarded. Metadata are not counted here.
2020-02-18 11:19:57 +01:00
Willy Tarreau
a71667c07d BUG/MINOR: tools: also accept '+' as a valid character in an identifier
The function is_idchar() was added by commit 36f586b ("MINOR: tools:
add is_idchar() to tell if a char may belong to an identifier") to
ease matching of sample fetch/converter names. But it lacked support
for the '+' character used in "base32+src" and "url32+src". A quick
way to figure the list of supported sample fetch+converter names is
to issue the following command:

   git grep '"[^"]*",.*SMP_T_.*SMP_USE_'|cut -f2 -d'"'|sort -u

No more entry is reported once searching for characters not covered
by is_idchar().

No backport is needed.
2020-02-17 06:37:40 +01:00
Willy Tarreau
e3b57bf92f MINOR: sample: make sample_parse_expr() able to return an end pointer
When an end pointer is passed, instead of complaining that a comma is
missing after a keyword, sample_parse_expr() will silently return the
pointer to the current location into this return pointer so that the
caller can continue its parsing. This will be used by more complex
expressions which embed sample expressions, and may even permit to
embed sample expressions into arguments of other expressions.
2020-02-14 19:02:06 +01:00
Willy Tarreau
80b53ffb1c MEDIUM: arg: make make_arg_list() stop after its own arguments
The main problem we're having with argument parsing is that at the
moment the caller looks for the first character looking like an end
of arguments (')') and calls make_arg_list() on the sub-string inside
the parenthesis.

Let's first change the way it works so that make_arg_list() also
consumes the parenthesis and returns the pointer to the first char not
consumed. This will later permit to refine each argument parsing.

For now there is no functional change.
2020-02-14 19:02:06 +01:00
Willy Tarreau
d4ad669051 MINOR: chunk: implement chunk_strncpy() to copy partial strings
This does like chunk_strcpy() except that the maximum string length may
be limited by the caller. A trailing zero is always appended. This is
particularly handy to extract portions of strings to put into the trash
for use with libc functions requiring a nul-terminated string.
2020-02-14 19:02:06 +01:00
Willy Tarreau
36f586b694 MINOR: tools: add is_idchar() to tell if a char may belong to an identifier
This function will simply be used to find the end of config identifiers
(proxies, servers, ACLs, sample fetches, converters, etc).
2020-02-14 19:02:06 +01:00
Ilya Shipitsin
88a2f0304c CLEANUP: ssl: remove unused functions in openssl-compat.h
functions SSL_SESSION_get0_id_context, SSL_CTX_get_default_passwd_cb,
SSL_CTX_get_default_passwd_cb_userdata are not used anymore
2020-02-14 16:15:00 +01:00
Willy Tarreau
160ad9e38a CLEANUP: mini-clist: simplify nested do { while(1) {} } while (0)
While looking for other occurrences of do { continue; } while (0) I
found these few leftovers in mini-clist where an outer loop was made
around "do { } while (0)" then another loop was placed inside just to
handle the continue. Let's clean this up by just removing the outer
one. Most of the patch is only the inner part of the loop that is
reindented. It was verified that the resulting code is the same.
2020-02-11 10:27:04 +01:00
Christopher Faulet
7716cdf450 MINOR: lua: Get the action return code on the stack when an action finishes
When an action successfully finishes, the action return code (ACT_RET_*) is now
retrieve on the stack, ff the first element is an integer. In addition, in
hlua_txn_done(), the value ACT_RET_DONE is pushed on the stack before
exiting. Thus, when a script uses this function, the corresponding action still
finishes with the good code. Thanks to this change, the flag HLUA_STOP is now
useless. So it has been removed.

It is a mandatory step to allow a lua action to return any action return code.
2020-02-06 15:13:03 +01:00
Christopher Faulet
07a718e712 CLEANUP: lua: Remove consistency check for sample fetches and actions
It is not possible anymore to alter the HTTP parser state from lua sample
fetches or lua actions. So there is no reason to still check for the parser
state consistency.
2020-02-06 15:13:03 +01:00
Christopher Faulet
4a2c142779 MEDIUM: http-rules: Support extra headers for HTTP return actions
It is now possible to append extra headers to the generated responses by HTTP
return actions, while it is not based on an errorfile. For return actions based
on errorfiles, these extra headers are ignored. To define an extra header, a
"hdr" argument must be used with a name and a value. The value is a log-format
string. For instance:

  http-request status 200 hdr "x-src" "%[src]" hdr "x-dst" "%[dst]"
2020-02-06 15:13:03 +01:00
Christopher Faulet
24231ab61f MEDIUM: http-rules: Add the return action to HTTP rules
Thanks to this new action, it is now possible to return any responses from
HAProxy, with any status code, based on an errorfile, a file or a string. Unlike
the other internal messages generated by HAProxy, these ones are not interpreted
as errors. And it is not necessary to use a file containing a full HTTP
response, although it is still possible. In addition, using a log-format string
or a log-format file, it is possible to have responses with a dynamic
content. This action can be used on the request path or the response path. The
only constraint is to have a responses smaller than a buffer. And to avoid any
warning the buffer space reserved to the headers rewritting should also be free.

When a response is returned with a file or a string as payload, it only contains
the content-length header and the content-type header, if applicable. Here are
examples:

  http-request return content-type image/x-icon file /var/www/favicon.ico  \
      if { path /favicon.ico }

  http-request return status 403 content-type text/plain    \
      lf-string "Access denied. IP %[src] is blacklisted."  \
      if { src -f /etc/haproxy/blacklist.lst }
2020-02-06 15:12:54 +01:00
Christopher Faulet
6d0c3dfac6 MEDIUM: http: Add a ruleset evaluated on all responses just before forwarding
This patch introduces the 'http-after-response' rules. These rules are evaluated
at the end of the response analysis, just before the data forwarding, on ALL
HTTP responses, the server ones but also all responses generated by
HAProxy. Thanks to this ruleset, it is now possible for instance to add some
headers to the responses generated by the stats applet. Following actions are
supported :

   * allow
   * add-header
   * del-header
   * replace-header
   * replace-value
   * set-header
   * set-status
   * set-var
   * strict-mode
   * unset-var
2020-02-06 14:55:34 +01:00
Christopher Faulet
ef70e25035 MINOR: http-ana: Add a function for forward internal responses
Operations performed when internal responses (redirect/deny/auth/errors) are
returned are always the same. The http_forward_proxy_resp() function is added to
group all of them under a unique function.
2020-02-06 14:55:34 +01:00
Christopher Faulet
72c7d8d040 MINOR: http-ana: Rely on http_reply_and_close() to handle server error
The http_server_error() function now relies on http_reply_and_close(). Both do
almost the same actions. In addtion, http_server_error() sets the error flag and
the final state flag on the stream.
2020-02-06 14:55:34 +01:00
Christopher Faulet
c87e468816 MINOR: http-rules: Add a flag on redirect rules to know the rule direction
HTTP redirect rules can be evaluated on the request or the response path. So
when a redirect rule is evaluated, it is important to have this information
because some specific processing may be performed depending on the direction. So
the REDIRECT_FLAG_FROM_REQ flag has been added. It is set when applicable on the
redirect rule during the parsing.

This patch is mandatory to fix a bug on redirect rule. It must be backported to
all stable versions.
2020-02-06 14:55:34 +01:00
Christopher Faulet
a4168434a7 MINOR: dns: Dynamically allocate dns options to reduce the act_rule size
<.arg.dns.dns_opts> field in the act_rule structure is now dynamically allocated
when a do-resolve rule is parsed. This drastically reduces the structure size.
2020-02-06 14:55:34 +01:00
Christopher Faulet
7651362e52 MINOR: htx/channel: Add a function to copy an HTX message in a channel's buffer
The channel_htx_copy_msg() function can now be used to copy an HTX message in a
channel's buffer. This function takes care to not overwrite existing data.

This patch depends on the commit "MINOR: htx: Add a function to append an HTX
message to another one". Both are mandatory to fix a bug in
http_reply_and_close() function. Be careful to backport both first.
2020-02-06 14:55:16 +01:00
Christopher Faulet
0ea0c86753 MINOR: htx: Add a function to append an HTX message to another one
the htx_append_msg() function can now be used to append an HTX message to
another one. All the message is copied or nothing. If an error occurs during the
copy, all changes are rolled back.

This patch is mandatory to fix a bug in http_reply_and_close() function. Be
careful to backport it first.
2020-02-06 14:54:47 +01:00
Olivier Houchard
1c7c0d6b97 BUG/MAJOR: memory: Don't forget to unlock the rwlock if the pool is empty.
In __pool_get_first(), don't forget to unlock the pool lock if the pool is
empty, otherwise no writer will be able to take the lock, and as it is done
when reloading, it leads to an infinite loop on reload.

This should be backported with commit 04f5fe87d3d3a222b89420f8c1231461f55ebdeb
2020-02-03 13:05:31 +01:00
Olivier Houchard
04f5fe87d3 BUG/MEDIUM: memory: Add a rwlock before freeing memory.
When using lockless pools, add a new rwlock, flush_pool. read-lock it when
getting memory from the pool, so that concurrenct access are still
authorized, but write-lock it when we're about to free memory, in
pool_flush() and pool_gc().
The problem is, when removing an item from the pool, we unreference it
to get the next one, however, that pointer may have been free'd in the
meanwhile, and that could provoke a crash if the pointer has been unmapped.
It should be OK to use a rwlock, as normal operations will still be able
to access the pool concurrently, and calls to pool_flush() and pool_gc()
should be pretty rare.

This should be backported to 2.1, 2.0 and 1.9.
2020-02-01 18:08:34 +01:00
Willy Tarreau
b30a153cd1 MINOR: task: detect self-wakeups on tl==sched->current instead of TASK_RUNNING
This is exactly what we want to detect (a task/tasklet waking itself),
so let's use the proper condition for this.
2020-01-31 17:45:10 +01:00
Willy Tarreau
bb238834da MINOR: task: permanently flag tasklets waking themselves up
Commit a17664d829 ("MEDIUM: tasks: automatically requeue into the bulk
queue an already running tasklet") tried to inflict a penalty to
self-requeuing tasks/tasklets which correspond to those involved in
large, high-latency data transfers, for the benefit of all other
processing which requires a low latency. However, it turns out that
while it ought to do this on a case-by-case basis, basing itself on
the RUNNING flag isn't accurate because this flag doesn't leave for
tasklets, so we'd rather need a distinct flag to tag such tasklets.

This commit introduces TASK_SELF_WAKING to mark tasklets acting like
this. For now it's still set when TASK_RUNNING is present but this
will have to change. The flag is kept across wakeups.
2020-01-31 17:45:10 +01:00
Willy Tarreau
a17664d829 MEDIUM: tasks: automatically requeue into the bulk queue an already running tasklet
When a tasklet re-runs itself such as in this chain:

   si_cs_io_cb -> si_cs_process -> si_notify -> si_chk_rcv

then we know it can easily clobber the run queue and harm latency. Now
what the scheduler does when it detects this is that such a tasklet is
automatically placed into the bulk list so that it's processed with the
remaining CPU bandwidth only. Thanks to this the CLI becomes instantly
responsive again even under heavy stress at 50 Gbps over 40kcon and
100% CPU on 16 threads.
2020-01-30 19:03:31 +01:00
Willy Tarreau
a62917b890 MEDIUM: tasks: implement 3 different tasklet classes with their own queues
We used to mix high latency tasks and low latency tasklets in the same
list, and to even refill bulk tasklets there, causing some unfairness
in certain situations (e.g. poll-less transfers between many connections
saturating the machine with similarly-sized in and out network interfaces).

This patch changes the mechanism to split the load into 3 lists depending
on the task/tasklet's desired classes :
  - URGENT: this is mainly for tasklets used as deferred callbacks
  - NORMAL: this is for regular tasks
  - BULK: this is for bulk tasks/tasklets

Arbitrary ratios of max_processed are picked from each of these lists in
turn, with the ability to complete in one list from what was not picked
in the previous one. After some quick tests, the following setup gave
apparently good results both for raw TCP with splicing and for H2-to-H1
request rate:

  - 0 to 75% for urgent
  - 12 to 50% for normal
  - 12 to what remains for bulk

Bulk is not used yet.
2020-01-30 18:59:33 +01:00
Willy Tarreau
911db9bd29 MEDIUM: connection: use CO_FL_WAIT_XPRT more consistently than L4/L6/HANDSHAKE
As mentioned in commit c192b0ab95 ("MEDIUM: connection: remove
CO_FL_CONNECTED and only rely on CO_FL_WAIT_*"), there is a lack of
consistency on which flags are checked among L4/L6/HANDSHAKE depending
on the code areas. A number of sample fetch functions only check for
L4L6 to report MAY_CHANGE, some places only check for HANDSHAKE and
many check both L4L6 and HANDSHAKE.

This patch starts to make all of this more consistent by introducing a
new mask CO_FL_WAIT_XPRT which is the union of L4/L6/HANDSHAKE and
reports whether the transport layer is ready or not.

All inconsistent call places were updated to rely on this one each time
the goal was to check for the readiness of the transport layer.
2020-01-23 16:34:26 +01:00
Willy Tarreau
4450b587dd MINOR: connection: remove CO_FL_SSL_WAIT_HS from CO_FL_HANDSHAKE
Most places continue to check CO_FL_HANDSHAKE while in fact they should
check CO_FL_HANDSHAKE_NOSSL, which contains all handshakes but the one
dedicated to SSL renegotiation. In fact the SSL layer should be the
only one checking CO_FL_SSL_WAIT_HS, so as to avoid processing data
when a renegotiation is in progress, but other ones randomly include it
without knowing. And ideally it should even be an internal flag that's
not exposed in the connection.

This patch takes CO_FL_SSL_WAIT_HS out of CO_FL_HANDSHAKE, uses this flag
consistently all over the code, and gets rid of CO_FL_HANDSHAKE_NOSSL.

In order to limit the confusion that has accumulated over time, the
CO_FL_SSL_WAIT_HS flag which indicates an ongoing SSL handshake,
possibly used by a renegotiation was moved after the other ones.
2020-01-23 16:34:26 +01:00
Willy Tarreau
c192b0ab95 MEDIUM: connection: remove CO_FL_CONNECTED and only rely on CO_FL_WAIT_*
Commit 477902bd2e ("MEDIUM: connections: Get ride of the xprt_done
callback.") broke the master CLI for a very obscure reason. It happens
that short requests immediately terminated by a shutdown are properly
received, CS_FL_EOS is correctly set, but in si_cs_recv(), we refrain
from setting CF_SHUTR on the channel because CO_FL_CONNECTED was not
yet set on the connection since we've not passed again through
conn_fd_handler() and it was not done in conn_complete_session(). While
commit a8a415d31a ("BUG/MEDIUM: connections: Set CO_FL_CONNECTED in
conn_complete_session()") fixed the issue, such accident may happen
again as the root cause is deeper and actually comes down to the fact
that CO_FL_CONNECTED is lazily set at various check points in the code
but not every time we drop one wait bit. It is not the first time we
face this situation.

Originally this flag was used to detect the transition between WAIT_*
and CONNECTED in order to call ->wake() from the FD handler. But since
at least 1.8-dev1 with commit 7bf3fa3c23 ("BUG/MAJOR: connection: update
CO_FL_CONNECTED before calling the data layer"), CO_FL_CONNECTED is
always synchronized against the two others before being checked. Moreover,
with the I/Os moved to tasklets, the decision to call the ->wake() function
is performed after the I/Os in si_cs_process() and equivalent, which don't
care about this transition either.

So in essence, checking for CO_FL_CONNECTED has become a lazy wait to
check for (CO_FL_WAIT_L4_CONN | CO_FL_WAIT_L6_CONN), but that always
relies on someone else having synchronized it.

This patch addresses it once for all by killing this flag and only checking
the two others (for which a composite mask CO_FL_WAIT_L4L6 was added). This
revealed a number of inconsistencies that were purposely not addressed here
for the sake of bisectability:

  - while most places do check both L4+L6 and HANDSHAKE at the same time,
    some places like assign_server() or back_handle_st_con() and a few
    sample fetches looking for proxy protocol do check for L4+L6 but
    don't care about HANDSHAKE ; these ones will probably fail on TCP
    request session rules if the handshake is not complete.

  - some handshake handlers do validate that a connection is established
    at L4 but didn't clear CO_FL_WAIT_L4_CONN

  - the ->ctl method of mux_fcgi, mux_pt and mux_h1 only checks for L4+L6
    before declaring the mux ready while the snd_buf function also checks
    for the handshake's completion. Likely the former should validate the
    handshake as well and we should get rid of these extra tests in snd_buf.

  - raw_sock_from_buf() would directly set CO_FL_CONNECTED and would only
    later clear CO_FL_WAIT_L4_CONN.

  - xprt_handshake would set CO_FL_CONNECTED itself without actually
    clearing CO_FL_WAIT_L4_CONN, which could apparently happen only if
    waiting for a pure Rx handshake.

  - most places in ssl_sock that were checking CO_FL_CONNECTED don't need
    to include the L4 check as an L6 check is enough to decide whether to
    wait for more info or not.

It also becomes obvious when reading the test in si_cs_recv() that caused
the failure mentioned above that once converted it doesn't make any sense
anymore: having CS_FL_EOS set while still waiting for L4 and L6 to complete
cannot happen since for CS_FL_EOS to be set, the other ones must have been
validated.

Some of these parts will still deserve further cleanup, and some of the
observations above may induce some backports of potential bug fixes once
totally analyzed in their context. The risk of breaking existing stuff
is too high to blindly backport everything.
2020-01-23 14:41:37 +01:00
Olivier Houchard
477902bd2e MEDIUM: connections: Get ride of the xprt_done callback.
The xprt_done_cb callback was used to defer some connection initialization
until we're connected and the handshake are done. As it mostly consists of
creating the mux, instead of using the callback, introduce a conn_create_mux()
function, that will just call conn_complete_session() for frontend, and
create the mux for backend.
In h2_wake(), make sure we call the wake method of the stream_interface,
as we no longer wakeup the stream task.
2020-01-22 18:56:05 +01:00
Olivier Houchard
8af03b396a MEDIUM: streams: Always create a conn_stream in connect_server().
In connect_server(), when creating a new connection for which we don't yet
know the mux (because it'll be decided by the ALPN), instead of associating
the connection to the stream_interface, always create a conn_stream. This way,
we have less special-casing needed. Store the conn_stream in conn->ctx,
so that we can reach the upper layers if needed.
2020-01-22 18:55:59 +01:00
Emmanuel Hocdet
6b5b44e10f BUG/MINOR: ssl: ssl_sock_load_pem_into_ckch is not consistent
"set ssl cert <filename> <payload>" CLI command should have the same
result as reload HAproxy with the updated pem file (<filename>).
Is not the case, DHparams/cert-chain is kept from the previous
context if no DHparams/cert-chain is set in the context (<payload>).

This patch should be backport to 2.1
2020-01-22 15:55:55 +01:00
Adis Nezirovic
1a693fc2fd MEDIUM: cli: Allow multiple filter entries for "show table"
For complex stick tables with many entries/columns, it can be beneficial
to filter using multiple criteria. The maximum number of filter entries
can be controlled by defining STKTABLE_FILTER_LEN during build time.

This patch can be backported to older releases.
2020-01-22 14:33:17 +01:00
Ilya Shipitsin
056c629531 BUG/MINOR: ssl: fix build on development versions of openssl-1.1.x
while working on issue #429, I encountered build failures with various
non-released openssl versions, let us improve ssl defines, switch to
features, not versions, for EVP_CTRL_AEAD_SET_IVLEN and
EVP_CTRL_AEAD_SET_TAG.

No backport is needed as there is no valid reason to build a stable haproxy
version against a development version of openssl.
2020-01-22 07:54:52 +01:00
Willy Tarreau
2086365f51 CLEANUP: pattern: remove the pat_time definition
It was inherited from acl_time, introduced in 1.3.10 by commit a84d374367
("[MAJOR] new framework for generic ACL support") and was never ever used.
Let's simply drop it now.
2020-01-22 07:44:36 +01:00
Tim Duesterhus
6a0dd73390 CLEANUP: Consistently unsigned int for bitfields
Signed bitfields of size `1` hold the values `0` and `-1`, but are
usually assigned `1`, possibly leading to subtle bugs when the value
is explicitely compared against `1`.
2020-01-22 07:28:39 +01:00
Baptiste Assmann
13a9232ebc MEDIUM: dns: use Additional records from SRV responses
Most DNS servers provide A/AAAA records in the Additional section of a
response, which correspond to the SRV records from the Answer section:

  ;; QUESTION SECTION:
  ;_http._tcp.be1.domain.tld.     IN      SRV

  ;; ANSWER SECTION:
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A1.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A8.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A5.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A6.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A4.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A3.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A2.domain.tld.
  _http._tcp.be1.domain.tld. 3600 IN      SRV     5 500 80 A7.domain.tld.

  ;; ADDITIONAL SECTION:
  A1.domain.tld.          3600    IN      A       192.168.0.1
  A8.domain.tld.          3600    IN      A       192.168.0.8
  A5.domain.tld.          3600    IN      A       192.168.0.5
  A6.domain.tld.          3600    IN      A       192.168.0.6
  A4.domain.tld.          3600    IN      A       192.168.0.4
  A3.domain.tld.          3600    IN      A       192.168.0.3
  A2.domain.tld.          3600    IN      A       192.168.0.2
  A7.domain.tld.          3600    IN      A       192.168.0.7

SRV record support was introduced in HAProxy 1.8 and the first design
did not take into account the records from the Additional section.
Instead, a new resolution is associated to each server with its relevant
FQDN.
This behavior generates a lot of DNS requests (1 SRV + 1 per server
associated).

This patch aims at fixing this by:
- when a DNS response is validated, we associate A/AAAA records to
  relevant SRV ones
- set a flag on associated servers to prevent them from running a DNS
  resolution for said FADN
- update server IP address with information found in the Additional
  section

If no relevant record can be found in the Additional section, then
HAProxy will failback to running a dedicated resolution for this server,
as it used to do.
This behavior is the one described in RFC 2782.
2020-01-22 07:19:54 +01:00
Christopher Faulet
2f5339079b MINOR: proxy/http-ana: Add support of extra attributes for the cookie directive
It is now possible to insert any attribute when a cookie is inserted by
HAProxy. Any value may be set, no check is performed except the syntax validity
(CTRL chars and ';' are forbidden). For instance, it may be used to add the
SameSite attribute:

    cookie SRV insert attr "SameSite=Strict"

The attr option may be repeated to add several attributes.

This patch should fix the issue #361.
2020-01-22 07:18:31 +01:00
Christopher Faulet
554c0ebffd MEDIUM: http-rules: Support an optional error message in http deny rules
It is now possible to set the error message to use when a deny rule is
executed. It may be a specific error file, adding "errorfile <file>" :

  http-request deny deny_status 400 errorfile /etc/haproxy/errorfiles/400badreq.http

It may also be an error file from an http-errors section, adding "errorfiles
<name>" :

  http-request deny errorfiles my-errors  # use 403 error from "my-errors" section

When defined, this error message is set in the HTTP transaction. The tarpit rule
is also concerned by this change.
2020-01-20 15:18:46 +01:00
Christopher Faulet
473e880a25 MINOR: http-ana: Add an error message in the txn and send it when defined
It is now possible to set the error message to return to client in the HTTP
transaction. If it is defined, this error message is used instead of proxy's
errors or default errors.
2020-01-20 15:18:46 +01:00