Given the following example configuration:
backend foo
mode http
use-server %[str(x)] if { always_true }
server x example.com:80
Running a configuration check with valgrind reports:
==19376== 170 (40 direct, 130 indirect) bytes in 1 blocks are definitely lost in loss record 281 of 347
==19376== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19376== by 0x5091AC: add_sample_to_logformat_list (log.c:511)
==19376== by 0x50A5A6: parse_logformat_string (log.c:671)
==19376== by 0x4957F2: check_config_validity (cfgparse.c:2588)
==19376== by 0x54442D: init (haproxy.c:2129)
==19376== by 0x421E42: main (haproxy.c:3169)
After this patch is applied the leak is gone as expected.
This is a very minor leak that can only be observed if deinit() is called,
shortly before the OS will free all memory of the process anyway. No
backport needed.
Given the following example configuration:
backend foo
mode http
use-server x if { always_true }
server x example.com:80
Running a configuration check with valgrind reports:
==18650== 14 bytes in 1 blocks are definitely lost in loss record 3 of 345
==18650== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==18650== by 0x649E489: strdup (strdup.c:42)
==18650== by 0x4A5438: cfg_parse_listen (cfgparse-listen.c:1548)
==18650== by 0x494C59: readcfgfile (cfgparse.c:2049)
==18650== by 0x5450B5: init (haproxy.c:2029)
==18650== by 0x421E42: main (haproxy.c:3168)
After this patch is applied the leak is gone as expected.
This is a very minor leak that can only be observed if deinit() is called,
shortly before the OS will free all memory of the process anyway. No
backport needed.
Given the following example configuration:
frontend foo
mode http
bind *:8080
unique-id-header x
Running a configuration check with valgrind reports:
==17621== 2 bytes in 1 blocks are definitely lost in loss record 1 of 341
==17621== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==17621== by 0x649E489: strdup (strdup.c:42)
==17621== by 0x4A87F1: cfg_parse_listen (cfgparse-listen.c:2747)
==17621== by 0x494C59: readcfgfile (cfgparse.c:2049)
==17621== by 0x545095: init (haproxy.c:2029)
==17621== by 0x421E42: main (haproxy.c:3167)
After this patch is applied the leak is gone as expected.
This is a very minor leak that can only be observed if deinit() is called,
shortly before the OS will free all memory of the process anyway. No
backport needed.
Given the following example configuration:
resolvers test
nameserver test 127.0.0.1:53
listen foo
bind *:8080
server foo example.com resolvers test
Running a configuration check within valgrind reports:
==21995== 5 bytes in 1 blocks are definitely lost in loss record 1 of 30
==21995== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21995== by 0x5726489: strdup (strdup.c:42)
==21995== by 0x4B2CFB: parse_server (server.c:2163)
==21995== by 0x4680C1: cfg_parse_listen (cfgparse-listen.c:534)
==21995== by 0x459E33: readcfgfile (cfgparse.c:2167)
==21995== by 0x50778D: init (haproxy.c:2021)
==21995== by 0x418262: main (haproxy.c:3133)
==21995==
==21995== 12 bytes in 1 blocks are definitely lost in loss record 3 of 30
==21995== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21995== by 0x5726489: strdup (strdup.c:42)
==21995== by 0x4AC666: srv_prepare_for_resolution (server.c:1606)
==21995== by 0x4B2EBD: parse_server (server.c:2081)
==21995== by 0x4680C1: cfg_parse_listen (cfgparse-listen.c:534)
==21995== by 0x459E33: readcfgfile (cfgparse.c:2167)
==21995== by 0x50778D: init (haproxy.c:2021)
==21995== by 0x418262: main (haproxy.c:3133)
with one more leak unrelated to `struct server`. After applying this
patch the leak is gone as expected.
This is a very minor leak that can only be observed if deinit() is called,
shortly before the OS will free all memory of the process anyway. No
backport needed.
Given the following example configuration:
frontend foo
mode http
bind *:8080
unique-id-format x
Running a configuration check with valgrind reports:
==30712== 42 (40 direct, 2 indirect) bytes in 1 blocks are definitely lost in loss record 18 of 39
==30712== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==30712== by 0x4ED7E9: add_to_logformat_list (log.c:462)
==30712== by 0x4EEE28: parse_logformat_string (log.c:720)
==30712== by 0x47B09A: check_config_validity (cfgparse.c:3046)
==30712== by 0x52881D: init (haproxy.c:2121)
==30712== by 0x41F382: main (haproxy.c:3126)
After this patch is applied the leak is gone as expected.
This is a very minor leak that can only be observed if deinit() is called,
shortly before the OS will free all memory of the process anyway. No
backport needed.
As reported in issue #724, openbsd fails to build in haproxy.c
due to a faulty comma in the middle of a warning message. This code
is only compiled when RLIMIT_AS is not defined, which seems to be
rare these days.
This may be backported to older versions as the problem was likely
introduced when strict limits were added.
Enables ('on') or disables ('off') sharing of idle connection pools between
threads for a same server. The default is to share them between threads in
order to minimize the number of persistent connections to a server, and to
optimize the connection reuse rate. But to help with debugging or when
suspecting a bug in HAProxy around connection reuse, it can be convenient to
forcefully disable this idle pool sharing between multiple threads, and force
this option to "off". The default is on.
This could have been nice to have during the idle connections debugging,
but it's not too late to add it!
Commit d645574 ("MINOR: soft-stop: let the first stopper only signal
other threads") introduced a minor mistake which is that when a stopping
thread signals all other threads, it also signals itself. When
single-threaded, the process constantly wakes up while waiting for
last connections to exit. Let's reintroduce the lost mask to avoid
this.
No backport is needed, this is 2.2-dev only.
Move the ckch_deinit() and crtlist_deinit() call to ssl_sock.c,
also unlink the SNI from the ckch_inst because they are free'd before in
ssl_sock_free_all_ctx().
Add some functions to deinit the whole crtlist and ckch architecture.
It will free all crtlist, crtlist_entry, ckch_store, ckch_inst and their
associated SNI, ssl_conf and SSL_CTX.
The SSL_CTX in the default_ctx and initial_ctx still needs to be free'd
separately.
Initial default settings for maxconn/maxsock/maxpipes were rearranged
in commit a409f30d0 ("MINOR: init: move the maxsock calculation code
to compute_ideal_maxsock()") but as a side effect, the calculated
maxpipes value was not stored anymore into global.maxpipes. This
resulted in splicing being disabled unless there is an explicit
maxpipes setting in the global section.
This patch just stores the calculated ideal value as planned in the
computation and as was done before the patch above.
This is strictly 2.2, no backport is needed.
Nowadays signals cause tasks to be woken up. The historic code still
processes signals after tasks, which forces a second round in the loop
before they can effectively be processed. Let's move the signal queue
handling between wake_expired_tasks() and process_runnable_tasks() where
it makes much more sense.
localpeer <name>
Sets the local instance's peer name. It will be ignored if the "-L"
command line argument is specified or if used after "peers" section
definitions. In such cases, a warning message will be emitted during
the configuration parsing.
This option will also set the HAPROXY_LOCALPEER environment variable.
See also "-L" in the management guide and "peers" section in the
configuration manual.
For an unknown reason in commit bb1b63c079 I placed the compiler version
output in haproxy.c instead of version.c. Better have it in version.c which
is more suitable to this sort of things.
We cannot simply `release_sample_expr(rule->arg.vars.expr)` for a
`struct act_rule`, because `rule->arg` is a union that might not
contain valid `vars`. This leads to a crash on a configuration using
`http-request redirect` and possibly others:
frontend http
mode http
bind 127.0.0.1:80
http-request redirect scheme https
Instead a `struct act_rule` has a `release_ptr` that must be used
to properly free any additional storage allocated.
This patch fixes a regression in commit ff78fcdd7f.
It must be backported to whereever that patch is backported.
It has be verified that the configuration above no longer crashes.
It has also been verified that the configuration in ff78fcdd7f
does not leak.
Commit 0a3b43d9c ("MINOR: haproxy: Make use of deinit_and_exit() for
clean exits") introduced this build warning:
src/haproxy.c: In function 'main':
src/haproxy.c:3775:1: warning: control reaches end of non-void function [-Wreturn-type]
}
^
This is because the new deinit_and_exit() is not marked as "noreturn"
so depending on the optimizations, the noreturn attribute of exit() will
either leak through it and silence the warning or not and confuse the
compiler. Let's just add the attribute to fix this.
No backport is needed, this is purely 2.2.
Given the following example configuration:
frontend foo
bind *:8080
mode http
http-request set-var(txn.foo) str(bar)
Running a configuration check within valgrind reports:
==23665== Memcheck, a memory error detector
==23665== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==23665== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==23665== Command: ./haproxy -c -f ./crasher.cfg
==23665==
[WARNING] 165/002941 (23665) : config : missing timeouts for frontend 'foo'.
| While not properly invalid, you will certainly encounter various problems
| with such a configuration. To fix this, please ensure that all following
| timeouts are set to a non-zero value: 'client', 'connect', 'server'.
Warnings were found.
Configuration file is valid
==23665==
==23665== HEAP SUMMARY:
==23665== in use at exit: 314,008 bytes in 87 blocks
==23665== total heap usage: 160 allocs, 73 frees, 1,448,074 bytes allocated
==23665==
==23665== 132 (48 direct, 84 indirect) bytes in 1 blocks are definitely lost in loss record 15 of 28
==23665== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==23665== by 0x4A2612: sample_parse_expr (sample.c:876)
==23665== by 0x54DF84: parse_store (vars.c:766)
==23665== by 0x528BDF: parse_http_req_cond (http_rules.c:95)
==23665== by 0x469F36: cfg_parse_listen (cfgparse-listen.c:1339)
==23665== by 0x459E33: readcfgfile (cfgparse.c:2167)
==23665== by 0x5074FD: init (haproxy.c:2021)
==23665== by 0x418262: main (haproxy.c:3126)
==23665==
==23665== LEAK SUMMARY:
==23665== definitely lost: 48 bytes in 1 blocks
==23665== indirectly lost: 84 bytes in 2 blocks
==23665== possibly lost: 0 bytes in 0 blocks
==23665== still reachable: 313,876 bytes in 84 blocks
==23665== suppressed: 0 bytes in 0 blocks
==23665== Reachable blocks (those to which a pointer was found) are not shown.
==23665== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==23665==
==23665== For counts of detected and suppressed errors, rerun with: -v
==23665== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
After this patch is applied the leak is gone as expected.
This is a very minor leak that can only be observed if deinit() is called,
shortly before the OS will free all memory of the process anyway. No
backport needed.
Particularly cleanly deinit() after a configuration check to clean up the
output of valgrind which reports "possible losses" without a deinit() and
does not with a deinit(), converting actual losses into proper hard losses
which makes the whole stuff easier to analyze.
As an example, given an example configuration of the following:
frontend foo
bind *:8080
mode http
Running `haproxy -c -f cfg` within valgrind will report 4 possible losses:
$ valgrind --leak-check=full ./haproxy -c -f ./example.cfg
==21219== Memcheck, a memory error detector
==21219== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==21219== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==21219== Command: ./haproxy -c -f ./example.cfg
==21219==
[WARNING] 165/001100 (21219) : config : missing timeouts for frontend 'foo'.
| While not properly invalid, you will certainly encounter various problems
| with such a configuration. To fix this, please ensure that all following
| timeouts are set to a non-zero value: 'client', 'connect', 'server'.
Warnings were found.
Configuration file is valid
==21219==
==21219== HEAP SUMMARY:
==21219== in use at exit: 1,436,631 bytes in 130 blocks
==21219== total heap usage: 153 allocs, 23 frees, 1,447,758 bytes allocated
==21219==
==21219== 7 bytes in 1 blocks are possibly lost in loss record 5 of 54
==21219== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21219== by 0x5726489: strdup (strdup.c:42)
==21219== by 0x468FD9: bind_conf_alloc (listener.h:158)
==21219== by 0x468FD9: cfg_parse_listen (cfgparse-listen.c:557)
==21219== by 0x459DF3: readcfgfile (cfgparse.c:2167)
==21219== by 0x5056CD: init (haproxy.c:2021)
==21219== by 0x418232: main (haproxy.c:3121)
==21219==
==21219== 14 bytes in 1 blocks are possibly lost in loss record 9 of 54
==21219== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21219== by 0x5726489: strdup (strdup.c:42)
==21219== by 0x468F9B: bind_conf_alloc (listener.h:154)
==21219== by 0x468F9B: cfg_parse_listen (cfgparse-listen.c:557)
==21219== by 0x459DF3: readcfgfile (cfgparse.c:2167)
==21219== by 0x5056CD: init (haproxy.c:2021)
==21219== by 0x418232: main (haproxy.c:3121)
==21219==
==21219== 128 bytes in 1 blocks are possibly lost in loss record 35 of 54
==21219== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21219== by 0x468F90: bind_conf_alloc (listener.h:152)
==21219== by 0x468F90: cfg_parse_listen (cfgparse-listen.c:557)
==21219== by 0x459DF3: readcfgfile (cfgparse.c:2167)
==21219== by 0x5056CD: init (haproxy.c:2021)
==21219== by 0x418232: main (haproxy.c:3121)
==21219==
==21219== 608 bytes in 1 blocks are possibly lost in loss record 46 of 54
==21219== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21219== by 0x4B953A: create_listeners (listener.c:576)
==21219== by 0x4578F6: str2listener (cfgparse.c:192)
==21219== by 0x469039: cfg_parse_listen (cfgparse-listen.c:568)
==21219== by 0x459DF3: readcfgfile (cfgparse.c:2167)
==21219== by 0x5056CD: init (haproxy.c:2021)
==21219== by 0x418232: main (haproxy.c:3121)
==21219==
==21219== LEAK SUMMARY:
==21219== definitely lost: 0 bytes in 0 blocks
==21219== indirectly lost: 0 bytes in 0 blocks
==21219== possibly lost: 757 bytes in 4 blocks
==21219== still reachable: 1,435,874 bytes in 126 blocks
==21219== suppressed: 0 bytes in 0 blocks
==21219== Reachable blocks (those to which a pointer was found) are not shown.
==21219== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==21219==
==21219== For counts of detected and suppressed errors, rerun with: -v
==21219== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 0 from 0)
Re-running the same command with the patch applied will not report any
losses any more:
$ valgrind --leak-check=full ./haproxy -c -f ./example.cfg
==22124== Memcheck, a memory error detector
==22124== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==22124== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==22124== Command: ./haproxy -c -f ./example.cfg
==22124==
[WARNING] 165/001503 (22124) : config : missing timeouts for frontend 'foo'.
| While not properly invalid, you will certainly encounter various problems
| with such a configuration. To fix this, please ensure that all following
| timeouts are set to a non-zero value: 'client', 'connect', 'server'.
Warnings were found.
Configuration file is valid
==22124==
==22124== HEAP SUMMARY:
==22124== in use at exit: 313,864 bytes in 82 blocks
==22124== total heap usage: 153 allocs, 71 frees, 1,447,758 bytes allocated
==22124==
==22124== LEAK SUMMARY:
==22124== definitely lost: 0 bytes in 0 blocks
==22124== indirectly lost: 0 bytes in 0 blocks
==22124== possibly lost: 0 bytes in 0 blocks
==22124== still reachable: 313,864 bytes in 82 blocks
==22124== suppressed: 0 bytes in 0 blocks
==22124== Reachable blocks (those to which a pointer was found) are not shown.
==22124== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==22124==
==22124== For counts of detected and suppressed errors, rerun with: -v
==22124== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
It might be worth investigating what exactly HAProxy does to lose pointers
to the start of those 4 memory areas and then to be able to still free them
during deinit(). If HAProxy is able to free them, they ideally should be
"still reachable" and not "possibly lost".
This patch fixes all the leftovers from the include cleanup campaign. There
were not that many (~400 entries in ~150 files) but it was definitely worth
doing it as it revealed a few duplicates.
There's no point splitting the file in two since only cfgparse uses the
types defined there. A few call places were updated and cleaned up. All
of them were in C files which register keywords.
There is nothing left in common/ now so this directory must not be used
anymore.
This one was not easy because it was embarking many includes with it,
which other files would automatically find. At least global.h, arg.h
and tools.h were identified. 93 total locations were identified, 8
additional includes had to be added.
In the rare files where it was possible to finalize the sorting of
includes by adjusting only one or two extra lines, it was done. But
all files would need to be rechecked and cleaned up now.
It was the last set of files in types/ and proto/ and these directories
must not be reused anymore.
extern struct dict server_name_dict was moved from the type file to the
main file. A handful of inlined functions were moved at the bottom of
the file. Call places were updated to use server-t.h when relevant, or
to simply drop the entry when not needed.
The files remained mostly unchanged since they were OK. However, half of
the users didn't need to include them, and about as many actually needed
to have it and used to find functions like srv_currently_usable() through
a long chain that broke when moving the file.
This one is particularly difficult to split because it provides all the
functions used to manipulate a proxy state and to retrieve names or IDs
for error reporting, and as such, it was included in 73 files (down to
68 after cleanup). It would deserve a small cleanup though the cut points
are not obvious at the moment given the number of structs involved in
the struct proxy itself.
The current state of the logging is a real mess. The main problem is
that almost all files include log.h just in order to have access to
the alert/warning functions like ha_alert() etc, and don't care about
logs. But log.h also deals with real logging as well as log-format and
depends on stream.h and various other things. As such it forces a few
heavy files like stream.h to be loaded early and to hide missing
dependencies depending where it's loaded. Among the missing ones is
syslog.h which was often automatically included resulting in no less
than 3 users missing it.
Among 76 users, only 5 could be removed, and probably 70 don't need the
full set of dependencies.
A good approach would consist in splitting that file in 3 parts:
- one for error output ("errors" ?).
- one for log_format processing
- and one for actual logging.
It was moved without any change, however many callers didn't need it at
all. This was a consequence of the split of proto_http.c into several
parts that resulted in many locations to still reference it.
Almost no change except moving the cli_kw struct definition after the
defines. Almost all users had both types&proto included, which is not
surprizing since this code is old and it used to be the norm a decade
ago. These places were cleaned.
Initially it looked like this could have been placed into auth.h or
stats.h but it's not the case as it's what makes the link between them
and the HTTP layer. However the file needed to be split in two. Quite
a number of call places were dropped because these were mostly leftovers
from the early days where the stats and cli were packed together.
The files were moved almost as-is, just dropping arg-t and auth-t from
acl-t but keeping arg-t in acl.h. It was useful to revisit the call places
since a handful of files used to continue to include acl.h while they did
not need it at all. Struct stream was only made a forward declaration
since not otherwise needed.
The cfg_peers external declaration was moved to the main file instead
of the type one. A few types were still missing from the proto, causing
warnings in the functions prototypes (proxy, stick_table).
The type file is becoming a mess, half of it is for the proxy protocol,
another good part describes conn_streams and mux ops, it would deserve
being split again. At least it was reordered so that elements are easier
to find, with the PP-stuff left at the end. The MAX_SEND_FD macro was moved
to compat.h as it's said to be the value for Linux.
The TASK_IS_TASKLET() macro was moved to the proto file instead of the
type one. The proto part was a bit reordered to remove a number of ugly
forward declaration of static inline functions. About a tens of C and H
files had their dependency dropped since they were not using anything
from task.h.
global.h was one of the messiest files, it has accumulated tons of
implicit dependencies and declares many globals that make almost all
other file include it. It managed to silence a dependency loop between
server.h and proxy.h by being well placed to pre-define the required
structs, forcing struct proxy and struct server to be forward-declared
in a significant number of files.
It was split in to, one which is the global struct definition and the
few macros and flags, and the rest containing the functions prototypes.
The UNIX_MAX_PATH definition was moved to compat.h.
This one is particularly tricky to move because everyone uses it
and it depends on a lot of other types. For example it cannot include
arg-t.h and must absolutely only rely on forward declarations to avoid
dependency loops between vars -> sample_data -> arg. In order to address
this one, it would be nice to split the sample_data part out of sample.h.
It was moved as-is, except for extern declaration of pattern_reference.
A few C files used to include it but didn't need it anymore after having
been split apart so this was cleaned.
One function prototype makes reference to struct mworker_proc which was
not defined there but in global.h instead. This definition, along with
the PROC_O_* fields were moved to mworker-t.h instead.
The STATS_DEFAULT_REALM and STATS_DEFAULT_URI were moved to defaults.h.
It was required to include types/pattern.h and types/sample.h since they
are mentioned in function prototypes.
It would be wise to merge this with uri_auth.h later.
A few includes were missing in each file. A definition of
struct polled_mask was moved to fd-t.h. The MAX_POLLERS macro was
moved to defaults.h
Stdio used to be silently inherited from whatever path but it's needed
for list_pollers() which takes a FILE* and which can thus not be
forward-declared.
And also rename standard.c to tools.c. The original split between
tools.h and standard.h dates from version 1.3-dev and was mostly an
accident. This patch moves the files back to what they were expected
to be, and takes care of not changing anything else. However this
time tools.h was split between functions and types, because it contains
a small number of commonly used macros and structures (e.g. name_desc)
which in turn cause the massive list of includes of tools.h to conflict
with the callers.
They remain the ugliest files of the whole project and definitely need
to be cleaned and split apart. A few types are defined there only for
functions provided there, and some parts are even OS-specific and should
move somewhere else, such as the symbol resolution code.
The protocol.h files are pretty low in the dependency and (sadly) used
by some files from common/. Almost nothing was changed except lifting a
few comments.
Regex are essentially included for myregex_t but it turns out that
several of the C files didn't include it directly, relying on the
one included by their own .h. This has been cleanly addressed so
that only the type is included by H files which need it, and adding
the missing includes for the other ones.
The type was moved out as it's used by standard.h for netns_entry.
Instead of just being a forward declaration when not used, it's an
empty struct, which makes gdb happier (the resulting stripped executable
is the same).
The pretty confusing "buffer.h" was in fact not the place to look for
the definition of "struct buffer" but the one responsible for dynamic
buffer allocation. As such it defines the struct buffer_wait and the
few functions to allocate a buffer or wait for one.
This patch moves it renaming it to dynbuf.h. The type definition was
moved to its own file since it's included in a number of other structs.
Doing this cleanup revealed that a significant number of files used to
rely on this one to inherit struct buffer through it but didn't need
anything from this file at all.
This moves types/activity.h to haproxy/activity-t.h and
proto/activity.h to haproxy/activity.h.
The macros defining the bit field values for the profiling variable
were moved to the type file to be more future-proof.
Now the file is ready to be stored into its final destination. A few
minor reorderings were performed to keep the file properly organized,
making the various sections more visible (cache & lockless).
In addition and to stay consistent, memory.c was renamed to pool.c.
This one is included almost everywhere and used to rely on a few other
.h that are not needed (unistd, stdlib, standard.h). It could possibly
make sense to split it into multiple parts to distinguish operations
performed on timers and the internal time accounting, but at this point
it does not appear much important.
This splits the hathreads.h file into types+macros and functions. Given
that most users of this file used to include it only to get the definition
of THREAD_LOCAL and MAXTHREADS, the bare minimum was placed into thread-t.h
(i.e. types and macros).
All the thread management was left to haproxy/thread.h. It's worth noting
the drop of the trailing "s" in the name, to remove the permanent confusion
that arises between this one and the system implementation (no "s") and the
makefile's option (no "s").
For consistency, src/hathreads.c was also renamed thread.c.
A number of files were updated to only include thread-t which is the one
they really needed.
Some future improvements are possible like replacing empty inlined
functions with macros for the thread-less case, as building at -O0 disables
inlining and causes these ones to be emitted. But this really is cosmetic.
Half of the users of this include only need the type definitions and
not the manipulation macros nor the inline functions. Moves the various
types into mini-clist-t.h makes the files cleaner. The other one had all
its includes grouped at the top. A few files continued to reference it
without using it and were cleaned.
In addition it was about time that we'd rename that file, it's not
"mini" anymore and contains a bit more than just circular lists.
This file is to openssl what compat.h is to the libc, so it makes sense
to move it to haproxy/. It could almost be part of api.h but given the
amount of openssl stuff that gets loaded I fear it could increase the
build time.
Note that this file contains lots of inlined functions. But since it
does not depend on anything else in haproxy, it remains safe to keep
all that together.
All files that were including one of the following include files have
been updated to only include haproxy/api.h or haproxy/api-t.h once instead:
- common/config.h
- common/compat.h
- common/compiler.h
- common/defaults.h
- common/initcall.h
- common/tools.h
The choice is simple: if the file only requires type definitions, it includes
api-t.h, otherwise it includes the full api.h.
In addition, in these files, explicit includes for inttypes.h and limits.h
were dropped since these are now covered by api.h and api-t.h.
No other change was performed, given that this patch is large and
affects 201 files. At least one (tools.h) was already freestanding and
didn't get the new one added.
When HAProxy is started with a '--' option, all following parameters are
considered configuration files. You can't add new options after a '--'.
The current reload system of the master-worker adds extra options at the
end of the arguments list. Which is a problem if HAProxy was started wih
'--'.
This patch fixes the issue by copying the new option at the beginning of
the arguments list instead of the end.
This patch must be backported as far as 1.8.
There is no reason the -S option can't take an argument which starts with
a -. This limitation must only be used for options that take a
non-finite list of parameters (-sf/-st)
This can be backported only if the previous patch which fixes
copy_argv() is backported too.
Could be backported as far as 1.9.
There is no reason the -x option can't take an argument which starts with
a -. This limitation must only be used for options that take a
non-finite list of parameters (-sf/-st)
This can be backported only if the previous patch which fixes
copy_argv() is backported too.
Could be backported as far as 1.8.
The copy_argv() function, which is used to copy and remove some of the
arguments of the command line in order to re-exec() the master process,
is poorly implemented.
The function tries to remove the -x and the -sf/-st options but without
taking into account that some of the options could take a parameter
starting with a dash.
In issue #644, haproxy starts with "-L -xfoo" which is perfectly
correct. However, the re-exec is done without "-xfoo" because the master
tries to remove the "-x" option. Indeed, the copy_argv() function does
not know how much arguments an option can have, and just assume that
everything starting with a dash is an option. So haproxy is exec() with
"-L" but without a parameter, which is wrong and leads to the exit of
the master, with usage().
To fix this issue, copy_argv() must know how much parameters an option
takes, and copy or skip the parameters correctly.
This fix is a first step but it should evolve to a cleaner way of
declaring the options to avoid deduplication of the parsing code, so we
avoid new bugs.
Should be backported with care as far as 1.8, by removing the options
that does not exists in the previous versions.
When the first thread stops and wakes others up, it's possible some of
them will also start to wake others in parallel. Let's make give this
notification task to the very first one instead since it's enough and
can reduce the amount of needless (though harmless) wakeup calls.
Currently the soft-stop can lead to old processes remaining alive for as
long as two seconds after receiving a soft-stop signal. What happens is
that when receiving SIGUSR1, one thread (usually the first one) wakes up,
handles the signal, sets "stopping", goes into runn_poll_loop(), and
discovers that stopping is set, so its also sets itself in the
stopping_thread_mask bit mask. After this it sees that other threads are
not yet willing to stop, so it continues to wait.
From there, other threads which were waiting in poll() expire after one
second on poll timeout and enter run_poll_loop() in turn. That's already
one second of wait time. They discover each in turn that they're stopping
and see that other threads are not yet stopping, so they go back waiting.
After the end of the first second, all threads know they're stopping and
have set their bit in stopping_thread_mask. It's only now that those who
started to wait first wake up again on timeout to discover that all other
ones are stopping, and can now quit. One second later all threads will
have done it and the process will quit.
This is effectively strictly larger than one second and up to two seconds.
What the current patch does is simple, when the first thread stops, it sets
its own bit into stopping_thread_mask then wakes up all other threads to do
also set theirs. This kills the first second which corresponds to the time
to discover the stopping state. Second, when a thread exists, it wakes all
other ones again because some might have gone back sleeping waiting for
"jobs" to go down to zero (i.e. closing the last connection). This kills
the last second of wait time.
Thanks to this, as SIGUSR1 now acts instantly again if there's no active
connection, or it stops immediately after the last connection has left if
one was still present.
This should be backported as far as 2.0.
HTTP health-checks are now internally based on tcp-checks. Of course all the
configuration parsing of the "http-check" keyword and the httpchk option has
been rewritten. But the main changes is that now, as for tcp-check ruleset, it
is possible to perform several send/expect sequences into the same
health-checks. Thus the connect rule is now also available from HTTP checks, jst
like set-var, unset-var and comment rules.
Because the request defined by the "option httpchk" line is used for the first
request only, it is now possible to set the method, the uri and the version on a
"http-check send" line.
The options and directives related to the configuration of checks in a backend
may be defined after the servers declarations. So, initialization of the check
of each server must not be performed during configuration parsing, because some
info may be missing. Instead, it must be done during the configuration validity
check.
Thus, callback functions are registered to be called for each server after the
config validity check, one for the server check and another one for the server
agent-check. In addition deinit callback functions are also registered to
release these checks.
This patch should be backported as far as 1.7. But per-server post_check
callback functions are only supported since the 2.1. And the initcall mechanism
does not exist before the 1.9. Finally, in 1.7, the code is totally
different. So the backport will be harder on older versions.
This options is used to force a non-SSL connection to check a SSL server or to
invert a check-ssl option inherited from the default section. The use_ssl field
in the check structure is used to know if a SSL connection must be used
(use_ssl=1) or not (use_ssl=0). The server configuration is used by default.
The problem is that we cannot distinguish the default case (no specific SSL
check option) and the case of an explicit non-SSL check. In both, use_ssl is set
to 0. So the server configuration is always used. For a SSL server, when
no-check-ssl option is set, the check is still performed using a SSL
configuration.
To fix the bug, instead of a boolean value (0=TCP, 1=SSL), we use a ternary value :
* 0 = use server config
* 1 = force SSL
* -1 = force non-SSL
The same is done for the server parameter. It is not really necessary for
now. But it is a good way to know is the server no-ssl option is set.
In addition, the PR_O_TCPCHK_SSL proxy option is no longer used to set use_ssl
to 1 for a check. Instead the flag is directly tested to prepare or destroy the
server SSL context.
This patch should be backported as far as 1.8.
The 'http-check send' directive have been added to add headers and optionnaly a
payload to the request sent during HTTP healthchecks. The request line may be
customized by the "option httpchk" directive but there was not official way to
add extra headers. An old trick consisted to hide these headers at the end of
the version string, on the "option httpchk" line. And it was impossible to add
an extra payload with an "http-check expect" directive because of the
"Connection: close" header appended to the request (See issue #16 for details).
So to make things official and fully support payload additions, the "http-check
send" directive have been added :
option httpchk POST /status HTTP/1.1
http-check send hdr Content-Type "application/json;charset=UTF-8" \
hdr X-test-1 value1 hdr X-test-2 value2 \
body "{id: 1, field: \"value\"}"
When a payload is defined, the Content-Length header is automatically added. So
chunk-encoded requests are not supported yet. For now, there is no special
validity checks on the extra headers.
This patch is inspired by Kiran Gavali's work. It should fix the issue #16 and
as far as possible, it may be backported, at least as far as 1.8.
This patch adds the sysname, release, version and machine fields from
the uname results to the version output. It intentionally leaves out the
machine name, because it is usually not useful and users might not want to
expose their machine names for privacy reasons.
May be backported if it is considered useful for debugging.
Some portability issues were met a few times in the past depending on
compiler versions, but this one was not reported in haproxy -vv output
while it's trivial to add it. This patch tries to be the most accurate
by explicitly reporting the clang version if detected, otherwise the
gcc version.
Since some systems switched to service managers which hide all warnings
by default, some users are not aware of some possibly important warnings
and get caught too late with errors that could have been detected earlier.
This patch adds a new global keyword, "zero-warning" and an equivalent
command-line option "-dW" to refuse to start in case any warning is
detected. It is recommended to use these with configurations that are
managed by humans in order to catch mistakes very early.
This helps quickly checking if the config produces any warning. For
this we reuse the "warned" bit field to add a new WARN_ANY bit that is
set by ha_warning(). The rest of the bit field was also cleaned from
unused bits.
In run_thread_poll_loop() we test both for (global_tasks_mask & tid_bit)
and thread_has_tasks(), but the former is useless since this test is
already part of the latter.
Commit 4b3f27b ("BUG/MINOR: haproxy/threads: try to make all threads
leave together") improved the soft-stop synchronization but it left a
small race open because it looks at tasks_run_queue, which can drop
to zero then back to one while another thread picks the task from the
run queue to insert it into the tasklet_list. The risk is very low but
not null. In addition the condition didn't consider the possible presence
of signals in the queue.
This patch moves the stopping detection just after the "wake" calculation
which already takes care of the various queues' sizes and signals. It
avoids needlessly duplicating these tests.
The bug was discovered during a code review but will probably never be
observed. This fix may be backported to 2.1 and 2.0 along with the commit
above.
Revamp the server connection lists. We know have 3 lists :
- idle_conns, which contains idling connections
- safe_conns, which contains idling connections that are safe to use even
for the first request
- available_conns, which contains connections that are not idling, but can
still accept new streams (those are HTTP/2 or fastcgi, and are always
considered safe).
It's more generic and versatile than the previous shut_your_big_mouth_gcc()
that was used to silence annoying warnings as it's not limited to ignoring
syscalls returns only. This allows us to get rid of the aforementioned
function and the shut_your_big_mouth_gcc_int variable, that started to
look ugly in multi-threaded environments.
There's a small issue with soft stop combined with the incoming
connection load balancing. A thread may dispatch a connection to
another one at the moment stopping=1 is set, and the second one could
stop by seeing (jobs - unstoppable_jobs) == 0 in run_poll_loop(),
without ever picking these connections from the queue. This is
visible in that it may occasionally cause a connection drop on
reload since no remaining thread will ever pick that connection
anymore.
In order to address this, this patch adds a stopping_thread_mask
variable by which threads acknowledge their willingness to stop
when their runqueue is empty. And all threads will only stop at
this moment, so that if finally some late work arrives in the
thread's queue, it still has a chance to process it.
This should be backported to 2.1 and 2.0.
Surprizingly the variable was never initialized, though on most
platforms it's zeroed at boot, and it is relatively harmless
anyway since in the worst case the bits are updated around poll().
This was introduced by commit 79321b95a8 and needs to be backported
as far as 1.9.
Remove the list of private connections from server, it has been largely
unused, we only inserted connections in it, but we would never actually
use it.
When a maximum memory setting is passed to haproxy and maxconn is not set
and ulimit-n is not set, it is expected that maxconn will be set to the
highest value permitted by this memory setting, possibly affecting the
FD limit.
When maxconn was changed to be deduced from the current process's FD limit,
the automatic setting above was partially lost because it now remains
limited to the current FD limit in addition to being limited to the
memory usage. For unprivileged processes it does not change anything,
but for privileged processes the difference is important. Indeed, the
previous behavior ensured that the new FD limit could be enforced on
the process as long as the user had the privilege to do so. Now this
does not happen anymore, and some people rely on this for automatic
sizing in VM environments.
This patch implements the ability to verify if the setting will be
enforceable on the process or not. First it computes maxconn based on
the memory limits alone, then checks if the process is willing to accept
them, otherwise tries again by respecting the process' hard limit.
Thanks to this we now have the best of the pre-2.0 behavior and the
current one, in that privileged users will be able to get as high a
maxconn as they need just based on the memory limit, while unprivileged
users will still get as high a setting as permitted by the intersection
of the memory limit and the process' FD limit.
Ideally, after some observation period, this patch along with the
previous one "MINOR: init: move the maxsock calculation code to
compute_ideal_maxsock()" should be backported to 2.1 and 2.0.
Thanks to Baptiste for raising the issue.
The maxsock value is currently derived from global.maxconn and a few other
settings, some of which also depend on global.maxconn. This makes it
difficult to check if a limit is already too high or not during the maxconn
automatic sizing.
Let's move this code into a new function, compute_ideal_maxsock() which now
takes a maxconn in argument. It performs the same operations and returns
the maxsock value if global.maxconn were to be set to that value. It now
replaces the previous code to compute maxsock.
This is the replacement of failed attempt to add thread safety and
per-process sequences of random numbers initally tried with commit
1c306aa84d ("BUG/MEDIUM: random: implement per-thread and per-process
random sequences").
This new version takes a completely different approach and doesn't try
to work around the horrible OS-specific and non-portable random API
anymore. Instead it implements "xoroshiro128**", a reputedly high
quality random number generator, which is one of the many variants of
xorshift, which passes all quality tests and which is described here:
http://prng.di.unimi.it/
While not cryptographically secure, it is fast and features a 2^128-1
period. It supports fast jumps allowing to cut the period into smaller
non-overlapping sequences, which we use here to support up to 2^32
processes each having their own, non-overlapping sequence of 2^96
numbers (~7*10^28). This is enough to provide 1 billion randoms per
second and per process for 2200 billion years.
The implementation was made thread-safe either by using a double 64-bit
CAS on platforms supporting it (x86_64, aarch64) or by using a local
lock for the time needed to perform the shift operations. This ensures
that all threads pick numbers from the same pool so that it is not
needed to assign per-thread ranges. For processes we use the fast jump
method to advance the sequence by 2^96 for each process.
Before this patch, the following config:
global
nbproc 8
frontend f
bind :4445
mode http
log stdout format raw daemon
log-format "%[uuid] %pid"
redirect location /
Would produce this output:
a4d0ad64-2645-4b74-b894-48acce0669af 12987
a4d0ad64-2645-4b74-b894-48acce0669af 12992
a4d0ad64-2645-4b74-b894-48acce0669af 12986
a4d0ad64-2645-4b74-b894-48acce0669af 12988
a4d0ad64-2645-4b74-b894-48acce0669af 12991
a4d0ad64-2645-4b74-b894-48acce0669af 12989
a4d0ad64-2645-4b74-b894-48acce0669af 12990
82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12987
82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12992
82d5f6cd-f6c1-4f85-a89c-36ae85d26fb9 12986
(...)
And now produces:
f94b29b3-da74-4e03-a0c5-a532c635bad9 13011
47470c02-4862-4c33-80e7-a952899570e5 13014
86332123-539a-47bf-853f-8c8ea8b2a2b5 13013
8f9efa99-3143-47b2-83cf-d618c8dea711 13012
3cc0f5c7-d790-496b-8d39-bec77647af5b 13015
3ec64915-8f95-4374-9e66-e777dc8791e0 13009
0f9bf894-dcde-408c-b094-6e0bb3255452 13011
49c7bfde-3ffb-40e9-9a8d-8084d650ed8f 13014
e23f6f2e-35c5-4433-a294-b790ab902653 13012
There are multiple benefits to using this method. First, it doesn't
depend anymore on a non-portable API. Second it's thread safe. Third it
is fast and more proven than any hack we could attempt to try to work
around the deficiencies of the various implementations around.
This commit depends on previous patches "MINOR: tools: add 64-bit rotate
operators" and "BUG/MEDIUM: random: initialize the random pool a bit
better", all of which will need to be backported at least as far as
version 2.0. It doesn't require to backport the build fixes for circular
include files dependecy anymore.
This reverts commit 1c306aa84d.
It breaks the build on all non-glibc platforms. I got confused by the
man page (which possibly is the most confusing man page I've ever read
about a standard libc function) and mistakenly understood that random_r
was portable, especially since it appears in latest freebsd source as
well but not in released versions, and with a slightly different API :-/
We need to find a different solution with a fallback. Among the
possibilities, we may reintroduce this one with a fallback relying on
locking around the standard functions, keeping fingers crossed for no
other library function to call them in parallel, or we may also provide
our own PRNG, which is not necessarily more difficult than working
around the totally broken up design of the portable API.