The url_decode() function used by the url_dec converter and a few other
call points is ambiguous on its processing of the '+' character which
itself isn't stable in the spec. This one belongs to the reserved
characters for the query string but not for the path nor the scheme,
in which it must be left as-is. It's only in argument strings that
follow the application/x-www-form-urlencoded encoding that it must be
turned into a space, that is, in query strings and POST arguments.
The problem is that the function is used to process full URLs and
paths in various configs, and to process query strings from the stats
page for example.
This patch updates the function to differentiate the situation where
it's parsing a path and a query string. A new argument indicates if a
query string should be assumed, otherwise it's only assumed after seeing
a question mark.
The various locations in the code making use of this function were
updated to take care of this (most call places were using it to decode
POST arguments).
The url_dec converter is usually called on path or url samples, so it
needs to remain compatible with this and will default to parsing a path
and turning the '+' to a space only after a question mark. However in
situations where it would explicitly be extracted from a POST or a
query string, it now becomes possible to enforce the decoding by passing
a non-null value in argument.
It seems to be what was reported in issue #585. This fix may be
backported to older stable releases.
Ryan O'Hara reported that haproxy breaks on fedora-32 using gcc-10
(pre-release). It turns out that constructs such as:
while (item != head) {
item = LIST_ELEM(item.n);
}
loop forever, never matching <item> to <head> despite a printf there
showing them equal. In practice the problem is that the LIST_ELEM()
macro is wrong, it assigns the subtract of two pointers (an integer)
to another pointer through a cast to its pointer type. And GCC 10 now
considers that this cannot match a pointer and silently optimizes the
comparison away. A tested workaround for this is to build with
-fno-tree-pta. Note that older gcc versions even with -ftree-pta do
not exhibit this rather surprizing behavior.
This patch changes the test to instead cast the null-based address to
an int to get the offset and subtract it from the pointer, and this
time it works. There were just a few places to adjust. Ideally
offsetof() should be used but the LIST_ELEM() API doesn't make this
trivial as it's commonly called with a typeof(ptr) and not typeof(ptr*)
thus it would require to completely change the whole API, which is not
something workable in the short term, especially for a backport.
With this change, the emitted code is subtly different even on older
versions. A code size reduction of ~600 bytes and a total executable
size reduction of ~1kB are expected to be observed and should not be
taken as an anomaly. Typically this loop in dequeue_proxy_listeners() :
while ((listener = MT_LIST_POP(...)))
used to produce this code where the comparison is performed on RAX
while the new offset is assigned to RDI even though both are always
identical:
53ded8: 48 8d 78 c0 lea -0x40(%rax),%rdi
53dedc: 48 83 f8 40 cmp $0x40,%rax
53dee0: 74 39 je 53df1b <dequeue_proxy_listeners+0xab>
and now produces this one which is slightly more efficient as the
same register is used for both purposes:
53dd08: 48 83 ef 40 sub $0x40,%rdi
53dd0c: 74 2d je 53dd3b <dequeue_proxy_listeners+0x9b>
Similarly, retrieving the channel from a stream_interface using si_ic()
and si_oc() used to cause this (stream-int in rdi):
1cb7: c7 47 1c 00 02 00 00 movl $0x200,0x1c(%rdi)
1cbe: f6 47 04 10 testb $0x10,0x4(%rdi)
1cc2: 74 1c je 1ce0 <si_report_error+0x30>
1cc4: 48 81 ef 00 03 00 00 sub $0x300,%rdi
1ccb: 81 4f 10 00 08 00 00 orl $0x800,0x10(%rdi)
and now causes this:
1cb7: c7 47 1c 00 02 00 00 movl $0x200,0x1c(%rdi)
1cbe: f6 47 04 10 testb $0x10,0x4(%rdi)
1cc2: 74 1c je 1ce0 <si_report_error+0x30>
1cc4: 81 8f 10 fd ff ff 00 orl $0x800,-0x2f0(%rdi)
There is extremely little chance that this fix wakes up a dormant bug as
the emitted code effectively does what the source code intends.
This must be backported to all supported branches (dropping MT_LIST_ELEM
and the spoa_example parts as needed), since the bug is subtle and may
not always be visible even when compiling with gcc-10.
ST_F_CHECK_DURATION is now part of exported server metrics, named
haproxy_server_check_duration_seconds and expressed in seconds. For a given
server, this value is exported only if the healthcheck is finished (the status
is greater or equal to HCHK_STATUS_CHECKED).
This patch fixes the issue #519. It may be backported as fat as 2.0.
Historically we used to require that the connections held the desired
polling states for the data layer and the socket layer. Then with muxes
these were more or less merged into the transport layer, and now it
happens that with all transport layers having their own state, the
"transport layer state" as we have it in the connection (XPRT_RD_ENA,
XPRT_WR_ENA) is only an exact copy of the undelying file descriptor
state, but with a delay. All of this is causing some difficulties at
many places in the code because there are still some locations which
use the conn_want_* API to remain clean and only rely on connection,
and count on a later collection call to conn_cond_update_polling(),
while others need an immediate action and directly use the FD updates.
Since our updates are now much cheaper, most of them being only an
atomic test-and-set operation, and since our I/O callbacks are deferred,
there's no benefit anymore in trying to "cache" the transient state
change in the connection flags hoping to cancel them before they
become an FD event. Better make such calls transparent indirections
to the FD layer instead and get rid of the deferred operations which
needlessly complicate the logic inside.
This removes flags CO_FL_XPRT_{RD,WR}_ENA and CO_FL_WILL_UPDATE.
A number of functions related to polling updates were either greatly
simplified or removed.
Two places were using CO_FL_XPRT_WR_ENA as a hint to know if more data
were expected to be sent after a PROXY protocol or SOCKSv4 header. These
ones were simply replaced with a check on the subscription which is
where we ought to get the autoritative information from.
Now the __conn_xprt_want_* and their conn_xprt_want_* counterparts
are the same. conn_stop_polling() and conn_xprt_stop_both() are the
same as well. conn_cond_update_polling() only causes errors to stop
polling. It also becomes way more obvious that muxes should not at
all employ conn_xprt_{want|stop}_{recv,send}(), and that the call
to __conn_xprt_stop_recv() in case a mux failed to allocate a buffer
is inappropriate, it ought to unsubscribe from reads instead. All of
this definitely requires a serious cleanup.
This is convenient when processing large dumps, it allows to copy-paste
values to inspect from one window to another, or to directly transfer
a "show fd"/"show stream" output through sed. In order to do this, simply
pass "-" alone instead of the value and they will all be read one line at
a time from stdin. For example, in order to quickly print the different
set of connection flags from "show fd", this is sufficient:
sed -ne 's/^.* cflg=\([^ ]*\).*/\1/p' | contrib/debug/flags conn -
It's often convenient, for example to dump two channels or two stream-int
at once. Now all input values are decoded and the value is recalled before
the dump when there is more than one to display.
It's often confusing to have a whole dump on the screen while only
checking for a set of task or stream flags, and appending "|grep ^chn"
isn't very convenient to repeat the opeation. Instead let's add the
ability to filter the output as certain types only by prepending their
name(s) before the value.
Commit 477902bd2e ("MEDIUM: connections: Get ride of the xprt_done
callback.") broke the master CLI for a very obscure reason. It happens
that short requests immediately terminated by a shutdown are properly
received, CS_FL_EOS is correctly set, but in si_cs_recv(), we refrain
from setting CF_SHUTR on the channel because CO_FL_CONNECTED was not
yet set on the connection since we've not passed again through
conn_fd_handler() and it was not done in conn_complete_session(). While
commit a8a415d31a ("BUG/MEDIUM: connections: Set CO_FL_CONNECTED in
conn_complete_session()") fixed the issue, such accident may happen
again as the root cause is deeper and actually comes down to the fact
that CO_FL_CONNECTED is lazily set at various check points in the code
but not every time we drop one wait bit. It is not the first time we
face this situation.
Originally this flag was used to detect the transition between WAIT_*
and CONNECTED in order to call ->wake() from the FD handler. But since
at least 1.8-dev1 with commit 7bf3fa3c23 ("BUG/MAJOR: connection: update
CO_FL_CONNECTED before calling the data layer"), CO_FL_CONNECTED is
always synchronized against the two others before being checked. Moreover,
with the I/Os moved to tasklets, the decision to call the ->wake() function
is performed after the I/Os in si_cs_process() and equivalent, which don't
care about this transition either.
So in essence, checking for CO_FL_CONNECTED has become a lazy wait to
check for (CO_FL_WAIT_L4_CONN | CO_FL_WAIT_L6_CONN), but that always
relies on someone else having synchronized it.
This patch addresses it once for all by killing this flag and only checking
the two others (for which a composite mask CO_FL_WAIT_L4L6 was added). This
revealed a number of inconsistencies that were purposely not addressed here
for the sake of bisectability:
- while most places do check both L4+L6 and HANDSHAKE at the same time,
some places like assign_server() or back_handle_st_con() and a few
sample fetches looking for proxy protocol do check for L4+L6 but
don't care about HANDSHAKE ; these ones will probably fail on TCP
request session rules if the handshake is not complete.
- some handshake handlers do validate that a connection is established
at L4 but didn't clear CO_FL_WAIT_L4_CONN
- the ->ctl method of mux_fcgi, mux_pt and mux_h1 only checks for L4+L6
before declaring the mux ready while the snd_buf function also checks
for the handshake's completion. Likely the former should validate the
handshake as well and we should get rid of these extra tests in snd_buf.
- raw_sock_from_buf() would directly set CO_FL_CONNECTED and would only
later clear CO_FL_WAIT_L4_CONN.
- xprt_handshake would set CO_FL_CONNECTED itself without actually
clearing CO_FL_WAIT_L4_CONN, which could apparently happen only if
waiting for a pure Rx handshake.
- most places in ssl_sock that were checking CO_FL_CONNECTED don't need
to include the L4 check as an L6 check is enough to decide whether to
wait for more info or not.
It also becomes obvious when reading the test in si_cs_recv() that caused
the failure mentioned above that once converted it doesn't make any sense
anymore: having CS_FL_EOS set while still waiting for L4 and L6 to complete
cannot happen since for CS_FL_EOS to be set, the other ones must have been
validated.
Some of these parts will still deserve further cleanup, and some of the
observations above may induce some backports of potential bug fixes once
totally analyzed in their context. The risk of breaking existing stuff
is too high to blindly backport everything.
The failed_secu counter is only used for the servers stats. It is used to report
the number of denied responses. On proxies, the same info is stored in the
denied_resp counter. So, it is more consistent to use the same field for
servers.
ST_F_CHECK_STATUS and ST_F_CHECK_CODE are now part of exported server metrics:
* haproxy_server_check_status
* haproxy_server_check_code
The heathcheck status is an integer corresponding to HCHK_STATUS value.
These ones used to serve as a set of switches between CO_FL_SOCK_* and
CO_FL_XPRT_*, and now that the SOCK layer is gone, they're always a
copy of the last know CO_FL_XPRT_* ones that is resynchronized before
I/O events by calling conn_refresh_polling_flags(), and that are pushed
back to FDs when detecting changes with conn_xprt_polling_changes().
While these functions are not particularly heavy, what they do is
totally redundant by now because the fd_want_*/fd_stop_*() actions
already perform test-and-set operations to decide to create an entry
or not, so they do the exact same thing that is done by
conn_xprt_polling_changes(). As such it is pointless to call that
one, and given that the only reason to keep CO_FL_CURR_* is to detect
changes there, we can now remove them.
Even if this does only save very few cycles, this removes a significant
complexity that has been responsible for many bugs in the past, including
the last one affecting FreeBSD.
All tests look good, and no performance regressions were observed.
we were decoding all substring and then parsing; this could lead to
consider & and = in decoding result as delimiters where it should not.
this patch reverses the order by first parsing and then decoding each key
and value separately.
we also stop parsing after number sign (#).
This patch should be backported to 2.1 and 2.0
Signed-off-by: William Dauchy <w.dauchy@criteo.com>
By passing the parameter "no-maint" in the query-string, it is now possible to
ignore servers in maintenance. It means that the metrics for servers in this
state will not be exported.
Now, the prometheus exporter parses the HTTP query-string to filter or to adapt
the exported metrics. In this first version, it is only possible select the
scopes of metrics to export. To do so, one or more parameters with "scope" as
name must be passed in the query-string, with one of those values: global,
frontend, backend, server or '*' (means all). A scope parameter with no value
means to filter out all scopes (nothing is returned). The scope parameters are
parsed in their appearance order in the query-string. So an empty scope will
reset all scopes already parsed. But it can be overridden by following scope
parameters in the query-string. By default everything is exported.
The filtering can also be done on prometheus scraping configuration, but general
aim is to optimise the source of data to improve load and scraping time. This is
particularly true for huge configuration with thousands of backends and servers.
Also note that this configuration was possible on the previous official haproxy
exporter but with even more parameters to select the needed metrics. Here we
thought it was sufficient to simply avoid a given type of metric. However, more
filters are still possible.
Thanks to William Dauchy. This patch is based on his work.
This adds two extra metrics per server, one for the current number of idle
connections and one for the configured limit :
* haproxy_server_idle_connections_current
* haproxy_server_idle_connections_limit
The following metrics have been renamed without the "_http" part :
* http_queue_time_average_seconds => queue_time_average_seconds
* http_connect_time_average_seconds => connect_time_average_seconds
* http_response_time_average_seconds => response_time_average_seconds
* http_total_time_average_seconds => total_time_average_seconds
These metrics are reported per backend and per server and are not specific to
HTTP sessions.
Now, for the sessions, the maximum times (queue, connect, response, total) are
reported in addition of the averages over the last 1024 connections. These
metrics are reported per backend and per server. Here are the metrics name :
* haproxy_backend_max_queue_time_seconds
* haproxy_backend_max_connect_time_seconds
* haproxy_backend_max_response_time_seconds
* haproxy_backend_max_total_time_seconds
and
* haproxy_server_max_queue_time_seconds
* haproxy_server_max_connect_time_seconds
* haproxy_server_max_response_time_seconds
* haproxy_server_max_total_time_seconds
This patch is related to #272.
The rcsid variable is static an unused, causing a build warning. Let's
just add __attribute__((unused)) to shut the warning.
This may be backported to 2.0.
The metrics QTIME, CTIME, RTIME and TTIME are now returned in seconds using a
float representation instead of in milliseconds. So these metrics are now
consistent with their announced type and respect Prometheus naming conventions.
This patch fixes the issue #288. It may be backported to 2.0. If so, the
previous patch, introducing the support for float fields in stats is mantatory
and should be backported first.
Now, following status are reported for servers:0=DOWN, 1=UP, 2=MAINT, 3=DRAIN,
4=NOLB.
It is linked to the github issue #255. Thanks to Mickaël Martin. If needed, this
patch may be backported to 2.0.
This simple program prepares a TCP connection between two ends and
allows to perform various operations on them such as send, recv, poll,
shutdown, close, reset, etc. It takes care of remaining particularly
silent to help inspection via strace, though it can also be verbose
and report status, errno, and poll events. It delays acceptation of
the incoming server-side connection so that it's even possible to
test the poll status on a listener with a pending connection, or
to close the connection without accepting it and inspect the effect
on the client.
Actions are executed in the command line order as they are parsed,
they may be grouped using commas when they are performed on the same
socket.
Example showing a successful recv() of pending data before a pending error:
$ ./poll -v -l pol,acc,pol -c snd,shw -s pol,rcv,pol,rcv,pol,snd,lin,clo -c pol,rcv,pol,rcv,pol
#### BEGIN ####
cmd #1 stp #1: do_pol(3): ret=1 ev=0x1 (IN)
cmd #1 stp #2: do_acc(3): ret=5
cmd #1 stp #3: do_pol(3): ret=0 ev=0
cmd #2 stp #1: do_snd(4): ret=3
cmd #2 stp #2: do_shw(4): ret=0
cmd #3 stp #1: do_pol(5): ret=1 ev=0x2005 (IN OUT RDHUP)
cmd #3 stp #2: do_rcv(5): ret=3
cmd #3 stp #3: do_pol(5): ret=1 ev=0x2005 (IN OUT RDHUP)
cmd #3 stp #4: do_rcv(5): ret=0
cmd #3 stp #5: do_pol(5): ret=1 ev=0x2005 (IN OUT RDHUP)
cmd #3 stp #6: do_snd(5): ret=3
cmd #3 stp #7: do_lin(5): ret=0
cmd #3 stp #8: do_clo(5): ret=0
cmd #4 stp #1: do_pol(4): ret=1 ev=0x201d (IN OUT ERR HUP RDHUP)
cmd #4 stp #2: do_rcv(4): ret=3
cmd #4 stp #3: do_pol(4): ret=1 ev=0x201d (IN OUT ERR HUP RDHUP)
cmd #4 stp #4: do_rcv(4): ret=-1 (Connection reset by peer)
cmd #4 stp #5: do_pol(4): ret=1 ev=0x2015 (IN OUT HUP RDHUP)
#### END ####
Prometheus protocol defines HELP and TYPE as a token after the '#' and
the space after the '#' is necessary.
This is expected in the prometheus python client for example
(a8f5c80f65/prometheus_client/parser.py (L194))
and the missing space is breaking the parsing of metrics' type.
This patch must be backported to 2.0.
The old module proto_http does not exist anymore. All code dedicated to the HTTP
analysis is now grouped in the file proto_htx.c. So, to finish the polishing
after removing the legacy HTTP code, proto_htx.{c,h} files have been moved in
http_ana.{c,h} files.
In addition, all HTX analyzers and related functions prefixed with "htx_" have
been renamed to start with "http_" instead.
Many flags of the HTTP transction (TX_*) are now unused and useless. So the
flags TX_WAIT_CLEANUP, TX_HDR_CONN_*, TX_CON_CLO_SET and TX_CON_KAL_SET were
removed. Most of TX_CON_WANT_* were also removed. Only TX_CON_WANT_TUN has been
kept.
First of all, all legacy HTTP analyzers and all functions exclusively used by
them were removed. So the most of the functions in proto_http.{c,h} were
removed. Only functions to deal with the HTTP transaction have been kept. Then,
http_msg and hdr_idx modules were entirely removed. And finally the structure
http_msg was lightened of all its useless information about the legacy HTTP. The
structure hdr_ctx was also removed because unused now, just like unused states
in the enum h1_state. Note that the memory pool "hdr_idx" was removed and
"http_txn" is now smaller.
When the response buffer is full and nothing more can be inserted, it is
important to not try to insert an empty data block. Otherwise, when the function
channel_add_input() is called, the flag CF_READ_PARTIAL is set on the response
channel while nothing was read and the stream is uselessly woken up. Finally, we
have loop while the response buffer is full.
This patch must be backported to 2.0.
The previous commit e6cdfe574 ("BUG/MINOR: contrib/prometheus-exporter: Don't
use channel_htx_recv_max()") is buggy. The buffer's reserve must be respected.
This patch must be backported to 2.0 and 1.9.
The function htx_free_data_space() must be used intead. Otherwise, if there are
some output data not already forwarded, the maximum amount of data that may be
inserted into the buffer may be greater than what we can really insert.
This patch must be backported to 2.0.
The following example files awere removed as irrelevant by this
time :
auth.cfg check.conf ssl.cfg haproxy.spec
The following scripts were removed as having been unused for more
than a decade :
debug2ansi debug2html debugfind check init.haproxy stats_haproxy.sh
seemless_reload.txt was moved to doc/ where it's more suitable.
haproxy.vim was moved to contrib/syntax-highlight/
scripts/create-release was updated not to try to update haproxy.spec
anymore.
The INSTALL guide, the Lua doc and the Prometheus exporter's README all
used to reference "linux2628", "linux26" or even "linux". These were all
updated to consistently reflect "linux-glibc" instead. The default options
were updated there as well so that it should build cleanly on most distros.
When built with the dummy 51Degrees library for testing, the output will
include "(dummy library)" to ensure it is clear that this is this is not
the API.
This way the directory structure remains the same as with the real lib and
one can apply the same build options regardless of where the lib is stored,
removing any possible confusion.
These are intended for use by HAProxy developers to ensure any changes
did not affect the 51Degrees implementation. The 51Degrees module can be
enabled and used by using the source in contrib/51d. This will run
without breaking, but will not return any meaningful information.
This is ideal for testing HAProxy core code, and other modules alongside
51Degrees, but should never be used as an actual module as it does
nothing.
The example configuration uses sess.ip_score however this variable
is not referenced within the example scripts. This patch adds support
for sess.ip_score to the python + lua scripts and generates a
random number between 1 and 100.
This type of blocks is useless because transition between data and trailers is
obvious. And when there is no trailers, the end-of-message is still there to
know when data end for chunked messages.
Since recent changes on the way HTX data blocks are added in an HTX message, we
must now be sure the prometheus service add its own blocks in one time. Indeed,
the function htx_add_data() may now decide to only copy a part of data. So
instead, we must call htx_add_data_atonce() instead.
The idle_pct thread-local variable was moved to struct thread_info by
commit 81036f2 ("MINOR: time: move the cpu, mono, and idle time to
thread_info") but not updated in service-prometheus.c, thus breaking
it.
No backport is needed. This fixes GH issue #110.
Add session flags, and add a new flag, SESS_FL_PREFER_LAST, to be set when
we use NTLM authentication, and we should reuse the last connection. This
should fix using NTLM with HTX. This totally replaces TX_PREFER_LAST.
This should be backported to 1.9.
The current coverage of the dummy library was limited because the callbacks
passed to wurfl_lookup() were not called. Now we do call them with one existing
and one non-existing headers to make sure that ha_wurfl_retrieve_header() is
covered by the tests as well.
I will replace thread by processes. Note that, I keep the pthread_key
system for identifiying process in the same way that threads. Note
also that I keep commented out the original thread code because I hope
to reactivate it.
The patch "MINOR: systemd: Make use of master socket in systemd unit"
introduces an environment file in /etc/default.
Unfortunatly this is not supported on redhat-based system, so we add
/etc/sysconfig/haproxy for that.
Unless the EXTRAOPTS variable is overriden in /etc/default/haproxy
the unit file will use the master socket by default.
This patch may be backported to 1.9 and depends on
MINOR: systemd: Use the variables from /etc/default/haproxy.
This will allow seamless upgrades from the sysvinit system while respecting
any changes the users may have made. It will also make local configuration
easier than overriding the systemd unit file.
Note by Tim:
This GPL-2 licensed patch was taken from the Debian project at [1].
It was slightly modified to cleanly apply, because HAProxy's default unit
file does not include rsyslog.service as an 'After' dependency. Also the
subject line was modified to include the proper subsystem and severity.
This patch may be backported to 1.9.
[1] https://salsa.debian.org/haproxy-team/haproxy/blob/master/debian/patches/haproxy.service-use-environment-variables.patch
Co-authored-by: Tim Duesterhus <tim@bastelstu.be>
I discovered this bug when running OWASP regression tests against HAProxy +
modsecurity-spoa (it's a POC to evaluate how it is working). I found out that
modsecurity spoa will crash when the request doesn't have any Host header.
See the pull request #86 on github for details.
This patch must be backported to 1.9 and 1.8.
This is dummy version of the Scientiamobile WURFL C API that can be used
to successfully build/run haproxy compiled with USE_WURFL=1.
It is marked as version 1.11.2.100 to distinguish it from any real version
of the lib. It has no external dependencies so it should work out of the
box by building it like this :
$ make -C contrib/wurfl
In order to use it, simply reference this directory as the WURFL include
and library paths :
$ make TARGET=<target> USE_WURFL=1 WURFL_INC=$PWD/contrib/wurfl WURFL_LIB=$PWD/contrib/wurfl
Some metrics have been renamed and their type adapted to be more usable in
Prometheus:
* haproxy_process_uptime_seconds -> haproxy_process_start_time_seconds
* haproxy_process_max_memory -> haproxy_process_max_memory_bytes
* haproxy_process_pool_allocated_total -> haproxy_process_pool_allocated_bytes
* haproxy_process_pool_used_total -> haproxy_process_pool_used_bytes
* haproxy_process_ssl_cache_lookups -> haproxy_process_ssl_cache_lookups_total
* haproxy_process_ssl_cache_misses -> haproxy_process_ssl_cache_misses_total
No backport needed. See issue #81 on github.
Following metrics have been removed:
* haproxy_frontend_connections_rate_current (ST_F_CONN_RATE)
* haproxy_frontend_http_requests_rate_current (ST_F_REQ_RATE)
* haproxy_*_current_session_rate (ST_F_RATE)
These rates can be deduced using the total value with this kind of formula:
rate(haproxy_frontend_connections_total[1m])
No backport needed. See issue #81 on github.
I found on an (old) AIX 5.1 machine that stdint.h didn't exist while
inttypes.h which is expected to include it does exist and provides the
desired functionalities.
As explained here, stdint being just a subset of inttypes for use in
freestanding environments, it's probably always OK to switch to inttypes
instead:
https://pubs.opengroup.org/onlinepubs/009696799/basedefs/stdint.h.html
Also it's even clearer here in the autoconf doc :
https://www.gnu.org/software/autoconf/manual/autoconf-2.61/html_node/Header-Portability.html
"The C99 standard says that inttypes.h includes stdint.h, so there's
no need to include stdint.h separately in a standard environment.
Some implementations have inttypes.h but not stdint.h (e.g., Solaris
7), but we don't know of any implementation that has stdint.h but not
inttypes.h"
Since the flag EOI was added on channels, some hidden bugs in the prometheus
exporter now leads to error. the visible effect is that responses are
truncated.
So first of all, channel_add_input() must be called when the response headers
and the EOM block are added. To be sure to correctly update the response channel
(especially to_forward value). Then the request must really be fully
consumed. And finally, the return clause in the switch has been replaced by a
break. It was totally wrong to skip the end of the function in the states
PROMEX_DONE and PROMEX_ERROR. (Note that PROMEX_ERROR was never used, so it was
replaced by PROMEX_END just to ease reading the code).
No need to backport this patch, the Prometheus exporter does not exist in early
versions.
It has been developped as a service applet. Internally, it is called
"promex". To build HAProxy with the promex service, you should use the Makefile
variable "EXTRA_OBJS". To be used, it must be enabled in the configuration with
an "http-request" rule and the corresponding HTTP proxy must enable the HTX
support. For instance:
frontend test
mode http
...
option http-use-htx
http-request use-service prometheus-exporter if { path /metrics }
...
See contrib/prometheus-exporter/README for details.
These flags haven't been used for a while. SF_TUNNEL was reintroduced
by commit d62b98c6e ("MINOR: stream: don't set backend's nor response
analysers on SF_TUNNEL") to handle the two-level streams needed to
deal with the first model for H2, and was not removed after this model
was abandonned. SF_INITIALIZED was only set. SF_CONN_TAR was never
referenced at all.
This generates the tables and indexes which will be used by the HPACK
encoder. The headers are sorted by length, then by statistical frequency,
then by direction (preference for responses), then by name, then by index.
The purpose is to speed up their lookup.
The SI_FL_WANT_PUT flag is used in an awkward way, sometimes it's
set by the stream-interface to mean "I have something to deliver",
sometimes it's cleared by the channel to say "I don't want you to
send what you have", and it has to be set back once CF_DONT_READ
is cleared. This will have to be split between SI_FL_RX_WAIT_EP
and SI_FL_RXBLK_CHAN. This patch only replaces all uses of the
flag with its natural (but negated) replacement SI_FL_RX_WAIT_EP.
The code is expected to be strictly equivalent. The now unused flag
was completely removed.
The plan is to have the following flags to describe why a stream interface
doesn't produce data :
- SI_FL_RXBLK_CHAN : the channel doesn't want it to receive
- SI_FL_RXBLK_BUFF : waiting for a buffer allocation to complete
- SI_FL_RXBLK_ROOM : more room is required in the channel to receive
- SI_FL_RXBLK_SHUT : input now closed, nothing new will come
- SI_FL_RX_WAIT_EP : waiting for the endpoint to produce more data
Applets like the CLI which consume complete commands at once and produce
large chunks of responses will for example be able to stop being woken up
by clearing SI_FL_WANT_GET and setting SI_FL_RXBLK_ROOM when the rx buffer
is full. Once called they will unblock WANT_GET. The flags were moved
together in readable form with the Rx bits using 2 hex digits and still
have some room to do a similar operation on the Tx path later, with the
WAIT_EP flag being represented alone on a digit.
This flag is not enough to describe all blocking situations, as can be
seen in each case we remove it. The muxes has taught us that using multiple
blocking flags in parallel will be much easier, so let's start to do this
now. This patch only renames this flags in order to make next changes more
readable.
Commit 53216e7db ("MEDIUM: connections: Don't directly mess with the
polling from the upper layers.") removed the CS_FL_DATA_RD_ENA and
CS_FL_DATA_WR_ENA flags without updating flags.c, thus breaking the
build. This patch also adds flag CL_FL_NOT_FIRST which was brought
by commit 08088e77c.
This fixes a typo in the README of the peers section of this subsystem
and 2 typos in code comments. Groupped together as cleanup to avoid too
many 1 char patches.
The behaviour of the flag CF_WRITE_PARTIAL was modified by commit
95fad5ba4 ("BUG/MAJOR: stream-int: don't re-arm recv if send fails") due
to a situation where it could trigger an immediate wake up of the other
side, both acting in loops via the FD cache. This loss has caused the
need to introduce CF_WRITE_EVENT as commit c5a9d5bf, to replace it, but
both flags express more or less the same thing and this distinction
creates a lot of confusion and complexity in the code.
Since the FD cache now acts via tasklets, the issue worked around in the
first patch no longer exists, so it's more than time to kill this hack
and to restore CF_WRITE_PARTIAL's semantics (i.e.: there has been some
write activity since we last left process_stream).
This patch mostly reverts the two commits above. Only the part making
use of CF_WROTE_DATA instead of CF_WRITE_PARTIAL to detect the loss of
data upon connection setup was kept because it's more accurate and
better suited.
Now all the code used to manipulate chunks uses a struct buffer instead.
The functions are still called "chunk*", and some of them will progressively
move to the generic buffer handling code as they are cleaned up.
Chunks are only a subset of a buffer (a non-wrapping version with no head
offset). Despite this we still carry a lot of duplicated code between
buffers and chunks. Replacing chunks with buffers would significantly
reduce the maintenance efforts. This first patch renames the chunk's
fields to match the name and types used by struct buffers, with the goal
of isolating the code changes from the declaration changes.
Most of the changes were made with spatch using this coccinelle script :
@rule_d1@
typedef chunk;
struct chunk chunk;
@@
- chunk.str
+ chunk.area
@rule_d2@
typedef chunk;
struct chunk chunk;
@@
- chunk.len
+ chunk.data
@rule_i1@
typedef chunk;
struct chunk *chunk;
@@
- chunk->str
+ chunk->area
@rule_i2@
typedef chunk;
struct chunk *chunk;
@@
- chunk->len
+ chunk->data
Some minor updates to 3 http functions had to be performed to take size_t
ints instead of ints in order to match the unsigned length here.
The master process will exit with the status of the last worker. When
the worker is killed with SIGTERM, it is expected to get 143 as an
exit status. Therefore, we consider this exit status as normal from a
systemd point of view. If it happens when not stopping, the systemd
unit is configured to always restart, so it has no adverse effect.
This has mostly a cosmetic effect. Without the patch, stopping HAProxy
leads to the following status:
â— haproxy.service - HAProxy Load Balancer
Loaded: loaded (/lib/systemd/system/haproxy.service; disabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Fri 2018-06-22 20:35:42 CEST; 8min ago
Docs: man:haproxy(1)
file:/usr/share/doc/haproxy/configuration.txt.gz
Process: 32715 ExecStart=/usr/sbin/haproxy -Ws -f $CONFIG -p $PIDFILE $EXTRAOPTS (code=exited, status=143)
Process: 32714 ExecStartPre=/usr/sbin/haproxy -f $CONFIG -c -q $EXTRAOPTS (code=exited, status=0/SUCCESS)
Main PID: 32715 (code=exited, status=143)
After the patch:
â— haproxy.service - HAProxy Load Balancer
Loaded: loaded (/lib/systemd/system/haproxy.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Docs: man:haproxy(1)
file:/usr/share/doc/haproxy/configuration.txt.gz
When the connection is closed by HAProxy, the status code provided in the
DISCONNECT frame is lost. By retransmitting it in the agent's reply, we are sure
to have it in the SPOE logs.
This patch may be backported in 1.8.
When the connection is closed by HAProxy, the status code provided in the
DISCONNECT frame is lost. By retransmitting it in the agent's reply, we are sure
to have it in the SPOE logs.
This patch may be backported in 1.8.
When the connection is closed by HAProxy, the status code provided in the
DISCONNECT frame is lost. By retransmitting it in the agent's reply, we are sure
to have it in the SPOE logs.
This patch may be backported in 1.8.
The commit c4dcaff3 ("BUG/MEDIUM: spoe: Flags are not encoded in network order")
introduced an incompatibility with older agents. So the major version of the
SPOP is increased to make the situation unambiguous. And because before the fix,
the protocol is buggy, the support of the version 1.0 is removed to be sure to
not continue to support buggy agents.
The agents in the contrib folder (spoa_example, modsecurity and mod_defender)
are also updated to announce the SPOP version 2.0.
So, to be clear, from the patch, connections to agents announcing the SPOP
version 1.0 will be rejected.
This patch must be backported in 1.8.
A recent fix on the SPOE revealed a mismatch between the SPOE specification and
the modsecurity implementation on the way flags are encoded or decoded. They
must be exchanged using the network bytes order and not the host one.
Be careful though, this patch breaks the compatiblity with HAProxy SPOE before
commit c4dcaff3 ("BUG/MEDIUM: spoe: Flags are not encoded in network order").
A recent fix on the SPOE revealed a mismatch between the SPOE specification and
the mod_defender implementation on the way flags are encoded or decoded. They
must be exchanged using the network bytes order and not the host one.
Be careful though, this patch breaks the compatiblity with HAProxy SPOE before
commit c4dcaff3 ("BUG/MEDIUM: spoe: Flags are not encoded in network order").
Buf is unsigned, so nbargs will be negative for more then 127 args.
Note that I cant test this bug because I cant put sufficient args
on the configuration line. It is just detected reading code.
[wt: this can be backported to 1.8 & 1.7]
While the haproxy workers usually are running chrooted the master
process is not. This patch is a pretty safe defense in depth measure
to ensure haproxy cannot touch sensitive parts of the file system.
ProtectSystem takes non-boolean arguments in newer SystemD versions,
but setting those would leave older systems such as Ubuntu Xenial
unprotected. Distro maintainers and system administrators could
adapt the ProtectSystem value to the SystemD version they ship.
Commit f4cfcf9 ("MINOR: debug/flags: Add missing flags") added a number
of missing flags but a few of them were incorrect, hiding real values.
This can be backported to 1.8.
There were several unused variables in halog.c that each caused a
compiler warning [-Wunused-but-set-variable]. This patch simply
removes the declaration of said vairables and any instance where the
unused variable was assigned a value.
The declaration of main() in iprange.c did not specify a type, causing
a compiler warning [-Wimplicit-int]. This patch simply declares main()
to be type 'int' and calls exit(0) at the end of the function.
This variable was used by the wrapper which was removed in
a6cfa9098e. The correct way to do seamless reload is now to enable
"expose-fd listeners" on the stat socket.
Being an external agent, it's confusing that it uses haproxy's internal
types and it seems to have encouraged other implementations to do so.
Let's completely remove any reference to struct sample and use the
native DATA types instead of converting to and from haproxy's sample
types.
This patch adds support for `Type=notify` to the systemd unit.
Supporting `Type=notify` improves both starting as well as reloading
of the unit, because systemd will be let known when the action completed.
See this quote from `systemd.service(5)`:
> Note however that reloading a daemon by sending a signal (as with the
> example line above) is usually not a good choice, because this is an
> asynchronous operation and hence not suitable to order reloads of
> multiple services against each other. It is strongly recommended to
> set ExecReload= to a command that not only triggers a configuration
> reload of the daemon, but also synchronously waits for it to complete.
By making systemd aware of a reload in progress it is able to wait until
the reload actually succeeded.
This patch introduces both a new `USE_SYSTEMD` build option which controls
including the sd-daemon library as well as a `-Ws` runtime option which
runs haproxy in master-worker mode with systemd support.
When haproxy is running in master-worker mode with systemd support it will
send status messages to systemd using `sd_notify(3)` in the following cases:
- The master process forked off the worker processes (READY=1)
- The master process entered the `mworker_reload()` function (RELOADING=1)
- The master process received the SIGUSR1 or SIGTERM signal (STOPPING=1)
Change the unit file to specify `Type=notify` and replace master-worker
mode (`-W`) with master-worker mode with systemd support (`-Ws`).
Future evolutions of this feature could include making use of the `STATUS`
feature of `sd_notify()` to send information about the number of active
connections to systemd. This would require bidirectional communication
between the master and the workers and thus is left for future work.
This one was built by studying the HPACK Huffman table (RFC7541
appendix B). It creates 5 small tables (4*512 bytes, 1*64 bytes) to
map one byte at a time from the input stream based on the following
observations :
* rht_bit31_24[256] is indexed on bits 31..24 when < 0xfe
* rht_bit24_17[256] is indexed on bits 24..17 when 31..24 >= 0xfe
* rht_bit15_11_fe[32] is indexed on bits 15..11 when 24..17 == 0xfe
* rht_bit15_8[256] is indexed on bits 15..8 when 24..17 == 0xff
* rht_bit11_4[256] is indexed on bits 11..4 when 15..8 == 0xff
* when 11..4 == 0xff, 3..2 provide the following mapping :
* 00 => 0x0a, 01 => 0x0d, 10 => 0x16, 11 => EOS
The same buffer is used for a request and its response. So we need to be sure
to correctly reset info when the response is encoded. And here there was a
bug. The pointer on the end of the frame was not updated. So it was not
possible to encode a response bigger than the corresponding request.
On x86_64, when gcc instruments functions and compiles at -O0, it saves
the function's return value in register rbx before calling the trace
callback. It provides a nice opportunity to display certain useful
values (flags, booleans etc) during trace sessions. It's absolutely
not guaranteed that it will always work but it provides a considerable
help when it does so it's worth activating it. When building on a
different architecture, the value 0 is always reported as the return
value. On x86_64 with optimizations (-O), the RBX register will not
necessarily match and random values will be reported, but since it's
not the primary target it's not a problem.
Now any call to trace() in the code will automatically appear interleaved
with the call sequence and timestamped in the trace file. They appear with
a '#' on the 3rd argument (caller's pointer) in order to make them easy to
spot. If the trace functionality is not used, a dmumy weak function is used
instead so that it doesn't require to recompile every time traces are
enabled/disabled.
The trace decoder knows how to deal with these messages, detects them and
indents them similarly to the currently traced function. This can be used
to print function arguments for example.
Note that we systematically flush the log when calling trace() to ensure we
never miss important events, so this may impact performance.
The trace() function uses the same format as printf() so it should be easy
to setup during debugging sessions.
These flags are not exactly for the data layer, they instead indicate
what is expected from the transport layer. Since we're going to split
the connection between the transport and the data layers to insert a
mux layer, it's important to have a clear idea of what each layer does.
All function conn_data_* used to manipulate these flags were renamed to
conn_xprt_*.
After careful inspection, this flag is set at exactly two places :
- once in the health-check receive callback after receipt of a
response
- once in the stream interface's shutw() code where CF_SHUTW is
always set on chn->flags
The flag was checked in the checks before deciding to send data, but
when it is set, the wake() callback immediately closes the connection
so the CO_FL_SOCK_WR_SH flag is also set.
The flag was also checked in si_conn_send(), but checking the channel's
flag instead is enough and even reveals that one check involving it
could never match.
So it's time to remove this flag and replace its check with a check of
CF_SHUTW in the stream interface. This way each layer is responsible
for its shutdown, this will ease insertion of the mux layer.
This flag is both confusing and wrong. It is supposed to report the
fact that the data layer has received a shutdown, but in fact this is
reported by CO_FL_SOCK_RD_SH which is set by the transport layer after
this condition is detected. The only case where the flag above is set
is in the stream interface where CF_SHUTR is also set on the receiving
channel.
In addition, it was checked in the health checks code (while never set)
and was always test jointly with CO_FL_SOCK_RD_SH everywhere, except in
conn_data_read0_pending() which incorrectly doesn't match the second
time it's called and is fortunately protected by an extra check on
(ic->flags & CF_SHUTR).
This patch gets rid of the flag completely. Now conn_data_read0_pending()
accurately reports the fact that the transport layer has detected the end
of the stream, regardless of the fact that this state was already consumed,
and the stream interface watches ic->flags&CF_SHUTR to know if the channel
was already closed by the upper layer (which it already used to do).
The now unused conn_data_read0() function was removed.
The ->init() callback of the connection's data layer was only used to
complete the session's initialisation since sessions and streams were
split apart in 1.6. The problem is that it creates a big confusion in
the layers' roles as the session has to register a dummy data layer
when waiting for a handshake to complete, then hand it off to the
stream which will replace it.
The real need is to notify that the transport has finished initializing.
This should enable a better splitting between these layers.
This patch thus introduces a connection-specific callback called
xprt_done_cb() which informs about handshake successes or failures. With
this, data->init() can disappear, CO_FL_INIT_DATA as well, and we don't
need to register a dummy data->wake() callback to be notified of errors.
Add plug_qdisc.c source file which may help in how to programatically
use plug queueing disciplines with its README file.
Such code may be useful to reproduce painful network application bugs.
Very early in the connection rework process leading to v1.5-dev12, commit
56a77e5 ("MEDIUM: connection: complete the polling cleanups") marked the
end of use for this flag which since was never set anymore, but it continues
to be tested. Let's kill it now.