When a response generated by HAProxy is handled by the mux H1, if the
corresponding request has not fully been received, the close mode is
forced. Thus, the client is notified the connection will certainly be closed
abruptly, without waiting the end of the request.
The flag HTX_FL_PROXY_RESP is now set on responses generated by HAProxy,
excluding responses returned by applets and services. It is an informative flag
set by the applicative layer.
When an error file was loaded, the flag HTX_SL_F_XFER_LEN was never set on the
HTX start line because of a bug. During the headers parsing, the flag
H1_MF_XFER_LEN is never set on the h1m. But it was the condition to set
HTX_SL_F_XFER_LEN on the HTX start-line. Instead, we must only rely on the flags
H1_MF_CLEN or H1_MF_CHNK.
Because of this bug, it was impossible to keep a connection alive for a response
generated by HAProxy. Now the flag HTX_SL_F_XFER_LEN is set when an error file
have a content length (chunked responses are unsupported at this stage) and the
connection may be kept alive if there is no connection header specified to
explicitly close it.
This patch must be backported to 2.0 and 1.9.
It currently is not possible to figure the exact haproxy version from a
core file for the sole reason that the version is stored into a const
string and as such ends up in the .text section that is not part of a
core file. By turning them into variables we move them to the data
section and they appear in core files. In order to help finding them,
we just prepend an extra variable in front of them and we're able to
immediately spot the version strings from a core file:
$ strings core | fgrep -A2 'HAProxy version'
HAProxy version follows
2.1-dev2-e0f48a-88
2019/10/15
(These are haproxy_version and haproxy_date respectively). This may be
backported to 2.0 since this part is not support to impact anything but
the developer's time spent debugging.
246c024 ("MINOR: ssl: load the ocsp in/from the ckch") broke the loading
of OCSP files. The function ssl_sock_load_ocsp_response_from_file() was
not returning 0 upon success which lead to an error after the .ocsp was
read.
The error messages for OCSP in ssl_sock_load_crt_file_into_ckch() add a
double extension to the filename, that can be confusing. The messages
reference a .issuer.issuer file.
If the user agent data contains text that has special characters that
are used to format the output from the vfprintf() function, haproxy
crashes. String "%s %s %s" may be used as an example.
% curl -A "%s %s %s" localhost:10080/index.html
curl: (52) Empty reply from server
haproxy log:
00000000:WURFL-test.clireq[00c7:ffffffff]: GET /index.html HTTP/1.1
00000000:WURFL-test.clihdr[00c7:ffffffff]: host: localhost:10080
00000000:WURFL-test.clihdr[00c7:ffffffff]: user-agent: %s %s %s
00000000:WURFL-test.clihdr[00c7:ffffffff]: accept: */*
segmentation fault (core dumped)
gdb 'where' output:
#0 strlen () at ../sysdeps/x86_64/strlen.S:106
#1 0x00007f7c014a8da8 in _IO_vfprintf_internal (s=s@entry=0x7ffc808fe750, format=<optimized out>,
format@entry=0x7ffc808fe9c0 "WURFL: retrieve header request returns [%s %s %s]\n",
ap=ap@entry=0x7ffc808fe8b8) at vfprintf.c:1637
#2 0x00007f7c014cfe89 in _IO_vsnprintf (
string=0x55cb772c34e0 "WURFL: retrieve header request returns [(null) %s %s %s B,w\313U",
maxlen=<optimized out>,
format=format@entry=0x7ffc808fe9c0 "WURFL: retrieve header request returns [%s %s %s]\n",
args=args@entry=0x7ffc808fe8b8) at vsnprintf.c:114
#3 0x000055cb758f898f in send_log (p=p@entry=0x0, level=level@entry=5,
format=format@entry=0x7ffc808fe9c0 "WURFL: retrieve header request returns [%s %s %s]\n")
at src/log.c:1477
#4 0x000055cb75845e0b in ha_wurfl_log (
message=message@entry=0x55cb75989460 "WURFL: retrieve header request returns [%s]\n") at src/wurfl.c:47
#5 0x000055cb7584614a in ha_wurfl_retrieve_header (header_name=<optimized out>, wh=0x7ffc808fec70)
at src/wurfl.c:763
In case WURFL (actually HAProxy) is not compiled with debug option
enabled (-DWURFL_DEBUG), this bug does not come to light.
This patch could be backported in every version supporting
the ScientiaMobile's WURFL. (as far as 1.7)
Absolute path must be used, otherwise, the requests are rejected by HAProxy
because of the recent changes. In addition, the configuration has been slightly
updated to remove warnings at startup.
As stated in the RCF7230#5.4, a client must send a field-value for the header
host that is identical to the authority if the target URI includes one. So, now,
by default, if the authority, when provided, does not match the value of the
header host, an error is triggered. To mitigate this behavior, it is possible to
set the option "accept-invalid-http-request". In that case, an http error is
captured without interrupting the request parsing.
There is no reason for a client to send several headers host. It even may be
considered as a bug. However, it is totally invalid to have different values for
those. So now, in such case, an error is triggered during the request
parsing. In addition, when several headers host are found with the same value,
only the first instance is kept and others are skipped.
When the option "accept-invalid-http-request" is enabled, some parsing errors
are ignored. But the position of the error is reported. In legacy HTTP mode,
such errors were captured. So, we now do the same in the H1 multiplexer.
If required, this patch may be backported to 2.0 and 1.9.
When an outgoing HTX message is formatted to a raw message, DATA blocks may be
splitted to not tranfser more data than expected. But if the buffer is almost
full, the formatting is interrupted, leaving some unused free space in the
buffer, because data are too large to be copied in one time.
Now, we transfer as much data as possible. When the message is chunked, we also
count the size used to encode the data.
When an outgoing HTX message is formatted to a raw message, if we fail to copy
data of an HTX block into the output buffer, we mark it as full. Before it was
only done calling the function buf_room_for_htx_data(). But this function is
designed to optimize input processing.
This patch must be backported to 2.0 and 1.9.
When raw data are copied or appended in a chunk, the result must not exceed the
chunk size but it can reach it. Unlike functions to copy or append a string,
there is no terminating null byte.
This patch must be backported as far as 1.8. Note in 1.8, the functions
chunk_cpy() and chunk_cat() don't exist.
In functions htx_*_to_h1(), most of time several calls to chunk_memcat() are
chained. The expected size is always compared to available room in the buffer to
be sure the full copy will succeed. But it is a bit risky because it relies on
the fact the function chunk_memcat() evaluates the available room in the buffer
in a same way than htx ones. And, unfortunately, it does not. A bug in
chunk_memcat() will always leave a byte unused in the buffer. So, for instance,
when a chunk is copied in an almost full buffer, the last CRLF may be skipped.
To fix the issue, we now rely on the result of chunk_memcat() only.
This patch must be backported to 2.0 and 1.9.
The SSL engines code was written below the OCSP #ifdef, which means you
can't build the engines code if the OCSP is deactived in the SSL lib.
Could be backported in every version since 1.8.
A NULL dereference can occur when inserting SNIs. In the case of
checking for duplicates, if there is already several sni_ctx with the
same key.
Fix issue #321.
Don't try to load the files containing the issuer and the OCSP response
each time we generate a SSL_CTX.
The .ocsp and the .issuer are now loaded in the struct
cert_key_and_chain only once and then loaded from this structure when
creating a SSL_CTX.
Don't try to load the file containing the sctl each time we generate a
SSL_CTX.
The .sctl is now loaded in the struct cert_key_and_chain only once and
then loaded from this structure when creating a SSL_CTX.
Note that this now make possible the use of sctl with multi-cert
bundles.
$ echo -e "set ssl cert certificate.pem <<\n$(cat certificate2.pem)\n" | \
socat stdio /var/run/haproxy.stat
Certificate updated!
The operation is locked at the ckch level with a HA_SPINLOCK_T which
prevents the ckch architecture (ckch_store, ckch_inst..) to be modified
at the same time. So you can't do a certificate update at the same time
from multiple CLI connections.
SNI trees are also locked with a HA_RWLOCK_T so reading operations are
locked only during a certificate update.
Bundles are supported but you need to update each file (.rsa|ecdsa|.dsa)
independently. If a file is used in the configuration as a bundle AND
as a unique certificate, both will be updated.
Bundles, directories and crt-list are supported, however filters in
crt-list are currently unsupported.
The code tries to allocate every SNIs and certificate instances first,
so it can rollback the operation if that was unsuccessful.
If you have too much instances of the certificate (at least 20000 in my
tests on my laptop), the function can take too much time and be killed
by the watchdog. This will be fixed later. Also with too much
certificates it's possible that socat exits before the end of the
generation without displaying a message, consider changing the socat
timeout in this case (-t2 for example).
The size of the certificate is currently limited by the maximum size of
a payload, that must fit in a buffer.
The ssl_sock_load_{multi}_ckchs() function were renamed and modified:
- allocate a ckch_inst and loads the sni in it
- return a ckch_inst or NULL
- the sni_ctx are not added anymore in the sni trees from there
- renamed in ckch_inst_new_load_{multi}_store()
- new ssl_sock_load_ckchs() function calls
ckch_inst_new_load_{multi}_store() and add the sni_ctx to the sni trees.
ssl_sock_load_multi_ckchs() is now able to fail without polluting the
bind_conf trees and leaking memory.
It is a prerequisite to load certificate on-the-fly with the CLI.
The insertion of the sni_ctxs in the trees are done once everything has
been allocated correctly.
ssl_sock_load_ckchn() is now able to fail without polluting the
bind_conf trees and leaking memory.
It is a prerequisite to load certificate on-the-fly with the CLI.
The insertion of the sni_ctxs in the trees are done once everything has
been allocated correctly.
In order to allow the creation of sni_ctx in runtime, we need to split
the function to allow rollback.
We need to be able to allocate all sni_ctxs required before inserting
them in case we need to rollback if we didn't succeed the allocation.
The function was splitted in 2 parts.
The first one ckch_inst_add_cert_sni() allocates a struct sni_ctx, fill
it with the right data and insert it in the ckch_inst's list of sni_ctx.
The second will take every sni_ctx in the ckch_inst and insert them in
the bind_conf's sni tree.
struct ckch_inst represents an instance of a certificate (ckch_node)
used in a bind_conf. Every sni_ctx created for 1 ckch_node in a
bind_conf are linked in this structure.
This patch allocate the ckch_inst for each bind_conf and inserts the
sni_ctx in its linked list.
The ssl_sock_populate_sni_keytypes_hplr() function does not return an
error upon an allocation failure.
The process would probably crash during the configuration parsing if the
allocation fail since it tries to copy some data in the allocated
memory.
This patch could be backported as far as 1.5.
This patch frees the sni_keytype nodes once the sni_ctxs have been
allocated in ssl_sock_load_multi_ckchn();
Could be backported in every version using the multi-cert SSL bundles.
The ssl_sock_add_cert_sni() function never return an error when a
sni_ctx allocation fail. It silently ignores the problem and continues
to try to allocate other snis.
It is unlikely that a sni allocation will succeed after one failure and
start a configuration without all the snis. But to avoid any problem we
return a -1 upon an sni allocation error and stop the configuration
parsing.
This patch must be backported in every version supporting the crt-list
sni filters. (as far as 1.5)
A ckch_store is a storage which contains a pointer to one or several
cert_key_and_chain structures.
This patch renames ckch_node to ckch_store, and ckch_n, ckchn to ckchs.
As using an mt_list for the tasklet list is costly, instead use a regular list,
but add an mt_list for tasklet woken up by other threads, to be run on the
current thread. At the beginning of process_runnable_tasks(), we just take
the new list, and merge it into the task_list.
This should give us performances comparable to before we started using a
mt_list, but allow us to use tasklet_wakeup() from other threads.
This macro atomically cuts the head of a list and returns the list
of elements as a detached list, meaning that they're all linked
together without any head. If the list was empty, NULL is returned.
I introduced this mistake when adding the description for the stats
metrics, it's even amazing it built and worked at all! This was
reported by Travis CI on non-GNU platforms :
src/stats.c:92:39: warning: use of GNU 'missing =' extension in designator [-Wgnu-designator]
[INF_NAME] { .name = "Name", .desc = "Product name" },
^
=
No backport is needed.
In issue #277 is reported a strange problem related to a fast-spinning
applet which seems to show valid progress being made. It's uncertain how
this can happen, maybe some very specific timing patterns manage to place
just a few bytes in each buffer and result in the peers applet being called
a lot. But it appears possible to artificially cross the spinning threshold
by asking for monster stats page (500 MB) and limiting the send() size to
1 MSS (1460 bytes), causing the stats page to be called for very small
blocks which most often do not leave enough room to place a new chunk.
The idea developed in this patch consists in not crashing for an applet
which reaches a very high call rate if it shows some indication of
progress. Detecting progress on applets is not trivial but in our case
we know that they must at least not claim to wait for a buffer allocation
if this buffer is present, wait for room if the buffer is empty, ask for
more data without polling if such data are still present, nor leave with
an empty input buffer without having written anything nor read anything
from the other side while a shutw is pending.
Doing so doesn't affect normal behaviors nor abuses of our existing
applets and does at least protect against an applet performing an
early return without processing events, or one causing an endless
loop by asking for impossible conditions.
This must be backported to 2.0.
Now "show info desc", "show info typed desc" and "show stat typed desc"
will report (hopefully) accurate descriptions of each field. These ones
were verified in the code. When some metrics are specific to the process
or the thread, they are indicated. Sometimes a config option is known
for a setting and it is reported as well. The purpose mainly is to help
sysadmins in field more easily sort out issues vs non-issues. In part
inspired by this very informative talk :
https://kernel-recipes.org/en/2019/metrics-are-money/
Example:
$ socat - /var/run/haproxy.sock <<< "show info desc"
Name: HAProxy:"Product name"
Version: 2.1-dev2-991035-31:"Product version"
Release_date: 2019/10/09:"Date of latest source code update"
Nbthread: 1:"Number of started threads (global.nbthread)"
Nbproc: 1:"Number of started worker processes (global.nbproc)"
Process_num: 1:"Relative process number (1..Nbproc)"
Pid: 11975:"This worker process identifier for the system"
Uptime: 0d 0h00m10s:"How long ago this worker process was started (days+hours+minutes+seconds)"
Uptime_sec: 10:"How long ago this worker process was started (seconds)"
Memmax_MB: 0:"Worker process's hard limit on memory usage in MB (-m on command line)"
PoolAlloc_MB: 0:"Amount of memory allocated in pools (in MB)"
PoolUsed_MB: 0:"Amount of pool memory currently used (in MB)"
PoolFailed: 0:"Number of failed pool allocations since this worker was started"
Ulimit-n: 300000:"Hard limit on the number of per-process file descriptors"
Maxsock: 300000:"Hard limit on the number of per-process sockets"
Maxconn: 149982:"Hard limit on the number of per-process connections (configured or imposed by Ulimit-n)"
Hard_maxconn: 149982:"Hard limit on the number of per-process connections (imposed by Memmax_MB or Ulimit-n)"
CurrConns: 0:"Current number of connections on this worker process"
CumConns: 1:"Total number of connections on this worker process since started"
CumReq: 1:"Total number of requests on this worker process since started"
MaxSslConns: 0:"Hard limit on the number of per-process SSL endpoints (front+back), 0=unlimited"
CurrSslConns: 0:"Current number of SSL endpoints on this worker process (front+back)"
CumSslConns: 0:"Total number of SSL endpoints on this worker process since started (front+back)"
Maxpipes: 0:"Hard limit on the number of pipes for splicing, 0=unlimited"
PipesUsed: 0:"Current number of pipes in use in this worker process"
PipesFree: 0:"Current number of allocated and available pipes in this worker process"
ConnRate: 0:"Number of front connections created on this worker process over the last second"
ConnRateLimit: 0:"Hard limit for ConnRate (global.maxconnrate)"
MaxConnRate: 0:"Highest ConnRate reached on this worker process since started (in connections per second)"
SessRate: 0:"Number of sessions created on this worker process over the last second"
SessRateLimit: 0:"Hard limit for SessRate (global.maxsessrate)"
MaxSessRate: 0:"Highest SessRate reached on this worker process since started (in sessions per second)"
SslRate: 0:"Number of SSL connections created on this worker process over the last second"
SslRateLimit: 0:"Hard limit for SslRate (global.maxsslrate)"
MaxSslRate: 0:"Highest SslRate reached on this worker process since started (in connections per second)"
SslFrontendKeyRate: 0:"Number of SSL keys created on frontends in this worker process over the last second"
SslFrontendMaxKeyRate: 0:"Highest SslFrontendKeyRate reached on this worker process since started (in SSL keys per second)"
SslFrontendSessionReuse_pct: 0:"Percent of frontend SSL connections which did not require a new key"
SslBackendKeyRate: 0:"Number of SSL keys created on backends in this worker process over the last second"
SslBackendMaxKeyRate: 0:"Highest SslBackendKeyRate reached on this worker process since started (in SSL keys per second)"
SslCacheLookups: 0:"Total number of SSL session ID lookups in the SSL session cache on this worker since started"
SslCacheMisses: 0:"Total number of SSL session ID lookups that didn't find a session in the SSL session cache on this worker since started"
CompressBpsIn: 0:"Number of bytes submitted to HTTP compression in this worker process over the last second"
CompressBpsOut: 0:"Number of bytes out of HTTP compression in this worker process over the last second"
CompressBpsRateLim: 0:"Limit of CompressBpsOut beyond which HTTP compression is automatically disabled"
Tasks: 10:"Total number of tasks in the current worker process (active + sleeping)"
Run_queue: 1:"Total number of active tasks+tasklets in the current worker process"
Idle_pct: 100:"Percentage of last second spent waiting in the current worker thread"
node: wtap.local:"Node name (global.node)"
Stopping: 0:"1 if the worker process is currently stopping, otherwise zero"
Jobs: 14:"Current number of active jobs on the current worker process (frontend connections, master connections, listeners)"
Unstoppable Jobs: 0:"Current number of unstoppable jobs on the current worker process (master connections)"
Listeners: 13:"Current number of active listeners on the current worker process"
ActivePeers: 0:"Current number of verified active peers connections on the current worker process"
ConnectedPeers: 0:"Current number of peers having passed the connection step on the current worker process"
DroppedLogs: 0:"Total number of dropped logs for current worker process since started"
BusyPolling: 0:"1 if busy-polling is currently in use on the worker process, otherwise zero (config.busy-polling)"
FailedResolutions: 0:"Total number of failed DNS resolutions in current worker process since started"
TotalBytesOut: 0:"Total number of bytes emitted by current worker process since started"
BytesOutRate: 0:"Number of bytes emitted by current worker process over the last second"
Now "show info" supports "desc" after the default and "typed" formats,
and "show stat" supports this after the typed format. In both cases
this appends the description for the represented metric between double
quotes. The same could be done for JSON output but would possibly require
to update the schema first.