Several bugs were introduced recently due to a misunderstanding of how
this function works and what it was supposed to do. Since it's supposed
to only return the pointer to a rule which aborts further processing of
the request, let's rename it to avoid further issues.
The function was also slightly cleaned up without any functional change.
It happens that all of them call parse_logformat_line() which sets
proxy->to_log with a number of flags affecting the line format for
all three users. For example, having a unique-id specified disables
the default log-format since fe->to_log is tested when the session
is established.
Similarly, having "option logasap" will cause "+" to be inserted in
unique-id or headers referencing some of the fields depending on
LW_BYTES.
This patch first removes most of the dependency on fe->to_log whenever
possible. The first possible cleanup is to stop checking fe->to_log
for being null, considering that it always contains at least LW_INIT
when any such usage is made of the log-format!
Also, some checks are wrong. s->logs.logwait cannot be nulled by
"logwait &= ~LW_*" since LW_INIT is always there. This results in
getting the wrong log at the end of a request or session when a
unique-id or add-header is set, because logwait is still not null
but the log-format is not checked.
Further cleanups are required. Most LW_* flags should be removed or at
least replaced with what they really mean (eg: depend on client-side
connection, depend on server-side connection, etc...) and this should
only affect logging, not other mechanisms.
This patch fixes the default log-format and tries to limit interferences
between the log formats, but does not pretend to do more for the moment,
since it's the most visible breakage.
After the response headers are sent and the request processing is done,
the buffers are wiped out and the stream interface is closed. We must
then disable the request analysers, otherwise some processing will
happen on a closed stream interface and empty buffers which do not
match, causing all sort of crashes. This issue was introduced with
recent work on the stats, and was reported by Seri.
David BERARD reported that http-request add-header passes a \0 along
with the header field, which of course is not appropriate. This is
caused by build_logline() which sometimes returns the size with the
trailing zero and sometimes can return an empty string. Let's fix
this function instead of fixing the places where it's used.
Previous commit was still wrong, it broke add-header and set-header
because we don't want to leave on these actions.
The http_check_access_rule() function should be redesigned, it was
initially thought for allow/deny rules but now it is executing other
non-final rules and at the same time returning a pointer to the last
final rule. That becomes a bit confusing and will need to be addressed
before we implement redirect and return.
This commit adding http-request add-header/set-header unfortunately introduced
a regression to the handling of the stats page which is not matched anymore.
Thanks to Dmitry Sivachenko for reporting this.
Released version 1.5-dev16 with the following main changes :
- BUG/MEDIUM: ssl: Prevent ssl error from affecting other connections.
- BUG/MINOR: ssl: error is not reported if it occurs simultaneously with peer close detection.
- MINOR: ssl: add fetch and acl "ssl_c_used" to check if current SSL session uses a client certificate.
- MINOR: contrib: make the iprange tool grep for addresses
- CLEANUP: polling: gcc doesn't always optimize constants away
- OPTIM: poll: optimize fd management functions for low register count CPUs
- CLEANUP: poll: remove a useless double-check on fdtab[fd].owner
- OPTIM: epoll: use a temp variable for intermediary flag computations
- OPTIM: epoll: current fd does not count as a new one
- BUG/MINOR: poll: the I/O handler was called twice for polled I/Os
- MINOR: http: make resp_ver and status ACLs check for the presence of a response
- BUG/MEDIUM: stream-interface: fix possible stalls during transfers
- BUG/MINOR: stream_interface: don't return when the fd is already set
- BUG/MEDIUM: connection: always update connection flags prior to computing polling
- CLEANUP: buffer: use buffer_empty() instead of buffer_len()==0
- BUG/MAJOR: stream_interface: fix occasional data transfer freezes
- BUG/MEDIUM: stream_interface: fix another case where the reader might not be woken up
- BUG/MINOR: http: don't abort client connection on premature responses
- BUILD: no need to clean up when making git-tar
- MINOR: log: add a tag for amount of bytes uploaded from client to server
- BUG/MEDIUM: log: fix possible segfault during config parsing
- MEDIUM: log: change a few log tokens to make them easier to remember
- BUG/MINOR: log: add_to_logformat_list() used the wrong constants
- MEDIUM: log-format: make the format parser more robust and more extensible
- MINOR: sample: support cast from bool to string
- MINOR: samples: add a function to fetch and convert any sample to a string
- MINOR: log: add lf_text_len
- MEDIUM: log: add the ability to include samples in logs
- REORG: stats: massive code reorg and cleanup
- REORG: stats: move the HTTP header injection to proto_http
- REORG: stats: functions are now HTTP/CLI agnostic
- BUG/MINOR: log: fix regression introduced by commit 8a3f52
- MINOR: chunks: centralize the trash chunk allocation
- MEDIUM: stats: use hover boxes instead of title to report details
- MEDIUM: stats: use multi-line tips to display detailed counters
- MINOR: tools: simplify the use of the int to ascii macros
- MINOR: stats: replace STAT_FMT_CSV with STAT_FMT_HTML
- MINOR: http: prepare to support more http-request actions
- MINOR: log: make parse_logformat_string() take a const char *
- MEDIUM: http: add http-request 'add-header' and 'set-header' to build headers
These two new statements allow to pass information extracted from the request
to the server. It's particularly useful for passing SSL information to the
server, but may be used for various other purposes such as combining headers
together to emulate internal variables.
These macros (U2H, U2A, LIM2A, ...) have been used with an explicit
index for the local storage variable, making it difficult to change
log formats and causing a few issues from time to time. Let's have
a single macro with a rotating index so that up to 10 conversions
may be used in a single call.
Frontend, listener, server and backend detailed counters are now spread
over several lines in a table when the pointer hovers over the <u> area.
The values are much more readble, and the extra space gained this way
allowed to report some percentages.
Note: some incoherencies still exist between some counters. For example,
the backend's cum_conn is increased when a session is assigned instead
of increasing cum_sess.
Using titles to report detailed information is not convenient. The
browser decides to wrap the line where it wants, generally the box
quickly fades away, and it's not possible to copy-paste the text
from the box.
By using two levels, we can make a block appear/disappear depending
on whether its parent it being hovered or not, for example :
.tip { display: none; }
u:hover .tip { display: block; }
or better:
.tip { display: block; visibility: hidden; }
u:hover .tip { visibility: visible; }
Toggling visibility ensures that the place required to display the block
is always reserved. This is important to display boxes that are close to
the edges.
Then using <span class="tip">this is a box</span> will make the text
appear only when the upper <u> is hovered.
But this still adds much text. So instead we use a generic <div> tag
which we don't use anywhere else. That way we don't have to specify a
class :
div { display: block; visibility: hidden; }
u:hover div { visibility: hidden; }
This works pretty well, even in old browsers from 2005.
This commit does not change the display format, it only replaces the
title attribute with the div tag. Later commits will adjust the layout.
At the moment, we need trash chunks almost everywhere and the only
correctly implemented one is in the sample code. Let's move this to
the chunks so that all other places can use this allocator.
Additionally, the get_trash_chunk() function now really returns two
different chunks. Previously it used to always overwrite the same
chunk and point it to a different buffer, which was a bit tricky
because it's not obvious that two consecutive results do alias each
other.
The commit above improved error reporting during log parsing, but as
a result, some shared strings such as httplog_format are truncated
during parsing. This is observable upon startup because the second
proxy to use httplog emits a warning.
Let's have the logformat parser duplicate the string while parsing it.
All the functions that were called "http" or "raw" have been cleaned up
to support either only HTML and use that word in their name, or support
both HTML and CSV and be usable both from the HTTP and the CLI entry
points.
stats_http_dump() now explicitly checks for HTML mode before adding
HTML-specific headers/trailers, and has been renamed since it's now
used by both entry points.
Some more duplicated code could be removed. The patch looks big but it's
mostly due to re-indents resulting from the moves or comments updates.
The calling sequences now look like this :
cli_io_handler()
-> stats_dump_sess_to_buffer() // "show sess"
-> stats_dump_errors_to_buffer() // "show errors"
-> stats_dump_info_to_buffer() // "show info"
-> stats_dump_stat_to_buffer() // "show stat"
-> stats_dump_csv_header()
-> stats_dump_proxy_to_buffer()
-> stats_dump_fe_stats()
-> stats_dump_li_stats()
-> stats_dump_sv_stats()
-> stats_dump_be_stats()
http_stats_io_handler()
-> stats_dump_stat_to_buffer() // same as above, but used for CSV or HTML
-> stats_dump_csv_header() // emits the CSV headers (same as above)
-> stats_dump_html_head() // emits the HTML headers
-> stats_dump_html_info() // emits the equivalent of "show info" at the top
-> stats_dump_proxy_to_buffer() // same as above, valid for CSV and HTML
-> stats_dump_html_px_hdr()
-> stats_dump_fe_stats()
-> stats_dump_li_stats()
-> stats_dump_sv_stats()
-> stats_dump_be_stats()
-> stats_dump_html_px_end()
-> stats_dump_html_end() // emits HTML trailer
The HTTP header injection that are performed in dumpstats when responding
or when redirecting a POST request have nothing to do in dumpstats. They
do not use any state from the stats, and are 100% HTTP. Let's make the
headers there in the HTTP core, and have dumpstats only produce stats.
The dumpstats code looks like a spaghetti plate. Several functions are
supposed to be able to do several things but rely on complex states to
dispatch the work to independant functions. Most of the HTML output is
performed within the switch/case statements of the whole state machine.
Let's clean this up by adding new functions to emit the data and have
a few more iterators to avoid relying on so complex states.
The new stats dump sequence looks like this for CLI and for HTTP :
cli_io_handler()
-> stats_dump_sess_to_buffer() // "show sess"
-> stats_dump_errors_to_buffer() // "show errors"
-> stats_dump_raw_info_to_buffer() // "show info"
-> stats_dump_raw_info()
-> stats_dump_raw_stat_to_buffer() // "show stat"
-> stats_dump_csv_header()
-> stats_dump_proxy()
-> stats_dump_px_hdr()
-> stats_dump_fe_stats()
-> stats_dump_li_stats()
-> stats_dump_sv_stats()
-> stats_dump_be_stats()
-> stats_dump_px_end()
http_stats_io_handler()
-> stats_http_redir()
-> stats_dump_http() // also emits the HTTP headers
-> stats_dump_html_head() // emits the HTML headers
-> stats_dump_csv_header() // emits the CSV headers (same as above)
-> stats_dump_http_info() // note: ignores non-HTML output
-> stats_dump_proxy() // same as above
-> stats_dump_http_end() // emits HTML trailer
Using %[expression] it becomes possible to make the log engine fetch
some samples from the request or the response and provide them in the
logs. Note that this feature is still limited, it does not yet allow
to apply converters, to limit the output length, nor to specify the
direction which should be fetched when a fetch function works in both
directions.
However it's quite convenient to log SSL information or to include some
information that are used in stick tables.
It is worth noting that this has been done in the generic log format
handler, which means that the same information may be used to build the
unique-id header and to pass the information to a backend server.
The log-format parser reached a limit making it hard to add new features.
It also suffers from a weak handling of certain incorrect corner cases,
for example "%{foo}" is emitted as a litteral while syntactically it's an
argument to no variable. Also the argument parser had to redo some of the
job with some cases causing minor memory leaks (eg: ignored args).
This work aims at improving the situation so that slightly better reporting
is possible and that it becomes possible to extend the log format. The code
has a few more states but looks significantly simpler. The parser is now
capable of reporting ignored arguments and truncated lines.
The <type> argument was checked against LOG_FMT_* but it was passed as
LF_* which are two independant enums. It happens that the 3 first entries
in these enums do match, but this broke some experimental changes which
required another state, so let's fix this now.
Some log tokens have evolved in a way that is not completely logical.
For example, frontend tokens sometimes begin with an 'f' and sometimes
with an 'F'. Same for backend and server.
So let's change a few cases without disrupting compatibility with existing
setups :
Bi => bi
Bp => bp
Ci => ci
Cp => cp
Fi => fi
Fp => fp
Si => si
Sp => sp
cc => CC
cs => CS
st => ST
The old ones are still supported but deprecated and will be unsupported by
the 1.5 release. However, a warning message is emitted when they're encounterd
and it indicates what token should be used to replace them.
When log format arguments are specified within braces with no '+' nor '-'
prefix, the NULL string is compared with known keywords causing a crash.
This only happens during parsing so it does not affect runtime processing.
When a server responds prematurely to a POST request, haproxy used to
cause the transfer to be aborted before the end. This is problematic
because this causes the client to receive a TCP reset when it tries to
push more data, generally preventing it from receiving the response
which contain the reason for the premature reponse (eg: "entity too
large" or an authentication request).
From now on we take care of allowing the upload traffic to flow to the
server even when the response has been received, since the server is
supposed to drain it. That way the client receives the server response.
This bug has been present since 1.4 and the fix should probably be
backported there.
The code review during the chase for the POST freeze uncovered another possible
issue which might appear when we perform an incomplete read and want to stop because
of READ_DONTWAIT or because we reached the maximum read_poll limit. Reading is
disabled but SI_FL_WAIT_ROOM was not set, possibly causing some cases where a
send() on the other side would not wake the reader up until another activity
on the same side calls the update function which fixes its status.
Since the changes in connection management, it became necessary to re-enable
polling after a fast-forward transfer would complete.
One such issue was addressed after dev12 by commit 9f7c6a18 (BUG/MAJOR:
stream_interface: certain workloads could cause get stuck) but unfortunately,
it was incomplete as very subtle cases would occasionally remain unaddressed
when a buffer was marked with the NOEXP flag, which is used during POST
uploads. The wake up must be performed even when the flag is there, the
flag is used only to refresh the timeout.
Too many conditions need to be hit together for the situation to be
reproducible, but it systematically appears for some users.
It is particularly important to credit Sander Klein and John Rood from
Picturae ICT ( http://picturae.com/ ) for reporting this bug on the mailing
list, providing configs and countless traces showing the bug in action, and
for their patience testing litteraly tens of snapshots and versions of
supposed fixes during a full week to narrow the commit range until the bug
was really knocked down! As a side effect of their numerous tests, several
other bugs were fixed.
stream_int_chk_rcv_conn() did not clear connection flags before updating them. It
is unsure whether this could have caused the stalled transfers that have been
reported since dev15.
In order to avoid such further issues, we now use a simple inline function to do
all the job.
Back in the days where polling was made with select() where all FDs
were checked at once, stream_int_chk_snd_conn() used to check whether
the file descriptor it was passed was ready or not, so that it did
not perform the work for nothing.
Right now FDs are checked just before calling the I/O handler so this
test never matches at best, or may return false information at worst.
Since conn_fd_handler() always clears the flags upon exit, it looks
like a missed event cannot happen right now. Still, better remove
this outdated check than wait for it to cause issues.
Sander Klein reported a rare case of POST transfers being stalled
after a few megabytes since dev15. One possible culprit is the fix
for the CPU spinning issues which is not totally correct, because
stream_int_chk_snd_conn() would inconditionally enable the
CO_FL_CURR_WR_ENA flag.
What could theorically happen is the following sequence :
1) send buffer is empty, server-side polling is disabled
2) client sends some data
3) such data are forwarded to the server using
stream_int_chk_snd_conn()
4) conn->flags |= CO_FL_CURR_WR_ENA
5) si_conn_send_loop() is called
6) raw_sock_from_buf() does a partial write due to full kernel buffers
7) stream_int_chk_snd_conn() detects this and requests to be called
to send the remaining data using __conn_data_want_send(), and clears
the SI_FL_WAIT_DATA flag on the stream interface, indicating that it
is already congestionned.
8) conn_cond_update_polling() calls conn_data_update_polling() which
sees that both CO_FL_DATA_WR_ENA and CO_FL_CURR_WR_ENA are set, so
it does not enable polling on the output fd.
9) the next chunk from the client fills the buffer
10) stream_int_chk_snd_conn() is called again
11) SI_FL_WAIT_DATA is already cleared, so the function immediately
returns without doing anything.
12) the buffer is now full with the FD write polling disabled and
everything deadlocks.
Not that there is no reason for such an issue not to happen the other
way around, from server to client, except maybe that due to the speed
difference between the client and the server, client-side polling is
always enabled and the buffer is never empty.
All this shows that the new polling still looks fragile, in part due
to the double information on the FD status, being both in fdtab[] and
in the connection, which looks unavoidable. We should probably have
some functions to tighten the relation between such flags and avoid
manipulating them by hand.
Also, the effects of chk_snd() on the polling are still under-estimated,
while the relation between the stream_int and the FD is still too much
present. Maybe the function should be rethought to only call the connection's
fd handler. The connection model probably needs two calling conventions
for bottom half and upper half.
J. Maurice reported that ssllabs.com test affects unrelated
legitimate traffic and cause SSL errors and broken connections.
Sometimes openssl store read/write/handshake errors in a global stack. This
stack is not specific to the current session. Openssl API does not clean the
stack at the beginning of a new read/write. And the function used to retrieve
error ID after read/write, returns the generic error SSL_ERROR_SSL if the
global stack is not empty.
The fix consists in cleaning the errors stack after read/write/handshake
errors.
The two ACL fetches "resp_ver" and "status", if used in a request despite
the warning, would return a match of zero length. This is inappropriate,
better return a non-match to be more consistent with other ACL processing.
When a polled I/O event is detected, the event is added to the updates
list and the I/O handler is called. Upon return, if the event handler
did not experience an EAGAIN, the event remains in the updates list so
that it will be processed later. But if the event was already in the
spec list, its state is updated and it will be called again immediately
upon exit, by fd_process_spec_events(), so this creates unfairness
between speculative events and polled events.
So don't call the I/O handler upon I/O detection when the FD already is
in the spec list. The fd events are still updated so that the spec list
is up to date with the possible I/O change.
The epoll loop checks for newly appeared FDs in order to process them early
if they're accepted sockets. Since the introduction of the fd_ev_set()
calls before the iocb(), the current FD is always in the update list,
and we don't want to check it again, so we must assign the old_updt
index just before calling the I/O handler.
Playing with fdtab[fd].ev makes gcc constantly reload the pointers
because it does not know they don't alias. Use a temporary variable
instead. This saves a few operations in the fast path.
Looking at the assembly code that updt_fd() and alloc/release_spec_entry
produce in the polling loops, it's clear that gcc has to recompute pointers
several times in a row because of limited spare registers. By better
grouping adjacent structure updates, we improve the code size by around
60 bytes in the fast path on x86.
In ev_poll and ev_epoll, we have a bit-to-bit mapping between the POLL_
constants and the FD_POLL_ constants. A comment said that gcc was able
to detect this and to automatically apply a mask. Things have possibly
changed since the output assembly doesn't always reflect this. So let's
perform an explicit assignment when bits are equal.
The iprange tool is handy for transforming network range formats, but
it's common to need a tool for running quick checks on the output.
The tool now supports a list of addresses on the command line, and it
will only output those which match. It's absolutely inefficient but is
handy for debugging.
Released version 1.5-dev15 with the following main changes :
- DOC: add a few precisions on compression
- BUG/MEDIUM: ssl: Fix handshake failure on session resumption with client cert.
- BUG/MINOR: ssl: One free session in cache remains unused.
- BUG/MEDIUM: ssl: first outgoing connection would fail with {ca,crt}-ignore-err
- MEDIUM: ssl: manage shared cache by blocks for huge sessions.
- MINOR: acl: add fetch for server session rate
- BUG/MINOR: compression: Content-Type is case insensitive
- MINOR: compression: disable on multipart or status != 200
- BUG/MINOR: http: don't report client aborts as server errors
- MINOR: stats: compute the ratio of compressed response based on 2xx responses
- MINOR: http: factor out the content-type checks
- BUG/MAJOR: stats: correctly check for a possible divide error when showing compression ratios
- BUILD: ssl: OpenSSL 0.9.6 has no renegociation
- BUG/MINOR: http: disable compression when message has no body
- MINOR: compression: make the stats a bit more robust
- BUG/MEDIUM: comp: DEFAULT_MAXZLIBMEM was expressed in bytes and not megabytes
- MINOR: connection: don't remove failed handshake flags
- MEDIUM: connection: add an error code in connections
- MEDIUM: connection: add minimal error reporting in logs for incomplete connections
- MEDIUM: connection: add error reporting for the PROXY protocol header
- MEDIUM: connection: add error reporting for the SSL
- DOC: document the connection error format in logs
- BUG/MINOR: http: don't log a 503 on client errors while waiting for requests
- BUILD: stdbool is not portable
- BUILD: ssl: NAME_MAX is not portable, use MAXPATHLEN instead
- BUG/MAJOR: raw_sock: must check error code on hangup
- BUG/MAJOR: polling: do not set speculative events on ERR nor HUP
- BUG/MEDIUM: session: fix FD leak when transport layer logging is enabled
- MINOR: stats: add a few more information on session dump
- BUG/MINOR: tcp: set the ADDR_TO_SET flag on outgoing connections
- CLEANUP: connection: remove unused server/proxy/task/si_applet declarations
- BUG/MEDIUM: tcp: process could theorically crash on lack of source ports
- MINOR: cfgparse: mention "interface" in the list of allowed "source" options
- MEDIUM: connection: introduce "struct conn_src" for servers and proxies
- CLEANUP: proto_tcp: use the same code to bind servers and backends
- CLEANUP: backend: use the same tproxy address selection code for servers and backends
- BUG/MEDIUM: stick-tables: conversions to strings were broken in dev13
- MEDIUM: proto_tcp: add support for tracking L7 information
- MEDIUM: counters: add sc1_trackers/sc2_trackers
- MINOR: http: add the "base32" pattern fetch function
- MINOR: http: add the "base32+src" fetch method.
- CLEANUP: session: use an array for the stick counters
- BUG/MINOR: proto_tcp: fix parsing of "table" in track-sc1/2
- BUG/MINOR: proto_tcp: bidirectional fetches not supported anymore in track-sc1/2
- BUG/MAJOR: connection: always recompute polling status upon I/O
- BUG/MINOR: connection: remove a few synchronous calls to polling updates
- MINOR: config: improve error checking on TCP stick-table tracking
- DOC: add some clarifications to the readme