haproxy public development tree
Go to file
Willy Tarreau 2eab6d3543 BUG/MINOR: h1: do not accept '#' as part of the URI component
Seth Manesse and Paul Plasil reported that the "path" sample fetch
function incorrectly accepts '#' as part of the path component. This
can in some cases lead to misrouted requests for rules that would apply
on the suffix:

    use_backend static if { path_end .png .jpg .gif .css .js }

Note that this behavior can be selectively configured using
"normalize-uri fragment-encode" and "normalize-uri fragment-strip".

The problem is that while the RFC says that this '#' must never be
emitted, as often it doesn't suggest how servers should handle it. A
diminishing number of servers still do accept it and trim it silently,
while others are rejecting it, as indicated in the conversation below
with other implementers:

   https://lists.w3.org/Archives/Public/ietf-http-wg/2023JulSep/0070.html

Looking at logs from publicly exposed servers, such requests appear at
a rate of roughly 1 per million and only come from attacks or poorly
written web crawlers incorrectly following links found on various pages.

Thus it looks like the best solution to this problem is to simply reject
such ambiguous requests by default, and include this in the list of
controls that can be disabled using "option accept-invalid-http-request".

We're already rejecting URIs containing any control char anyway, so we
should also reject '#'.

In the H1 parser for the H1_MSG_RQURI state, there is an accelerated
parser for bytes 0x21..0x7e that has been tightened to 0x24..0x7e (it
should not impact perf since 0x21..0x23 are not supposed to appear in
a URI anyway). This way '#' falls through the fine-grained filter and
we can add the special case for it also conditionned by a check on the
proxy's option "accept-invalid-http-request", with no overhead for the
vast majority of valid URIs. Here this information is available through
h1m->err_pos that's set to -2 when the option is here (so we don't need
to change the API to expose the proxy). Example with a trivial GET
through netcat:

  [08/Aug/2023:16:16:52.651] frontend layer1 (#2): invalid request
    backend <NONE> (#-1), server <NONE> (#-1), event #0, src 127.0.0.1:50812
    buffer starts at 0 (including 0 out), 16361 free,
    len 23, wraps at 16336, error at position 7
    H1 connection flags 0x00000000, H1 stream flags 0x00000810
    H1 msg state MSG_RQURI(4), H1 msg flags 0x00001400
    H1 chunk len 0 bytes, H1 body len 0 bytes :

    00000  GET /aa#bb HTTP/1.0\r\n
    00021  \r\n

This should be progressively backported to all stable versions along with
the following patch:

    REGTESTS: http-rules: add accept-invalid-http-request for normalize-uri tests

Similar fixes for h2 and h3 will come in followup patches.

Thanks to Seth Manesse and Paul Plasil for reporting this problem with
detailed explanations.
2023-08-08 19:56:11 +02:00
.github CI: explicitely highlight VTest result section if there's something 2023-07-17 15:56:53 +02:00
addons MINOR: tree-wide: use free_acl_cond() where relevant 2023-05-11 15:37:04 +02:00
admin MINOR: acme.sh: add the deploy script for acme.sh in admin directory 2023-04-26 17:32:15 +02:00
dev DEV: add a Lua helper script for SSL keys logging 2023-05-24 16:08:23 +02:00
doc MINOR: acl: add acl() sample fetch 2023-08-01 10:49:06 +02:00
examples EXAMPLES: maintain haproxy 2.8 retrocompatibility for lua mailers script 2023-07-11 16:04:22 +02:00
include MINOR: h2: pass accept-invalid-http-request down the request parser 2023-08-08 19:10:54 +02:00
reg-tests REGTESTS: http-rules: add accept-invalid-http-request for normalize-uri tests 2023-08-08 19:55:51 +02:00
scripts SCRIPTS: publish-release: update the umask to keep group write access 2023-05-24 22:49:12 +02:00
src BUG/MINOR: h1: do not accept '#' as part of the URI component 2023-08-08 19:56:11 +02:00
tests TESTS: add a unit test for one_among_mask() 2022-06-21 20:29:57 +02:00
.cirrus.yml CI: cirrus-ci: bump FreeBSD image to 13-1 2023-04-23 09:44:53 +02:00
.gitattributes
.gitignore CONTRIB: Add vi file extensions to .gitignore 2023-06-02 18:14:34 +02:00
.mailmap
.travis.yml
BRANCHES
BSDmakefile BUILD: makefile: commit the tiny FreeBSD makefile stub 2023-05-24 17:17:36 +02:00
CHANGELOG [RELEASE] Released version 2.9-dev2 2023-07-21 20:29:42 +02:00
CONTRIBUTING
INSTALL DOC: install: Document how to build a limited support for QUIC 2023-07-21 20:27:13 +02:00
LICENSE
MAINTAINERS CLEANUP: assorted typo fixes in the code and comments 2022-11-30 14:02:36 +01:00
Makefile MINOR: quic: Split QUIC connection code into three parts 2023-07-27 10:51:03 +02:00
README
SUBVERS
VERDATE [RELEASE] Released version 2.9-dev2 2023-07-21 20:29:42 +02:00
VERSION [RELEASE] Released version 2.9-dev2 2023-07-21 20:29:42 +02:00

The HAProxy documentation has been split into a number of different files for
ease of use.

Please refer to the following files depending on what you're looking for :

  - INSTALL for instructions on how to build and install HAProxy
  - BRANCHES to understand the project's life cycle and what version to use
  - LICENSE for the project's license
  - CONTRIBUTING for the process to follow to submit contributions

The more detailed documentation is located into the doc/ directory :

  - doc/intro.txt for a quick introduction on HAProxy
  - doc/configuration.txt for the configuration's reference manual
  - doc/lua.txt for the Lua's reference manual
  - doc/SPOE.txt for how to use the SPOE engine
  - doc/network-namespaces.txt for how to use network namespaces under Linux
  - doc/management.txt for the management guide
  - doc/regression-testing.txt for how to use the regression testing suite
  - doc/peers.txt for the peers protocol reference
  - doc/coding-style.txt for how to adopt HAProxy's coding style
  - doc/internals for developer-specific documentation (not all up to date)