Commit Graph

198 Commits

Author SHA1 Message Date
Frédéric Lécaille
0bec807e08 MINOR: shctx: Shared objects block by block allocation.
This patch makes shctx capable of storing objects in several parts,
each parts being made of several blocks. There is no more need to
walk through until reaching the end of a row to append new blocks.

A new pointer to a struct shared_block member, named last_reserved,
has been added to struct shared_block so that to memorize the last block which was
reserved by shctx_row_reserve_hot(). Same thing about "last_append" pointer which
is used to memorize the last block used by shctx_row_data_append() to store the data.
2018-10-24 04:35:53 +02:00
Willy Tarreau
61c112aa5b REORG: http: move HTTP rules parsing to http_rules.c
These ones are mostly called from cfgparse.c for the parsing and do
not depend on the HTTP representation. The functions's prototypes
were moved to proto/http_rules.h, making this file work exactly like
tcp_rules. Ideally we should stop calling these functions directly
from cfgparse and register keywords, but there are a few cases where
that wouldn't work (stats http-request) so it's probably not worth
trying to go this far.
2018-10-02 18:28:05 +02:00
Willy Tarreau
6b952c8101 REORG: http: move http_get_path() to http.c
This function is purely HTTP once http_txn is put aside. So the original
one was renamed to http_txn_get_path() and it extracts the relevant offsets
from the txn to pass them to http_get_path(). One benefit of the new version
is that it returns the length at the same time so that allowed to slightly
simplify http_get_path_from_string() which had to look up the end pointer
previously and which is not needed anymore.
2018-09-11 10:30:25 +02:00
Willy Tarreau
83061a820e MAJOR: chunks: replace struct chunk with struct buffer
Now all the code used to manipulate chunks uses a struct buffer instead.
The functions are still called "chunk*", and some of them will progressively
move to the generic buffer handling code as they are cleaned up.
2018-07-19 16:23:43 +02:00
Willy Tarreau
843b7cbe9d MEDIUM: chunks: make the chunk struct's fields match the buffer struct
Chunks are only a subset of a buffer (a non-wrapping version with no head
offset). Despite this we still carry a lot of duplicated code between
buffers and chunks. Replacing chunks with buffers would significantly
reduce the maintenance efforts. This first patch renames the chunk's
fields to match the name and types used by struct buffers, with the goal
of isolating the code changes from the declaration changes.

Most of the changes were made with spatch using this coccinelle script :

  @rule_d1@
  typedef chunk;
  struct chunk chunk;
  @@
  - chunk.str
  + chunk.area

  @rule_d2@
  typedef chunk;
  struct chunk chunk;
  @@
  - chunk.len
  + chunk.data

  @rule_i1@
  typedef chunk;
  struct chunk *chunk;
  @@
  - chunk->str
  + chunk->area

  @rule_i2@
  typedef chunk;
  struct chunk *chunk;
  @@
  - chunk->len
  + chunk->data

Some minor updates to 3 http functions had to be performed to take size_t
ints instead of ints in order to match the unsigned length here.
2018-07-19 16:23:43 +02:00
Willy Tarreau
c9fa0480af MAJOR: buffer: finalize buffer detachment
Now the buffers only contain the header and a pointer to the storage
area which can be anywhere. This will significantly simplify buffer
swapping and will make it possible to map chunks on buffers as well.

The buf_empty variable was removed, as now it's enough to have size==0
and area==NULL to designate the empty buffer (thus a non-allocated head
is the empty buffer by default). buf_wanted for now is indicated by
size==0 and area==(void *)1.

The channels and the checks now embed the buffer's head, and the only
pointer is to the storage area. This slightly increases the unallocated
buffer size (3 extra ints for the empty buffer) but considerably
simplifies dynamic buffer management. It will also later permit to
detach unused checks.

The way the struct buffer is arranged has proven quite efficient on a
number of tests, which makes sense given that size is always accessed
and often first, followed by the othe ones.
2018-07-19 16:23:43 +02:00
Willy Tarreau
178b987025 MINOR: cache: use the new buffer API
A few direct accesses to buf->p now use ci_head() instead.
2018-07-19 16:23:42 +02:00
Olivier Houchard
acd1403794 MINOR: buffer: Use b_add()/bo_add() instead of accessing b->i/b->o.
Use the newly available functions instead of using the buffer fields directly.
2018-07-19 16:23:42 +02:00
Willy Tarreau
8f9c72d301 MINOR: buffer: remove bi_end()
It was replaced by ci_tail() when the channel is known, or b_tail() in
other cases.
2018-07-19 16:23:40 +02:00
Willy Tarreau
dda2e41881 MINOR: buffer: remove bi_ptr()
It's now been replaced by b_head() when b->o is null, ci_head() when
the channel is known, or b_peek(b, b->o) in other situations.
2018-07-19 16:23:40 +02:00
Willy Tarreau
7194d3cc3b MINOR: buffer: split bi_contig_data() into ci_contig_data and b_config_data()
This function was sometimes used from a channel and sometimes from a buffer.
In both cases it requires knowledge of the size of the output data (to skip
them). Here the split ensures the channel can deal with this point, and that
other places not having output data can continue to work.
2018-07-19 16:23:40 +02:00
Willy Tarreau
bcbd39370f MINOR: channel/buffer: replace b_{adv,rew} with c_{adv,rew}
These ones manipulate the output data count which will be specific to
the channel soon, so prepare the call points to use the channel only.
The b_* functions are now unused and were removed.
2018-07-19 16:23:40 +02:00
Aurélien Nephtali
abbf607105 MEDIUM: cli: Add payload support
In order to use arbitrary data in the CLI (multiple lines or group of words
that must be considered as a whole, for example), it is now possible to add a
payload to the commands. To do so, the first line needs to end with a special
pattern: <<\n. Everything that follows will be left untouched by the CLI parser
and will be passed to the commands parsers.

Per-command support will need to be added to take advantage of this
feature.

Signed-off-by: Aurélien Nephtali <aurelien.nephtali@corp.ovh.com>
2018-04-26 14:19:33 +02:00
Willy Tarreau
1093a4586c BUG/MAJOR: cache: always initialize newly created objects
Recent commit 5bd37fa ("BUG/MAJOR: cache: fix random crashes caused by
incorrect delete() on non-first blocks") addressed an issue where
dangling objects could be deleted in the cache, but even after this fix
some similar segfaults were reported at the same place (cache_free_blocks()).
The tree was always corrupted as well. Placing some traces revealed
that this time it's caused by a missing initialization in
http_action_store_cache() : while object->eb.key is used to note that
the object is not in the tree, the first retrieved block may contain
random data and is not initialized. Further, this entry can be updated
later without the object being inserted into the tree. Thus, if at the
end the object is not stored and the blocks are put back to the avail
list, the next attempt to use them will find eb.key != 0 and will try
to delete the uninitialized block, will see that eb.node.leaf_p is not
NULL (random data), and will dereference it as well as a few other
uninitialized pointers. It was harder to trigger than the previous one,
despite being very closely related. This time the following config
was used :

  listen l1
    mode http
    bind :8888
    http-request cache-use c1
    http-response cache-store c1
    server s1 127.0.0.1:8000

  cache c1
    total-max-size 4
    max-age 10

Httpterm was running on port 8000.

And it was stressed this way :

  $ inject -o 1 -u 500 -P 1 -G '127.0.0.1:8888/?s=4097&p=1&x=%s'
  ... wait 5 seconds then Ctrl-C ...
  # wait 3 seconds doing nothing
  $ inject -o 1 -u 500 -P 1 -G '127.0.0.1:8888/?s=4097&p=1&x=%s'
  => segfault

Other values don't work well. The size and the small pieces in the
responses (p=1) are critical to make it work.

Here the fix consists in pre-zeroing object->eb.key AND object->eb.leaf_p
just after the object is allocated so as to stay consistent with other
locations. Ideally this could be simplified later by only relying on
eb->node.leaf_p everywhere since in the end the key alone is not a
reliable indicator, so that we use only one indicator of being part of
the tree or not.

This fix needs to be backported to 1.8.
2018-04-06 19:02:25 +02:00
Willy Tarreau
5bd37fa625 BUG/MAJOR: cache: fix random crashes caused by incorrect delete() on non-first blocks
Several segfaults were reported in the cache, each time in eb_delete()
called from cache_free_blocks() itself called from shctx_row_reserve_hot().
Each time the tree node was corrupted with random cached data (often JS or
HTML contents).

The problem comes from an incompatibility between the cache's expectations
and the recycling algorithm used in the shctx. The shctx allocates and
releases a chain of blocks at once. And when it needs to allocate N blocks
from the avail list while a chain of M>N is found, it picks the first N
from the list, moves them to the hot list, and marks all remaining M-N
blocks as isolated blocks (chains of 1).

For each such released block, the shctx->free_block() callback is used
and passed a pointer to the first and current block of the chain. For
the cache, it's cache_free_blocks(). What this function does is check
that the current block is the first one, and in this case delete the
object from the tree and mark it as not in tree by setting key to zero.

The problem this causes is that the tail blocks when M>N become first
blocks for the next call to shctx_row_reserve_hot(), these ones will
be passed to cache_free_blocks() as list heads, and will be sent to
eb_delete() despite containing only cached data.

The simplest solution for now is to mark each block as holding no cache
object by setting key to zero all the time. It keeps the principle used
elsewhere in the code. The SSL code is not subject to this problem
because it relies on the block's len not being null, which happens
immediately after a block was released. It was uncertain however whether
this method is suitable for the cache. It is not critical though since
this code is going to change soon in 1.9 to dynamically allocate only
the number of required blocks.

This fix must be backported to 1.8. Thanks to Thierry for providing
exploitable cores.
2018-04-04 20:17:03 +02:00
Willy Tarreau
afe1de5d98 BUG/MINOR: cache: fix "show cache" output
The "show cache" command used to dump the header for each entry into into
the handler loop, making it repeated every ~16kB of output data. Additionally
chunk_appendf() was used instead of chunk_printf(), causing the output to
repeat already emitted lines, and the output size to grow in O(n^2). It used
to take several minutes to report tens of millions of objects from a small
cache containing only a few thousands. There was no more impact though.

This fix must be backported to 1.8.
2018-04-04 11:56:43 +02:00
Willy Tarreau
d4569d1937 BUG/MEDIUM: cache: don't cache the response on no-cache="set-cookie"
If the server mentions no-cache="set-cookie" in the response headers,
we must guarantee that any set-cookie field will not be stored. We
cannot edit the stored response on the fly to trim the set-cookie
header so we can refrain from storing a response containing such a
header. In theory we could use TX_SCK_PRESENT for this but this one
is only set when the cookie is being watched by the configuration.
Since these responses are not very frequent and often accompanied
with a set-cookie header, let's simply refrain from caching whenever
such directive is present.

This needs to be backported to 1.8.
2017-12-22 18:03:04 +01:00
Willy Tarreau
504455c533 BUG/MEDIUM: cache: respect the request cache-control header
Till now if a client emitted a request featureing a cache-control header,
this one was not respected and a stale object could still be delievered.r
 This patch ensures that :
  - cache-control: no-cache disables retrieval from the cache but does
    not prevent the newly fetched object from being stored ;
  - cache-control: no-store can safely retrieve from the cache but prevents
    from storing any fetched object
  - cache-control: max-age/max-stale/min-fresh act like no-cache
  - pragma: no-cache acts like cache-control: no-cache.

This needs to be backported to 1.8.
2017-12-22 17:56:18 +01:00
Willy Tarreau
c9bd34c7e0 BUG/MEDIUM: cache: replace old object on store
Currently the cache aborts a store operation if the object to store
already exists in the cache. This is used to avoid storing multiple
copies at the same time on concurrent accesses. It causes an issue
though, which is that existing unexpired objects cannot be updated.
This happens when any request criterion disables the retrieval from
the cache (eg: with max-age or any other cache-control condition).

For now, let's simply replace the previous existing entry by unlinking
it from the index. This could possibly be improved in the future if
needed.

This fix needs to be backported to 1.8.
2017-12-22 17:56:18 +01:00
Willy Tarreau
7704b1e89a BUG/MEDIUM: cache: do not try to retrieve host-less requests from the cache
All HTTP/1.1 requests the Host header share the same hash key 0 and
will be return the first cached object. Let's add the check on the call
to sha1_hosturi() to prevent this from happening.

This must be backported to 1.8.
2017-12-22 17:56:17 +01:00
Willy Tarreau
faf2909f9f BUG/MINOR: cache: do not force the TX_CACHEABLE flag before checking cacheability
The cache used to set this flag before calling
check_response_for_cacheability() due to the way the flags were previously
set (too late), but this is a bad idea as it loses the information of the
implicit caching rules related to the method and the status code. Let's
only rely on what was determined during the request and response parsing
instead and not change it.

This fix must be backported to 1.8, and it requires that the following
patches are also merged :
 - MINOR: http: adjust the list of supposedly cacheable methods
 - MINOR: http: update the list of cacheable status codes as per RFC7231
 - MINOR: http: start to compute the transaction's cacheability from the request
 - BUG/MINOR: http: do not ignore cache-control: public
2017-12-22 15:49:15 +01:00
William Lallemand
bcd9101a66 BUG/MEDIUM: cache: bad computation of the remaining size
The cache was not setting the hdrs_len to zero when we are called
in the http_forward_data with headers + body.

The consequence is to always try to store a size - the size of headers,
during the calls to http_forward_data even when it has already forwarded
the headers.

Thanks to Cyril Bonté for reporting this bug.

Must be backported to 1.8.
2017-11-28 12:06:06 +01:00
Willy Tarreau
fd5efb5936 CLEANUP: cache: more efficiently pack the struct cache
By having the cache id on 33 bytes as the first member, it was
creating a hole and forcing the "hot" remaining part to be split
across two cache lines. Let's move the id at the end as it's used
only during config parsing.
2017-11-26 11:10:53 +01:00
William Lallemand
49b4453b58 MEDIUM: cache: max-age configuration keyword
Add a configuration keyword to change the max-age.
The default one is still 60s.
2017-11-24 19:31:01 +01:00
William Lallemand
a71cd1d407 MINOR: cache: replace a fprint() by an abort()
In the applet I/O handler we can never get an object bigger than a
buffer, so we should never reach this case.
2017-11-24 19:00:07 +01:00
Willy Tarreau
bafbe01028 CLEANUP: pools: rename all pool functions and pointers to remove this "2"
During the migration to the second version of the pools, the new
functions and pool pointers were all called "pool_something2()" and
"pool2_something". Now there's no more pool v1 code and it's a real
pain to still have to deal with this. Let's clean this up now by
removing the "2" everywhere, and by renaming the pool heads
"pool_head_something".
2017-11-24 17:49:53 +01:00
Olivier Houchard
fbc74e8556 MINOR/CLEANUP: proxy: rename "proxy" to "proxies_list"
Rename the global variable "proxy" to "proxies_list".
There's been multiple proxies in haproxy for quite some time, and "proxy"
is a potential source of bugs, a number of functions have a "proxy" argument,
and some code used "proxy" when it really meant "px" or "curproxy". It worked
by pure luck, because it usually happened while parsing the config, and thus
"proxy" pointed to the currently parsed proxy, but we should probably not
rely on this.

[wt: some of these are definitely fixes that are worth backporting]
2017-11-24 17:21:27 +01:00
Christopher Faulet
767a84bcc0 CLEANUP: log: Rename Alert/Warning in ha_alert/ha_warning 2017-11-24 17:19:12 +01:00
William Lallemand
ecb73b12c1 MINOR: cache: move the refcount decrease in the applet release
Move the refcount decrease of the cache in the release callback of the
applet. We don't need to decrease it in the applet code.
2017-11-24 15:04:36 +01:00
William Lallemand
49dc048c25 BUG/MEDIUM: cache: free ressources in chn_end_analyze
Upon an aborted HTTP connection, or an error, the filter cache does not
decrement the refcount and does not free the allocated ressources.
2017-11-24 15:04:36 +01:00
William Lallemand
f528fff46b MEDIUM: cache: store sha1 for hashing the cache key
The cache was relying on the txn->uri for creating its key, which was a
big problem when there was no log activated.

This patch does a sha1 of the host + uri, and stores it in the txn.
When a object is stored, the eb32node uses the first 32 bits of the hash
as a key, and the whole hash is stored in the cache entry.

During a lookup, the truncated hash is used, and when it matches an
entry we check the real sha1.
2017-11-23 20:20:04 +01:00
William Lallemand
e899af89b5 BUG/MEDIUM: cache fix cli_kws structure
The cli_kws structure was not ended and was causing undefined behavior.
2017-11-22 16:56:58 +01:00
William Lallemand
55e7674bc4 BUG/MEDIUM: cache: refcount forbids to free the objects
Some refcount decrementation were forgotten and they were forbidding to
reuse the objects in some cases.
2017-11-22 15:13:54 +01:00
William Lallemand
0872766e31 BUG/MEDIUM: cache: use key=0 as a condition for freeing
The cache was trying to remove objects from the tree while they were
already removed from it. We set the key to 0 as a check for not trying
to remove the object from the tree when we are still using the object.
2017-11-22 15:13:54 +01:00
William Lallemand
1f49a366fd MEDIUM: cache: "show cache" on the cli
The cli command "show cache" displays the status of the cache, the first
displayed line is the shctx informations with how much blocks available
blocks it contains (blocks are 1k by default).

The next lines are the objects stored in the cache tree, the pointer,
the size of the object and how much blocks it uses, a refcount for the
number of users of the object, and the remaining expiration time (which
can be negative if expired)

Example:

    $ echo "show cache" | socat - /run/haproxy.sock
    0x7fa54e9ab03a: foobar (shctx:0x7fa54e9ab000, available blocks:3921)
    0x7fa54ed65b8c (size: 43190 (43 blocks), refcount:2, expire: 2)
    0x7fa54ecf1b4c (size: 45238 (45 blocks), refcount:0, expire: 2)
    0x7fa54ed70cec (size: 61622 (61 blocks), refcount:0, expire: 2)
    0x7fa54ecdbcac (size: 42166 (42 blocks), refcount:1, expire: 2)
    0x7fa54ec9736c (size: 44214 (44 blocks), refcount:2, expire: 2)
    0x7fa54eca28ec (size: 46262 (46 blocks), refcount:2, expire: -2)
2017-11-21 21:35:04 +01:00
William Lallemand
75d93291c9 CLEANUP: cache: reorder includes 2017-11-21 21:35:04 +01:00
William Lallemand
eee5c39715 CLEANUP: cache: remove wrong comment 2017-11-20 19:22:27 +01:00
William Lallemand
a400a3a6d0 BUG/MEDIUM: cache: free callback to remove from tree
Call the shctx free_blocks callback in order to remove the row from the
cache tree.

Put the row in the hot list during allocation, forbid the blocks to be
stolen by a free or a row_reserve
2017-11-20 19:22:27 +01:00
William Lallemand
e1533f5790 MINOR: cache: disable cache if shctx_row_data_append fail
Disable the cache if the append of data failed, it should never happen
because the allocated row size is at least equal to the size of the
object to allocate.
2017-11-14 15:20:44 +01:00
William Lallemand
10935bc547 MINOR: cache: forward data with headers
Forward the remaining headers with the data in the first call of
cache_store_http_forward_data().

Previously the headers were forwarded first, and the function left,
implying an additionnal call to cache_store_http_forward_data() for the
data.

Cc: Christopher Faulet <cfaulet@haproxy.com>
2017-11-14 15:20:44 +01:00
William Lallemand
9d5f54daad BUG/MEDIUM: cache: use msg->sov to forward header
Use msg->sov to forward headers instead of msg->eoh. It can causes some
problem because eoh does not contains the last \r\n, and the filter does
not support to send the headers partially.

Cc: Christopher Faulet <cfaulet@haproxy.com>
2017-11-14 15:20:44 +01:00
William Lallemand
18f133adb3 BUG/MEDIUM: cache: does not cache if no Content-Length
In the case of Transfer-Encoding: chunked, there is no Content-Length
which causes the cache to allocate a too small shctx row for the data.

It's not possible to allocate a shctx row for the chunks, we need to be
able to allocate on-the-fly the shctx blocks during the data transfer.
2017-11-11 14:01:21 +01:00
William Lallemand
9c54c53f2f BUG/MEDIUM: cache: don't try to resolve wrong filters
Don't try to resolve wrong filters which are not cache filters during
the post configuration callback.
2017-11-02 16:58:25 +01:00
Olivier Houchard
fccf840cdf MINOR: cache: Don't confuse act_return and act_parse_ret. 2017-11-01 15:10:51 +01:00
Olivier Houchard
cd2867a012 MINOR: cache: Remove useless test for nonzero.
Don't bother testing if len is nonzero, we know it is, as we're in the
"else" part of a if (!len), and testing it confuses clang into thinking
ret may be left uninitialized.
2017-11-01 15:10:51 +01:00
William Lallemand
77c1197bfb MEDIUM: cache: deliver objects from cache
Lookup objects in the cache and deliver them using the http-request
action "cache-use".
2017-10-31 21:17:19 +01:00
William Lallemand
4da3f8a1f2 MEDIUM: cache: store objects in cache
Store object in the cache. The cache use an shctx for storage.

It uses an http-response action to store the headers and a filter to
store the body. The http-response action is used in order to allow
modifications by other actions before caching.
2017-10-31 21:17:19 +01:00
William Lallemand
41db46035e MEDIUM: cache: configuration parsing and initialization
Parse a configuration section "cache" and a http-{response,request}
actions.

Example:

    listen frt
        mode http
        http-response cache-store foobar
        http-request cache-use foobar

    cache foobar
        total-max-size 4   # size in megabytes
2017-10-31 21:17:19 +01:00