Till now the callers had to know which one to call for specific use cases.
Let's fuse them now since a single one will remain after the API migration.
Given that bi_del() may only be used where o==0, just combine the two tests
by first removing output data then only input.
This will be important so that we can parse a buffer without touching it.
Now we indicate where from the buffer's head we plan to start to copy, and
for how many bytes. This will be used by send functions to loop at the end
of the buffer without having to update the buffer's output byte count.
This new functoin limits itself to the amount of data available in the
buffer and doesn't care about the direction anymore. It's only called
from co_getblk() which already checks that no more than the available
output bytes is requested.
These ones were merged into a single b_contig_space() that covers both
(the bo_ case was a simplified version of the other one). The function
doesn't use ->i nor ->o anymore.
This function was sometimes used from a channel and sometimes from a buffer.
In both cases it requires knowledge of the size of the output data (to skip
them). Here the split ensures the channel can deal with this point, and that
other places not having output data can continue to work.
These ones manipulate the output data count which will be specific to
the channel soon, so prepare the call points to use the channel only.
The b_* functions are now unused and were removed.
Where relevant, the channel version is used instead. The buffer version
was ported to be more generic and now takes a swap buffer and the output
byte count to know where to set the alignment point. The H2 mux still
uses buffer_slow_realign() with buf->o but it will change later.
Here's the list of newly introduced functions :
- b_data(), returning the total amount of data in the buffer (currently i+o)
- b_orig(), returning the origin of the storage area, that is, the place of
position 0.
- b_wrap(), pointer to wrapping point (currently data+size)
- b_size(), returning the size of the buffer
- b_room(), returning the amount of bytes left available
- b_full(), returning true if the buffer is full, otherwise false
- b_stop(), pointer to end of data mark (currently p+i), used to compute
distances or a stop pointer for a loop.
- b_peek(), this one will help make the transition to the new buffer model.
It returns a pointer to a position in the buffer known from an offest
relative to the beginning of the data in the buffer. Thus, we can replace
the following occurrences :
bo_ptr(b) => b_peek(b, 0);
bo_end(b) => b_peek(b, b->o);
bi_ptr(b) => b_peek(b, b->o);
bi_end(b) => b_peek(b, b->i + b->o);
b_ptr(b, ofs) => b_peek(b, b->o + ofs);
- b_head(), pointer to the beginning of data (currently bo_ptr())
- b_tail(), pointer to first free place (currently bi_ptr())
- b_next() / b_next_ofs(), pointer to the next byte, taking wrapping
into account.
- b_dist(), returning the distance between two pointers belonging to a buffer
- b_reset(), which resets the buffer
- b_space_wraps(), indicating if the free space wraps around the buffer
- b_almost_full(), indicating if 3/4 or more of the buffer are used
Some of these are provided with the unchecked variants using the "__"
prefix, or with the "_ofs" suffix indicating they return a relative
position to the buffer's origin instead of a pointer.
Cc: Olivier Houchard <ohouchard@haproxy.com>
The buffer code currently depends on pools and other stuff and is not
really autonomous anymore. The rewrite of the new API is an opportunity
to clean this up. This patch creates a new file (buf.h) which does not
depend on other elements and which will only contain what is needed to
perform the most basic buffer operations. The new API will be introduced
in this file and the conversion will be finished once buffer.h is empty.
The definition of struct buffer was moved to this new file, using more
explicity stdint types for the sizes and offsets.
Most new functions will be implemented in two variants :
__b_something() : unchecked variant, no wrapping is expected
b_something() : wrapping-checked variant
This way callers will be able to select which one to use depending on
the use cases.
When the block of data need to be split to support the wrapping, the start of
the second block of data was wrong. We must be sure to skup data copied during
the first memcpy.
This patch must be backported to 1.8.
When the block of data need to be split to support the wrapping, the start of
the second block of data was wrong. We must be sure to skip data copied during
the first memcpy.
This patch must be backported to 1.8, 1.7, 1.6 and 1.5.
Since commit cf975d4 ("MINOR: pools/threads: Implement lockless memory
pools."), we support lockless pools. However the parts dedicated to
detecting use-after-free are not present in this part, making DEBUG_UAF
useless in this situation.
The present patch sets a new define CONFIG_HAP_LOCKLESS_POOLS when such
a compatible architecture is detected, and when pool debugging is not
requested, then makes use of this everywhere in pools and buffers
functions. This way enabling DEBUG_UAF will automatically disable the
lockless version.
No backport is needed as this is purely 1.9-dev.
Commit 9dcf9b6 ("MINOR: threads: Use __decl_hathreads to declare locks")
accidently lost a few "extern" in certain lock declarations, possibly
causing certain entries to be declared at multiple places. Apparently
it hasn't caused any harm though.
The offending ones were :
- fdtab_lock
- fdcache_lock
- poll_lock
- buffer_wq_lock
During the migration to the second version of the pools, the new
functions and pool pointers were all called "pool_something2()" and
"pool2_something". Now there's no more pool v1 code and it's a real
pain to still have to deal with this. Let's clean this up now by
removing the "2" everywhere, and by renaming the pool heads
"pool_head_something".
b_alloc_margin is, strickly speeking, thread-safe. It will not crash
HAproxy. But its contract is not respected anymore in a multithreaded
environment. In this function, we need to be sure to have <margin> buffers
available in the pool after the allocation. So to have this guarantee, we must
lock the memory pool during all the operation. This also means, we must call
internal and lockless memory functions (prefixed with '__').
For the record, this patch fixes a pernicious bug happens after a soft reload
where some streams can be blocked infinitly, waiting for a buffer in the
buffer_wq list. This happens because, during a soft reload, pool_gc2 is called,
making some calls to b_alloc_fast fail.
This is specific to threads, no backport is needed.
This macro should be used to declare variables or struct members depending on
the USE_THREAD compile option. It avoids the encapsulation of such declarations
between #ifdef/#endif. It is used to declare all lock variables.
This function incorrectly dealt with the case where data doesn't
wrap but lies at the end of the buffer, resulting in Lukas' reported
data corruption with HTTP/2. No backport is needed, it was introduced
for HTTP/2 in 1.8-dev.
First, we use atomic operations to update jobs/totalconn/actconn variables,
listener's nbconn variable and listener's counters. Then we add a lock on
listeners to protect access to their information. And finally, listener queues
(global and per proxy) are also protected by a lock. Here, because access to
these queues are unusal, we use the same lock for all queues instead of a global
one for the global queue and a lock per proxy for others.
We used to have bo_{get,put}_{chr,blk,str} to retrieve/send data to
the output area of a buffer, but not the equivalent ones for the input
area. This will be needed to copy uploaded data frames in HTTP/2.
Thus function returns the number of blocks. When a buffer is full and
properly aligned, buf->p loops back the beginning, and the test in the
code doesn't cover that specific case, so it returns two chunks, a full
one and an empty one. It's harmless but can sometimes have a small impact
on performance and definitely makes the code hard to debug.
This function returns true if the available buffer space wraps. This
will be used to detect if it's worth realigning a buffer when it lacks
contigous space.
bi_istput() injects the ist string into the input region of the buffer,
it will be used to feed small data chunks into the conn_stream. bo_istput()
does the same into the output region of the buffer, it will be used to send
data via the transport layer and assumes there's no input data.
In order to match known patterns in wrapping buffer, we'll introduce new
string manipulation functions for buffers. The new function b_isteq()
relies on an ist string for the pattern and compares it against any
location in the buffer relative to <p>. The second function bi_eat()
is specially designed to match input contents.
This simply reduces the amount of output data from the buffer after
they have been transferred, in a way that is more natural than by
fiddling with buf->o. b_del() was renamed to bi_del() to avoid any
ambiguity (it's not yet used).
These ones return respectively the pointer to the end of the buffer and
the distance between b->p and the end. These will simplify a bit some
new code needed to parse directly from a wrapping buffer.
swap_buffer is a global variable only used by buffer_slow_realign. So it has
been moved from global.h to buffer.c and it is allocated by init_buffer
function. deinit_buffer function has been added to release it. It is also used
to destroy the buffers' pool.
These functions was added in commit 637f8f2c ("BUG/MEDIUM: buffers: Fix how
input/output data are injected into buffers").
This patch fixes hidden bugs. When a buffer is full (buf->i + buf->o ==
buf->size), instead of returning 0, these functions can return buf->size. Today,
this never happens because callers already check if the buffer is full before
calling bi/bo_contig_space. But to avoid possible bugs if calling conditions
changed, we slightly refactored these functions.
The function buffer_contig_space is buggy and could lead to pernicious bugs
(never hitted until now, AFAIK). This function should return the number of bytes
that can be written into the buffer at once (without wrapping).
First, this function is used to inject input data (bi_putblk) and to inject
output data (bo_putblk and bo_inject). But there is no context. So it cannot
decide where contiguous space should placed. For input data, it should be after
bi_end(buf) (ie, buf->p + buf->i modulo wrapping calculation). For output data,
it should be after bo_end(buf) (ie, buf->p) and input data are assumed to not
exist (else there is no space at all).
Then, considering we need to inject input data, this function does not always
returns the right value. And when we need to inject output data, we must be sure
to have no input data at all (buf->i == 0), else the result can also be wrong
(but this is the caller responsibility, so everything should be fine here).
The buffer can be in 3 different states:
1) no wrapping
<---- o ----><----- i ----->
+------------+------------+-------------+------------+
| |oooooooooooo|iiiiiiiiiiiii|xxxxxxxxxxxx|
+------------+------------+-------------+------------+
^ <contig_space>
p ^ ^
l r
2) input wrapping
...---> <---- o ----><-------- i -------...
+-----+------------+------------+--------------------+
|iiiii|xxxxxxxxxxxx|oooooooooooo|iiiiiiiiiiiiiiiiiiii|
+-----+------------+------------+--------------------+
<contig_space> ^
^ ^ p
l r
3) output wrapping
...------ o ------><----- i -----> <----...
+------------------+-------------+------------+------+
|oooooooooooooooooo|iiiiiiiiiiiii|xxxxxxxxxxxx|oooooo|
+------------------+-------------+------------+------+
^ <contig_space>
p ^ ^
l r
buffer_contig_space returns (l - r). The cases 1 and 3 are correctly
handled. But for the second case, r is wrong. It points on the buffer's end
(buf->data + buf->size). It should be bo_end(buf) (ie, buf->p - buf->o).
To fix the bug, the function has been splitted. Now, bi_contig_space and
bo_contig_space should be used to know the contiguous space available to insert,
respectively, input data and output data. For bo_contig_space, input data are
assumed to not exist. And the right version is used, depending what we want to
do.
In addition, to clarify the buffer's API, buffer_realign does not return value
anymore. So it has the same API than buffer_slow_realign.
This patch can be backported in 1.7, 1.6 and 1.5.
When an entity tries to get a buffer, if it cannot be allocted, for example
because the number of buffers which may be allocated per process is limited,
this entity is added in a list (called <buffer_wq>) and wait for an available
buffer.
Historically, the <buffer_wq> list was logically attached to streams because it
were the only entities likely to be added in it. Now, applets can also be
waiting for a free buffer. And with filters, we could imagine to have more other
entities waiting for a buffer. So it make sense to have a generic list.
Anyway, with the current design there is a bug. When an applet failed to get a
buffer, it will wait. But we add the stream attached to the applet in
<buffer_wq>, instead of the applet itself. So when a buffer is available, we
wake up the stream and not the waiting applet. So, it is possible to have
waiting applets and never awakened.
So, now, <buffer_wq> is independant from streams. And we really add the waiting
entity in <buffer_wq>. To be generic, the entity is responsible to define the
callback used to awaken it.
In addition, applets will still request an input buffer when they become
active. But they will not be sleeped anymore if no buffer are available. So this
is the responsibility to the applet I/O handler to check if this buffer is
allocated or not. This way, an applet can decide if this buffer is required or
not and can do additional processing if not.
[wt: backport to 1.7 and 1.6]
The function buffer_contig_space() returns the contiguous space avalaible
to add data (at the end of the input side) while the function
hlua_channel_send_yield() needs to insert data starting at p. Here we
introduce a new function bi_space_for_replace() which returns the amount
of space that can be inserted at the head of the input side with one of
the buffer_replace* functions.
This patch proposes a function that returns the space avalaible after buf->p.
This function's name was poorly chosen and is confusing to the point of
being suspiciously used at some places. The operations it does always
consider the ability to forward pending input data before receiving new
data. This is not obvious at all, especially at some places where it was
used when consuming outgoing data to know if the buffer has any chance
to ever get the missing data. The code needs to be re-audited with that
in mind. Care must be taken with existing code since the polarity of the
function was switched with the renaming.