These both flags are set after releasing the applet, in
appctx_shut(). Concretly, it means the applet is shutdown for reads and
writes. Once set, the applet's I/O handler was no longer called. Tests on
these flags are useless. There is no chance to match them.
This case does not exist yet with the H1 multiplexer, but applets may decide to
not produce data if there is not enough room in the destination buffer (the
applet's outbuf or the opposite SE buffer). It is true for the stats applets for
instance. However this case is not properly handled when the zero-copy
forwarding is in-use.
To fix the issue, the se_done_ff() function was modified to return the number of
bytes really forwarded and to subs for sends if nothing was forwarded while the
zero-copy forwarding was blocked by the producer. On the applet side, we take
care to block the zero-copy forwarding if the applet requests more room. At the
end, zero-copy forwarding is unblocked if something was forwarded.
This way, it is now possible for the stats applet to report a full buffer and
block the zero-copy forwarding, even if the buffer is not really full, by
requesting more room.
No backport needed.
An issue was introduced when zero-copy forwarding was added to the stats and
cache applets. There is no test to be sure the upper layer is ready to use
the zero-copy forwarding. So these applets refuse to deliver the response
into the applet's output buffer if the zero-copy forwarding is supported by
the opposite endpoint. It is especially an issue when a filter, like the
compression, is in-use on the response channel.
Because of this bug, the response is not delivered and the applet is woken
up in loop to produce data.
To fix the issue, an appctx flag was added, APPCTX_FL_FASTFWD, to know when
the zero-copy forwarding is in-use. We rely on this flag to not fill the
outbuf in the applet's I/O handler.
No backport needed.
This simplifies a bit the stats applet. Because the CLI part was not
refactored yet to use the applet's buffers, there are 3 ways to produce
data:
* the HTX message for the HTTP stats when zero-copy forwarding is not
used
* raw data in the opposite endpoint buffer for the HTTP stats when
zero-copy forwarding is used
* the channel buffer when the CLI "show stat" command is evaluated
There is already a dedicated function to take care to copy data at the right
place. There is now also a dedicated function to check us the output buffer
is almost full.
Commit 91b77c1632 ("MEDIUM: mux-h1: Support zero-copy forwarding for chunks with
an unknown size") was recently pushed but it contains 3 bugs. The first one is
during the nego. The extra size reserved for the CRLF at the end of the chunk
must not be added to the offset value. Indeed, the CRLF will be appended after
the data and not prepended to them.
The second one, still during the nego, is an integer overflow when the available
room in the output buffer is computed.
Finally, the last one is when the chunk itself is formatted. This part was
totally buggy if the output buffer was not empty at the beginning.
No backport needed.
A packet is considered as reordered when it is detected as lost because its packet
number is above the largest acknowledeged packet number by at least the
packet reordering threshold value.
Add ->nb_reordered_pkt new quic_loss struct member at the same location that
the number of lost packets to count such packets.
Should be backported to 2.6.
Let's say that the largest packet number acknowledged by the peer is #10, when inspecting
the non already acknowledged packets to detect if they are lost or not, this is the
case a least if the difference between this largest packet number and and their
packet numbers are bigger or equal to the packet reordering threshold as defined
by the RFC 9002. This latter must not be less than QUIC_LOSS_PACKET_THRESHOLD(3).
Which such a value, packets #7 and oldest are detected as lost if non acknowledged,
contrary to packet number #8 or #9.
So, the packet loss detection is very sensitive to such a network characteristic
where non acknowledged packets are distant from each others by their packet number
differences.
Do not use this static value anymore for the packet reordering threshold which is used
as a criteria to detect packet loss. In place, make it depend on the difference
between the number of the last transmitted packet and the number of the oldest
one among the packet which are still in flight before being inspected to be
deemed as lost.
Add new tune.quic.reorder-ratio setting to apply a ratio in percent to this
dynamic packet reorder threshold.
Should be backported to 2.6.
The new formula for K CUBIC which arrives with RFC 9438 is as follows:
K = cubic_root((W_max - cwnd_epoch) / C)
Note that W_max is c->last_w_max, and cwnd_epoch is c->cwnd when entering
quic_cubic_update() just after a congestion event.
Must be backported as far as 2.6.
The formula for K CUBIC calculation is as follows:
K = cubic_root(W_max * (1 - beta_quic) / C).
Note that this does not match the comment. But the aim of this patch is to not
hide a bug inside another patch to update this K CUBIC calculation.
The unit of C is bytes/s^3 (or segments/s^3). And we want to store K as
milliseconds. So, the conversion inside the cubic_root() to convert seconds in
milliseconds is wrong. The unit used here is bytes/(ms/1000)^3 or
bytes*1000^3/ms^3. That said, it is preferable to compute K as seconds, then
convert to milliseconds as done by this patch.
Must be backported as far as 2.6.
The CLI command "update ssl ocsp-response" was forcefully removing an
OCSP response from the update tree regardless of whether it used to be
in it beforehand or not. But since the main OCSP upate task works by
removing the entry being currently updated from the update tree and then
reinserting it when the update process is over, it meant that in the CLI
command code we were modifying a structure that was already being used.
These concurrent accesses were not properly locked on the "regular"
update case because it was assumed that once an entry was removed from
the update tree, the update task was the only one able to work on it.
Rather than locking the whole update process, an "updating" flag was
added to the certificate_ocsp in order to prevent the "update ssl
ocsp-response" command from trying to update a response already being
updated.
An easy way to reproduce this crash was to perform two "simultaneous"
calls to "update ssl ocsp-response" on the same certificate. It would
then crash on an eb64_delete call in the main ocsp update task function.
This patch can be backported up to 2.8.
Released version 3.0-dev3 with the following main changes :
- DOC: configuration: clarify http-request wait-for-body
- BUG/MAJOR: ssl_sock: Always clear retry flags in read/write functions
- MINOR: h3: add traces for stream sending function
- BUG/MEDIUM: h3: do not crash on invalid response status code
- BUG/MEDIUM: qpack: allow 6xx..9xx status codes
- BUG/MEDIUM: quic: fix crash on invalid qc_stream_buf_free() BUG_ON
- CLEANUP: log: deinitialization of the log buffer in one function
- BUG/MINOR: h1: Don't support LF only at the end of chunks
- BUG/MEDIUM: h1: Don't support LF only to mark the end of a chunk size
- MINOR: ssl: add HAVE_SSL_0RTT constant
- MINOR: ssl: rename HA_OPENSSL_HAVE_0RTT_SUPPORT constant to HAVE_SSL_0RTT_QUIC
- MEDIUM: ssl/quic: always compile the ssl_conf.early_data test
- DOC: httpclient: add dedicated httpclient section
- BUG/MINOR: h1-htx: properly initialize the err_pos field
- BUG/MEDIUM: h1: always reject the NUL character in header values
- CLEANUP: h1: remove unused function h1_measure_trailers()
- BUG/MINOR: ssl/quic: fix 0RTT define
- MINOR: mux-quic: prepare for earlier flow control update
- MINOR: mux-quic: define a flow control related type
- MEDIUM: mux-quic: limit stream flow control on snd_buf
- MEDIUM: mux-quic: limit conn flow control on snd_buf
- MINOR: mux-quic: remove unneeded sent-offset fields
- MINOR: mux-quic: check fctl during STREAM frame build
- MAJOR: mux-quic: remove intermediary Tx buffer
- MEDIUM: mux-quic: simplify sending API
- MEDIUM: mux-quic: release Tx buf on too small room
- MEDIUM: mux-quic: properly handle conn Tx buf exhaustion
- MINOR: mux-quic: realign Tx buffer if possible
- CLEANUP: connection: remove obsolete comment in header file
- OPTIM: connection: progressive hash for conn_calculate_hash()
- MINOR: tcp_act: fix alphabetical ordering of tcp request content actions
- MINOR: tcp-act: Rename "set-{mark,tos}" to "set-fc-{mark,tos}"
- MINOR: hlua: Rename set_{tos, mark} to set_fc_{tos, mark}
- MEDIUM: tcp-act: <expr> support for set-fc-{mark,tos} actions
- MEDIUM: tcp-act/backend: support for set-bc-{mark,tos} actions
- MINOR: stats: Be able to access to registered stats modules from anywhere
- MEDIUM: stats: Be able to access a specific field into a stats module
- MINOR: promex: Add a param to override the description when a metric is dumped
- MINOR: promex: Add info in the promex context to dump extra counters
- MEDIUM: promex: Dump frontends extra counters if requested
- MEDIUM: promex: Dump backends extra counters if requested
- MEDIUM: promex: Dump servers extra counters if requested
- MEDIUM: promex: Dump listeners extra counters if requested
- DOC: promex: Add documentation about extra-counters
- MINOR: promex: Always limit the number of labels dumped for each metric
- MEDIUM: promex: Simplify the context using generic pointers for restart points
- MINOR: promex: Remove unsued htx parameter when a metric is dumped
- MEDIUM: promex: Add a registration mechanism to support modules
- MEDIUM: promex: Dump metrics of registered modules with a way to filter them
- MEDIUM: promex/stick-table: Dump stick-table metrics via a promex module
- MEDIUM: promex/resolvers: Dump resolvers metrics via a promex module
- MINOR: promex: Rename dump functions to use the right wording
- MINOR: promex: Always pass the final name and description to promex_dmp_ts()
- MEDIUM: promex: Add support for filters on metric names
- REGTESTS: promex: Adapt script to be less verbose
- MINOR: compiler: add a new DO_NOT_FOLD() macro to prevent code folding
- MINOR: debug: make sure calls to ha_crash_now() are never merged
- MINOR: debug: make ABORT_NOW() store the caller's line number when using abort
- BUG/MINOR: diag: always show the version before dumping a diag warning
- BUG/MINOR: diag: run the final diags before quitting when using -c
- MINOR: acl: add extra diagnostics about suspicious string patterns
- BUG/MINOR: quic: Wrong ack ranges handling when reaching the limit.
- BUILD: quic: Variable name typo inside a BUG_ON().
- DOC: config: fix typo for '%ms' log format alternative
- DOC: config: fix ordering for "txn.*" fetches
- MINOR: stream: add "txn.redispatch" fetch
- BUILD: debug: remove leftover parentheses in ABORT_NOW()
- MINOR: debug: make BUG_ON() catch build errors even without DEBUG_STRICT
- BUG/MINOR: ssl: Fix error message after ssl_sock_load_ocsp call
- MINOR: debug: support passing an optional message in ABORT_NOW()
- MINOR: debug: add an optional message argument to the BUG_ON() family
- DEBUG: make the "debug dev {debug|warn|check}" command print a message
- CLEANUP: quic: Code clarifications for QUIC CUBIC (RFC 9438)
- BUG/MINOR: quic: fix possible integer wrap around in cubic window calculation
- MINOR: quic: Stop using 1024th of a second.
- CI: github: abandon asan matrix.py helper
- CI: ssl: add yet another OpenSSL download fallback
- DOC: install: clarify WolfSSL chroot requirements
- MINOR: task: Move wait_event in the task header file
- MINOR: stconn: Be able to detect applets using HTX
- MINOR: stconn: Explicitly use an appctx to attach a stconn on it
- MINOR: stconn: Be prepared to handle error when a SC is attached to an applet
- MINOR: applet: Add dedicated IN/OUT buffers for appctx
- MINOR: applet: Add traces to debug receive/send and block/wake events
- MINOR: applet: Add support for callback functions to exchange data with channels
- MINOR: applet: Implement default functions to exchange data with channels
- MEDIUM: stconn: Add functions to handle applets I/O from the SC layer
- MEDIM: applet: Add the applet handler based on IN/OUT buffers
- MINOR: applet: Show IN/OUT buffers in trace messages when used
- MINOR: applet: Add flags on the appctx and stop abusing its state
- MINIOR: applet: Add flags to deal with ends of input, ends of stream and errors
- MINOR: applet: Remove appctx state field to only used the flags
- MINOR: applet: Add an appctx flag to report shutdown to applets
- MEDIUM: applet: Use appctx flags to report EOS/EOI/ERROR to SE
- MINOR: applet: Add callback function to deal with zero-copy forwarding
- MEDIUM: applet: Add support for zero-copy forwarding from an applet
- MINOR: applet: Automatically handle applets having more data for the stream
- MEDIUM: stats: Don't interrupt processing on partial post
- MAJOR: stats: Update HTTP stats applet to handle its own buffers
- MEDIUM: cache: Temporarily remove zero-copy forwarding support
- MAJOR: cache: Update HTTP cache applet to handle its own buffers
- MAJOR: cache: Send cached objects using zero-copy forwarding
- MINOR: stconn: Add support for flags during zero-copy forwarding negotiation
- MINOR: mux-h1: Be able to define the length of a chunk size when it is prepended
- MEDIUM: stconn: Nofify requested size during zero-copy forwarding nego is exact
- MINOR: mux-h1: Stop zero-copy forwarding during nego for too big requested size
- MEDIUM: mux-h1: Support zero-copy forwarding for chunks with an unknown size
- MAJOR: stats: Send stats dump over HTTP using zero-copy forwarding
- MEDIUM: applet: Simplify a bit API to exchange data with applets
- MINOR: cache: Remove unsed .data_sent field from the cache applet context
- MINOR: applet: Use an option to disable zero-copy forwarding for all applets
- MINOR: applet: Identify applets using their own buffers via a flag
- BUG/MINOR: ssl: Duplicate ocsp update mode when dup'ing ckch
- MINOR: ssl: Use OCSP_CERTID instead of ckch_store in ckch_store_build_certid
- BUG/MINOR: ssl: Clear the ckch instance when deleting a crt-list line
- BUG/MEDIUM: ocsp: Separate refcount per instance and per store
- BUG/MINOR: ssl: Destroy ckch instances before the store during deinit
- BUG/MINOR: ssl: Reenable ocsp auto-update after an "add ssl crt-list"
- REGTESTS: ssl: Add OCSP related tests
- REGTESTS: ssl: Fix empty line in cli command input
- DOC: install: recommend pcre2
- DOC: config: fix misplaced "txn.conn_retries"
- DOC: config: fix typos for "bytes_{in,out}"
- DOC: config: fix misplaced "bytes_{in,out}"
- DOC: config: add more custom log format table alternatives
- MINOR: stream: rename "txn.redispatch" to "txn.redispatched"
- MINOR: sample: implement bc_{be,srv}_queue samples
- BUG/MINOR: mux-h2: count rejected DATA frames against the connection's flow control
- MINOR: mux-h2: count excess of CONTINUATION frames as a glitch
- MINOR: mux-h2: count late reduction of INITIAL_WINDOW_SIZE as a glitch
- DOC: internal: update missing data types in peers-v2.0.txt
- MEDIUM: stick-tables: add a new stored type for glitch_cnt and glitch_rate
- MINOR: session: add the necessary functions to update the per-session glitches
- MEDIUM: mux-h2: update session trackers with number of glitches
- BUG/MINOR: server/cli: add missing LF at the end of certain notice/error lines
- BUG/MINOR: vars/cli: fix missing LF after "get var" output
- BUG/MEDIUM: cli: fix once for all the problem of missing trailing LFs
- MINOR: cli: make sure to always print a pending message after release()
- MINOR: cli: always reset the applet task's timeout
- MINOR: cli: add a new "wait" command to wait for a certain delay
- BUG/MINOR: applet: Always release empty appctx buffers after processing
- MINOR: server: split the server deletion code in two parts
- MINOR: cli/wait: make the wait command support a more detailed help message
- MINOR: cli/wait: also support an unrecoverable failure status
- MINOR: cli/wait: also pass up to 4 arguments to the external conditions
- MINOR: cli/wait: add a condition to wait on a server to become unused
- CI: Update to actions/cache@v4
- BUILD: address a few remaining calloc(size, n) cases
- BUG/MEDIUM: pool: fix rare risk of deadlock in pool_flush()
As reported by github user @JB0925 in issue #2427, there is a possible
crash in pool_flush(). The problem is that if the free_list is not empty
in the first test, and is empty at the moment the xchg() is performed,
for example because another thread called it in parallel, we place a
POOL_BUSY there that is never removed later, causing the next thread to
wait forever.
This was introduced in 2.5 with commit 2a4523f6f ("BUG/MAJOR: pools: fix
possible race with free() in the lockless variant"). It has probably
very rarely been detected, because:
- pool_flush() is only called when stopping is set
- the function does nothing if global pools are disabled, which is
the case on most modern systems with a fast memory allocator.
It's possible to reproduce it by modifying __task_free() to call
pool_flush() on 1% of the calls instead of only when stopping.
The fix is quite simple, it consists in moving the zeroing of the
entry in the break path after verifying that the entry was not already
busy.
This must be backported wherever commit 2a4523f6f is.
In issue #2427 Ilya reports that gcc-14 rightfully complains about
sizeof() being placed in the left term of calloc(). There's no impact
but it's a bad pattern that gets copy-pasted over time. Let's fix the
few remaining occurrences (debug.c, halog, udp-perturb).
This can be backported to all branches, and the irrelevant parts dropped.
The "wait" command now supports a condition, "srv-unused", which waits
for the designated server to become totally unused, indicating that it
is removable. Upon each wakeup it calls srv_check_for_deletion() to
verify if conditions are met, if not if it's recoverable, or if it's
not recoverable, and proceeds according to this, never waiting for a
final decision longer than the configured delay.
The purpose is to make it possible to remove servers from the CLI after
waiting for their sessions to be terminated:
$ socat -t5 /path/to/socket - <<< "
disable server px/srv1
shutdown sessions server px/srv1
wait 2s srv-unused px/srv1
del server px/srv1"
Or even wait for connections to terminate themselves:
$ socat -t70 /path/to/socket - <<< "
disable server px/srv1
wait 1m srv-unused px/srv1
del server px/srv1"
Conditions will need to have context, arguments etc from the command line.
Since these will vary with time (otherwise we wouldn't wait), let's just
pass them as text (possibly pre-processed). We're starting with 4 strings
that are expected to be allocated by strdup() and are always sent to free()
upon release.
Since we'll support waiting for an action to succeed or permanently
fail, we need the ability to return an unrecoverable failure. Let's
add CLI_WAIT_ERR_FAIL for this. A static error message may be placed
into ctx->msg to report to the user why the failure is unrecoverable.
We'll need to be able to verify whether or not a server may be deleted.
For now, both the verification and the action are performed in the same
function, at once under thread isolation. The goal here is to extract
the verification code into a new function that will perform these checks,
return a status between success/recoverable/non-recoverable failure, and
will also return a message for the caller.
When an applet is using its own buffers, it is important to release them, if
empty, after processing to recycle unsued buffers. It is not a leak because
these buffers are necessarily released when the applet is released. But this
leads to an excess of buffer allocations.
No need to backport.
This allows to insert delays between commands, i.e. to collect a same
set of metrics at a fixed interval. E.g:
$ socat -t20 /path/to/socket <<< "show activity; wait 10s; show activity"
The goal will be to extend the feature to optionally support waiting on
certain conditions. For this reason the struct definitions and enums were
placed into cli-t.h.
The CLI applet doesn't make use of its timeout at all, only the stream
does. That's a wonder because it allows any command's I/O handler to
trivially set a wakeup timer by simply touching the task's ->expire
field, and the I/O handler will automatically be woken up again. The
only condition for this is that we properly take care of clearing that
timeout whenever we finish processing a command and switch back to the
PROMPT state. That's what this patch does.
If a release handler produces a final message, it's currently left
pending in the CLI context and needs another I/O event to be dumped
because immediately after calling ->release, we check for states
OUTPUT and above and we wait until more data arrives.
This patch adds continue statement to go back to the loop immediately
after leaving the release handler in order to attempt to emit the
output message.
At this point it's not sure whether any release handlers are producing
messages, so it's probably not needed to backport this.
Some commands are still missing their trailing LF, and very few were even
already spotted in the past emitting more than one. The risk of missing
this LF is particularly high, especially when tests are run in non-
interactive mode where the output looks good at first glance. The problem
is that once run in interactive mode, the missing empty line makes the
command not being complete, and scripts can wait forever.
Let's tackle the problem at its root: messages emitted at the end must
always end with an LF and we know some miss it. Thus, in cli_output_msg()
we now start by removing the trailing LFs from the string, and we always
add exactly one. This way the trailing LF added by correct functions are
silently ignored and all functions are now correct.
This would need to be progressively backported to all supported versions
in order to address them all at once, though the risk of breaking a legacy
script relying on the wrong output is never zero. At first it should at
least go as far as the lastest LTS (2.8), and maybe another one depending
on user demands. Note that it also requires previous patch ("BUG/MINOR:
vars/cli: fix missing LF after "get var" output") because it fixes a test
for a bogus output for "get var" in a VTC.
"get var" on the CLI was also missing an LF, and the vtest as well, so
that fixing only the code breaks the vtest. This must be backported to
2.4 as the issue was brought with commit c35eb38f1d ("MINOR: vars/cli:
add a "get var" CLI command to retrieve global variables").
Some cli_err(), cli_msg() or even ha_error() etc are missing the trailing
LF, which breaks the continuity of the CLI parsing: the extra LF that serves
to mark the end of the command is in fact taken as the missing LF and no
extra one is added.
This patch adds the missing LF on identified messages. It might be worth
trying to proceed in a more generic way with this, given the amount of
code that is possibly at risk.
We now update the session's tracked counters with the observed glitches.
In order to avoid incurring a high cost, e.g. if many small frames contain
issues, we batch the updates around h2_process_demux() by directly passing
the difference. Indeed, for now all functions that increment glitches are
called from h2_process_demux(). If that were to change, we'd just need to
keep the value of the last synced counter in the h2c struct instead of the
stack.
The regtest was updated to verify that the 3rd client that does not cause
issue still sees the counter resulting from client 2's mistakes. The rate
is also verified, considering it shouldn't fail since the period is very
long (1m).
This adds a new pair of stored types in the stick-tables:
- glitch_cnt
- glitch_rate
These keep count of the number of glitches reported on a front connection,
in order to decide how to act with a badly defective client or a potential
attacker. For now nothing updates these counters, but all the infrastructure
needed to configure, update and retrieve them was added, including the doc.
No regtest was added yet since they're not filled yet.
This is apparently the only location where the stored data types are
documented, but it was quite outdated as it stopped at gpc1 rate. This
patch adds the missing types (up to and including gpc_rate).
It's quite uncommon for a client to decide to change the connection's
initial window size after the settings exchange phase, unless it tries
to increase it. One of the impacts depending is that it updates all
streams, so it can be expensive, depending on the stacks, and may even
be used to construct an attack. For this reason, we now count a glitch
when this happens.
A test with h2spec shows that it triggers 9 across a full test.
Here we consider that if a HEADERS frame is made of more than 4 fragments
whose average size is lower than 1kB, that's very likely an abuse so we
count a glitch per 16 fragments, which means 1 glitch per 1kB frame in a
16kB buffer. This means that an abuser sending 1600 1-byte frames would
increase the counter by 100, and that sending 100 headers per request in
individual frames each results in a count of ~7 to be added per request.
A test consisting in sending 100M requests made of 101 frames each over
a connection resulted in ~695M glitches to be counted for this connection.
Note that no special care is taken to avoid wrapping since it already takes
a very long time to reach 100M and there's no particular impact of wrapping
here (roughly 1M/s).
RFC9113 clarified a point regarding the payload from DATA frames sent to
closed streams. It must always be counted against the connection's flow
control. In practice it should really have no practical effect, but if
repeated upload attempts are aborted, this might cause the client's
window to progressively shrink since not being ACKed.
It's probably not necessary to backport this, unless another patch
depends on it.
The fetch will return true if the stream was redispatched: this is a
past action, thus we rename the fetch to better reflect its true
meaning and prevent confusions.
Documentation was updated.
While at it, the fetch was moved from internal states section to Layer 4
section, which is where it belongs.
No backport needed unless 92b2edb (" MINOR: stream: add "txn.redispatch"
fetch") gets backported.
Counters are managed at the stream level and also work in TCP mode.
They were found in the Layer 7 section, moving them to the Layer 4
section instead.
This could be backported in 2.9 with fa0a304f3 ("DOC: config: add an
index of sample fetch keywords")
An extra space was placed at the start of "bytes_out" description,
and dconv was having a hard time to properly render the text in html
format because of that.
Finally, remove an extra line feed.
This should be backported in 2.9 with c7424a1ba ("MINOR: samples:
implement bytes_in and bytes_out samples")
txn.conn_retries was inserted in the internal states sample table, but it
should belong to Layer 4 sample table instead (SMP_USE_L4SRV)
This should be backported in 2.9 with fa0a304f3 ("DOC: config: add an
index of sample fetch keywords")
The 'set ssl cert' command was failing because of empty lines in the
contents of the PEM file used to perform the update.
We were also missing the issuer in the newly created ckch_store, which
then raised an error when committing the transaction.
If a certificate that has an OCSP uri is unused and gets added to a
crt-list with the ocsp auto update option "on", it would not have been
inserted into the auto update tree because this insertion was only
working on the first call of the ssl_sock_load_ocsp function.
If the configuration used a crt-list like the following:
cert1.pem *
cert2.pem [ocsp-update on] *
Then calling "del ssl crt-list" on the second line and then reverting
the delete by calling "add ssl crt-list" with the same line, then the
cert2.pem would not appear in the ocsp update list (can be checked
thanks to "show ssl ocsp-updates" command).
This patch ensures that in such a case we still perform the insertion in
the update tree.
This patch can be backported up to branch 2.8.
The ckch_store's free'ing function might end up calling
'ssl_sock_free_ocsp' if the corresponding certificate had ocsp data.
This ocsp cleanup function expects for the 'refcount_instance' member of
the certificate_ocsp structure to be 0, meaning that no live
ckch instance kept a reference on this certificate_ocsp structure.
But since in ckch_store_free we were destroying the ckch_data before
destroying the linked instances, the BUG_ON would fail during a standard
deinit. Reversing the cleanup order fixes the problem.
Must be backported to 2.8.
With the current way OCSP responses are stored, a single OCSP response
is stored (in a certificate_ocsp structure) when it is loaded during a
certificate parsing, and each ckch_inst that references it increments
its refcount. The reference to the certificate_ocsp is actually kept in
the SSL_CTX linked to each ckch_inst, in an ex_data entry that gets
freed when he context is freed.
One of the downside of this implementation is that is every ckch_inst
referencing a certificate_ocsp gets detroyed, then the OCSP response is
removed from the system. So if we were to remove all crt-list lines
containing a given certificate (that has an OCSP response), the response
would be destroyed even if the certificate remains in the system (as an
unused certificate). In such a case, we would want the OCSP response not
to be "usable", since it is not used by any ckch_inst, but still remain
in the OCSP response tree so that if the certificate gets reused (via an
"add ssl crt-list" command for instance), its OCSP response is still
known as well. But we would also like such an entry not to be updated
automatically anymore once no instance uses it. An easy way to do it
could have been to keep a reference to the certificate_ocsp structure in
the ckch_store as well, on top of all the ones in the ckch_instances,
and to remove the ocsp response from the update tree once the refcount
falls to 1, but it would not work because of the way the ocsp response
tree keys are calculated. They are decorrelated from the ckch_store and
are the actual OCSP_CERTIDs, which is a combination of the issuer's name
hash and key hash, and the certificate's serial number. So two copies of
the same certificate but with different names would still point to the
same ocsp response tree entry.
The solution that answers to all the needs expressed aboved is actually
to have two reference counters in the certificate_ocsp structure, one
for the actual ckch instances and one for the ckch stores. If the
instance refcount becomes 0 then we remove the entry from the auto
update tree, and if the store reference becomes 0 we can then remove the
OCSP response from the tree. This would allow to chain some "del ssl
crt-list" and "add ssl crt-list" CLI commands without losing any
functionality.
Must be backported to 2.8.
When deleting a crt-list line through a "del ssl crt-list" call on the
CLI, we ended up free'ing the corresponding ckch instances without fully
clearing their contents. It left some dangling references on other
objects because the attache SSL_CTX was not deleted, as well as all the
ex_data referenced by it (OCSP responses for instance).
This patch can be backported up to branch 2.4.
The only useful information taken out of the ckch_store in order to copy
an OCSP certid into a buffer (later used as a key for entries in the
OCSP response tree) is the ocsp_certid field of the ckch_data structure.
We then don't need to pass a pointer to the full ckch_store to
ckch_store_build_certid or even any information related to the store
itself.
The ckch_store_build_certid is then converted into a helper function
that simply takes an OCSP_CERTID and converts it into a char buffer.
When calling ckchs_dup (during a "set ssl cert" CLI command), if the
modified store had OCSP auto update enabled then the new certificate
would not keep the previous update mode and would not appear in the auto
update list.
This patch can be backported to 2.8.
At the beginning of the 3.0-dev cycle, the zero-copy forwarding support was
added only for the cache applet with an option to disable it. This was a
hack, waiting for a better integration with applets. It is now possible to
implement the zero-copy forwarding for any applets. So the specific option
for the cache applet was renamed to be used for all applets. And this option
is now also checked for the stats applet.
Concretely, 'tune.cache.zero-copy-forwarding' was renamed to
'tune.applet.zero-copy-forwarding'.