Commit Graph

18087 Commits

Author SHA1 Message Date
Willy Tarreau
d60269f93f DOC: design: add some thoughts about how to handle the update_list
This one is a real problem as it outlives the closure of the FD, and
some subtle changes are required.
2022-07-15 19:43:10 +02:00
Willy Tarreau
91a7c164b4 MINOR: task: move the niced_tasks counter to the thread group context
This one is only used as a hint to improve scheduling latency, so there
is no more point in keeping it global since each thread group handles
its own run q
2022-07-15 19:43:10 +02:00
Willy Tarreau
b0e7712fb2 MEDIUM: task/thread: move the task shared wait queues per thread group
Their migration was postponed for convenience only but now's time for
having the shared wait queues per thread group and not just per process,
otherwise the WQ lock uses a huge amount of CPU alone.
2022-07-15 19:43:10 +02:00
Willy Tarreau
82e378aa8a MINOR: fd/thread: get rid of thread_mask()
Since commit d2494e048 ("BUG/MEDIUM: peers/config: properly set the
thread mask") there must not remain any single case of a receiver that
is bound nowhere, so there's no need anymore for thread_mask().

We're adding a test in fd_insert() to make sure this doesn't happen by
accident though, but the function was removed and its rare uses were
replaced with the original value of the bind_thread msak.
2022-07-15 19:43:10 +02:00
Willy Tarreau
6bdf9452c0 MINOR: cli/threads: always bind CLI to thread group 1
When using multiple groups, the stats socket starts to emit errors and
it's not natural to have to touch the global section just to specify
"thread 1/all". Let's pre-attach these sockets to thread group 1. This
will cause errors when trying to change the group but this really is not
a problem for now as thread groups are not enabled by default. This will
make sure configs remain portable and may possibly be relaxed later.
2022-07-15 19:43:10 +02:00
Willy Tarreau
dcbd763fe9 MINOR: mworker/threads: limit the mworker sockets to group 1
As a side effect of commit 34aae2fd1 ("MEDIUM: mworker: set the iocb of
the socketpair without using fd_insert()"), a config may now refuse to
start if there are multiple groups configured because the default bind
mask may span over multiple groups, and it is not possible to force it
to work differently.

Let's just assign thread group 1 to the master<->worker sockets so that
the thread bindings automatically resolve to a single group. The same was
done for the master side of the socket even if it's not used. It will
avoid being forgotten in the future.
2022-07-15 19:43:10 +02:00
Willy Tarreau
5b09341c02 MEDIUM: cpu-map: replace the process number with the thread group number
The principle remains the same, but instead of having a single process
and ignoring extra ones, now we set the affinity masks for the respective
threads of all groups.

The doc was updated with a few extra examples.
2022-07-15 19:43:10 +02:00
Willy Tarreau
1b2b59bfa7 MINOR: thread: remove MAX_THREADS limitation
This one is now causing difficulties during the development phase and
it's going to disappear anyway, let's get rid of it.
2022-07-15 19:43:10 +02:00
Willy Tarreau
e5715bface MEDIUM: poller: disable thread-groups for poll() and select()
These old legacy pollers are not designed for this. They're still
using a shared list of events for all threads, this will not scale at
all, so there's no point in enabling thread-groups there. Modern
systems have epoll, kqueue or event ports and do not need these ones.

We arrange for failing at boot time, only when thread-groups > 1 so
that existing setups will remain unaffected.

If there's a compelling reason for supporting thread groups with these
pollers in the future, the rework should not be too hard, it would just
consume a lot of memory to have an fd_evts[] array per thread, but that
is doable.
2022-07-15 19:43:10 +02:00
Willy Tarreau
b1093c6ba2 MEDIUM: poller: program the update in fd_update_events() for a migrated FD
When an FD is migrated, all pollers program an update. That's useless
code duplication, and when thread groups will be supported, this will
require an extra round of locking just to verify the update_mask on
return. Let's just program the update direction from fd_update_events()
as it already does for closed FDs, this becomes more logical.
2022-07-15 19:43:10 +02:00
Willy Tarreau
1b927eb3c3 MEDIUM: proto: stop protocols under thread isolation during soft stop
protocol_stop_now() is called from do_soft_stop_now() running on any
thread that received the signal. The problem is that it will call some
listener handlers to close the FD, resulting in an fd_delete() being
called from the wrong group. That's not clean and we cannot even rely
on the thread mask to show up.

One interesting long-term approach could be to have kill queues for
FDs, and maybe we'll need them in the long run. However that doesn't
work well for listeners in this situation.

Let's simply isolate ourselves during this instant. We know we'll be
alone dealing with the close and that the FD will be instantly deleted
since not in use by any other thread. It's not the cleanest solution
but it should last long enough without causing trouble.
2022-07-15 19:43:10 +02:00
Willy Tarreau
7aa41196cf MEDIUM: debug/threads: make the lock debugging take tgroups into account
Since we have to use masks to verify owners/waiters, we have no other
option but to have them per group. This definitely inflates the size
of the locks, but this is only used for extreme debugging anyway so
that's not dramatic.

Thus as of now, all masks in the lock stats are local bit masks, derived
from ti->ltid_bit. Since at boot ltid_bit might not be set, we just take
care of this situation (since some structs are initialized under look
during boot), and use bit 0 from group 0 only.
2022-07-15 19:41:26 +02:00
Willy Tarreau
4d9888ca69 CLEANUP: fd: get rid of the __GET_{NEXT,PREV} macros
They were initially made to deal with both the cache and the update list
but there's no cache anymore and keeping them for the update list adds a
lot of obfuscation that is really not desired. Let's get rid of them now.

Their purpose was simply to get a pointer to fdtab[fd].update.{,next,prev}
in order to perform atomic tests and modifications. The offset passed in
argument to the functions (fd_add_to_fd_list() and fd_rm_from_fd_list())
was the offset of the ->update field in fdtab, and as it's not used anymore
it was removed. This also removes a number of casts, though those used by
the atomic ops have to remain since only scalars are supported.
2022-07-15 19:41:26 +02:00
Willy Tarreau
740038c8b9 MINOR: listener/config: make "thread" always support up to LONGBITS
The difference is subtle but in one place there was MAXTHREADS and this
will not work anymore once it goes over 64.
2022-07-15 19:41:26 +02:00
Willy Tarreau
acd644197f MEDIUM: config: remove the "process" keyword on "bind" lines
It was deprecated, marked for removal in 2.7 and was already emitting a
warning, let's get rid of it. Note that we've kept the keyword detection
to suggest to use "thread" instead.
2022-07-15 19:41:26 +02:00
Willy Tarreau
94f763b5e4 MEDIUM: config: remove deprecated "bind-process" directives from frontends
This was already causing a deprecation warning and was marked for removal
in 2.7, now it happens. An error message indicates this doesn't exist
anymore.
2022-07-15 19:41:26 +02:00
Willy Tarreau
91f7a1af34 CLEANUP: applet: remove the obsolete command context from the appctx
The "ctx" and "st2" parts in the appctx were marked for removal in 2.7
and were emulated using memcpy/memset etc for possible external code.
Let's remove this now.
2022-07-15 19:41:26 +02:00
Willy Tarreau
9a7fa90239 MINOR: cli/activity: add a thread number argument to "show activity"
The output of "show activity" can be so large that the output is visually
unreadable on a screen. Let's add an option to filter on the desired
column (actually the thread number), use "0" to report only the first
column (aggregated/sum/avg), and use "-1", the default, for the normal
detailed dump.
2022-07-15 19:41:26 +02:00
Willy Tarreau
dadf00e226 DEBUG: cli: add a new "debug dev deadlock" expert command
This command will create the requested number of tasks competing on a
lock, resulting in triggering the watchdog and crashing the process.
This will help stress the watchdog and inspect the lock debugging parts.
2022-07-15 19:41:26 +02:00
Willy Tarreau
dd75b64cdf MINOR: cli/streams: show a stream's tgid next to its thread ID
We now display both the global thread ID and the tgid/ltid pair so that
it's easier to match it with the FD.
2022-07-15 19:41:26 +02:00
Willy Tarreau
f0c86ddfe8 BUG/MEDIUM: debug: fix parallel thread dumps again
The previous attempt to fix thread dumps in commit 672972604 ("BUG/MEDIUM:
debug: fix possible hang when multiple threads dump at once") still had
some shortcomings. Sometimes parallel dumps are jerky essentially due to
the way that threads synchronize on startup and end. In addition the risk
of waiting forever for a stopped thread exists, and panics happening in
parallel to thread dumps are not more reliable either.

This commit revisits the state transitions so that all threads may request
a dump in parallel, that all of them wait for each other in the handler,
and that one thread is responsible for counting every other and checking
that the total matches the number of active threads.

Then for stopping there's a finishing phase that all threads wait for so
that none quits this area too early. Given that we now know the number of
participants to the dump, we can let them each decrement the counter when
leaving so that another dump may only start after the last participant
has completely left.

Now many thread dumps in parallel are running fine, so do panics. No
backport is needed as this was the result of the changes for thread
groups.
2022-07-15 19:41:26 +02:00
Willy Tarreau
55433f9b34 BUG/MINOR: debug: enter ha_panic() only once
Some panic dumps are mangled or truncated due to the watchdog firing at
the same time on multiple threads and calling ha_panic() simultaneously.
What may happen in this case is that the second one waits for the first
one to finish but as soon as it's done the second one resets the buffer
and dumps again, sometimes resetting the first one's dump. Also the first
one's abort() may trigger while the second one is currently dumping,
resulting in a full dump followed by a truncated one, leading to
confusion. Sometimes some lines appear in the middle of a dump as well.
It doesn't happen often and is easier to trigger by causing massive
deadlocks.

There's no reason for the process to resist to a panic, so we can safely
add a counter and no nothing on subsequent calls. Ideally we'd wait there
forever but as this may happen inside a signal handler (e.g. watchdog),
it doesn't always work, so the easiest thing to do is to return so that
the thread is interrupted as soon as possible and brought to the debug
handler to be dumped.

This should be backported, at least to 2.6 and possibly to older versions
as well.
2022-07-15 19:41:26 +02:00
Willy Tarreau
f15c75a2d3 BUG/MINOR: thread: use the correct thread's group in ha_tkillall()
In ha_tkillall(), the current thread's group was used to check for the
thread being running instead of using the target thread's group mask.
Most of the time it would not have any effect unless some groups are
uneven where it can lead to incomplete thread dumps for example.

No backport is needed, this is purely 2.7.
2022-07-15 19:41:26 +02:00
Willy Tarreau
52f238d326 BUG/MEDIUM: cli/threads: make "show threads" more robust on applets
Running several concurrent "show threads" in loops might occasionally
cause a segfault when trying to retrieve the stream from appctx_sc()
which may be null while the applet is finishing. It's not easy to
reproduce, it requires 3-5 sessions in parallel for about a minute
or so. The appctx_sc must be checked before passing it to sc_strm().

This must be backported to 2.6 which also has the bug.
2022-07-15 19:41:26 +02:00
Willy Tarreau
9b0f0d146f BUG/MINOR: threads: produce correct global mask for tgroup > 1
In thread_resolve_group_mask(), if a global thread number is passed
and it belongs to a group greater than 1, an incorrect shift resulted
in shifting that ID again which made it appear nowhere or in a wrong
group possibly. The bug was introduced in 2.5 with commit 627def9e5
("MINOR: threads: add a new function to resolve config groups and
masks") though the groups only starts to be usable in 2.7, so there
is no impact for this bug, hence no backport is needed.
2022-07-15 19:41:26 +02:00
Amaury Denoyelle
114c9c87ce MINOR: h3: implement graceful shutdown with GOAWAY
Implement graceful shutdown as specified in RFC 9114. A GOAWAY frame is
generated with stream ID to indicate range of processed requests.

This process is done via the release app protocol operation. The MUX
is responsible to emit the generated GOAWAY frame after app release. A
CONNECTION_CLOSE will be emitted once there is no unacknowledged STREAM
frames.
2022-07-15 15:56:13 +02:00
Amaury Denoyelle
d701039773 MINOR: h3: store control stream in h3c
Store a reference to the HTTP/3 control stream in h3c context.

This will be useful to implement GOAWAY emission without having to store
the control stream ID on opening.
2022-07-15 15:56:13 +02:00
Amaury Denoyelle
a154dc0290 MINOR: mux-quic: send one last time before release
Call qc_send() on qc_release(). This is mostly useful for application
protocol with a connection closing procedure. Most notably, this will be
useful to implement HTTP/3 GOAWAY emission.
2022-07-15 15:56:13 +02:00
Amaury Denoyelle
c49d5d1a4b CLEANUP: mux-quic: move qc_release()
This change is purely cosmetic. qc_release() function is moved just
before qc_io_cb(). It's cleaner as it brings it closer where it is used.
More importantly, this will be required to be able to use it in
qc_send() function.
2022-07-15 15:56:13 +02:00
Amaury Denoyelle
240b1b108b MEDIUM: quic: send CONNECTION_CLOSE on released MUX
Send a CONNECTION_CLOSE if the MUX has been released and all STREAM data
are acknowledged. This is useful to prevent a client from trying to use
a connection which have the upper layer closed.

To implement this a new function qc_check_close_on_released_mux() has
been added. It is called on QUIC MUX release notification and each time
a qc_stream_desc has been released.

This commit is associated with the previous one :
  MINOR: mux-quic/h3: schedule CONNECTION_CLOSE on app release
Both patches are required to prevent the risk of browsers stuck on
webpage loading if MUX has been released. On CONNECTION_CLOSE reception,
the client will reopen a new QUIC connection.
2022-07-15 15:56:13 +02:00
Amaury Denoyelle
069288b4c0 MINOR: mux-quic/h3: prepare CONNECTION_CLOSE on release
When MUX is released, a CONNECTION_CLOSE frame should be emitted. This
will ensure that the client does not use anymore a half-dead connection.

App protocol layer is responsible to provide the error code via release
callback. For HTTP/3 NO_ERROR is used as specified in RFC 9114. If no
release callback is provided, generic QUIC NO_ERROR code is used. Note
that a graceful shutdown is used : quic_conn must emit CONNECTION_CLOSE
frame when possible. This will be provided in another patch.

This change should limit the risk of browsers stuck on webpage loading
if MUX has been released. On CONNECTION_CLOSE reception, the client will
reopen a new QUIC connection.
2022-07-15 15:20:33 +02:00
Amaury Denoyelle
d666d740d2 MINOR: mux-quic: support app graceful shutdown
Adjust qcc_emit_cc_app() to allow the delay of emission of a
CONNECTION_CLOSE. This will only set the error code but the quic-conn
layer is not flagged for immediate close. The quic-conn will be
responsible to shut the connection when deemed suitable.

This change will allow to implement application graceful shutdown, such
as HTTP/3 with GOAWAY emission. This will allow to emit closing frames
on MUX release. Once all work is done at the lower layer, the quic-conn
should emit a CONNECTION_CLOSE with the registered error code.
2022-07-15 15:06:59 +02:00
Amaury Denoyelle
57e6db7021 MINOR: quic: define a generic QUIC error type
Define a new structure quic_err to abstract a QUIC error type. This
allows to easily differentiate a transport and an application error
code. This simplifies error transmission from QUIC MUX and H3 layers.

This new type is defined in quic_frame module. It is used to replace
<err_code> field in <quic_conn>. QUIC_FL_CONN_APP_ALERT flag is removed
as it is now useless.

Utility functions are defined to be able to quickly instantiate
transport, tls and application errors.
2022-07-15 14:57:49 +02:00
Amaury Denoyelle
41cd879383 CLEANUP: quic: clean up include on quic_frame-t.h
quic_frame-t.h and xprt_quic-t.h include themselves mutually. This may
cause some troubles later.

In fact, xprt_quic does not need to include quic_frame so remove this.
And as quic_frame is a generic source file which is included in multiple
places, it is useful to also remove the xprt_quic include in it. Use
forward declaration for this.
2022-07-15 14:54:24 +02:00
Amaury Denoyelle
72d86509f1 BUG/MINOR: quic: fix closing state on NO_ERROR code sent
Reception is disabled as soon as a CONNECTION_CLOSE emission is
required. An early return is done on qc_lstnr_pkt_rcv() to implement
this.

This condition is not functional if the error code sent is NO_ERROR
(0x00). To fix this, check the quic-conn flags instead of the
error code.

Currently this bug has no impact has NO_ERROR emission is not used.

This can be backported up to 2.6.
2022-07-13 15:33:15 +02:00
Willy Tarreau
672972604f BUG/MEDIUM: debug: fix possible hang when multiple threads dump at once
A bug in the thread dumper was introduced by commit 00c27b50c ("MEDIUM:
debug: make the thread dumper not rely on a thread mask anymore"). If
two or more threads try to trigger a thread dump exactly at the same
time, the second one may loop indefinitely trying to set the value to 1
while the other ones will wait for it to finish dumping before leaving.
This is a consequence of a logic change using thread numbers instead of
a thread mask, as threads do not need to see all other ones there anymore.

No backport is needed, this is only for 2.7.
2022-07-13 09:03:02 +02:00
Amaury Denoyelle
a5b5075211 MEDIUM: mux-quic: implement STOP_SENDING handling
Implement support for STOP_SENDING frame parsing. The stream is resetted
as specified by RFC 9000. This will automatically interrupt all future
send operation in qc_send(). A RESET_STREAM will be sent with the code
extracted from the original STOP_SENDING frame.
2022-07-11 16:45:04 +02:00
Amaury Denoyelle
843a1196b3 MEDIUM: mux-quic: implement RESET_STREAM emission
Implement functions to be able to reset a stream via RESET_STREAM.

If needed, a qcs instance is flagged with QC_SF_TO_RESET to schedule a
stream reset. This will interrupt all future send operations.

On stream emission, if a stream is flagged with QC_SF_TO_RESET, a
RESET_STREAM frame is generated and emitted to the transport layer. If
this operation succeeds, the stream is locally closed. If upper layer is
instantiated, error flag is set on it.
2022-07-11 16:45:04 +02:00
Amaury Denoyelle
20d1f84ce4 MINOR: mux-quic: use stream states to mark as detached
Adjust condition to detach a qcs instance : if the stream is not locally
close it is not directly free. This should improve stream closing by
ensuring that either FIN or a RESET_STREAM is sent before destroying it.
2022-07-11 16:41:10 +02:00
Amaury Denoyelle
38e6006da1 MINOR: mux-quic: define basic stream states
Implement a basic state machine to represent stream lifecycle. By
default a stream is idle. It is marked as open when sending or receiving
the first data on a stream.

Bidirectional streams has two states to represent the closing on both
receive and send channels. This distinction does not exists for
unidirectional streams which passed automatically from open to close
state.

This patch is mostly internal and has a limited visible impact. Some
behaviors are slightly updated :
* closed streams are garbage collected at the start of io handler
* send operation is interrupted if a stream is close locally

Outside of this, there is no functional change. However, some additional
BUG_ON guards are implemented to ensure that we do not conduct invalid
operation on a stream. This should strengthen the code safety. Also,
stream states are displayed on trace which should help debugging.
2022-07-11 16:37:21 +02:00
Amaury Denoyelle
b68559a9aa MINOR: mux-quic: support stream opening via MAX_STREAM_DATA
MAX_STREAM_DATA can be used as the first frame of a stream. In this
case, the stream should be opened, if it respects flow-control limit. To
implement this, simply replace plain lookup in stream tree by
qcc_get_qcs() at the start of the parsing function. This automatically
takes care of opening the stream if not already done.

As specified by RFC 9000, if MAX_STREAM_DATA is receive for a
receive-only stream, a STREAM_STATE_ERROR connection error is emitted.
2022-07-11 16:24:03 +02:00
Amaury Denoyelle
57161b7d0c MINOR: mux-quic: do not ack STREAM frames on unrecoverable error
Improve return path for qcc_recv() on STREAM parsing. It returns 0 on
success. On error, a non-zero value is returned which indicates to the
caller that the packet containing the frame should not be acknowledged.

When qcc_recv() generates a CONNECTION_CLOSE or RESET_STREAM, either
directly or via qcc_get_qcs(), an error is returned which ensure that no
acknowledgement is generated. This required an adjustment on
qcc_get_qcs() API which now returns a success/error code. The stream
instance is returned via a new out argument.
2022-07-11 16:24:03 +02:00
Amaury Denoyelle
5fbb8691d4 MINOR: mux-quic: filter send/receive-only streams on frame parsing
Extend the function qcc_get_qcs() to be able to filter send/receive-only
unidirectional streams. A connection error STREAM_STATE_ERROR is emitted
if this new filter does not match.

This will be useful when various frames handlers are converted with
qcc_get_qcs(). Depending on the frame type, it will be easy to filter on
the forbidden stream types as specified in RFC 9000.
2022-07-11 16:24:03 +02:00
Amaury Denoyelle
4561f84ad4 MINOR: mux-quic: implement qcs_alert()
Implement a simple function to notify a possible subscriber or wake up
the upper layer if a special condition happens on a stream. For the
moment, this is only used to replace identical code in
qc_wake_some_streams().
2022-07-11 16:24:03 +02:00
Amaury Denoyelle
392e94e985 MINOR: mux-quic: add traces on frame parsing functions
Add traces for parsing functions for MAX_DATA and MAX_STREAM_DATA.
2022-07-11 16:24:03 +02:00
Amaury Denoyelle
c1a6dfd477 MINOR: mux-quic: rename stream purge function
Rename qc_release_detached_streams() to qc_purge_streams(). The aim is
to have a more generic name. It's expected to complete this function to
add other criteria to purge dead streams.

Also the function documentation has been corrected. It does not return a
number of streams. Instead it is a boolean value, to true if at least
one stream was released.
2022-07-11 16:24:03 +02:00
Amaury Denoyelle
b143723411 REORG: mux-quic: rename stream initialization function
Rename both qcc_open_stream_local/remote() functions to
qcc_init_stream_local/remote(). This change is purely cosmetic. It will
reduces the ambiguity with the soon to be implemented OPEN states for
QCS instances.
2022-07-11 16:24:03 +02:00
Amaury Denoyelle
e53b489826 BUG/MEDIUM: mux-quic: fix server chunked encoding response
QUIC MUX was not able to correctly deal with server response using
chunked transfer-encoding. All data will be transfered correctly to the
client but the FIN bit is missing. The transfer will never stop as the
client will wait indefinitely for the FIN bit.

This bug happened because the HTX message representing a chunked encoded
payload contains a final empty block with the EOM flag. However,
emission is skipped by QUIC MUX if there is no data to transfer. To fix
this, the condition was completed to ensure that there is no need to
send the FIN signal. If this is false, data emission will proceed even
if there is no data : this will generate an empty QUIC STREAM frame with
FIN set which will mark the end of the transfer.

To ensure that a FIN STREAM frame is sent only one time,
QC_SF_FIN_STREAM is resetted on send confirmation from the transport in
qcc_streams_sent_done().

This bug was reproduced when dealing with chunked transfer-encoding
response for the HTTP server.

This must be backported up to 2.6.
2022-07-11 16:21:52 +02:00
Willy Tarreau
a88e8bf428 BUILD: http: silence an uninitialized warning affecting gcc-5
When building with gcc-5, one can see this warning:

  src/http_fetch.c: In function 'smp_fetch_meth':
  src/http_fetch.c:356:6: warning: 'htx' may be used uninitialized in this function [-Wmaybe-uninitialized]
     sl = http_get_stline(htx);
        ^

It's wrong since the only way to reach this code is to have met the
same condition a few lines before and initialized the htx variable.
The reason in fact is that the same test happens on different variables
of distinct types, so the compiler possibly doesn't know that the
condition is the same. Newer gcc versions do not have this problem.
Let's just move the assignment earlier and have the exact same test,
as it's sufficient to shut this up. This may have to be backported
to 2.6 since the code is the same there.
2022-07-10 14:13:48 +02:00
Willy Tarreau
481edaceb8 BUILD: debug: silence warning on gcc-5
In 2.6, 8a0fd3a36 ("BUILD: debug: work around gcc-12 excessive
-Warray-bounds warnings") disabled some warnings that were reported
around the the BUG() statement. But the -Wnull-dereference warning
isn't known from gcc-5, it only arrived in gcc-6, hence makes gcc-5
complain loudly that it doesn't know this directive. Let's just
condition this one to gcc-6.
2022-07-10 14:13:48 +02:00