BUG/MINOR: server: shut down streams under thread isolation

Since the beginning of thread support, the shutdown of streams attached
to a server was run under the server's lock, but that's not sufficient.
It indeed turns out that shutting down streams (either from the CLI using
"shutdown sessions server XXX" or due to "on-error shutdown-sessions")
iterates over all the streams to shut them down, but stream_shutdown()
has no way to protect its actions against concurrent actions from the
stream itself on another thread, and streams offer no such provisions
anyway.

The impact is some rare but possible crashes when shutting down streams
from the CLI in cmopetition with high server traffic. The probability
is low enough to mark it minor, though it was observed in the field.

At least since 2.4 the streams are arranged in per-thread lists, so it
likely would be possible using the event subsystem to delegate these
events to dedicated per-thread tasks which would address the problem.
But server streams don't get killed often enough to justify such extra
complexity, so better just run the loop under thread isolation.

It also shows that the internal API could probably be improved to
support a lighter thread exclusion instead of full isolation: various
places want to only exclude one thread and here it could work. But
again there's no point doing this for now.

This patch should be backported to all stable branches. It's important
to carefully check that this srv_shutdowns_streams() function is never
called itself under isolation in older versions (though at first glance
it looks OK).
This commit is contained in:
Willy Tarreau 2024-09-21 19:35:35 +02:00
parent e77c73316a
commit b500e84e24

View File

@ -2015,7 +2015,9 @@ static int srv_has_streams(struct server *srv)
* code in <why>, which must be one of SF_ERR_* indicating the reason for the
* shutdown.
*
* Must be called with the server lock held.
* Note: the function will switch to thread isolation (due to shutdown_stream()
* modifying the streams directly), and commonly called with the server's lock
* held as well though not needed anymore (this is not a bug).
*/
void srv_shutdown_streams(struct server *srv, int why)
{
@ -2023,10 +2025,14 @@ void srv_shutdown_streams(struct server *srv, int why)
struct mt_list back;
int thr;
thread_isolate();
for (thr = 0; thr < global.nbthread; thr++)
MT_LIST_FOR_EACH_ENTRY_LOCKED(stream, &srv->per_thr[thr].streams, by_srv, back)
if (stream->srv_conn == srv)
stream_shutdown(stream, why);
thread_release();
}
/* Shutdown all connections of all backup servers of a proxy. The caller must