DOC: management: document the "trace" and "show trace" commands

At the moment the subsystem is still not complete and the various modules
do not yet produce traces (some dirty experimental code for H2 exists) but
this aims at easing a broad adoption.

Among the missing elements, we can enumerate the lack of configuration
of the sinks (e.g. it's still not possible to change their output format
nor enable/disable timestamps) and since timestamps are not availalbe in
the sinks, they are not collected nor passed by the traces.
This commit is contained in:
Willy Tarreau 2019-08-22 20:06:04 +02:00
parent d8b99edeed
commit f909c91e8a

View File

@ -2563,6 +2563,16 @@ show schema json
verifiers may be used to verify the output of "show info json" and "show
stat json" against the schema.
show trace [<source>]
Show the current trace status. For each source a line is displayed with a
single-character status indicating if the trace is stopped, waiting, or
running. The output sink used by the trace is indicated (or "none" if none
was set), as well as the number of dropped events in this sink, followed by a
brief description of the source. If a source name is specified, a detailed
list of all events supported by the source, and their status for each action
(report, start, pause, stop), indicated by a "+" if they are enabled, or a
"-" otherwise. All these events are independent and an event might trigger
a start without being reported and conversely.
shutdown frontend <frontend>
Completely delete the specified frontend. All the ports it was bound to will
@ -2593,6 +2603,151 @@ shutdown sessions server <backend>/<server>
maintenance mode, for instance. Such terminated sessions are reported with a
'K' flag in the logs.
trace
The "trace" command alone lists the trace sources, their current status, and
their brief descriptions. It is only meant as a menu to enter next levels,
see other "trace" commands below.
trace 0
Immediately stops all traces. This is made to be used as a quick solution
to terminate a debugging session or as an emergency action to be used in case
complex traces were enabled on multiple sources and impact the service.
trace <source> event [ [+|-|!]<name> ]
Without argument, this will list all the events supported by the designated
source. They are prefixed with a "-" if they are not enabled, or a "+" if
they are enabled. It is important to note that a single trace may be labelled
with multiple events, and as long as any of the enabled events matches one of
the events labelled on the trace, the event will be passed to the trace
subsystem. For example, receiving an HTTP/2 frame of type HEADERS may trigger
a frame event and a stream event since the frame creates a new stream. If
either the frame event or the stream event are enabled for this source, the
frame will be passed to the trace framework.
With an argument, it is possible to toggle the state of each event and
individually enable or disable them. Two special keywords are supported,
"none", which matches no event, and is used to disable all events at once,
and "any" which matches all events, and is used to enable all events at
once. Other events are specific to the event source. It is possible to
enable one event by specifying its name, optionally prefixed with '+' for
better readability. It is possible to disable one event by specifying its
name prefixed by a '-' or a '!'.
One way to completely disable a trace source is to pass "event none", and
this source will instantly be totally ignored.
trace <source> level [<level>]
Without argument, this will list all detail levels for this source, and the
current one will be indicated by a star ('*') prepended in front of it. With
an argument, this will change the detail level to the specified level. Detail
levels are a form of filters that are applied before reporting the events.
These filters are used to report a level of detail suitable for the use case.
For example a developer might need to know precisely where in the code an
HTTP header was considered invalid while the end user may not even care about
this header's validity at all. There are currently 5 distinct levels for a
trace :
user this will report information that are suitable for use by a
regular haproxy user who wants to observe his traffic.
Typically some HTTP requests and responses will be reported
without much detail. Most sources will set this as the
default level to ease operations.
payload in addition to what is reported at the "user" level, it will
also display more detailed information about the contents,
which may be HTTP headers, or unencoded contents.
proto in addition to what is reported at the "payload" level, it
also display protocol-level information. This can for
example be the raw data exchanged over the wire after
encoding or frames received before decoding.
state in addition to what is reported at the "proto" level, it
will also display state transitions (or failed transitions)
which happen in parsers, so this will show attempts to
perform an operation while the "proto" level only shows
the final operation.
developer it reports everything available, which can include advanced
information such as "breaking out of this loop" that are
only relevant to a developer trying to understand a bug that
only happens once in a while in field.
It is highly recommended to always use the "user" level only and switch to
other levels only if instructed to do so by a developer. Also it is a good
idea to first configure the events before switching to higher levels, as it
may save from dumping many lines if no filter is applied.
trace <source> lock [criterion]
Without argument, this will list all the criteria supported by this source
for lock-on processing, and display the current choice by a star ('*') in
front of it. Lock-on means that the source will focus on the first matching
event and only stick to the criterion which triggered this event, and ignore
all other ones until the trace stops. This allows for example to take a trace
on a single connection or on a single stream. The following criteria are
supported by some traces, though not necessarily all, since some of them
might not be available to the source :
backend lock on the backend that started the trace
connection lock on the connection that started the trace
frontend lock on the frontend that started the trace
listener lock on the listener that started the trace
nothing do not lock on anything
server lock on the server that started the trace
session lock on the session that started the trace
thread lock on the thread that started the trace
In addition to this, each source may provide up to 4 specific criteria such
as internal states or connection IDs. For example in HTTP/2 it is possible
to lock on the H2 stream and ignore other streams once a strace starts.
When a criterion is passed in argument, this one is used instead of the
other ones and any existing tracking is immediately terminated so that it can
restart with the new criterion. The special keyword "nothing" is supported by
all sources to permanently disable tracking.
trace <source> { pause | start | stop } [ [+|-|!]event]
Without argument, this will list the events enabled to automatically pause,
start, or stop a trace for this source. These events are specific to each
trace source. With an argument, this will either enable the event for the
specified action (if optionally prefixed by a '+') or disable it (if
prefixed by a '-' or '!'). The special keyword "now" is not an event and
requests to take the action immediately. The keywords "none" and "any" are
supported just like in "trace event".
The 3 supported actions are respectively "pause", "start" and "stop". The
"pause" action enumerates events which will cause a running trace to stop and
wait for a new start event to restart it. The "start" action enumerates the
events which switch the trace into the waiting mode until one of the start
events appears. And the "stop" action enumerates the events which definitely
stop the trace until it is manually enabled again. In practice it makes sense
to manually start a trace using "start now" without caring about events, and
to stop it using "stop now". In order to capture more subtle event sequences,
setting "start" to a normal event (like receiving an HTTP request) and "stop"
to a very rare event like emitting a certain error, will ensure that the last
captured events will match the desired criteria. And the pause event is
useful to detect the end of a sequence, disable the lock-on and wait for
another opportunity to take a capture. In this case it can make sense to
enable lock-on to spot only one specific criterion (e.g. a stream), and have
"start" set to anything that starts this criterion (e.g. all events which
create a stream), "stop" set to the expected anomaly, and "pause" to anything
that ends that criterion (e.g. any end of stream event). In this case the
trace log will contain complete sequences of perfectly clean series affecting
a single object, until the last sequence containing everything from the
beginning to the anomaly.
trace <source> sink [<sink>]
Without argument, this will list all event sinks available for this source,
and the currently configured one will have a star ('*') prepended in front
of it. Sink "none" is always available and means that all events are simply
dropped, though their processing is not ignored (e.g. lock-on does occur).
Other sinks are available depending on configuration and build options, but
typically "stdout" and "stderr" will be usable in debug mode, and in-memory
ring buffers should be available as well. When a name is specified, the sink
instantly changes for the specified source. Events are not changed during a
sink change. In the worst case some may be lost if an invalid sink is used
(or "none"), but operations do continue to a different destination.
9.4. Master CLI
---------------