each command's -s option, but instead of parsing gdb output, member
values are read directly from memory, so the command is much faster
for 1-, 2-, 4-, and 8-byte members.
(Alexandr_Terekhov@epam.com)
"bt" option, which can be accessed using "bt -o", and where "bt -O"
will toggle the original and optional methods as the default. The
original backtrace method has adopted two changes/features from
the optional method:
(1) ORIG_X0 and SYSCALLNO registers are not displayed in kernel
exception frames.
(2) stackframe entry text locations are modified to be the PC
address of the branch instruction instead of the subsequent
"return" PC address contained in the stackframe link register.
Accordingly, these are the essential differences between the original
and optional methods:
(1) optional: the backtrace will start with the IPI exception frame
located on the process stack.
(2) original: the starting point of backtraces for the active,
non-crashing, tasks, will continue to have crash_save_cpu()
on the IRQ stack as the starting point.
(3) optional: the exception entry stackframe adjusted to be located
farther down in the IRQ stack.
(4) optional: bt -f does not display IRQ stack memory above the
adjusted exception entry stackframe.
(5) optional: may display "(Next exception frame might be wrong)".
(takahiro.akashi@linaro.org, anderson@redhat.com)
all tasks for evidence of stack overflows. It does so by verifying
the thread_info.task address, ensuring the thread_info.cpu value is
a valid cpu number, and checking the end of the stack for the
STACK_END_MAGIC value.
(anderson@redhat.com)
layout and associated KASLR support that was introduced in Linux 4.6.
The kernel text and static data has been moved from unity-mapped
memory into the vmalloc region, and its start address can be
randomized if CONFIG_RANDOMIZE_BASE is configured. Related support
is being put into the kernel's kdump code, the kexec-tools package,
and makedumpfile(8); with that in place, the analysis of Linux 4.6
ARM64 dumpfiles with or without KASLR enabled should work normally
by entering "crash vmlinux vmcore". On live systems, Linux 4.6 ARM64
kernels will only work automatically if CONFIG_RANDOMIZE_BASE is not
configured. Unfortunately, if CONFIG_RANDOMIZE_BASE is configured
on a live system, two --machdep command line arguments are required,
at least for the time being. The arguments are:
--machdep phys_offset=<base physical address>
--machdep kimage_voffset=<kernel kimage_voffset value>
Without the patch, any attempt to analyze a Linux 4.6 ARM64 kernel
fails during initialization with a stream of "read error" messages
followed by "crash: vmlinux and vmcore do not match!".
(takahiro.akashi@linaro.org)
are specified by the QEMU mem-path argument of a memory-backend-file
object. This allows the running of a live crash session against a
QEMU guest from the host machine. In this example, the /tmp/MEM file
on a QEMU host represents the guest's physical memory:
$ qemu-kvm ...other-options... \
-object memory-backend-file,id=MEM,size=128m,mem-path=/tmp/MEM,share=on \
-numa node,memdev=MEM -m 128
and a live session run can be run against the guest kernel like so:
$ crash <path-to-guest-vmlinux> live:/tmp/MEM@0
By prepending the ramdump image name with "live:", the crash session will
act as if it were running a normal live session.
(oleg@redhat.com)
option searches for data structures of a specified size or within a
range of specified sizes. The -m option searches for data structures
that contain a member of a given type. If a structure contains
another structure, the members of the embedded structure will also
be subject to the search. The type string may be a substring of the
data type name. The output displays the size and name of the data
structure.
(Alexandr_Terekhov@epam.com, anderson@redhat.com)
which were introduced in Linux 4.5 by this commit:
commit 132cd887b5c54758d04bf25c52fa48f45e843a30
arm64: Modify stack trace and dump for use with irq_stack
Without the patch, if an active task was operating on its per-cpu
IRQ stack on dumpfiles generated by kdump, its backtrace would start
at the exception frame that was laid down on the process stack.
This patch also adds support for "bt -E" to search IRQ stacks for
exception frames, and the "mach" command displays the addresses
of each per-cpu IRQ stack.
(anderson@redhat.com)
on/off with the "set" command. When set to "on", gdb's printing of
arrays will be set to "pretty", so that the display of each array
element will consume one line.
(anderson@redhat.com)
conjunction with "-s", and requires that the "start" address is the
address of a list_head, or other similar list linkage structure whose
first member points to the next linkage structure. The "-l <offset>"
argument is the offset of the embedded list linkage structure in the
specified "-s" data structure; it can be either a number of bytes or
expressed in "struct.member" format.
(anderson@redhat.com)
"dis -s" option if the kernel source code is not located in the
standard location that is compiled into the kernel's debuginfo data.
The directory argument should point to the top-level directory of the
kernel source tree.
(anderson@redhat.com)
line number that is associated with a specified text location,
followed by a source code listing if it is available on the host
machine. The line associated with the text location will be marked
with an asterisk; depending upon gdb's internal "listsize" variable,
several lines will precede the marked location. If a "count" argument
is entered, it specifies the number of source code lines to be
displayed after the marked location; otherwise the remaining source
code of the containing function will be displayed.
(anderson@redhat.com)
option is context-sensitive, similar to the the regular "files"
command when used without an argument, but replaces the FILE and
DENTRY columns with I_MAPPING and NRPAGES columns that reflect
each open file's inode.i_mapping address_space structure address,
and the address_space.nrpages count within it; this shows how
many of each open file's pages are currently in the system's
page cache. The "files -p <inode>" option takes the address
of an inode, and dumps all of its pages that are currently in the
system's page cache, borrowing the "kmem -p" page structure output.
(yangoliver@gmail.com)
state. Without the patch:
(1) The "ps" command's ST column shows "??" for tasks in the
TASK_WAKING state.
(2) The "ps" command's ST column shows "??" for tasks in the
TASK_PARKED state in Linux 3.14 and later kernels.
(3) The STATE field of the initial system banner and the "set"
command are incorrect if the task state has the TASK_WAKING,
TASK_WAKEKILL modifier, or TASK_PARKED bits set in Linux 3.14
and later kernels.
(4) The "foreach DE" task identifier fails if a task with a PID
number of 0xDE (222) exists.
(5) The "foreach" command's "SW", "PA", "TR" and "DE" task
identifiers inadvertently select all tasks in kernel versions
that do not have those states.
(6) The "help -t" output would display incorrect values for the
TASK_WAKEKILL, TASK_WAKING and TASK_PARKED states in Linux 3.14
and later kernels.
Lastly, support for the TASK_NOLOAD modifier introduced in Linux 4.2
has been added to STATE field of the "set" command and the initial
system banner.
(anderson@redhat.com)
"union", "task", "list" and "tree" commands. If a specified
structure member contains an embedded structure, the output may
be restricted to just the embedded structure by expressing the
.member argument as "member.member". If a specified structure
member is an array, the output may be restricted to a single array
element by expressing the .member argument as "member[index]".
Furthermore, these embedded member specifications may extend beyond
one level deep, for example, by expressing the member argument as
"member.member.member", or "member[index].member".
(Alexandr_Terekhov@epam.com, anderson@redhat.com)
but it allows the user to specify the page struct members to be
displayed. The option takes a comma-separated list of one or
more page struct members, which will be displayed following the
page structure address. The "flags" member will always be expressed
in hexadecimal format, and the "_count" and "_mapcount" members will
always be expressed in decimal format. Otherwise, all other members
will be displayed in hexadecimal format unless the current output
radix is 10 and the member is a signed/unsigned integer. Members
that are data structures may be specified by the data structure's
member name, or expanded to specify a member of that data structure.
For example, "-m lru" refers to a list_head data structure, in which
case both the list_head.next and list_head.prev pointer values will
be displayed; if "-m lru.next" is specified, just the list_head.next
value will be displayed.
(atomlin@redhat.com, anderson@redhat.com)
data of specified cpus. It can be used in conjunction with all runq
command options. The cpus must be specified in a comma- and/or
dash-separated list; for examples, "3", "1,8,9", "1-23", or "1,8-15".
(anderson@redhat.com)
based upon the task_struct's sched_entity.last_arrival. Without the
patch, it indicates that either the task_struct's last_run or
timestamp value are used.
(anderson@redhat.com)
initial support is restricted to 32-bit MIPS kernels that are
configured as little-endian. With respect to dumpfile types, only
ELF vmcores are recognized. In addition to building crash as a
32-bit MIPS binary, it is also possible to build crash as an x86
binary on an x86 or x86_64 host so that crash analysis of MIPS
dumpfiles can be performed on an x86 or x86_64 host. The x86 binary
can be built by entering "make target=MIPS" for the initial build;
subsequent builds with MIPS support can be accomplished by entering
"make" alone.
(rabin@rab.in)
adds support for displaying the new s390x vector registers. For
ELF dumps, the registers are taken from the VX ELF notes; for s390
dumps. the registers are taken from memory. The option produces the
same output as the -a option, but also displays the vector registers
for all active tasks.
(holzheu@linux.vnet.ibm.com)
the mm_struct address pointer in its task_struct is NULL'd out, and
as a result, the "vm" command looks like this:
crash> vm
PID: 4563 TASK: ffff88049863f500 CPU: 8 COMMAND: "postgres"
MM PGD RSS TOTAL_VM
0 0 0k 0k
However, the mm_struct address can be retrieved from the task's
kernel stack and entered manually with this option, which allows the
"vm" command to attempt to dump the virtual memory data of the task.
It may, or may not, work, depending upon how far the virtual memory
deconstruction has proceeded. This option only verifies that the
address entered is from the "mm_struct" slab cache, and that
its mm_struct.mm_count is non-zero.
(qiaonuohan@cn.fujitsu.com, anderson@redhat.com)
information, which will be appended to the traditional output of
the command. For example:
crash> kmem -i
PAGES TOTAL PERCENTAGE
TOTAL MEM 1965332 7.5 GB ----
FREE 78080 305 MB 3% of TOTAL MEM
USED 1887252 7.2 GB 96% of TOTAL MEM
SHARED 789954 3 GB 40% of TOTAL MEM
BUFFERS 110606 432.1 MB 5% of TOTAL MEM
CACHED 1212645 4.6 GB 61% of TOTAL MEM
SLAB 146563 572.5 MB 7% of TOTAL MEM
TOTAL SWAP 1970175 7.5 GB ----
SWAP USED 5 20 KB 0% of TOTAL SWAP
SWAP FREE 1970170 7.5 GB 99% of TOTAL SWAP
COMMIT LIMIT 2952841 11.3 GB ----
COMMITTED 1150595 4.4 GB 38% of TOTAL LIMIT
The COMMIT LIMIT and COMMITTED information is similar to that
displayed by the CommitLimit and Committed_AS lines in /proc/meminfo.
(atomlin@redhat.com)
of network devices with respect the network namespace of the current
context, or that of a task specified by the optional "pid" or "task"
argument. The former "net -n <address>" option that translates
an IPv4 address expressed as a decimal or hexadecimal value into a
standard numbers-and-dots notation has been changed to "net -N".
(ws@parallels.com)
"virsh dump --memory-only --format <compression-type>" command,
where the compression-type is either "kdump-zlib", "kdump-lzo" or
"kdump-snappy". Without the patch, if an x86_64 guest kernel was loaded
with a non-zero "phys_base", the "--machdep phys_base=<offset>" command
line option was required as a workaround or the crash session would fail
with the warning message "WARNING: cannot read linux_banner string"
followed by the fatal error message "crash: vmlinux and <dumpfile name>
do not match!".
(anderson@redhat.com)
information. If the "tainted_mask" symbol exists, the option will
show its hexadecimal value and translate each bit set to the symbolic
letter of the taint type. On kernels prior to 2.6.28 which had the
"tainted" symbol, only its hexadecimal value is shown. The relevant
kernel sources should be consulted for the meaning of the letter(s)
or hexadecimal bit value(s).
(anderson@redhat.com)
the header of compressed kdumps, and the new DUMP_ELF_INCOMPLETE flag
in the header of ELF kdumps. If the makedumpfile(8) facility fails
to complete the creation of compressed or ELF kdump vmcore files
due to ENOSPC or other error, it will mark the vmcore as incomplete.
If either flag is set, the crash utility will issue a warning that
the dumpfile is known to be incomplete during initialization, just
prior to the system banner display. When reads are attempted on
missing data, a read error will be returned. As an alternative,
zero-filled data will be returned if the "--zero_excluded" command
line flag is used, or the "zero_excluded" runtime variable is set
to "on". In either case, the read errors or zero-filled memory
may cause the crash session to fail entirely, cause commands to
fail, or may result in other unpredictable runtime behavior.
(anderson@redhat.com, zhouwj-fnst@cn.fujitsu.com)
either "show" (the default) or "hide". When set to "hide", certain
command output associated with offline cpus will be hidden from view,
and the output will indicate that the cpu is "[OFFLINE]". The new
variable can be set during invocation on the crash command line via
the option "--offline [show|hide]". During runtime, or in a .crashrc
or other crash input file, the variable can be set by entering
"set offline [show|hide]". The commands or options that are affected
when the variable is set to "hide" are as follows:
o On X86_64 machines, the "bt -E" option will not search exception
stacks associated with offline cpus.
o On X86_64 machines, the "mach" command will append "[OFFLINE]"
to the addresses of IRQ and exception stacks associated with
offline cpus.
o On X86_64 machines, the "mach -c" command will not display the
cpuinfo_x86 data structure associated with offline cpus.
o The "help -r" option has been fixed so as to not attempt to
display register sets of offline cpus from ELF kdump vmcores,
compressed kdump vmcores, and ELF kdump clones created by
"virsh dump --memory-only".
o The "bt -c" option will not accept an offline cpu number.
o The "set -c" option will not accept an offline cpu number.
o The "irq -s" option will not display statistics associated with
offline cpus.
o The "timer" command will not display hrtimer data associated
with offline cpus.
o The "timer -r" option will not display hrtimer data associated
with offline cpus.
o The "ptov" command will append "[OFFLINE]" when translating a
per-cpu address offset to a virtal address of an offline cpu.
o The "kmem -o" option will append "[OFFLINE]" to the base per-cpu
virtual address of an offline cpu.
o The "kmem -S" option in CONFIG_SLUB kernels will not display
per-cpu data associated with offline cpus.
o When a per-cpu address reference is passed to the "struct"
command, the data structure will not be displayed for offline
cpus.
o When a per-cpu symbol and cpu reference is passed to the "p"
command, the data will not be displayed for offline cpus.
o When the "ps -[l|m]" option is passed the optional "-C [cpus]"
option, the tasks queued on offline cpus are not shown.
o The "runq" command and the "runq [-t/-m/-g/-d]" options will not
display runqueue data for offline cpus.
o The "ps" command will replace the ">" active task indicator to
a "-" for offline cpus.
The initial system information banner and the "sys" command will
display the total number of cpus as before, but will append the count
of offline cpus. Lastly, a fix has been made for the initialization
time determination of the maximum number of per-cpu objects queued
in a CONFIG_SLAB kmem_cache so as to continue checking all cpus
higher than the first offline cpu. These changes in behavior are not
dependent upon the setting of the crash "offline" variable.
(qiaonuohan@cn.fujitsu.com)
TASK_PARKED state in Linux 3.9 and later kernels. Without the patch,
the command's "ST" column entry for parked tasks shows "??". The
state column will now show "PA", and the foreach command will accept
"PA" as a "state" argument.
(anderson@redhat.com)
the number identifying the command. However, unlike the similar "r"
pseudo-command, if the number is a command name in the user's PATH,
maintain the current behavior and execute that command.
(anderson@redhat.com)
more "ramdump" files may be entered on the crash command line
in an ordered pair format consisting of the RAM dump filename
and the starting physical address expressed in hexadecimal,
connected with an ampersand:
$ crash vmlinux ramdump@address [ramdump@address]
A temporary ELF header will be created in /var/tmp, and the
combination of the header and the ramdump file(s) will be handled
like a normal ELF vmcore. The ELF header will only exist during
the crash session. If desired, an optional "-o <filename>"
may be entered to create a permanent ELF vmcore file from the
ramdump file(s).
(vinayakm.list@gmail.com, paawan1982@yahoo.com, anderson@redhat.com)
for Linux 3.13 and later kernels if the option is attempted, and in
the "help mount" output, similar to the deprecated "mount -d" option.
(anderson@redhat.com)
option to "runq -t", but which displays the amount of time that the
active task on each cpu has been running, expressed in a format
consisting of days, hours, minutes, seconds and milliseconds.
(anderson@redhat.com)
option to "ps -l", but which translates the task timestamp value from
a decimal or hexadecimal nanoseconds value into a more human-readable
string consisting of the number of days, hours, minutes, seconds and
milliseconds that have elapsed since the task started executing on a
cpu. More accurately described, it is the time difference between
the timestamp copied from the per-cpu runqueue clock when the task
last started executing compared to the most current value of the
per-cpu runqueue clock.
(anderson@redhat.com, bud.brown@redhat.com)
In addition, a new "ps -C <cpu-specifier>" option has been added
that can only be used with "ps -l" and "ps -m", which sorts the
global task list into per-cpu blocks; the cpu-specifier uses the
standard comma or dash separated list, expressed as "-C 1,3,5",
"-C 1-3", "-C 1,3,5-7,10", or "-Call" or "-Ca" for all cpus.
(anderson@redhat.com)
of the active task on one or more cpus. The cpus must be specified
in a comma- and/or dash-separated list; for examples ""3", "1,8,9",
"1-23", or "1,8,9-14". Similar to "bt -a", the option is only
applicable with crash dumps.
(atomlin@redhat.com)
The hash queue is used for gathering and verifying lists, and the
original count of 128 may be overwhelmed if a list is extremely
large. For example, on a 256GB system with 192GB of free pages,
the "kmem -f" command takes hours to complete; with this patch,
the time is reduced to a few minutes. In addition, a new command
line option "--hash <count>" has been added to allow a user to
override the default hash queue head count of 32768.
(anderson@redhat.com)
that that are configured with CONFIG_RANDOMIZE_BASE. When set to
"auto", the KASLR relocation value will be determined automatically
by comparing the "_stext" symbol value compiled into the vmlinux file
with the _stext symbol value stored in kdump vmcoreinfo data; on live
systems the comparison will be made with the "_stext" symbol value
that is found in /proc/kallsyms.
(ahonig@google.com, anderson@redhat.com)
X86_64 kernels that are configured with CONFIG_RANDOMIZE_BASE.
The offset value must be equal to the difference between the
symbol values compiled into the vmlinux file and their relocated
value.
(ahonig@google.com, anderson@redhat.com)
a slab object, consisting of the slab cache name and the address
value, separated by a colon, and encompassed in brackets:
[slab-cache-name:address]
Enhanced the "bt -F" option such that if "-F" is entered twice,
and if the stack frame contents reference a slab cache object, both
the slab cache name and the stack contents will be displayed within
brackets.
Enhanced the "rd -S" option such that if "-S" is entered twice,
and if the memory contents reference a slab cache object, both the
slab cache name and the memory contents will be displayed within
brackets.
(anderson@redhat.com)