and c65eacbe290b8141554c71b2c94489e73ade8c8d, which have introduced a
new CONFIG_THREAD_INFO_IN_TASK configuration. This configuration
moves each task's thread_info structure from the base of its kernel
stack into its task_struct. Without the patch, the crash session
fails during initialization with the error "crash: invalid structure
member offset: thread_info_cpu".
(anderson@redhat.com)
whose device driver uses the blk-mq interface. Currently "dev -d"
always displays 0 in all fields for the blk-mq disk because blk-mq
does not increment/decrement request_list.count[2] on I/O creation
and I/O completion. The following values are used in blk-mq in such
situations:
- I/O creation: blk_mq_ctx.rq_dispatched[2]
- I/O completion: blk_mq_ctx.rq_completed[2]
So, we can get the counter of in-progress I/Os as follows:
in progress I/Os == rq_dispatched - rq_completed
This patch displays the result of above calculation for the disk.
It determines whether the device driver uses blk-mq if the
request_queue.mq_ops is not NULL. The "DRV" field is displayed as
"N/A(MQ)" if the value for in-flight in the device driver does not
exist for blk-mq.
(m.mizuma@jp.fujitsu.com)
as is done with 32-bit ARM. Without the patch, a crash session may
fail during the "gathering module symbol data" stage with a message
similar to "crash: store_module_symbols_v2: total: 15 mcnt: 16".
(takahiro.akashi@linaro.org)
that are not declared as "char *" types. Change two prior direct
callers of resizebuf() to use RESIZEBUF(), and fix two prior users of
RESIZEBUF() to correctly calculate the need to resize their buffers.
(anderson@redhat.com)
in which Thomas Gleixner redesigned the kernel timer mechanism to
switch to a non-cascading wheel. Without the patch, the "timer"
command fails with the message "timer: zero-size memory allocation!
(called from <address>)"
(anderson@redhat.com)
each command's -s option, but instead of parsing gdb output, member
values are read directly from memory, so the command is much faster
for 1-, 2-, 4-, and 8-byte members.
(Alexandr_Terekhov@epam.com)
initialization. In the unlikely case where the ordering of module
symbol name strings does not match the order of the kernel_symbol
structures, a faulty module symbol list entry may be created that
contains a bogus name string.
(sebastien.piechurski@bull.net)
unlikely case where the symbol's name string is composed entirely of
hexadecimal characters. For example, without the patch, "sym e820"
fails with the error message "sym: invalid address: e820".
(anderson@redhat.com)
could be identified because of the "randomize_modules" kernel symbol,
and if it existed, the "--kaslr=<offset>" and/or "--kaslr=auto"
options were unnecessary. Since the "randomize_modules" symbol was
removed in Linux 4.1, this patch has replaced the KASLR identifier
with the "module_load_offset" symbol, which was also introduced in
Linux 3.15, but still remains.
(anderson@redhat.com)
layout and associated KASLR support that was introduced in Linux 4.6.
The kernel text and static data has been moved from unity-mapped
memory into the vmalloc region, and its start address can be
randomized if CONFIG_RANDOMIZE_BASE is configured. Related support
is being put into the kernel's kdump code, the kexec-tools package,
and makedumpfile(8); with that in place, the analysis of Linux 4.6
ARM64 dumpfiles with or without KASLR enabled should work normally
by entering "crash vmlinux vmcore". On live systems, Linux 4.6 ARM64
kernels will only work automatically if CONFIG_RANDOMIZE_BASE is not
configured. Unfortunately, if CONFIG_RANDOMIZE_BASE is configured
on a live system, two --machdep command line arguments are required,
at least for the time being. The arguments are:
--machdep phys_offset=<base physical address>
--machdep kimage_voffset=<kernel kimage_voffset value>
Without the patch, any attempt to analyze a Linux 4.6 ARM64 kernel
fails during initialization with a stream of "read error" messages
followed by "crash: vmlinux and vmcore do not match!".
(takahiro.akashi@linaro.org)
version supports running against a live kernel. Compressed kdump
support is also here, but the crash dump support for the kernel,
kexec-tools, and makedumpfile is still pending. Initial work was
done by Karl Volz with help from Bob Picco.
(dave.kleikamp@oracle.com)
option searches for data structures of a specified size or within a
range of specified sizes. The -m option searches for data structures
that contain a member of a given type. If a structure contains
another structure, the members of the embedded structure will also
be subject to the search. The type string may be a substring of the
data type name. The output displays the size and name of the data
structure.
(Alexandr_Terekhov@epam.com, anderson@redhat.com)
kmem_cache shown by "kmem -s" in kernels configured with CONFIG_SLUB.
Without the patch, the values under the ALLOCATED column may be too
large because cached per-cpu objects are counted as allocated.
(vinayakm.list@gmail.com)
address is the highest text symbol value in a kernel module. Without
the patch, the disassembly may continue past the end of the function,
or may show nothing at all. The patch utilizes in-kernel kallsyms
symbol size information instead of disassembling until reaching the
address of the next symbol in the module.
(anderson@redhat.com)
the kernel has dynamically downsized from the size indicated by the
debuginfo data. At this time, only "kmem_cache" and "task_struct"
structures that have been downsized are registered, but others may be
added in the future. If a downsized data structure is passed to gdb
for display, gdb will request a read of the "full" data structure,
which may flow into a memory region that was either filtered by
makedumpfile(8), or perhaps into non-existent memory, thereby killing
the generating command immediately due to a partial read. With this
patch, commands such as "struct" and "task" that reference downsized
data structures will have their reads flagged to return successfully
if partial read error occurs.
(anderson@redhat.com)
which contain this kernel commit:
commit 1d798ca3f16437c71ff63e36597ff07f9c12e4d6
mm: make compound_head() robust
The commit above removes the PG_tail and PG_compound page.flags bits
and the page.first_page member, and introduces a page.compound_head
member, which is a pointer to the head page and whose bit 0 acts as
the tail flag. Without the patch, a SLAB or SLUB warning message
that indicates "cannot determine how compound pages are linked" is
displayed during initialization, and any command that tracks compound
pages will be affected.
(anderson@redhat.com)
instructions. Without the patch, "dis [-f] <function>" may continue
beyond the end of a function, disassembling the memory that is in
between the target function and the next function.
(anderson@redhat.com)
returns a count of symbols with the same name. Export a new
is_symbol_text() function, which checks whether specified symbol
entry is a type 't' or 'T'.
(atomlin@redhat.com, anderson@redhat.com)
format if an invalid structure and/or member is used as an argument.
Without the patch, the command will display the expected error
indicating "task: invalid structure member reference", but then will
be followed by a stream of "task: recursive temporary file usage"
error messages.
(anderson@redhat.com)
option is context-sensitive, similar to the the regular "files"
command when used without an argument, but replaces the FILE and
DENTRY columns with I_MAPPING and NRPAGES columns that reflect
each open file's inode.i_mapping address_space structure address,
and the address_space.nrpages count within it; this shows how
many of each open file's pages are currently in the system's
page cache. The "files -p <inode>" option takes the address
of an inode, and dumps all of its pages that are currently in the
system's page cache, borrowing the "kmem -p" page structure output.
(yangoliver@gmail.com)
the patch, if the debuginfo data of an ARM64 kernel module that
contains a per-cpu section is loaded by "mod -s <module>" or
"mod -S", commands such as "bt" or "sym" may incorrectly translate
the module's virtual addresses to symbol names.
(Jan.Karlsson@sonymobile.com)
"union", "task", "list" and "tree" commands. If a specified
structure member contains an embedded structure, the output may
be restricted to just the embedded structure by expressing the
.member argument as "member.member". If a specified structure
member is an array, the output may be restricted to a single array
element by expressing the .member argument as "member[index]".
Furthermore, these embedded member specifications may extend beyond
one level deep, for example, by expressing the member argument as
"member.member.member", or "member[index].member".
(Alexandr_Terekhov@epam.com, anderson@redhat.com)
(1) The MIPS general purpose registers in the elf_gregset_t
don't start at index 0 but at index 6.
(2) Adjust for the kernel's pt_regs structure changes between
kernel versions. For example, fields are inserted into the
middle based on build time options, and the amount of padding
at the head of the structure was changed relatively recently.
To handle this, split the structure definition into two parts
and get the offsets of these two parts dynamically.
(3) Do not display each parsed kernel symbol during initialization
when invoked with "crash -d8".
(4) Add support for loading raw MIPS ramdump dumpfiles.
(5) Add support for compressed kdump dumpfiles.
(rabinv@axis.com)
a bundle of data that describes a structure member. The function
receives a pointer to a struct_member_data structure, in which the
caller has initialized the "structure" and "member" name pointers:
struct struct_member_data {
char *structure;
char *member;
long type;
long unsigned_type;
long length;
long offset;
long bitpos;
long bitsize;
};
A gdb "printm" command is crafted using those two fields, and the
output of the command is used to initialize the remaining six fields.
Adapted from Qiao Nuohan's "pstruct" extension module.
(anderson@redhat.com, qiaonuohan@cn.fujitsu.com)
initial support is restricted to 32-bit MIPS kernels that are
configured as little-endian. With respect to dumpfile types, only
ELF vmcores are recognized. In addition to building crash as a
32-bit MIPS binary, it is also possible to build crash as an x86
binary on an x86 or x86_64 host so that crash analysis of MIPS
dumpfiles can be performed on an x86 or x86_64 host. The x86 binary
can be built by entering "make target=MIPS" for the initial build;
subsequent builds with MIPS support can be accomplished by entering
"make" alone.
(rabin@rab.in)
the mm_struct address pointer in its task_struct is NULL'd out, and
as a result, the "vm" command looks like this:
crash> vm
PID: 4563 TASK: ffff88049863f500 CPU: 8 COMMAND: "postgres"
MM PGD RSS TOTAL_VM
0 0 0k 0k
However, the mm_struct address can be retrieved from the task's
kernel stack and entered manually with this option, which allows the
"vm" command to attempt to dump the virtual memory data of the task.
It may, or may not, work, depending upon how far the virtual memory
deconstruction has proceeded. This option only verifies that the
address entered is from the "mm_struct" slab cache, and that
its mm_struct.mm_count is non-zero.
(qiaonuohan@cn.fujitsu.com, anderson@redhat.com)
information, which will be appended to the traditional output of
the command. For example:
crash> kmem -i
PAGES TOTAL PERCENTAGE
TOTAL MEM 1965332 7.5 GB ----
FREE 78080 305 MB 3% of TOTAL MEM
USED 1887252 7.2 GB 96% of TOTAL MEM
SHARED 789954 3 GB 40% of TOTAL MEM
BUFFERS 110606 432.1 MB 5% of TOTAL MEM
CACHED 1212645 4.6 GB 61% of TOTAL MEM
SLAB 146563 572.5 MB 7% of TOTAL MEM
TOTAL SWAP 1970175 7.5 GB ----
SWAP USED 5 20 KB 0% of TOTAL SWAP
SWAP FREE 1970170 7.5 GB 99% of TOTAL SWAP
COMMIT LIMIT 2952841 11.3 GB ----
COMMITTED 1150595 4.4 GB 38% of TOTAL LIMIT
The COMMIT LIMIT and COMMITTED information is similar to that
displayed by the CommitLimit and Committed_AS lines in /proc/meminfo.
(atomlin@redhat.com)
special text region delimiter symbols declared in vmlinux.lds.S with
VMLINUX_SYMBOL(), such as __sched_text_start, __lock_text_start,
__kprobes_text_start, __entry_text_start and __irqentry_text_start.
Without the patch, if the addresses of those symbols are the same
value as the first "real" symbol in those text regions, commands
such as "dis" and "sym" may show the "_text_start" symbol name
instead of the desired text symbol name.
(qiaonuohan@cn.fujitsu.com, anderson@redhat.com)
of network devices with respect the network namespace of the current
context, or that of a task specified by the optional "pid" or "task"
argument. The former "net -n <address>" option that translates
an IPv4 address expressed as a decimal or hexadecimal value into a
standard numbers-and-dots notation has been changed to "net -N".
(ws@parallels.com)
configured with CONFIG_SLAB:
commit bf0dea23a9c094ae869a88bb694fbe966671bf6d
mm/slab: use percpu allocator for cpu cache
The commit above redesigned the kmem_cache.array_cache[] from a
hardwired array to a per-cpu pointer referencing external array_cache
structures. Without the patch, the crash session would fail during
initialization with the message "crash: cannot resolve cache_cache".
Note that it could be worked around by using the "--no_kmem_cache"
command line option, with a resulting loss of functionality for
commands requiring slab-related data.
(anderson@redhat.com)
either "show" (the default) or "hide". When set to "hide", certain
command output associated with offline cpus will be hidden from view,
and the output will indicate that the cpu is "[OFFLINE]". The new
variable can be set during invocation on the crash command line via
the option "--offline [show|hide]". During runtime, or in a .crashrc
or other crash input file, the variable can be set by entering
"set offline [show|hide]". The commands or options that are affected
when the variable is set to "hide" are as follows:
o On X86_64 machines, the "bt -E" option will not search exception
stacks associated with offline cpus.
o On X86_64 machines, the "mach" command will append "[OFFLINE]"
to the addresses of IRQ and exception stacks associated with
offline cpus.
o On X86_64 machines, the "mach -c" command will not display the
cpuinfo_x86 data structure associated with offline cpus.
o The "help -r" option has been fixed so as to not attempt to
display register sets of offline cpus from ELF kdump vmcores,
compressed kdump vmcores, and ELF kdump clones created by
"virsh dump --memory-only".
o The "bt -c" option will not accept an offline cpu number.
o The "set -c" option will not accept an offline cpu number.
o The "irq -s" option will not display statistics associated with
offline cpus.
o The "timer" command will not display hrtimer data associated
with offline cpus.
o The "timer -r" option will not display hrtimer data associated
with offline cpus.
o The "ptov" command will append "[OFFLINE]" when translating a
per-cpu address offset to a virtal address of an offline cpu.
o The "kmem -o" option will append "[OFFLINE]" to the base per-cpu
virtual address of an offline cpu.
o The "kmem -S" option in CONFIG_SLUB kernels will not display
per-cpu data associated with offline cpus.
o When a per-cpu address reference is passed to the "struct"
command, the data structure will not be displayed for offline
cpus.
o When a per-cpu symbol and cpu reference is passed to the "p"
command, the data will not be displayed for offline cpus.
o When the "ps -[l|m]" option is passed the optional "-C [cpus]"
option, the tasks queued on offline cpus are not shown.
o The "runq" command and the "runq [-t/-m/-g/-d]" options will not
display runqueue data for offline cpus.
o The "ps" command will replace the ">" active task indicator to
a "-" for offline cpus.
The initial system information banner and the "sys" command will
display the total number of cpus as before, but will append the count
of offline cpus. Lastly, a fix has been made for the initialization
time determination of the maximum number of per-cpu objects queued
in a CONFIG_SLAB kmem_cache so as to continue checking all cpus
higher than the first offline cpu. These changes in behavior are not
dependent upon the setting of the crash "offline" variable.
(qiaonuohan@cn.fujitsu.com)
configured with CONFIG_SLUB to display the address of each per-cpu
kmem_cache_cpu address and the contents of its per-cpu partial list.
(qiaonuohan@cn.fujitsu.com)
is followed by a vmlinux file on the crash command line. When the
crash session ends, two errors will occur:
(1) the vmlinux file will be deleted
(2) the temporary uncompressed version of the vmlinux.debug file
will remain in /var/tmp
This problem also occurs in the highly unlikely case where a
compressed vmlinux file is followed by a vmlinux.debug file on the
command line, and the uncompressed temporary version of the vmlinux
file is larger than the vmlinux.debug file. In that case:
(1) the vmlinux.debug file will be deleted
(2) the temporary uncompressed version of the vmlinux file
will remain in /var/tmp
(dmair@suse.com)
that was built with "make target=ARM64" in order to analyze ARM64
dumpfiles on an x86_64 host. Without the patch, if the extend
command is used with an extension module built in the same manner,
it fails with the message "extend: <module>.so: not an ELF format
object file".
(Jan.Karlsson@sonymobile.com)
the cgroup_name() function now utilizes kernfs_name(). Without the
patch, the command fails with the error message "runq: invalid
structure member offset: cgroup_dentry".
(anderson@redhat.com)
can be now be readily identified because of new kernel symbols that
have been added. For those kernels, the new "--kaslr=<offset>"
and/or "--kaslr=auto" options are not necessary for ELF or compressed
kdump vmcores, or for live systems that have /proc/kallsyms showing
the relocated symbol values. A new KASLR initialization function
called kaslr_init() is now called by symtab_init() prior to the
initial symbol-sorting operation. If kaslr_init() determines that
KASLR may be in effect, it will trigger a search for the relevant
vmlinux symbols during the sorting operation, which in turn will
cause the relocation value to be automatically calculated.
(anderson@redhat.com)
kernel system call alias/wrapper names, for examples, "SyS_read" and
"compat_SyS_futex" instead of "sys_read" and "compat_sys_futex".
Without the patch, commands such as "dis", "sym <address>", and
"sys -c" display the alias/wrapper name instead of the real system
call name in Linux 3.10 and later kernels.
(anderson@redhat.com)
that that are configured with CONFIG_RANDOMIZE_BASE. When set to
"auto", the KASLR relocation value will be determined automatically
by comparing the "_stext" symbol value compiled into the vmlinux file
with the _stext symbol value stored in kdump vmcoreinfo data; on live
systems the comparison will be made with the "_stext" symbol value
that is found in /proc/kallsyms.
(ahonig@google.com, anderson@redhat.com)
X86_64 kernels that are configured with CONFIG_RANDOMIZE_BASE.
The offset value must be equal to the difference between the
symbol values compiled into the vmlinux file and their relocated
value.
(ahonig@google.com, anderson@redhat.com)