kernels that have the following Linux 3.18 commit, which removes the
special per-cpu exception stack for handling stack segment faults:
commit 6f442be2fb22be02cafa606f1769fa1e6f894441
x86_64, traps: Stop using IST for #SS
Without this patch, backtraces that originate on any of the other 4
per-cpu exception stacks will be mis-labeled at the transition point
back to the previous stack. For example, backtraces that that
originate on the NMI stack will indicate that they are coming from
the "DOUBLEFAULT" stack. The patch examines all idt_table entries
during initialization, looking for gate descriptors that have
non-zero index values, and when found, pulls out out the handler
function address; from that information, the exception stack name
string array is properly initialized rather than being hard-coded.
This fix also properly labels the exception stack names on x86_64
CONFIG_PREEMPT_RT realtime kernels, which only utilize 3 exception
stacks instead of the traditional 5 (now 4 with this kernel commit),
instead of just showing "RT". Also, without the patch, the "mach"
command will mis-label the stack names when it displays the base
addresses of each per-cpu exception stack.
(anderson@redhat.com)
the mm_struct address pointer in its task_struct is NULL'd out, and
as a result, the "vm" command looks like this:
crash> vm
PID: 4563 TASK: ffff88049863f500 CPU: 8 COMMAND: "postgres"
MM PGD RSS TOTAL_VM
0 0 0k 0k
However, the mm_struct address can be retrieved from the task's
kernel stack and entered manually with this option, which allows the
"vm" command to attempt to dump the virtual memory data of the task.
It may, or may not, work, depending upon how far the virtual memory
deconstruction has proceeded. This option only verifies that the
address entered is from the "mm_struct" slab cache, and that
its mm_struct.mm_count is non-zero.
(qiaonuohan@cn.fujitsu.com, anderson@redhat.com)
and "bt -F[F]" options. Without the patch, if the page structure
associated with a memory address still contains a (stale) pointer to
the address of a kmem_cache structure, but whose page.flags does not
have the PG_slab bit set, the address is incorrectly presumed to be
contained within that slab cache. As as result, the "kmem" command
may display one or more messages indicating a "bad inuse counter", a
"bad next pointer" or a "bad s_mem pointer", followed by an "address
not found in cache" error message. The "rd -S[S]" and "bt -F[F]"
commands may mislabel memory locations as belonging to slab caches.
(anderson@redhat.com)
information, which will be appended to the traditional output of
the command. For example:
crash> kmem -i
PAGES TOTAL PERCENTAGE
TOTAL MEM 1965332 7.5 GB ----
FREE 78080 305 MB 3% of TOTAL MEM
USED 1887252 7.2 GB 96% of TOTAL MEM
SHARED 789954 3 GB 40% of TOTAL MEM
BUFFERS 110606 432.1 MB 5% of TOTAL MEM
CACHED 1212645 4.6 GB 61% of TOTAL MEM
SLAB 146563 572.5 MB 7% of TOTAL MEM
TOTAL SWAP 1970175 7.5 GB ----
SWAP USED 5 20 KB 0% of TOTAL SWAP
SWAP FREE 1970170 7.5 GB 99% of TOTAL SWAP
COMMIT LIMIT 2952841 11.3 GB ----
COMMITTED 1150595 4.4 GB 38% of TOTAL LIMIT
The COMMIT LIMIT and COMMITTED information is similar to that
displayed by the CommitLimit and Committed_AS lines in /proc/meminfo.
(atomlin@redhat.com)
special text region delimiter symbols declared in vmlinux.lds.S with
VMLINUX_SYMBOL(), such as __sched_text_start, __lock_text_start,
__kprobes_text_start, __entry_text_start and __irqentry_text_start.
Without the patch, if the addresses of those symbols are the same
value as the first "real" symbol in those text regions, commands
such as "dis" and "sym" may show the "_text_start" symbol name
instead of the desired text symbol name.
(qiaonuohan@cn.fujitsu.com, anderson@redhat.com)
of network devices with respect the network namespace of the current
context, or that of a task specified by the optional "pid" or "task"
argument. The former "net -n <address>" option that translates
an IPv4 address expressed as a decimal or hexadecimal value into a
standard numbers-and-dots notation has been changed to "net -N".
(ws@parallels.com)
wait_queue_head_t structure. Without the patch, if the entries
on the list are dynamically-created __wait_queue structures on
kernel stacks, the tasks owning the kernel stack are not displayed.
(anderson@redhat.com)
the active tasks in x86_64 ELF or compressed dumpfiles created by the
KVM "virsh dump --memory-only" facility. Without the patch, the
backtraces of active tasks may show an invalid starting frame that
indicates "__schedule". The fix displays the exception RIP and dumps
the register contents that are stored in the dumpfile header. If the
active task was operating in the kernel, the backtrace continues from
there; if the task was operating in user-space, the backtrace is
complete at that point.
(anderson@redhat.com)
built on an X86_64 host with "make target=X86" or "make target=ARM"
is used on a live X86_64 system without specifying a vmlinux
namelist. Without the patch, the session fails with the message
"crash: cannot find booted kernel -- please enter namelist argument".
The error message will be "crash: compiled for the X86 architecture"
or "crash: compiled for the ARM architecture".
(anderson@redhat.com)
architectures. Without the patch, the command fails with the message
"crash: seek error: physical address: <address> type: log_buf
pointer", followed by "crash: cannot read log_buf value". This bug
was introduced in crash-7.0.0 by a patch that added support for the
PPC64 BOOK3E processor family.
(anderson@redhat.com)
for analyzing little-endian PPC64 dumpfiles on an x86_64 host, which
can be done by entering "make target=PPC64". After the initial build
is complete, subsequent builds can be done by entering "make" alone.
(anderson@redhat.com)
if they are running Linux 3.12 and later kernels. Older kernels
without GENERIC_HARDIRQ support will fail with the error message
"irq: cannot determine number of IRQs".
(sebott@linux.vnet.ibm.com)
"virsh dump --memory-only --format <compression-type>" command,
where the compression-type is either "kdump-zlib", "kdump-lzo" or
"kdump-snappy". Without the patch, if an x86_64 guest kernel was loaded
with a non-zero "phys_base", the "--machdep phys_base=<offset>" command
line option was required as a workaround or the crash session would fail
with the warning message "WARNING: cannot read linux_banner string"
followed by the fatal error message "crash: vmlinux and <dumpfile name>
do not match!".
(anderson@redhat.com)
information. If the "tainted_mask" symbol exists, the option will
show its hexadecimal value and translate each bit set to the symbolic
letter of the taint type. On kernels prior to 2.6.28 which had the
"tainted" symbol, only its hexadecimal value is shown. The relevant
kernel sources should be consulted for the meaning of the letter(s)
or hexadecimal bit value(s).
(anderson@redhat.com)
configured with CONFIG_SLAB:
commit bf0dea23a9c094ae869a88bb694fbe966671bf6d
mm/slab: use percpu allocator for cpu cache
The commit above redesigned the kmem_cache.array_cache[] from a
hardwired array to a per-cpu pointer referencing external array_cache
structures. Without the patch, the crash session would fail during
initialization with the message "crash: cannot resolve cache_cache".
Note that it could be worked around by using the "--no_kmem_cache"
command line option, with a resulting loss of functionality for
commands requiring slab-related data.
(anderson@redhat.com)
the crash utility will fail to initialize, typically with a message
indicating "no debugging data available". However, it has been
reported (on a 32-bit ARM system) that the initialization sequence
continued on beyond that message point, and the session failed later
on with the message "neither runqueue nor rq structures exist". As
an aid to understanding why the session failed, if the target kernel
is configured with CONFIG_IKCONFIG, and CONFIG_DEBUG_INFO_REDUCED has
been set to "y", a relevant warning message will be displayed.
(anderson@redhat.com)
the header of compressed kdumps, and the new DUMP_ELF_INCOMPLETE flag
in the header of ELF kdumps. If the makedumpfile(8) facility fails
to complete the creation of compressed or ELF kdump vmcore files
due to ENOSPC or other error, it will mark the vmcore as incomplete.
If either flag is set, the crash utility will issue a warning that
the dumpfile is known to be incomplete during initialization, just
prior to the system banner display. When reads are attempted on
missing data, a read error will be returned. As an alternative,
zero-filled data will be returned if the "--zero_excluded" command
line flag is used, or the "zero_excluded" runtime variable is set
to "on". In either case, the read errors or zero-filled memory
may cause the crash session to fail entirely, cause commands to
fail, or may result in other unpredictable runtime behavior.
(anderson@redhat.com, zhouwj-fnst@cn.fujitsu.com)
the patch, if a dumpfile read targets physical memory in the first
memory page stored in the second or later sequential split dumpfile,
incorrect data will be returned.
(qiaonuohan@cn.fujitsu.com)
if the NT_PRSTATUS notes in a compressed kdump are invalid/corrupt.
If all cpus are online but the dumpfile initialization that cycles
through the NT_PRSTATUS notes does not find exactly one note per
cpu, then the register contents in those notes should not be used.
(anderson@redhat.com)
while "please wait... (determining panic task)" is being displayed.
This was caused by a patch introduced in crash-7.0.8, and can only
happen when analyzing dumpfiles whose header does not contain the
requisite information to determine the panic task and the active
tasks do not have any crash-related traces in their kernel stacks.
It should be noted that the SIGSEGV can be avoided by entering
"--no_panic" on the crash command line.
(anderson@redhat.com)
Without the patch, if certain patterns of cpus are offline, the count
may be too small, causing cpu-dependent commands to not recognize
online cpus.
(Jan.Karlsson@sonymobile.com, anderson@redhat.com)
Without the patch, if certain patterns of cpus are offline, the count
may be too small, causing cpu-dependent commands to not recognize
online cpus.
(Jan.Karlsson@sonymobile.com, anderson@redhat.com)
banner, the "sys" command, and the X86_64 "mach" command, to only
show the "OFFLINE" cpu count if there are actually offline cpus.
(anderson@redhat.com)
either "show" (the default) or "hide". When set to "hide", certain
command output associated with offline cpus will be hidden from view,
and the output will indicate that the cpu is "[OFFLINE]". The new
variable can be set during invocation on the crash command line via
the option "--offline [show|hide]". During runtime, or in a .crashrc
or other crash input file, the variable can be set by entering
"set offline [show|hide]". The commands or options that are affected
when the variable is set to "hide" are as follows:
o On X86_64 machines, the "bt -E" option will not search exception
stacks associated with offline cpus.
o On X86_64 machines, the "mach" command will append "[OFFLINE]"
to the addresses of IRQ and exception stacks associated with
offline cpus.
o On X86_64 machines, the "mach -c" command will not display the
cpuinfo_x86 data structure associated with offline cpus.
o The "help -r" option has been fixed so as to not attempt to
display register sets of offline cpus from ELF kdump vmcores,
compressed kdump vmcores, and ELF kdump clones created by
"virsh dump --memory-only".
o The "bt -c" option will not accept an offline cpu number.
o The "set -c" option will not accept an offline cpu number.
o The "irq -s" option will not display statistics associated with
offline cpus.
o The "timer" command will not display hrtimer data associated
with offline cpus.
o The "timer -r" option will not display hrtimer data associated
with offline cpus.
o The "ptov" command will append "[OFFLINE]" when translating a
per-cpu address offset to a virtal address of an offline cpu.
o The "kmem -o" option will append "[OFFLINE]" to the base per-cpu
virtual address of an offline cpu.
o The "kmem -S" option in CONFIG_SLUB kernels will not display
per-cpu data associated with offline cpus.
o When a per-cpu address reference is passed to the "struct"
command, the data structure will not be displayed for offline
cpus.
o When a per-cpu symbol and cpu reference is passed to the "p"
command, the data will not be displayed for offline cpus.
o When the "ps -[l|m]" option is passed the optional "-C [cpus]"
option, the tasks queued on offline cpus are not shown.
o The "runq" command and the "runq [-t/-m/-g/-d]" options will not
display runqueue data for offline cpus.
o The "ps" command will replace the ">" active task indicator to
a "-" for offline cpus.
The initial system information banner and the "sys" command will
display the total number of cpus as before, but will append the count
of offline cpus. Lastly, a fix has been made for the initialization
time determination of the maximum number of per-cpu objects queued
in a CONFIG_SLAB kmem_cache so as to continue checking all cpus
higher than the first offline cpu. These changes in behavior are not
dependent upon the setting of the crash "offline" variable.
(qiaonuohan@cn.fujitsu.com)
CONFIG_SLAB kmem_cache per-cpu array_cache.limit value during
session initialization. In a recently seen vmcore, several of the
array_cache.limit values were corrupted such that they were stored
as negative values, which in turn caused the "kmem -[sS]" options
to fail immediately with a dump of the internal memory buffer
allocation statistics and the error message "kmem: cannot allocate
any more memory!".
(anderson@redhat.com)
TASK_PARKED state in Linux 3.9 and later kernels. Without the patch,
the command's "ST" column entry for parked tasks shows "??". The
state column will now show "PA", and the foreach command will accept
"PA" as a "state" argument.
(anderson@redhat.com)
not the same endian as the crash utility binary. Without the patch
the filename is shown with the incorrect/opposite endian type.
(hukeping@huawei.com)
was not configured with CONFIG_IKCONFIG. Without the patch, the
initial system banner and the "sys" command show "UPTIME: (cannot
calculate: unknown HZ value)", the "ps -t" option shows "RUN TIME:
(cannot calculate: unknown HZ value)", and the "timer -r" option
kills the crash session with a floating point exception.
(hukeping@huawei.com)
introduced in crash-7.0.8. Without this patch, it is possible that
the "ps" command may fail prematurely with the error message
"ps: bsearch for tgid failed: task: <address> tgid: <number>"
when running on a live system or against a "live" dumpfile.
(panfy.fnst@cn.fujitsu.com)
utility source tree on PPC and PPC64 machines. Without the patch,
both PPC and PPC64 will get #define'd if the extension module build
procedure does not #define one or the other, which in turn causes
multiple conflicting declarations.
(anderson@redhat.com)
an LPAE enabled kernel by first checking whether CONFIG_ARM_LPAE
exists in the vmcoreinfo data, and if it does not, by then checking
whether the next higher symbol above "swapper_pg_dir" is 0x5000 bytes
higher in value.
(sdu.liu@huawei.com)
module to be built outside of a crash source tree on a ppc64le PPC64
little-endian host. Without the patch, "make -f snap.mk" would fail
to compile, indicating "gcc: error: macro name missing after '-D'"
(anderson@redhat.com)
gathering of tasks from the kernel pid_hash[] in 2.6.24 and later
kernels. Without the patch, if an entry in a pid_hash[] chain is
not related to the "init_pid_ns" pid_namespace structure, any
remaining entries in the hlist chain are skipped.
(vvs@parallels.com)
(1) task.c: initialize the "curr" and "curr_my_q" variables in the
dump_tasks_in_task_group_cfs_rq() function.
(2) ramdump.c: make the "rd" and "len" return values from read()
and write() calls in write_elf() to be ssize_t types.
(3) cmdline.c: make the parsed PATH string buffer equal to the size
of the PATH string + 1 to prevent a possible buffer overflow
when a command line starts with a "!".
(anderson@redhat.com)
are searched for the currently-running kernel on live systems. This
will automatically locate the vmlinux namelist for kernels that were
locally installed with "make modules_install install".
(lrintel@redhat.com)
on S390X machines. The output of CPU timer and clock comparator has
always been incorrect because:
- We added S390X_WORD_SIZE (8) instead of 4 to get the second word
- We did not left shift the clock comparator by 8
The fix gets the complete 64 bit values and by shifting the clock
comparator correctly.
(holzheu@linux.vnet.ibm.com)
a crash-7.0.4 patch which added per-thread task_struct.rss_stat page
counts to the task's mm_struct.rss_stat page counts in order to show
an accurate/synchronized RSS value. Without the patch, the "ps"
command performance would degrade as the number of tasks increased,
most notably when there were thousands of tasks.
(panfy.fnst@cn.fujitsu.com, anderson@redhat.com)
created by older development versions of KVM tools in which the
cpu version id was 12, but the cpu device headers did not contain
the additional XSAVE related fields.
(uobergfe@redhat.com)
that were created with a cpu version id of 12 or greater that contain
additional XSAVE related fields in their cpu device headers. Without
the patch, active tasks running on cpus above 0 may have truncated
backtraces.
(uobergfe@redhat.com)
Since this required a large number of patches to be applied to
architecture-neutral files in the gdb-7.6 tree, the changes are
only applied if the host build system is a ppc64le.
(ptesarik@suse.cz, normand@linux.vnet.ibm.com)
by the "kmem [-sS]" options for kernels configured with CONFIG_SLUB.
Without the patch, the contents of several structure members are not
validated, and may generate bogus or never-ending output, typically
seen when running the commands on a "live dump" where the dumpfile
was taken while the kernel was still running. The patch aborts the
relevant parts of per-kmem_cache output when invalid data is
encountered or if an object list contains duplicate entries, and
error messages have been enhanced to more accurately describe the
issues encountered.
(anderson@redhat.com)
error message (typically when reading the "cpu_possible_mask") until
it is confirmed that all of the following are true:
(1) /dev/crash does not exist, and
(2) /dev/mem is restricted via CONFIG_STRICT_DEVMEM, and
(3) /proc/kcore cannot be read/accessed.
The "kernel may be configured with CONFIG_STRICT_DEVMEM" and
the "trying /proc/kcore as an alternative" messages will still
be displayed when appropriate. The read error message be displayed
only if all three live memory read options fail.
(anderson@redhat.com)
be compressed and named "crash.ko.xz". Without the patch, the driver
is not recognized and loaded, and as a result the /dev/mem driver
and/or /proc/kcore will be tried as the live memory source.
(anderson@redhat.com)