Without the patch, if certain patterns of cpus are offline, the count
may be too small, causing cpu-dependent commands to not recognize
online cpus.
(Jan.Karlsson@sonymobile.com, anderson@redhat.com)
banner, the "sys" command, and the X86_64 "mach" command, to only
show the "OFFLINE" cpu count if there are actually offline cpus.
(anderson@redhat.com)
either "show" (the default) or "hide". When set to "hide", certain
command output associated with offline cpus will be hidden from view,
and the output will indicate that the cpu is "[OFFLINE]". The new
variable can be set during invocation on the crash command line via
the option "--offline [show|hide]". During runtime, or in a .crashrc
or other crash input file, the variable can be set by entering
"set offline [show|hide]". The commands or options that are affected
when the variable is set to "hide" are as follows:
o On X86_64 machines, the "bt -E" option will not search exception
stacks associated with offline cpus.
o On X86_64 machines, the "mach" command will append "[OFFLINE]"
to the addresses of IRQ and exception stacks associated with
offline cpus.
o On X86_64 machines, the "mach -c" command will not display the
cpuinfo_x86 data structure associated with offline cpus.
o The "help -r" option has been fixed so as to not attempt to
display register sets of offline cpus from ELF kdump vmcores,
compressed kdump vmcores, and ELF kdump clones created by
"virsh dump --memory-only".
o The "bt -c" option will not accept an offline cpu number.
o The "set -c" option will not accept an offline cpu number.
o The "irq -s" option will not display statistics associated with
offline cpus.
o The "timer" command will not display hrtimer data associated
with offline cpus.
o The "timer -r" option will not display hrtimer data associated
with offline cpus.
o The "ptov" command will append "[OFFLINE]" when translating a
per-cpu address offset to a virtal address of an offline cpu.
o The "kmem -o" option will append "[OFFLINE]" to the base per-cpu
virtual address of an offline cpu.
o The "kmem -S" option in CONFIG_SLUB kernels will not display
per-cpu data associated with offline cpus.
o When a per-cpu address reference is passed to the "struct"
command, the data structure will not be displayed for offline
cpus.
o When a per-cpu symbol and cpu reference is passed to the "p"
command, the data will not be displayed for offline cpus.
o When the "ps -[l|m]" option is passed the optional "-C [cpus]"
option, the tasks queued on offline cpus are not shown.
o The "runq" command and the "runq [-t/-m/-g/-d]" options will not
display runqueue data for offline cpus.
o The "ps" command will replace the ">" active task indicator to
a "-" for offline cpus.
The initial system information banner and the "sys" command will
display the total number of cpus as before, but will append the count
of offline cpus. Lastly, a fix has been made for the initialization
time determination of the maximum number of per-cpu objects queued
in a CONFIG_SLAB kmem_cache so as to continue checking all cpus
higher than the first offline cpu. These changes in behavior are not
dependent upon the setting of the crash "offline" variable.
(qiaonuohan@cn.fujitsu.com)
CONFIG_SLAB kmem_cache per-cpu array_cache.limit value during
session initialization. In a recently seen vmcore, several of the
array_cache.limit values were corrupted such that they were stored
as negative values, which in turn caused the "kmem -[sS]" options
to fail immediately with a dump of the internal memory buffer
allocation statistics and the error message "kmem: cannot allocate
any more memory!".
(anderson@redhat.com)
TASK_PARKED state in Linux 3.9 and later kernels. Without the patch,
the command's "ST" column entry for parked tasks shows "??". The
state column will now show "PA", and the foreach command will accept
"PA" as a "state" argument.
(anderson@redhat.com)
not the same endian as the crash utility binary. Without the patch
the filename is shown with the incorrect/opposite endian type.
(hukeping@huawei.com)
was not configured with CONFIG_IKCONFIG. Without the patch, the
initial system banner and the "sys" command show "UPTIME: (cannot
calculate: unknown HZ value)", the "ps -t" option shows "RUN TIME:
(cannot calculate: unknown HZ value)", and the "timer -r" option
kills the crash session with a floating point exception.
(hukeping@huawei.com)
introduced in crash-7.0.8. Without this patch, it is possible that
the "ps" command may fail prematurely with the error message
"ps: bsearch for tgid failed: task: <address> tgid: <number>"
when running on a live system or against a "live" dumpfile.
(panfy.fnst@cn.fujitsu.com)
utility source tree on PPC and PPC64 machines. Without the patch,
both PPC and PPC64 will get #define'd if the extension module build
procedure does not #define one or the other, which in turn causes
multiple conflicting declarations.
(anderson@redhat.com)
an LPAE enabled kernel by first checking whether CONFIG_ARM_LPAE
exists in the vmcoreinfo data, and if it does not, by then checking
whether the next higher symbol above "swapper_pg_dir" is 0x5000 bytes
higher in value.
(sdu.liu@huawei.com)
module to be built outside of a crash source tree on a ppc64le PPC64
little-endian host. Without the patch, "make -f snap.mk" would fail
to compile, indicating "gcc: error: macro name missing after '-D'"
(anderson@redhat.com)
gathering of tasks from the kernel pid_hash[] in 2.6.24 and later
kernels. Without the patch, if an entry in a pid_hash[] chain is
not related to the "init_pid_ns" pid_namespace structure, any
remaining entries in the hlist chain are skipped.
(vvs@parallels.com)
(1) task.c: initialize the "curr" and "curr_my_q" variables in the
dump_tasks_in_task_group_cfs_rq() function.
(2) ramdump.c: make the "rd" and "len" return values from read()
and write() calls in write_elf() to be ssize_t types.
(3) cmdline.c: make the parsed PATH string buffer equal to the size
of the PATH string + 1 to prevent a possible buffer overflow
when a command line starts with a "!".
(anderson@redhat.com)
are searched for the currently-running kernel on live systems. This
will automatically locate the vmlinux namelist for kernels that were
locally installed with "make modules_install install".
(lrintel@redhat.com)
on S390X machines. The output of CPU timer and clock comparator has
always been incorrect because:
- We added S390X_WORD_SIZE (8) instead of 4 to get the second word
- We did not left shift the clock comparator by 8
The fix gets the complete 64 bit values and by shifting the clock
comparator correctly.
(holzheu@linux.vnet.ibm.com)
a crash-7.0.4 patch which added per-thread task_struct.rss_stat page
counts to the task's mm_struct.rss_stat page counts in order to show
an accurate/synchronized RSS value. Without the patch, the "ps"
command performance would degrade as the number of tasks increased,
most notably when there were thousands of tasks.
(panfy.fnst@cn.fujitsu.com, anderson@redhat.com)
created by older development versions of KVM tools in which the
cpu version id was 12, but the cpu device headers did not contain
the additional XSAVE related fields.
(uobergfe@redhat.com)
that were created with a cpu version id of 12 or greater that contain
additional XSAVE related fields in their cpu device headers. Without
the patch, active tasks running on cpus above 0 may have truncated
backtraces.
(uobergfe@redhat.com)
Since this required a large number of patches to be applied to
architecture-neutral files in the gdb-7.6 tree, the changes are
only applied if the host build system is a ppc64le.
(ptesarik@suse.cz, normand@linux.vnet.ibm.com)
by the "kmem [-sS]" options for kernels configured with CONFIG_SLUB.
Without the patch, the contents of several structure members are not
validated, and may generate bogus or never-ending output, typically
seen when running the commands on a "live dump" where the dumpfile
was taken while the kernel was still running. The patch aborts the
relevant parts of per-kmem_cache output when invalid data is
encountered or if an object list contains duplicate entries, and
error messages have been enhanced to more accurately describe the
issues encountered.
(anderson@redhat.com)
error message (typically when reading the "cpu_possible_mask") until
it is confirmed that all of the following are true:
(1) /dev/crash does not exist, and
(2) /dev/mem is restricted via CONFIG_STRICT_DEVMEM, and
(3) /proc/kcore cannot be read/accessed.
The "kernel may be configured with CONFIG_STRICT_DEVMEM" and
the "trying /proc/kcore as an alternative" messages will still
be displayed when appropriate. The read error message be displayed
only if all three live memory read options fail.
(anderson@redhat.com)
be compressed and named "crash.ko.xz". Without the patch, the driver
is not recognized and loaded, and as a result the /dev/mem driver
and/or /proc/kcore will be tried as the live memory source.
(anderson@redhat.com)
the number identifying the command. However, unlike the similar "r"
pseudo-command, if the number is a command name in the user's PATH,
maintain the current behavior and execute that command.
(anderson@redhat.com)
EM_ARM and EM_AARCH values as "e_machine" types, and ELFOSABI_LINUX
as an "e_ident[EI_OSABI]" type. Without the patch, the e_machine
translation would show "40 (unsupported)" for 32-bit ARM, or
"183 (unsupported)" on ARM64; and the ELFOSABI_LINUX type would
be translated as "3 (?)".
(anderson@redhat.com)
more "ramdump" files may be entered on the crash command line
in an ordered pair format consisting of the RAM dump filename
and the starting physical address expressed in hexadecimal,
connected with an ampersand:
$ crash vmlinux ramdump@address [ramdump@address]
A temporary ELF header will be created in /var/tmp, and the
combination of the header and the ramdump file(s) will be handled
like a normal ELF vmcore. The ELF header will only exist during
the crash session. If desired, an optional "-o <filename>"
may be entered to create a permanent ELF vmcore file from the
ramdump file(s).
(vinayakm.list@gmail.com, paawan1982@yahoo.com, anderson@redhat.com)
the active tasks in the kernel's per-cpu crash_notes, there is an
initialization-time warning message indicating "could not retrieve
crash_notes". It has been changed to a more meaningful warning
message indicating "cannot retrieve registers for active tasks".
(anderson@redhat.com)
configured with CONFIG_SLUB to display the address of each per-cpu
kmem_cache_cpu address and the contents of its per-cpu partial list.
(qiaonuohan@cn.fujitsu.com)
kernel's VA_BITS value. It currently is hardwired in the kernel to
one of two values depending upon whether 4K or 64K pages are
configured. However, there are plans to support 16K paqes, to make
VA_BITS a configurable value, and to make the number of page-table
levels configurable. Towards that end, the crash utility has been
changed to determine the VA_BITS value based upon known kernel
virtual addresses, and to then calculate the relevant kernel virtual
address ranges on that value instead of hardwiring them based upon
the page size.
(anderson@redhat.com)
extension module, do not attempt to read the crash_notes. Since the
dumpfile was taken while running on a live system, the crash_notes,
if configured into the kernel, would not contain valid data. Without
the patch, the message "WARNING: could not retrieve crash_notes" is
displayed during session initialization.
(anderson@redhat.com)
that begin with "__crc_". Without the patch, several thousand of
them may be displayed by "sym -l" prior to the first kernel virtual
address symbol.
(anderson@redhat.com)
for Linux 3.13 and later kernels if the option is attempted, and in
the "help mount" output, similar to the deprecated "mount -d" option.
(anderson@redhat.com)
in the dumpfile header is not contained within the file, attempts
to analyze it with a vmlinux file, or using the "crash --osrelease"
or "crash --log" options with just the vmcore, will result in the
crash utility spinning forever, endlessly performing reads of 0 bytes
from the file without recognizing the EOF condition.
(dwysocha@redhat.com)
containing commit eee5cc2702929fd41cce28058dc6d6717f723f87, which
removed the super_block.s_files list_head member and the open files
list that it contained. Without the patch, the command option fails
with the error message "mount: invalid structure member offset:
super_block_s_files"
(anderson@redhat.com)
"kpatch" modules. Without the patch, the command would display
"mod: cannot find or load object file for <kpatch-module> module".
(anderson@redhat.com)
Without the patch, the command fails with a dump of the crash utility
memory allocation statistics, ending with "search: cannot allocate
any more memory!".
(anderson@redhat.com)
is followed by a vmlinux file on the crash command line. When the
crash session ends, two errors will occur:
(1) the vmlinux file will be deleted
(2) the temporary uncompressed version of the vmlinux.debug file
will remain in /var/tmp
This problem also occurs in the highly unlikely case where a
compressed vmlinux file is followed by a vmlinux.debug file on the
command line, and the uncompressed temporary version of the vmlinux
file is larger than the vmlinux.debug file. In that case:
(1) the vmlinux.debug file will be deleted
(2) the temporary uncompressed version of the vmlinux file
will remain in /var/tmp
(dmair@suse.com)
was configured with more than 4GB of memory. Without the patch, the
crash session may fail during initialization with the error message
"crash: vmlinux and <dumpfile> do not match!".
(dslutz@verizon.com)
CONFIG_ARM_LPAE. The patch implements the virtual-to-physical
address translation of 64-bit PTEs used by ARM LPAE kernels.
(sdu.liu@huawei.com, weijitao@huawei.com)
that was built with "make target=ARM64" in order to analyze ARM64
dumpfiles on an x86_64 host. Without the patch, if the extend
command is used with an extension module built in the same manner,
it fails with the message "extend: <module>.so: not an ELF format
object file".
(Jan.Karlsson@sonymobile.com)
the cgroup_name() function now utilizes kernfs_name(). Without the
patch, the command fails with the error message "runq: invalid
structure member offset: cgroup_dentry".
(anderson@redhat.com)
done from within a previously-existing build tree, "patch -N" the
gdb sources, and start the rebuild from the gdb-<version> directory
instead of the gdb-<version>/gdb directory.
(anderson@redhat.com)
and by the "sys" command when using a System.map file with a
Linux 3.0 and later debug kernel. Without the patch, the kernel
version is not displayed in parentheses following the debug kernel
name.
(anderson@redhat.com)