Commit Graph

302 Commits

Author SHA1 Message Date
Dave Anderson
c4bb18f5fc Fix for the "timer" command on Linux 4.2 and later kernels, which
contain this kernel commit that modifies the tvec_root and tvec
data structures:

  commit bc7a34b8b9ebfb0f4b8a35a72a0b134fd6c5ef50
  timer: Use hlist for the timer wheel hash buckets

Without the patch, the "timer" command will spew messages indicating
"timer: invalid list entry: 0", followed by "timer: ignoring faulty
timer list at index <number> of timer array".
(anderson@redhat.com)
2015-08-25 16:14:27 -04:00
Dave Anderson
4744ba766d If the method of determining how compound pages are linked cannot be
accomplished due to page struct related changes in upstream kernels,
issue a WARNING message during session initialization.
(anderson@redhat.com)
2015-08-19 14:22:32 -04:00
Dave Anderson
b3c6380340 Reduce the unnecessary error messages if a directory is used as a
command line argument.  Without the patch, six error messages are
displayed:

  crash: unable to read dump file /tmp
  /tmp: ELF header read: Is a directory
  /tmp: ELF header read: Is a directory
  crash: /tmp: read: Is a directory
  read_maps: unable to read header from /tmp, errno = 1
  crash: vmw: Failed to read '/tmp': [Error 21] Is a directory

With the patch applied, the functions that generate those messages
are not called; only the standard "not a supported file format",
and "Usage" messages will be displayed.
(anderson@redhat.com)
2015-08-18 16:37:16 -04:00
Dave Anderson
2152a1fdea Minor cleanup and error handling fix-up for the "dis" command.
Without the patch, if the target address of "dis -r" or "dis -f"
is not an exact address of an instruction, "dis -r" will continue
beyond the target address, and "dis -f" will show nothing.
(anderson@redhat.com)
2015-08-14 11:14:06 -04:00
Dave Anderson
2e3d3f20d3 Fix for the "dis" command on architectures with variable-length
instructions.  Without the patch, "dis [-f] <function>" may continue
beyond the end of a function, disassembling the memory that is in
between the target function and the next function.
(anderson@redhat.com)
2015-08-14 10:19:56 -04:00
Dave Anderson
cc5244d86b Fix for the S390X "dis" command to prevent jump target addresses
from being displayed as kernel system call alias/wrapper names, for
example, "SyS_read+<offset>" instead of "sys_read+<offset>".
(anderson@redhat.com)
2015-08-14 09:17:30 -04:00
Dave Anderson
48add7d9b6 Fix for the PPC64 "dis" command to prevent conditional branch
target addresses from being displayed as kernel system call
alias/wrapper names, for example, "SyS_read+<offset>" instead
of "sys_read+<offset>".
(anderson@redhat.com)
2015-08-13 16:38:14 -04:00
Dave Anderson
0807455490 Fix for the ARM64 "dis" command to prevent branch target addresses
from being displayed as kernel system call alias/wrapper names, for
example, "SyS_read+<offset>" instead of "sys_read+<offset>".
(anderson@redhat.com)
2015-08-12 16:02:16 -04:00
Dave Anderson
4935c333a6 Introduction of the "dis -f <address>" option, which disassembles
from the target address until the end of the function.
(atomlin@redhat.com)
2015-08-12 13:49:15 -04:00
Dave Anderson
3c2fc5f2a0 When searching all kernel stacks for evidence of a panic task in
"live" s390x dumpfiles created by the VMDUMP, stand-alone dump, or
"virsh dump" facilities, none of which explicitly mark the dumpfile
as a "live dump", run a standard "bt" backtrace on each kernel stack
instead of the text-address-only "bt -t".  Without the patch, an
invalid text reference may be found in a task's kernel stack due to
the common zero-based user and kernel virtual address space ranges of
the s390x, causing the task to be mistakenly set as the "PANIC" task.
(holzheu@linux.vnet.ibm.com)
2015-08-12 09:30:29 -04:00
Dave Anderson
9681db206b Second part of:
Do not search for a panic task in s390x dumpfiles that are marked
  as a "live dump"...
The first part prevented a search of the active tasks; this part
prevents the last-ditch search of all tasks.
(anderson@redhat.com)
2015-08-11 10:42:21 -04:00
Dave Anderson
67b4843394 Mark the "crash" task that generated a snapshot vmcore utilizing the
the "snap.so" extension module as "(ACTIVE)" in the STATE field of
the initial system banner and the "set" command.  Without the patch,
the task's STATE field shows it as the "(PANIC)" task.
(anderson@redhat.com)
2015-08-11 10:27:04 -04:00
Dave Anderson
a640cbb1b5 Do not search for a panic task in s390x dumpfiles that are marked as
a "live dump".  Without the patch, an exhaustive, unnecessary, search
of all kernel stacks that looks for evidence of a system crash may
find an invalid reference in a task's kernel stack due to the common
zero-based user and kernel virtual address space ranges of the s390x,
causing the task to be mistakenly set as the "PANIC" task.
(holzheu@linux.vnet.ibm.com, anderson@redhat.com)
2015-08-10 14:03:27 -04:00
Dave Anderson
8119552763 Fix for the RSS value displayed by the "ps" command in Linux 2.6.34
and later big-endian machines.  Without the patch, a task's RSS value
will be erroneously calculated by using twice its file pages instead
of adding its file pages with its anonymous pages.
(anderson@redhat.com)
2015-08-05 15:04:25 -04:00
Dave Anderson
e90f049c22 If a kdump dumpfile is marked as incomplete in its ELF or compressed
kdump header, and the user has not used the --zero_excluded command
line option, append a note to the incomplete dump WARNING message
shown during invocation that suggests the use of --zero_excluded.
(zhouwj-fnst@cn.fujitsu.com)
2015-08-04 11:50:02 -04:00
Dave Anderson
9f809b8e2c Fix for the extensions/trace.c extension module to account for
kernels that are not configured with CONFIG_TRACE_MAX_TRACER.
Without the patch, the module fails to load with the error message
"failed to init the offset, struct: trace_array, member: max_offset".
(rabinv@axis.com)
2015-08-03 14:14:23 -04:00
Dave Anderson
9c102f9948 Fix for a segmentation violation generated by the ARM64 "bt -[f|F]"
options when analyzing the active tasks in vmcores generated by the
kdump facility.  This bug is a regression that was introduced in
crash-7.1.2 by commit 15a58e4070, which
was an enhancement of the ARM64 backtrace capability for active tasks
in kdump vmcores.
(anderson@redhat.com)
2015-08-03 13:55:02 -04:00
Dave Anderson
2e3b89ed93 Fix for the "kmem -s <address>", "bt -F[F]", and "rd -S[S]"
options in kernels configured with CONFIG_SLUB.  Without the patch,
if a referenced slab object address comes from a slab cache that
utilizes a multiple-page slab, and the object is located within
a tail page of that slab cache, it will not be recognized as a slab
object.  The "bt -F[F]" and "rd -S[S]" options will just show the
object address, and the "kmem -s <address>" object will indicate
"kmem: address is not allocated in slab subsystem: <address>".
This bug is a regression that was introduced in crash-7.1.0 by commit
8b2cb365d7, which addressed a bug where
stale slab object addresses were incorrectly being recognized as
valid slab objects.
(anderson@redhat.com)
2015-07-17 10:41:32 -04:00
Dave Anderson
8eb8fcc719 Fix for the "crash --osrelease" option for flattened format dumpfiles
in the unlikely event that the dumpfile header does not contain the
VMCOREINFO note section from the original ELF /proc/vmcore.  Without
the patch, the command displays nothing instead of showing "unknown".
(anderson@redhat.com)
2015-07-14 14:56:11 -04:00
Dave Anderson
94b8342c71 crash-7.1.1 -> crash-7.1.2 2015-07-13 10:42:01 -04:00
Dave Anderson
b3be954095 If a symbol or symbol+offset argument is passed to the "dis" command,
and there are multiple text symbols with the same symbol name, then
display a message indicating that there are "duplicate text symbols
found", followed by a list of the symbols.  Without the patch, the
duplicate symbol with the lowest virtual address is used.
(atomlin@redhat.com, anderson@redhat.com)
2015-07-09 16:52:30 -04:00
Dave Anderson
21874fe737 Export the previously static symbol_name_count() function, which
returns a count of symbols with the same name.  Export a new
is_symbol_text() function, which checks whether specified symbol
entry is a type 't' or 'T'.
(atomlin@redhat.com, anderson@redhat.com)
2015-07-09 12:56:29 -04:00
Dave Anderson
a8921b155f Update the extensions/eppic.mk file to clone the eppic source code
from https://github.com/lucchouina/eppic.git.
(lucchouina@gmail.com)
2015-07-09 10:45:18 -04:00
Dave Anderson
203853b71e Fix compiler warning generated by extensions/trace.c when compiled
with gcc version 5.  Without the patch, the message "warning: the
use of 'mktemp' is dangerous, better use 'mkstemp'" is generated.
(anderson@redhat.com)
2015-07-08 09:14:18 -04:00
Dave Anderson
39fa580b6d If the starting hexadecimal address of a function is passed to the
"dis" command without a count argument, disassemble the entire
function -- similar to when a symbol name of a function is passed
without a count argument.  Without the patch, only one instruction
is displayed.
(atomlin@redhat.com)
2015-07-06 10:05:04 -04:00
Dave Anderson
9a3cef8342 Force the 32-bit MIPS extensions/eppic.so to be compiled with -m32.
This is required when "make extensions" is executed after the top
level crash binary has been built with "make TARGET=MIPS" on an
x86_64 host.
(rabinv@axis.com)
2015-07-06 09:29:17 -04:00
Dave Anderson
c69e75877d Fix for the error handling of the "foreach task -R struct.member"
format if an invalid structure and/or member is used as an argument.
Without the patch, the command will display the expected error
indicating "task: invalid structure member reference", but then will
be followed by a stream of "task: recursive temporary file usage"
error messages.
(anderson@redhat.com)
2015-07-03 19:04:10 -04:00
Dave Anderson
0ab34ff030 Modified the qualification for the execution of the "runq -g" option.
Without the patch, if the target kernel was not configured with both
CONFIG_FAIR_GROUP_SCHED and CONFIG_RT_GROUP_SCHED, the command fails
with the message "runq: -g option not supported or applicable on this
architecture or kernel".  With this patch, if the kernel was built
with either CONFIG_FAIR_GROUP_SCHED or CONFIG_RT_GROUP_SCHED, the
command will execute.
(rabinv@axis.com)
2015-07-02 15:39:10 -04:00
Dave Anderson
3106fee2be Implementation of two new "files" command options. The "files -c"
option is context-sensitive, similar to the the regular "files"
command when used without an argument, but replaces the FILE and
DENTRY columns with I_MAPPING and NRPAGES columns that reflect
each open file's inode.i_mapping address_space structure address,
and the address_space.nrpages count within it; this shows how
many of each open file's pages are currently in the system's
page cache.  The "files -p <inode>" option takes the address
of an inode, and dumps all of its pages that are currently in the
system's page cache, borrowing the "kmem -p" page structure output.
(yangoliver@gmail.com)
2015-07-02 15:16:53 -04:00
Dave Anderson
7a2ff137fe Commit f95ecdc330 above to speed up
"crash --osrelease" for flattened format dumpfiles inadvertently
broke the option for ELF kdump and compressed kdump dumpfiles.
(anderson@redhat.com)
2015-07-01 16:25:56 -04:00
Dave Anderson
4af766ba41 Fix for the "timer" command when run on a kernel with a large number
of cpus.  Without the patch, the command may fail prematurely with
a dump of the internal crash utility allocated buffer statistics
followed by the message "timer: cannot allocate any more memory!",
(anderson@redhat.com)
2015-07-01 15:30:15 -04:00
Dave Anderson
df8ab4efc2 Fix for the PPC64 "bt" command to align its exception frame verifier
function with the most recent version of the kernel's getvecname()
function, which was updated in Linux 3.12.  Without the patch, the
"Hypervisor Decrementer", "Emulation Assist", "Hypervisor Doorbell",
"Altivec Unavailable", "Instruction Breakpoint", "Denormalisation",
"HMI" and "Altivec Assist" exception types are not recognized and
their exception frames not displayed; the  "Doorbell" exception type
is marked as a "reserved" exception type.
(anderson@redhat.com)
2015-06-29 10:19:25 -04:00
Dave Anderson
af28417771 Fix for the "bt" command on little-endian PPC64 machines for tasks
that are blocked in __schedule().  Without the patch, there will be
two "__switch_to" frames displayed before the normal "__schedule"
frame that is used as the starting point for blocked tasks.
(anderson@redhat.com)
2015-06-29 10:16:18 -04:00
Dave Anderson
4c46b8b028 Fix for the PPC64 "bt" command for active non-panic tasks. Without
the patch, the backtrace may fail immediately with the error message
"bt: invalid kernel virtual address: f  type: Regs NIP value".
(anderson@redhat.com)
2015-06-29 10:13:08 -04:00
Dave Anderson
68b413b9e4 Fix for the internal memory allocation functionality. Without the
patch, in the unlikely event where the GETBUF() facility has to
utilize malloc() to allocate a buffer, and CTRL-c is entered while
that buffer is being zeroed out before being returned to the caller,
it may result in a never-ending set of "malloc-free mismatch" error
messages.
(anderson@redhat.com)
2015-06-25 11:42:54 -04:00
Dave Anderson
cd93c8a0b5 Several fixes associated with the gathering and display of task
state.  Without the patch:
  (1) The "ps" command's ST column shows "??" for tasks in the
      TASK_WAKING state.
  (2) The "ps" command's ST column shows "??" for tasks in the
      TASK_PARKED state in Linux 3.14 and later kernels.
  (3) The STATE field of the initial system banner and the "set"
      command are incorrect if the task state has the TASK_WAKING,
      TASK_WAKEKILL modifier, or TASK_PARKED bits set in Linux 3.14
      and later kernels.
  (4) The "foreach DE" task identifier fails if a task with a PID
      number of 0xDE (222) exists.
  (5) The "foreach" command's "SW", "PA", "TR" and "DE" task
      identifiers inadvertently select all tasks in kernel versions
      that do not have those states.
  (6) The "help -t" output would display incorrect values for the
      TASK_WAKEKILL, TASK_WAKING and TASK_PARKED states in Linux 3.14
      and later kernels.
Lastly, support for the TASK_NOLOAD modifier introduced in Linux 4.2
has been added to STATE field of the "set" command and the initial
system banner.
(anderson@redhat.com)
2015-06-23 15:07:25 -04:00
Dave Anderson
fbf9a6fed1 Fix for the initialization-time sorting mechanism required for
"flattened format" dumpfiles if the dumpfile is truncated/incomplete.
Without the patch, the sorting function continues performing invalid
reads beyond the of the dumpfile, which may lead to an infinite loop
instead of a session-ending error message.  In addition, since the
sorting operation may take several minutes, a "please wait" message
with an incrementing percentage-complete counter will be displayed.
(anderson@redhat.com)
2015-06-18 15:33:50 -04:00
Dave Anderson
f95ecdc330 Speed up the "crash --osrelease" option when used with "flattened"
format dumpfiles.  Without the patch, the rearranged data array
initialization is performed before the vmcoreinfo data in the
header is read, which can take a significant amount of time with
large dumpfiles.  The patch simply looks for the appropriate
vmcoreinfo data string near the beginning of the dumpfile.
(anderson@redhat.com)
2015-06-16 16:40:19 -04:00
Dave Anderson
005eb9e502 Fix to prevent an unnecessary/temporary GETBUF() memory allocation
of 1 MB by the dump_mem_map() utility function when the kernel is
configured with CONFIG_SPARSEMEM.
(yangoliver@gmail.com)
2015-06-16 11:09:28 -04:00
Dave Anderson
cefda9be11 Fix for the S390X "bt" command when running against kernels that have
Linux 4.0 commit 2f859d0dad818765117c1cecb24b3bc7f4592074, which
removes the "async_stack" and "panic_stack" members from the "pcpu"
structure.  Without the patch, backtraces of active tasks that were
executing I/O or machine check interrupts are not displayed, while
other tasks may generate fatal readmem() errors of type "readmem_ul".
(holzheu@linux.vnet.ibm.com)
2015-06-09 10:10:52 -04:00
Dave Anderson
23939b2c44 Enabled the "crash --log vmcore" command line option on the ARM64
architecture.  Without the patch, the option fails with the message
"crash: crash --log not implemented on ARM64: TBD".
(anderson@redhat.com)
2015-06-04 14:01:38 -04:00
Dave Anderson
2e84d38c93 Enabled the "bt -R" option on the ARM64 architecture. Without the
patch, the option fails with the message "bt: -R option not supported
or applicable on this architecture or kernel".
(anderson@redhat.com)
2015-06-04 10:36:06 -04:00
Dave Anderson
15a58e4070 Enhancement of the ARM64 backtrace capability. Without the patch,
backtraces of the active tasks start at the function that is saved
in each per-cpu ELF note.  With the patch, the backtrace will start
at the "crash_kexec" function on the panicking cpu, and at the
"crash_save_cpu" function on the other active cpus.  By doing so,
the backtrace will display the exception handling functions leading
to crash_kexec() or crash_save_cpu(), as well as the exception frame
register set as it was at the time of the fatal exception on the
panic cpu, or when the shutdown IPI was received on the other cpus.
(anderson@redhat.com)
2015-06-02 16:03:11 -04:00
Dave Anderson
6b04220e76 crash-7.1.0 -> crash-7.1.1 2015-05-27 10:55:27 -04:00
Dave Anderson
8b752d7b95 Fix for the handling of ARM64 kernel module per-cpu symbols. Without
the patch, if the debuginfo data of an ARM64 kernel module that
contains a per-cpu section is loaded by "mod -s <module>" or
"mod -S", commands such as "bt" or "sym" may incorrectly translate
the module's virtual addresses to symbol names.
(Jan.Karlsson@sonymobile.com)
2015-05-27 10:43:54 -04:00
Dave Anderson
3cbecbcd3c Fix for any command that passes strings to gdb for evaluation,
where the string contains a parentheses-within-parentheses
expression along with a ">" or ">>" operator inside the outermost
set of parentheses.  Without the patch, a command such as the
following fails like so:

  crash> p ((1+1) >> 1)
  p: gdb request failed: p ((1+1)
  crash>

(anderson@redhat.com)
2015-05-21 17:28:11 -04:00
Dave Anderson
042639e3f5 Enhanced the "struct.member" display capability of the "struct",
"union", "task", "list" and "tree" commands.  If a specified
structure member contains an embedded structure, the output may
be restricted to just the embedded structure by expressing the
.member argument as "member.member".  If a specified structure
member is an array, the output may be restricted to a single array
element by expressing the .member argument as "member[index]".
Furthermore, these embedded member specifications may extend beyond
one level deep, for example, by expressing the member argument as
"member.member.member", or "member[index].member".
(Alexandr_Terekhov@epam.com, anderson@redhat.com)
2015-05-21 16:46:10 -04:00
Dave Anderson
d4040e2fb4 Fixes for the translation of ARM64 PTEs, as displayed by the "vm -p"
and "vtop" commands.  Without the patch, if "vm -p" references a
swapped-out page on Linux 4.0 and later kernels, the SWAP location
may indicate "(unknown swap location)", and will show an invalid
OFFSET value; on Linux 3.13 and later kernels, running "vtop" on a
user virtual address incorrectly translates the PTE contents of
swapped out pages by showing a PHYSICAL address and FLAGS translation
instead of the SWAP device and OFFSET.  It is possible that there may
be PTE bit translation errors on other kernel versions; the patch
addresses the changes in ARM64 PTE bit definitions made in Linux
3.11, 3.13, and 4.0 kernels.
(anderson@redhat.com)
2015-05-21 09:55:29 -04:00
Dave Anderson
4119e19053 Fix for the DATE display in the initial system banner and by the
"sys" command to account for the Linux 3.17 change that moved
the "timekeeper" symbol and structure into a containing tk_core
structure; the "shadow_timekeeper" timekeeper will be used as an
alternative.  Without the patch, the DATE shows something within
a few hours of the Linux epoch, such as "Wed Dec 31 18:00:00 1969".
(kmcmartin@redhat.com)
2015-05-19 17:09:06 -04:00
Dave Anderson
7623eee904 Fix for the ARM64 page size determination on Linux 4.1 and later
kernels.  Without the patch, the crash session fails during
initialization with the message "crash: invalid/unsupported page
size: 98304" on kernels with 64K pages.  On kernels with 4K pages,
the message is "crash: invalid/unsupported page size: 6144".  In
addition, the "-p <page-size>" command line override option
had no effect on ARM64; that has been fixed as well.
(anderson@redhat.com)
2015-05-19 10:20:04 -04:00