Commit Graph

595 Commits

Author SHA1 Message Date
Dave Anderson
95daa11b82 Fix for the "bpf -t" option. Although highly unlikely, without the
patch, the target function name of a BPF bytecode call instruction
may fail to be resolved correctly.
(anderson@redhat.com)
2018-06-01 15:28:55 -04:00
Dave Anderson
9446958fe2 Fix to address a "__builtin___snprintf_chk" compiler warning if bpf.c
is compiled with -D_FORTIFY_SOURCE=2.
(anderson@redhat.com)
2018-06-01 14:01:01 -04:00
Dave Anderson
da49e2010b Update for the recognition of the new x86_64 CPU_ENTRY_AREA virtual
address range introduced in Linux 4.15.  The memory range exists
above the vmemmap range and below the mapped kernel static text/data
region, and where all of the x86_64 exception stacks have been moved.
Without the patch, reads from the new memory region fail because the
address range is not recognized as a legitimate virtual address.
Most notable is the failure of "bt" on tasks whose backtraces
originate from any of the exception stacks, which fail with the two
error messages "bt: seek error: kernel virtual address: <address>
type: stack contents" followed by "bt: read of stack at <address>
failed".
(anderson@redhat.com)
2018-06-01 10:58:00 -04:00
Dave Anderson
a6cd8408d1 Fix for the x86 and x86_64 "mach -m" option on Linux 4.12 and later
kernels to account for the structure name changes "e820map" to
"e820_table", and "e820entry" to "e820_entry", and for the symbol
name change from "e820" to "e820_table".  Also updated the display
output to properly translate E820_PRAM and E820_RESERVED_KERN entries.
Without the patch on all kernels, E820_PRAM and E820_RESERVED_KERN
entries show "type 12" and "type 128" respectively.  Without the
patch on Linux 4.12 and later kernels, the command fails with the
error message "mach: cannot resolve e820".
(anderson@redhat.com)
2018-05-31 11:43:14 -04:00
Dave Anderson
46d2121960 Fix for the "timer -r" command on Linux 4.10 and later kernels that
contain commit 2456e855354415bfaeb7badaa14e11b3e02c8466, titled
"ktime: Get rid of the union".  Without the patch, the command fails
with the error message "timer: invalid structure member offset:
ktime_t_sec".
(k-hagio@ab.jp.nec.com)
2018-05-29 14:04:03 -04:00
Dave Anderson
393350cbd1 Mark start of 7.2.4 development phase with version 7.2.3++ 2018-05-29 14:02:47 -04:00
Dave Anderson
06fc95c733 crash-7.2.2 -> crash-7.2.3 2018-05-17 13:41:20 -04:00
Dave Anderson
6bc9665296 Fix for a third, highly unlikely, crash-7.2.2 buffer overrun
regression, that could potentially occur during session
initialization.
(anderson@redhat.com)
2018-05-17 12:14:40 -04:00
Dave Anderson
40dfdd081e Fix for a second crash-7.2.2 buffer overrun regression that may
cause the "rd -S" option to generate a segmentation violation
if a displayed memory location contains a slab object address.
(anderson@redhat.com)
2018-05-17 12:13:44 -04:00
Dave Anderson
7fcefcd4fe Fix for a crash-7.2.2 regression that may cause the "mount"
command to generate a segmentation violation.  The bug is
dependant upon the compiler version used to build the crash
utility, where a buffer overrun is not seen with more recent
versions of gcc, which hide the bug due to a different stack
layout of a function's local varibles.
(anderson@redhat.com)
2018-05-17 12:12:31 -04:00
Dave Anderson
ce7363fdef Mark start of 7.2.3 development phase with version 7.2.2++ 2018-05-17 11:15:58 -04:00
Dave Anderson
ccb285d9fd crash-7.2.1 -> crash-7.2.2 2018-05-16 15:11:03 -04:00
Dave Anderson
e354204c65 Updates for the presumption that system call names begin with "sys_".
In Linux 4.17, x86_64 system calls may begin with "__x64_sys", where,
for example, "sys_read" has been replaced by "__x64_sys_read".
(anderson@redhat.com)
2018-05-15 16:33:56 -04:00
Dave Anderson
1e1bd9c4c1 Fix for the "bpf" command display on Linux 4.17-rc1 and later kernels,
which contain two new program types, BPF_PROG_TYPE_RAW_TRACEPOINT
and BPF_PROG_TYPE_CGROUP_SOCK_ADDR.  Without the patch, the dynamic
header string created for bpf programs overran into the bpf map
header, creating one long combined header string.
(anderson@redhat.com)
2018-05-11 15:54:32 -04:00
Dave Anderson
6946bc2e95 Trivial formatting fix to "bpf" help page.
(anderson@redhat.com)
2018-05-08 15:11:27 -04:00
Dave Anderson
8a846ffa5c Fix for infrequent failures of the x86 "bt" command to handle cases
where a user space task with "resume_userspace" or "entry_INT80_32"
at the top of the stack, or which was interrupted by the crash NMI
while handling a timer interrupt.  Without the patch, the backtrace
would be proceeded with the error message "bt: cannot resolve stack
trace", and then dump the text symbols found on the stack and all
possible exception frames.
(anderson@redhat.com)
2018-05-08 13:49:57 -04:00
Dave Anderson
483f98dab7 Fix for a compilation error of the new "bpf.c" file when building
on older host systems where CLOCK_BOOTTIME does not exist.
(anderson@redhat.com)
2018-05-08 09:28:32 -04:00
Dave Anderson
23b23ce165 Second stage of the new "bpf" command. This patch adds additional
per-program and per-map data for the "bpf -p ID" and "bpf -m ID"
options, containing data items shown by the "bpftool prog list"
and "bpftool map list" options; new "bpf -P" and "bpf -M" options
have been added that dump the extra data for all loaded programs
or tasks.
(anderson@redhat.com)
2018-05-07 11:45:21 -04:00
Dave Anderson
759dc0c50d Fix, and an update, for the "ipcs" command. The fix addresses an
error where IPCS entries are not displayed because of a faulty
read of the "deleted" member of the embedded "kern_ipc_perm" data
structure.  The "deleted" member was being read as a 4-byte integer,
but since it is declared as a "bool" type, only the lowest byte gets
set to 1 or 0.  Since the structure is not zeroed-out when allocated,
stale data may be left in the upper 3 bytes, and the IPCS entry
gets rejected.  The update is required for Linux 4.11 and greater
kernels, which reimplemented the IDR facility to use radix trees
in kernel commit 0a835c4f090af2c76fc2932c539c3b32fd21fbbb, titled
"Reimplement IDR and IDA using the radix tree".  Without the patch,
if any IPCS entry exists, the command would fail with the message
"ipcs: invalid structure member offset: idr_top"
(anderson@redhat.com)
2018-04-30 10:38:26 -04:00
Dave Anderson
48b1708609 For live system analysis, if both "/dev/mem" and the "/dev/crash"
memory driver do not exist, try to use "/proc/kcore".  Without
the patch, the session fails immediately with the error message
"crash: /dev/mem: No such file or directory".
(anderson@redhat.com)
2018-04-26 14:05:00 -04:00
Dave Anderson
d66564ae3a Fix for the determination of the ARM64 phys_offset value when
running live against /proc/kcore.  Without the patch, the message
"WARNING: cannot access vmalloc'd module memory" may be displayed
during session initialization, and vmalloc/module memory will be
unaccessible.  It should be noted that at the time of this patch,
the upstream (4.16.0) version of /proc/kcore does not work correctly
for ARM64, because PT_LOAD segments for unity-mapped blocks of
physical memory are not generated.
(anderson@redhat.com)
2018-04-25 17:02:54 -04:00
Dave Anderson
6504149678 Fix for an s390x session initialization-time warning that indicates
"WARNING: cannot determine MAX_PHYSMEM_BITS" on Linux 4.15 and later
kernels containing commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4,
which changed the data type of "mem_section" from an array to a
pointer.  Without the patch, the s390x manner of determining
MAX_PHYSMEM_BITS fails because it presumes that "mem_section" is
an array, and as a result, displays the warning message.
(anderson@redhat.com)
2018-04-24 11:52:17 -04:00
Dave Anderson
83b7cbc612 Fix for a compilation error on ARM64. Without the patch, the
compilation of the new bpf.c file fails with the error message
"bpf.c:881:18: error: conflicting types for 'u64'"
(anderson@redhat.com)
2018-04-23 11:52:56 -04:00
Dave Anderson
4e8c1f3720 Fix for the "ps -a" option for a user task that has utilized
"prctl(PR_SET_MM, ...)" to self-modify its memory map such
that the stack locations of its command line arguments and
environment variables such are not contiguous.  Without the
patch, the command may fail with a dump of the crash utility's
internal buffer usage statistics followed by "ps: cannot allocate
any more memory!".
(k-hagio@ab.jp.nec.com)
2018-04-20 16:17:37 -04:00
Dave Anderson
11eceac4ef Fixes to address several gcc-8.0.1 compiler warnings that are generated
when building with "make warn".  The warnings are all false alarm
messages of type [-Wformat-overflow=], [-Wformat-truncation=] and
[-Wstringop-truncation]; the affected files are extensions.c, task.c,
kernel.c, memory.c, remote.c, symbols.c, filesys.c and xen_hyper.c.
(anderson@redhat.com)
2018-04-20 14:37:52 -04:00
Dave Anderson
dacfbe8ab1 Introduction of a new "bpf" command that displays information about
loaded eBFP (extended Berkeley Packet Filter) programs and maps.
Because of its upstream fluidity, the capabilities of this command
will be an ongoing task.  In its initial form, the command displays
the addresses, basic information, and key data structures of eBPF
programs and maps.  It also translates the bytecode, and disassembles
the jited code, of loaded eBPF programs.
(anderson@redhat.com)
2018-04-19 15:53:40 -04:00
Dave Anderson
9aa345148c Display a fatal error message if the "tree -l" option is attempted
with radix trees.  Without the patch, the option would be silently
ignored.
(neelx@redhat.com)
2018-04-18 09:36:25 -04:00
Dave Anderson
90642e6ffa Added a new "tree -l" option for the rbtree display, which dumps
the tree sorted in linear order, starting with the leftmost node and
progressing to the right.  Also, if a corrupted rb_node pointer is
encountered, do not fail immediately, but rather display the rb_node
address and the corrupt pointer and continue.
(neelx@redhat.com)
2018-04-17 10:12:02 -04:00
Dave Anderson
6588de928a Speed up the "ps -r" option by stashing the length of the
task_struct.rlim or signal_struct.rlim array in the internal
array_table[].  Without the patch, the length of the array
is determined by a call to the embedded gdb module for each
task, and as a result, the command takes a minute or more
per 1000 tasks.  With the patch applied, it only takes about
0.5 seconds per 1000 tasks.
(k-hagio@ab.jp.nec.com)
2018-04-16 16:10:44 -04:00
Dave Anderson
c1a8d0c968 Optimization of the crash startup time and "ps" command processing
time when analyzing dumpfiles/systems with extremely large task
counts.  For example, running with a dumpfile containing over a
million tasks, startup time and "ps" processing time was reduced
from 45 minutes to less then 40 seconds.
(gthelen@google.com)
2018-04-10 11:18:14 -04:00
Dave Anderson
bb3f55c28d Speed up the "bt" command by avoiding the text value cache that
was put in place many years ago when the crash utility supported the
analysis of remote dumpfiles using the deprecated "crash daemon"
running on the remote host.  The performance improvement will be
most noticable when running the first instance of "foreach bt",
where there would often be a "hitch" when it was determining the
framesize of kernel module text return addresses.
(anderson@redhat.com)
2018-04-05 16:31:53 -04:00
Dave Anderson
33421b0b5a Fix for the x86_64 "bt" command for kernels that are configured with
CONFIG_FRAME_POINTER.  Without the patch, the per-text-return-address
framesize cache may contain invalid entries for functions that have
an "and $0xfffffffffffffff0,%rsp" instruction in their prologue,
which aligns the stack on a 16-byte boundary; therefore any cached
framesize for a text-return-address in such a function may be
incorrect depending upon the alignment of the stack address of a
calling function.  If an invalid cached framesize is utilized by
"bt", the backtrace may skip over several frames, or may display
one or more invalid (stale) frames.  The patch introduces a new
cache that contains functions for which framesize values should
not be cached.
(anderson@redhat.com)
2018-04-05 16:05:51 -04:00
Dave Anderson
6088a29f7e Fix for the "bt" command on 4.16 and later kernels size in which the
"thread_union" data structure is not contained in the vmlinux file's
debuginfo data.  Without the patch, the kernel stack size is not
calculated correctly, and defaults to 8K.  As a result "bt" fails
with the message "bt: invalid RSP: <address> bt->stackbase/stacktop:
<address>/<address> cpu: <number>".
(efault@gmx.de)
2018-04-05 11:07:59 -04:00
Dave Anderson
5d172b230c Commit 45b74b8953 added support for
calculating phys_base and the mapped kernel offset for KASLR-enabled
kernels on SADUMP dumpfiles by using a technique developed by Takao
Indoh. Originally, the patchset included support for kdumps, but this
was dropped in v2, as it was deemed unnecessary due to the upstream
implementation of the "vmcoreinfo device" in QEMU.  However, there
are still several reasons for which the vmcoreinfo device may not be
present at the time when a memory dump is taken from a VM, ranging
from a host running older QEMU/libvirt versions, to misconfigured VMs
or environments running Hypervisors that doesn't support this device.
This patchset generalizes the KASLR-related functions from sadump.c
and moves them to kaslr_helper.c, and makes kdump analysis fall back
to KASLR offset calculation if vmcoreinfo data is missing.
(slp@redhat.com)
2018-03-29 10:26:29 -04:00
Dave Anderson
907196e93d VMware VMSS dumpfiles contain the state of each vCPU at the time
when the VM was suspended.  This patch enables crash to read the
relevant registers from each vCPU state for use as the starting hooks
by the "bt" command.  Also, support for "help -[D|n]" to display
dumpfile contents, and "help -r" to display vCPU register sets has
been implemented.  This is also the first step towards implementing
automatic KASLR offset calculations for VMSS dumpfiles.
(slp@redhat.com)
2018-03-26 13:56:29 -04:00
Dave Anderson
df4679ddb5 Fix the "help foreach" argument list to include the new "gleader"
task qualifier option that was added in version 7.1.2.
(anderson@redhat.com)
2018-03-22 09:22:51 -04:00
Dave Anderson
f4072ebf13 Fixes for 32-bit X86 "bt" command on kernels that have been compiled
with retpoline gcc support.  Without the patch, backtraces may fail
with the error message "bt: cannot resolve stack trace", followed by
the text symbols found on the stack and possible exception frames.
(anderson@redhat.com)
2018-03-14 10:26:20 -04:00
Dave Anderson
4141373d9d Implemented the x86_64 machdep->is_page_ptr() plugin function. If
the kernel is configured with CONFIG_SPARSEMEM_VMEMMAP, the plugin
function optimizes the mem_section search, reducing the computation
effort and time consumed by commands that repeatedly call the
is_page_ptr() function on large-memory systems.
(k-hagio@ab.jp.nec.com)
2018-03-06 11:26:05 -05:00
Dave Anderson
d586679b86 As the first step in optimizing the is_page_ptr() function, save
the maximum SPARSEMEM section number during initialization, and
use it as the topmost delimeter in subsequent mem_section searches.
Also allow for per-architecture machdep->is_page_ptr() plugin functions.
(anderson@redhat.com)
2018-03-02 14:53:16 -05:00
Dave Anderson
6de5d2c034 Implemented a new "ps -A" option that restricts the task output to
just the active tasks on each cpu.
(atomlin@redhat.com)
2018-03-01 09:39:29 -05:00
Dave Anderson
a002f07040 Fix the search for the booted kernel on a live system to prevent
selecting the unusable "vmlinux.o" file found in private build
directories.  Without the patch, the non-executable vmlinux.o file
may be selected, and the resulting fatal error message indicates a
somewhat misleading "crash: cannot resolve _stext".
(bhsharma@redhat.com, anderson@redhat.com)
2018-02-28 16:13:51 -05:00
Dave Anderson
764e2d0997 Fix to support Linux 4.16-rc1 and later ARM64 kernels, which
fail during session initialization with the error message
"crash: cannot determine page size".  The failure to determine
the page size is due to the combination of the following kernel
commits:
  - Linux 4.6 commit 6ad1fe5d9077a1ab40bf74b61994d2e770b00b14
    arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
  - Linux 4.10 commit 4b65a5db362783ab4b04ca1c1d2ad70ed9b0ba2a
    arm64: Introduce uaccess_{disable,enable} functionality based on TTBR0_EL1
  - Linux 4.16 commit 1e1b8c04fa3451e2b7190930adae43c95f0fae31
    arm64: entry: Move the trampoline to be before PAN
(takahiro.akashi@linaro.org)
2018-02-15 14:43:10 -05:00
Dave Anderson
b732bceec9 Mark start of 7.2.2 development phase with version 7.2.1++ 2018-02-15 14:40:52 -05:00
Dave Anderson
5fb0e4b6fb crash-7.2.0 -> crash-7.2.1 2018-02-13 14:44:06 -05:00
Dave Anderson
ddace9720f Fix for the ARM64 "bt" command in kernels that contain commit
30d88c0e3ace625a92eead9ca0ad94093a8f59fe, titled "arm64: entry:
Apply BP hardening for suspicious interrupts from EL0".  Without
the patch, there may be invalid kernel kernel exception frames
displayed on an active task's kernel stack, often below a stackframe
of the "do_el0_ia_bp_hardening" function; the address translation
of the PC and LR values in the the bogus exception frame will
display "[unknown or invalid address]".
(anderson@redhat.com)
2018-02-09 16:26:27 -05:00
Dave Anderson
a38e3ec4cb Fix for the ARM64 "bt" command running against Linux 4.14 and
later kernels.  Without the patch, the backtraces of the active
tasks in a kdump-generated dumpfile are truncated.  Without the
patch, the panic task will just show the "crash_kexec" frame
and the kernel-entry user-space exception frame; the non-panic
tasks will show their backtraces starting from the stackframe
addresses captured in the per-cpu NT_PRSTATUS notes, and will
not display the exception frame generated by the NMI callback,
nor any stackframes on the IRQ stack.
(anderson@redhat.com)
2018-02-09 14:58:34 -05:00
Dave Anderson
fad29db973 Fix the sample crash.ko memory driver to prevent an s390X kernel
addressing exception.  Legitimate pages of RAM that successfully
pass the page_is_ram() and pfn_valid() verifier functions may not
be provided by the s390x hypervisor, and the memcpy() from the
non-existent memory to the bounce buffer panics the kernel.  The
patch replaces the the memcpy() call with probe_kernel_read().
(anderson@redhat.com)
2018-02-08 10:23:50 -05:00
Dave Anderson
e4499a9de6 Since Xen commit 666aca08175b ("sched: use the auto-generated list of
schedulers") crash cannot open Xen vmcores because the "schedulers"
symbol no longer exists.  Xen 4.7 implemented schedulers as its own
section in "xen/arch/x86/xen.lds.S", delimited by the two symbols
"__start_schedulers_array" and "__end_schedulers_array".  Without
the patch, the crash session fails during initialization with the
error message "crash: cannot resolve schedulers"
(npajkovsky@suse.cz)
2018-02-02 12:04:03 -05:00
Dave Anderson
b5a331ac2b Add a new "foreach gleader" qualifier option, restricting the output
to user-space tasks that are thread group leaders.
(Jan.Karlsson@sony.com)
2018-02-02 11:28:14 -05:00
Dave Anderson
e9ae5eb974 Xen commit 615588563e99a23aaf37037c3fee0c413b051f4d (Xen 4.0.0.)
extended the direct mapping to 5 TB.  This area was previously
reserved for future use, so it is OK to simply change the upper
bound unconditionally.
(ptesarik@suse.com)
2018-02-02 09:34:55 -05:00