In function x86_64_irq_eframe_link_init, instruction "push xxx" is searched in
addresses range from "common_interrupt" to the next nearby symbol, in order to
calculate the value of irq_eframe_link. The searching distance is given by
max_instructions, which is calculated by end ranging address minus start ranging
address. Then crash asks gdb to disassemble max_instructions quantity of instructions.
Taking max_instructions as the quantity of disassemble instructions is inappropriate,
because most x86_64 instructions have a length longer than 1, as a consequence, much
more than the actual needed instructions get disassembled.
In gdb-7.6 crash, the extra instructions are skipped by "if (!strstr(buf, sp->name))",
which breaks if one instruction doesn't belongs to a symbol:
0xffffffff8005d5b4 <common_interrupt+0>: cld
0xffffffff8005d5b5 <common_interrupt+1>: sub $0x48,%rsp
...
0xffffffff8005d61e <common_interrupt+106>: leaveq
0xffffffff8005d61f <exit_intr>: mov %gs:0x10,%rcx <--- searching stops here
...
In gdb-10.2 crash, "exit_intr" doesn't show, however it really exist. As a result,
searching for "push xxx" will go to a wrong place.
0xffffffff8005d5b4 <common_interrupt+0>: cld
0xffffffff8005d5b5 <common_interrupt+1>: sub $0x48,%rsp
...
0xffffffff8005d61e <common_interrupt+106>: leave
0xffffffff8005d61f <common_interrupt+107>: mov %gs:0x10,%rcx <--- searching continues
...
(gdb) p exit_intr
$1 = {<text variable, no debug info>} 0xffffffff8005d61f <common_interrupt+107>
(gdb) info symbol exit_intr
common_interrupt + 107 in section .text
The previous way to determine start and end searching range is not stable, otherwise we may
encounter regression that cmd "bt" prints wrong IRQ stack. This patch fix the bug by removing
max_instructions calculation, and directly ask gdb to disassemble addresses range from
"common_interrupt" to the next nearby symbol.
Signed-off-by: Tao Liu <ltao@redhat.com>
Provides API for crash_target to fetch registers of given
CPU. It will allow gdb to perform such commands as "bt",
"frame", "info locals".
Highlevel API is crash_get_cpu_reg (). It calls machine
(architecture) specific function: machdep->get_cpu_reg().
Input arguments such as register number and register size
come from gdb arch information. So, get_cpu_regs()
implementations in crash must understand it.
Signed-off-by: Alexey Makhalov <amakhalov@vmware.com>
gdb-10 produces reduced output of `bt` command.
Changed disassembler output is the reason of missing frames
in backtrace. Call instruction mnemonic for x86_64 was changed
from "callq" to "call" in gdb-10.
Fixing the issue by adding a search for "call" word in disassembler
parser.
Signed-off-by: Alexey Makhalov <amakhalov@vmware.com>
Reported-by: Kazuhito Hagio <k-hagio-ab@nec.com>
Main changes:
[1] update gdb-7.6.patch to gdb-10.2.patch, and keep all functionality
and good compatibility
[2] remove unneeded patches(gdb-7.6-proc_service.h.patch and
gdb-7.6-ppc64le-support.patch)
[3] to make the c++ compiler happy, add the extern "C" to eliminate
compilation issues, also add CXXFLAGS=-m32 to generate proper
32bit object files
[4] the parameter types of some functions are changed, eg, the set of
prettyprint variables
[5] eliminate error_hook() and SJLJ while running in C++ code (after
gdb_command_funnel()) use try-catch mechanism instead
[6] request_types() is redone to do not call GNU_GET_NEXT_DATATYPE multiple
times but single usage of GNU_ITERATE_DATATYPES with proper callback
instead. Complete iteration happens on C++ side now.
[7] remove "struct global_iterator" from request structure, but add
several fields (including callback pointer) to be able to perform
iteration on C++ side
[8] type of "linux_banner" symbol is reported as 'D' by new gdb as its
section ".rodata" marked as writable in vmlinux
[9] BFD API has changed.
[10] the deprecated_command_loop_hook got deprecated. So, call crash
main_loop() directly from gdb captured_main()
[11] remove previously used hooks for that in target.c. Add crash_target
for gdb to provide target operations such as xfer_partial to read
and write crash dump memory.
Signed-off-by: Alexey Makhalov <amakhalov@vmware.com>
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Since at least kernel v2.6.30 the __per_cpu_offset gets initialized to
__per_cpu_load. So first check if the __per_cpu_offset was set to a
proper value before reading any per cpu variable to prevent potential
bugs.
[ kh: added check for the existence of __per_cpu_load ]
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
The upstream kernel commit edcb5cf84f05e5d2e2af25422a72ccde359fcca9
("x86/paravirt/xen: Remove xen_patch()") broke crash compatibility.
This change adds a check for both symbols: "xen_patch" and
its replacement: "paravirt_patch_default". Withouth the patch,
crash fails with an error message like this:
crash: seek error: physical address: 83640e000 type: "pud page"
Resolves: https://github.com/crash-utility/crash/issues/78
Closes: https://github.com/crash-utility/crash/pull/79
Signed-off-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
Fix "bt" command on Linux 5.12-rc1 and later kernels that contain
commit 951c2a51ae75 ("x86/irq/64: Adjust the per CPU irq stack pointer
by 8"). Without the patch, the "bt" command and some of its options
that read irq stack fail with the error message "bt: read of stack at
<address> failed".
Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
Linux 5.10 has introduced SEV-ES support. New (5th) exception
stack was added: 'VC_stack'.
'struct exception_stacks' cannot be used to obtain the size
of VC stack, as the size of VC stack is zero there. Try
another structure 'struct cea_exception_stacks' first as it
represents actual CPU entry area with valid stack sizes and
guard pages.
Handled the case if VC stack is not mapped (present).
It happens when SEV-ES is not active or not supported.
Signed-off-by: Alexey Makhalov <amakhalov@vmware.com>
Use LA57 bit in CR4 to check whether 5-level paging enabled.
Initialize machdep to 5-level paging operation mode used by
x86_64_kvtop.
Replaced *_get_cr3_idtr() set of functions by *_get_cr3_cr4_idtr().
[ kh: added malloc for p4d page ]
Signed-off-by: Alexey Makhalov <amakhalov@vmware.com>
Acked-by: Lianbo Jiang <lijiang@redhat.com>
Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
x86_64_exception_frame() called with combined flags including
EFRAME_VERIFY does not perform the verify. It's only done when
EFRAME_VERIFY is the only flag set.
Correct the condition to EFRAME_VERIFY if the flag is set. Verify
requests are always performed. Fixes stack overrun "seek errors" seen on
an x86_64 core when backtracing a PID at an IRQ stack where the
interrupt handler doesn't save a pt_regs. Higher layers than the top
frame on the IRQ stack were not displayed.
But it breaks bt -e and bt -E for exceptions on userspace stacks. Those
use the constant 0 as the kvaddr argument to x86_64_exception_frame()
and pass the userspace stack position in the local argument.
x86_64_exception_frame() only verifies the kvaddr argument. Zero is not
accessible and EFRAME_VERIFY always fails for those cases.
Modify the EFRAME_VERIFY block in x86_64_exception_frame() to choose
kvaddr or local to verify using the same condition used to assign one of
them to pt_regs_buf later in the same function.
[ kh: modified commit message ]
Signed-off-by: David Mair <dmair@suse.com>
Add support for 1GB huge pages to "vtop" command on x86_64. Without
this patch, the command with a user virtual address corresponding to
a 1GB huge page fails with the error message "vtop: seek error:
physical address: <address> type: "page table".
crash> vtop 7f6e40000000
VIRTUAL PHYSICAL
vtop: seek error: physical address: 3f53f000f000 type: "page table"
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Chu Kaiping <chukaiping@foxmail.com>
the error message "crash: cannot resolve init_tss". This is caused
by a change in the Xen hypervisor with commit 78884406256, from
4.12.0-rc5-763-g7888440625. In that patch the tss_struct structure
was renamed to tss64 and the tss_page structure was introduced,
which contains a single tss64. Now tss information is accessible
via the symbol "per_cpu__tss_page".
(dietmar.hahn@ts.fujitsu.com)
are NOT configured with CONFIG_RANDOMIZE_BASE and have backported
kernel commit d52888aa2753e3063a9d3a0c9f72f94aa9809c15, titled
"x86/mm: Move LDT remap out of KASLR region on 5-level paging",
which modified the 4-level and 5-level paging PAGE_OFFSET values.
Without this patch, the crash session fails during initialization
with the error message "crash: seek error: kernel virtual address:
<address> type: "tss_struct ist array".
(anderson@redhat.com)
to allow the use of a negative decimal number as the value. Without
the patch, only the hexadecimal representation of the value would be
accepted.
(v-santy@microsoft.com, anderson@redhat.com)
019b17b3ffe48100e52f609ca1c6ed6e5a40cba1, titled "x86/exceptions: Add
structs for exception stacks". Without the patch, the exception
stack sizes cannot be determined, and as a result backtraces
that initiate from an exception stack will fail with error messages
indicating "bt: invalid kernel virtual address: <address> type:
stack contents" and then "bt: read of stack at <address> failed".
(anderson@redhat.com)
e6401c13093173aad709a5c6de00cf8d692ee786, titled "x86/irq/64: Split
the IRQ stack into its own". Without the patch, the per-cpu IRQ
stack addresses cannot be determined, and as a result backtraces
that utilize an IRQ stack will fail.
(anderson@redhat.com)
by the KVM "virsh dump" facility if the kernel is KASLR-enabled and
does not have the phys_base value stored in vmcoreinfo data. Without
the patch, the message "WARNING: cannot determine physical base
address: defaulting to 0" is displayed, and the crash session fails
to initialize.
(zhiche.yy@alibaba-inc.com)
are NOT configured with CONFIG_RANDOMIZE_BASE and have backported
kernel commit d52888aa2753e3063a9d3a0c9f72f94aa9809c15, titled
"x86/mm: Move LDT remap out of KASLR region on 5-level paging",
which modified the 4-level and 5-level paging PAGE_OFFSET values.
Without this patch, the crash session fails during initialization
with the error message "crash: read error: kernel virtual address:
<address> type: tss_struct ist array".
(anderson@redhat.com)
configured with CONFIG_RANDOMIZE_BASE. Linux 4.20 introduced
kernel commit d52888aa2753e3063a9d3a0c9f72f94aa9809c15, titled
"x86/mm: Move LDT remap out of KASLR region on 5-level paging",
which modified the 4-level and 5-level paging PAGE_OFFSET values.
Without this patch, the crash session fails during initialization
with the error message "crash: read error: kernel virtual address:
<address> type: tss_struct ist array". For kernels prior to
Linux 4.20.0 which have backports of the kernel commit, the kernel's
PAGE_OFFSET value must be manually specified via the command line
option "--machdep page_offset=ffff888000000000" for kernels with
4-level page tables, or "--machdep page_offset=ff11000000000000"
for kernels with 5-level paging. (or alternatively the shorter
version "-m page_offset=<address>" may be used). The command
line option requirement may be revisited in the future.
(anderson@redhat.com)
with Xen 4.11.0 during initialization, which fails with the error
message "crash: invalid kernel virtual address: <address> type:
fill_pcpu_struct", followed by "WARNING: cannot fill pcpu_struct"
and "crash: cannot read cpu_info". The second fix prevents a
segmentation violation associated with a crash-7.1.1 commit that
addressed the Xen 4.5.0 hypervisor symbol name change from
"dom0" to "hardware_domain".
(dietmar.hahn@ts.fujitsu.com)
header of /proc/kcore in Linux 4.19 and later kernels. This patch
introduces support for live session /proc/kcore VMCOREINFO access by
the crash utility's internal pc->read_vmcoreinfo() function. New
usage include the initialization of the x86_64 phys_base value, and
the arm64 phys_offset, page size, and VA bits count.
(anderson@redhat.com)
cannot be referenced symbolically, such as when the exception occurs
while running in seccomp BPF filter code. Without the patch, the
exception frame register dump is preceded by "[exception RIP: unknown
or invalid address]", and then followed by "bt: WARNING: possibly
bogus exception frame". With the patch applied, the translation of
the exception RIP will show "[exception RIP: no symbolic reference]",
and there will be no warning message.
(anderson@redhat.com)
Linux 4.17 commit a7412546d8cb5ad578805060b4006f2a021b5868, titled
"x86/mm: Adjust vmalloc base and size at boot-time", which increases
the region's size from 32TB to 1280TB when 5-level pagetables are
enabled. Also presume that virtual addresses above the end of the
vmalloc space up to the beginning of vmemmap space are translatable
via 5-level page tables. Without the patch, mapped virtual addresses
may fail translation in whatever command accesses them, with errors
indicating "seek error: kernel virtual address: <mapped-address>
type: <type-string>"
(anderson@redhat.com)
user-space "vtop" commands. The swap offset bits in an x86_64 PTE
were changed in Linux 4.6, and then again in Linux 4.18.1 with the
new L1TF security patchset. Without the patch, the offset value
in the later kernels, or in older kernels with an L1TF backport,
show an incorrect swap offset value.
(anderson@redhat.com)
initialization on live systems running a kernel that is configured
with CONFIG_X86_5LEVEL. Without the patch, a message indicating
"crash: read error: kernel virtual address: <address> type:
__pgtable_l5_enabled" will be displayed if /proc/kcore gets
selected as the live memory source after /dev/mem is determined
to be unusable.
(anderson@redhat.com)
and later kernels. This patch adds support for user virtual address
translation when the kernel is configured with CONFIG_X86_5LEVEL.
(douly.fnst@cn.fujitsu.com)
from not being displayed. Without the patch, if the RIP in a pt_regs
structure on the stack is not a kernel text address, such as a NULL
pointer, it is not recognized as an exception frame and the register
set is not displayed.
(anderson@redhat.com)
and later kernels. With this patch, the usage of 5-level page tables
is automatically detected on live systems and when running against
vmcores that contain the new "NUMBER(pgtable_l5_enabled)" VMCOREINFO
entry. Without the patch, the "--machdep vm=5level" command line
option is required.
(douly.fnst@cn.fujitsu.com, anderson@redhat.com)
containing commit 3aa99fc3e708b9cd9b4cfe2df0b7a66cf293e3cf, titled
"x86/entry/64: Remove 'interrupt' macro". Without the patch, the
exception frame display generated by an interrupt exception will
show incorrect contents, and be followed by the message "bt: WARNING:
possibly bogus exception frame".
(anderson@redhat.com)
unusable because the kernel was configured with CONFIG_STRICT_DEVMEM,
the first memory read during session initialization will fail. The
current behavior results in a readmem() error message, followed by two
notification messages that indicate that /dev/mem is restricted and
a switch to using /proc/kcore will be attempted; the readmem is
reattempted from /proc/kcore, and if successful, the session will
continue initialization. With this patch, the behavior will change
such that if the switch to /proc/kcore and the reattempted readmem()
are successful, no messages will be displayed unless the crash
session is invoked with "crash -d<number>".
(anderson@redhat.com)
address range introduced in Linux 4.15. The memory range exists
above the vmemmap range and below the mapped kernel static text/data
region, and where all of the x86_64 exception stacks have been moved.
Without the patch, reads from the new memory region fail because the
address range is not recognized as a legitimate virtual address.
Most notable is the failure of "bt" on tasks whose backtraces
originate from any of the exception stacks, which fail with the two
error messages "bt: seek error: kernel virtual address: <address>
type: stack contents" followed by "bt: read of stack at <address>
failed".
(anderson@redhat.com)
kernels to account for the structure name changes "e820map" to
"e820_table", and "e820entry" to "e820_entry", and for the symbol
name change from "e820" to "e820_table". Also updated the display
output to properly translate E820_PRAM and E820_RESERVED_KERN entries.
Without the patch on all kernels, E820_PRAM and E820_RESERVED_KERN
entries show "type 12" and "type 128" respectively. Without the
patch on Linux 4.12 and later kernels, the command fails with the
error message "mach: cannot resolve e820".
(anderson@redhat.com)
In Linux 4.17, x86_64 system calls may begin with "__x64_sys", where,
for example, "sys_read" has been replaced by "__x64_sys_read".
(anderson@redhat.com)
CONFIG_FRAME_POINTER. Without the patch, the per-text-return-address
framesize cache may contain invalid entries for functions that have
an "and $0xfffffffffffffff0,%rsp" instruction in their prologue,
which aligns the stack on a 16-byte boundary; therefore any cached
framesize for a text-return-address in such a function may be
incorrect depending upon the alignment of the stack address of a
calling function. If an invalid cached framesize is utilized by
"bt", the backtrace may skip over several frames, or may display
one or more invalid (stale) frames. The patch introduces a new
cache that contains functions for which framesize values should
not be cached.
(anderson@redhat.com)
calculating phys_base and the mapped kernel offset for KASLR-enabled
kernels on SADUMP dumpfiles by using a technique developed by Takao
Indoh. Originally, the patchset included support for kdumps, but this
was dropped in v2, as it was deemed unnecessary due to the upstream
implementation of the "vmcoreinfo device" in QEMU. However, there
are still several reasons for which the vmcoreinfo device may not be
present at the time when a memory dump is taken from a VM, ranging
from a host running older QEMU/libvirt versions, to misconfigured VMs
or environments running Hypervisors that doesn't support this device.
This patchset generalizes the KASLR-related functions from sadump.c
and moves them to kaslr_helper.c, and makes kdump analysis fall back
to KASLR offset calculation if vmcoreinfo data is missing.
(slp@redhat.com)
when the VM was suspended. This patch enables crash to read the
relevant registers from each vCPU state for use as the starting hooks
by the "bt" command. Also, support for "help -[D|n]" to display
dumpfile contents, and "help -r" to display vCPU register sets has
been implemented. This is also the first step towards implementing
automatic KASLR offset calculations for VMSS dumpfiles.
(slp@redhat.com)
the kernel is configured with CONFIG_SPARSEMEM_VMEMMAP, the plugin
function optimizes the mem_section search, reducing the computation
effort and time consumed by commands that repeatedly call the
is_page_ptr() function on large-memory systems.
(k-hagio@ab.jp.nec.com)
patch is a cleanup/collaboration of the original logic used by the
various vtop functions, where several new common functions have been
added for extracting page table entries from PGD, P4D, PUD, PMD and
PTE pages. The usage of the former PML4 and UPML pages have been
replaced with the use of the common PGD page, and use the PUD page
in 4-level page table translation. Support for 5-level page tables
has been incorporated into the the existing x86_64_kvtop() and
x86_64_uvtop_level4() functions. Backwards compatibility for older
legacy kernels has been maintained. The third phase of support will
automatically detect whether the kernel proper, and whether an
individual user task, is utilizing 5-level page tables. This patch
enables support for kernel-only 5-level page tables by entering the
command line option "--machdep vm=5level".
(douly.fnst@cn.fujitsu.com)
c482feefe1aeb150156248ba0fd3e029bc886605, titled "x86/entry/64: Make
cpu_entry_area.tss read-only". Without the patch, the addresses and
sizes of the x86_64 exception stacks cannot be determined; therefore
if a backtrace starts on one of the exception stacks, then the "bt"
command will fail.
(anderson@redhat.com)
"bt" command may indicate "bt: cannot transition from exception stack
to current process stack" if the crash callback NMI occurred while an
active task was running on the new entry trampoline stack. This has
only been tested on the RHEL7 backport of the upstream patch because
as of this commit, crash does not run on 4.15-rc kernels. Further
changes may be required for upstream kernels, and distributions that
implement the kernel changes differently than upstream.
(anderson@redhat.com)
backports of, kernel commit 4950d6d48a0c43cc61d0bbb76fb10e0214b79c66,
titled "x86/dumpstack: Remove 64-byte gap at end of irq stack".
Without the patch, backtraces fail to transition from the IRQ stack
back to the process stack, showing an error message such as
"bt: cannot transition exception stack to IRQ stack to current
process stack".
(anderson@redhat.com)
translation mechanism. Without the patch, when verifying that the
PAGE_PRESENT bit is set in the top-level page table, it would always
test positively, and the translation would continue parsing the
remainder of the page tables. This would virtually never be a
problem in practice because if the top-level page table entry
existed, its PAGE_PRESENT bit would be set.
(oleksandr@redhat.com, anderson@redhat.com)
dumpfile facility. SADUMP dumpfile headers do not contain phys_base
or VMCOREINFO notes, so without this patch, the crash session fails
during initialization with the message "crash: seek error: kernel
virtual address: <address> type: "page_offset_base". This patch
calculates the phys_base value and the KASLR offset using the IDTR
and CR3 registers from the dumpfile header.
(indou.takao@jp.fujitsu.com)
"kimage_voffset" value in the ELF header. Without the patch, it is
necessary to use the "--machdep kvimage_offset=<value>" command line
option, or the session fails with the message "crash: vmlinux and
vmcore do not match!".
(anderson@redhat.com)
x86_64 "bt" command. Kernels configured with CONFIG_ORC_UNWINDER
contain .orc_unwind and .orc_unwind_ip sections that can be queried
to determine the stack frame size of any text address within a kernel
function. For kernels not configured with CONFIG_FRAME_POINTER,
the crash utility does frame size calculation by disassembling a
function from its beginning to the specified text address, counting
the push, pop, and add/sub rsp instructions, accounting for retq
instructions that occur in the middle of a function. With this patch,
access to the new ORC sections has been plugged into the existing
frame size calculator, resulting in a more efficient and accurate
manner of determining frame sizes, and as a result, more accurate
backtraces.
(anderson@redhat.com)
"x86/boot/64: Rename init_level4_pgt and early_level4_pgt". Without
the patch, the crash session fails during initialization with the
error message "crash: cannot resolve "init_level4_pgt".
(anderson@redhat.com)
sets of virtual memory offsets have been #define'd and helper macros
and placeholder functions for the p4d page tables have been added.
The only functional changes with this patchset are dynamically-set
PGDIR_SHIFT and PHYSICAL_MASK_SHIFT values that are based upon the
kernel configuration.
(anderson@redhat.com)