Avoid code duplication and increase maintainebility of these helper
functions. As their only difference was the application of position
relative relocations, consolidate them and add a flag for exactly this
feature.
This is purely stylistic and not changing functionality.
* src/abg-dwarf-reader.cc(try_reading_first_ksymtab_entry):
New function to consolidate functionality for
try_reading_first_ksymtab_entry_using_{pre,}v4_19_format functions.
(try_reading_first_ksymtab_entry_using_v4_19_format,
try_reading_first_ksymtab_entry_using_pre_v4_19_format):
refactor to use try_reading_first_ksymtab_entry
Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
Because DWARF sometimes emit decl-only classes (real one, with no
members) with a size property, and the rest of the time, would emit
the same decl-only class without a size property, comparing the two
might yield some false positives.
This patch handles those beasts when comparing classes.
* include/abg-comp-filter.h (is_decl_only_class_with_size_change):
Declare an overload.
* include/abg-fwd.h (look_through_decl_only_class): Declare an
overload.
* src/abg-comp-filter.cc (is_decl_only_class_with_size_change):
Define an overload that takes class_or_union& type. Re-write the
previous overload in terms of this new one.
* src/abg-ir.cc (look_through_decl_only_class): Define a new
overload that takes a class_or_union&. Rewrite the previous
overload in terms of this one.
(equals): In the overload for class_or_union&, use
is_decl_only_class_with_size_change to detect cases of decl-only
classes that differ only by their size attribute and avoid
comparing them.
* tests/data/test-annotate/test21-pr19092.so.abi: Adjust.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Likewise.
* tests/data/test-diff-filter/test41-report-0.txt: Likewise.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
The leaf_diff_node_marker_visitor pass which collects leaf diff nodes
for the leaf diff reporter considers bogus decl-only classes (that have the
is-declaration-only flag set, are empty, and yet have a non-nil size
property) originated from bogus DWARF.
The leaf reporter thus potentially reports size changes among
decl-only classes, which does not make sense. Two decl-only classes
of the same name should always be considered equal, in this context.
This patch thus teaches the leaf_diff_node_marker_visitor to avoid
collecting a leaf diff node that is about a size change on a true
decl-only class.
* include/abg-comp-filter.h (is_decl_only_class_with_size_change):
Declare new function.
* src/abg-comp-filter.cc (is_decl_only_class_with_size_change):
Define new function.
* src/abg-comparison.cc
(leaf_diff_node_marker_visitor::visit_begin): Use the newly
defined is_decl_only_class_with_size_change above to ignore bogus
decl-only classes with a size change.
* tests/data/test-diff-suppr/test45-abi-report-1.txt: New test input.
* tests/data/test-diff-suppr/test45-abi-wl.xml: Likewise.
* tests/data/test-diff-suppr/test45-abi.xml: Likewise.
* tests/data/test-diff-suppr/test45-abi.suppr.txt: New reference
output for the test input above.
* tests/data/test-diff-suppr/test46-PR25128-base.xml: New test input.
* tests/data/test-diff-suppr/test46-PR25128-new.xml: Likewise.
* tests/data/test-diff-suppr/test46-PR25128-report-1.txt: New
reference input for the test input above.
* tests/data/Makefile.am: Add the new test material to source distribution.
* tests/test-diff-suppr.cc (in_out_spec): Add the new test input
above to this test harness.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
Right now, the symbol names that are part of a Linux Kernel ABI
white list are matched against function and variable names that appear
in the DWARF information.
In other words, when Libabigail processes the Linux Kernel ABI
whitelist, it generates a suppression specifications which keeps
functions and variables (as described by DWARF) whose names match the
symbol names specified in the white list. All other functions and
variables are dropped. But that doesn't apply to ELF symbols.
Libabigail generates no suppression at all for ELF symbols. It only
considers variables and functions described in the debug information.
With this patch, Libabiagil now generates a suppression specification
which keeps functions and variables whose ELF symbol name match the
symbol names specified in the whitelist. The suppression
specification also drops ELF symbols whose name don't match the names
specified in the white list.
Note that this patch uses the previous commit which description is:
"Support symbol_name_not_regexp in [suppress_{function, variable}]"
* src/abg-tools-utils.cc (gen_suppr_spec_from_kernel_abi_whitelist):
Generate a suppression specification which considers the name of
the symbol associated to a function/variable, rather than just the
name of said function/variable.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
In the suppress_function and suppress_variable directives of the
suppression specification language, we lack the
'symbol_name_not_regexp' properties, that would allow users to specify
which (function/variable) symbols to *keep* as opposed to specifying
which symbols to suppress.
This patch adds that feature. That will later allow us to make the
linux kernel symbol white lists[1] functionality use this feature;
that is, upon analysing the content of a kernel symbol whitelist which
lists a symbol named "foo", Libabigail would automatically generate a
suppression specification which contains, e.g a 'suppress_function"
directive that has this new 'symbol_name_not_regexp' property which
value is set to "foo".
Note that the patch makes sure that feature is supported when
analyzing both abixml and DWARF formats.
[1]: You can learn about what a Linux Kernel symbols white list is by
reading about it at
https://sourceware.org/libabigail/manual/kmidiff.html#environment.
* doc/manuals/libabigail-concepts.rst: Document the new
symbol_name_not_regexp properties for the
suppress_{function,variable} directives.
* include/abg-suppression.h
({function,variable}_suppression::{g,s}et_symbol_name_not_regex_str):
Declare new member functions.
* src/abg-dwarf-reader.cc
(read_context::is_elf_symbol_suppressed): Define new member functions.
(read_context::{load_symbol_maps_from_symtab_section,
populate_symbol_map_from_ksymtab,
populate_symbol_map_from_ksymtab_reloc}): Drop suppressed symbols
when reading symbol tables.
({function,variable}_is_suppressed): Consider that in C, the
linkage name is _by default_ the same as the function/variable
name. Remove local variable.
* include/abg-ir.h (elf_symbol_is_{function,variable}): Add ...
* src/abg-ir.cc (elf_symbol_is_{function,variable}): ... new
functions.
* src/abg-reader.cc (build_elf_symbol): Take an additional boolean
to detect and drop suppressed symbols.
(build_elf_symbol_db): Adjust the call to build_elf_symbol to make
it detect and drop suppressed symbols.
(read_corpus_from_input): Be mindful that the set of symbols for a
given corpus can be empty because of suppression specifications.
* src/abg-suppression-priv.h
({function,variable}_suppression::priv::symbol_name_not_regex[_str_]):
Add new data members.
(function,variable}_suppression::priv::get_symbol_name_not_regex):
Add new member functions.
({function,variable}_is_suppressed): Guard against empty name.
(is_elf_symbol_suppressed): Define new function template.
* src/abg-suppression.cc
({function,variable}_suppression::{g,s}et_symbol_name_not_regex_str):
Define new member functions.
({function,variable}_suppression::suppresses_function)
(suppression_matches_{function,variable}_sym_name)
(read_{function,variable}_suppression): Support the new
"symbol_name_not_regex" property.
* tests/data/test-diff-suppr/test44-suppr-sym-name-not-regexp-report-1.txt:
New test reference report.
* tests/data/test-diff-suppr/test44-suppr-sym-name-not-regexp-report-2.txt:
Likewise.
* tests/data/test-diff-suppr/test44-suppr-sym-name-not-regexp-v{0,1}.c:
Sources of the new test input.
* tests/data/test-diff-suppr/test44-suppr-sym-name-not-regexp-v{0,1}.o:
New test input binaries.
* tests/data/test-diff-suppr/test44-suppr-sym-name-not-regexp-v{0,1}.o.abi:
New test input abixml files.
* tests/data/test-diff-suppr/test44-suppr-sym-name-not-regexp.suppr.txt:
Next test suppression specification.
* tests/data/Makefile.am: Add the new test material above to
source distribution.
* tests/test-diff-suppr.cc (in_out_specs): Add the input tests
above to the test harness.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
Add the missing line breaks.
* tools/kmidiff.c (display_usage): add missing line breaks to
help text
Signed-off-by: Matthias Maennich <maennich@google.com>
When loading the corpus from elf while specifying a whitelist, we might
be able to ignore the symbol table. In any case we have to load the elf
properties into the context, such as the binary's architecture.
Otherwise they are missing from the internal / xml representation.
Previously, elf properties were not loaded when a whitelist was
specified. Fix that.
* src/abg-dwarf-reader.cc (read_corpus_from_elf):
unconditionally load elf properties into context
Signed-off-by: Matthias Maennich <maennich@google.com>
In the previous commit 2f7248f, we were just taking the first address
referred to by the DW_AT_ranges attribute as being the address of the
symbol of the function we are looking at. But there can be cases
where this is not true, as explained at
https://sourceware.org/bugzilla/show_bug.cgi?id=25058#c7.
We really need to get the first address that represents an exported
and defined function symbol, which is pointed to by the DW_AT_ranges
attribute.
And this is what this patch does.
* src/abg-dwarf-reader.cc
(read_context::get_first_exported_fn_address_from_DW_AT_ranges):
Rename read_context::get_first_address_from_DW_AT_ranges into
this. Walk through the addresses referred to by the DW_AT_ranges
attribute until we find one that is for an exported function
symbol, rather than just picking the first address of the set.
(read_context::get_function_address): Adjust.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
Usually, function DIEs (DW_TAG_subprogram) refer to the address of the
underlying ELF symbol by using the DW_AT_low_pc attribute. However,
there are cases where it does so by using the DW_AT_ranges attribute.
In those cases, the first address of the sequence defined in the value
of that attribute is the address of the ELF symbol.
The problem is that the DWARF reader of Libabigail fails to get the
address of the underlying ELF symbol when the DW_AT_low_pc attribute
is missing. Rather, it should then look at the value of the
DW_AT_ranges attribute instead.
This is what this patch does.
* src/abg-dwarf-reader.cc
(read_context::get_first_address_from_DW_AT_ranges): Define new
member function.
(read_context::get_function_address): Use the new
read_context::get_first_address_from_DW_AT_ranges here.
* tests/data/test-diff-dwarf/PR25058-liblttng-ctl-report-1.txt:
New reference test output.
* tests/data/test-diff-dwarf/PR25058-liblttng-ctl.so: New test
input binary.
* tests/data/test-diff-dwarf/PR25058-liblttng-ctl2.10.so: New test
input binary.
* tests/data/Makefile.am: Add the new test materials above to
source distribution.
* tests/test-diff-dwarf.cc (in_out_specs): Add the new input test
input binary files to this test harness.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
My patch "568dee1 PR25042 - Support string form DW_FORM_strx{1,4} from
DWARF 5" introduced a thinko in configure.ac. The thinko triggers a
regression test issue on old systems where we don't support
DW_FORM_strx from DWARF 5. Fixed thus.
* configure.ac: Fix thinko when setting the HAVE_DW_FORM_strx
macro.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
* src/abg-dwarf-reader.cc (compare_dies_string_attribute_value):
Fix a typo in the comment of this function.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
* configure.ac: Detect the presence of the DW_FORM_strx{1,4}
enumerators.
* src/abg-dwarf-reader.cc (form_is_DW_FORM_strx): Define new
function.
(compare_dies_string_attribute_value): Use the new
form_is_DW_FORM_strx here.
* tests/data/Makefile.am: Add the new test input files below to
source distribution.
* tests/data/test-read-dwarf/PR25042-libgdbm-clang-dwarf5.so.6.0.0:
New binary test input file.
* tests/data/test-read-dwarf/PR25042-libgdbm-clang-dwarf5.so.6.0.0.abi:
Reference output of the new binary test input file.
* tests/test-read-dwarf.cc (in_out_specs): Add the input test
files above to the test harness, for platforms that support the
DW_FORM_strx form.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
When writting a suppression specification in which the user wants to
keep a family of types (whose names set is specified by a regular
expression) and suppress/drop all other types, one needs to write
something like:
[suppress_type]
name_regexp = (?!the-regexp-of-the-types-to-keep)
It would be nicer (like what is done for other properties that take
regular expressions as value in suppression specifications) to be able
to write:
[suppress_type]
name_not_regexp = the-regexp-of-types-to-keep
This patch does just that.
It augments the abigail::suppr::type_suppression type to make it carry
the new 'name_not_regex' property. It updates the suppression engine
to take the 'name_not_regex' property into account when interpreting
instances of abigail::suppr::type_suppression. The parser for type
suppression directives is updated to recognize the new name_not_regexp
property. The manual has been updated accordingly to describe the new
property. New regression tests have been added.
* doc/manuals/libabigail-concepts.rst: Update this to document the
new name_not_regexp property of the suppress_type directive.
* include/abg-suppression.h
(type_suppression::{g,s}et_type_name_not_regex_str): Declare new accessors.
* src/abg-suppression-priv.h
(type_suppression::priv::{type_name_not_regex_str_,
type_name_not_regex_}): Define new data members.
(type_suppression::priv::{get_type_name_not_regex,
set_type_name_not_regex, get_type_name_not_regex_str,
set_type_name_not_regex_str}): Define new member functions.
* src/abg-suppression.cc
(type_suppression::get_type_name_regex_str): Fix comments.
(type_suppression::{set_type_name_not_regex_str,
get_type_name_not_regex_str}): Define new data members.
(suppression_matches_type_name): Adapt to support the new
type_name_not_regex property.
(read_type_suppression): Support parsing the type_name_not_regexp
property.
* tests/data/test-diff-suppr/test42-negative-suppr-type-report-0.txt:
New test reference output.
* tests/data/test-diff-suppr/test42-negative-suppr-type-report-1.txt: Likewise.
* tests/data/test-diff-suppr/test42-negative-suppr-type-suppr-1.txt:
New test input.
* tests/data/test-diff-suppr/test42-negative-suppr-type-suppr-2.txt: Likewise.
* tests/data/test-diff-suppr/test42-negative-suppr-type-v0.{cc, o}: Likewise.
* tests/data/test-diff-suppr/test42-negative-suppr-type-v1.{cc,
o}: Likewise.
* tests/data/Makefile.am: Add the test files above to source
distribution.
* tests/test-diff-suppr.cc (int_out_specs): Add the new tests to
the harness.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
In the comparison engine, when a sub-type of a function type (say, a
parameter type size change) has been suppressed, this suppression is
not necessarily well propagated to the function carrying the function
type, because the parameter type size, for instance, is considered as
a type local change to that function; and we generally don't propagate
suppression to a non-suppressed parent diff node that already carries
a local change.
This leads to an empty change report for the function we are looking
at because the only sub-type change has been suppressed.
This patch properly propagates the suppressed-ness in that case, so
that the parent function diff node is suppressed as well.
* src/abg-comparison.cc
(suppression_categorization_visitor::visit_end): Propagate
suppression-ness from suppressed function type diff node to its
parent function node if the latter doesn't have any local non-type
change.
* tests/data/test-diff-suppr/test43-suppr-direct-fn-subtype-report-1.txt:
New test reference output.
* tests/data/test-diff-suppr/test43-suppr-direct-fn-subtype-suppr-1.txt:
New test input suppression file.
* tests/data/test-diff-suppr/test43-suppr-direct-fn-subtype-v{0,1}.cc:
Source code of input binary file.
* tests/data/test-diff-suppr/test43-suppr-direct-fn-subtype-v{0,1}.o:
Input binary files.
* tests/data/Makefile.am: Add the new test input files above to
source distribution.
* tests/test-diff-suppr.cc (in_out_specs): Add the test input to
test harness.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
While looking at something else, I noticed that we could be missing
type size changes on function parameters when using has_type_change.
Fixed thus.
* src/abg-comp-filter.cc (has_type_change): Support function
parameters.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
When the endianness of the ELF binary differs from the endianness of
the host, some byte swapping needs to happen when we read the reloc
section to either determine the format of the kernel symbol table or
to get the set of symbols referenced by the kernel symbol table.
So we need to use elf_getdata rather than elf_rawdata to read the data
from the reloc section, because the former handles the proper byte
swapping for us.
This patch does just that and thus fixes the build breakage that is
occuring when running the testreaddwarf test on s390x (big endian),
especially when trying to read the AARCH64 little endian binary
data/test-read-dwarf/PR25007-sdhci.ko.
* src/abg-dwarf-reader.cc
(read_context::{get_ksymtab_format_module,
populate_symbol_map_from_ksymtab_reloc}): Use elf_getdata rather
than elf_rawdata.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
When analyzing an AARCH64 linux kernel module built with support for
either R_AARCH64_ABS64 or R_AARCH64_PREL32 relocations, we need these
macros to be defined in elf.h (i.e a recent enough version of libelf),
otherwise we cannot properly support those kernel modules using the
scheme that uses the relocation table of the __ksymtab and
__ksymtab_gpl sections to read those sections.
In the future, I think we should automatically fallback to another way
of trying to read those sections if those macros are not defined, and
emit a message hinting at what is happening, when in verbose mode. I
am keeping it as is for the moment, so that we can get a better case
of the when these macros are not defined and whatnot.
In the mean time, this patch conditionalizes the test that reads a
kernel module build with support for these relocations to avoid
running it on platform that support these relocations.
* tests/test-read-dwarf.cc: Do not run the test on
PR25007-sdhci.ko if the macros R_AARCH64_PREL32 and
R_AARCH64_ABS64 are not defined.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
Now that there are proper facilities to lookup ELF symbols inside the
ELF/DWARF reader and get a native GElf_Sym type instance (from
libelf), we don't need to carry the value of the symbol (that is
relevant only that low level anyway) in the abigail::ir::elf_symbol
type.
This patch removes that property.
* include/abg-ir.h (elf_symbol::{elf_symbol, create}): Remove the
'val' parameter.
* src/abg-dwarf-reader.cc (elf_symbol::get_value): Remove this
member function declaration.
(lookup_symbol_from_sysv_hash_tab)
(lookup_symbol_from_gnu_hash_tab, lookup_symbol_from_symtab)
(create_default_var_sym, create_default_fn_sym)
(read_context::lookup_elf_symbol_from_index): Adjust calls to
creating elf_symbol instances.
* src/abg-ir.cc (elf_symbol::priv::value_): Remove this data
member.
(elf_symbol::{priv::priv, elf_symbol, create): Adjust.
* src/abg-reader.cc (build_elf_symbol): Likewise.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
In relocatable files, two symbols listed in the .symtab section can
have the same value and yet be different. That is because those
symbols can be *defined* in different sections. And the value of
those symbols represent addresses (offsets) within their own
respective sections (a.k.a section-relative addresses).
In the same time, symbol address as referred-to in the DWARF
information are *not* section-relative, rather, they are relative to
the beginning of the whole binary.
Until now, the DWARF-referred-to symbol addresses were translated into
section-relative addresses, so that they could be compared to the
other section-relative addresses we were getting from listing the
symbols and their values from the .symtab section. The problem with
that approach is that, during the translation from binary-relative to
section-relative addresses we were wrongly assuming that all symbols
referenced from the DWARF were defined in the .text section. This is
wrong especially for ET_REL files because they could be defined in
sections named .foo.text or .bar.text, for instance.
This leads to issues where we wrongly consider that two symbols having the
same value are the same. Because we wrongly assume that they are all
defined in the same .text section.
This patch fixes this problem by translating the section-relative
addresses we see in .symtab into binary-relative addresses by adding
the address of the section to the section-relative address. Those
binary-addresses can thus safely be compared to the binary-relative
addresses we see in the DWARF. And also, when two symbols have the
same binary-relative address, we can now safely assume that they are
the same -- they are aliases, basically.
* src/abg-dwarf-reader.cc
(read_context::{lookup_native_elf_symbol_from_index,
maybe_adjust_et_rel_sym_addr_to_abs_addr}): Define new member
functions.
(read_context::lookup_elf_symbol_from_index): Add a new overload.
Write the old overloads in terms of the new one.
(read_context::{load_symbol_maps_from_symtab_section,
populate_symbol_map_from_ksymtab_reloc}): Use the new
maybe_adjust_et_rel_sym_addr_to_abs_addr function to translate the
symbol value/address into a binary-relative address before adding
it to the addr->sym maps.
(read_context::maybe_adjust_{fn, var}_sym_address): Do not adjust
DWARF-referred-to addresses of ET_REL symbols anymore.
* tests/data/test-read-dwarf/PR25007-sdhci.ko: New binary test input.
* tests/data/test-read-dwarf/PR25007-sdhci.ko.abi: ABI
representation of the above.
* tests/test-read-dwarf.cc: Add the new test input to the harness.
* tests/data/test-diff-dwarf/test28-vtable-changes-report-0.txt: Adjust.
* tests/data/test-diff-filter/test20-inline-report-0.txt: Likewise.
* tests/data/test-diff-filter/test20-inline-report-1.txt: Likewise.
* tests/data/test-diff-filter/test41-report-0.txt: Likewise.
* tests/data/test-diff-filter/test9-report.txt: Likewise.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
The patch:
"e687032 Support pre and post v4.19 ksymtabs for Linux kernel modules"
introduces the use of the R_AARCH64_{ABS64, PREL32} macros. However,
some older "elf.h" don't define these. When compiling on these older
platforms, we thus need to avoid using these new macros.
With this patch, the configure system detects the presence of these
macros and defines the HAVE_R_AARCH64_{ABS64, PREL32}_MACRO macros
accordingly.
Note that just to comply with what's in there in the code already, we
don't directly do "#ifdef R_AARCH64_ABS64", but rather "#ifdef
HAVE_R_AARCH64_ABS64_MACRO", to allow cases where we want to
artificially disable the "feature" at configure time, in the future.
* configure.ac: Define macros HAVE_R_AARCH64_{ABS64, PREL32}_MACRO
if the macros R_AARCH64_{ABS64, PREL32} are present.
* src/abg-dwarf-reader.cc
(read_context::get_ksymtab_format_module): Conditionalize the use
of R_AARCH64_{ABS64, PREL32} using HAVE_R_AARCH64_{ABS64, PREL32}_MACRO.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
As described in commit ad8c2531fb, the format of the Linux kernel
ksymtab changed in v4.19 to use relative references instead of absolute
references. This changes the type of relocations emitted for ksymtab
sections to be place-relative 32-bit relocations instead of absolute
relocations. One side-effect of this is that libdwfl will not relocate
the ksymtab sections due to the PC-relative relocations. This breaks
load_kernel_symbol_table() for kernel modules because it only reads in
zeros from the unrelocated ksymtab section and is subsequently unable to
determine what exported symbols it refers to. Since a vmlinux binary is
already fully linked and relocated (and therefore we can read its
ksymtab section just fine), this problem is only relevant to Linux
kernel modules.
To work around this, we utilize the ksymtab relocation sections to
determine which symbols the ksymtab entries refer to. We do this by
inspecting each relocation's r_info field for the symbol table index and
from there we are able to read each symbol's value and subsequently add
that to the set of exported symbols.
In addition, for Linux kernel modules, we can utilize relocation types
to implement a new heuristic to determine the ksymtab format we have.
The presence of PC-relative relocations suggest the new v4.19 format,
and absolute relocation types suggest the old pre v4.19 format.
* include/abg-ir.h (elf_symbol::{elf_symbol, create}): Take new
symbol value and shndx parameters.
(elf_symbol::{get_value, get_shndx}): Declare new accessors.
* src/abg-ir.cc (elf_symbol::priv::{value_, shndx_}): New data
members.
(elf_symbol::priv::priv): Adjust.
(elf_symbol::elf_symbol): Take new value and is_linux_string_cst
parameters.
(elf_symbol::create): Likewise.
(elf_symbol::{get_value, get_is_linux_string_cst}): Define new
accessors.
* src/abg-reader.cc (build_elf_symbol): Adjust.
* src/abg-dwarf-reader.cc (binary_is_linux_kernel)
(binary_is_linux_kernel): New static functions.
(lookup_symbol_from_sysv_hash_tab)
(lookup_symbol_from_gnu_hash_tab)
(lookup_symbol_from_symtab): Adjust.
(read_context::{ksymtab_reloc_section_,
ksymtab_gpl_reloc_section_, ksymtab_strings_section_}): New data
members.
(read_context::read_context): Initialize ksymtab_reloc_section_,
ksymtab_gpl_reloc_section_, ksymtab_strings_section_.
(read_context::{find_ksymtab_reloc_section,
find_ksymtab_gpl_reloc_section, find_ksymtab_strings_section,
find_any_ksymtab_reloc_section, get_ksymtab_format_module,
populate_symbol_map_from_ksymtab,
populate_symbol_map_from_ksymtab_reloc, is_linux_kernel_module}):
New member functions.
(read_context::load_kernel_symbol_table): Adjust to call either
populate_symbol_map_from_ksymtab{_reloc,} depending on ksymtab
format.
(read_context::get_ksymtab_format): Adjust to call
get_ksymtab_format_module for linux kernel modules.
(read_context::lookup_elf_symbol_from_index): Adjust.
(create_default_var_sym, create_default_fn_sym): Adjust.
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
When emitting abixml, profiling shows that we spend a great deal of
time testing if a given type has been emitted already, to avoid
emitting a given type more than once. This makes the serialization
phase take more time than the binary analysis phase!
This patch leverages the fact that we already have the set of
canonical types in the system. While emitting that set entirely, we
don't need to test if a type has been emitted already because we know
by definition that every type is present just once in that set, more
or less. OK, because there are also types that don't have canonical
types (for instance, declaration-only class/structs), we'll still have
to check of those types have already been emitted, but this is a very
small set to handle.
The patch thus organizes the canonical types per scope, so that when
emitting a scope and the canonical types within it, the type is
emitted in its correct namespace.
Then, when emitting a translation unit and each namespaces in it, the
patch emits the canonical types of those namespaces.
The patch arranges for some ancillary things that are needed to make
the whole picture be coherent enough for things to keep working.
Testing shows that we gained ~ 30% of performance by doing this, while
analysing the whole linux kernel 5.1 version. We went from ~ 3m30s
minutes to less than 2m30s.
With this patch, the serialization phase now takes less time than the
analysis time.
* include/abg-fwd.h (is_decl_slow)
(peel_pointer_or_reference_type): Declare new functions.
* include/abg-ir.h (struct canonical_type_hash): Define new type.
(type_base_ptr_set_type, type_base_ptrs_type)
(type_base_sptrs_type, canonical_type_sptr_set_type): Define new
typedefs.
(environment::get_canonical_types_map): Declare new member
function.
(scope_decl::{get_canonical_types, get_sorted_canonical_types}):
Declare new member functions.
* src/abg-ir.cc (is_ptr_ref_or_qual_type)
(peel_pointer_or_reference_type, is_decl_slow): Define new
functions.
(environment::{get_canonical_types_map}): Define new member
functions.
(canonical_type_hash::operator()): Likewise.
(scope_decl::{get_canonical_types, get_sorted_canonical_types}):
Likewise.
(struct type_topo_comp): Define new comparison functor type.
(environment::{sorted_canonical_types_}): Define new data member.
(scope_decl::priv::{canonical_types_, sorted_canonical_types_}):
Likewise.
(scope_decl::is_empty): Take the presence of canonical types into
account when determining if a scope is empty or not.
(is_decl): Make this work for cases where the artifact at hand is
a type which has a declaration, as opposed to being a pure
declaration like a variable or a function.
(canonicalize): Add the canonical type the list of canonical types
of its scope.
* src/abg-dwarf-reader.cc (read_context::die_is_in_cplus_plus):
Define new member function.
* src/abg-writer.cc (write_type, write_canonical_types_of_scope):
Define new static functions.
(fn_type_ptr_set_type): Define new typedef.
(write_context::{m_referenced_fn_types_set,
m_referenced_non_canonical_types_set}): Add new data members.
(write_context::m_referenced_types_set): Renamed
m_referenced_types_map into this.
(write_context::get_referenced_types): Adjust.
(write_context::get_referenced_{function_types,
non_canonical_types}):
(write_context::record_type_as_referenced): Adjust to add the
referenced type in the proper set which would be one of the three
following: write_context::{get_referenced_types,
get_referenced_function_types,
get_referenced_non_canonical_types}.
(write_context::{type_is_referenced, clear_referenced}): Adjust.
(write_translation_unit): Use the new
write_canonical_types_of_scope. Also emit declaration-only
classes that have member types. Do not test if a given type of a
given scope has been emitted, in general, as this was super slow
given the number of types. Emit referenced function types (as
these don't belong to any scope). Rather than using the expensive
"is_function_type" on *all* the referenced types, just walk the
set write_context::get_referenced_function_types. Likewise,
rather than using type_base::get_naked_canonical_type on
*all* the referenced types, just walk the set
write_context::get_referenced_non_canonical_types
(write_class): Use write_canonical_types_of_scope here.
* tools/abilint.cc (main): Support linting corpus group abixml
files.
* tests/data/test-annotate/libtest23.so.abi: Adjust.
* tests/data/test-annotate/libtest24-drop-fns-2.so.abi: Likewise.
* tests/data/test-annotate/libtest24-drop-fns.so.abi: Likewise.
* tests/data/test-annotate/test-anonymous-members-0.o.abi: Likewise.
* tests/data/test-annotate/test0.abi: Likewise.
* tests/data/test-annotate/test1.abi: Likewise.
* tests/data/test-annotate/test13-pr18894.so.abi: Likewise.
* tests/data/test-annotate/test14-pr18893.so.abi: Likewise.
* tests/data/test-annotate/test15-pr18892.so.abi: Likewise.
* tests/data/test-annotate/test17-pr19027.so.abi: Likewise.
* tests/data/test-annotate/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise.
* tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise.
* tests/data/test-annotate/test2.so.abi: Likewise.
* tests/data/test-annotate/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise.
* tests/data/test-annotate/test21-pr19092.so.abi: Likewise.
* tests/data/test-annotate/test4.so.abi: Likewise.
* tests/data/test-annotate/test6.so.abi: Likewise.
* tests/data/test-annotate/test7.so.abi: Likewise.
* tests/data/test-annotate/test8-qualified-this-pointer.so.abi: Likewise.
* tests/data/test-read-dwarf/PR22015-libboost_iostreams.so.abi: Likewise.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise.
* tests/data/test-read-dwarf/PR24378-fn-is-not-scope.abi: Likewise.
* tests/data/test-read-dwarf/libtest23.so.abi: Likewise.
* tests/data/test-read-dwarf/libtest24-drop-fns-2.so.abi: Likewise.
* tests/data/test-read-dwarf/libtest24-drop-fns.so.abi: Likewise.
* tests/data/test-read-dwarf/test0.abi: Likewise.
* tests/data/test-read-dwarf/test1.abi: Likewise.
* tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Likewise.
* tests/data/test-read-dwarf/test11-pr18828.so.abi: Likewise.
* tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise.
* tests/data/test-read-dwarf/test13-pr18894.so.abi: Likewise.
* tests/data/test-read-dwarf/test14-pr18893.so.abi: Likewise.
* tests/data/test-read-dwarf/test15-pr18892.so.abi: Likewise.
* tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise.
* tests/data/test-read-dwarf/test17-pr19027.so.abi: Likewise.
* tests/data/test-read-dwarf/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise.
* tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise.
* tests/data/test-read-dwarf/test2.so.abi: Likewise.
* tests/data/test-read-dwarf/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Likewise.
* tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi: Likewise.
* tests/data/test-read-dwarf/test4.so.abi: Likewise.
* tests/data/test-read-dwarf/test6.so.abi: Likewise.
* tests/data/test-read-dwarf/test7.so.abi: Likewise.
* tests/data/test-read-dwarf/test8-qualified-this-pointer.so.abi: Likewise.
* tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Likewise.
* tests/data/test-read-write/test10.xml: Likewise.
* tests/data/test-read-write/test14.xml: Likewise.
* tests/data/test-read-write/test15.xml: Likewise.
* tests/data/test-read-write/test17.xml: Likewise.
* tests/data/test-read-write/test18.xml: Likewise.
* tests/data/test-read-write/test19.xml: Likewise.
* tests/data/test-read-write/test2.xml: Likewise.
* tests/data/test-read-write/test20.xml: Likewise.
* tests/data/test-read-write/test21.xml: Likewise.
* tests/data/test-read-write/test22.xml: Likewise.
* tests/data/test-read-write/test23.xml: Likewise.
* tests/data/test-read-write/test24.xml: Likewise.
* tests/data/test-read-write/test25.xml: Likewise.
* tests/data/test-read-write/test26.xml: Likewise.
* tests/data/test-read-write/test27.xml: Likewise.
* tests/data/test-read-write/test28-without-std-fns-ref.xml: Likewise.
* tests/data/test-read-write/test28-without-std-vars-ref.xml: Likewise.
* tests/data/test-read-write/test3.xml: Likewise.
* tests/data/test-read-write/test6.xml: Likewise.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
Kernel modules without exported symbols (no use of EXPORT_SYMBOL*()),
will not have a __ksymtab_strings section. Libabigail will therefore
assume they are usual ELF binaries. That leads to wrong results as
now all ELF symbols are considered part of the ABI. That is obviously
wrong. Instead consider binaries having a .modinfo section to be kernel
binaries. We keep the __ksymtab_strings condition as vmlinux has no
.modinfo section but a __ksymtab_strings if symbols are exported.
One case is still open (and requires maybe some documentation): if a
kernel does not export symbols (no module support), none of the
conditions apply. But, who would be interested in the ABI of a kernel
that does not expose any?
* src/abg-dwarf-reader.cc(is_linux_kernel_binary): consider
binaries only having a .modinfo section to be kernel binaries
Co-developed-by: Alessio Balsini <balsini@android.com>
Signed-off-by: Alessio Balsini <balsini@android.com>
Signed-off-by: Matthias Maennich <maennich@google.com>
On older compilers (such as g++ 4.8), the default C++ standard is set to
gnu++98. When compiling libabigail with --enable-cxx11=yes, src/ and
tests/ where compiled with the correct flag, while tools/ was compiled
without specifying a standard. With a compiler falling back to gnu++98
that leads to unresolved references when linking the tools against the
libabigail library. Fix that by consistently using the std= flag across
the code base.
* configure.ac: add -std=c++11 flag to CXXFLAGS when compiling
for C++11
* src/Makefile.am: drop now obsolete setting of the -std flag
* tests/Makefile.am: likewise
Reported-by: Chun-Hung Wu <Chun-hung.Wu@mediatek.com>
Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
Libabigail's filtering engine fails to recognize an enum changing into
a compatible integer (or vice versa) as a harmless change.
This patch fixes that.
* include/abg-comparison.h (peel_typedef_or_qualified_type_diff):
Declare new function.
(peel_pointer_or_qualified_type_diff): Rename
peel_pointer_or_qualified_type into this.
* include/abg-fwd.h (is_enum_type): Declare a new overload for
type_or_decl_base*.
* src/abg-comp-filter.cc (has_harmless_enum_to_int_change): Define
new static function.
* src/abg-comparison.cc (categorize_harmless_diff_node): Use the
new has_harmless_enum_to_int_change here.
(peel_pointer_or_qualified_type_diff): Renamed
peel_pointer_or_qualified_type into this.
(is_diff_of_basic_type): Adjust.
(peel_typedef_or_qualified_type_diff): Define new function.
* test-diff-filter/PR24787-lib{one, two}.so: New test input
binaries.
* test-diff-filter/PR24787-{one, two}.c: Source files of the test
input binaries above.
* test-diff-filter/PR24787-report-0.txt: Test output reference.
* tests/data/Makefile.am: Add the new testing material to source
distribution.
* tests/test-diff-filter.cc (in_out_specs): Add the new test to
the test harness.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
While doing my recent optimization work, it became useful to have an
idea of the time different parts of the processing pipeline are
taking.
This patch introduces an abigail::tools_utils::timer type that is easy
to use to time a given part of the code and emit the elapsed time to
an output stream.
This abigail::tools_utils::timer type is thus used to time various
parts of the processing pipeline involved in abidw. Just using the
existing --verbose option now yields timing information.
* include/abg-tools-utils.h (class timer): Declare new type.
(operator<<(ostream&, const timer&)): Declare new streaming
operator for the new timer type.
* src/abg-tools-utils.cc (struct timer::priv): Define new type.
(timer::{timer, start, stop, value_in_seconds, value,
value_as_string, ~timer}): Define member functions.
(operator<<(ostream& o, const timer& t)): Define streaming
operator.
(build_corpus_group_from_kernel_dist_under): Add timing logs to
the linux kernel reading process.
* src/abg-dwarf-reader.cc
(read_context::canonicalize_types_scheduled): Add timing logs to
type canonicalization.
(read_debug_info_into_corpus): Add timing logs for the whole debug
info loading and internal representation building process.
* tools/abidw.cc (load_corpus_and_write_abixml): Add timing logs
for the binary loading and serizalization process.
(load_kernel_corpus_group_and_write_abixml): Add timing logs the
Linux Kernel binary loading and writing process.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
GCC 9+ rightfully complains about some indentation issues in the
types_defined_same_linux_kernel_corpus_public function.
This patch fixes it and adds more comments.
* src/abg-ir.cc (types_defined_same_linux_kernel_corpus_public):
Fix indentation and add comments.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
During type canonicalization there are observations that can speed-up
type comparison significantly without impacting correctness too much.
Typically, when two types are of the same name and kind, are found in
the same corpus and are defined in the same translation unit, they
ought to be the same type, even in C. So there is no need in this
case to actually perform the structural comparison of the two types
which does have a quadratic performance at best.
Using this optimization made the loading of the
drivers/gpu/drm/i915/i915.ko module go from a quasi inifite time (many
hours on my system) to less than two minutes. I am confining this
optimization to the Linux kernel case only for now, but I believe it
could benefit all C programs. I am waiting for more testing before
applying it more broadly.
Also, while looking at this, I noticed that when loading several
corpora into a given corpus group (i.e, loading several linux kernel
binaries to represent a single conceptual kernel), we sometimes fail
to recognize that a type defined in a header file that is included in
several corpora is actually the same type, and should be re-used,
rather than being re-defined in each corpus. This later adds stress
(time and space) on the system as we need to canonicalize and
de-duplicate these type later on.
This is because the "per-corpus" type maps that we use to lookup a
type by name and location when we see it (so that we know it's defined
in a different corpus of our current group) should really be
per-corpus-group type maps! That is a type can be defined in the
corpus representing a .ko binary, and that type would be seen again in
another .ko binary later. Until now, we were wrongly considering that
types were to be first defined in the corpus of the vmlinux binary,
and then could be re-used later.
I have thus fixed the code so that whenever we add a type to its
scope, the relevant per-corpus type maps are updated, as well as the
per-corpus-group ones, so that we can later lookup types in those
per-corpus-group type maps to know if a type is already defined in any
corpus of the group.
* include/abg-corpus.h (corpus::origin): Add a new
LINUX_KERNEL_BINARY_ORIGIN enumerator.
(corpus::{s,g}et_group): Declare new member
functions.
(class corpus): Make the corpus_group class friend of this one.
(corpus_group::get_main_corpus): Declare new member function.
* src/abg-corpus-priv.h (corpus::priv::group): Define new data
member.
(corpus::priv::priv): Initialize the new corpus::priv::group data
member.
* src/abg-corpus.cc (corpus::{g,s}et_group): Define new member
functions.
(corpus_group::get_main_corpus): Likewise.
(corpus_group::add_corpus): Use the new corpus::set_group() here
to to make the corpus be aware of the group it belongs to.
* src/abg-dwarf-reader.cc (read_debug_info_into_corpus): Set the
current corpus origin to the corpus::LINUX_KERNEL_BINARY_ORIGIN if
we are looking at a Linux Kernel binary.
(read_context::main_corpus_from_current_group): Use the
corpus_group::get_main_corpus method.
(should_reuse_type_from_corpus_group): Return the corpus group,
rather than the main corpus.
(read_debug_info_into_corpus): Add the current corpus to the
current corpus group before the debug info reading is done. That
way, the corpus group will be accessible from the current corpus
during the construction of the internal representation.
(read_and_add_corpus_to_group_from_elf): Add the corpus to the
group only if it wasn't added to it before.
* include/abg-ir.h (operator{==,!=}): Declare new deep equality
and inequality operators for class_or_union_sptr and
union_decl_sptr.
* src/abg-ir.cc (types_defined_same_linux_kernel_corpus_public):
Define a new static function.
(type_base::get_canonical_type_for): Use the new
types_defined_same_linux_kernel_corpus_public here to speed up
type comparison.
(equals): In the overload of class_or_union, use the new
types_defined_same_linux_kernel_corpus_public as well, to speed up
type comparison.
(operator{==,!=}): Define new deep equality and inequality
operators for class_or_union_sptr and union_decl_sptr.
(maybe_update_types_lookup_map): In the overload function for
type_decl_sptr, class_decl_sptr, union_decl_sptr,
enum_type_decl_sptr, typedef_decl_sptr, qualified_type_def_sptr,
reference_type_def_sptr, array_type_def_sptr,
array_type_def::subrange_sptr, and function_type_sptr, update the
type lookup maps of the containing corpus group as well, not just
the ones of the current corpus.
* src/abg-reader.cc (build_enum_type_decl): Forgot to set the
"is-anonymous" flag. Oops, fix this.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Adjust.
* tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Adjust.
* tests/data/test-read-dwarf/test12-pr18844.so.abi: Adjust.
* tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi: Adjust.
* tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Adjust.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
Introduce a compatibility layer for C++11 code by adding
include/abg-cxx-compat.h. abg-cxx-compat defines a new namespace
abg_compat and defines
abg_compat::hash
abg_compat::shared_ptr
abg_compat::weak_ptr
abg_compat::dynamic_pointer_cast
abg_compat::static_pointer_cast
abg_compat::unordered_map
abg_compat::unordered_set
based on definitions from std::tr1 (std=gnu++98) or std:: (std=gnu++11).
I decided for introducing abg_compat:: rather than polluting abigail::
to allow an easier transition to C++11 at a later time and to not subtly
break existing code.
As the shared_ptr in C++11 defines shared_ptr::operator bool() explicit,
some locations where a shared_ptr is assigned to boolean, needed to be
adjusted to explicitly cast to bool.
* include/abg-cxx-compat.h: new file introducing the abg_compat
namespace to provide C++11 functionality from either std::tr1
or std::
* include/Makefile.am: Add the new abg-cxx-compat.h to source
distribution.
* include/abg-comparison.h: replace std::tr1 usage by abg_compat
and adjust includes accordingly: likewise
* include/abg-diff-utils.h: likewise
* include/abg-fwd.h: likewise
* include/abg-ini.h: likewise
* include/abg-interned-str.h: likewise
* include/abg-ir.h: likewise
* include/abg-libxml-utils.h: likewise
* include/abg-libzip-utils.h: likewise
* include/abg-reporter.h: likewise
* include/abg-sptr-utils.h: likewise
* include/abg-suppression.h: likewise
* include/abg-tools-utils.h: likewise
* include/abg-workers.h: likewise
* src/abg-comp-filter.cc: likewise
* src/abg-comparison-priv.h: likewise
* src/abg-corpus.cc: likewise
* src/abg-dwarf-reader.cc: likewise
* src/abg-hash.cc: likewise
* src/abg-ir.cc: likewise
* src/abg-reader.cc: likewise
* src/abg-suppression.cc: likewise
* src/abg-tools-utils.cc: likewise
* src/abg-writer.cc: likewise
* tests/test-diff-filter.cc: likewise
* tests/test-diff-pkg.cc: likewise
* tests/test-read-dwarf.cc: likewise
* tests/test-read-write.cc: likewise
* tests/test-types-stability.cc: likewise
* tests/test-write-read-archive.cc: likewise
* tools/abicompat.cc: likewise
* tools/abidiff.cc: likewise
* tools/abidw.cc: likewise
* tools/abilint.cc: likewise
* tools/abipkgdiff.cc: likewise
Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
__gnu_cxx::stdio_filebuf is a GNU extension only available in certain
std libraries. It is not e.g. in libc++. In order to be able to compile
with using libc++, replace the usage of __gnu_cxx::stdio_filebuf with
standard C++ methods. In this case, reopen the temporary file with a
std::fstream and expose that stream rather than the previously exposed
std::iostream.
* include/abg-tools-utils.h (get_stream): Change return type to
std::fstream
* src/abg-corpus.cc: remove unused #include of ext/stdio_filebuf.h
* src/abg-tools-utils (temp_file::priv): remove filebuf_ member,
and replace iostream_ by fstream_ with changing the shared_ptr
type accordingly
(temp_file::priv::priv): initialize fstream_ based on
temporary file name
(temp_file::priv::~priv): adjust destruction accordingly
(temp_file::is_good): test the fstream rather than the fd
(temp_file::get_stream): adjust return type to std::fstream
and adjust implementation based on the changes in temp_file::priv
* src/Makefile.am: remove gnu extension from c++ standard flag
* tests/Makefile.am: likewise
Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
* src/abg-dwarf-reader.cc (addr_elf_symbol_sptr_map_sptr): Fix a
typo in the comment of this typedef.
* src/abg-ir.cc (hash_type_or_decl): Fix typo in a comment.
* src/abg-writer.cc (write_translation_unit): Remove useless
vertical space.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
It looks like due to a typo, we are never caching the name of the
function_type, so we are computing it all the time, *OOOPS*. So this
is having an impact when comparing instance of function_type during
de-duplication at abixml writting time.
Things are faster now, thanks to this patch.
* src/abg-ir.cc (function_type::get_cached_name): Really cache the
computed name of function_type instances.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
When we dynamically hash types in the abixml writter, we use
hash_type_or_decl. This function uses runtime type identification to
determine if the (type) artifact is a decl or a type, and based on
that, choose how to compute its hash value. Profiling shows that
using the RTTI in hash_type_or_decl at this point is a hotspot.
Because we know that the type ABI is a *type*, we obviously can avoid
using RTTI there.
The patch thus implements a hash_type function, and uses that in the
xml writter. Emitting the abixml output is faster with this patch.
* include/abg-fwd.h (hash_type): Declare new function.
* src/abg-ir.cc (hash_type): Define new function.
* src/abg-writer.cc (type_hasher::operator()): Use the new
hash_type rather than the old hash_type_or_decl.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
Profiling showed that a number of use of dynamic_cast are a speed
bottleneck.
This patch implements a poor-man's RTTI that allows us to implement a
form of dynamic_cast that is specific to the types of the internal
reprenstation that are in the namespace abigail::ir. It speeds up
things greatly.
Basically, the base type of all ABI artifacts
(abigail::ir::type_or_decl_base) now contains three new data members.
The first one contains a bitmap that identifies the type of artifact.
The second one contains a pointer to the dynamic type sub-object of
the current instance of the artifact. The last one contains either a
pointer to the type_base sub-object of the current instance of ABI
artifact if it's a type, or a pointer to the type_decl sub-object of
the current instance.
Together these three data members allow the patch to implement the
abigail::ir::{is_type(), is_decl(), is_<type_kind>_type} functions
that we need to make the code base noticeably faster when using abidw
on a big vmlinux binary.
* include/abg-fwd.h (is_type_decl): Replace the overloads
that takes a type_base* and/or a decl_base* by one that takes a
type_or_decl_base*.
* include/abg-ir.h (type_or_decl_base::type_or_decl_kind): Define
new enum.
(type_or_decl_base::{kind, runtime_type_instance,
type_or_decl_base_pointer}): Declare new accessors.
(operator{|,|=,&,&=): Declare new operators for the new
type_or_decl_base::type_or_decl_kind enum.
(global_scope::global_scope): Move the definition of this
constructor to ...
* src/abg-ir.cc (global_scope::global_scope): ... here.
(type_or_decl_base::priv::{kind_, rtti_, type_or_decl_ptr_}):
Add new data members.
(type_or_decl_base::priv::priv): Take a
type_or_decl_base::type_or_decl_kind enum.
(type_or_decl_base::priv::kind): Define new accessors.
(operator{|,|=,&,&=): Define new operators for the new
type_or_decl_base::type_or_decl_kind enum.
(type_or_decl_base::type_or_decl_base): Take a
type_or_decl_base::type_or_decl_kind enum.
(type_or_decl_base::{kind, runtime_type_instance,
type_or_decl_base_pointer}): Define new accessors.
(decl_base::decl_base, scope_decl::scope_decl)
(type_base::type_base, scope_type_decl::scope_type_decl)
(class_or_union::class_or_union) : Adjust to set the runtime type
identifier of the instances of these types.
(global_scope::global_scope, type_decl::type_decl)
(qualified_type_def::qualified_type_def)
(pointer_type_def::pointer_type_def)
(reference_type_def::reference_type_def
array_type_def::subrange_type::subrange_type)
(array_type_def::array_type_def, enum_type_decl::enum_type_decl)
(typedef_decl::typedef_decl, var_decl::var_decl)
(function_type::function_type, method_type::method_type)
(function_decl::function_decl)
(function_decl::parameter::parameter, method_decl::method_decl)
(class_decl::class_decl, class_decl::base_spec::base_spec)
(union_decl::union_decl, template_decl::template_decl)
(type_tparameter::type_tparameter)
(non_type_tparameter::non_type_tparameter)
(template_tparameter::template_tparameter)
(type_composition::type_composition)
(function_tdecl::function_tdecl, function_tdecl::function_tdecl)
(class_tdecl::class_tdecl):
Likewise and call runtime_type_instance() here to set the runtime
type instance pointers of the current instance.
(is_decl, is_type, is_class_type, is_pointer_type): Adjust to use
the new poor-man's rtti machinery.
(is_type_decl): Replace the overloads that takes a type_base*
and/or a decl_base* by one that takes a type_or_decl_base*.
(pointer_type_def::operator==, class_decl::operator==): Use the
poor-man's rtti machinery to replace dynamic_cast.
hash_type_or_decl: Replace dynamic_cast<const type_base> by
is_type() and dynamic_cast<const decl_base*> by is_decl().
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
For a reason, anonymous types are not canonicalized. I think this is
due to the fact that because they have no name,
read_context::lookup_type_from_die(die) used by maybe_canonicalize_type()
falls short in trying to canonicalize the *DIE*.
So later, at comparison time, things can be really slow because we
can't do canonical comparison; we ressort to structural comparison.
This patch ensures that even anonymous types are canonicalized.
* src/abg-dwarf-reader.cc (maybe_canonicalize_type): Add two new
overloads. One that takes type_base_sptr, one that takes a
Dwarf_Die* and type_base_sptr. These force canonicalization for
anonymous types.
(build_function_type): Schedule function types for
canonicalization.
(build_ir_node_from_die): For struct/classes and unions, use the
new overload of maybe_canonicalize_type to schedule
canonicalization.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Adjust.
* tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Adjust.
* tests/data/test-read-dwarf/test12-pr18844.so.abi: Adjust.
* tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi: Adjust.
* tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Adjust.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
In preparation for some coming patches, I figured it'd be more type
safe to make the Dwarf_Die parameter of maybe_canonicalize_type be a
const pointer. The patch subsequently adjusts code that needs adjusting.
* src/abg-dwarf-reader.cc (maybe_canonicalize_type): Make the
first parameter const.
(read_context::{get_canonical_die, lookup_artifact_from_die,
lookup_type_from_die, schedule_type_for_late_canonicalization}):
Adjust.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
Since the previous commit filters out harmless changes inside unions,
this one allows those changes to be reported whenever the user runs
abidiff with the --harmless option.
The patch just displays the "before" and "after" of the union because
some of the harmless changes are not tracked anymore.
Because we want to display the before and after of the union we use
the function get_class_or_union_flat_representation. The patch adds a
"qualified_name" boolean parameter to that function so that we can
choose to display union members names in a non-qualified fashion,
which is the natural way of displaying those names in the context of a
union (or class) representation. Because
get_class_or_union_flat_representation uses
type_or_decl_base::get_pretty_representation, the patch has also added a
"qualified_name" boolean parameter to that function so that we can
choose to display names in a non-qualified manner.
* include/abg-fwd.h (get_class_or_union_flat_representation): Add
a "qualified_name" boolean parameter.
* include/abg-ir.h ({type_or_decl_base, decl_base, type_decl,
namespace_decl, array_type_def::subrange_type, array_type_def,
enum_type_decl, typedef_decl, var_decl, function_decl,
function_decl::parameter, function_type, method_type, class_decl,
union_decl}::get_pretty_representation): Likewise.
* src/abg-ir.cc ({type_or_decl_base, decl_base, type_decl,
namespace_decl, array_type_def::subrange_type, array_type_def, enum_type_decl,
typedef_decl, var_decl, function_decl, function_decl::parameter,
function_type, method_type, class_decl, union_decl,
}::get_pretty_representation): Adjust the code to emit qualified
or non-qualified names depending on the new "qualified_name"
boolean parameter.
(get_class_or_union_flat_representation): Likewise.
* src/abg-default-reporter.cc (default_reporter::report): Use
get_class_or_union_flat_representation with the new
"qualified_name" boolean set to false.
* tests/data/test-diff-dwarf/test38-union-report-0.txt: Adjust.
* tests/test-diff-filter.cc (in_out_specs): Run the test harness
on test-PR24731-v{0,1}.o make abidiff use the --harmless option.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
When union data members are re-ordered, abidiff reports the
re-ordering as if it was a meaningful ABI change.
This patch teaches Libabigail to categorize that benign type layout
change as a HARMLESS_UNION_CHANGE_CATEGORY kind of change and ignore it.
* include/abg-comp-filter.h (union_diff_has_harmless_changes):
Declare new function and ...
* src/abg-comp-filter.cc (union_diff_has_harmless_changes):
... define it here.
(categorize_harmless_diff_node): Use the new
union_diff_has_harmless_changes here.
* include/abg-comparison.h (HARMLESS_UNION_CHANGE_CATEGORY): Add a
new enumerator to diff_category enum. Adjust the value of the
other enumerators.
* src/abg-comparison.cc (get_default_harmless_categories_bitmap):
Add the new HARMLESS_UNION_CHANGE_CATEGORY in here.
(operator<<(ostream& o, diff_category c)): Support the new
HARMLESS_UNION_CHANGE_CATEGORY.
* tests/data/test-diff-filter/test-PR24731-report-0.txt: Likewise.
* tests/data/test-diff-filter/test-PR24731-report-1.txt: Likewise.
* tests/data/test-diff-filter/test-PR24731-v0.c: Likewise.
* tests/data/test-diff-filter/test-PR24731-v0.o: Likewise.
* tests/data/test-diff-filter/test-PR24731-v1.c: Likewise.
* tests/data/test-diff-filter/test-PR24731-v1.o: Likewise.
* tests/data/Makefile.am: Add the new test material above to
source distribution.
* tests/test-diff-filter.cc (in_out_spec): Add the new test input
to this test harness.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
When comparing internal decl names (as part of decl comparison), we
need to take into account the fact that a given decl might be
anonymous and that it might have anonymous scopes in its tree of
containing scopes.
For instance, "__anonymous_struct__1::foo" and
"__anonymous_struct__2::foo" are considered equivalent.
So are "__anonymous_struct__1::foo::__anonymous_struct__2::bar" and
"__anonymous_struct__10::foo::__anonymous_struct__11::bar".
But "__anonymous_struct__1::bar::__anonymous_struct__2::baz" and
"__anonymous_struct__10::foo::__anonymous_struct__11::bar" are not.
This patch introduces the function tools_utils::decl_names_equal that
compares fully qualified names by taking into account anonymous
component names.
That function is thus used in the equals() function overload for
decl_base types. Because tools_utils::decl_names_equal compares strings the
usual way (character by character) it's slower than comparing
instances of interned_string in a O(1) time. So the patch carefully
tries to use tools_utils::decl_names_equal sparringly; that is, it
uses it only when we are looking at decls that have some anonymous
scope. That way, we use the fast interned_string comparison most of
the time. By doing this, we barely see any performance degradation
while running abidw --noout on a full blown vmlinux binary.
* include/abg-ir.h (decl_base::{get_has_anonymous_parent,
set_has_anonymous_parent,
get_is_anonymous_or_has_anonymous_parent}): Declare new member
functions.
* src/abg-ir.cc (decl_base::priv::has_anonymous_parent_): Define
new data member.
(decl_base::priv): Initialize the new data member.
(decl_base::{get_has_anonymous_parent, set_has_anonymous_parent,
get_is_anonymous_or_has_anonymous_parent}): Define new member
functions.
(equals): In the overload for decl_base, use the new
decl_names_equal for decls that have anonymous scopes.
(scope_decl::add_member_decl): Propagate the
decl_base::has_anonymous_parent_ property.
* include/abg-tools-utils.h
(get_anonymous_struct_internal_name_prefix)
(get_anonymous_union_internal_name_prefix)
(get_anonymous_enum_internal_name_prefix, decl_names_equal):
Declare new functions.
* src/abg-comp-filter.cc (has_harmless_name_change): Handle the
case where the name change is actually from an anonymous name to
another one, using the new decl_names_equal function.
* src/abg-dwarf-reader.cc
(get_internal_anonymous_die_prefix_name): Renamed
get_internal_anonynous_die_base_name into this. Use the new
get_anonymous_{struct, union, enum}_internal_name_prefix functions
here.
(get_internal_anonymous_die_name, die_qualified_type_name)
(build_enum_type, add_or_update_class_type)
(add_or_update_union_type): Adjust.
* src/abg-tools-utils.cc (get_anonymous_struct_internal_name_prefix)
(get_anonymous_union_internal_name_prefix)
(get_anonymous_enum_internal_name_prefix, decl_names_equal):
Define new functions.
* tests/test-tools-utils.cc: New test file.
* tests/Makefile.am: Add new runtesttoolsutils test, built from
test-tools-utils.cc.
* tests/data/test-diff-dwarf/test46-rust-report-0.txt: Adjust.
* tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-3.txt:
Likewise.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
In order to build this (external!) header file stand-alone, it required
some minor fixes. I.e. adding some includes and using declarations.
* include/abg-reporter.h: fix includes and using declarations
Signed-off-by: Matthias Maennich <maennich@google.com>
While looking at something else, I realized that
compare_dies_string_attribute_value had some indentation that was off.
Fixed thus.
* src/abg-dwarf-reader.cc (compare_dies_string_attribute_value):
Fix indentation.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
Profiling shows that the *number of calls* to
compare_dies_string_attribute_value by compare_as_decl_dies represents
a hotspot. That is, compare_dies_string_attribute_value itself
doesn't necessarily takes long, but it's being called "too much",
especially when compare_as_decl_dies is called for DIEs representing
classes/structs.
This patch thus reduces the calls to
compare_dies_string_attribute_value from 3 to 1 for classes/structs.
This optimization makes abidw's reading be 10% faster (from ~5min:15s
to ~ 4min:45s) on a fullblow vmlinux. Note that abidw writting time
hasn't been yet optimized.
* src/abg-dwarf-reader.cc (die_is_class_type): Take a const
pointer to Dwarf_Die.
(compare_as_decl_dies): For classes/structs, call
compare_dies_string_attribute_value just once to compare the
DW_AT_name attribute values.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
When looking at a C program, during function decl DIE de-duplication
at we can rely on linkage names of function declarations to quickly
determine if two function decls are equal, in a given binary.
This patch uses that observation to speed up function decl DIE
de-duplication. abidw --noout vmlinux goes from 8 to 5 minutes with
this.
* src/abg-dwarf-reader.cc (read_context::{die_is_in_c,
die_is_in_c_or_cplusplus}): Define new member functions.
(fn_die_equal_by_linkage_name): Define new static function.
(compare_dies): In the case for for DW_TAG_subprogram, use the new
fn_die_equal_by_linkage_name.
* tests/data/test-annotate/test15-pr18892.so.abi: Adjust.
* tests/data/test-annotate/test21-pr19092.so.abi: Adjust.
* tests/data/test-read-dwarf/test15-pr18892.so.abi: Adjust.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Adjust.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
Whenever we see a function type inside a translation unit, if it
matches one that has already been seen -- i.e, one that has the same
textual representation -- we should be able to re-use that same
function type without having to compare their types to be sure they
are the same, as part of the type canonicalization process.
This slittly increases analysis speed (by a few tens of seconds on a
total of 8 minutes) by decreasing the load on type canonicalization
when anlyzing vmlinux. It also slightly reduces memory consumption,
so I am getting it in for now.
* src/abg-dwarf-reader.cc (istring_fn_type_map_type): Declare new
typedef.
(die_is_function_type): Define new static function.
(read_context::per_tu_repr_to_fn_type_maps_): Define new data
member ...
(read_context::per_tu_repr_to_fn_type_maps): ... and its accessor.
(read_context::{associate_die_repr_to_fn_type_per_tu,
lookup_fn_type_from_die_repr_per_tu}): Define new member
functions.
(build_function_type): Use the new
read_context::lookup_fn_type_from_die_repr_per_tu and
read_context::associate_die_repr_to_fn_type_per_tu functions,
instead of read_context::lookup_type_from_die.
* tests/data/test-annotate/test13-pr18894.so.abi: Adjust.
* tests/data/test-annotate/test14-pr18893.so.abi: Adjust.
* tests/data/test-annotate/test21-pr19092.so.abi: Adjust.
* tests/data/test-read-dwarf/test13-pr18894.so.abi: Adjust.
* tests/data/test-read-dwarf/test14-pr18893.so.abi: Adjust.
* tests/data/test-read-dwarf/test16-pr18904.so.abi: Adjust.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Adjust.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
This is another attempt at handling anonymous decls comparison. It's
not the full blown method that I'd like, but this one seems to be fast
enough. In this method, we take the immediate scope (and whether it's
anonymous or not) of the anonymous decl into account.
* include/abg-interned-str.h (interned_string::clear): Add new
member function.
* src/abg-ir.cc (equals): In the overload for decl_base, consider
the scope of the current (anonymous) decl. If that scope is
anonymous then take that into account as well.
* tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-3.txt:
Adjust.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>