dwarf-reader: Leverage ODR & DWZ

When DWARF debug info has been preprocessed with the DWZ tool, I think
we can assume that if two DIEs originating from the .gnu_debugaltlink
section have different offset, it means they are different, even if
they represent two types of the same nature and of the same name.
This is the whole point of DWZ.

When we process two DIEs originating from C++, it's possible "in
general" to assume that the One Definition Rule is in effect, meaning
that if two types of the same nature have the same name, they ought to
represent the same entity, meaning, they are the same type.

These two observations can lead to faster comparison of two aggregate
types of the same nature and of the same name.

This patch implements these two optimizations and use them by
default.

The first one is used by default on binaries that contains a
.gnu_debugaltlink section, which is the hint we use to detect that DWZ
was used to factorize DWARF DIEs.

The second of is used by default on DWARF DIEs originating from C++.

These optimizations can be de-activated on abidw and abidiff using the
--no-leverage-dwarf-factorization and --no-assume-odr-for-cplusplus.

	* doc/manuals/abidiff.rst: Add documentation for
	--no-leverage-dwarf-factorization and
	--no-assume-odr-for-cplusplus
	* doc/manuals/abidw.rst: Likewise.
	* doc/manuals/abipkgdiff.rst: Likewise.
	* include/abg-fe-iface.h (options::{leverage_dwarf_factorization,
	assume_odr_for_cplusplus}): New data members.
	* src/abg-dwarf-reader.cc (reader::leverage_dwarf_factorization_):
	New data member.
	(reader::leverage_dwarf_factorization): New accessor.
	(compare_dies): If we are allowed to leverage the DWARF
	factorization and if two type DIEs coming from the
	.gnu_debugaltlink DWARF section have different offset, then they
	are different.  Also, if we are allowed to assume ODR, use it to
	speed up class/struct/unions comparisons.
	* tools/abidiff.cc (options::{assume_odr_for_cplusplus,
	leverage_dwarf_factorization}): Define new data members.
	(options::options): Initialize them.
	(display_usage): Add new help strings for
	--no-leverage-dwarf-factorization and
	--no-assume-odr-for-cplusplus.
	(parse_command_line): Parse these new options.
	(set_generic_options): New function.
	(main): Use the new set_generic_options function.
	* tools/abidw.cc (options::{assume_odr_for_cplusplus,
	leverage_dwarf_factorization}): Define new data members.
	(options::options): Initialize them.
	(display_usage): Add new help strings for
	--no-leverage-dwarf-factorization and
	--no-assume-odr-for-cplusplus.
	(parse_command_line): Parse these new options.
	(set_generic_options): New function.
	(load_corpus_and_write_abixml): Use the new set_generic_options
	function.
	* tools/abipkgdiff.cc (options::{assume_odr_for_cplusplus,
	leverage_dwarf_factorization}): Define new data members.
	(options::options): Initialize them.
	(display_usage): Add new help strings for
	--no-leverage-dwarf-factorization and
	--no-assume-odr-for-cplusplus.
	(parse_command_line): Parse these new options.
	(set_generic_options): New function.
	(compare): Use it.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
This commit is contained in:
Dodji Seketeli 2022-11-23 11:54:08 +01:00
parent 754364d5ca
commit d222b44733
9 changed files with 205 additions and 15 deletions

View File

@ -580,6 +580,26 @@ Options
changes. Added or removed functions and variables do not have any
diff nodes tree associated to them.
* ``--no-assume-odr-for-cplusplus``
When analysing a binary originating from C++ code using `DWARF`_
debug information, libabigail assumes the `One Definition Rule`_
to speed-up the analysis. In that case, when several types have
the same name in the binary, they are assumed to all be equal.
This option disables that assumption and instructs libabigail to
actually actually compare the types to determine if they are
equal.
* ``--no-leverage-dwarf-factorization``
When analysing a binary which `DWARF`_ debug information was
processed with the `DWZ`_ tool, the type information is supposed
to be already factorized. That context is used by libabigail to
perform some speed optimizations.
This option disables those optimizations.
* ``--ctf``
When comparing binaries, extract ABI information from `CTF`_ debug
@ -810,3 +830,6 @@ Usage examples
.. _ELF: http://en.wikipedia.org/wiki/Executable_and_Linkable_Format
.. _DWARF: http://www.dwarfstd.org
.. _CTF: https://raw.githubusercontent.com/wiki/oracle/binutils-gdb/files/ctf-spec.pdf
.. _ODR: https://en.wikipedia.org/wiki/One_Definition_Rule
.. _One Definition Rule: https://en.wikipedia.org/wiki/One_Definition_Rule
.. _DWZ: https://sourceware.org/dwz

View File

@ -327,6 +327,25 @@ Options
This option is available only if the package was configured with
the --enable-debug-type-canonicalization option.
* ``--no-assume-odr-for-cplusplus``
When analysing a binary originating from C++ code using `DWARF`_
debug information, libabigail assumes the `One Definition Rule`_
to speed-up the analysis. In that case, when several types have
the same name in the binary, they are assumed to all be equal.
This option disables that assumption and instructs libabigail to
actually actually compare the types to determine if they are
equal.
* ``--no-leverage-dwarf-factorization``
When analysing a binary which `DWARF`_ debug information was
processed with the `DWZ`_ tool, the type information is supposed
to be already factorized. That context is used by libabigail to
perform some speed optimizations.
This option disables those optimizations.
* ``--ctf``
@ -370,3 +389,6 @@ standard `here
.. _GNU: http://www.gnu.org
.. _Linux Kernel: https://kernel.org/
.. _CTF: https://raw.githubusercontent.com/wiki/oracle/binutils-gdb/files/ctf-spec.pdf
.. _ODR: https://en.wikipedia.org/wiki/One_Definition_Rule
.. _One Definition Rule: https://en.wikipedia.org/wiki/One_Definition_Rule
.. _DWZ: https://sourceware.org/dwz

View File

@ -528,6 +528,26 @@ Options
$ abipkgdiff --self-check --d1 mesa-libGLU-debuginfo-9.0.1-3.fc33.x86_64.rpm mesa-libGLU-9.0.1-3.fc33.x86_64.rpm
==== SELF CHECK SUCCEEDED for 'libGLU.so.1.3.1' ====
$
* ``--no-assume-odr-for-cplusplus``
When analysing a binary originating from C++ code using `DWARF`_
debug information, libabigail assumes the `One Definition Rule`_
to speed-up the analysis. In that case, when several types have
the same name in the binary, they are assumed to all be equal.
This option disables that assumption and instructs libabigail to
actually actually compare the types to determine if they are
equal.
* ``--no-leverage-dwarf-factorization``
When analysing a binary which `DWARF`_ debug information was
processed with the `DWZ`_ tool, the type information is supposed
to be already factorized. That context is used by libabigail to
perform some speed optimizations.
This option disables those optimizations.
* ``--ctf``
@ -554,3 +574,6 @@ In the later case, the value of the exit code is the same as for the
.. _DWARF: http://www.dwarfstd.org
.. _CTF: https://raw.githubusercontent.com/wiki/oracle/binutils-gdb/files/ctf-spec.pdf
.. _Development Package: https://fedoraproject.org/wiki/Packaging:Guidelines?rd=Packaging/Guidelines#Devel_Packages
.. _ODR: https://en.wikipedia.org/wiki/One_Definition_Rule
.. _One Definition Rule: https://en.wikipedia.org/wiki/One_Definition_Rule
.. _DWZ: https://sourceware.org/dwz

View File

@ -63,7 +63,9 @@ protected:
bool load_all_types = false;
bool drop_undefined_syms = false;
bool show_stats = false;
bool do_log = false;;
bool do_log = false;
bool leverage_dwarf_factorization = true;
bool assume_odr_for_cplusplus = true;
options_type(environment&);
};// font_end_iface::options_type

View File

@ -74,6 +74,7 @@ using std::stack;
using std::deque;
using std::list;
using std::map;
using abg_compat::optional;
using namespace elf_helpers; // TODO: avoid using namespace
@ -1857,6 +1858,7 @@ public:
mutable size_t compare_count_;
mutable size_t canonical_propagated_count_;
mutable size_t cancelled_propagation_count_;
mutable optional<bool> leverage_dwarf_factorization_;
protected:
@ -5114,6 +5116,29 @@ public:
load_in_linux_kernel_mode(bool f)
{options().load_in_linux_kernel_mode = f;}
/// Test if it's allowed to assume that the DWARF debug info has
/// been factorized (for instance, with the DWZ tool) so that if two
/// type DIEs originating from the .gnu_debugaltlink section have
/// different offsets, they represent different types.
///
/// @return true iff we can assume that the DWARF debug info has
/// been factorized.
bool
leverage_dwarf_factorization() const
{
if (!leverage_dwarf_factorization_.has_value())
{
if (options().leverage_dwarf_factorization
&& elf_helpers::find_section_by_name(elf_handle(),
".gnu_debugaltlink"))
leverage_dwarf_factorization_ = true;
else
leverage_dwarf_factorization_ = false;
}
ABG_ASSERT(leverage_dwarf_factorization_.has_value());
return *leverage_dwarf_factorization_;
}
/// Getter of the "show_stats" flag.
///
/// This flag tells if we should emit statistics about various
@ -10556,6 +10581,12 @@ compare_dies(const reader& rdr,
if (l_offset == r_offset)
return COMPARISON_RESULT_EQUAL;
if (rdr.leverage_dwarf_factorization()
&& (l_die_source == ALT_DEBUG_INFO_DIE_SOURCE
&& r_die_source == ALT_DEBUG_INFO_DIE_SOURCE))
if (l_offset != r_offset)
return COMPARISON_RESULT_DIFFERENT;
comparison_result result = COMPARISON_RESULT_EQUAL;
if (maybe_get_cached_type_comparison_result(rdr, l_tag,
dies_being_compared,
@ -10706,6 +10737,12 @@ compare_dies(const reader& rdr,
if (!compare_as_decl_and_type_dies(rdr, l, r))
SET_RESULT_TO_FALSE(result, l, r);
else if (rdr.options().assume_odr_for_cplusplus
&& rdr.odr_is_relevant(l)
&& rdr.odr_is_relevant(r)
&& !die_is_anonymous(l)
&& !die_is_anonymous(r))
result = COMPARISON_RESULT_EQUAL;
else
{
aggregates_being_compared.add(dies_being_compared);

View File

@ -346,7 +346,7 @@ static InOutSpec in_out_specs[] =
{
"data/test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64.rpm",
"data/test-diff-pkg/tbb-4.3-3.20141204.fc23.x86_64.rpm",
"--no-default-suppression",
"--no-default-suppression --no-assume-odr-for-cplusplus",
"",
"data/test-diff-pkg/tbb-debuginfo-4.1-9.20130314.fc22.x86_64.rpm",
"data/test-diff-pkg/tbb-debuginfo-4.3-3.20141204.fc23.x86_64.rpm",

View File

@ -108,6 +108,8 @@ struct options
bool show_redundant_changes;
bool show_symbols_not_referenced_by_debug_info;
bool show_impacted_interfaces;
bool assume_odr_for_cplusplus;
bool leverage_dwarf_factorization;
bool dump_diff_tree;
bool show_stats;
bool do_log;
@ -159,6 +161,8 @@ struct options
show_redundant_changes(),
show_symbols_not_referenced_by_debug_info(true),
show_impacted_interfaces(),
assume_odr_for_cplusplus(true),
leverage_dwarf_factorization(true),
dump_diff_tree(),
show_stats(),
do_log()
@ -259,6 +263,10 @@ display_usage(const string& prog_name, ostream& out)
<< " --no-redundant do not display redundant changes "
"(this is the default)\n"
<< " --impacted-interfaces display interfaces impacted by leaf changes\n"
<< " --no-leverage-dwarf-factorization do not use DWZ optimisations to "
"speed-up the analysis of the binary\n"
<< " --no-assume-odr-for-cplusplus do not assume the ODR to speed-up the "
"analysis of the binary\n"
<< " --dump-diff-tree emit a debug dump of the internal diff tree to "
"the error output stream\n"
<< " --stats show statistics about various internal stuff\n"
@ -615,6 +623,10 @@ parse_command_line(int argc, char* argv[], options& opts)
opts.show_redundant_changes = false;
else if (!strcmp(argv[i], "--impacted-interfaces"))
opts.show_impacted_interfaces = true;
else if (!strcmp(argv[i], "--no-leverage-dwarf-factorization"))
opts.leverage_dwarf_factorization = false;
else if (!strcmp(argv[i], "--no-assume-odr-for-cplusplus"))
opts.leverage_dwarf_factorization = false;
else if (!strcmp(argv[i], "--dump-diff-tree"))
opts.dump_diff_tree = true;
else if (!strcmp(argv[i], "--stats"))
@ -800,6 +812,23 @@ set_diff_context_from_opts(diff_context_sptr ctxt,
ctxt->dump_diff_tree(opts.dump_diff_tree);
}
/// Set a bunch of tunable buttons on the ELF-based reader from the
/// command-line options.
///
/// @param rdr the reader to tune.
///
/// @param opts the command line options.
static void
set_generic_options(abigail::elf_based_reader& rdr, options& opts)
{
rdr.options().show_stats = opts.show_stats;
rdr.options().do_log = opts.do_log;
rdr.options().leverage_dwarf_factorization =
opts.leverage_dwarf_factorization;
rdr.options().assume_odr_for_cplusplus =
opts.assume_odr_for_cplusplus;
}
/// Set suppression specifications to the @p read_context used to load
/// the ABI corpus from the ELF/DWARF file.
///
@ -1208,9 +1237,7 @@ main(int argc, char* argv[])
env, requested_fe_kind,
opts.show_all_types);
ABG_ASSERT(rdr);
rdr->options().show_stats = opts.show_stats;
rdr->options().do_log = opts.do_log;
set_generic_options(*rdr, opts);
set_suppressions(*rdr, opts);
c1 = rdr->read_corpus(c1_status);
@ -1283,8 +1310,7 @@ main(int argc, char* argv[])
opts.show_all_types);
ABG_ASSERT(rdr);
rdr->options().show_stats = opts.show_stats;
rdr->options().do_log = opts.do_log;
set_generic_options(*rdr, opts);
set_suppressions(*rdr, opts);
c2 = rdr->read_corpus(c2_status);

View File

@ -117,6 +117,8 @@ struct options
bool do_log;
bool drop_private_types;
bool drop_undefined_syms;
bool assume_odr_for_cplusplus;
bool leverage_dwarf_factorization;
optional<bool> exported_interfaces_only;
type_id_style_kind type_id_style;
#ifdef WITH_DEBUG_SELF_COMPARISON
@ -156,6 +158,8 @@ struct options
do_log(),
drop_private_types(false),
drop_undefined_syms(false),
assume_odr_for_cplusplus(true),
leverage_dwarf_factorization(true),
type_id_style(SEQUENCE_TYPE_ID_STYLE)
{}
@ -226,6 +230,10 @@ display_usage(const string& prog_name, ostream& out)
#ifdef WITH_CTF
<< " --ctf use CTF instead of DWARF in ELF files\n"
#endif
<< " --no-leverage-dwarf-factorization do not use DWZ optimisations to "
"speed-up the analysis of the binary\n"
<< " --no-assume-odr-for-cplusplus do not assume the ODR to speed-up the "
"analysis of the binary\n"
<< " --annotate annotate the ABI artifacts emitted in the output\n"
<< " --stats show statistics about various internal stuff\n"
<< " --verbose show verbose messages about internal stuff\n";
@ -398,6 +406,10 @@ parse_command_line(int argc, char* argv[], options& opts)
|| !strcmp(argv[i], "debug-die-canonicalization"))
opts.debug_die_canonicalization = true;
#endif
else if (!strcmp (argv[i], "--no-assume-odr-for-cplusplus"))
opts.assume_odr_for_cplusplus = false;
else if (!strcmp (argv[i], "--no-leverage-dwarf-factorization"))
opts.leverage_dwarf_factorization = false;
else if (!strcmp(argv[i], "--annotate"))
opts.annotate = true;
else if (!strcmp(argv[i], "--stats"))
@ -520,6 +532,24 @@ set_suppressions(abigail::elf_based_reader& rdr, options& opts)
rdr.add_suppressions(opts.kabi_whitelist_supprs);
}
/// Set a bunch of tunable buttons on the ELF-based reader from the
/// command-line options.
///
/// @param rdr the reader to tune.
///
/// @param opts the command line options.
static void
set_generic_options(abigail::elf_based_reader& rdr, options& opts)
{
rdr.options().drop_undefined_syms = opts.drop_undefined_syms;
rdr.options().show_stats = opts.show_stats;
rdr.options().do_log = opts.do_log;
rdr.options().leverage_dwarf_factorization =
opts.leverage_dwarf_factorization;
rdr.options().assume_odr_for_cplusplus =
opts.assume_odr_for_cplusplus;
}
/// Load an ABI @ref corpus (the internal representation of the ABI of
/// a binary) and write it out as an abixml.
///
@ -569,10 +599,9 @@ load_corpus_and_write_abixml(char* argv[],
opts.linux_kernel_mode);
ABG_ASSERT(reader);
// ... then tune a bunch of "buttons" on the newly created reader ...
reader->options().drop_undefined_syms = opts.drop_undefined_syms;
reader->options().show_stats = opts.show_stats;
reader->options().do_log = opts.do_log;
// ... then tune a bunch of "buttons" on the newly created reader
// ...
set_generic_options(*reader, opts);
set_suppressions(*reader, opts);
// If the user asked us to check if we found the "alternate debug

View File

@ -205,6 +205,8 @@ public:
bool show_added_binaries;
bool fail_if_no_debug_info;
bool show_identical_binaries;
bool leverage_dwarf_factorization;
bool assume_odr_for_cplusplus;
bool self_check;
optional<bool> exported_interfaces_only;
#ifdef WITH_CTF
@ -248,6 +250,8 @@ public:
show_added_binaries(true),
fail_if_no_debug_info(),
show_identical_binaries(),
leverage_dwarf_factorization(true),
assume_odr_for_cplusplus(true),
self_check()
#ifdef WITH_CTF
,
@ -892,6 +896,10 @@ display_usage(const string& prog_name, ostream& out)
<< " --no-parallel do not execute in parallel\n"
<< " --fail-no-dbg fail if no debug info was found\n"
<< " --show-identical-binaries show the names of identical binaries\n"
<< " --no-leverage-dwarf-factorization do not use DWZ optimisations to "
"speed-up the analysis of the binary\n"
<< " --no-assume-odr-for-cplusplus do not assume the ODR to speed-up the"
"analysis of the binary\n"
<< " --verbose emit verbose progress messages\n"
<< " --self-check perform a sanity check by comparing "
"binaries inside the input package against their ABIXML representation\n"
@ -1224,6 +1232,24 @@ set_diff_context_from_opts(diff_context_sptr ctxt,
ctxt->add_suppressions(supprs);
}
/// Set a bunch of tunable buttons on the ELF-based reader from the
/// command-line options.
///
/// @param rdr the reader to tune.
///
/// @param opts the command line options.
static void
set_generic_options(abigail::elf_based_reader& rdr, const options& opts)
{
if (!opts.kabi_suppressions.empty())
rdr.add_suppressions(opts.kabi_suppressions);
rdr.options().leverage_dwarf_factorization =
opts.leverage_dwarf_factorization;
rdr.options().assume_odr_for_cplusplus =
opts.assume_odr_for_cplusplus;
}
/// Compare the ABI two elf files, using their associated debug info.
///
/// The result of the comparison is emitted to standard output.
@ -1336,8 +1362,7 @@ compare(const elf_file& elf1,
ABG_ASSERT(reader);
reader->add_suppressions(priv_types_supprs1);
if (!opts.kabi_suppressions.empty())
reader->add_suppressions(opts.kabi_suppressions);
set_generic_options(*reader, opts);
corpus1 = reader->read_corpus(c1_status);
@ -1436,8 +1461,7 @@ compare(const elf_file& elf1,
ABG_ASSERT(reader);
reader->add_suppressions(priv_types_supprs2);
if (!opts.kabi_suppressions.empty())
reader->add_suppressions(opts.kabi_suppressions);
set_generic_options(*reader, opts);
corpus2 = reader->read_corpus(c2_status);
@ -3350,6 +3374,10 @@ parse_command_line(int argc, char* argv[], options& opts)
opts.show_added_binaries = false;
else if (!strcmp(argv[i], "--fail-no-dbg"))
opts.fail_if_no_debug_info = true;
else if (!strcmp(argv[i], "--no-leverage-dwarf-factorization"))
opts.leverage_dwarf_factorization = false;
else if (!strcmp(argv[i], "--no-assume-odr-for-cplusplus"))
opts.assume_odr_for_cplusplus = false;
else if (!strcmp(argv[i], "--verbose"))
opts.verbose = true;
else if (!strcmp(argv[i], "--no-abignore"))