The Git repository of the Libabigail Project
Go to file
Dodji Seketeli 9a0abd846b Use the ODR to speed up type canonicalization
This is the last patch of the series of 11 patches that started at the
patch with the subject:

    constify is_class_type()

And below starts the cover letter of this patch.

While analyzing some libraries like libmozjs.so[1] it appeared that
type canonicalization takes a significant time to comparing composite
types that are re-defined in each translation units again and again.

The One Definition Rule[2] says that two types with the same name
shall designate the same thing; so when a type T being canonicalized
has the same name of a canonical type C in the same ABI corpus, then
this patch considers C as being the canonical type of T, without
comparing T and C structurally.  This saves us from comparing T and C.

Before this patch, `abidw --noout libmozjs.so` was taking
approximatively 5 minutes; with the patch, it takes 1 minutes and 30
seconds.

To do this, the patch changes ABI artifacts to carry a pointer to the
corpus it belongs to.  Whenever an ABI artifact is added to a given
context, the corpus of that context is propagated to the artifact;
that is now possible as the artifact now carries the property of the
corpus it belongs to.

During type canonicalization the ODR-based optimization outlined above
is performed as we can now compare the corpus of a given type again
the one of another type; it's now possible to know if two types come
from the same corpus.

There are a few cases though were the optimization is not performed:
  - anonymous struct; when a struct is anonymous (it has no name, as
    described in the DWARF), the DWARF reader gives it a name
    nonetheless, so that diagnostics can refer to that anonymous type.
    But then all anonymous types in the system have the same name.  So
    when faced with two anonymous types (with the same name) from the
    same corpus, it's wrong to consider that they name the same thing.
    The patch added an "is_anonymous" property to types created by the
    DWARF reader so that such anonymous types can be detected by the
    type canonicalizer; they are thus not involved in this
    optimization.  Note that the abixml writer and reader have been
    updated to emit and read this property.
  - typedefs.  I have seen in some boost code two typedefs of the same
    name refer to different underlying types.  I believe this is a
    violation of ODR.  I'll need to investigate on this later.  And I
    think we really need to detect these ODR violations as part of
    this enhancement request:
    https://sourceware.org/bugzilla/show_bug.cgi?id=18941.
  - pointers, references, arrays and function types, as they can refer
    to the two exceptions above.

This is the last patch of the series which aimed at speeding up type
canonicalization in the context of types being re-defined a lot in
translation units.

[1]: Instruction to build libmozjs.so from the mongodb sources:
	- git clone https://github.com/mongodb/mongo.git
	- cd mongo
	- scons --link-model=dynamic build/opt/third_party/mozjs-38/libmozjs.so

[2] One Definition Rule: https://en.wikipedia.org/wiki/One_Definition_Rule

	* include/abg-fwd.h (class corpus): Forward-declare this.
	(is_anonymous_type): Declare this new function.
	* include/abg-ir.h (corpus_sptr, corpus_wptr): Declare these
	typedefs here too.
	(translation_unit::{g,s}et_corpus): Declare new member functions.
	(type_or_decl_base::{g,s}et_corpus): Likewise.
	* src/abg-ir.cc (translation_unit::priv::corpus): New data member.
	(translation_unit::priv::priv): Initialize it.
	(translation_unit::{g,s}et_corpus): Define new accessors.
	(translation_unit::get_global_scope): Propagate the corpus of the
	translation unit to its newly created global scope.
	(translation_unit::bind_function_type_life_time): Propagate the
	corpus of the translation_unit to the added function type.
	(type_or_decl_base::priv::corpus_): Add new data member.
	(type_or_decl_base::priv::priv): Initialize it.
	(type_or_decl_base::{g,s}et_corpus): Define new accessors.
	(scope_decl::{add,insert}_member_decl): Propagate the context's
	corpus to the member added to the context.
	(decl_base::priv::is_anonymous_): Add new data member.
	(decl_base::priv::priv): Initialize it.
	(decl_base::{s,g}et_is_anonymous): Define accessors.
	(is_anonymous_type): Define a new test function.
	(decl_base::set_name): Update the "is_anonymous" property.
	(type_base::get_canonical_type_for): Implement the ODR-based
	optimization to type canonicalization.
	* src/abg-corpus.cc (corpus::add): When a translation unit is
	added to a corpus, set the corpus of the translation unit.
	* src/abg-dwarf-reader.cc (build_enum_type)
	(build_class_type_and_add_to_ir): Set the "is_anonymous" flag on
	anonymous enums and classes.
	* src/abg-reader.cc (read_is_anonymous): Define new static
	function.
	(build_type_decl, build_enum_type, build_class_decl): Call the new
	read_is_anonymous function and set the "is_anonymous" property on
	the built type declaration.
	* src/abg-writer.cc (write_is_anonymous): Define new static
	function.
	(write_type_decl, write_enum_type_decl, write_class_decl): Write
	the "is_anonymous" property.
	* tests/data/test-diff-filter/test31-pr18535-libstdc++-report-0.txt:
	Adjust.
	* tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Likewise.
	* tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Likewise.
	* tests/data/test-read-dwarf/test11-pr18828.so.abi: Likewise.
	* tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise.
	* tests/data/test-read-dwarf/test13-pr18894.so.abi: Likewise.
	* tests/data/test-read-dwarf/test14-pr18893.so.abi: Likewise.
	* tests/data/test-read-dwarf/test15-pr18892.so.abi: Likewise.
	* tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise.
	* tests/data/test-read-dwarf/test17-pr19027.so.abi: Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-10-04 13:52:25 +02:00
doc Add a new --abidiff option to abidw 2015-09-21 15:14:26 +02:00
include Use the ODR to speed up type canonicalization 2015-10-04 13:52:25 +02:00
m4 Delete ltsugar.m4 and pkg.m4 files from m4/ 2015-01-06 09:54:45 +01:00
scripts Initial DOT work. 2013-07-23 23:13:55 +02:00
src Use the ODR to speed up type canonicalization 2015-10-04 13:52:25 +02:00
tests Use the ODR to speed up type canonicalization 2015-10-04 13:52:25 +02:00
tools Remove some dead code in abilint 2015-09-21 15:20:10 +02:00
.gitignore Update .gitignore 2014-11-01 12:10:06 +01:00
abigail.m4 For usage from within GCC set header path to $includedir/libabigail 2013-08-14 16:10:15 +02:00
AUTHORS Initial AUTHORS and README 2013-02-28 13:25:20 +01:00
ChangeLog Update ChangeLog file 2015-06-25 08:13:21 +02:00
COMMIT-LOG-GUIDELINES Allow introductory text in commit log and ignore it when generating ChangeLog 2014-11-18 23:18:06 +01:00
COMPILING Encourage people to use autoreconf -i 2015-10-01 10:40:51 +02:00
config.h.in Make abipkgdiff compare tar archives containing binaries 2015-08-22 14:32:20 +02:00
configure.ac Misc style cleanups 2015-08-22 14:32:20 +02:00
CONTRIBUTING Update the CONTRIBUTING file 2015-03-19 12:47:59 +01:00
COPYING Use a better wording for the COPYING file 2015-04-22 09:53:18 +02:00
COPYING-GPLV3 Update licence texts 2015-04-20 13:51:21 +02:00
COPYING-LGPLV2 Initial import of gen-changelog.py 2014-11-18 23:18:06 +01:00
COPYING-LGPLV3 LGPLv3 License the library 2013-07-23 23:13:55 +02:00
gen-changelog.py [gen-changelog] Make subject line always come first 2014-11-18 23:18:06 +01:00
install-sh Add missing autoconfiscation files into version control 2013-03-01 00:47:49 +01:00
libabigail.pc.in Make libxml2 a private dependency wrt pkconfig 2013-08-22 17:41:29 +02:00
ltmain.sh Add missing autoconfiscation files into version control 2013-03-01 00:47:49 +01:00
Makefile.am Update licence texts 2015-04-20 13:51:21 +02:00
README Fix wording in README 2015-09-05 10:30:00 +02:00

This is the Application Binary Interface Generic Analysis and
Instrumentation Library.

It aims at constructing, manipulating, serializing and de-serializing
ABI-relevant artifacts.

The set of artifacts that we are intersted is made of quantities like
types, variable, fonctions and declarations of a given library or
program.  For a given library or program this set of quantities is
called an ABI corpus.

This library aims at (among other things) providing a way to compare
two ABI Corpora (apparently the plural of corpus is copora, heh,
that's cool), provide detailed information about their differences,
and help build tools to infer interesting conclusions about these
differences.

You are welcome to contribute to this project after reading the files
CONTRIBUTING and COMMIT-LOG-GUIDELINES files in the source tree.

Communicating with the maintainers of this project -- including
sending patches to be include to the source code -- happens via email
at libabigail@sourceware.org.