libabigail/tests
Dodji Seketeli 912eb7e36b Speed up type canonicalization by avoiding recursive hashing
Recursive type hashing was showing up as the major hot spot of
performance profiles.  After spending a few days on trying to speed it
up, I have officially declared recursive tree node hashing as a slow
process and I am giving up.

I have thus decided to not use that at type canonicalization time.

Rather, I am proposing a new type canonicalization routine where types
are first hashed by hashing their pretty representation string.

Basically, if T is the total number of types in the system and C the
number of classes of equivalences (or the number of canonical types),
the number of type comparisons done by a naive type canonicalization
routine is N x C.  With the worse C being equal to N itself, that
worse number of comparisons is N*N.

By using a hash table to store the canonical types, keyed by a hash of
their pretty representation string, the number of type comparisons can
be brought down to N*P, where P is a the greater number of which
pretty representation string hash collide.  That number P is usually
small; my measurements show that N usually goes from 1 to 3.  And
moreover, computing the hash of the pretty representation string of
the function is way faster than using the recursive type hash!

As a result, running abidw on the libcilkrts.so library, from GCC goes
from 12 minutes to 0.4 seconds!

Incidentally, now that we are not trying to speed up the recursive
type hashing process, all the complicated business we had around
caching the result of the hashing is gone!  I was thinking that hash
cashing was inherently a bad idea, especially for recursive types --
that refer to themselves directly or indirectly, because in those
case, depending on when you cached the hash value, the value of the
hashing can be different.

The abixml writer's code doesn't use the recursive type hash anymore
either; it uses the pointer value of the canonical type as hash.
Super fast too!

The patch had to fix pieces here and there to comply with the fact
that canonical types are now used across the board in a mandatory
fashion.

	* include/abg-ir.h (canonical_types_map_type): Adjust this typedef
	to make it point to an unordered_map which the key is now a string
	and the value is a vector of types.
	(type_or_decl_base::{get_cached_hash_value, set_cached_hash_value,
	cached_hash}): Remove these member functions and type.
	(struct type_base::cached_hash): Remove.
	* src/abg-ir.cc (struct type_or_decl_base::priv::hash_): Remove.
	(type_or_decl_base::priv::priv): Adjust.
	(type_or_decl_base::{g,s}et_cached_hash_value): Remove.
	(type_base::get_canonical_type_for): For declaration-only classes,
	look at their definition for the canonical_type.  Do not use
	recursive type hashing anymore.  Rather, use the pretty
	representation string, and hash that.
	(class_decl::base_spec::get_hash): Do away with hash value caching
	here.
	(class_decl::operator==): For decl-only classes, look at their
	definitions for canonical types.
	(hash_type_or_decl): Adjust comment.  Use the canonical type
	pointer value for type hash.  That's the fast path.  Otherwise, if
	not available, fall back to a slow path which is the recursive
	type hash we were using before.
	* src/abg-dwarf-reader.cc (maybe_canonicalize_type): Schedule all
	classes and typedef to classes for late canonicalization.
	* src/abg-hash.cc (type_base::dynamic_hash::operator()): There is
	no hash value cashing anymore.
	(type_base::cached_hash::operator()): Remove.
	* src/abg-reader.cc (read_context::get_type): Slight style
	adjustment.
	(read_translation_unit_from_file)
	(read_translation_unit_from_buffer): Do not forget to canonicalize
	types when reading just one translation unit.
	(build_type_tparameter, build_template_tparameter): Canonicalize
	the type.
	* src/abg-writer.cc (struct type_hasher): New hasher type.
	(type_ptr_map): Use a deep pointer comparison equal operator
	functor, and canonical types as type hash values.
	(write_class_decl): Do not write size and alignment on decl-only
	classes.  Do not record decl-only classes as being emitted.  Their
	definition must be emitted before.
	* tests/test-read-write.cc (main): Do not do abi testing on
	translation units (as opposed to doing it on abi corpora) as that
	code is not wet yet.  We need to know how to diff namespaces.
	* tests/data/test-abidiff/test-PR18791-report0.txt: Adjust.
	* tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Likewise.
	* tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Likewise.
	* tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise.
	* tests/data/test-read-dwarf/test13-pr18894.so.abi: Likewise.
	* tests/data/test-read-dwarf/test14-pr18893.so.abi: Likewise.
	* tests/data/test-read-dwarf/test15-pr18892.so.abi: Likewise.
	* tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-09-21 13:51:31 +02:00
..
data Speed up type canonicalization by avoiding recursive hashing 2015-09-21 13:51:31 +02:00
Makefile.am Re-arrange some regression tests order 2015-09-02 15:33:27 +02:00
print-diff-tree.cc Introduce the concept of environment 2015-09-07 23:35:29 +02:00
runtestcanonicalizetypes.sh.in Fix the new regression test for type canonicalizing 2015-02-19 11:44:19 +01:00
test-abicompat.cc Fix type synthesis to fix abicompat weak mode 2015-07-20 17:11:32 +02:00
test-abidiff.cc Introduce the concept of environment 2015-09-07 23:35:29 +02:00
test-alt-dwarf-file.cc Expose a new libabigail::tools_utils namespace 2015-01-08 12:28:14 +01:00
test-core-diff.cc Expose a new libabigail::tools_utils namespace 2015-01-08 12:28:14 +01:00
test-diff2.cc Update copyright years 2015-01-07 17:52:10 +01:00
test-diff-dwarf.cc Introduce the concept of environment 2015-09-07 23:35:29 +02:00
test-diff-filter.cc Bug 18904 - Fix support for C++ rvalue references 2015-09-02 14:42:16 +02:00
test-diff-pkg.cc Misc style cleanups 2015-08-22 14:32:20 +02:00
test-diff-suppr.cc Support source_location_not_in and source_location_not_regexp suppressions 2015-09-16 20:54:40 +02:00
test-dot.cc Correct DOT merge. 2013-07-23 23:13:55 +02:00
test-ir-walker.cc Introduce the concept of environment 2015-09-07 23:35:29 +02:00
test-lookup-syms.cc Expose a new libabigail::tools_utils namespace 2015-01-08 12:28:14 +01:00
test-read-dwarf.cc Introduce the concept of environment 2015-09-07 23:35:29 +02:00
test-read-write.cc Speed up type canonicalization by avoiding recursive hashing 2015-09-21 13:51:31 +02:00
test-svg.cc Add svg generation. 2013-07-23 23:13:54 +02:00
test-utils.cc Update copyright years 2015-01-07 17:52:10 +01:00
test-utils.h Update copyright years 2015-01-07 17:52:10 +01:00
test-write-read-archive.cc Fix archive writing support 2015-04-24 19:59:19 +02:00