Commit Graph

790 Commits

Author SHA1 Message Date
Dodji Seketeli
876dab386e When reading DWARF set member type access where the type is built
The DWARF reader assumes that the DIEs for all member types are seen
by build_class_type_and_add_to_ir(), as member type DIEs of the DIE of
the class.  Well that assumption is not correct because there can be
errors in the DWARF we are looking at.  One of these errors I stumbled
accross is that a DIE for a typedef that should be a member typedef is
actually a child of a *function* DIE.  And that function DIE is a
child of the class.  Go figure.  In any case, get_scope_for_die()
already fixes that up and behaves as if the DIE of the typedef is a
child of the DIE of the class.  A side effect of this is that when
build_class_type_and_add_to_ir() reads the DIE of the class, it never
sees the DIE for that typedef.

The takeaway of this state of affairs is that we cannot rely on
build_class_type_and_add_to_ir() to update the member access specifier
for member types because it does not see all member types.  Rather
build_ir_node_from_die() detects (reliably) that the type is a member
type and updates the access specifier there.

I also realize that the "is_member_type" flag of
build_ir_node_from_die() and friends is useless now because inside
build_ir_node_from_die() to know that that the type we are building is
a member type, we just need to look at the scope and see if it's a
class type.

So by doing all this, this patch fixes the fact that some types were
not being canonicalized because build_class_type_and_add_to_ir() was
not seeing them.  Ahhhh, DWARF.

	* include/abg-fwd.h (is_class(decl_base*)): Return a class_decl*
	rather than just a bool.
	* abg-ir.cc (is_class(decl_base*)): Return a class_decl* rather
	than just a bool.  Simplify the implementation.
	* src/abg-dwarf-reader.cc
	(maybe_set_member_type_access_specifier): Define new static
	function.
	(build_ir_node_from_die): Remove the is_member_type flag.  When
	building member types set their access specifier.  Simplify the
	logic of detecting that a type is a member type; basically
	delegate taht to the new maybe_set_member_type_access_specifier().
	(build_class_type_and_add_to_ir): Do not try to set the member
	type access specifiers anymore.
	(build_qualified_type, build_pointer_type, build_reference_type)
	(build_typedef_type, build_var_decl, build_function_decl): Adjust.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-24 13:15:10 +01:00
Dodji Seketeli
14607415fb Fix enum_diff::has_changes()
Now that we have type canonicalizing, there is no need for trying to
be smart when comparing types; just do the comparison and it should be
fast.  Plus in the case of enum_diff, we just getting it wrong as we were
not checking several parts of the enum type, like the member access
specifiers if it was a member type, etc ...

	* src/abg-comparison.cc (enum_diff::has_changes): Just use the
	normal comparison operator to compare the two enums here.  It's
	fast now.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-24 13:15:10 +01:00
Dodji Seketeli
dc2f054d03 Build the set of exported decls directly during DWARF loading
Until now, after the ABI corpus was built from DWARF, the translation
units of the corpus were walked and each function was considered for
addition into the set of exported decls.  During that walking, a first
version of the set was put into a std::list and then, a set of filters
(user-provided tunables like a list of regular expressions to keep or
remove some functions from the exported decls) is applied to that list
and the final set of exported decls is put in a std::vector.

Profiling has shown that this process of building the set of exported
decls is a hot spot and also that the current use of std::list was a
big memory consumer especially on binaries with large exported symbol
tables.

So this patch builds the set of exported decls "on the fly", during
DWARF reading, as opposed to waiting after the DWARF is read and
having to walk the corpus again.  The corpus defines a policy object
that encapsulates the methods for determining if a function or
variable ought to be part of the set of exported decls.  The DWARF
reader uses that policy object to determine which functions and
variables among those built during the reading ought be part of the
exported decls; the policy object also has a reference to the final
vector (managed by the corpus) that must hold the exported decls, so
the decls are put in that vector directly without unnecessary copying.

Profiling also showed that the string copying done by
{var_decl,function_decl}::get_id() was a hot spot.  So the patch
returns a reference there.

With this patch applied, the peak memory consumption of abidiff on
libabigail.so itself (abidiff libabigail.so libabigail.so) is 54MB of
resident and takes 2 minutes and 16s (on my slow system).  Without the
patch the peak consumption was more than 300MB and it was taking
slightly longer.

For the test of bug
https://sourceware.org/bugzilla/show_bug.cgi?id=17948, memory
consumtion and wall clock time spent is down from 3.4GB and 1m59s to
760MB and 0m43s.

	* include/abg-ir.h ({var,function}_decl::get_id): Return a
	reference.
	* src/abg-ir.cc ({var,function}_decl::get_id): Return a reference
	to the string rather than copying it over.
	* include/abg-corpus.h (class corpus::exported_decls_builder):
	Declare new type.
	(corpus::{sort_functions, sort_variables,
	maybe_drop_some_exported_decls, get_exported_decls_builder}):
	Declare new methods.
	* src/abg-corpus.h (corpus::exported_decls_builder::priv): Define
	new type.
	(class symtab_build_visitor_type): Remove this type that is
	useless now.
	(corpus::exported_decls_builder::{exported_decls_builder,
	exported_functions, exported_variables,
	maybe_add_fn_to_exported_fns, maybe_add_var_to_exported_vars}):
	Define new functions.
	(corpus::priv::is_public_decl_table_built): Remove this data
	member.  It's now useless.
	(corpus::priv::priv): Adjust.
	(corpus::priv::build_public_decl_table): Remove this member
	function.  It's now useless.
	(corpus::{priv::build_unreferenced_symbols_tables, get_functions,
	get_variables}): No need to build the public decls table here.
	It's already built by the time the corpus is read from DWARF now.
	(corpus::{sort_functions, sort_variables,
	maybe_drop_some_exported_decls, get_exported_decls_builder}):
	Define new member functions.
	* src/abg-dwarf-reader.cc (read_context::exported_decls_builder):
	New data member.
	(read_context::read_context): Initialize it.
	(read_context::{exported_decls_builder,
	maybe_add_fn_to_exported_fns, maybe_add_var_to_exported_vars}):
	Define new member functions.
	(read_debug_info_into_corpus): Get the the new
	'exported_decls_builder' object from the corpus and stick it into
	the read context so the DWARF reading code can use it to build the
	exported decls set.  When the DWARF reading is done, sort the set
	of exported functions and variables that was built.
	(build_ir_node_from_die): When a function or variable is built,
	consider putting it into the set of exported decls.
	* tools/abicompat.cc (main): Now that the exported decls is built
	*before* we had a chance to stick the list of symbol IDs to keep,
	call corpus::maybe_drop_some_exported_decls() to update the set of
	exported decls we should consider for the corpus.

was applied to that list and the final

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-24 13:15:10 +01:00
Dodji Seketeli
69d6c26828 Fix canonicalizing of member types ... *AGAIN*
So abidiff libabigail.so libabigail.so is broken again.  Sigh.

It was broken by this wrong commit:

    commit 5b33223cb7
    Author: Dodji Seketeli <dodji@redhat.com>
    Date:   Fri Feb 20 13:48:48 2015 +0100

	Simplify canonicalizing handling for typedefs

	    * src/abg-dwarf-reader.cc (build_ir_node_from_die): For typedefs,
	    we don't need to test that the current scope is a class to know
	    that we are looking at a member type.  Just looking at the
	    is_member flag is enough.

So the issue arises when for instance, we are reading a class that
defines a member typedef (or enum) and uses that enum as the type of a
data member.  When reading that data member (before reading the
definition of the typedef), we read the type of the data member; so we
hit the typedef.  But build_ir_node_from_die() cannot fully construct
the scope of the typedef before handing off the typedef because we are
currently building it!  So it hands out a non-complete version of the
class that is being built;  'is_member' is not set to 'true' because
we are getting the type of the data member; it's not *necessarily* a
member type.  So we need to check !is_class_type(scope) to know if we
are given a member type.  I am now thinking that the "is_member" flag
is actually useless.  I think I'll remove it in a later patch.

Anyway, this fixes 'abidiff libabigail.so libabigail.so' again.  I
have some stashed patches that brings it's time down to ~ 45 seconds.
So we are getting close to being able to include that *ultimate* test in
regression test suite.  Oh well.

	* src/abg-dwarf-reader.cc (build_ir_node_from_die): When building
	typedefs, enum and memeber classes, check that the scope is a
	member class to detect if we are building a member type.  In which
	case the caller is going to handle the canonicalizing of the
	member type *after* it's access specification has been adjusted.
	Otherwise, that adjustments happens after the type has been
	canonicalized and bad things happen at comparison type.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-22 22:43:23 +01:00
Dodji Seketeli
56d958641c Bug 17649 Avoid endless looping on diff graph with cycles
Bug URL: https://sourceware.org/bugzilla/show_bug.cgi?id=17649.

abidiff stumbled accross a diff graph with cycles.  And it kept
walking that graph endlessly.  Of course.

It turned out on such graphs with cycles, the categorizing code that
uses abigail::comparison::diff::traverse() to walk the graph and
categorize the diff nodes was traversing the same class of equivalence
of certain diff nodes more than once without even noticing.

This patch changes the logic of the diff graph traversing code to make
it always call diff_node_visitor::visit_begin() on the visitor for a
diff node prior to visiting it (visiting means calling
diff_node_visitor::visit()) and diff_node_visitor::visit_end() after
visiting it.

But when the diff node has already been visited and it's reached again
by the traversing code (in case of a cycle) then the
diff_node_visitor::visit_begin() is called, but
diff_node_visitor::visit() is *NOT*.  Then
diff_node_visitor::visit_end() is called.  In other words, even when
the diff node is not visited (because it's already been visited) the
pair diff_node_visitor::{visit_begin,visit_end}() is called.

This avoids traversing the diff node (or rather the equivalence class
of the diff node) more than once even in presence of cycles, but still
gives a chance to custom visitors to detect that they are seeing a
cycle and act accordingly if need be.  This is a kind of cycle
detection feature.

Then the code of the (harmless and harmful categorization) filters has
been adapted to always rely on the cycle detection feature.  The code
of the category propagation visitor has also been adapted to propagate
the category of a given diff node to and from its canonical diff node.

	* include/abg-comp-filter.h (harm{less,ful}_filter::visit_end):
	Declare new methods.
	* include/abg-comparison.h (diff_context::maybe_apply_filters):
	Remove the traverse_nodes_once flag.
	* src/abg-comp-filter.cc (apply_filter): Force the traversing to
	operate in cycle avoidance mode.
	(harm{less,ful}_filter::visit): Update the category of the
	canonical node too.
	(harm{less,ful}_filter::visit_end): Define new method.
	* src/abg-comparison.cc (diff_context::maybe_apply_filters):
	Remove the traverse_nodes_once flag.  Adjust.  Simplify logic.
	(diff::traverse): Always call diff_node_visitor::{begin,end}.  If
	the node has already been visited previously then do not call
	diff_node_visitor::visit() and do not visit the children nodes.
	(category_propagation_visitor::visit_end):  If the node has
	already been visited, then propagate the category from the
	canonical nodes of the children nodes.
	(propagate_categories):  Force the traversing to operate in cycle
	avoidance mode.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-21 15:16:48 +01:00
Dodji Seketeli
5f929d456c Add missing new line after reporting alignment changes
* src/abg-comparison.cc (distinct_diff::report): After calling
	report_size_and_alignment_changes, one needs to add a new line if
	some stuff got emitted out the output stream.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-20 15:13:19 +01:00
Dodji Seketeli
bbf550f493 Add type checking overloads that ease their calling from GDB
* include/abg-fwd.h (is_class_type, is_pointer, is_reference_type)
	(is_qualified_type): Declare overloads that take naked (non-smart)
	pointers.
	* src/abg-ir.cc (is_class_type, is_pointer, is_reference_type)
	(is_qualified_type): Define overloads that take naked (non-smart)
	pointers.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-20 14:23:15 +01:00
Dodji Seketeli
80b3e35a5d Remove overly eager assert in distinct_diff::report
Since we now have proper diff node filtering capabilities, it appears
that there can be distinct diff node that is deemed to be reported
even though it's child node is not; this happens when the distinct
diff node does carry local changes.  So remove the assert that was
saying otherwise and enjoy one less abort.

	* src/abg-comparison.cc (distinct_diff::report): Remove over-eager
	assert.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-20 14:19:33 +01:00
Dodji Seketeli
49cc99fa7d Factorize late canonicalizing code in the dwarf reader
The late canonicalizing code needed factorizing to increase
maintainability.

	* src/abg-dwarf-reader.cc
	(read_context::{canonicalize_types_scheduled,
	perform_late_type_canonicalizing}):  Factorize these from ...
	(build_translation_unit_and_add_to_ir): ... here.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-20 14:07:39 +01:00
Dodji Seketeli
4e9f871357 Rename schedule_type_for_canonicalization -> schedule_type_for_late_canonicalization
* src/abg-dwarf-reader.cc
	(read_context::schedule_type_for_late_canonicalization): Renamed
	read_context::schedule_type_for_canonicalization into this.  Also,
	add some sanity checking code in there.
	(build_class_type_and_add_to_ir, maybe_canonicalize_type): Adjust.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-20 14:05:54 +01:00
Dodji Seketeli
885e2ab4be Adjust semantics of the 'is_member' flag of build_ir_node_from_die()
It turns out the new 'is_member' flag of build_ir_node_from_die()
really means 'is this DIE for member type'.  So let's make it clear
now.

	* src/abg-dwarf-reader.cc (build_ir_node_from_die): Rename
	is_member into is_member_type.  Adjust.
	(get_scope_for_die, build_translation_unit_and_add_to_ir)
	(build_namespace_decl_and_add_to_ir): Adjust.
	(build_class_type_and_add_to_ir): Adjust.  Adjust set to false
	when calling build_ir_node_from_die() to build a function_decl.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-20 14:00:53 +01:00
Dodji Seketeli
5b33223cb7 Simplify canonicalizing handling for typedefs
* src/abg-dwarf-reader.cc (build_ir_node_from_die): For typedefs,
	we don't need to test that the current scope is a class to know
	that we are looking at a member type.  Just looking at the
	is_member flag is enough.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-20 13:48:48 +01:00
Dodji Seketeli
cab7817dd0 Do not miss canonicalizing opportunities on non-member class types
* src/abg-dwarf-reader.cc (build_ir_node_from_die): When a class
	is not a member type, then it at least ought to be scheduled for
	late canonicalizing.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-20 13:43:25 +01:00
Dodji Seketeli
0a19dd37db Fix handling of canonicalizing of member enum types
A member enum type's canonicalizing is handled by the member type
handling code of the build_class_type_and_add_to_ir().  So do not try
to canonicalize from elsewhere.

	* src/abg-dwarf-reader.cc (build_ir_node_from_die): Once we've
	built the enum type by calling build_enum_type(), do not try to
	canonicalize it here if it's a member type.  The calling
	build_class_type_and_add_to_ir() must deal with it already.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-20 13:39:20 +01:00
Dodji Seketeli
25b69bb7a3 Stick qualified, pointer, reference and array types into the global scope
In the debug info of least up to GCC 4.4.x, pointer, reference, array and
qualified types were in the global scope.  In GCC 4.8.x they can
belong to the scope of their sub-type.  The comparison code of
libabigail can thus (wrongly) consider that a qualified type described
by GCC 4.4.x debug info is different from the debug info of the *same*
qualified type emitted by GCC 4.8.x just because their scopes are
different.  The scope of qualified, pointer and reference types should
not matter anyway.  So this patch makes these composite types belong
to the global scope, irrespective of where they appear in the debug
info.  I have seen this when comparing libstdc++ from RHEL 6.5 and 7.
This is visible now that we have type canonicalizing.

	* src/abg-dwarf-reader.cc (build_class_type_and_add_to_ir): Do not
	consider qualified, pointer, reference and array types as member
	types.  Only typedef, class and enum types are.
	(build_ir_node_from_die): Stick base, pointer, reference,
	qualified and array types into the global scope.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-20 13:34:59 +01:00
Dodji Seketeli
0f92cd6a35 Avoid creating multiple versions of certain composite types
Sometimes during the reading of debug info, when creating a given
composite T, calling build_ir_node_from_die() to get the IR node for
their sub-type might lead to the creation of T, while we are in the
function that is supposed to create it.  In that case, just return the
type that has then be created from underneath us, rather than creating
a new one.

	* src/abg-dwarf-reader.cc (build_qualified_type)
	(build_pointer_type_def, build_reference_type, build_array_type)
	(build_typedef_type): If the composite type we are about to create
	was already created, just return the one that exists already.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-20 12:59:48 +01:00
Dodji Seketeli
7c7c326b2b Do not forget to canonicalize enum underlying type and void type
It appears we were not canonicalizing the underlying type of enum type
and the void type.  We could catch this now that we are requiring that
abigail::ir::strip_typedef() works on canoncialized types only.  This
patch fixes that.

	* src/abg-dwarf-reader.cc (build_enum_type): Canoncialize the
	underlying type of the enum type.
	(build_ir_node_for_void_type): Canonicalize the void type.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-20 12:51:25 +01:00
Dodji Seketeli
d4e34afa9b Do not forget to associate DIE to the types they represent
There are cases where we forget to associate some DIEs to the types
they represent.  I have seen this while comparing libstdc++ from RHEL
6.5 and RHEL 7.

This patch moves the DIE->type association to the build_* functions
that actually creates the types, as opposed to doing it in the callers
of these build_* functions.

	* src/abg-dwarf-reader.cc (build_type_decl, build_enum_type)
	(build_qualified_type, build_pointer_type_def)
	(build_reference_type, build_typedef_type)
	(build_class_type_and_add_to_ir): Take a new flag that says if the
	DIE is from the alternate debug info section or not.  Perform the
	DIE->type association in these functions.  Note that in
	build_class_type_and_add_to_ir we are now doing the DIE->type
	association even for declaration-only classes.  And for member
	types, do not bother doing the association because it's already
	been done by build_ir_node_from_die().
	(build_ir_node_from_die): Do not do the DIE->type association here
	anymore.  Adjust to the new signature of the build_* functions
	above that actually build the types.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-20 12:39:41 +01:00
Dodji Seketeli
668116ed16 Clear per-TU data before reading debub info for a TU
A corpus is made of several translation units; when reading the
corpus, there is data (maintained in the reading context) that is
relevant to all the translation units.  But there is also data that is
relevant to the current translation unit being read only; and data
should cleared before starting to read the next translation unit.
Otherwise, bad and subtle issues happen.  This patch clears some
per-TU data that I forgot to clear recently.

The patch also does some code factorizing to increase maintainability.

	* src/abg-dwarf-reader.cc
	(read_context::die_type_map): New accessor for the two DIE->Type
	maps we have; the one of the main debug info section and the one
	of the alternate debug info section.
	(read_context::{associate_die_to_type,
	lookup_type_from_die_offset}): use the new die_type_map()
	accessor.
	(read_context::clear_per_translation_unit_data): Factorize this
	from build_translation_unit_and_add_to_ir().  Also, add code to
	clear the DIE->type map as well as the vectors of offsets of the
	types of the DIEs to canonicalize after the translation unit has
	been read.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-20 12:21:58 +01:00
Dodji Seketeli
dbe88f1103 Fix the new regression test for type canonicalizing
* tests/runtestcanonicalizetypes.sh.in (binaries): Refer to
	abg-tools-utils, not abg-tools-utils.o; the extension is computed
	automatically, depending on the underlying platform.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-19 11:44:19 +01:00
Dodji Seketeli
cc3f6a86a7 Make strip_typedef() act on canonical types only
strip_typedef(), when constructing new pointers,
references and other composite types was building new types that
weakly referred to their sub-types; for instance, a pointer type
has a weak reference on its pointed-type.  That means the referred-to
type must be 'own' by something else.  That means that strip_typedef()
needs to create types which lifetime is "long enough".  This patch
ensures that strip_typedef() returns a canonical type; and we are sure
that a canonical type is live during the entire life time of the
libabigail library itself.

So that means strip_typedef can only be used after types have been
canonicalized.  To that end, this patch changes is_class_type() to
make it not strip typedefs.  That way, is_class_type() can be used
even when canonicalized types are not yet available.  The patch then
introduces a new is_compatible_with_class_type() function that strips
typedef.  The code of type_size_changed() that wanted to strip
typedefs is then adjusted to use this new
is_compatible_with_class_type() instead.

	* include/abg-fwd.h (is_compatible_with_class_type): Declare new
	function.
	(canonicalize): Move the declaration here, from ...
	* include/abg-ir.h (canonicalize): ... here.
	* src/abg-ir.cc (strip_typedef): Assert that the input type is
	canonicalized.  Make sure that weak references are on
	canonicalized types.  Make sure that the returned type is a
	canonical one.
	(canonicalize): Make this return the canonical type that it has
	computed.
	* src/abg-comp-filter.cc (type_size_changed): Use the new
	is_compatible_with_class_type() function, instead of
	is_class_type().

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-19 11:44:11 +01:00
Dodji Seketeli
3afb235d0c Speed up function_decl::get_id() and var_decl::get_id()
function_decl::get_id() var_decl::get_id() showed up high on CPU usage
profiles.  This patch thus implements a caching-version of these
version and it incurred a 10% (at least) speed on binaries with a lot
publicly exported symbols.

	* src/abg-ir.cc (var_decl::priv::id_): New data member.
	(var_decl::get_id): Cache the result on the first invocation and
	and returns it on subsequent invocations.
	(function_dec::priv::id_): New data member.
	(function_decl::get_id): Cache the result on the first invocation
	and and returns it on subsequent invocations.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-18 21:32:38 +01:00
Dodji Seketeli
2346a328eb Speed up symbol version reading
Profiling showed that symbol version info reading was on the hot spot.
This patch implements caching the symbol table and symbol
version-related sections.  It incurred speed ups of at least 20% on
big binaries with a lot versionned symbols.

	* src/abg-dwarf-reader.cc (find_symbol_table_section)
	(get_symbol_versionning_sections): Forward declare these existing
	static functions.
	(read_context::{symtab_section_,
	symbol_versionning_sections_loaded_,
	symbol_versionning_sections_found_, versym_section_
	verdef_section, verneed_section}): New data members.
	(read_context::read_context): Initialize them.
	(read_context::{find_symbol_table_section,
	get_symbol_versionning_sections, get_version_for_symbol}):
	Implement a caching version of their exisiting non-caching
	counterpart.
	(read_context::lookup_elf_symbol_from_index): Use the new caching
	functions read_context::find_symbol_table_section and
	read_context::get_version_for_symbol.
	(read_context::load_symbol_maps): Likewise, use the new caching
	function read_context::find_symbol_table_section.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-18 21:32:38 +01:00
Dodji Seketeli
955c466ba7 Stop traversing function/variable node when added to symbol table
During the building of the list of publicly exported functions and
variables from a corpus, when we visit a function it's useless (and
time consuming) to visit the sub-nodes of the function.  This patch
does away with that.

	* src/abg-corpus.cc (symtab_build_visitor_type::visit_begin):
	Replace symtab_build_visitor_type::visit_end with this and return
	false.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-18 21:32:38 +01:00
Dodji Seketeli
ba8ecf46a1 Do not apply diff filters sub-tree not carrying changes
Some *huge* sub-trees might not carry any change.  In that case do not
bother applying the filter because eventually no filter is going to be
applied anyway.  This can save us a lot of walking time.

	* src/abg-comp-filter.cc ({harmless, harmful}_filter::visit): Do
	not try to do the categorizing on a diff sub-tree that does
	not carry any change.
	* src/abg-comparison.cc (diff_context::maybe_apply_filters): Do
	not bother trying to apply the filters on a diff sub-tree that
	does not carry any change.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-18 21:32:37 +01:00
Dodji Seketeli
8b28d171c3 Canonicalize types either early or late after TU reading
While trying to diff two identical files (abidiff foo.so foo.so) it
appeared that canonicalizing types during e.g, the DWARF reading
process was leading to subtle errors because it's extremely hard to
know when a type is complete.  That is, during the building of a class
type C, a pointer to C can be built before C is complete.  Worse, even
after reading the DIE (from DWARF) of class C, there can be DIE seen
later in the translation unit that modifies type C.  In these late
cases, one needs to wait -- not only until C is fully built, but also
sometimes, after the translation unit is fully built -- to
canonicalize C and then the pointer to C.  This kind of things.

So now there are two possible points in time when canonicalization of
a type can happen.  It can happen early, when the type is built.  This
is the case for basic types and composite types for which all
sub-types are canonicalized already.  It can happen late, right after
we've finished reading the debug info for the current translation
unit.

So this patch fixes the IR traversal and uses that to walk the
translation unit (or even types) after it's built.  It does away with
the first attempt to perform early canonicalizing only.

The patch also handles type canonicalizing while reading xml-abi
format.

	* include/abg-fwd.h (is_class_type)
	(type_has_non_canonicalized_subtype): Declare new functions.
	(is_member_type): Remove the overload that takes a decl_base_sptr.
	It's superfluous.  We just need the one that takes a
	type_base_sptr.
	* include/abg-ir.h (translation_unit::{is_constructed,
	set_is_constructed}): Add new methods.
	(class_decl::has_virtual_member_functions): Likewise.
	(class decl_base): Makes it virtually inherit ir_traversable_base.
	(class type_base): Make this virtually inherit traversable_base
	too.
	(type_base::canonicalize): Renamed enable_canonical_equality
	into this.
	(type_base::traverse): Declare new virtual method.
	(canonicalize): Renamed enable_canonical_equality into this.
	(scope_type_decl::traverse): Declare new virtual method.
	(namespace_decl::get_pretty_representation): Declare new virtual
	method.
	(function_type::traverse): Likewise.
	(class_decl::base_spec::traverse): Likewise.
	(ir_node_visitor::visit): Remove the overloads and replace each of
	them with a pair of ...
	(ir_node_visitor::{visit_begin, visit_end}): ... of these.
	* include/abg-traverse.h (traversable_base::visiting): New
	method.
	(traversable_base::visiting_): New data member.
	(traversable_base::traversable_base): New constructor.
	* src/abg-ir.cc ({scope_decl, type_decl, namespace_decl,
	qualified_type_def, pointer_type_def, reference_type_def,
	array_type_def, enum_type_decl, typedef_decl, var_decl,
	function_decl, function_decl::parameter, class_decl,
	class_decl::member_function_template,
	class_decl::member_class_template, function_tdecl,
	class_tdecl}::traverse): Fix this to properly set the
	traversable_base::visiting_ flag and to reflect the new signatures
	of the ir_node_visitor methods.
	({type_base, scope_type_decl, function_type,
	class_decl::base_spec}::traverse): New method.
	(type_base::get_canonical_type_for): Handle the case of the type
	already having a canonical type.  Properly hash the type using the
	dynamic type hasher.  Look through declaration-only classes to
	consider the definition of the class instead.  Fix logic to have a
	single pointer of return, to ease debugging.
	(canonicalize): Renamed enable_canonical_equality into this.
	(namespace_decl::get_pretty_representation): Define new method.
	(ir_node_visitor::visit): Replace each of these overloads with a
	pair of visit_begin/visit_end ones.
	(translation_unit::priv::is_constructed_): New data member.
	(translation_unit::priv::priv): Initialize it.
	(translation_unit::{is_constructed, set_is_constructed}): Define
	new methods.
	(is_member_type(const decl_base_sptr)): Remove.
	(is_class_type(decl_base *d)): Define new function.
	(class_decl::has_virtual_member_functions): Define new method.
	(equals(const class_decl&, const class_decl&, change_kind*)): If
	the containing translation unit is not constructed yet, do not
	take virtual member functions in account when comparing the
	classes.  This is because when reading from DWARF, there can be
	DIEs that change the number of virtual member functions after the
	DIE of the class.  So one needs to start taking virtual members
	into account only after the translation unit has been constructed.
	(class non_canonicalized_subtype_detector): Define new type.
	(type_has_non_canonicalized_subtype): Define new function.
	* src/abg-corpus.cc (symtab_build_visitor_type::visit): Renamed
	this into symtab_build_visitor_type::visit_end.
	* src/abg-dwarf-reader.cc (die_type_map_type): New typedef.
	(die_class_map_type): This is now a typedef on a map of
	Dwarf_Off/class_decl_sptr.
	(read_context::{die_type_map_, alternate_die_type_map_,
	types_to_canonicalize_, alt_types_to_canonicalize_}): New data
	members.
	(read_context::{associate_die_to_decl,
	associate_die_to_decl_primary}): Make these methods public.
	(read_context::{associate_die_to_type,
	lookup_type_from_die_offset, is_wip_class_die_offset,
	types_to_canonicalize, schedule_type_for_canonicalization}):
	Define new methods.
	(build_type_decl, build_enum_type)
	(build_class_type_and_add_to_ir, build_qualified_type)
	(build_pointer_type_def, build_reference_type, build_array_type)
	(build_typedef_type, build_function_decl): Do not canonicalize
	types here.
	(maybe_canonicalize_type): Define new function.
	(build_ir_node_from_die): Take a new flag that says if the ir node
	is a member type/function or not. Early-canonicalize base types.
	Canonicalize composite types that have only canonicalized
	sub-types.  Schedule the other types for late canonicalizing.  For
	class types, early canonicalize those that are non-member types,
	that are fully constructed and that have only canonicalized
	sub-types.  Adjust to the new signature of build_ir_node_from_die.
	(get_scope_for_die, build_namespace_decl_and_add_to_ir)
	(build_qualified_type, build_pointer_type_def)
	(build_reference_type, build_array_type, build_typedef_type)
	(build_var_decl, build_function_decl): Adjust for the new
	signature of build_ir_node_from_die.
	(build_translation_unit_and_add_to_ir): Likewise.  Perform the
	late canonicalizing of the types that have been scheduled for
	that.
	(build_class_type_and_add_to_ir): Return a class_decl_sptr, not a
	decl_base_sptr.  Adjust for the new signature of
	build_ir_node_from_die.  Early canonicalize member types that are
	created and added to a given class, or schedule them for late
	canonicalizing.
	* src/abg-reader.cc (class read_context::{m_wip_classes_map,
	m_types_to_canonicalize}): New data members.
	(read_context::{clear_types_to_canonicalize,
	clear_wip_classes_map, mark_class_as_wip, unmark_class_as_wip,
	is_wip_class, maybe_canonicalize_type,
	schedule_type_for_late_canonicalizing,
	perform_late_type_canonicalizing}): Add new method definitions.
	(read_context::clear_per_translation_unit_data): Call
	read_context::clear_types_to_canonicalize().
	(read_translation_unit_from_input): Call
	read_context::perform_late_type_canonicalizing() at the end of the
	function.
	(build_function_decl): Fix the function type canonicalizing (per
	translation) that was already in place.  Do the canonicalizing of
	these only when the type is fully built.  Oops.  This was really
	brokend.  Also, when the function type is constructed, consider it
	for type canonicalizing.
	(build_type_decl): Early canonicalize basic types.
	(build_qualified_type_decl, build_pointer_type_def)
	(build_pointer_type_def, build_reference_type_def)
	(build_array_type_def, build_enum_type_decl, build_typedef_decl):
	Handle the canonicalizing for these composite types: either early
	or late.
	(build_class_decl): Likewise.  Also, mark this class a 'being
	built' until it's fully built.  This helps the canonicalizing code
	to know that it should leave a class alone until it's fully built.
	* tests/test-ir-walker.cc (struct name_printing_visitor): Adjust
	to the visitor methods naming change.
	* configure.ac: Generate the tests/runtestcanonicalizetypes.sh
	testing script from tests/runtestcanonicalizetypes.sh.in.
	* tests/runtestcanonicalizetypes.sh.in: Add the template for the
	new runtestcanonicalizetypes.sh script that test for type
	canonicalizing.
	* tests/Makefile.am: Add the new runtestcanonicalizetypes.sh
	regression testing script to the build system.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-18 21:32:37 +01:00
Dodji Seketeli
f968b7d422 Factorize per TU data clearing in the xml-abi reader
It's more maintainable to have a function that clears the per
translation unit data that needs to be erased before we parse the next
translation unit using the same reading context.  This patch does just
that.  Subsequent patches use this cleanup.

	* src/abg-reader.cc
	(read_context::clear_per_translation_unit_data): Factorize this
	function out of ...
	(read_context::read_translation_unit_from_input): ... this one.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-18 19:16:14 +01:00
Dodji Seketeli
4f73543250 Use the deep type sptr equality operator when possible
While looking at something else, it occurs to me that we ought to use
the deep sptr equality operator when we can.

	* src/abg-ir.cc (equals):  On function_decl overload, use the deep
	sptr type equality operator when comparing types.
	(non_type_tparameter::operator==): Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-18 19:16:13 +01:00
Dodji Seketeli
988d2bafb4 Properly compare virtualness of member functions
* src/abg-ir.cc (equals(const function_decl&, const
	function_decl&, change_kind*)): Compare virtualness of member
	function before comparing their vtable offsets.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-18 19:16:13 +01:00
Dodji Seketeli
d1d4965ee1 Misc style fixes
* include/abg-ir.h (reference_type_def::get_pointed_to_type): use
	type_base_sptr, rather than shared_ptr<type_base>
	(typdef_decl::get_underlying_type): Likewise.
	(function_decl::get_return_type): Likewise.
	(function_decl::set_type): Likewise.
	(class_decl::member_class_template::as_class_tdecl): Likewise.
	* src/abg-comparison.cc (compute_diff): Remove useless vertical
	space.
	(corpus_diff::traverse): Add a vertical space after this.
	* src/abg-dwarf-reader.cc (type_ptr_map): Remove this unused
	typedef.
	(get_version_for_symbol)
	(finish_member_function_reading): Fix the comments of these
	functions.
	* src/abg-reader.cc (build_function_decl): Return a
	function_decl_sptr rather than a shared_ptr<function_decl>.
	(build_qualified_type_decl)
	(build_pointer_type_def, build_reference_type_def)
	(build_array_type_def, build_typedef_decl, build_class_decl): Use
	the is_<someking_of_type> functions here, rather than using the
	dynamic cast.  This increases maintainability.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-18 19:15:17 +01:00
Dodji Seketeli
e39715f6c1 Optimize compressed debug info reading for speed
Profiling of abidiff or rather abidw on an Xorg binary that is
dwarf-compressed revealed that find_last_import_unit_point_before_die
was a hot spot.  This patch optimizes it for speed by ooking for the
inclusion point in reverse topological order.

	* src/abg-dwarf-reader.cc
	(find_last_import_unit_point_before_die): Look for the inclusion
	point of the partial unit in reverse topological order.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-11 10:38:11 +01:00
Dodji Seketeli
3e033263fc Share private data of class_diff nodes
Profiling has shown that there can be a *lot* of class_diff nodes that
actually represent the same diffs.  In other words, these instances of
class_diff that is they are in the same equivalence class.  In that
case the private data of these class_diff node consume a lot of
redundant memory.  This patch is an optimization that leverages that
insight.  It shares the private data of the class_diff nodes that are
in the same equivalence class.

This makes the memory consumption of abidiff on the Xorg binaries of
RHEL 6 and 7 drop from more than 6GB to less than 370MB; execution
time drops from 15 to 7 minutes.

	* src/abg-comparison.cc (class_diff::class_diff): Do not
	initialize the private data of class_diff here.
	(compute_diff): In the overload for class_diff, initialize the
	private data of the new instance of class_diff to the private data
	of its canonical instance.
	(redundancy_marking_visitor::visit_begin): If a node is marked
	redundant, do not dare visit its children.  In cases of classes
	that have members that reference themselves, this prevents us from
	wrongly marking some of the data member changes as being
	redundant.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-10 16:08:26 +01:00
Dodji Seketeli
899bbd75cf Do not crash when applying filters to a NULL diff
While looking at something else, I stumbled accross this two-liner.

	* src/abg-comparison.cc (diff_context::maybe_apply_filters): Do
	not crash when called with a NULL diff.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-10 11:41:48 +01:00
Dodji Seketeli
8fbd4f93ba Initial implementation of canonical type comparison in the IR
Comparing types that are equal showed up high in profiles.  This patch
is an answer to that.  It implements the notion of canonical type for
types known to libabigail.  Then when comparing two types, if they
have a canonical types, just comparing the pointer value of their
canonical type is enough.  This speeds up type comparison somewhat;
comparing the Xorg binaries from rhel 6 and 7 goes from more than 20h
(I gave up after that) to under 15 minutes.

	* include/abg-ir.h (class type_base): Pimplify this class.
	(type_base::canonical_types_map_type): New typedef.
	(type_base::{get_canonical_types_map, get_canonical_type_for,
	get_canonical_type}): Declare new member functions.
	(enable_canonical_equality): Declare new function.
	(struct type_base::hash): Declare this functor here.
	* src/abg-ir.cc ():
	* src/abg-dwarf-reader.cc (build_type_decl, build_enum_type)
	(build_class_type_and_add_to_ir, build_qualified_type)
	(build_pointer_type_def, build_reference_type, build_array_type)
	(build_typedef_type, build_function_decl): Enable canonical
	equality for the resulting type returned by these functions.
	* src/abg-hash.cc (type_base:#️⃣:operator()(const type_base&)):
	Adjust as this is now out-of-line.  Also, add two overloads for
	type_base* and type_base_sptr.
	(struct type_base::priv): Define new type for private data of
	type_base.
	(type_base::{get_canonical_types_map, get_canonical_type_for,
	get_canonical_type}): Define new member functions.
	(enable_canonical_equality): Define new function
	(type_base::{type_base, set_size_in_bits, get_size_in_bits,
	set_alignment_in_bits, get_alignment_in_bits}): Adjust.
	({type_decl, scope_type_decl, qualified_type_def,
	pointer_type_def, reference_type_def, array_type_def,
	enum_type_decl, typedef_decl, function_type,
	class_decl}::operator==): If the types being compared have
	canonical type then use them for comparison.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-09 17:01:48 +01:00
Dodji Seketeli
c27ec0db35 Don't walk the diff tree when there are no suppressions
Profiling showed that we were walking the diff tree to apply
suppressions even when there were no suppressions to apply.  This
patch does away with that behaviour.

	* src/abg-comparison.cc (apply_suppressions): Do not walk the diff
	tree to apply suppressions when there are no suppressions to
	apply.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-07 11:00:05 +01:00
Dodji Seketeli
6fa1dca62a Speedup some diff::has_changes() implementations
Some implementations of diff::has_changes() are high on performance
profiles. This patch make them faster by trying to detect as quickly
as possible cases where the diff node actually has changes.  The
longest past is to detect when the diff node does *not* have changes.
We'll deal the latter case later.

	* src/abg-comparison.cc ({distinct_diff, var_diff,
	class_diff}::has_changes): Use the hash value of the diff subjects
	to detect quickly if they differ.  If they don't, then go the slow
	path of comparing the types.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-07 11:00:05 +01:00
Dodji Seketeli
34b94a06da Get out as early as possible when comparing different ABI artefacts
When the the of 'equals' overloaded functions was introduced, it was
to get the possibility to have a hint about the kind of difference
(local or sub-type difference) there was between two different ABI
artifacts.  To do that, it was quite common to keep on comparing the
two artifacts even when we knew there were different, because we need
to know all the kinds of the differences there are.

Now, profiling shows that doing this generally is too costly.  So,
this patch adds a way to doing it only when necessary.

	* include/abg-ir.h (equal): Turn the last parameter of type
	change_kind& into a change_kind*.  Do this on all the overloads'
	declarations.
	* src/abg-ir.cc (equal): Do the same for the definitions of the
	overloads and adapt them to report about the kind of changes makes
	the two ABI artifact different -- only if the change_kind pointer
	is non-null.  That way, callers have a way to choose if they want
	to go the expensive route of knowing what kind of changes there
	are.
	({decl_base, scope_decl, type_base, scope_type_decl,
	qualified_type_def, pointer_type_def, pointer_type_def,
	reference_type_def, array_type_def, enum_type_decl, typedef_decl,
	var_decl, function_type, function_decl, function_decl::parameter,
	class_decl::base_spec, class_decl}::operator==): Adjust to the new
	signature of equals; call it with the change_kind* parameter set
	to NULL.
	* src/abg-comparison.cc ({var_diff, pointer_diff, array_diff,
	reference_diff, qualified_type_diff, enum_diff, class_diff,
	base_diff, scope_diff, fn_parm_diff, function_decl_diff,
	type_decl_diff, typedef_diff}::has_local_changes): Adjust.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-07 11:00:05 +01:00
Dodji Seketeli
3b3dbf6643 Rename diff::length() into diff::has_changes()
Since it turned out that the length of the changes carried by a diff
node has never been used in the algorithms of the comparison engine,
the diff::length() feels wrong.  What we want is rather a name like
diff::has_changes() so this is what this patch does.

	* include/abg-comparison.h (*::has_changes): Rename the ::length()
	method of all the diff types that inherit the diff class into
	this, in the class declarations.
	* src/abg-comparison.cc (*::has_changes): Do the same as in the
	declarations, in the definitions.
	(diff::to_be_reported, distinct_diff::has_local_changes)
	(distinct_diff::report, distinct_diff::, array_diff::has_changes)
	(reference_diff::has_changes, qualified_type_diff::has_changes)
	(enum_diff::has_changes, translation_unit_diff::has_changes)
	(suppression_categorization_visitor::visit_end)
	(redundancy_marking_visitor::visit_begin): Adjust.
	* tests/test-diff-dwarf.cc (main): Adjust.
	* tools/abidiff.cc (main): Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-02-05 12:44:59 +01:00
Dodji Seketeli
32352341c5 Add a method to diff_context to dump a diff tree to error output
For debugging purposes it's very convenient to able to dump a diff
tree to error output.  This patch just adds that possibility.

	* include/abg-comparison.h (diff_context::error_output_stream):
	Make this function const.
	(diff_context::{do_dump_diff_tree}): Declare new methods.
	* src/abg-comparison.cc (diff_context::error_output_stream): Make
	this function const.
	(diff_context::do_dump_diff_tree): Define new methods.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-01-27 12:49:40 +01:00
Dodji Seketeli
2173563f3c Keep children nodes of class_diff and scope_diff sorted
I realized that some children nodes of class_diff and scope_diff node
appear in the order laid out by the hash map that contains them.  This
can be quite random depending on various factors.  More over the
reporting code walks sorts the children nodes before walking them to
emit reports, so the walking order of the reporting code is (or can
be) different from the natural walking order used, for instance, by
the categorization or redundancy detection code.  This can have weird
side effects, especially for reporting about redundancy where the
walking other matters.

This patch thus sorts the children nodes of the class_diff and
scope_diff nodes and hopefully udpates all the code that needs
updating to take that in account.

	* include/abg-comparison.h (decl_diff_base, type_diff_base):
	Forward declare these types.
	(diff_sptrs_type, decl_diff_base_sptr, decl_diff_base_sptrs_type)
	(type_diff_base_sptr, type_diff_base_sptrs_type)
	(base_diff_sptrs_type, string_type_diff_base_sptr_map)
	(string_decl_diff_base_sptr_map, string_diff_sptr_map): New
	typedefs.
	(changed_type_or_decl, changed_parm, changed_parms_type)
	(string_changed_type_or_decl_map)
	(unsigned_changed_type_or_decl_map, changed_type_or_decl_vector):
	Remove typedefs.
	(class_diff::changed_base): Make this return a
	base_diff_sptrs_type now.  No more a string_base_diff_sptr_map.
	(class_diff::changed_member_fns): Make this return a
	function_decl_diff_sptrs_type, no more a
	string_changed_member_function_sptr_map.
	(class_diff::changed_types): Make this return a diff_sptrs_type,
	not a string_changed_type_or_decl_map anymore.
	(class_diff::changed_decls): Make this return a diff_sptrs_type,
	not a string_changed_type_or_decl_map anymore.
	* src/abg-comp-filter.cc (has_virtual_mem_fn_change)
	(has_non_virtual_mem_fn_change): Adjust.
	* src/abg-comparison.cc (compute_diff): For the decl_base_sptr and
	type_base_sptr overloads, assert that the resulting diff is
	non-null.
	(class_diff::priv::{sorted_changed_base_,
	sorted_changed_member_types_, sorted_subtype_changed_dm_,
	sorted_changed_dm_, sorted_changed_member_functions_,
	sorted_changed_member_class_tmpls_}): New data members.
	(class_diff::priv::changed_member_types_): Changed the type of
	this from string_changed_type_or_decl_map to string_diff_sptr_map.
	(class_diff::priv::changed_member_functions_): Changed the type of
	this from string_changed_member_function_sptr_map to
	string_function_decl_diff_sptr_map.
	(class_diff::priv::changed_member_class_tmpls_): Changed the type
	of this from string_changed_type_or_decl_map to
	string_diff_sptr_map.
	(class_diff::ensure_lookup_tables_populated): Adjust.  Initialize
	the new sorted members class_diff::priv::{sorted_changed_bases_,
	sorted_subtype_changed_dm_, sorted_changed_dm_,
	sorted_changed_member_functions_, sorted_changed_member_types_}.
	(class_diff::priv::{member_type_has_changed,
	member_class_tmpl_has_changed, count_filtered_bases,
	count_filtered_subtype_changed_dm, count_filtered_changed_mem_fns,
	}): Adjust.
	(class_diff::chain_into_hierarchy): Adjust:  The children nodes of
	class_diff are now laid out in a sorted way.
	(class_diff::{changed_bases, changed_member_fns}): Adjust.
	(base_diff_comp, virtual_member_function_diff_comp): New types.
	(sort_string_base_diff_sptr_map)
	(sort_string_virtual_member_function_diff_sptr_map): New static
	functions.
	(data_member_diff_comp): Renamed var_diff_comp into this.
	(sort_unsigned_data_member_diff_sptr_map): Renamed sort_var_diffs
	into this and adjust.
	(class_diff::report): Do not sort the nodes we are about to emit
	here.  Just use the natural order of the nodes in their parent
	tree as they should now be sorted.
	(scope_diff::priv::{changed_types_, changed_decls_}): Change the
	type of these from string_changed_type_or_decl_map to
	string_diff_sptr_map.
	(scope_diff::priv::{sorted_changed_types_,
	sorted_changed_decls_}): New data members.
	(scope_diff::ensure_lookup_tables_populated): Adjust.  Initialize
	the new scope_diff::priv::sorted_changed_{types_, decls_}.
	(scope_diff::chain_into_hierarchy): Adjust.  The children of
	scope_diff are now sorted.
	(scope_diff::changed_{types, decls}): Return the sorted vectors of
	children nodes.
	(struct changed_type_or_decl_comp): Remove.
	(struct diff_comp): New type.
	(sort_changed_type_or_decl): Remove.
	(sort_string_diff_sptr_map): New static function.
	(scope_diff::report): Adjust.  Do not sort children nodes here
	ourselves before reporting about them.  Rather, use the natural
	topological order of the children as they are now sorted.
	(corpus_diff::priv::sorted_changed_vars_): Renamed
	corpus_diff::priv::changed_vars_ into this to make it more
	explicit that the things it holds are sorted.
	(corpus_diff::changed_variables_sorted): Adjust.
	(corpus_diff::priv::ensure_lookup_tables_populated): Likewise.
	(corpus_diff::priv::apply_filters_and_compute_diff_stats):
	Likewise.
	(corpus_diff::priv::categorize_redundant_changed_sub_nodes):
	Likewise.
	(corpus_diff::priv::clear_redundancy_categorization): Likewise.
	(corpus_diff::priv::maybe_dump_diff_tree): Likewise.
	(corpus_diff::report): Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-01-27 12:49:40 +01:00
Dodji Seketeli
a00ed6bf41 Hand-code the string representation of GElf_Ehdr::e_machine
I was using elfutils/libebl.h to get a string representation of the
marchine architecture of the elf file.  It appears elfutils/libebl.h
is an internal header not meant to be used by client code of
elfutils.  So this patch hand-codes the string representation of the
value of the GElf_Ehdr data member and does away with the need of the
elfutils/libebl.h header as with libebl.

	* configure.ac: Do not check for elfutils/libebl.h and libebl.a
	anymore.
	* src/abg-dwarf-reader.cc: Do not include elfutils/libebl.h
	anymore.
	(e_machine_to_string): Define new static
	function.
	(read_context::::load_elf_architecture): Use the new
	e_machine_to_string() function rather than ebl_backend_name() and
	ebl_openbackend().
	* tests/data/test-diff-dwarf/test-23-diff-arch-report-0.txt: Adjust.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-01-26 22:17:39 +01:00
Dodji Seketeli
29bc673dc0 Fix chaining of descendant node of qualified type diff node
While looking at the abidiff report emitted for two versions of the
TBB library, I noticed that some diff nodes were not marked as
redundant as they should be.  As a result, they were being reported as
having "been reported earlier", which seems to be an acceptable cruft
to me especially now that the comparison IR can do proper redundancy
detection and marking.

I tracked that down and it's because the child node of a
qualified_type_diff is just the underlying type diff node, whereas
during reporting, we report about the leaf underlying type diff node,
which can be different from the just the underlying type diff node
because the later is always non-qualified.

The fix is to make the child node of qualified_type_diff be the leaf
underlying type diff node, so that diff tree walking (for the purpose
of redundancy detection) and reporting are all looking at the same
tree.

	* include/abg-comparison.h
	(qualified_type_diff::leaf_underlying_type_diff): Declare new
	accessor.
	* src/abg-comparison.cc (get_leaf_type): Forward declare this
	static function.
	(qualified_type_diff::priv::leaf_underlying_type_diff): Define new
	data member.
	(qualified_type_diff::leaf_underlying_type_diff): Define this new
	accessor.
	(qualified_type_diff::chain_into_hierarchy): Call
	leaf_underlying_type_diff() here rather than
	underlying_type_diff().
	(qualified_type_diff::report): Use leaf_underlying_type_diff()
	rather than re-computing the diff between the two leaf underlying
	type diff nodes.
	* libtest26-qualified-redundant-node-v{0,1}.so: New binary test
	input files.
	* tests/data/test-diff-filter/test26-qualified-redundant-node-v{0,1}.cc:
	Source code for the binary test inputs above.
	* tests/test-diff-filter.cc (int_out_spec): Add the new test input
	to the vector of test input data over which to run this test
	harness.
	* tests/data/test-diff-filter/test26-qualified-redundant-node-report-{0,1.txt:
	New test input file.
	* tests/data/Makefile.am: Add the new test input data to the
	source distribution.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-01-26 12:12:57 +01:00
Dodji Seketeli
ddfb37ab17 Recognize cyclic diff tree nodes as being redundant
Okay I need to introduce some vocabulary here.  Suppose we have the
version 1 of a library named library-v1.so which source code is:

    struct S
    {
     int m0;
     struct S* m2;
    };

    int
    foo(struct S* ptr)
    {
      return ptr;
    }

And now suppose we have a version 2 of that library named
library-v2.so which source code is modified so that a new data member
is inserted into struct S:

    struct S
    {
     int m0;
     char m1; /* <--- a new data member is inserted here.  */
     struct S* m2;
    };

    int
    foo(struct S* ptr)
    {
      return ptr;
    }

struct S is said to be a cyclic type because it contains a (data)
member which type refers to struct S itself, namely, the type of the
data member S::m2 is struct S*, which refers to struct S.

So, by analogy, the diff node tree that represents the changes of
struct S is also said to be cyclic, for similar reasons: the diff
node of the change of S::m2 refers to the diff node of the change of
the type of S::m2, namely the diff node of struct S*, which refers to
the diff node for the change of struct S itself.

Now let's talk about redundancy.  When walking the diff node tree of
struct S in a depth-first manner, at some point, we look at the diff
node for the data member S::m2, and we end up looking at the diff node
of its type which is the diff node for struct S*; we keep walking and
eventually we look the diff node of the change of the underlying type
of struct S, which is the diff node of struct S, and hah! that is a
redundant node because it's the first node that we visited when
visiting the diff node of ...  struct S!  So the diff tree node for
the change of struct S is not only a cyclic node, it's a redundant
diff node as well, and its second occurrence is located at the point
of appearance of data member S::m2.  Hence the wording "cyclic
redundant diff tree node".  There! We have our vocabulary all set now.

This patch enhances the code of the comparison engine so that a cyclic
diff tree node is marked as redundant from the point of its second
occurrence, onward.

First the patch separates the notion of visiting a diff node from the
notion of traversing it.  Now traversing a diff node means visiting it
and visiting its children nodes.  So one can visit a node without
traversing it, but one can not traverse a node without visiting it.

So, when walking diff node trees, we need to avoid ending up in
infinite loop in presence of cyclic nodes.  This is why re-traversing
a node that is already being traversed is forbidden by this patch, but
visiting a node that is being visited is allowed.  Before this patch,
the notions of visiting and traversing were conflated in one and were
not very clear; and one couldn't visit a node that was currently being
visited.  As a result, in presence of a cyclic node, its redundant
nature wasn't being recognized, and so the diff tree node was not
being flagged as being redundant.  Diff reports were then cluttered by
redundant references to changes involving cyclic types.

	* include/abg-comparison.h (enum visiting_kind): Rename
	enumerator DO_NOT_MARK_VISITED_NODES_AS_TRAVERSED into
	DO_NOT_MARK_VISITED_NODES_AS_VISITED.
	(diff_context::diff_has_been_visited): Rename
	diff_context::diff_has_been_traversed into this.
	(diff_context::mark_diff_as_visited): Rename
	diff_context::mark_diff_as_traversed into this.
	(diff_context::forget_visited_diffs): Rename
	diff_context::forget_traversed_diffs into this.
	(diff_context::forbid_visiting_a_node_twice): Rename
	diff_context::forbid_traversing_a_node_twice into this.
	(diff_context::visiting_a_node_twice_is_forbidden): Rename
	diff_context::traversing_a_node_twice_is_forbidden into this.
	(diff::is_traversing): Move this from protected to public.
	* src/abg-comparison.cc (diff_context::priv::visited_diff_nodes_):
	Rename diff_context::priv::traversed_diff_nodes_ into this.
	(diff_context::priv::forbid_visiting_a_node_twice_): Rename
	diff_context::priv::forbid_traversing_a_node_twice_ into this.
	(diff_context::priv::priv): Adjust.
	(diff_context::diff_has_been_visited): Rename
	diff_context::diff_has_been_traversed into this.  Adjust.
	(diff_context::mark_diff_as_visited): Rename
	diff_context::mark_diff_as_traversed into this.  Adjust.
	(diff_context::forget_visited_diffs): Rename
	diff_context::forget_traversed_diffs into this.  Adjust.
	(diff_context::forbid_visiting_a_node_twice): Rename
	diff_context::forbid_traversing_a_node_twice into this.
	(diff_context::visiting_a_node_twice_is_forbidden): Rename
	diff_context::traversing_a_node_twice_is_forbidden into this.
	(diff_context::maybe_apply_filters): Adjust.
	(diff::end_traversing): Remove the 'mark_as_traversed' parameter
	of this.  Remove the visited-marking code.
	(diff::traverse): This is the crux of the changes of this patch.
	Avoid traversing a node that is being traversed, but one can visit
	a node being visited.  Also, traversing a node means visiting it
	and visiting its children nodes.
	(diff::is_filtered_out):  Simplify logic for filtering redundant
	code.  Basically all nodes that are redundant are filtered.  All
	the complicated logic that was due when diff nodes were shared is
	not relevant anymore.
	(corpus_diff::priv::categorize_redundant_changed_sub_nodes)
	(propagate_categories, apply_suppressions)
	(diff_node_printer::diff_node_printer, print_diff_tree)
	(categorize_redundant_changed_sub_nodes)
	(clear_redundancy_categorization)
	(clear_redundancy_categorization): Adjust.
	(redundancy_marking_visitor::visit_begin): Adjust.  Also, if the
	current diff node is already being traversed (that's a clyclic
	node) then mark it as redundant.
	* src/abg-comp-filter.cc (apply_filter): Adjust.
	* tests/data/test-diff-filter/test16-report-2.txt: New test input data.
	* tests/data/test-diff-filter/libtest25-cyclic-type-v{0,1}.so: New
	test input binaries.
	* tests/data/test-diff-filter/test25-cyclic-type-v{0,1}.cc: Source
	code for the test input binaries.
	* tests/data/test-diff-filter/test25-cyclic-type-report-0.txt: New
	test input data.
	* tests/data/test-diff-filter/test25-cyclic-type-report-1.txt:
	Likewise.
	* tests/test-diff-filter.cc (in_out_specs): Add the new test
	inputs above to the list of test input data over which to run this
	test harness.
	* tests/data/Makefile.am: Add the new test files above to source
	distribution.
	* tests/data/test-diff-filter/test16-report.txt: Adjust.
	* tests/data/test-diff-filter/test17-0-report.txt: Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-01-24 22:48:40 +01:00
Dodji Seketeli
f3344623d3 Tighten the condition for creating a cloned function from DWARF
So, apparently, the DWARF reader can be too eager to clone functions
whose DIE have a DW_AT_abstract_origin.  It seems to be that there are
cases where the second DIE (the one that has the DW_AT_abstract_origin
attribute) has the same linkage name than the first one.  In that
case, no cloning should happen.  And this should fix
https://sourceware.org/bugzilla/show_bug.cgi?id=17861.

	* src/abg-dwarf-reader.cc (build_ir_node): Re-indent.  Also,
	consider that when a DIE C refers to a DIE A via the
	DW_abstract_origin attribute, C represents a clone of A, only if C
	and A have *different* linkage names.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-01-20 12:37:56 +01:00
Dodji Seketeli
af0923b1b0 Fix the output of the array diff report
While working on something else, I realized the report of the array
diff change wasn't referring to the pretty representation of the array
when talking about the changes of array element type; rather it was
just referring to the array name.  I think referring to the pretty
representation of the array is more helpful.  This patch does just
that.

	* src/abg-comparison.cc (array_diff::report): Refer to the pretty
	representation of the array when talking about changes of the
	array element type.
	* src/abg-ir.cc (equals): In the overload for array_type, use the
	equality operator that knows how to handle null pointers to
	element type.  This avoids crashes when the pointer to element
	type is null.
	* tests/data/test-diff-dwarf/test10-report.txt: Adjust.
	* tests/data/test-diff-filter/test24-compatible-vars-report-1.txt:
	Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-01-19 15:00:44 +01:00
Dodji Seketeli
63c81f028d Do not install the generated documentation by default
* doc/manuals/Makefile.am: Do not install the generated
	documentation by default

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-01-14 18:46:06 +01:00
Dodji Seketeli
3ea7c4682e Make sure to install html docs & gziped info on make install
* doc/manuals/Makefile.am: Make sure Make sure to install html
	docs & gziped info on make install

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-01-14 14:36:20 +01:00
Dodji Seketeli
527d7ab218 Do not install the abinilint program
This program is meant to be used by libabigail developers to debug its
ini file parsing facilities.  So there is no need to install.

	* tools/Makefile.am: Add abinilint to the noinst_PROGRAMS primary.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-01-14 13:20:46 +01:00
Dodji Seketeli
3d969dbe05 Small grammar fix in a manpage title
* doc/manuals/conf.py: Fix the grammar of the title of the abidiff
	man page.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-01-13 18:33:38 +01:00
Dodji Seketeli
5059907727 Generate texinfo documentation properly
There was a texinfo documentation that was being generated up to now,
but I haven't really looked at it.  Now that I have handled man pages
generation, I thought I'd give the texinfo generation a closer look
and ensure it's in a correct shape.  This patch cleans the generation
process up, changes the documentation markup so that it looks OK in
the generated texinfo file and handles the install of the generated
texinfo.

	* doc/manuals/Makefile.am: Generate texinfo doc, install it and
	uninstall it.
	* doc/manuals/libabigail-tools.rst: Do not use the :doc: syntax to
	refer to documents because it doesn't seem to work with sphinx
	right now.  Rather, use a table of content.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-01-13 18:33:23 +01:00