mirror of
git://sourceware.org/git/libabigail.git
synced 2024-12-16 15:04:46 +00:00
18569fc154
2286 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Dodji Seketeli
|
d1a8eae8ed |
ir: Avoid infinite loop during type canonicalization
While looking at something else, I noticed an occurrence of infinite loop during type canonicalization, especially when cancelling canonical type propagation on some types. Fixed thus. This helps address https://bugzilla.redhat.com/show_bug.cgi?id=1951501 * src/abg-ir-priv.h (environment::priv::collect_types_that_depends_on): Don't try to collect a type that has already been collected. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
4e029df894 |
writer: escape enum linkage name in abixml
While looking at something else, I stumbled across this bug where the linkage name of enum are not escaped in abixml. So "forbidden" characters like '<' can snick in. Fixed thus. This helps address https://bugzilla.redhat.com/show_bug.cgi?id=1951501 * src/abg-writer.cc (write_enum_type_decl): Escape linkage name. |
||
Dodji Seketeli
|
022faf705f |
RHBZ-1944096 - assertion failure during self comparison of systemd
When reading the abixml representing an enumerator which value is exactly either LLONG_MIN or LLONG_MAX, build_enum_type_decl fails because we wrongly think that an underflow or overflow happened, while using strtoll. This patch fixes the condition used to detect {under,over}flow whenusing strtoll. * src/abg-reader.cc (build_enum_type_decl): When strtoll detects an underflow or overflo, it sets errno to ERANGE. So take that into account. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
190350a35f |
Bug 27985 - abidiff: bad array types in report
Reporting the change in array type exhibits a glitch in the type name. As the bug report says: The resulting abidiff output contains: type of 'int numbers[2]' changed: type name changed from 'void[2]' to 'void[3]' array type size changed from 64 to 96 array type subrange 1 changed length from 2 to 3 instead of type of 'int numbers[2]' changed: type name changed from 'int[2]' to 'int[3]' array type size changed from 64 to 96 array type subrange 1 changed length from 2 to 3 The problem comes from array_type_def::get_qualified_name() where we fail to generate a "new" qualified name once the type of the array is canonicalized. Fixed thus. * src/abg-ir.cc (array_type_def::get_qualified_name): Use the cache for temporary qualified names when the type is not yet canonicalized. That way, the cache for (non-temporary) qualified names is used only for canonicalized types. * tests/data/test-abidiff/test-PR27985-report.txt: Reference output for the new test. * tests/data/test-abidiff/test-PR27985-v{0,1}.c: Source code for the new test binary inputs. * tests/data/test-abidiff/test-PR27985-v{0,1}.o: New test binary inputs. * tests/data/test-abidiff/test-PR27985-v{0,1}.o.abi: New test abixml input. * tests/data/Makefile.am: Add the new test materials above to source distribution. * tests/test-abidiff.cc (specs): Add the tests above to the harness. * tests/data/test-diff-pkg/nss-3.23.0-1.0.fc23.x86_64-report-0.txt: Adjust. * tests/data/test-abidiff-exit/qualifier-typedef-array-report-1.txt: Adjust. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Giuliano Procida
|
0907d84aef |
abg-writer: faster referenced type emission tests
When determining whether a referenced type should be emitted, various tests are done: - has the type been emitted already? hash table lookup - does the translation unit match? string comparison - is this the last translation unit? read bool variable The translation unit tests were added in recent commits and followed the hash table lookups. This resulted in a performance regression affecting Android continuous integration tests. The lookups require a hash calculation and an equality check if the hash is present. The equality checks are expensive deep equalities rather than pointer comparisons. This change reorders the tests so that the lookups happen last. This speeds up abidw by more than a factor of 10 for one Android library. * src/abg-writer.cc (write_translation_unit): Reorder referenced type emission tests for efficiency. Consolidate related comments. Signed-off-by: Giuliano Procida <gprocida@google.com> Reviewed-by: Matthias Maennich <maennich@google.com> Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
ca08bae742 |
RHBZ 1925886 - Compare anonymous types without qualified names
An anonymous struct/union is, by definition an entity that is not named (unless a naming typedef is provided for it). It turns out that in C++ binaries, there are anonymous types that are logically equivalent (as far as ABI is concerned) because they have the same members and layout, but turn out to be evaluated as being different because they are defined in different name spaces. And because they are not named, showing them as being different just because of their name space doesn't bring anything but spurious error reporting. Consider the DWARF representing this: struct S { union { int a; int b; } member; }; where the 'member' is of type S::<anonymous-union>. Probably due to LTO, we see some DWARF that represents the type of 'member' as just <anonymous-union>, in some translation units. I could not generate that DWARF from a small test case, myself. But it comes from the binary 'usr/bin/lto-dump', from the https://bugzilla.redhat.com/show_bug.cgi?id=1925886 problem report. So in that case, we want the S::<anonymous-union> to compare equal to the <anonymous-union>, otherwise, this produces spurious type changes, especially when doing self comparison. This is what this patch does. * include/abg-fwd.h (is_anonymous_type): Constify this function. * src/abg-ir.cc (equals): In the overload for decl_base, do not take scope of anonymous types into account. In the overload for array_type_def do not peel of typedefs. This is not directly related to anonymous types, but it make comparison more robust against naming typedefs used for anonymous types in array elements. (get_type_name): Do not take into account the scope of anonymous types when building internal representation of types. Note that the internal representation is what is used for canonicalization. This means that all anonymous types are compared against each others during type canonicalization. * src/abg-reader.cc (build_class_decl): Do not try to re-use anonymous types, just like we already do for DWARF. * tests/data/test-annotate/test17-pr19027.so.abi: Adjust. * tests/data/test-annotate/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise. * tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise. * tests/data/test-diff-filter/test31-pr18535-libstdc++-report-0.txt: Likewise. * tests/data/test-diff-filter/test31-pr18535-libstdc++-report-1.txt: Likewise. * tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise. * tests/data/test-read-dwarf/test-libaaudio.so.abi: Likewise. * tests/data/test-read-dwarf/test-libandroid.so.abi: Likewise. * tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Likewise. * tests/data/test-read-dwarf/test11-pr18828.so.abi: Likewise. * tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise. * tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise. * tests/data/test-read-dwarf/test17-pr19027.so.abi: Likewise. * tests/data/test-read-dwarf/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise. * tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise. * tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi: Likewise. * tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Likewise. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
dd0861c9a8 |
Bug 27236 - Don't forget to emit some referenced types
Since we arranged to only emit referenced types in translation units where they belong, it appears that in some cases we forget to emit some referenced types. This is because some referenced types might belong to a translation unit that is *already* emitted by the time we detect that a type is referenced. To fix this correctly, we should probably have a pass that walks the corpus to detect referenced types, so that we have their set even before we start emitting translation units. But for now, the patch just detects when we are emitting the last translation unit. In that case all the non-emitted referenced types are emitted. It doesn't seem to be an issue if those don't belong to that translation unit, compared to their original (from the DWARF) type. * include/abg-writer.h (write_translation_unit): Add a new parameter that says if we are emitting the last TU. * src/abg-writer.cc (write_translation_unit::{type_is_emitted, decl_only_type_is_emitted}): Constify these methods. (write_context::has_non_emitted_referenced_types): Define new member function using the const methods above. (write_translation_unit): When emitting the last TU, emit all the referenced types. (write_corpus): Set signal when emitting the last translation unit. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
cc2574121f |
Bug 27236 - Allow updating classes from abixml
Some classes can be defined piece-wise, in some rare cases in the abixml. build_class_decl is currently preventing that to happen, leading to some spurious self comparison errors. Fixed thus. * src/abg-reader.cc (build_class_decl): Keep going when the class has already been built. The rest of the code knows how to add new stuff. * tests/data/test-abidiff/test-PR18791-report0.txt: Adjust. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
39ba859603 |
Bug 27236 - Fix the canonical type propagation optimization
While working on another bug, it turned out the initial fix for the bug https://sourceware.org/bugzilla/show_bug.cgi?id=27236 was just papering over the real issue. I think the real issue is that "canonical type propagation" optimization was being done even in cases where it shouldn't have been done. This patch recognizes the limits of that optimization and avoid performing it when we are off limits. So here is what that optimization is. The text below is also present in the comments in the source code. I am putting it here to explain the context. During the canonicalization of a type T (which doesn't yet have a canonical type), T is compared structurally (member-wise) against a type C which already has a canonical type. The comparison expression is C == T. During that structural comparison, if a subtype of C (which also already has a canonical type) is structurally compared to a subtype of T (which doesn't yet have a canonical type) and if they are equal, then we can deduce that the canonical type of the subtype of C is the canonical type of the subtype of C. Thus, we can canonicalize the sub-type of the T, during the canonicalization of T itself. That canonicalization of the sub-type of T is what we call "propagating the canonical type of the sub-type of C onto the sub-type of T". It's also called "on-the-fly canonicalization". It's on the fly because it happens during a comparison -- which itself happens during the canonicalization of T. So this is the general description of the "canonical type propagation optimization". Now we must recognize the limits of that optimization. Said otherwise, there is a case when a type is *NOT* eligible to this canonical type propagation optimization. The reason why a type is deemed NON-eligible to the canonical type propagation optimization is that it "depends" on a recursively present type. Let me explain. Suppose we have a type T that has sub-types named ST0 and ST1. Suppose ST1 itself has a sub-type that is T itself. In this case, we say that T is a recursive type, because it has T (itself) as one of its sub-types: T +-- ST0 | +-- ST1 | + | | | +-- T | +-- ST2 ST1 is said to "depend" on T because it has T as a sub-type. But because T is recursive, then ST1 is said to depend on a recursive type. Notice however that ST0 does not depend on any recursive type. Now suppose we are comparing T to a type T' that has the same structure with sub-types ST0', ST1' and ST2'. During the comparison of ST1 against ST1', their sub-type T is compared against T'. Because T (resp. T') is a recursive type that is already being compared, the comparison of T against T' (as a subtypes of ST1 and ST1') returns true, meaning they are considered equal. This is done so that we don't enter an infinite recursion. That means ST1 is also deemed equal to ST1'. If we are in the course of the canonicalization of T' and thus if T (as well as as all of its sub-types) is already canonicalized, then the canonical type propagation optimization will make us propagate the canonical type of ST1 onto ST1'. So the canonical type of ST1' will be equal to the canonical type of ST1 as a result of that optmization. But then, later down the road, when ST2 is compared against ST2', let's suppose that we find out that they are different. Meaning that ST2 != ST2'. This means that T != T', i.e, the canonicalization of T' failed for now. But most importantly, it means that the propagation of the canonical type of ST1 to ST1' must now be invalidated. Meaning, ST1' must now be considered as not having any canonical type. In other words, during type canonicalization, if ST1' depends on a recursive type T', its propagated canonical type must be invalidated (set to nullptr) if T' appears to be different from T, a.k.a, the canonicalization of T' temporarily failed. This means that any sub-type that depends on recursive types and that has been the target of the canonical type propagation optimization must be tracked. If the dependant recursive type fails its canonicalization, then the sub-type being compared must have its propagated canonical type cleared. In other words, its propagated canonical type must be cancelled. This concept of cancelling the propagated canonical type when needed is what this patch introduces. New data members have been introduced to the environment::priv private structure. Those are to keep track of the stack of sub-types being compared so that we can detect if a candidate to the canonical type propagation optimization depends on a recursive type. There is also a data structure in there to track the targets of the canonical type propagation optimization that "might" need to see their propagated canonical types be cancelled. Then new functions have been introduced to detect when a type depends on a recursive type, to cancel or confirm propagated canonical types etc. In abg-ir.cc, The RETURN* macros used in the equals() overloads have been factorized using the newly introduced function templates return_comparison_result(). This now contains the book keeping that was previously done (in the RETURN* macros) to detect recursive cycles in the comparison, as well as triggering the canonical type propagation. This i also where the logic of properly limiting the optimization is implemented now. * include/abg-ir.h (pointer_set): This typedef is now for an unordered_set<uintptr_t> rather than an unordered_set<size_t>. (environment::priv_): Make this public so that code in free form function from abg-ir.cc can access it. * src/abg-ir-priv.h (struct type_base::priv): Move this private structure here, from abg-ir.cc. (type_base::priv::{depends_on_recursive_type_, canonical_type_propagated_}): Added these two new data members. (type_base::priv::priv): Initialize the two new data members. (type_base::priv::{depends_on_recursive_type, set_depends_on_recursive_type, set_does_not_depend_on_recursive_type, canonical_type_propagated, set_canonical_type_propagated, clear_propagated_canonical_type}): Define new member functions. (struct environment::priv): Move this struct here, from abg-ir.cc. (environment::priv::{types_with_non_confirmed_propagated_ct_, left_type_comp_operands_, right_type_comp_operands_}): New data members. (environment::priv::{mark_dependant_types, mark_dependant_types_compared_until, confirm_ct_propagation, collect_types_that_depends_on, cancel_ct_propagation, remove_from_types_with_non_confirmed_propagated_ct}): New member functions. * src/abg-ir.cc (struct environment::priv, struct) (type_base::priv, struct class_or_union::priv): Move these struct to include/abg-ir-priv.h. (push_composite_type_comparison_operands) (pop_composite_type_comparison_operands) (mark_dependant_types_compared_until) (maybe_cancel_propagated_canonical_type): Define new functions. (notify_equality_failed, mark_types_as_being_compared): Re-indent. (is_comparison_cycle_detected, return_comparison_result): Define new function templates. (RETURN_TRUE_IF_COMPARISON_CYCLE_DETECTED): Define new macro. (equals(const function_type& l, const function_type& r)): Redefine the RETURN macro using the new return_comparison_result function template. Use the new RETURN_TRUE_IF_COMPARISON_CYCLE_DETECTED and mark_types_as_being_compared functions. (equals(const class_or_union& l, const class_or_union&, change_kind*)): Likewise. (equals(const class_decl& l, const class_decl&, change_kind*)): Likewise. Because this uses another equal() function to compare the class_or_union part the type, ensure that no canonical type propagation occurs at that point. (types_are_being_compared): Remove as it's not used anymore. (maybe_propagate_canonical_type): Use the new environment::priv::propagate_ct() function here. (method_matches_at_least_one_in_vector): Ensure the right-hand-side operand of the equality stays on the right. This is important because the equals() functions expect that. * src/abg-reader.cc (build_type): Ensure all types are canonicalized. * tests/data/test-diff-dwarf/PR25058-liblttng-ctl-report-1.txt: Adjust. * tests/data/test-diff-pkg/nss-3.23.0-1.0.fc23.x86_64-report-0.txt: Likewise. * tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-2.txt: Likewise. * tests/data/test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-0.txt: Likewise. * tests/data/test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-1.txt: Likewise. * tests/data/test-read-dwarf/test-libaaudio.so.abi: Likewise. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
46b1ab08b0 |
Bug 27995 - Self comparison error from abixml file
There are several self comparison issues uncovered by comparing the file test-PR27995.abi (provided in the bug report) against itself. This patch address them all as well as the regressions induced on some of the test suite and then and updates the other reference test suite output that need it. In the equals overload for decl_base, we compare the non-internal versions of qualified decl names. For var_decls of anonymous class or union types, the non-internal version is the flat-representation of the type. Thus a benign change in a data member name of the anonymous type might cause the equals function to consider the var_decls to be wrongly different. The internal version of the qualified decl name should return a name that is stable for types, irrespective of these benign variations. The patch thus makes the equals overload for decl_base to compare internal versions of qualified decl names instead. The patch ensures that enum_type_decl::get_pretty_representation return and internal pretty representation that is "stable" for anonymous types. Basically, all anonymous enums will have the same of name that looks like "__anonymous_enum__". This is to ensure two things: first, that all anonymous enums are compared against each other during type canonicalization, ensuring that when two anonymous enums are canonically different, it really is because of changes in their enumerators or basic type, not because of anything having to do with their artificial names. Second, that in the equals overload for decl_base, their internal qualified name always compare equal. This nullifies the risk of having anonymous types compare different because of their (non existent) name. This is because libabigail's dwarf reader assigns artificial unique names to anonymous types, so we don't want to use these during actual type comparison. We do something similar for class_decl::get_pretty_representation and union_decl::get_pretty_representation where the pretty internal representation for class/union decl would now be __anonymous_{struct,union}__. The patch scouts the uses of get_pretty_representation() to make sure to use avoid using the internal-form of the pretty representations when it's not warranted. It also updates the doxygen comments of the overloads of that function. In the abixml reader, we were wrongly canonicalizing array types early, even before they were fully created. The was leading to spurious type chances down the road. The patch also fixes the caching of the name of function types by making it consistent with caching of the names of the other types of the system. The idea is that we don't cache the name of a function type until it's canonicalize. This is because the type might be edited during its pre-canonicalization life time; and that editing might change its name. However once the type is canonicalized, it becomes immutable. At that point we can cache its name, for performance purposes. Note that we need to do that both for the "internal version" of the type name (used for canonilization purposes) and the "non-internal version" one, which is used for other purposes. This caching scheme wasn't respected for function types, so we were caching a potentially wrong name for the type after its canonicalization. Last but not least, there is a problem that makes canonical type comparison different from structural type comparison. Let's consider these two declarations: typedef int FirstInt; typedef int SecondInt; Now, consider these two pointer types: FirstInt* and SecondInt*; These two pointer types are canonically different because they have different type names. This is because during type canonicalization, types with the same "pretty representation" are compared against each other. So types with different type names will certainly have different pretty representations and won't be compared; they are thus going to have different canonical types. However, FirstInt* and SecondInt* do compare equal, structurally, because the equals overload for pointer_type_def compares the pointed-to types of pointers by peeling off typedefs. So, here, as both pointed-to types are 'int' when the typedefs are peeled off, the two pointers structurally compare equal. This discrepancy between structural and canonical equality introduces subtle and spurious type changes depending on the order in which types are canonicalized. For instance: struct {FirstInt* m0;}; /* First type. */ struct {SecondInt* m0;}; /* Second type. */ If FirstInt* and SecondInt* are canonicalized before their containing anonymous types, then the two anonymous types will compare different (because FirstInt* and SecondInt* compare different) and have different canonical types. If, however, the anonymous types are canonicalized before FirstInt* and SecondInt*, then will compare equal because FirstInt* and SecondInt* are structurally equivalent. FirstInt* and SecondInt* will be canonicalized latter and have different canonical types (because they have different type names) despite being structurally equivalent. The change in the order of canonicalization can happen when canonicalizing types from a corpus coming from DWARF as opposed to canonicalizing types from a corpus coming from abixml. The patch fixes this discrepancy by not peeling off typedefs from the pointed-to types when comparing pointers. Note that this makes us regress on bug https://sourceware.org/bugzilla/show_bug.cgi?id=27236, where the typedef peeling was introduced. In hindsight, introducing that typedef peeling was a mistake. I'll try to address that bug again in a subsequent patch. * doc/manuals/abidiff.rst: Add documentation for the --debug option. * src/abg-ir.cc (equals): In the overload for decl_base consider the internal version of qualified decl name. In the overload for pointer_type_def do not peel typedefs off from the compared pointed-to types. In the overload for typedef_decl compare the typedef as a decl as well. In the overload for var_decl, compare variables that have the same ELF symbols without taking into account their qualified name, rather than their name. Stop comparing data member without considering their names. In the overload for class_or_union, when a decl-only class that is ODR-relevant is compared against another type, assume that equality if names are equal. This is useful in environments where some TUs are ODR-relevant and others aren't. (*::get_pretty_representation): Update doxygen comments. (enum_type_decl::get_pretty_representation): Return an internal pretty representation that is stable across all anonymous enums. (var_decl::get_anon_dm_reliable_name): Use the non-internal pretty representation for anonymous data members. (function_type::priv::temp_internal_cached_name_): New data member. (function_type::get_cached_name): Cache the internal name after the function type is canonicalized. Make sure internal name and non-internal name are cached separately. (class_or_union::find_anonymous_data_member): Look for the anonymous data member by looking at its non-internal name. ({class, union}_decl::get_pretty_representation): Use something like "class __anonymous_{union,struct}__" for all anonymous classes, so that they can all be compared against each other during type canonicalization. (type_has_sub_type_changes): Use non-internal pretty representation. (hash_type_or_decl, function_decl_is_less_than:): Use internal pretty representation for comparison here. * src/abg-reader.cc (read_context::maybe_canonicalize_type): Don't early canonicalize array types. * src/abg-writer.cc (annotate): Use non-internal pretty representation. * tests/data/test-diff-filter/test-PR27995-report-0.txt: New reference report. * tests/data/test-diff-filter/test-PR27995.abi: New test input abixml file. * tests/data/Makefile.am: Add test-PR27995.abi, test-PR27995-report-0.txt to the source distribution. * tests/data/test-annotate/libtest23.so.abi: Adjust. * tests/data/test-diff-dwarf/test6-report.txt: Adjust. * tests/data/test-diff-filter/test31-pr18535-libstdc++-report-0.txt: Adjust. * tests/data/test-diff-filter/test31-pr18535-libstdc++-report-1.txt: Adjust. * tests/data/test-diff-filter/test41-report-0.txt: Adjust. * tests/data/test-diff-filter/test43-decl-only-def-change-leaf-report-0.txt: Adjust. * tests/data/test-diff-filter/test8-report.txt: Adjust. * tests/data/test-diff-pkg/libICE-1.0.6-1.el6.x86_64.rpm--libICE-1.0.9-2.el7.x86_64.rpm-report-0.txt: Adjust. * tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-0.txt: Adjust. * tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-2.txt: Adjust. * tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-3.txt: Adjust. * tests/data/test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-0.txt: Adjust. * tests/data/test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-1.txt: Adjust. * tests/data/test-diff-suppr/test39-opaque-type-report-0.txt: Adjust. * tests/data/test-read-dwarf/PR22015-libboost_iostreams.so.abi: Adjust. * tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Adjust. * tests/data/test-read-dwarf/libtest23.so.abi: Adjust. * tests/data/test-read-dwarf/test-libandroid.so.abi: Adjust. * tests/data/test-read-dwarf/test11-pr18828.so.abi: Adjust. * tests/data/test-read-dwarf/test12-pr18844.so.abi: Adjust. * tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Adjust. * tests/test-diff-filter.cc (in_out_specs): Add the test-PR27995.abi to the test harness. * tools/abidiff.cc (options::do_debug): New data member. (options::options): Initialize it. (parse_command_line): Parse --debug. (main): Activate self comparison debug if the user provided --debug. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
2d276b67ed |
ir: Tighten type comparison optimization for Linux kernel binaries
types_defined_same_linux_kernel_corpus_public() performs an optimization while comparing two types in the context of the Linux kernel. If two types of the same kind and name are defined in the same corpus and in the same file, then they ought to be equal. For two anonymous classes that have naming typedefs, the function forgets to ensure that the naming typedefs have the same name. I have no binary that exhibits the potential issue, but I stumbled upon the problem while looking at something else that uncovered the problem. This change doesn't impact any of the binaries of the regression suite at the moment, though. Fixed thus. * src/abg-ir.cc (types_defined_same_linux_kernel_corpus_public): Ensure that anonymous classes with naming typedefs have identical typedef names. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
d71518dbf0 |
ir: Tighten the test for anonymous data member
In is_anonymous_data_member(), we only test that the name of the data member is empty; we forget to test that decl_base::get_is_anonymous() is true. This might make us wrongly think that a data member is anonymous in cases like in the equals() function for var_decl, where we temporarily set the name of the compared var_decl to "" before invoking the decl_base::operator==. We do this to perform the comparison by not taking into account the name of the variable. This hasn't yet happened on the binaries of the regression test suite, but it's definitely wrong so I am fixing it here. * src/abg-ir.cc: (is_anonymous_data_member): Consider decl_base::get_is_anonymous as well. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
8926b2f3d1 |
ir: Improve the debugging facilities
While looking at something else, I stumbled across some minor issues in the debugging facilities I use to track self comparison problems. I added a missing ABG_RETURN macro in the stack of equals() function to better detect when there is a change, under the debugger. I also fixed get_debug_representation() to properly display the class/enum name (as expected) rather their pretty representation. * src/abg-ir.cc (maybe_compare_as_member_decls): Add a missing ABG_RETURN (get_debug_representation): Display the name of class and enums, not their pretty representation. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Giuliano Procida
|
cfd81dec10 |
PR28060 - Invalid offset for bitfields
Bitfield and other member offsets can be specified in DWARF using: - DW_AT_data_bit_offset, or - DW_AT_data_member_location and optionally DW_AT_bit_offset. The code would only use the value DW_AT_data_member_location if there was no DW_AT_bit_offset. This commit fixes this and adjusts documentation and affected tests. * src/abg-dwarf-reader.cc (read_and_convert_DW_at_bit_offset): Update documentation. (die_member_offset): Treat DW_AT_bit_offset as an optional adjustment to DW_AT_data_member_location. * tests/data/test-annotate/test13-pr18894.so.abi: Update. * tests/data/test-annotate/test15-pr18892.so.abi: Update. * tests/data/test-annotate/test17-pr19027.so.abi: Update. * tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi: Update. * tests/data/test-annotate/test21-pr19092.so.abi: Update. * tests/data/test-diff-dwarf-abixml/PR25409-librte_bus_dpaa.so.20.0.abi: Regenerate. * tests/data/test-diff-pkg/libcdio-0.94-1.fc26.x86_64--libcdio-0.94-2.fc26.x86_64-report.1.txt: Report now empty. * tests/data/test-read-dwarf/PR25007-sdhci.ko.abi: Update. * tests/data/test-read-dwarf/PR25042-libgdbm-clang-dwarf5.so.6.0.0.abi: Update. * tests/data/test-read-dwarf/test13-pr18894.so.abi: Update. * tests/data/test-read-dwarf/test15-pr18892.so.abi: Update. * tests/data/test-read-dwarf/test17-pr19027.so.abi: Update. * tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi: Update. * tests/data/test-read-dwarf/test21-pr19092.so.abi: Update. * tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi: Update. Signed-off-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Giuliano Procida
|
e6fd6b8a57 |
abg-ir.h: add declaration of operator<< for elf_symbol::visibility
There is a formatted output operator for elf_symbol::visibility in abg-ir.cc. However, it had no visibile declaration and was not usable by library users. This commit adds the declaration. * include/abg-ir.h (operator<<(elf_symbol::visibility): Add declaration. Signed-off-by: Giuliano Procida <gprocida@google.com> |
||
Giuliano Procida
|
401ec26be6 |
ir: remove "is Linux string constant" property from elf_symbol
This boolean property was obsoleted by the new symtab reader implementation. It has no users. Following this change, the find_ksymtab_strings_section function joins find_ksymtab_section and find_ksymtab_gpl_section in having no users. * include/abg-ir.h (elf_symbol::elf_symbol): Drop is_linux_string_cst argument. (elf_symbol::create): Likewise. (elf_symbol::get_is_linux_string_cst): Drop method. * src/abg-dwarf-reader.cc (lookup_symbol_from_sysv_hash_tab): Remove code that gets the index of the __ksymtab_strings section. Drop corresponding elf_symbol::create argument. (lookup_symbol_from_gnu_hash_tab): Likewise. (lookup_symbol_from_symtab): Likewise. (create_default_fn_sym): Drop false is_linux_string_cst argument to elf_symbol::create. * src/abg-ir.cc (elf_symbol::priv::is_linux_string_cst_): Drop member variable. (elf_symbol::priv default ctor): Drop initialisation of is_linux_string_cst_. (elf_symbol::priv normal ctor): Drop is_linux_string_cst argument and corresponding is_linux_string_cst_ initialisation. (elf_symbol::elf_symbol ctor): Drop is_linux_string_cst argument and corresponding forwarding to priv ctor. (elf_symbol::create): Drop is_linux_string_cst argument and corresponding forwarding to ctor. (elf_symbol::get_is_linux_string_cst): Drop method. * src/abg-reader.cc (build_elf_symbol): Drop false is_linux_string_cst argument to elf_symbol::create. * src/abg-symtab-reader.cc (symtab::load): Likewise. Signed-off-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Matthias Maennich
|
86c06ad684 |
Consistently use std::unique_ptr for private implementations (pimpl)
In the absence of non-refcounting smart pointers before C++11, std::shared_ptr was commonly used instead. Having bumped the standard to C++11, allows us to use std::unique_ptr consistently avoiding any costs involved with shared_ptr ref counting. Hence do that and add default virtual destructors where required. * include/abg-comparison.h (diff_maps): use unique_ptr for priv_ (diff_context): Likewise. (diff_traversable_base): Likewise. (type_diff_base): Likewise. (decl_diff_base): Likewise. (distinct_diff): Likewise. (var_diff): Likewise. (pointer_diff): Likewise. (reference_diff): Likewise. (array_diff): Likewise. (qualified_type_diff): Likewise. (enum_diff): Likewise. (class_or_union_diff): Likewise. (class_diff): Likewise. (base_diff): Likewise. (scope_diff): Likewise. (fn_parm_diff): Likewise. (function_type_diff): Likewise. (function_decl_diff): Likewise. (typedef_diff): Likewise. (translation_unit_diff): Likewise. (diff_stats): Likewise. (diff_node_visitor): Likewise. * include/abg-corpus.h (corpus): Likewise. (exported_decls_builder): Likewise. (corpus_group): Likewise. * include/abg-ini.h (property): Likewise. (property_value): Likewise. (string_property_value): Likewise. (list_property_value): Likewise. (tuple_property_value): Likewise. (simple_property): Likewise. (list_property): Likewise. (tuple_property): Likewise. (config): Likewise. (section): Likewise. (function_call_expr): Likewise. * include/abg-interned-str.h (interned_string_pool): Likewise. * include/abg-ir.h (environment): Likewise. (location_manager): Likewise. (type_maps): Likewise. (translation_unit): Likewise. (elf_symbol::version): Likewise. (type_or_decl_base): Likewise. (scope_decl): Likewise. (qualified_type_def): Likewise. (pointer_type_def): Likewise. (array_type_def): Likewise. (subrange_type): Likewise. (enum_type_decl): Likewise. (enum_type_decl::enumerator): Likewise. (typedef_decl): Likewise. (dm_context_rel): Likewise. (var_decl): Likewise. (function_decl::parameter): Likewise. (function_type): Likewise. (method_type): Likewise. (template_decl): Likewise. (template_parameter): Likewise. (type_tparameter): Likewise. (non_type_tparameter): Likewise. (template_tparameter): Likewise. (type_composition): Likewise. (function_tdecl): Likewise. (class_tdecl): Likewise. (class_decl::base_spec): Likewise. (ir_node_visitor): Likewise. * include/abg-suppression.h (suppression_base): Likewise. (type_suppression::insertion_range): Likewise. (type_suppression::insertion_range::boundary): Likewise. (type_suppression::insertion_range::integer_boundary): Likewise. (type_suppression::insertion_range::fn_call_expr_boundary): Likewise. (function_suppression): Likewise. (function_suppression::parameter_spec): Likewise. (file_suppression): Likewise. * include/abg-tools-utils.h (temp_file): Likewise. (timer): Likewise. * include/abg-traverse.h (traversable_base): Likewise. * include/abg-workers.h (queue): Likewise. * src/abg-comparison.cc (diff_context): add default destructor. (diff_maps): Likewise. (corpus_diff): Likewise. (diff_node_visitor): Likewise. (class_or_union_diff::get_priv): adjust return type. (class_diff::get_priv): adjust return type. * src/abg-corpus.cc (corpus): add default destructor. * src/abg-ir.cc (location_manager): Likewise. (type_maps): Likewise. (elf_symbol::version): Likewise. (array_type_def::subrange_type): Likewise. (enum_type_decl::enumerator): Likewise. (function_decl::parameter): Likewise. (class_decl::base_spec): Likewise. (ir_node_visitor): Likewise. Signed-off-by: Matthias Maennich <maennich@google.com> |
||
Matthias Maennich
|
578ba12139 |
symtab-reader: add support for binaries compiled with CFI
Control-Flow-Integrity (CFI) when enabled in clang built binaries introduces an indirection when looking up ELF symbols. For DSO, the symbol table (.dynsym) will still contain the symbols, but additional symbols with suffix .cfi will be added to the full .symtab. Unfortunately, the DWARF debug information refers to CFI symbols by address to the .cfi suffixed variants as they point to the actual implementation. When the dwarf reader is determining whether to suppress variable or function declarations, it does so by identifying if there is an associated ELF symbol at the given address read from DWARF. Unless we know about the alternative address, this will fail and the type information will be suppressed. Hence add the .cfi symbol values to the lookup map to associate their address with the corresponding publicly exported symbol. * src/abg-symtab-reader.cc (symtab::load_): use new add_alternative_address_lookups method. (add_alternative_address_lookups): New method. * src/abg-symtab-reader.h (add_alternative_address_lookups): new function declaration. * tests/data/test-read-dwarf/test-libaaudio.so: New test data. * tests/data/test-read-dwarf/test-libaaudio.so.abi: New test data. * tests/data/Makefile.am: Add the two new tests input to source distribution. * tests/test-read-dwarf.cc: New test case. Reported-by: Dan Albert <danalbert@google.com> Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com> Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Matthias Maennich
|
3a22dfaff6 |
elf-helpers: refactor find_symbol_table_section
Refactor the acquisition of symtabs to explicitly provide functionality to get the .symtab and .dynsym sections. A later patch will make use of that to acquire .symtab while find_symbol_table_section() still provides .dynsym as default symbol table. This also adds a new overload to find_section to acquire the first section by type and adjusts find_symbol_table_section() to make use of those functions. * src/abg-elf-helpers.cc(find_section): New overload. (find_symtab_section): New function. (find_dynsym_section): New function. (find_symbol_table_section): Use new find_*_section functions. * src/abg-elf-helpers.h(find_section): New overload declaration. (find_symtab_section): New function declaration. (find_dynsym_section): New function declaration. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com> |
||
Dodji Seketeli
|
e2e253e5b1 |
Bug 27980 - Fix updating of type scope upon type canonicalization
Once a type T is canonicalized, its scope is updated so that the vector returned by scope_decl::get_canonical_types() now contains the new canonical type of T. This works, obviously, even when the scope is itself a type. This works well on binaries compiled using C only because, currently, libabigail de-duplicates the DIEs of types. This means that if the scope of T is a non-anonymous type, the class of equivalence of that scope contains just one element. So updating the scope of T implies updating just one scope. On binaries where some files are compiled using C++ however, type DIEs are not de-duplicated. This is just because that feature hasn't yet been implemented in libabigail. Anyway, in that case, if the scope of T is a non-anonymous type, the class of equivalence of that scope contains more than one element. So updating the scope of T implies updating the scope of all the elements of the class of equivalence T. In practise, that means updating the canonical type (scope) of T. Libabigail fails to update the canonical type (scope) of T. Later at abixml emitting time, just emitting the canonical types of the scope of T is not enough to emit the canonical type of T. And that's how the abixml emitter forgets to emit some types as reported in the bug https://sourceware.org/bugzilla/show_bug.cgi?id=27980. This patch fixes that issue. I also noticed that when emitting abixml for unions, the emitter fails to emit the canonical member types of the union, unlike what is done for class types. So that is fixed as well. The binary provided in the bug report is added to the regression testsuite. * src/abg-ir.cc (canonicalize): Update the scope_decl::get_canonical_types() of canonical type of the containing type of the newly canonicalized type. * src/abg-writer.cc (write_union_decl): Write the canonical types contained in the current union scope, just like we do for classes. * tests/data/test-read-dwarf/test16-pr18904.so.abi: Adjust. * tests/data/test-types-stability/pr27980-libc.so: New binary input file. * tests/data/Makefile.am: Add the test input file above to source distribution. * tests/test-types-stability.cc (elf_paths): Add the new test input file to this test harness. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Giuliano Procida
|
caf06d7e5c |
abg-reader: Create a fresh corpus object per corpus
Currently the XML reader reuses the same corpus object for all corpora in a corpus group. This has an unwanted side-effect: any abi-instr with the same path in different corpora will collide and parts of the ABI will be lost. Creating a new corpus object for every abi-corpus element seems like the right thing to do. Testing with large ABIs containing many corpora also shows a modest (~10%) abidiff speed improvement. * src/abg-reader.cc (read_corpus_from_input): Always create a fresh corpus object for each abi-corpus XML element. Signed-off-by: Giuliano Procida <gprocida@google.com> |
||
Giuliano Procida
|
25bd77e31e |
abg-reader: Ensure corpus always has a symtab reader
In the presence of an empty abi-corpus element and with the following change to always allocate a fresh corpus object, such objects can sometimes be left without a symtab reader, instead of inheritng one from the previous corpus. The reader is called to obtain sorted lists of symbols during ABI comparisons. The simplest way to avoid a crash is to maintain the invariant that a reader object is always present. With this change, if there is bad XML preventing symbols from being read, no error is raised as before, but the logic has been tweaked so that abi-instr parsing will nevertheless be attempted. * src/abg-reader.cc (read_symbol_db_from_input): Fix documentation for this function. Allow "successful parsing" to include the case where no symbols were present in the input. (read_corpus_from_input): Unconditionally set a symtab reader on the corpus object. Unconditionally parse the abi-instr of a corpus. Signed-off-by: Giuliano Procida <gprocida@google.com> |
||
Giuliano Procida
|
5ccbfd4f29 |
dwarf-reader: Create new corpus unconditionally
The DWARF reader appears to create a new corpus object only if one is not already present. However, the only case where there can be multiple corpora is when build_corpus_group_from_kernel_dist_under is called and this function clears down the reader context, including the current corpus, between reading ELF objects. So it's clearer to just create a fresh corpus object unconditionally in the DWARF reader. * src/abg-dwarf-reader.cc (read_debug_info_into_corpus): Create new corpus object unconditionally. Signed-off-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Ben Woodard via Libabigail
|
519d7ce8e5 |
Fix trivial typo when printing version string
When abicompat prints its version string, it does not terminate it with a newline the way that other commands do. Contributed by Bolo. * tools/abicompat.cc (main): Add a newline after version string. Signed-off-by: Ben Woodard <woodard@redhat.com> Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
d604e33793 |
Revert "Fix trivial typo when printing version string"
This reverts commit
|
||
Ben Woodard via Libabigail
|
ad619f14ea |
Fix trivial typo when printing version string
When abicompat prints its version string, it does not terminate it with a newline the way that other commands do. Contributed by Bolo. Signed-off-by: Ben Woodard <woodard@redhat.com> |
||
Dodji Seketeli
|
e330b57a6a |
doc: Fix typo
David Marchand <dmarchand@redhat.com> found this typo. Fixed thus. * doc/manuals/libabigail-concepts.rst: Fix typo. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
9238ff4b07 |
abg-reader: Fix typo
* src/abg-reader.cc (read_context::maybe_check_abixml_canonical_type_stability): Fix typo. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
923a355f16 |
abidw: Remove temporary .typeid files when using --debug-abidiff
I noticed that the temporary typeid file generated by abidw when using the --debug-abidiff option was left behind. This patch removes it. * tools/abidw.cc (load_corpus_and_write_abixml): Remove temporary typeid file after its use. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
9681ab04d2 |
Fix recursive array type definition
This is a follow-up of the patch below: commit |
||
Dodji Seketeli
|
1cfbff1b30 |
rhbz1951526 - SELF CHECK FAILED for 'gimp-2.10'
This is a fix for bug https://bugzilla.redhat.com/show_bug.cgi?id=1951526. Although it's a patch for one bug, it addresses several different issues that cause the observed self comparison failure. As is often the case on this kind of problems, the failure is difficult to reproduce on a synthetic test case so I'll explain the root causes in this commit log. There are 4 different root causes to this problem. As I couldn't come up with a reduced test case for each one of them I am adding the fixes for those 4 issues in this commit, along with a new regression test extracted from the initial bugzilla problem report. So, overall, the symptom we are seeing here is that when we build an IR for the input binary gimp-2.0, save that IR into abixml, and read back that abixml into another IR, comparing the two IR shows changes; it should show no change whatsoever. This is what we call in libabigail jargon a self comparison (or self check) failure. As alluded to in my introduction above, there appear to be 4 different root causes for that self comparison failure. 1/ The first cause has to do with a situation about two anymous enums that are (wrongly) considered different from an ABI point of view. Using the debugging capabilities recently gained by libabigail, I could notice that the two enums are: (gdb) p debug(&l) enum __anonymous_enum__ : unnamed-enum-underlying-type-32 { // size in bits: 32 // translation unit: /usr/src/debug/gimp-2.10.22-2.el9.1.aarch64/app/<artificial>-757de // @: 0x698fb68, @canonical: 0 GIMP_INTERPOLATION_NONE = 0, GIMP_INTERPOLATION_LINEAR = 1, GIMP_INTERPOLATION_CUBIC = 2, GIMP_INTERPOLATION_NOHALO = 3, GIMP_INTERPOLATION_LOHALO = 4, }; $1 = (abigail::ir::decl_base *) 0x698fba0 (gdb) p debug(&r) enum __anonymous_enum__ : unnamed-enum-underlying-type-32 { // size in bits: 32 // translation unit: /usr/src/debug/gimp-2.10.22-2.el9.1.aarch64/app/app.c // @: 0xa6d83e8, @canonical: 0 GIMP_INTERPOLATION_NONE = 0, GIMP_INTERPOLATION_LINEAR = 1, GIMP_INTERPOLATION_CUBIC = 2, GIMP_INTERPOLATION_NOHALO = 3, GIMP_INTERPOLATION_LOHALO = 4, GIMP_INTERPOLATION_LANCZOS = 3, }; $2 = (abigail::ir::decl_base *) 0xa6d8420 (gdb) Note how the second enum has a new enumerator named 'GIMP_INTERPOLATION_LANCZOS', but its value is '3', which is the exact same value of as the one of the existing enumerator GIMP_INTERPOLATION_NOHALO. During type canonicalization of the IR from the input binary, libabigail (wrongly) considers these two enums as being different. This leads to the type 'Gimp*' (or anything type indirectly using any one of the anonymous enums above) coming from one translation unit being considered different from a type 'Gimp*' coming from another translation unit, just because their are not using either one version of the anonymous enum above or the other. This leads to a *LOT* of spurious type changes from the first IR, that are saved into abixml. To fix this first problem, this patch introduces "two modes" of comparing enums. There is a binary-only mode which only looks enumerator values, not enumerator names. And then there is the source-level mode which looks at both enumerator names and values when comparing enums. The former mode is used during type canonicalization. However, when a change is detected between two enums, then the diff-IR built to describe the change is constructed using the later mode. Using the later mode allows to describe precisely things like enumerator insertion/removal by referring to the names of the added/removed enumerators. 2/ The second root cause is that a struct, say, 'struct _GimpImage' from a translation unit is considered different from a 'struct _GimpImage' because the DWARF reader wrongly assign them different sizes. Here is what it looks like in the debugger: (gdb) p debug(&l) struct _GimpImage { // size in bits: 384 // definition point: ../../app/core/gimpimage.h:39:1 // translation unit: /usr/src/debug/gimp-2.10.22-2.el9.1.aarch64/app/<artificial>-757de // @: 0x69b9d10, @canonical: 0 GimpViewable parent_instance;' Gimp* gimp;' GimpImagePrivate* priv;' }; $8 = (abigail::ir::type_base *) 0x69b9d10 (gdb) p debug(&r) struct _GimpImage { // size in bits: 0 // definition point: :0:0 // translation unit: /usr/src/debug/gimp-2.10.22-2.el9.1.aarch64/app/<artificial>-8813f // @: 0x6ac7a50, @canonical: 0 }; Notice how the second 'struct _GimpImage' has a size of zero. This is because when reading the DWARF, we first encounter the DIE for the first' struct _GimpImage' and we properly build a type for it, along with its declaration. Then when we encounter another DIE defining 'struct _GimpImage' again, from a different translation unit, the DWARF reader recognizes that it's a DIE for a declaration of 'struct _GimpImage' and fails to re-use the previous definition for 'struct _GimpImage'. So it wrongly builds declaration-only 'struct _GimpImage' for it, hence the second struct _GimpImage with a zero size. Here again that creates spurious changes (after type canonicalization) in types using struct _GimpImage. And that is a lot of types, including things like 'Gimp*' and the like. The fix for this root cause issue is to change add_or_update_class_type in the DWARF reader to recognize that we are seeing a type declaration for which there was already a definition and return that definition instead of creating a new declaration. 3/ The third root cause is better explained with a "screen shot". Consider these two 'versions' of the same struct _GdkDevice from two different translation units: struct _GdkDevice { // size in bits: 576 // definition point: /usr/include/gtk-2.0/gdk/gdkinput.h:98:1 // translation unit: /usr/src/debug/gimp-2.10.22-2.el9.1.aarch64/app/<artificial>-2d0352 // @: 0x8820530, @canonical: 0 GObject parent_instance;' gchar* name; // uses canonical type '@0x6892980' GdkInputSource source;' GdkInputMode mode;' gboolean has_cursor; // uses canonical type '@0x688dd00' gint num_axes; // uses canonical type '@0x688dd00' GdkDeviceAxis* axes;' gint num_keys; // uses canonical type '@0x688dd00' GdkDeviceKey* keys;' }; $9 = (abigail::ir::type_base *) 0x8820530 (gdb) p debug(&r) struct _GdkDevice { // size in bits: 576 // definition point: /usr/include/gtk-2.0/gdk/gdkinput.h:98:1 // translation unit: /usr/src/debug/gimp-2.10.22-2.el9.1.aarch64/app/<artificial>-1fdb18 // @: 0x7cd71e0, @canonical: 0 GObject parent_instance;' gchar* _g_sealed__name; // uses canonical type '@0x6892980' GdkInputSource _g_sealed__source;' GdkInputMode _g_sealed__mode;' gboolean _g_sealed__has_cursor; // uses canonical type '@0x688dd00' gint _g_sealed__num_axes; // uses canonical type '@0x688dd00' GdkDeviceAxis* _g_sealed__axes;' gint _g_sealed__num_keys; // uses canonical type '@0x688dd00' GdkDeviceKey* _g_sealed__keys;' }; $10 = (abigail::ir::type_base *) 0x7cd71e0 (gdb) Notice how the name of the second data member 'name' was changed to '_g_sealed_name'. A similar scheme happens to several other data member names. The offsets and types of the struct _GdkDevice haven't changed however. So from an ABI standpoint, the two versions of that struct are equal. Libabigail consider them different however. Because that type is used by tons of other types of the binary being analyzed, this leads to lots of spurious canonical type difference that shouldn't be there. These three issues are magnified by the fact that the gimp binary is compiled using "link time optimization". That brings in a lot more opportunities to see these underlying issues that have been there for a long time. 4/ The fourth and last root cause issue. When the abixml writer emits a translation unit (TU), it keeps track of the 'non-emitted referred to type' of the currently emitted translation unit and emits them at the end of each TU. For instance, if the type 'Gimp*' (pointer to Gimp) was emitted, and yet the referred-to type 'Gimp' wasn't emitted, the TU writer makes sure to emit the referred-to 'Gimp' type at the end of the TU. This has been going on for quite some time now. The problem however is that although the non-emitted referred-to type was referred to in this current TU, it might no have been *DEFINED* in this TU. In that case, it should not be emitted in this TU. Otherwise, the TU where that type is defined in the abixml might appear different from where it is defined in the initial binary, leading to self comparison failures down the road. This patch ensures that a non-emitted referred-to type is always emitted in the TU it belongs to. 5/ After doing all this, it appears that we were forgetting to emit some function types that were defined in TUs emitted earlier and yet were being referred-to later. Looking closer, I realized that we should just emit function types seen in a given TU, regardless of the referred-to relation. The problem with that is that function types are special in libabigail because there are two situation in which they are created. Basically, a function type is created by the DWARF DIE DW_TAG_subroutine_type. This is for instance how pointer to functions are represented in DWARF, namely, by a DW_TAG_pointer_type that points to a DW_TAG_subroutine_type. That is represented in the libabigail ir by an instance of the abigail::ir::function_type type. This is represented in abixml as a 'function-type' XML element. But then, libabigail considers that all decls have a type. This applies obviously for variables or data member. Right. But then, libabigail considers that a function is also a decl, which has a type. And the type of a function is a function type, represented by the same abigail::ir::function_type. A practical difference with the former situation is that function decls are *NOT* represented in abixml using a 'function-type' element. Instead a 'function-decl' XML element uses return type and parameter elements to represent the types involved with a function decl. Said otherwise, the former 'function type' concept used to represent the type of functions in the libabigail IR is artificial. This artificial-ness was not explicitly expressed in libabigail. This patch now expresses that artificial-ness for function types. So the abixml writer now just decide to not emit artificial function types, and instead, emit all the non-artificial function types instead. This addresses this last issues by being able to emit all non-artificial function types defined in a given TU, without having to bother with the fact that they are referred-to or not. Together, fixing these 5 problems fixes this reported problem. The changes to the reference test outputs are adjustments needed because of the abixml output indeed changes. * include/abg-ir.h (environment::use_enum_binary_only_equality): Declare accessors. (type_or_decl_base::{s,g}et_is_artificial): Likewise. (decl_base::{s,g}et_is_artificial): Remove accessors. * src/abg-ir.cc (environment::priv::use_enum_binary_only_equality): Define new data member. (environment::priv::use_enum_binary_only_equality): Define accessors. (type_or_decl_base::priv::is_artificial_): Define new data member. It has actually moved here from decl_base::priv::is_artificial_. (type_or_decl_base::priv::priv): Initialize it. (type_or_decl_base::{g,s}et_is_artificial): Define accessors. (decl_base::is_artificial_): Move this to type_or_decl_base::is_artificial_. (maybe_adjust_canonical_type): In a given class of equivalence of function types, if there is one non-artificial function type, then the entire class of equivalence is considered non-artificial; so flag the canonical function type as being non-artificial. (is_enumerator_present_in_enum): Define new static function. (equals): Re-arrange the overload for enums so the order of the enumerators doesn't count in the comparison. Also, two enums with different numbers of enumerators can still be equal, with the right redundancy. In the overload for var_decl, avoid taking into account the names of data members in the comparison. (enum_type_decl::enumerator::operator==): In the binary-level comparison mode, only compare the value of enumerators, not their name. * src/abg-comparison.cc (compute_diff): In the overload for enum_type_decl, if the enums compare different using binary-level comparison, then use source-level comparison to build the diff-IR. * src/abg-dwarf-reader.cc (read_context::compare_before_canonicalisation): Compare enums using binary-level comparison. (add_or_update_class_type): If we are looking at the definition of an existing declaration that has been already defined then use the previous definition, in case we are going to need to update the definition. Also, update the size only if it's needed. (build_function_type): By default, consider the newly built function type as artificial. (build_ir_node_from_die): When looking at a DW_TAG_subroutine_type DIE, consider the built function type as non-artificial. * src/abg-reader.cc (read_context::maybe_check_abixml_canonical_type_stability): Don't consider declaration-only classes in an ODR context because they don't have canonical types. (build_function_decl): Flag the function type of the function as artificial. (build_class_decl): Make sure to reuse class types that were already created. * src/abg-writer.cc (write_translation_unit): Allow emitting empty classes. Make sure referenced types are emitting in the translation unit where they belong. Avoid emitting artificial function types. * tests/data/test-alt-dwarf-file/rhbz1951526/rhbz1951526-report-0.txt: New test reference output. * tests/data/test-alt-dwarf-file/rhbz1951526/usr/bin/gimp-2.10: New reference test binary input. * tests/data/test-alt-dwarf-file/rhbz1951526/usr/lib/debug/.dwz/gimp-2.10.22-2.el9.1.aarch64: Likewise. * tests/data/test-alt-dwarf-file/rhbz1951526/usr/lib/debug/usr/bin/gimp-2.10-2.10.22-2.el9.1.aarch64.debug: Likewise. * tests/data/Makefile.am: Add the new test files to source directory. * tests/test-alt-dwarf-file.cc: Add the new test inputs to this test harness. * tests/data/test-abidiff/test-PR18791-report0.txt: Adjust. * tests/data/test-abidiff/test-enum0-report.txt: Likewise. * tests/data/test-annotate/libtest23.so.abi: Likewise. * tests/data/test-annotate/libtest24-drop-fns-2.so.abi: Likewise. * tests/data/test-annotate/libtest24-drop-fns.so.abi: Likewise. * tests/data/test-annotate/test-anonymous-members-0.o.abi: Likewise. * tests/data/test-annotate/test13-pr18894.so.abi: Likewise. * tests/data/test-annotate/test14-pr18893.so.abi: Likewise. * tests/data/test-annotate/test15-pr18892.so.abi: Likewise. * tests/data/test-annotate/test17-pr19027.so.abi: Likewise. * tests/data/test-annotate/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise. * tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise. * tests/data/test-annotate/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise. * tests/data/test-annotate/test21-pr19092.so.abi: Likewise. * tests/data/test-diff-dwarf-abixml/test0-pr19026-libvtkIOSQL-6.1.so.1.abi: Likewise. * tests/data/test-diff-dwarf/PR25058-liblttng-ctl-report-1.txt: Likewise. * tests/data/test-diff-dwarf/test6-report.txt: Likewise. * tests/data/test-diff-filter/test31-pr18535-libstdc++-report-0.txt: Likewise. * tests/data/test-diff-filter/test31-pr18535-libstdc++-report-1.txt: Likewise. * tests/data/test-diff-filter/test8-report.txt: Likewise. * tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-2.txt: Likewise. * tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-3.txt: Likewise. * tests/data/test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-0.txt: Likewise. * tests/data/test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-1.txt: Likewise. * tests/data/test-read-dwarf/PR22015-libboost_iostreams.so.abi: Likewise. * tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise. * tests/data/test-read-dwarf/PR25007-sdhci.ko.abi: Likewise. * tests/data/test-read-dwarf/PR25042-libgdbm-clang-dwarf5.so.6.0.0.abi: Likewise. * tests/data/test-read-dwarf/PR26261/PR26261-exe.abi: Likewise. * tests/data/test-read-dwarf/libtest23.so.abi: Likewise. * tests/data/test-read-dwarf/libtest24-drop-fns-2.so.abi: Likewise. * tests/data/test-read-dwarf/libtest24-drop-fns.so.abi: Likewise. * tests/data/test-read-dwarf/test-libandroid.so.abi: Likewise. * tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Likewise. * tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise. * tests/data/test-read-dwarf/test13-pr18894.so.abi: Likewise. * tests/data/test-read-dwarf/test14-pr18893.so.abi: Likewise. * tests/data/test-read-dwarf/test15-pr18892.so.abi: Likewise. * tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise. * tests/data/test-read-dwarf/test17-pr19027.so.abi: Likewise. * tests/data/test-read-dwarf/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise. * tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise. * tests/data/test-read-dwarf/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise. * tests/data/test-read-dwarf/test21-pr19092.so.abi: Likewise. * tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi: Likewise. * tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Likewise. * tests/data/test-read-write/test28-without-std-fns-ref.xml: Likewise. * tests/data/test-read-write/test28-without-std-vars-ref.xml: Likewise. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
ed87d0a29b |
reader: Canonicalizing a type once is enough
While looking at something else, I noticed that the abixml reader was trying to canonicalize each type twice. Once should be enough. * src/abg-reader.cc (build_type): Don't try to canonicalize the type here because all the sub-routines of this function (which actually build the type) already try to canonicalize it. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
f7ad3366fb |
ir: make 'debug(artefact)' support showing enums
While debugging something else, I realized that 'debug(artifact)' couldn't show the enumerators of an enum. I also realized that we were not showing the 'declaration-only-ness' of the artefact either. This patch fixes that. * src/abg-ir.cc (get_debug_representation): Add support for showing details for enums. Also show declaration-only-ness for class or unions. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
f7e6ce3160 |
location:expand() shouldn't crash when no location manager available
While debugging, I noticed that trying to expand location not yet associated with any location manager would crash. This patch fixes that. * src/abg-ir.cc (location::expand): When no location manager is present, just expand to an empty location. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
d1b4247c16 |
Add environment::{get_type_id_from_pointer,get_canonical_type_from_type_id}
When debugging self comparison issues, once the abixml file is read back into memory, I often want to get the type-id of an artifact that was read from abixml or get the canonical type of an artifact which type-id is known. Part of that information is indirectly present in the data member abigail::reader::reader_context::m_pointer_type_id_map after the .typeid file is loaded from file into memory. The problem is that the instance of abigail::reader::reader_context is transient as it's destroyed quickly after the abixml file is read. We want it to stay alive longer. So this patch moves that data member into abigail::environment instead, along with its accessors. The patch then adds the new member functions environment::{get_type_id_from_pointer,get_canonical_type_from_type_id} to get the type-id of an artifact de-serialized from abixml and the canonical type of an artifact for which we now the type-id string. * include/abg-ir.h (environment::{get_pointer_type_id_map, get_type_id_from_pointer, get_canonical_type_from_type_id}): Declare new member functions. * src/abg-ir.cc (environment::{get_pointer_type_id_map, get_type_id_from_pointer, get_canonical_type_from_type_id}): Define member functions. (environment::priv::pointer_type_id_map_): Move this data member here from ... * src/abg-reader.cc (read_context::m_pointer_type_id_map): ... here. (read_context::get_pointer_type_id_map): Remove this as it's now defined in environment::get_pointer_type_id_map. (read_context::maybe_check_abixml_canonical_type_stability): Adjust. (build_type): Likewise. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
e9e3e06454 |
ir: Enable setting breakpoint on first type inequality
When debugging type canonicalization in type_base::get_canonical_type_for, I more often than not want to know why a type compares different to another. Until now, I've been doing that by stepping in the debugger. I figure a much efficient way of doing that is to be able to set a breakpoint on the first occurrence of type inequality. To do that, I am adding a few macros to use in the 'equals' functions to return their value: ABG_RETURN(value), ABG_RETURN_REQUAL(l,r) and ABG_RETURN_FALSE. Those invoke a new function called 'notify_equality_failed' when the result of the comparison is false. This allows to just set a debugger breakpoint on 'notify_equality_failed' to know when and why the type comparison fails. These macros invoke notify_equality_failed only if the WITH_DEBUG_SELF_COMPARISON macro is defined. Otherwise, they do what the code was doing previously. Said otherwise, this whole shebang is enabled only when the code is configured with --enable-debug-self-comparison. This patch incurs no functional change. * src/abg-ir.cc (notify_equality_failed): Define new static function if WITH_DEBUG_SELF_COMPARISON is defined. (ABG_RETURN_EQUAL, ABG_RETURN_FALSE, ABG_RETURN): Define new macros. (try_canonical_compare): Use ABG_RETURN_EQUAL rather than just returning the result of a comparison. (equals): In all the overloads, use the new ABG_RETURN* macros, rather than just returning boolean values. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
b00ba10e1d |
xml reader: Fix recursive qualified & reference type definition
This is a followup patch for the fix for
https://bugzilla.redhat.com/show_bug.cgi?id=1944088, which was in the
patch:
commit
|
||
Dodji Seketeli
|
7a9fa3fe5a |
abixml reader: Fix recursive type definition handling
This should fix self comparison bug https://bugzilla.redhat.com/show_bug.cgi?id=1944088 This arose from a self comparison check failing on the library libgvpr.so.2 from the graphviz-2.44.0-17.el9.aarch64.rpm package. Now that we have facilities to see what type (instantiated from the abixml representation of the libgvpr.so library) exactly the canonicalization process is failing for, I decided to use it ;-) I extracted the package and its associating debug info into a directory named 'extract' and ran abidw --debug-abidiff on it: $ build/tool/abidw --debug-abidiff -d extract/usr/lib/debug extract/usr/lib64/libgvpr.so.2 That yielded the output below: error: problem detected with type 'typedef Vmalloc_t' from second corpus error: canonical type for type 'typedef Vmalloc_t' of type-id 'type-id-170' changed from 'd72618' to '14a7448' error: problem detected with type 'Vmalloc_t*' from second corpus error: canonical type for type 'Vmalloc_t*' of type-id 'type-id-188' changed from 'd72ba8' to '14a7968' [...] This tells me that "typedef Vmalloc_t", created from the abixml compares different from its originating peer that was created from the binary directly. The same goes for the pointer type "Vmalloc_t*", etc. Using the new debugging/logging functionalities from the command line of the debugger, I could see that in the abixml reader, build_typedef_decl can fail subtly when the underlying type of the typedef refers to the typedef itself. In that case, we need to ensure that the typedef created by build_typedef_decl is the same one that is used by the underlying type. which is not the case at the moment. At the moment, the underlying type would create a new typedef beside the one currently being created by build_typedef_decl. That leads to more than one typedef in the system to designate "typedef Vmalloc_t". And that wreaks havoc later down the road. This patch arranges so that build_typedef_decl creates the typedef "early" before the underlying type is created. That typedef temporarily has no underlying type. It's registered as being the typedef for the type-id string that identifies it in the abixml. And then the function goes to create the underlying type. This arrangement ensures that if the underlying type refers to the typedef being created (via its type-id string), then the typedef that was created early is effectively re-used. This ensures that a typedef which recursively refer to itself is properly represented. It's only when the underlying type is fully created that it's added to the typedef. Something similar is done for pointer types, in build_pointer_type_def. Note that to do this, the patch adjusts the typedef_decl and pointer_type_def classes so that they can be created with no underlying/pointed-to types. The underlying/pointed-to type can thus be added later. I believe this patch is the minimal patch necessary to fix this issue. The graphviz RPM is added to the regression test suite for good measure. After visual inspection, I realized that there are other types besides typedef and pointer types that exhibit the same class of problem even if they are not involved in this issue on this particular binary. A subsequent patch is going to address the problem for those types, namely, qualified and reference types. * include/abg-ir.h (pointer_type_def::pointer_type_def): Declare a constructor with no pointed-to type. (pointer_type_def::set_pointed_to_type): Declare new method. (typedef_decl::typedef_decl): Declare a constructor with no underlying type. * src/abg-ir.cc (pointer_type_def::pointer_type_def): Define a constructor with no pointed-to type. The pointed-to type can thus later be set when it becomes available. (pointer_type_def::set_pointed_to_type): Define new method. (pointer_type_def::get_qualified_name): Make this work on a pointer type that (momentarily) has no pointed-to type. (typedef_decl::typedef_decl): Define a constructor with no underlying type. (typedef_decl::get_size_in_bits): Make this work on a typedef that has (momentarily) no underlying type. (typedef_decl::set_underlying_type): Update the size and alignment of the typedef from its new underlying type. * src/abg-reader.cc (build_pointer_type_def): Construct the pointer type early /BEFORE/ we even try to construct its pointed-to type. Associate this incomplete type with the type-id. Then try to construct the pointed-to type. During the construction of the pointed-to type, if this pointer is needed (due to recursion) then the incomplete pointer type can be used, leading to just one pointer type used (recursively) as it should be. (build_typedef_decl): Likewise for building typedef type early without its underlying type so that it can used by the underlying type if needed. * tests/data/test-diff-pkg/graphviz-2.44.0-18.el9.aarch64-self-check-report-0.txt: New test reference output. * tests/data/test-diff-pkg/graphviz-2.44.0-18.el9.aarch64.rpm: New binary test input. * tests/data/test-diff-pkg/graphviz-debuginfo-2.44.0-18.el9.aarch64.rpm: Likewise. * tests/data/Makefile.am: Add the new test material above to source distribution. * tests/test-diff-pkg.cc (in_out_specs): Add the test inputs above to this test harness. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
d94947440e |
Introduce artificial locations
When an abixml file is "read in" and the resulting in-memory internal representation is saved back into abixml, the saved result can often differ from the initial input in a non deterministic manner. That read-write instability is non-desirable because it generates unnecessary changes that cloud our ability to build reliable regression tests, among other things. Also, that unnecessarily increases the changes to the existing regression test reference outputs leading to a lot more churn than necessary. This patch tries to minimize that abixml read-write instability in preparation of patches that would otherwise cause too much churn in reference output files of the regression test suite. The main reason why this read-write instability occurs is that a lot of type definitions don't have source location. For instance, all the types that are not user defined fall into that category. Those types can't be topologically sorted by using their location as a sorting criteria. Instead, we are currently using the order in which those location-less types are processed by the reader as the output (i.e, write time) order. The problem with that approach is that the processing order can be dependant on the other of which OTHER TYPES likes class types are processed. And that order can be changed by patches in the future. That in and of itself shouldn't change the write order of these types. For instance, if a class Foo has data members and member functions whose types are non-user-defined types, then, the order in which those data members are processed can possibly determine the order in which those non-user-defined are processed. This patch thus introduces the concept of artificial location. A *NON-ARTIFICIAL* location is a source location that was emitted by the original emitter of the type meta-data. In the case of DWARF type meta-data, the compiler originally emitted source location. That location when read is considered non-artificial, or natural, if you prefer. In the case of abixml however, an artificial location would be the source location at which an XML element is encountered. For instance, consider the abixml file below "path/to/exmaple.abi" below: 1 <abi-corpus version='2.0' path='path/to/example.abi'> 2 <abi-instr address-size='64' path='test24-drop-fns.cc' language='LANG_C_plus_plus'> 3 <type-decl name='bool' size-in-bits='8' id='type-id-1'/> 4 </abi-instr> 5 </abi-corpus/> I've added line numbers for ease of reading. At line 3 of that file, the non-user defined type name "bool" is defined using the XML element "type-decl". Note how that element lacks the "filepath", "line" and "column" attributes that would collectively define the source location of that type. So this type "bool" don't carry any natural location. The abixml reader can however generate an artificial location for it. That the filepath of that artificial location would thus be the path to that ABI corpus, i.e, "path/to/example.abi". The line number would be 3. The column would be left to zero. That artificial location will never be explicitly be written down as an XML attribute as it can always be implicitly retrieved by construction. The patch changes the internal representation so that each ABI artifact of the internal representation can now carry both an artificial and a natural location. When two artifacts have an artificial location, then its used to topologically sort them. The one that is defined topologically "earlier" obviously comes first. When two artifacts have a natural location then its used to topologically sort them. Otherwise, they are sorted lexicographically. This makes the output of abilint a lot more read-write stable. * include/abg-fwd.h (get_artificial_or_natural_location): Declare new function. * include/abg-ir.h (location::location): Initialize & copy ... (location::is_artificial_): ... a new data member. (location::{g,s}et_is_artificial): New accessors. (location::{operator=}): Adjust. (type_or_decl_base::{set,get,has}_artificial_location): Declare new member functions. * src/abg-ir.cc (decl_topo_comp::operator()): In the overload for decl_base*, use artificial location for topological sort in priority. Otherwise, use natural location. Otherwise, sort lexicographically. (type_topo_comp::operator()): In the overload for type_base*, use lexicographical sort only for types that don't have location at all. (type_or_decl_base::priv::artificial_location_): Define new data member. (type_or_decl_base::{set,get,has}_artificial_location): Define new member functions. (decl_base::priv): Allow a constructor without location. That one sets no natural location to the artifact. (decl_base::decl_base): Use decl_base::set_location in the constructor now. (decl_base::set_location): Adjust this to support setting a natural or an artificial location. (get_debug_representation): Emit debugging log showing the location of an artifact, using its artificial location in priority. (get_natural_or_artificial_location): Define new function. * src/abg-reader.cc (read_artificial_location) (maybe_set_artificial_location): Define new static functions. (read_location): Read artificial location when no natural location was found. (build_namespace_decl, build_function_decl, build_type_decl) (build_qualified_type_decl, build_pointer_type_def) (build_reference_type_def, build_subrange_type) (build_array_type_def, build_enum_type_decl, build_typedef_decl) (build_class_decl, build_union_decl, build_function_tdecl) (build_class_tdecl, build_type_tparameter) (build_non_type_tparameter, build_template_tparameter): Read and set artificial location. * src/abg-writer.cc (write_location): Don't serialize artificial locations. (write_namespace_decl): Topologically sort member declarations before serializing them. * tests/data/test-read-write/test28-without-std-fns-ref.xml: Adjust. * tests/data/test-read-write/test28-without-std-vars-ref.xml: Likewise. * tests/data/test-annotate/libtest23.so.abi: Likewise. * tests/data/test-annotate/libtest24-drop-fns-2.so.abi: Likewise. * tests/data/test-annotate/libtest24-drop-fns.so.abi: Likewise. * tests/data/test-annotate/test0.abi: Likewise. * tests/data/test-annotate/test13-pr18894.so.abi: Likewise. * tests/data/test-annotate/test14-pr18893.so.abi: Likewise. * tests/data/test-annotate/test15-pr18892.so.abi: Likewise. * tests/data/test-annotate/test17-pr19027.so.abi: Likewise. * tests/data/test-annotate/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise. * tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise. * tests/data/test-annotate/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise. * tests/data/test-annotate/test21-pr19092.so.abi: Likewise. * tests/data/test-read-dwarf/PR22015-libboost_iostreams.so.abi: Likewise. * tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise. * tests/data/test-read-dwarf/PR25007-sdhci.ko.abi: Likewise. * tests/data/test-read-dwarf/PR25042-libgdbm-clang-dwarf5.so.6.0.0.abi: Likewise. * tests/data/test-read-dwarf/PR26261/PR26261-exe.abi: Likewise. * tests/data/test-read-dwarf/libtest23.so.abi: Likewise. * tests/data/test-read-dwarf/libtest24-drop-fns-2.so.abi: Likewise. * tests/data/test-read-dwarf/libtest24-drop-fns.so.abi: Likewise. * tests/data/test-read-dwarf/test-libandroid.so.abi: Likewise. * tests/data/test-read-dwarf/test-suppressed-alias.o.abi: Likewise. * tests/data/test-read-dwarf/test0.abi: Likewise. * tests/data/test-read-dwarf/test0.hash.abi: Likewise. * tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Likewise. * tests/data/test-read-dwarf/test11-pr18828.so.abi: Likewise. * tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise. * tests/data/test-read-dwarf/test13-pr18894.so.abi: Likewise. * tests/data/test-read-dwarf/test14-pr18893.so.abi: Likewise. * tests/data/test-read-dwarf/test15-pr18892.so.abi: Likewise. * tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise. * tests/data/test-read-dwarf/test17-pr19027.so.abi: Likewise. * tests/data/test-read-dwarf/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise. * tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise. * tests/data/test-read-dwarf/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise. * tests/data/test-read-dwarf/test21-pr19092.so.abi: Likewise. * tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi: Likewise. * tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Likewise. * tests/data/test-read-write/test28-without-std-fns-ref.xml: Likewise. * tests/data/test-read-write/test28-without-std-vars-ref.xml: Likewise. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
27d2927107 |
Detect abixml canonical type instability during abidw --debug-abidiff
In the debugging mode of self comparison induced by the invocation of "abidw --debug-abidiff <binary>", it's useful to be able to ensure the following invariant: The pointer value of the canonical type of a type T that is serialized into abixml with the id string "type-id-12" (for instance) must keep the same canonical type pointer value when that abixml file is de-serialized back into memory. This is possible mainly because libabigail stays loaded in memory all the time during both serialization and de-serialization. This patch adds support for detecting when that invariant is not respected. In other words it detects when the type build from de-serializing the type which id is "type-id-12" (for instance) has a canonical type which pointer value is different from the pointer value of the canonical type (of the type) that was serialized as having the type id "type-id-12". This is done in three phases. The first phase happens in the code of abidw itself; after the abixml is written on disk, another file called the "typeid file" is written on disk as well. That later file contains a set of records; each record associates a "type id string" (like the type IDs that appear in the abixml file) to the pointer value of the canonical type that matches that type id string. That file is thus now available for manual inspection during a later debugger session. This is done by invoking the new function write_canonical_type_ids. The second phase appears right before abixml loading time. The typeid file is read back and the association "type-id string" <-> is stored in a hash map that is returned by environment::get_type_id_canonical_type_map(). This is done by invoking the new function load_canonical_type_ids. The third phase happens right after the canonicalization (triggered in the abixml reader) of a type coming from abixml, corresponding to a given type id. It checks if the pointer value of the canonicalization type just computed is the same as the one associated to that type id in the map returned by environment::get_type_id_canonical_type_map. This is a way of verifying the "stability" of a canonical type during its serialization and de-serialization to and from abixml and it's done as part of "abidw --debug-abidiff <binary>". Just as an example, here is the kind of error output that I am getting on a real life debugging session on a binary that exhibits self comparison error: $ abidw --debug-abidiff -d <some-binary> error: problem detected with type 'typedef Vmalloc_t' from second corpus error: canonical type for type 'typedef Vmalloc_t' of type-id 'type-id-179' changed from '1a083e8' to '21369b8' [...] $ From this output, I see that the first type for which libabigail exhibits an instability on the pointer value of the canonical type is the type 'typedef Vmalloc_t'. In other words, when that type is saved to abixml, the type we read back is different. This needs further debugging but at least it pinpoints exactly what type we are seeing the core issue on first. This is of a tremendous help in the root cause analysis needed to understand why the self comparison is failing. * include/abg-ir.h (environment::get_type_id_canonical_type_map): Declare new data member. * src/abg-ir.cc (environment::priv::type_id_canonical_type_map_): Define new data member. (environment::get_type_id_canonical_type_map): Define new method. * include/abg-reader.h (load_canonical_type_ids): Declare new function. * src/abg-reader.cc (read_context::m_pointer_type_id_map): Define new data member. (read_context::{get_pointer_type_id_map, maybe_check_abixml_canonical_type_stability}): Define new methods. (read_context::{maybe_canonicalize_type, perform_late_type_canonicalizing}): Invoke maybe_perform_self_comparison_canonical_type_check after canonicalization to perform canonicalization type stability checking. (build_type): Associate the pointer value for the newly built type with the type id string identifying it in the abixml. Once the abixml representation is dropped from memory and we are about to perform type canonicalization, we can still know what the type id of a given type coming from abixml was; it's thus possible to verify that the canonical type associated to that type id is the same as the one stored in the typeid file. (read_type_id_string): Define new static function. (load_canonical_type_ids): Define new function. * include/abg-writer.h (write_canonical_type_ids): Likewise. * src/abg-writer.cc (write_canonical_type_ids): Define new function overloads. * tools/abidw.cc (options::type_id_file_path): New data member. (load_corpus_and_write_abixml): Write and read back the typeid file. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
104468d1a4 |
Detect failed self comparison in type canonicalization of abixml
During the self comparison triggered by "abidw --abidiff <binary>", some comparison errors can happen when canonicalizing types that are "de-serialized" from the abixml that was serialized from the input binary. This patch adds some debugging checks and messaging to emit a message when a type from the abixml appears to not "match" the original type from the initial corpus it originated from. This is the more detailed description: Let's consider a type T coming from the corpus of the input binary. That input corpus is serialized into abixml and de-serialized again into a second corpus that we shall name the abixml corpus. From that second corpus, let's consider the type T' that is the result of serializing T into abixml and de-serializing it again. T is said to be the original type of T'. If T is a canonical type, then T' should equal T. Otherwise, if T is not a canonical type, its canonical type should equal the canonical type of T'. For the sake of simplicity, let's consider that T is a canonical type. During the canonicalization of T', T' should equal T. Each and every canonical type coming from the abixml corpus should be equal to its original type from the binary corpus. If a T' is different from its original type T, then there is an "equality problem" between T and T'. In other words, there is a mismatch between T and T'. We want to be notified of that problem so that we can debug it further and fix it. So this patch introduces the option "abidw --debug-abidiff <binary>" to trigger the "debug self comparison mode". At canonicalization time, we detect that we are in that debug self comparison mode and during canonicalization of types from the abixml corpus, it detects when they compare different from their counterpart from the original corpus. This debugging capability can be enabled at configure time with a new --enable-debug-self-comparison configure option. That option defines a new WITH_DEBUG_SELF_COMPARISON compile time macro that is used to conditionally compile the implementation of this debugging feature. So, one example of this might look like this: abidw --debug-abidiff bin: error: problem detected with type 'typedef Vmalloc_t' from second corpus error: problem detected with type 'Vmalloc_t*' from second corpus [...] So that means the "typedef Vmalloc_t" read from the abixml compares different from its original type where it should not. So armed with this new insight, I know I need to debug that comparison in particular to see why it wrongly results in two different types. * doc/manuals/abidw.rst: Add documentation for the --debug-abidiff option. * include/abg-ir.h (environment::{set_self_comparison_debug_input, get_self_comparison_debug_inputs, self_comparison_debug_is_on}): Declare new methods. * configure.ac: Define a new --enable-debug-self-comparison option that is disabled by default. That option defines a new WITH_DEBUG_SELF_COMPARISON preprocessor macro. * src/abg-ir.cc (environment::priv::{first_self_comparison_corpus_, second_self_comparison_corpus_, self_comparison_debug_on_}): New data members. Also, re-indent the data members. (environment::{set_self_comparison_debug_input, get_self_comparison_debug_inputs, self_comparison_debug_is_on}): Define new method. (type_base::get_canonical_type_for): In the "debug self comparison mode", if a type coming from the second corpus compares different from its counterpart coming from the first corpus then log a debug message. * src/abg-dwarf-reader.cc (read_debug_info_into_corpus): When loading the first corpus, if the debug self comparison mode is on, then save that corpus on the side in the environment. * src/abg-reader.cc (read_corpus_from_input): When loading the second corpus, if the debug self comparison mode is on, then save that corpus on the side in the environment. * tools/abidw.cc: Include the config.h file for preprocessor macros defined at configure (options::debug_abidiff): New data member. (parse_command_line): Parse the --debug-abidiff option. (load_corpus_and_write_abixml): Switch the self debug mode on when the --debug-abidiff option is provided. Use a read_context for the abixml loading. That is going to be useful for subsequent patches. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
6eee409137 |
Add primitives callable from the command line of the debugger
During debugging it can be extremely useful to be able to visualize the data members of a class type, instance of abigail::ir::class_decl*. It's actually useful to visualize the pretty representation (type name and kind) of all types and decls that inherit abigail::ir::type_or_decl_base, basically. Today, in the debugger, if we have a variable defined as "abigail::ir::type_or_decl_base* t", we can type: $ p t->get_pretty_representation(true, true); This would display something like: $ typedef foo_t However, if 't' is declared as: "abigail::ir::class_decl* t", then if we type: (gdb) p t->get_pretty_representation(true, true); We'll get something like: class foo_klass (gdb) So we get the kind and the name of the ABI artifact; but in case of a class, we don't get the details of its data members. This patch introduces a function named "debug" which, would be invoked on the 't' above like this: (gdb) p debug(t) I would yield: struct tm { // size in bits: 448 // translation unit: test24-drop-fns.cc // @: 0x5387a0, @canonical: 0x5387a0 int tm_sec; // uses canonical type '@0x538270' int tm_min; // uses canonical type '@0x538270' int tm_hour; // uses canonical type '@0x538270' int tm_mday; // uses canonical type '@0x538270' int tm_mon; // uses canonical type '@0x538270' int tm_year; // uses canonical type '@0x538270' int tm_wday; // uses canonical type '@0x538270' int tm_yday; // uses canonical type '@0x538270' int tm_isdst; // uses canonical type '@0x538270' long int tm_gmtoff; // uses canonical type '@0x461200' const char* tm_zone; // uses canonical type '@0x544528' }; (gdb) This gives much more information to understand what 't' designates. The patch also provides functions to retrieve one data member from a given type that happens to designate a class type. For instance: (gdb) p get_data_member(t, "tm_sec") This would yield: $19 = std::shared_ptr<abigail::ir::var_decl> (use count 4, weak count 0) = {get() = 0x9d9a80} We could visualize that data member by doing: (gdb) p debug(get_data_member(t, "tm_sec")._M_ptr) int tm::tm_sec (gdb) The patch also provides a new 'debug_equals' function that allow us to easily perform an artifact comparison from the command line of the debugger, as well as methods to the environment type to poke at the canonical types available in the environment. These new debugging primitives already proved priceless while debugging issues that are fixed by subsequent patches to come. * include/abg-fwd.h (get_debug_representation, get_data_member) (debug, debug_equals): Declare new functions. * include/abg-ir.h (environment{get_canonical_types, get_canonical_type}): Declare new member functions. * src/abg-ir.cc (environment::{get_canonical_types, get_canonical_type}): Define new member functions. (get_debug_representation, get_data_member) (debug, debug_equals): Define new functions. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
e89bf5abe8 |
Peel array types when peeling pointers from a type
In peel_typedef_pointer_or_reference_type, we want to peel typedefs and pointer types (in general) from a given type. We need to peel array types as well, as those are conceptually a pointer-like type as well. This patch does that. * src/abg-ir.cc (peel_typedef_pointer_or_reference_type): In the overloads for type_base_sptr and type_base*, peel array type off as well. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
fa5ff32afb |
Fix DWARF type DIE canonicalization
While looking at something else, I noticed that the DWARF type DIE canonicalization code wasn't taking the type of array elements into account when comparing arrays. This patch fixes that. * src/abg-dwarf-reader.cc (compare_dies): When comparing array type DIEs, take into account the type of the elements of the arrays. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
073185e7ab |
Miscellaneous indentation and comments cleanups
While looking at something else, I did some indentation and comments cleanups. * src/abg-ir.cc (environment::priv::{config_, canonical_types_, sorted_canonical_types_, void_type_, variadic_marker_type_}): Re-indent these data members. (peel_typedef_pointer_or_reference_type): Fix comment. (var_decl::var_decl): Likewise. (function_decl::function_decl): Add a comment. * src/abg-reader.cc (handle_reference_type_def): Fix indentation of parameters. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
26c41c060b |
Fix thinko in configure.ac
* configure.ac: Fix a thinko I spotted while looking at something else. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
1656f9dd7b |
reader: Use xmlFirstElementChild/xmlNextElementSibling to iterate over children elements
Use xmlFirstElementChild/xmlNextElementSibling to iterate over element children nodes rather than doing it by hand in the various for loops. * src/abg-reader.cc (walk_xml_node_to_map_type_ids) (read_translation_unit, read_translation_unit_from_input) (read_symbol_db_from_input, build_needed) (read_elf_needed_from_input, read_corpus_group_from_input) (build_namespace_decl, build_elf_symbol_db, build_function_decl) (build_function_type, build_array_type_def, build_enum_type_decl) (build_class_decl, build_union_decl, build_function_tdecl) (build_class_tdecl, build_type_composition) (build_template_tparameter): Use xmlFirstElementChild/xmlNextElementSibling rather than poking at xmlNode::children and looping over xmlNode::next by hand. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
09c7a773a3 |
reader: Use xmlFirstElementChild and xmlNextElementSibling rather than xml::advance_to_next_sibling_element
The xml::advance_to_next_sibling_element is redundant with the xmlNextElementSibling API of libxml. Similarly, xmlFirstElementChild is redundant with using xml::advance_to_next_sibling_element on the xmlNode::children data member. Let's use the libxml API instead. * include/abg-libxml-utils.h (advance_to_next_sibling_element): Remove the declaration of this function. * src/abg-libxml-utils.cc (go_to_next_sibling_element_or_stay) (advance_to_next_sibling_element): Remove definitions of these functions. * src/abg-reader.cc (read_translation_unit_from_input) (read_elf_needed_from_input, read_corpus_group_from_input): Use xmlNextElementSibling instead of xml::advance_to_next_sibling_element. (read_corpus_from_input): Likewise. Also, use xmlFirstElementChild instead of xml::advance_to_next_sibling_element on the xmlNode::children data member. (read_corpus_group_from_input): use xmlFirstElementChild instead of xml::advance_to_next_sibling_element on the xmlNode::children data member. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
dd55550355 |
reader: Handle 'abi-corpus' element being possibly empty
This problem was reported at https://sourceware.org/bugzilla/show_bug.cgi?id=27616. The abixml reader wrongly assumes that the 'abi-corpus' element is always non-empty. Note that until now, the only emitter of abixml consumed in practice was abg-writer.cc and it only emits non-empty 'abi-corpus' elements. So the issue wasn't exposed. So, the reader assumes that an 'abi-corpus' element has at least a text node. For instance, consider this minimal input file named test-v0.abi: $cat test-v0.abi <abi-corpus-group architecture='elf-arm-aarch64'> <abi-corpus path='vmlinux' architecture='elf-arm-aarch64'> </abi-corpus> </abi-corpus-group> $ Now, compare it to this file where the abi-corpus element is an empty element (doesn't even contain any text): $cat test-v0.abi <abi-corpus-group architecture='elf-arm-aarch64'> <abi-corpus path='vmlinux'/> </abi-corpus-group> $ comparing the two files with abidiff (wrongly) reports: $ abidiff test-v0.abi test-v1.abi ELF architecture changed Functions changes summary: 0 Removed, 0 Changed, 0 Added function Variables changes summary: 0 Removed, 0 Changed, 0 Added variable architecture changed from 'elf-arm-aarch64' to '' $ What's happening is that read_corpus_from_input is getting out early when it sees that the node is empty. This is at: xmlNodePtr node = ctxt.get_corpus_node(); @@ -1907,10 +1925,14 @@ read_corpus_from_input(read_context& ctxt) corp.set_soname(reinterpret_cast<char*>(soname_str.get())); } if (!node->children) // <---- we get out early here and we return nil; // forget about the properties of // the current empty corpus element node So, at its core, fixing the issue at hand involves avoiding the early return there. But then, it turns out that's not enough. In the current setting, the different abixml processing entry points are designed to be used in a semi "streaming" mode. So for instance, read_translation_unit_from_input can be invoked repeatedly to "stream in" the next translation unit at each invocation. Alternatively, the lower level xmlTextReaderNext can be used to iterate over XML node until we reach the translation unit XML element we are interested in. At that point xmlTextReaderExpand can be used to expand the XML node, then we let the context know that this is the current node of the corpus that needs to be processed, using read_context::get_corpus_node. Once we've done that, read_translation_unit_from_input can be called to process that particular corpus node. Note that the corpus node at hand, that needs to be processed will be retrieved by read_context::get_corpus_node. These two modes of operation are also available for read_corpus_from_input, read_symbol_db_from_input, read_elf_needed_from_input etc. Today, these functions all assume that the current node returned by read_context::get_corpus_node is the node /before/ the node of the corpus to be processed. So they all start looking at the /next sibling/ of the node returned by read_context::get_corpus_node. So the code was implicitly assuming that read_context::get_corpus_node was pointing to a text node that was before the node of the corpus that we want to process. This is wrong. read_context::get_corpus_node should just return the current node of the corpus that needs to be processed and voila. And so read_context::set_corpus_node should be used to set the current node of the corpus to the current element node that needs to be processed. That's the spirit of the change done by this patch. As its name suggests, the existing xml::advance_to_next_sibling_element is used to skip non element xml nodes (including text nodes) and move to the next element node to process, which is set to the context using read_context::set_corpus_node. Then the actual processing functions like read_corpus_from_input get the node to process, using read_context::get_corpus_node and process it rather than processing the sibling node that comes after it. The other changes are either to prevent related crashes that I noticed while doing various tests, update the abilint tool used to read and debug abixml input files and add better documentation. * src/abg-reader.cc (read_context::get_corpus_node): Add comment to this member function. (read_translation_unit_from_input, read_symbol_db_from_input) (read_elf_needed_from_input): Start processing the current node of the corpus that needs to be processed rather than its next sibling. Once the processing is done, set the new "current node of the corpus to be processed" properly by skipping to the next element node to be processed. (read_corpus_from_input): Don't get out early when the 'abi-corpus' element is empty. If, however, it has children node, skip to the first child element and flag it -- using read_context::set_corpus_node -- as being the element node to be processed by the processing facilities of the reader. If we are in a mode where we called xmlTextReaderExpand ourselves to get the node to process, then it means we need to free that node indirectly by calling xmlTextReaderNext. In that case, that node should not be flagged by read_context::set_corpus_node. Add more comments. * src/abg-corpus.cc (corpus::is_empty): Do not crash when no symtab is around. * src/abg-libxml-utils.cc (go_to_next_sibling_element_or_stay): Fix typo in comment. (advance_to_next_sibling_element): Don't crash when given a nil node. * tests/data/test-abidiff/test-PR27616-squished-v0.abi: Add new test input. * tests/data/test-abidiff/test-PR27616-squished-v1.abi: Likewise. * tests/data/test-abidiff/test-PR27616-v0.xml: Likewise. * tests/data/test-abidiff/test-PR27616-v1.xml: Likewise. * tests/data/Makefile.am: Add the new test inputs above to source distribution. * tests/test-abidiff.cc (specs): Add the new tests inputs above to this harness. * tools/abilint.cc (main): Support writing corpus groups. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |
||
Dodji Seketeli
|
b215a21153 |
dwarf-reader: properly set artificial-ness in opaque types
get_opaque_version_of_type forgets to set the "is-artificial" property according to the initial type the opaque type is derived from. This can lead to some instability in the abixml output. Fixed thus. * src/abg-dwarf-reader.cc (get_opaque_version_of_type): Propagate the artificial-ness of the original type here. * tests/data/test-read-dwarf/PR27700/test-PR27700.abi: Adjust. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |