libabigail/include/abg-reader.h
Dodji Seketeli 27d2927107 Detect abixml canonical type instability during abidw --debug-abidiff
In the debugging mode of self comparison induced by the invocation of
"abidw --debug-abidiff <binary>", it's useful to be able to ensure the
following invariant:

    The pointer value of the canonical type of a type T that is
    serialized into abixml with the id string "type-id-12" (for
    instance) must keep the same canonical type pointer value when
    that abixml file is de-serialized back into memory.  This is
    possible mainly because libabigail stays loaded in memory all the
    time during both serialization and de-serialization.

This patch adds support for detecting when that invariant is not
respected.

In other words it detects when the type build from de-serializing the
type which id is "type-id-12" (for instance) has a canonical type
which pointer value is different from the pointer value of the
canonical type (of the type) that was serialized as having the type id
"type-id-12".

This is done in three phases.

The first phase happens in the code of abidw itself; after the abixml
is written on disk, another file called the "typeid file" is written
on disk as well.  That later file contains a set of records; each
record associates a "type id string" (like the type IDs that appear in
the abixml file) to the pointer value of the canonical type that
matches that type id string.  That file is thus now available for
manual inspection during a later debugger session.  This is done by
invoking the new function write_canonical_type_ids.

The second phase appears right before abixml loading time.  The typeid
file is read back and the association "type-id string" <-> is stored
in a hash map that is returned by
environment::get_type_id_canonical_type_map().  This is done by
invoking the new function load_canonical_type_ids.

The third phase happens right after the canonicalization (triggered in
the abixml reader) of a type coming from abixml, corresponding to a
given type id.  It checks if the pointer value of the canonicalization
type just computed is the same as the one associated to that type id
in the map returned by environment::get_type_id_canonical_type_map.

This is a way of verifying the "stability" of a canonical type during
its serialization and de-serialization to and from abixml and it's
done as part of "abidw --debug-abidiff <binary>".

Just as an example, here is the kind of error output that I am getting
on a real life debugging session on a binary that exhibits self
comparison error:

    $ abidw  --debug-abidiff -d <some-binary>
    error: problem detected with type 'typedef Vmalloc_t' from second corpus
    error: canonical type for type 'typedef Vmalloc_t' of type-id 'type-id-179' changed from '1a083e8' to '21369b8'
    [...]
    $

From this output, I see that the first type for which libabigail
exhibits an instability on the pointer value of the canonical type is
the type 'typedef Vmalloc_t'.  In other words, when that type is saved
to abixml, the type we read back is different.  This needs further
debugging but at least it pinpoints exactly what type we are seeing
the core issue on first.  This is of a tremendous help in the root
cause analysis needed to understand why the self comparison is
failing.

	* include/abg-ir.h (environment::get_type_id_canonical_type_map):
	Declare new data member.
	* src/abg-ir.cc (environment::priv::type_id_canonical_type_map_):
	Define new data member.
	(environment::get_type_id_canonical_type_map): Define new method.
	* include/abg-reader.h (load_canonical_type_ids): Declare new
	function.
	* src/abg-reader.cc (read_context::m_pointer_type_id_map):
	Define new data member.
	(read_context::{get_pointer_type_id_map,
	maybe_check_abixml_canonical_type_stability}): Define new methods.
	(read_context::{maybe_canonicalize_type,
	perform_late_type_canonicalizing}): Invoke
	maybe_perform_self_comparison_canonical_type_check after
	canonicalization to perform canonicalization type stability
	checking.
	(build_type): Associate the pointer value for the newly built type
	with the type id string identifying it in the abixml.  Once the
	abixml representation is dropped from memory and we are about to
	perform type canonicalization, we can still know what the type id
	of a given type coming from abixml was; it's thus possible to
	verify that the canonical type associated to that type id is the
	same as the one stored in the typeid file.
	(read_type_id_string): Define new static function.
	(load_canonical_type_ids): Define new function.
	* include/abg-writer.h (write_canonical_type_ids): Likewise.
	* src/abg-writer.cc (write_canonical_type_ids): Define new
	function overloads.
	* tools/abidw.cc (options::type_id_file_path): New data member.
	(load_corpus_and_write_abixml): Write and read back the typeid
	file.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2021-05-25 12:24:26 +02:00

94 lines
2.1 KiB
C++

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
// -*- Mode: C++ -*-
//
// Copyright (C) 2013-2021 Red Hat, Inc.
//
// Author: Dodji Seketeli
/// @file
///
/// This file contains the declarations of the entry points to
/// de-serialize an instance of @ref abigail::translation_unit from an
/// ABI Instrumentation file in libabigail native XML format.
#ifndef __ABG_READER_H__
#define __ABG_READER_H__
#include <istream>
#include "abg-corpus.h"
#include "abg-suppression.h"
namespace abigail
{
namespace xml_reader
{
using namespace abigail::ir;
translation_unit_sptr
read_translation_unit_from_file(const std::string& file_path,
environment* env);
translation_unit_sptr
read_translation_unit_from_buffer(const std::string& file_path,
environment* env);
translation_unit_sptr
read_translation_unit_from_istream(std::istream* in,
environment* env);
class read_context;
/// A convenience typedef for a shared pointer to read_context.
typedef shared_ptr<read_context> read_context_sptr;
read_context_sptr
create_native_xml_read_context(const string& path, environment *env);
read_context_sptr
create_native_xml_read_context(std::istream* in, environment* env);
const string&
read_context_get_path(const read_context&);
corpus_sptr
read_corpus_from_native_xml(std::istream* in,
environment* env);
corpus_sptr
read_corpus_from_native_xml_file(const string& path,
environment* env);
corpus_sptr
read_corpus_from_input(read_context& ctxt);
corpus_group_sptr
read_corpus_group_from_input(read_context& ctxt);
corpus_group_sptr
read_corpus_group_from_native_xml(std::istream* in,
environment* env);
corpus_group_sptr
read_corpus_group_from_native_xml_file(const string& path,
environment* env);
void
add_read_context_suppressions(read_context& ctxt,
const suppr::suppressions_type& supprs);
void
consider_types_not_reachable_from_public_interfaces(read_context& ctxt,
bool flag);
}//end xml_reader
#ifdef WITH_DEBUG_SELF_COMPARISON
bool
load_canonical_type_ids(xml_reader::read_context& ctxt,
const string& file_path);
#endif
}//end namespace abigail
#endif // __ABG_READER_H__