libabigail/include/abg-corpus.h
Dodji Seketeli cf8eba68c3 Implement string interning for Libabigail
This patch implements string interning optimization.  One can read
about the principles of this optimization at
https://en.wikipedia.org/wiki/String_interning.

The patch introduces an abigail::interned_string type, as well as an
abigail::interned_string_pool type.  Each environment type owns a
string pool and strings are interned in that pool for all types and
decls of that environments.  The interned_string has methods to
interact seemingly with std::string including a hashing function.  Of
course hashing and comparing interned_string is faster than for
std::string.

To enable ABI artifacts to intern strings, each constructor of ABI
artifacts now takes the environment it's constructed in as parameter.
From the environment, it can thus use the interned string pool.

The patch then changes declaration names to be of type
interned_string, and performs the necessary adjustments.  The hash
maps that hash strings coming from those declaration names are
adjusted to hash interned_string.

	* include/Makefile.am: Add the new abg-interned-str.h file to
	source distribution.
	* include/abg-corpus.h (corpus::corpus): Re-arrange the order of
	* src/abg-corpus.cc
	(corpus::exported_decls_builder::priv::get_id): Return
	interned_string rather than std::string.
	(corpus::corpus): Re-arrange the order of parameters: take an
	environment as first parameter.  parameters: take an environment
	as first parameter.
	* include/abg-dwarf-reader.h (lookup_symbol_from_elf)
	(lookup_public_function_symbol_from_elf): Likewise.
	* src/abg-dwarf-reader.cc (lookup_symbol_from_sysv_hash_tab)
	(lookup_symbol_from_gnu_hash_tab)
	(lookup_symbol_from_elf_hash_tab, lookup_symbol_from_symtab)
	(lookup_symbol_from_elf, lookup_public_function_symbol_from_elf)
	(lookup_public_variable_symbol_from_elf, lookup_symbol_from_elf)
	(lookup_public_function_symbol_from_elf): Take an environment as
	first parameter and adjust.
	(build_translation_unit_and_add_to_ir)
	(build_namespace_decl_and_add_to_ir, build_type_decl)
	(build_enum_type, finish_member_function_reading)
	(build_class_type_and_add_to_ir, build_function_type)
	(read_debug_info_into_corpus, read_corpus_from_elf): Adjust.
	* include/abg-fwd.h: Include abg-interned-str.h
	(get_type_name, get_function_type_name, get_method_type_name):
	Return a interned_string, rather than a std::string.
	* include/abg-interned-str.h: New declarations for interned strings
	and their pool.
	* include/abg-ir.h (environment::intern): Declare new method.
	(elf_symbol::{g,s}et_environment): Likewise.
	(type_or_decl_base::type_or_decl_base): Make the default
	constructor private.
	({translation, type_or_decl_base}::set_environment)
	(set_environment_for_artifact): Take a const environment*.
	(elf_symbol::elf_symbol)
	(elf_symbol::create)
	(type_or_decl_base::type_or_decl_base)
	(translation::translation, decl_base::decl_base)
	(scope_decl::scope_decl, type_base::type_base)
	(type_decl::type_decl, scope_type_decl::scope_type_decl)
	(namespace_decl::namespace_decl)
	(enum_type_decl::enumerator::enumerator)
	(function_type::function_type, method_type::method_type)
	(template_decl::template_decl, function_tdecl::function_tdecl)
	(class_tdecl::class_tdecl, class_decl::class_decl): Take an
	environment.
	(type_or_decl_base::operator=)
	(enum_type_decl::enumerator::get_environment): Declare new method.
	(decl_base::{peek_qualified_name, peek_temporary_qualified_name,
	get_qualified_name, get_name, get_qualified_parent_name,
	get_linkage_name}, qualified_type_def::get_qualified_name)
	(reference_type_def::get_qualified_name)
	(array_type_def::get_qualified_name)
	(enum_type_decl::enumerator::{get_name, get_qualified_name})
	({var,function}_decl::get_id)
	(function_decl::parameter::{get_type_name, get_name_id}): Return
	an interned_string, rather than a std::string.
	(decl_base::{set_qualified_name, set_temporary_qualified_name,
	get_qualified_name, set_linkage_name})
	(qualified_type_def::get_qualified_name)
	(reference_type_def::get_qualified_name)
	(array_type_def::get_qualified_name)
	(function_decl::parameter::get_qualified_name): Take an
	interned_string, rather than a std::string.
	(class_decl::member_{class,function}_template::member_{class,function}_template):
	Adjust.
	* src/abg-ir.cc (environment_setter::env_): Make this be a pointer
	to const environment.
	(environment_setter::visit_begin): Adjust.
	(interned_string_pool::priv): Define new type.
	(interned_string_pool::*): Define the method declared in
	abg-interned-str. h.
	(operator==, operator!=, operator+): Define operator for interned_string and
	std::string
	(operator<<): Define for interned_string.
	(translation_unit::priv::env_): Make this be a pointer to const
	environment.
	(translation_unit::priv::priv): Take a pointer to const
	environment.
	(elf_symbol::priv::env_): New data member.
	(elf_symbol::priv::priv): Adjust.  Make an overoad take an
	environment.
	(translation_unit::{g,s}et_environment): Adjust.
	(interned_string_bool_map_type): New typedef.
	(environment::priv::classes_being_compared_): Make this hastable
	of string be a hashtable of interned_string.
	(environment::priv::string_pool_): New data member.
	(environment::{get_void_type_decl,
	get_variadic_parameter_type_decl}): Adjust.
	(type_or_decl_base::priv::env_): Make this be a pointer to const
	environment.
	(type_or_decl::base::priv::priv): Adjust.
	(type_or_decl_base::set_environment)
	(set_environment_for_artifact): Take a pointer to const
	environment.
	(elf_symbol::{g,s}et_environment, environment::intern)
	(type_or_decl_base::operator=): Define new methods.
	(decl_base::priv::{name_, qualified_parent_name_,
	temporary_qualified_name_, qualified_name_, linkage_name_}): Make
	these data member be of tpe interned_string.
	(decl_base::priv::priv): Make this take an environment. Adjust.
	(decl_base::{peek_qualified_name, peek_temporary_qualified_name,
	get_linkage_name, get_qualified_parent_name, get_name,
	get_qualified_name}, get_type_name, get_function_type_name)
	(get_method_type_name, get_node_name)
	(qualified_type_def::get_qualified_name)
	(pointer_type_def::get_qualified_name)
	(array_type_def::get_qualified_name)
	(enum_type_decl::enumerator::get_qualified_name)
	(var_decl::get_id, function_decl::get_id)
	(function_decl::parameter::get_{name_id, type_name}): Return an
	interned_string.
	(decl_base::{set_qualified_name, set_temporary_qualified_name})
	(qualified_type_def::get_qualified_name)
	(pointer_type_def::get_qualified_name)
	(reference_type_def::get_qualified_name)
	(array_type_def::get_qualified_name)
	(function_decl::parameter::get_qualified_name): Take an
	interned_string.
	(decl_base::{set_name, set_linkage_name}): Intern the std::string
	passed in parameter.
	(equals): In the overload for decl_base, adjust for a little speed
	optimization that is justified by profiling.
	(pointer_type_def::priv::{internal_qualified_name_,
	temp_internal_qualified_name_}): Make these data member be
	interned_string.
	(enum_type_decl::enumerator::priv::env_): New data member.
	(enum_type_decl::enumerator::priv::{name_, qualified_name}): Make
	these data member be of type interned_string.
	(enum_type_decl::enumerator::get_environment): New method.
	(enum_type_decl::enumerator::priv::priv) Adjust.
	(typedef_decl::operator==): Implement a little speed optimization.
	(var_decl::priv::nake_type_): New data member.
	(var_decl::priv::id_): Make this data member be of type
	interned_string.
	(equals): In the overload for var_decl, function_type,
	function_decl, adjust for the use of interned_string.
	(function_decl::priv::id_): Make this be of type interned_string.
	(scope_decl::{add_member_decl, insert_member_decl})
	(lookup_function_type_in_translation_unit)
	(synthesize_type_from_translation_unit, lookup_node_in_scope)
	(lookup_type_in_scope, scope_decl::scope_decl)
	(qualified_type_def::qualified_type_def)
	(qualified_type_def::get_qualified_name)
	(pointer_type_def::pointer_type_def)
	(reference_type_def::reference_type_def)
	(array_type_def::array_type_def, array_type_def::append_subrange)
	(array_type_def::get_qualified_name)
	(enum_type_decl::enum_type_decl)
	(enum_type_decl::enumerator::get_qualified_name)
	(enum_type_decl::enumerator::set_name)
	(typedef_decl::typedef_decl, var_decl::var_decl)
	(function_type::function_type, method_type::method_type)
	(function_decl::function_decl)
	(function_decl::parameter::parameter)
	(class_decl::priv::comparison_started)
	(class_decl::add_base_specifier)
	(class_decl::base_spec::base_spec)
	(class_decl::method_decl::method_decl)
	(type_tparameter::type_tparameter)
	(non_type_tparameter::non_type_tparameter)
	(template_tparameter::template_tparameter)
	(type_composition::type_composition)
	(function_tdecl::function_tdecl, class_tdecl::class_tdecl)
	(qualified_name_setter::do_update): Adjust.
	(translation_unit::translation_unit, elf_symbol::elf_symbol)
	(elf_symbol::create, type_or_decl_base::type_or_decl_base)
	(decl_base::decl_base, type_base::type_base)
	(type_decl::type_decl, scope_type_decl::scope_type_decl)
	(namespace_decl::namespace_decl)
	(enum_type_decl::enumerator::enumerator, class_decl::class_decl)
	(template_decl::template_decl, function_tdecl::function_tdecl)
	(class_tdecl::class_tdecl): Take an environment.
	* src/abg-comparison.cc
	(function_suppression::suppresses_function): Adjust.
	* src/abg-reader.cc (read_translation_unit)
	(read_corpus_from_input, build_namespace_decl, build_elf_symbol)
	(build_function_parameter, build_function_decl, build_type_decl)
	(build_function_type, build_enum_type_decl, build_enum_type_decl)
	(build_class_decl, build_function_tdecl, build_class_tdecl)
	(read_corpus_from_native_xml): Likewise.
	* src/abg-writer.cc (id_manager::m_cur_id): Make this mutable.
	(id_manager::m_env): New data member.
	(id_manager::id_manager): Adjust.
	(id_manager::get_environment): New method.
	(id_manager::{get_id, get_id_with_prefix}): Return an
	interned_string.
	(type_ptr_map): Make this be a hash map of type_base* ->
	interned_string, rather a type_base* -> string.
	(write_context::m_env): New data member.
	(write_context::m_type_id_map): Make this data member be mutable.
	(write_context::m_emitted_type_id_map): Make this be a hash map of
	interned_string -> bool, rather than string -> bool.
	(write_context::write_context): Take an environment and adjust.
	(write_context::get_environment): New method.
	(write_context::get_id_manager): New const overload.
	(write_context::get_id_for_type): Return an interned_string; adjust.
	(write_context::{record_type_id_as_emitted,
	record_type_as_referenced}): Adjust.
	(write_context::type_id_is_emitted): Take an interned_string.
	(write_context::{type_is_emitted,
	record_decl_only_type_as_emitted}): Adjust.
	(write_translation_unit, write_corpus_to_native_xml, dump):
	Adjust.
	* tools/abisym.cc (main): Adjust.
	* tests/data/test-read-write/test22.xml: Adjust.
	* tests/data/test-read-write/test23.xml: Adjust.
	* tests/data/test-read-write/test26.xml: Adjust.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2016-02-24 15:13:20 +01:00

330 lines
7.2 KiB
C++

// -*- mode: C++ -*-
//
// Copyright (C) 2013-2016 Red Hat, Inc.
//
// This file is part of the GNU Application Binary Interface Generic
// Analysis and Instrumentation Library (libabigail). This library is
// free software; you can redistribute it and/or modify it under the
// terms of the GNU Lesser General Public License as published by the
// Free Software Foundation; either version 3, or (at your option) any
// later version.
// This library is distributed in the hope that it will be useful, but
// WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
// General Lesser Public License for more details.
// You should have received a copy of the GNU Lesser General Public
// License along with this program; see the file COPYING-LGPLV3. If
// not, see <http://www.gnu.org/licenses/>.
/// @file
#ifndef __ABG_CORPUS_H__
#define __ABG_CORPUS_H__
#include <abg-ir.h>
namespace abigail
{
namespace ir
{
class corpus;
/// A convenience typedef for shared pointer to @ref corpus.
typedef shared_ptr<corpus> corpus_sptr;
/// This is the abstraction of a set of translation units (themselves
/// seen as bundles of unitary abi artefacts like types and decls)
/// bundled together as a corpus. A corpus is thus the Application
/// binary interface of a program, a library or just a set of modules
/// put together.
class corpus
{
public:
struct priv;
/// Convenience typedef for shared_ptr of corpus::priv
typedef shared_ptr<priv> priv_sptr;
/// A convenience typedef for std::vector<string>.
typedef vector<string> strings_type;
/// Convenience typedef for std::vector<abigail::ir::function_decl*>
typedef vector<function_decl*> functions;
///Convenience typedef for std::vector<abigail::ir::var_decl*>
typedef vector<var_decl*> variables;
class exported_decls_builder;
/// Convenience typedef for shared_ptr<exported_decls_builder>.
typedef shared_ptr<exported_decls_builder> exported_decls_builder_sptr;
/// This abstracts where the corpus comes from. That is, either it
/// has been read from the native xml format, from DWARF or built
/// artificially using the library's API.
enum origin
{
ARTIFICIAL_ORIGIN = 0,
NATIVE_XML_ORIGIN,
DWARF_ORIGIN
};
private:
shared_ptr<priv> priv_;
corpus();
void record_canonical_type(const type_base_sptr&) const;
type_base_sptr lookup_canonical_type(const string& qualified_name) const;
public:
corpus(ir::environment*, const string&);
const environment*
get_environment() const;
environment*
get_environment();
void
set_environment(environment*) const;
void
add(const translation_unit_sptr);
const translation_units&
get_translation_units() const;
void
drop_translation_units();
origin
get_origin() const;
void
set_origin(origin);
string&
get_path() const;
void
set_path(const string&);
const vector<string>&
get_needed() const;
void
set_needed(const vector<string>&);
const string&
get_soname();
void
set_soname(const string&);
const string&
get_architecture_name();
void
set_architecture_name(const string&);
bool
is_empty() const;
bool
operator==(const corpus&) const;
void
set_fun_symbol_map(string_elf_symbols_map_sptr);
void
set_undefined_fun_symbol_map(string_elf_symbols_map_sptr);
void
set_var_symbol_map(string_elf_symbols_map_sptr);
void
set_undefined_var_symbol_map(string_elf_symbols_map_sptr);
const string_elf_symbols_map_sptr
get_fun_symbol_map_sptr() const;
const string_elf_symbols_map_type&
get_fun_symbol_map() const;
const string_elf_symbols_map_sptr
get_undefined_fun_symbol_map_sptr() const;
const string_elf_symbols_map_type&
get_undefined_fun_symbol_map() const;
const elf_symbols&
get_sorted_fun_symbols() const;
const elf_symbols&
get_sorted_undefined_fun_symbols() const;
const string_elf_symbols_map_sptr
get_var_symbol_map_sptr() const;
const string_elf_symbols_map_type&
get_var_symbol_map() const;
const string_elf_symbols_map_sptr
get_undefined_var_symbol_map_sptr() const;
const string_elf_symbols_map_type&
get_undefined_var_symbol_map() const;
const elf_symbols&
get_sorted_var_symbols() const;
const elf_symbols&
get_sorted_undefined_var_symbols() const;
const elf_symbol_sptr
lookup_function_symbol(const string& n) const;
const elf_symbol_sptr
lookup_function_symbol(const string& symbol_name,
const elf_symbol::version& version) const;
const elf_symbol_sptr
lookup_function_symbol(const elf_symbol& symbol) const;
const elf_symbol_sptr
lookup_variable_symbol(const string& n) const;
const elf_symbol_sptr
lookup_variable_symbol(const string& symbol_name,
const elf_symbol::version& version) const;
const elf_symbol_sptr
lookup_variable_symbol(const elf_symbol& symbol) const;
const functions&
get_functions() const;
const vector<function_decl*>*
lookup_functions(const string& id) const;
void
sort_functions();
const variables&
get_variables() const;
void
sort_variables();
const elf_symbols&
get_unreferenced_function_symbols() const;
const elf_symbols&
get_unreferenced_variable_symbols() const;
vector<string>&
get_regex_patterns_of_fns_to_suppress();
const vector<string>&
get_regex_patterns_of_fns_to_suppress() const;
vector<string>&
get_regex_patterns_of_vars_to_suppress();
const vector<string>&
get_regex_patterns_of_vars_to_suppress() const;
vector<string>&
get_regex_patterns_of_fns_to_keep();
const vector<string>&
get_regex_patterns_of_fns_to_keep() const;
vector<string>&
get_sym_ids_of_fns_to_keep();
const vector<string>&
get_sym_ids_of_fns_to_keep() const;
vector<string>&
get_regex_patterns_of_vars_to_keep();
const vector<string>&
get_regex_patterns_of_vars_to_keep() const;
vector<string>&
get_sym_ids_of_vars_to_keep();
const vector<string>&
get_sym_ids_of_vars_to_keep() const;
void
maybe_drop_some_exported_decls();
exported_decls_builder_sptr
get_exported_decls_builder() const;
friend class type_base;
};// end class corpus.
/// Abstracts the building of the set of exported variables and
/// functions.
///
/// Given a function or variable, this type can decide if it belongs
/// to the list of exported functions and variables based on all the
/// parameters needed.
class corpus::exported_decls_builder
{
public:
class priv;
/// Convenience typedef for shared_ptr<priv>
typedef shared_ptr<priv> priv_sptr;
friend class corpus;
private:
priv_sptr priv_;
// Forbid default construction.
exported_decls_builder();
public:
exported_decls_builder(functions& fns,
variables& vars,
strings_type& fns_suppress_regexps,
strings_type& vars_suppress_regexps,
strings_type& fns_keep_regexps,
strings_type& vars_keep_regexps,
strings_type& sym_id_of_fns_to_keep,
strings_type& sym_id_of_vars_to_keep);
const functions&
exported_functions() const;
functions&
exported_functions();
const variables&
exported_variables() const;
variables&
exported_variables();
void
maybe_add_fn_to_exported_fns(function_decl*);
void
maybe_add_var_to_exported_vars(var_decl*);
}; //corpus::exported_decls_builder
}// end namespace ir
}//end namespace abigail
#endif //__ABG_CORPUS_H__