libabigail/include/abg-tools-utils.h
Dodji Seketeli 7b35e89315 Bug 19139 - DWARF reader doesn't handle garbage in function names
In this bug, the DWARF debug info of the binary (generated by Intel's
ICC compiler) has interesting constructs like:

     [ 6b5a0]    subprogram
		 decl_line            (data2) 787
		 decl_column          (data1) 15
		 decl_file            (data1) 46
		 declaration          (flag)
		 accessibility        (data1) public (1)
		 type                 (ref4) [ 6b56a]
		 prototyped           (flag)
		 name                 (string) "ldiv"
		 MIPS_linkage_name    (string) "ldiv"
     [ 6b5b6]      formal_parameter
		   type                 (ref4) [ 5f2aa]
		   name                 (string) "$Ë2"
     [ 6b5bf]      formal_parameter
		   type                 (ref4) [ 5f2aa]
		   name                 (string) "$Ë3"

Note the strings that make up the name of the formal parameters of the
function, near the end:

     [ 6b5b6]      formal_parameter
		   type                 (ref4) [ 5f2aa]
		   name                 (string) "$Ë2"
     [ 6b5bf]      formal_parameter
		   type                 (ref4) [ 5f2aa]
		   name                 (string) "$Ë3"

The strings "$Ë2" and $Ë3" (which are the names of the
parameters of the function) are garbage.

Libabigail's DWARF reader naively uses those strings as names for the
function parameters, in the type of the function.

Then, the abixml writer emits an XML document, with these strings as
property values, representing the name of the type of the function.

And of course, the XML later chokes when it tries to read that XML
document, saying that the property is not valid UTF-8.

This patch addresses the issue by dropping those garbage names on the
floor, for function type names.  In that context, any string that is
not made of ASCII characters is considered as being garbage, for now.

The patch, in the abixml writer, also escapes function parameters
names so that they don't contain characters that are not allowed in
XML.  The abixml reader already handles the un-escaping of the names
it reads, so I think there is nothing to do there.

Ultimately, I guess I should get the unicode value of the characters
of that string, encode the string into UTF-8 and use the result as the
name for the parameter.  That would mean using UTF-8 strings for
function parameter names, and, for all declarations names.  But that
is too much for worfk too little gain for now.  The great majority of
the binaries we are dealing with are still using ASCII for declaration
names.

The patch also introduces a new test harness that runs "abidw
--abidiff" on a bunch of input binaries.  This harness runs over the
binaries that were submitted in this bug report.

	* include/abg-tools-utils.h (string_is_ascii): Declare new
	function ...
	* src/abg-tools-utils.cc (string_is_ascii): ... and define it.
	* src/abg-writer.cc (write_function_type): Escape forbidden XML
	characters in function type names.
	* src/abg-dwarf-reader.cc (build_function_type):  If a parameter
	name is not ascii, drop it on the floor.
	* tests/data/test-types-stability/pr19139-DomainNeighborMapInst.o:
	New test input binary.
	* tests/data/test-types-stability/pr19202-libmpi_gpfs.so.5.0:
	Likewise.
	* tests/data/Makefile.am: Add the new binaries above to the build
	system.
	* tests/test-types-stability.cc: New test harness.
	* tests/Makefile.am: Add the new test harness to the build system.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-11-05 16:40:22 +01:00

193 lines
4.8 KiB
C++

// -*- Mode: C++ -*-
//
// Copyright (C) 2013-2015 Red Hat, Inc.
//
// This file is part of the GNU Application Binary Interface Generic
// Analysis and Instrumentation Library (libabigail). This library is
// free software; you can redistribute it and/or modify it under the
// terms of the GNU Lesser General Public License as published by the
// Free Software Foundation; either version 3, or (at your option) any
// later version.
// This library is distributed in the hope that it will be useful, but
// WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
// General Lesser Public License for more details.
// You should have received a copy of the GNU Lesser General Public
// License along with this program; see the file COPYING-LGPLV3. If
// not, see <http://www.gnu.org/licenses/>.
///@file
#include <tr1/memory>
#include <string>
#include <ostream>
#include <istream>
#include <iostream>
namespace abigail
{
namespace tools_utils
{
using std::ostream;
using std::istream;
using std::ifstream;
using std::string;
using std::tr1::shared_ptr;
bool file_exists(const string&);
bool is_regular_file(const string&);
bool is_dir(const string&);
bool base_name(string const& path,
string& file_name);
bool dir_name(string const &path,
string& path_dir_name);
bool ensure_dir_path_created(const string&);
bool ensure_parent_dir_created(const string&);
bool check_file(const string& path, ostream& out);
bool string_ends_with(const string&, const string&);
bool string_is_ascii(const string&);
class temp_file;
/// Convenience typedef for a shared_ptr to @ref temp_file.
typedef shared_ptr<temp_file> temp_file_sptr;
/// A temporary file.
///
/// This is a helper file around the mkstemp API.
///
/// Once the temporary file is created, users can interact with it
/// using an iostream. They can also get the path to the newly
/// created temporary file.
///
/// When the instance of @ref temp_file is destroyed, the underlying
/// resources are de-allocated, the underlying temporary file is
/// closed and removed.
class temp_file
{
struct priv;
typedef shared_ptr<priv> priv_sptr;
priv_sptr priv_;
temp_file();
public:
bool
is_good() const;
const char*
get_path() const;
std::iostream&
get_stream();
static temp_file_sptr
create();
}; // end class temp_file
size_t
get_random_number();
string
get_random_number_as_string();
/// The different types of files understood the bi* suite of tools.
enum file_type
{
/// A file type we don't know about.
FILE_TYPE_UNKNOWN,
/// The native xml file format representing a translation unit.
FILE_TYPE_NATIVE_BI,
/// An elf file. Read this kind of file should yield an
/// abigail::corpus type.
FILE_TYPE_ELF,
/// An archive (AR) file.
FILE_TYPE_AR,
// A native xml file format representing a corpus of one or several
// translation units.
FILE_TYPE_XML_CORPUS,
// A zip file, possibly containing a corpus of one of several
// translation units.
FILE_TYPE_ZIP_CORPUS,
/// An RPM (.rpm) binary file
FILE_TYPE_RPM,
/// An SRPM (.src.rpm) file
FILE_TYPE_SRPM,
/// A DEB (.deb) binary file
FILE_TYPE_DEB,
/// A plain directory
FILE_TYPE_DIR,
/// A tar archive. The archive can be compressed with the popular
/// compression schemes recognized by GNU tar.
FILE_TYPE_TAR
};
/// Exit status for abidiff and abicompat tools.
///
/// It's actually a bit mask. The valu of each enumerator is a power
/// of two.
enum abidiff_status
{
/// This is for when the compared ABIs are equal.
///
/// Its numerical value is 0.
ABIDIFF_OK = 0,
/// This bit is set if there an application error.
///
/// Its numerical value is 1.
ABIDIFF_ERROR = 1,
/// This bit is set if the tool is invoked in an non appropriate
/// manner.
///
/// Its numerical value is 2.
ABIDIFF_USAGE_ERROR = 1 << 1,
/// This bit is set if the ABIs being compared are different.
///
/// Its numerical value is 4.
ABIDIFF_ABI_CHANGE = 1 << 2,
/// This bit is set if the ABIs being compared are different *and*
/// are incompatible.
///
/// Its numerical value is 8.
ABIDIFF_ABI_INCOMPATIBLE_CHANGE = 1 << 3
};
abidiff_status
operator|(abidiff_status, abidiff_status);
abidiff_status
operator&(abidiff_status, abidiff_status);
abidiff_status&
operator|=(abidiff_status&l, abidiff_status r);
bool
abidiff_status_has_error(abidiff_status s);
bool
abidiff_status_has_abi_change(abidiff_status s);
bool
abidiff_status_has_incompatible_abi_change(abidiff_status s);
ostream&
operator<<(ostream& output, file_type r);
file_type guess_file_type(istream& in);
file_type guess_file_type(const string& file_path);
std::tr1::shared_ptr<char>
make_path_absolute(const char*p);
}// end namespace tools_utils
}//end namespace abigail