libabigail/tests/test-types-stability.cc

86 lines
2.6 KiB
C++
Raw Normal View History

Bug 19139 - DWARF reader doesn't handle garbage in function names In this bug, the DWARF debug info of the binary (generated by Intel's ICC compiler) has interesting constructs like: [ 6b5a0] subprogram decl_line (data2) 787 decl_column (data1) 15 decl_file (data1) 46 declaration (flag) accessibility (data1) public (1) type (ref4) [ 6b56a] prototyped (flag) name (string) "ldiv" MIPS_linkage_name (string) "ldiv" [ 6b5b6] formal_parameter type (ref4) [ 5f2aa] name (string) "$Ë2" [ 6b5bf] formal_parameter type (ref4) [ 5f2aa] name (string) "$Ë3" Note the strings that make up the name of the formal parameters of the function, near the end: [ 6b5b6] formal_parameter type (ref4) [ 5f2aa] name (string) "$Ë2" [ 6b5bf] formal_parameter type (ref4) [ 5f2aa] name (string) "$Ë3" The strings "$Ë2" and $Ë3" (which are the names of the parameters of the function) are garbage. Libabigail's DWARF reader naively uses those strings as names for the function parameters, in the type of the function. Then, the abixml writer emits an XML document, with these strings as property values, representing the name of the type of the function. And of course, the XML later chokes when it tries to read that XML document, saying that the property is not valid UTF-8. This patch addresses the issue by dropping those garbage names on the floor, for function type names. In that context, any string that is not made of ASCII characters is considered as being garbage, for now. The patch, in the abixml writer, also escapes function parameters names so that they don't contain characters that are not allowed in XML. The abixml reader already handles the un-escaping of the names it reads, so I think there is nothing to do there. Ultimately, I guess I should get the unicode value of the characters of that string, encode the string into UTF-8 and use the result as the name for the parameter. That would mean using UTF-8 strings for function parameter names, and, for all declarations names. But that is too much for worfk too little gain for now. The great majority of the binaries we are dealing with are still using ASCII for declaration names. The patch also introduces a new test harness that runs "abidw --abidiff" on a bunch of input binaries. This harness runs over the binaries that were submitted in this bug report. * include/abg-tools-utils.h (string_is_ascii): Declare new function ... * src/abg-tools-utils.cc (string_is_ascii): ... and define it. * src/abg-writer.cc (write_function_type): Escape forbidden XML characters in function type names. * src/abg-dwarf-reader.cc (build_function_type): If a parameter name is not ascii, drop it on the floor. * tests/data/test-types-stability/pr19139-DomainNeighborMapInst.o: New test input binary. * tests/data/test-types-stability/pr19202-libmpi_gpfs.so.5.0: Likewise. * tests/data/Makefile.am: Add the new binaries above to the build system. * tests/test-types-stability.cc: New test harness. * tests/Makefile.am: Add the new test harness to the build system. Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-11-05 15:01:56 +00:00
// -*- Mode: C++ -*-
//
// Copyright (C) 2013-2015 Red Hat, Inc.
//
// This file is part of the GNU Application Binary Interface Generic
// Analysis and Instrumentation Library (libabigail). This library is
// free software; you can redistribute it and/or modify it under the
// terms of the GNU Lesser General Public License as published by the
// Free Software Foundation; either version 3, or (at your option) any
// later version.
// This library is distributed in the hope that it will be useful, but
// WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
// General Lesser Public License for more details.
// You should have received a copy of the GNU Lesser General Public
// License along with this program; see the file COPYING-LGPLV3. If
// not, see <http://www.gnu.org/licenses/>.
// Author: Dodji Seketeli
/// @file
///
/// This program tests that the representation of types by the
/// internal representation of libabigail is stable through reading
/// from ELF/DWARF, constructing an internal represenation, saving that
/// internal presentation to the abixml format, reading from that
/// abixml format and constructing an internal representation from it
/// again.
///
/// This program thus compares the internal representation that is
/// built from reading from ELF/DWARF and the one that is built from
/// the abixml (which itself results from the serialization of the
/// first internal representation to abixml).
///
/// The comparison is expected to yield the empty set.
#include <string>
#include <fstream>
#include <iostream>
#include <cstdlib>
#include "abg-tools-utils.h"
#include "test-utils.h"
#include "abg-dwarf-reader.h"
#include "abg-comparison.h"
using std::string;
using std::ofstream;
using std::cerr;
// A set of elf files to test type stability for.
const char* elf_paths[] =
{
"data/test-types-stability/pr19139-DomainNeighborMapInst.o",
"data/test-types-stability/pr19202-libmpi_gpfs.so.5.0",
"data/test-types-stability/pr19026-libvtkIOSQL-6.1.so.1",
Bug 19139 - DWARF reader doesn't handle garbage in function names In this bug, the DWARF debug info of the binary (generated by Intel's ICC compiler) has interesting constructs like: [ 6b5a0] subprogram decl_line (data2) 787 decl_column (data1) 15 decl_file (data1) 46 declaration (flag) accessibility (data1) public (1) type (ref4) [ 6b56a] prototyped (flag) name (string) "ldiv" MIPS_linkage_name (string) "ldiv" [ 6b5b6] formal_parameter type (ref4) [ 5f2aa] name (string) "$Ë2" [ 6b5bf] formal_parameter type (ref4) [ 5f2aa] name (string) "$Ë3" Note the strings that make up the name of the formal parameters of the function, near the end: [ 6b5b6] formal_parameter type (ref4) [ 5f2aa] name (string) "$Ë2" [ 6b5bf] formal_parameter type (ref4) [ 5f2aa] name (string) "$Ë3" The strings "$Ë2" and $Ë3" (which are the names of the parameters of the function) are garbage. Libabigail's DWARF reader naively uses those strings as names for the function parameters, in the type of the function. Then, the abixml writer emits an XML document, with these strings as property values, representing the name of the type of the function. And of course, the XML later chokes when it tries to read that XML document, saying that the property is not valid UTF-8. This patch addresses the issue by dropping those garbage names on the floor, for function type names. In that context, any string that is not made of ASCII characters is considered as being garbage, for now. The patch, in the abixml writer, also escapes function parameters names so that they don't contain characters that are not allowed in XML. The abixml reader already handles the un-escaping of the names it reads, so I think there is nothing to do there. Ultimately, I guess I should get the unicode value of the characters of that string, encode the string into UTF-8 and use the result as the name for the parameter. That would mean using UTF-8 strings for function parameter names, and, for all declarations names. But that is too much for worfk too little gain for now. The great majority of the binaries we are dealing with are still using ASCII for declaration names. The patch also introduces a new test harness that runs "abidw --abidiff" on a bunch of input binaries. This harness runs over the binaries that were submitted in this bug report. * include/abg-tools-utils.h (string_is_ascii): Declare new function ... * src/abg-tools-utils.cc (string_is_ascii): ... and define it. * src/abg-writer.cc (write_function_type): Escape forbidden XML characters in function type names. * src/abg-dwarf-reader.cc (build_function_type): If a parameter name is not ascii, drop it on the floor. * tests/data/test-types-stability/pr19139-DomainNeighborMapInst.o: New test input binary. * tests/data/test-types-stability/pr19202-libmpi_gpfs.so.5.0: Likewise. * tests/data/Makefile.am: Add the new binaries above to the build system. * tests/test-types-stability.cc: New test harness. * tests/Makefile.am: Add the new test harness to the build system. Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-11-05 15:01:56 +00:00
// The below should always be the last element of array.
0
};
int
main()
{
using abigail::tests::get_src_dir;
using abigail::tests::get_build_dir;
bool is_ok = true;
for (const char** p = elf_paths; p && *p; ++p)
{
string abidw = get_build_dir() + "/tools/abidw";
string elf_path = get_src_dir() + "/tests/" + *p;
string cmd = abidw + " --abidiff " + elf_path;
if (system(cmd.c_str()))
{
cerr << "IR stability issue detected for binary "
<< elf_path;
is_ok = false;
}
}
return !is_ok;
}