Bug 19139 - DWARF reader doesn't handle garbage in function names
In this bug, the DWARF debug info of the binary (generated by Intel's
ICC compiler) has interesting constructs like:
[ 6b5a0] subprogram
decl_line (data2) 787
decl_column (data1) 15
decl_file (data1) 46
declaration (flag)
accessibility (data1) public (1)
type (ref4) [ 6b56a]
prototyped (flag)
name (string) "ldiv"
MIPS_linkage_name (string) "ldiv"
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
Note the strings that make up the name of the formal parameters of the
function, near the end:
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
The strings "$Ë2" and $Ë3" (which are the names of the
parameters of the function) are garbage.
Libabigail's DWARF reader naively uses those strings as names for the
function parameters, in the type of the function.
Then, the abixml writer emits an XML document, with these strings as
property values, representing the name of the type of the function.
And of course, the XML later chokes when it tries to read that XML
document, saying that the property is not valid UTF-8.
This patch addresses the issue by dropping those garbage names on the
floor, for function type names. In that context, any string that is
not made of ASCII characters is considered as being garbage, for now.
The patch, in the abixml writer, also escapes function parameters
names so that they don't contain characters that are not allowed in
XML. The abixml reader already handles the un-escaping of the names
it reads, so I think there is nothing to do there.
Ultimately, I guess I should get the unicode value of the characters
of that string, encode the string into UTF-8 and use the result as the
name for the parameter. That would mean using UTF-8 strings for
function parameter names, and, for all declarations names. But that
is too much for worfk too little gain for now. The great majority of
the binaries we are dealing with are still using ASCII for declaration
names.
The patch also introduces a new test harness that runs "abidw
--abidiff" on a bunch of input binaries. This harness runs over the
binaries that were submitted in this bug report.
* include/abg-tools-utils.h (string_is_ascii): Declare new
function ...
* src/abg-tools-utils.cc (string_is_ascii): ... and define it.
* src/abg-writer.cc (write_function_type): Escape forbidden XML
characters in function type names.
* src/abg-dwarf-reader.cc (build_function_type): If a parameter
name is not ascii, drop it on the floor.
* tests/data/test-types-stability/pr19139-DomainNeighborMapInst.o:
New test input binary.
* tests/data/test-types-stability/pr19202-libmpi_gpfs.so.5.0:
Likewise.
* tests/data/Makefile.am: Add the new binaries above to the build
system.
* tests/test-types-stability.cc: New test harness.
* tests/Makefile.am: Add the new test harness to the build system.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-11-05 15:01:56 +00:00
|
|
|
// -*- Mode: C++ -*-
|
|
|
|
//
|
2018-01-08 16:54:56 +00:00
|
|
|
// Copyright (C) 2013-2018 Red Hat, Inc.
|
Bug 19139 - DWARF reader doesn't handle garbage in function names
In this bug, the DWARF debug info of the binary (generated by Intel's
ICC compiler) has interesting constructs like:
[ 6b5a0] subprogram
decl_line (data2) 787
decl_column (data1) 15
decl_file (data1) 46
declaration (flag)
accessibility (data1) public (1)
type (ref4) [ 6b56a]
prototyped (flag)
name (string) "ldiv"
MIPS_linkage_name (string) "ldiv"
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
Note the strings that make up the name of the formal parameters of the
function, near the end:
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
The strings "$Ë2" and $Ë3" (which are the names of the
parameters of the function) are garbage.
Libabigail's DWARF reader naively uses those strings as names for the
function parameters, in the type of the function.
Then, the abixml writer emits an XML document, with these strings as
property values, representing the name of the type of the function.
And of course, the XML later chokes when it tries to read that XML
document, saying that the property is not valid UTF-8.
This patch addresses the issue by dropping those garbage names on the
floor, for function type names. In that context, any string that is
not made of ASCII characters is considered as being garbage, for now.
The patch, in the abixml writer, also escapes function parameters
names so that they don't contain characters that are not allowed in
XML. The abixml reader already handles the un-escaping of the names
it reads, so I think there is nothing to do there.
Ultimately, I guess I should get the unicode value of the characters
of that string, encode the string into UTF-8 and use the result as the
name for the parameter. That would mean using UTF-8 strings for
function parameter names, and, for all declarations names. But that
is too much for worfk too little gain for now. The great majority of
the binaries we are dealing with are still using ASCII for declaration
names.
The patch also introduces a new test harness that runs "abidw
--abidiff" on a bunch of input binaries. This harness runs over the
binaries that were submitted in this bug report.
* include/abg-tools-utils.h (string_is_ascii): Declare new
function ...
* src/abg-tools-utils.cc (string_is_ascii): ... and define it.
* src/abg-writer.cc (write_function_type): Escape forbidden XML
characters in function type names.
* src/abg-dwarf-reader.cc (build_function_type): If a parameter
name is not ascii, drop it on the floor.
* tests/data/test-types-stability/pr19139-DomainNeighborMapInst.o:
New test input binary.
* tests/data/test-types-stability/pr19202-libmpi_gpfs.so.5.0:
Likewise.
* tests/data/Makefile.am: Add the new binaries above to the build
system.
* tests/test-types-stability.cc: New test harness.
* tests/Makefile.am: Add the new test harness to the build system.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-11-05 15:01:56 +00:00
|
|
|
//
|
|
|
|
// This file is part of the GNU Application Binary Interface Generic
|
|
|
|
// Analysis and Instrumentation Library (libabigail). This library is
|
|
|
|
// free software; you can redistribute it and/or modify it under the
|
|
|
|
// terms of the GNU Lesser General Public License as published by the
|
|
|
|
// Free Software Foundation; either version 3, or (at your option) any
|
|
|
|
// later version.
|
|
|
|
|
|
|
|
// This library is distributed in the hope that it will be useful, but
|
|
|
|
// WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
|
|
|
// General Lesser Public License for more details.
|
|
|
|
|
|
|
|
// You should have received a copy of the GNU Lesser General Public
|
|
|
|
// License along with this program; see the file COPYING-LGPLV3. If
|
|
|
|
// not, see <http://www.gnu.org/licenses/>.
|
|
|
|
|
|
|
|
// Author: Dodji Seketeli
|
|
|
|
|
|
|
|
/// @file
|
|
|
|
///
|
|
|
|
/// This program tests that the representation of types by the
|
|
|
|
/// internal representation of libabigail is stable through reading
|
|
|
|
/// from ELF/DWARF, constructing an internal represenation, saving that
|
|
|
|
/// internal presentation to the abixml format, reading from that
|
|
|
|
/// abixml format and constructing an internal representation from it
|
|
|
|
/// again.
|
|
|
|
///
|
|
|
|
/// This program thus compares the internal representation that is
|
|
|
|
/// built from reading from ELF/DWARF and the one that is built from
|
|
|
|
/// the abixml (which itself results from the serialization of the
|
|
|
|
/// first internal representation to abixml).
|
|
|
|
///
|
|
|
|
/// The comparison is expected to yield the empty set.
|
|
|
|
|
|
|
|
#include <string>
|
|
|
|
#include <fstream>
|
|
|
|
#include <iostream>
|
|
|
|
#include <cstdlib>
|
|
|
|
#include "abg-tools-utils.h"
|
|
|
|
#include "test-utils.h"
|
|
|
|
#include "abg-dwarf-reader.h"
|
|
|
|
#include "abg-comparison.h"
|
2016-01-15 12:34:16 +00:00
|
|
|
#include "abg-workers.h"
|
Bug 19139 - DWARF reader doesn't handle garbage in function names
In this bug, the DWARF debug info of the binary (generated by Intel's
ICC compiler) has interesting constructs like:
[ 6b5a0] subprogram
decl_line (data2) 787
decl_column (data1) 15
decl_file (data1) 46
declaration (flag)
accessibility (data1) public (1)
type (ref4) [ 6b56a]
prototyped (flag)
name (string) "ldiv"
MIPS_linkage_name (string) "ldiv"
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
Note the strings that make up the name of the formal parameters of the
function, near the end:
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
The strings "$Ë2" and $Ë3" (which are the names of the
parameters of the function) are garbage.
Libabigail's DWARF reader naively uses those strings as names for the
function parameters, in the type of the function.
Then, the abixml writer emits an XML document, with these strings as
property values, representing the name of the type of the function.
And of course, the XML later chokes when it tries to read that XML
document, saying that the property is not valid UTF-8.
This patch addresses the issue by dropping those garbage names on the
floor, for function type names. In that context, any string that is
not made of ASCII characters is considered as being garbage, for now.
The patch, in the abixml writer, also escapes function parameters
names so that they don't contain characters that are not allowed in
XML. The abixml reader already handles the un-escaping of the names
it reads, so I think there is nothing to do there.
Ultimately, I guess I should get the unicode value of the characters
of that string, encode the string into UTF-8 and use the result as the
name for the parameter. That would mean using UTF-8 strings for
function parameter names, and, for all declarations names. But that
is too much for worfk too little gain for now. The great majority of
the binaries we are dealing with are still using ASCII for declaration
names.
The patch also introduces a new test harness that runs "abidw
--abidiff" on a bunch of input binaries. This harness runs over the
binaries that were submitted in this bug report.
* include/abg-tools-utils.h (string_is_ascii): Declare new
function ...
* src/abg-tools-utils.cc (string_is_ascii): ... and define it.
* src/abg-writer.cc (write_function_type): Escape forbidden XML
characters in function type names.
* src/abg-dwarf-reader.cc (build_function_type): If a parameter
name is not ascii, drop it on the floor.
* tests/data/test-types-stability/pr19139-DomainNeighborMapInst.o:
New test input binary.
* tests/data/test-types-stability/pr19202-libmpi_gpfs.so.5.0:
Likewise.
* tests/data/Makefile.am: Add the new binaries above to the build
system.
* tests/test-types-stability.cc: New test harness.
* tests/Makefile.am: Add the new test harness to the build system.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-11-05 15:01:56 +00:00
|
|
|
|
|
|
|
using std::string;
|
|
|
|
using std::ofstream;
|
|
|
|
using std::cerr;
|
|
|
|
|
|
|
|
// A set of elf files to test type stability for.
|
|
|
|
const char* elf_paths[] =
|
|
|
|
{
|
2016-01-13 14:34:42 +00:00
|
|
|
"data/test-types-stability/pr19434-elf0",
|
Bug 19139 - DWARF reader doesn't handle garbage in function names
In this bug, the DWARF debug info of the binary (generated by Intel's
ICC compiler) has interesting constructs like:
[ 6b5a0] subprogram
decl_line (data2) 787
decl_column (data1) 15
decl_file (data1) 46
declaration (flag)
accessibility (data1) public (1)
type (ref4) [ 6b56a]
prototyped (flag)
name (string) "ldiv"
MIPS_linkage_name (string) "ldiv"
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
Note the strings that make up the name of the formal parameters of the
function, near the end:
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
The strings "$Ë2" and $Ë3" (which are the names of the
parameters of the function) are garbage.
Libabigail's DWARF reader naively uses those strings as names for the
function parameters, in the type of the function.
Then, the abixml writer emits an XML document, with these strings as
property values, representing the name of the type of the function.
And of course, the XML later chokes when it tries to read that XML
document, saying that the property is not valid UTF-8.
This patch addresses the issue by dropping those garbage names on the
floor, for function type names. In that context, any string that is
not made of ASCII characters is considered as being garbage, for now.
The patch, in the abixml writer, also escapes function parameters
names so that they don't contain characters that are not allowed in
XML. The abixml reader already handles the un-escaping of the names
it reads, so I think there is nothing to do there.
Ultimately, I guess I should get the unicode value of the characters
of that string, encode the string into UTF-8 and use the result as the
name for the parameter. That would mean using UTF-8 strings for
function parameter names, and, for all declarations names. But that
is too much for worfk too little gain for now. The great majority of
the binaries we are dealing with are still using ASCII for declaration
names.
The patch also introduces a new test harness that runs "abidw
--abidiff" on a bunch of input binaries. This harness runs over the
binaries that were submitted in this bug report.
* include/abg-tools-utils.h (string_is_ascii): Declare new
function ...
* src/abg-tools-utils.cc (string_is_ascii): ... and define it.
* src/abg-writer.cc (write_function_type): Escape forbidden XML
characters in function type names.
* src/abg-dwarf-reader.cc (build_function_type): If a parameter
name is not ascii, drop it on the floor.
* tests/data/test-types-stability/pr19139-DomainNeighborMapInst.o:
New test input binary.
* tests/data/test-types-stability/pr19202-libmpi_gpfs.so.5.0:
Likewise.
* tests/data/Makefile.am: Add the new binaries above to the build
system.
* tests/test-types-stability.cc: New test harness.
* tests/Makefile.am: Add the new test harness to the build system.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-11-05 15:01:56 +00:00
|
|
|
"data/test-types-stability/pr19139-DomainNeighborMapInst.o",
|
|
|
|
"data/test-types-stability/pr19202-libmpi_gpfs.so.5.0",
|
2015-11-09 13:22:25 +00:00
|
|
|
"data/test-types-stability/pr19026-libvtkIOSQL-6.1.so.1",
|
2016-01-08 21:22:58 +00:00
|
|
|
"data/test-types-stability/pr19138-elf0",
|
2016-01-18 07:51:21 +00:00
|
|
|
"data/test-types-stability/pr19433-custom0",
|
Bug 19141 - Libabigail doesn't support common ELF symbols
Libabigail's internal representation of elf symbols fails to account
for common symbols in relocatable files. There can be several common
symbols of the same name (defined in a section of SHN_COMMON kind).
In that case, Libabigail wrongly considers these multiple instances of
the same common symbol as being alias, and that breaks some
basic assumptions about aliases. Oops.
This patch adds support for the common symbols (and the fact that
relocatable files can have several instances of the same common
symbol) and amends the ELF reader to make it properly represent those.
* include/abg-ir.h (elf_symbol::elf_symbol): Take a new flag to
say if the symbol is common.
(elf_symbol::{is_common_symbol, has_other_common_instances,
get_next_common_instance, add_common_instance}): New member functions.
* src/abg-ir.cc (elf_symbol::priv::{is_common_,
next_common_instance_): New data members.
(elf_symbol::priv::priv): Adjust.
(elf_symbol::{elf_symbol, create}): Take a new flag to say if the
symbol is common.
(textually_equals): Adjust to account for symbol common-ness.
(elf_symbol::{is_common_symbol, has_other_common_instances,
get_next_common_instance, add_common_instance}): Define new member
functions.
(elf_symbol::add_alias): Drive-by fix; compare symbols using
pointer value. Value comparison is not necessary.
* src/abg-dwarf-reader.cc (lookup_symbol_from_sysv_hash_tab)
(lookup_symbol_from_gnu_hash_tab, lookup_symbol_from_symtab)
(read_context::lookup_elf_symbol_from_index): Adjust the creation
of the symbol to account for common-ness.
(read_context::load_symbol_maps): Recognize instances of a given
common symbol and represent them as such. Do not mistake this
with symbol aliases.
* src/abg-reader.cc (build_elf_symbol): Adjust the creation of the
symbol to account for common-ness.
* src/abg-writer.cc (write_elf_symbol): Adjust symbol
serialization to account common-ness.
* tests/data/test-types-stability/pr19141-get5d.o: Add new test
binary input.
* tests/data/test-types-stability/pr19142-topo.o: Likewise.
* tests/data/Makefile.am: Add the new test inputs to source distribution.
* tests/test-types-stability.cc (elf_paths): The the new test
inputs into account.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2016-01-20 12:37:52 +00:00
|
|
|
"data/test-types-stability/pr19141-get5d.o",
|
|
|
|
"data/test-types-stability/pr19142-topo.o",
|
2016-01-21 09:48:32 +00:00
|
|
|
"data/test-types-stability/pr19204-libtcmalloc.so.4.2.6-xlc",
|
Bug 19139 - DWARF reader doesn't handle garbage in function names
In this bug, the DWARF debug info of the binary (generated by Intel's
ICC compiler) has interesting constructs like:
[ 6b5a0] subprogram
decl_line (data2) 787
decl_column (data1) 15
decl_file (data1) 46
declaration (flag)
accessibility (data1) public (1)
type (ref4) [ 6b56a]
prototyped (flag)
name (string) "ldiv"
MIPS_linkage_name (string) "ldiv"
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
Note the strings that make up the name of the formal parameters of the
function, near the end:
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
The strings "$Ë2" and $Ë3" (which are the names of the
parameters of the function) are garbage.
Libabigail's DWARF reader naively uses those strings as names for the
function parameters, in the type of the function.
Then, the abixml writer emits an XML document, with these strings as
property values, representing the name of the type of the function.
And of course, the XML later chokes when it tries to read that XML
document, saying that the property is not valid UTF-8.
This patch addresses the issue by dropping those garbage names on the
floor, for function type names. In that context, any string that is
not made of ASCII characters is considered as being garbage, for now.
The patch, in the abixml writer, also escapes function parameters
names so that they don't contain characters that are not allowed in
XML. The abixml reader already handles the un-escaping of the names
it reads, so I think there is nothing to do there.
Ultimately, I guess I should get the unicode value of the characters
of that string, encode the string into UTF-8 and use the result as the
name for the parameter. That would mean using UTF-8 strings for
function parameter names, and, for all declarations names. But that
is too much for worfk too little gain for now. The great majority of
the binaries we are dealing with are still using ASCII for declaration
names.
The patch also introduces a new test harness that runs "abidw
--abidiff" on a bunch of input binaries. This harness runs over the
binaries that were submitted in this bug report.
* include/abg-tools-utils.h (string_is_ascii): Declare new
function ...
* src/abg-tools-utils.cc (string_is_ascii): ... and define it.
* src/abg-writer.cc (write_function_type): Escape forbidden XML
characters in function type names.
* src/abg-dwarf-reader.cc (build_function_type): If a parameter
name is not ascii, drop it on the floor.
* tests/data/test-types-stability/pr19139-DomainNeighborMapInst.o:
New test input binary.
* tests/data/test-types-stability/pr19202-libmpi_gpfs.so.5.0:
Likewise.
* tests/data/Makefile.am: Add the new binaries above to the build
system.
* tests/test-types-stability.cc: New test harness.
* tests/Makefile.am: Add the new test harness to the build system.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-11-05 15:01:56 +00:00
|
|
|
// The below should always be the last element of array.
|
|
|
|
0
|
|
|
|
};
|
|
|
|
|
2016-01-15 12:34:16 +00:00
|
|
|
/// A task which launches abidw --abidiff on a binary
|
|
|
|
/// passed to the constructor of the task.
|
|
|
|
struct test_task : public abigail::workers::task
|
|
|
|
{
|
|
|
|
string path;
|
|
|
|
string error_message;
|
|
|
|
bool is_ok;
|
|
|
|
|
|
|
|
/// The constructor of the test task.
|
|
|
|
///
|
|
|
|
/// @param elf_path the path to the elf binary on which we are
|
|
|
|
/// supposed to launch abidw --abidiff.
|
|
|
|
test_task(const string& elf_path)
|
|
|
|
: path(elf_path),
|
|
|
|
is_ok(true)
|
|
|
|
{}
|
|
|
|
|
|
|
|
/// This virtual function overload actually performs the job of the task.
|
|
|
|
///
|
|
|
|
/// It calls abidw --abidiff on the binary refered to by the task.
|
|
|
|
/// It thus stores a flag saying if the result of abidw --abidiff is
|
|
|
|
/// OK or not.
|
|
|
|
virtual void
|
|
|
|
perform()
|
|
|
|
{
|
|
|
|
using abigail::tests::get_src_dir;
|
|
|
|
using abigail::tests::get_build_dir;
|
|
|
|
|
|
|
|
string abidw = string(get_build_dir()) + "/tools/abidw";
|
|
|
|
string elf_path = string(get_src_dir()) + "/tests/" + path;
|
|
|
|
string cmd = abidw + " --abidiff " + elf_path;
|
|
|
|
if (system(cmd.c_str()))
|
|
|
|
{
|
|
|
|
error_message = "IR stability issue detected for binary " + elf_path;
|
|
|
|
is_ok = false;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}; // end struct test_task
|
|
|
|
|
|
|
|
/// A convenience typedef for a shared_ptr to @ref test_task.
|
|
|
|
typedef shared_ptr<test_task> test_task_sptr;
|
|
|
|
|
Bug 19139 - DWARF reader doesn't handle garbage in function names
In this bug, the DWARF debug info of the binary (generated by Intel's
ICC compiler) has interesting constructs like:
[ 6b5a0] subprogram
decl_line (data2) 787
decl_column (data1) 15
decl_file (data1) 46
declaration (flag)
accessibility (data1) public (1)
type (ref4) [ 6b56a]
prototyped (flag)
name (string) "ldiv"
MIPS_linkage_name (string) "ldiv"
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
Note the strings that make up the name of the formal parameters of the
function, near the end:
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
The strings "$Ë2" and $Ë3" (which are the names of the
parameters of the function) are garbage.
Libabigail's DWARF reader naively uses those strings as names for the
function parameters, in the type of the function.
Then, the abixml writer emits an XML document, with these strings as
property values, representing the name of the type of the function.
And of course, the XML later chokes when it tries to read that XML
document, saying that the property is not valid UTF-8.
This patch addresses the issue by dropping those garbage names on the
floor, for function type names. In that context, any string that is
not made of ASCII characters is considered as being garbage, for now.
The patch, in the abixml writer, also escapes function parameters
names so that they don't contain characters that are not allowed in
XML. The abixml reader already handles the un-escaping of the names
it reads, so I think there is nothing to do there.
Ultimately, I guess I should get the unicode value of the characters
of that string, encode the string into UTF-8 and use the result as the
name for the parameter. That would mean using UTF-8 strings for
function parameter names, and, for all declarations names. But that
is too much for worfk too little gain for now. The great majority of
the binaries we are dealing with are still using ASCII for declaration
names.
The patch also introduces a new test harness that runs "abidw
--abidiff" on a bunch of input binaries. This harness runs over the
binaries that were submitted in this bug report.
* include/abg-tools-utils.h (string_is_ascii): Declare new
function ...
* src/abg-tools-utils.cc (string_is_ascii): ... and define it.
* src/abg-writer.cc (write_function_type): Escape forbidden XML
characters in function type names.
* src/abg-dwarf-reader.cc (build_function_type): If a parameter
name is not ascii, drop it on the floor.
* tests/data/test-types-stability/pr19139-DomainNeighborMapInst.o:
New test input binary.
* tests/data/test-types-stability/pr19202-libmpi_gpfs.so.5.0:
Likewise.
* tests/data/Makefile.am: Add the new binaries above to the build
system.
* tests/test-types-stability.cc: New test harness.
* tests/Makefile.am: Add the new test harness to the build system.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-11-05 15:01:56 +00:00
|
|
|
int
|
|
|
|
main()
|
|
|
|
{
|
2016-01-15 12:34:16 +00:00
|
|
|
using std::vector;
|
|
|
|
using std::tr1::dynamic_pointer_cast;
|
|
|
|
using abigail::workers::queue;
|
|
|
|
using abigail::workers::task;
|
|
|
|
using abigail::workers::task_sptr;
|
|
|
|
using abigail::workers::get_number_of_threads;
|
Bug 19139 - DWARF reader doesn't handle garbage in function names
In this bug, the DWARF debug info of the binary (generated by Intel's
ICC compiler) has interesting constructs like:
[ 6b5a0] subprogram
decl_line (data2) 787
decl_column (data1) 15
decl_file (data1) 46
declaration (flag)
accessibility (data1) public (1)
type (ref4) [ 6b56a]
prototyped (flag)
name (string) "ldiv"
MIPS_linkage_name (string) "ldiv"
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
Note the strings that make up the name of the formal parameters of the
function, near the end:
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
The strings "$Ë2" and $Ë3" (which are the names of the
parameters of the function) are garbage.
Libabigail's DWARF reader naively uses those strings as names for the
function parameters, in the type of the function.
Then, the abixml writer emits an XML document, with these strings as
property values, representing the name of the type of the function.
And of course, the XML later chokes when it tries to read that XML
document, saying that the property is not valid UTF-8.
This patch addresses the issue by dropping those garbage names on the
floor, for function type names. In that context, any string that is
not made of ASCII characters is considered as being garbage, for now.
The patch, in the abixml writer, also escapes function parameters
names so that they don't contain characters that are not allowed in
XML. The abixml reader already handles the un-escaping of the names
it reads, so I think there is nothing to do there.
Ultimately, I guess I should get the unicode value of the characters
of that string, encode the string into UTF-8 and use the result as the
name for the parameter. That would mean using UTF-8 strings for
function parameter names, and, for all declarations names. But that
is too much for worfk too little gain for now. The great majority of
the binaries we are dealing with are still using ASCII for declaration
names.
The patch also introduces a new test harness that runs "abidw
--abidiff" on a bunch of input binaries. This harness runs over the
binaries that were submitted in this bug report.
* include/abg-tools-utils.h (string_is_ascii): Declare new
function ...
* src/abg-tools-utils.cc (string_is_ascii): ... and define it.
* src/abg-writer.cc (write_function_type): Escape forbidden XML
characters in function type names.
* src/abg-dwarf-reader.cc (build_function_type): If a parameter
name is not ascii, drop it on the floor.
* tests/data/test-types-stability/pr19139-DomainNeighborMapInst.o:
New test input binary.
* tests/data/test-types-stability/pr19202-libmpi_gpfs.so.5.0:
Likewise.
* tests/data/Makefile.am: Add the new binaries above to the build
system.
* tests/test-types-stability.cc: New test harness.
* tests/Makefile.am: Add the new test harness to the build system.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-11-05 15:01:56 +00:00
|
|
|
|
2016-01-15 12:34:16 +00:00
|
|
|
/// Create a task queue. The max number of worker threads of the
|
|
|
|
/// queue is the number of the concurrent threads supported by the
|
|
|
|
/// processor of the machine this code runs on.
|
|
|
|
const size_t num_tests = sizeof(elf_paths) / sizeof (char*) - 1;
|
|
|
|
size_t num_workers = std::min(get_number_of_threads(), num_tests);
|
|
|
|
queue task_queue(num_workers);
|
Bug 19139 - DWARF reader doesn't handle garbage in function names
In this bug, the DWARF debug info of the binary (generated by Intel's
ICC compiler) has interesting constructs like:
[ 6b5a0] subprogram
decl_line (data2) 787
decl_column (data1) 15
decl_file (data1) 46
declaration (flag)
accessibility (data1) public (1)
type (ref4) [ 6b56a]
prototyped (flag)
name (string) "ldiv"
MIPS_linkage_name (string) "ldiv"
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
Note the strings that make up the name of the formal parameters of the
function, near the end:
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
The strings "$Ë2" and $Ë3" (which are the names of the
parameters of the function) are garbage.
Libabigail's DWARF reader naively uses those strings as names for the
function parameters, in the type of the function.
Then, the abixml writer emits an XML document, with these strings as
property values, representing the name of the type of the function.
And of course, the XML later chokes when it tries to read that XML
document, saying that the property is not valid UTF-8.
This patch addresses the issue by dropping those garbage names on the
floor, for function type names. In that context, any string that is
not made of ASCII characters is considered as being garbage, for now.
The patch, in the abixml writer, also escapes function parameters
names so that they don't contain characters that are not allowed in
XML. The abixml reader already handles the un-escaping of the names
it reads, so I think there is nothing to do there.
Ultimately, I guess I should get the unicode value of the characters
of that string, encode the string into UTF-8 and use the result as the
name for the parameter. That would mean using UTF-8 strings for
function parameter names, and, for all declarations names. But that
is too much for worfk too little gain for now. The great majority of
the binaries we are dealing with are still using ASCII for declaration
names.
The patch also introduces a new test harness that runs "abidw
--abidiff" on a bunch of input binaries. This harness runs over the
binaries that were submitted in this bug report.
* include/abg-tools-utils.h (string_is_ascii): Declare new
function ...
* src/abg-tools-utils.cc (string_is_ascii): ... and define it.
* src/abg-writer.cc (write_function_type): Escape forbidden XML
characters in function type names.
* src/abg-dwarf-reader.cc (build_function_type): If a parameter
name is not ascii, drop it on the floor.
* tests/data/test-types-stability/pr19139-DomainNeighborMapInst.o:
New test input binary.
* tests/data/test-types-stability/pr19202-libmpi_gpfs.so.5.0:
Likewise.
* tests/data/Makefile.am: Add the new binaries above to the build
system.
* tests/test-types-stability.cc: New test harness.
* tests/Makefile.am: Add the new test harness to the build system.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-11-05 15:01:56 +00:00
|
|
|
|
2016-01-15 12:34:16 +00:00
|
|
|
/// Create one task per binary registered for this test, and push
|
|
|
|
/// them to the task queue. Pushing a task to the queue triggers
|
|
|
|
/// a worker thread that starts working on the task.
|
Bug 19139 - DWARF reader doesn't handle garbage in function names
In this bug, the DWARF debug info of the binary (generated by Intel's
ICC compiler) has interesting constructs like:
[ 6b5a0] subprogram
decl_line (data2) 787
decl_column (data1) 15
decl_file (data1) 46
declaration (flag)
accessibility (data1) public (1)
type (ref4) [ 6b56a]
prototyped (flag)
name (string) "ldiv"
MIPS_linkage_name (string) "ldiv"
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
Note the strings that make up the name of the formal parameters of the
function, near the end:
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
The strings "$Ë2" and $Ë3" (which are the names of the
parameters of the function) are garbage.
Libabigail's DWARF reader naively uses those strings as names for the
function parameters, in the type of the function.
Then, the abixml writer emits an XML document, with these strings as
property values, representing the name of the type of the function.
And of course, the XML later chokes when it tries to read that XML
document, saying that the property is not valid UTF-8.
This patch addresses the issue by dropping those garbage names on the
floor, for function type names. In that context, any string that is
not made of ASCII characters is considered as being garbage, for now.
The patch, in the abixml writer, also escapes function parameters
names so that they don't contain characters that are not allowed in
XML. The abixml reader already handles the un-escaping of the names
it reads, so I think there is nothing to do there.
Ultimately, I guess I should get the unicode value of the characters
of that string, encode the string into UTF-8 and use the result as the
name for the parameter. That would mean using UTF-8 strings for
function parameter names, and, for all declarations names. But that
is too much for worfk too little gain for now. The great majority of
the binaries we are dealing with are still using ASCII for declaration
names.
The patch also introduces a new test harness that runs "abidw
--abidiff" on a bunch of input binaries. This harness runs over the
binaries that were submitted in this bug report.
* include/abg-tools-utils.h (string_is_ascii): Declare new
function ...
* src/abg-tools-utils.cc (string_is_ascii): ... and define it.
* src/abg-writer.cc (write_function_type): Escape forbidden XML
characters in function type names.
* src/abg-dwarf-reader.cc (build_function_type): If a parameter
name is not ascii, drop it on the floor.
* tests/data/test-types-stability/pr19139-DomainNeighborMapInst.o:
New test input binary.
* tests/data/test-types-stability/pr19202-libmpi_gpfs.so.5.0:
Likewise.
* tests/data/Makefile.am: Add the new binaries above to the build
system.
* tests/test-types-stability.cc: New test harness.
* tests/Makefile.am: Add the new test harness to the build system.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-11-05 15:01:56 +00:00
|
|
|
for (const char** p = elf_paths; p && *p; ++p)
|
|
|
|
{
|
2016-01-15 12:34:16 +00:00
|
|
|
test_task_sptr t(new test_task(*p));
|
|
|
|
assert(task_queue.schedule_task(t));
|
|
|
|
}
|
|
|
|
|
|
|
|
/// Wait for all worker threads to finish their job, and wind down.
|
|
|
|
task_queue.wait_for_workers_to_complete();
|
|
|
|
|
|
|
|
// Now walk the results and print whatever error messages need to be
|
|
|
|
// printed.
|
|
|
|
|
|
|
|
const vector<task_sptr>& completed_tasks =
|
|
|
|
task_queue.get_completed_tasks();
|
|
|
|
|
|
|
|
assert(completed_tasks.size () == num_tests);
|
|
|
|
|
|
|
|
bool is_ok = true;
|
|
|
|
for (vector<task_sptr>::const_iterator ti = completed_tasks.begin();
|
|
|
|
ti != completed_tasks.end();
|
|
|
|
++ti)
|
|
|
|
{
|
|
|
|
test_task_sptr t = dynamic_pointer_cast<test_task>(*ti);
|
|
|
|
if (!t->is_ok)
|
Bug 19139 - DWARF reader doesn't handle garbage in function names
In this bug, the DWARF debug info of the binary (generated by Intel's
ICC compiler) has interesting constructs like:
[ 6b5a0] subprogram
decl_line (data2) 787
decl_column (data1) 15
decl_file (data1) 46
declaration (flag)
accessibility (data1) public (1)
type (ref4) [ 6b56a]
prototyped (flag)
name (string) "ldiv"
MIPS_linkage_name (string) "ldiv"
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
Note the strings that make up the name of the formal parameters of the
function, near the end:
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
The strings "$Ë2" and $Ë3" (which are the names of the
parameters of the function) are garbage.
Libabigail's DWARF reader naively uses those strings as names for the
function parameters, in the type of the function.
Then, the abixml writer emits an XML document, with these strings as
property values, representing the name of the type of the function.
And of course, the XML later chokes when it tries to read that XML
document, saying that the property is not valid UTF-8.
This patch addresses the issue by dropping those garbage names on the
floor, for function type names. In that context, any string that is
not made of ASCII characters is considered as being garbage, for now.
The patch, in the abixml writer, also escapes function parameters
names so that they don't contain characters that are not allowed in
XML. The abixml reader already handles the un-escaping of the names
it reads, so I think there is nothing to do there.
Ultimately, I guess I should get the unicode value of the characters
of that string, encode the string into UTF-8 and use the result as the
name for the parameter. That would mean using UTF-8 strings for
function parameter names, and, for all declarations names. But that
is too much for worfk too little gain for now. The great majority of
the binaries we are dealing with are still using ASCII for declaration
names.
The patch also introduces a new test harness that runs "abidw
--abidiff" on a bunch of input binaries. This harness runs over the
binaries that were submitted in this bug report.
* include/abg-tools-utils.h (string_is_ascii): Declare new
function ...
* src/abg-tools-utils.cc (string_is_ascii): ... and define it.
* src/abg-writer.cc (write_function_type): Escape forbidden XML
characters in function type names.
* src/abg-dwarf-reader.cc (build_function_type): If a parameter
name is not ascii, drop it on the floor.
* tests/data/test-types-stability/pr19139-DomainNeighborMapInst.o:
New test input binary.
* tests/data/test-types-stability/pr19202-libmpi_gpfs.so.5.0:
Likewise.
* tests/data/Makefile.am: Add the new binaries above to the build
system.
* tests/test-types-stability.cc: New test harness.
* tests/Makefile.am: Add the new test harness to the build system.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-11-05 15:01:56 +00:00
|
|
|
{
|
|
|
|
is_ok = false;
|
2016-01-15 12:34:16 +00:00
|
|
|
cerr << t->error_message << "\n";
|
Bug 19139 - DWARF reader doesn't handle garbage in function names
In this bug, the DWARF debug info of the binary (generated by Intel's
ICC compiler) has interesting constructs like:
[ 6b5a0] subprogram
decl_line (data2) 787
decl_column (data1) 15
decl_file (data1) 46
declaration (flag)
accessibility (data1) public (1)
type (ref4) [ 6b56a]
prototyped (flag)
name (string) "ldiv"
MIPS_linkage_name (string) "ldiv"
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
Note the strings that make up the name of the formal parameters of the
function, near the end:
[ 6b5b6] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë2"
[ 6b5bf] formal_parameter
type (ref4) [ 5f2aa]
name (string) "$Ë3"
The strings "$Ë2" and $Ë3" (which are the names of the
parameters of the function) are garbage.
Libabigail's DWARF reader naively uses those strings as names for the
function parameters, in the type of the function.
Then, the abixml writer emits an XML document, with these strings as
property values, representing the name of the type of the function.
And of course, the XML later chokes when it tries to read that XML
document, saying that the property is not valid UTF-8.
This patch addresses the issue by dropping those garbage names on the
floor, for function type names. In that context, any string that is
not made of ASCII characters is considered as being garbage, for now.
The patch, in the abixml writer, also escapes function parameters
names so that they don't contain characters that are not allowed in
XML. The abixml reader already handles the un-escaping of the names
it reads, so I think there is nothing to do there.
Ultimately, I guess I should get the unicode value of the characters
of that string, encode the string into UTF-8 and use the result as the
name for the parameter. That would mean using UTF-8 strings for
function parameter names, and, for all declarations names. But that
is too much for worfk too little gain for now. The great majority of
the binaries we are dealing with are still using ASCII for declaration
names.
The patch also introduces a new test harness that runs "abidw
--abidiff" on a bunch of input binaries. This harness runs over the
binaries that were submitted in this bug report.
* include/abg-tools-utils.h (string_is_ascii): Declare new
function ...
* src/abg-tools-utils.cc (string_is_ascii): ... and define it.
* src/abg-writer.cc (write_function_type): Escape forbidden XML
characters in function type names.
* src/abg-dwarf-reader.cc (build_function_type): If a parameter
name is not ascii, drop it on the floor.
* tests/data/test-types-stability/pr19139-DomainNeighborMapInst.o:
New test input binary.
* tests/data/test-types-stability/pr19202-libmpi_gpfs.so.5.0:
Likewise.
* tests/data/Makefile.am: Add the new binaries above to the build
system.
* tests/test-types-stability.cc: New test harness.
* tests/Makefile.am: Add the new test harness to the build system.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-11-05 15:01:56 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return !is_ok;
|
|
|
|
}
|