libabigail/tests/test-read-ctf.cc
Guillermo E. Martinez e64d32bee3 ctf-reader: Add support to read CTF information from the Linux Kernel
This patch is meant to extract ABI information from the CTF data
stored in the Linux kernel build directory.  It depends on the
vmlinux.ctfa archive file.

In order to generate the CTF information, the Linux Kernel build
system must support the 'make ctf' command, which causes the compiler
to be run with -gctf, thus emitting the CTF information for the
Kernel.

The target 'ctf' in the Linux Makefile generates a 'vmlinux.ctfa' file
that will be used by the ctf reader in libabigail. The 'vmlinux.ctfa'
archive has multiple 'ctf dictionaries' called "CTF archive members".

There is one CTF archive member for built-in kernel modules (in
`vmlinux') and one for each out-of-tree kernel module organized in a
parent-child hierarchy.

There is also a CTF archive member called `shared_ctf' which is a
parent dictionary containing shared symbols and CTF types used by more
than one kernel object.  These common types are stored in 'types_map'
in the ctf reader, ignoring the ctf dictionary name.  The CTF API has
the machinery for looking for a shared type in the parent dictionary
referred to in a given child dictionary. This CTF layout can be dumped
by using the objdump tool.

Due to the fact that the _same_ ctf archive is used to build the
vmlinux corpus the corpora of the kernel module (which, by the way,
all belong to the same corpus group), the high number of open/close on
the CTF archive is very time consuming during the ctf extraction.

So, the performance is improved up to 300% (from ~2m:50s to ~50s) by
keeping the ctf archive open for a given group, and thus, by using the
same ctf_archive_t pointer while building all the various corpora.

We just invoke `reset_read_context' for each new corpus.  Note that
the `read_context::ctfa` data member should be updated if the
corpus::origin data member is set to `LINUX_KERNEL_BINARY_ORIGIN' and
the file to be process is not 'vmlinux'.

Note that `ctf_close' must be called after processing all group's
members so it is executed from the destructor of `reader_context'.

The basic algorithm used to generate the Linux corpus is the
following:

   1. Looking for: vmlinux, *.ko objects, and vmlinux.ctfa files. The
   first files are used to extract the ELF symbols, and the last one
   contains the CTF type information for non-static variables and
   functions symbols.

   2. `process_ctf_archive' iterates on public symbols for vmlinux and
   its modules, using the name of the symbol, ctf reader search for CTF
   information in its dictionary, if the information was found it
   builds a `var_decl' or `function_decl' depending of `ctf_type_kind'
   result.

This algorithm is also applied to ELF files (exec, dyn, rel), so
instead of iterating on all ctf_types it just loops on the public
symbols.

	* abg-elf-reader-common.h: Include ctf-api.h file.
	(read_and_add_corpus_to_group_from_elf, set_read_context_corpus_group)
	(reset_read_context, dic_type_key): Declare new member functions.
	* include/abg-ir.cc (types_defined_same_linux_kernel_corpus_public): Use
	bitwise to know the corpus `origin'.
	* src/abg-ctf-reader.cc: Include map, algorithms header files.
	(read_context::type_map): Change from unordered_map to std::map storing
	ctf dictionary name as part of the key.
	(read_context::is_elf_exec): Add new member variable.
	(read_context::{cur_corpus_, cur_corpus_group_}): Likewise.
	(read_context::unknown_types_set): Likewise.
	(read_context::{current_corpus_group, main_corpus_from_current_group,
	has_corpus_group, current_corpus_is_main_corpus_from_current_group,
	should_reuse_type_from_corpus_group}): Add new member functions.
	(read_context::{add_unknown_type, lookup_unknown_type, initialize}):
	Likewise.
	(read_context::{add_type, lookup_type}): Add new `ctf_dict_t' type
	argument.
	(ctf_reader::{process_ctf_typedef, process_ctf_base_type,
	process_ctf_function_type, process_ctf_forward_type,
	process_ctf_struct_type, process_ctf_union_type, process_ctf_array_type,
	process_ctf_qualified_type, process_ctf_enum_type}): Add code to `reuse'
	types already registered in main corpus `should_reuse_type_from_corpus_group'.
	Use new `lookup_type' and `add_type' operations on `read_context::types_map'.
	Replace function calls to the new ctf interface. Add verifier to not build
	types duplicated by recursive calling chain.
	(ctf_reader::process_ctf_type): Add code to return immediately if the
	ctf type is unknown. Add unknown types to `unknown_types_set'.
	(ctf_reader::process_ctf_archive): Change comment.
	Add code to iterate over global symbols, searching by symbol name in the
	ctf dictionary using `ctf_lookup_{variable,by_symbol_name}' depending of
	the ELF file type and corpus type, creating a `{var,fuc}_decl' using the
	return type of `ctf_type_kind'.  Also close the ctf dict and call
	`canonicalize_all_types'.
	(slurp_elf_info): Set `is_elf_exec' depending of ELF type.  Also return
	success if corpus origin is Linux and symbol table was read.
	(ctf_reader::read_corpus): Add current corpus.  Set corpus origin to
	`LINUX_KERNEL_BINARY_ORIGIN' if `is_linux_kernel' returns true.  Verify
	the ctf reader status, now the ctf archive is 'opened' using
	`ctf_arc{open,bufopen}' depending if the corpus origin has
	`corpus::LINUX_KERNEL_BINARY_ORIGIN' bit set. Use
	`sort_{function,variables}' calls after extract ctf information.
	`ctf_close' is called from `read_context' destructor.
	(read:context::{set_read_context_corpus_group, reset_read_context,
	read_and_add_corpus_to_group_from_elf, dic_type_key): Add new member
	function implementation.
	* include/abg-tools-utils.h (build_corpus_group_from_kernel_dist_under):
	Add `origin' parameter with default `corpus::DWARF_ORIGIN'.
	* src/abg-tools-utils.cc: Use `abg-ctf-reader.h' file.
	(maybe_load_vmlinux_dwarf_corpus): Add new function.
	(maybe_load_vmlinux_ctf_corpus): Likewise.
	(build_corpus_group_from_kernel_dist_under): Update comments.
	Add new `origin' argument. Use `maybe_load_vmlinux_dwarf_corpus'
	or `maybe_load_vmlinux_ctf_corpus' according to `origin' value.
	* src/abg-corpus.h (corpus::origin): Update `origin' type
	values in enum.
	* src/abg-corpus-priv.h (corpus::priv): Replace `origin' type
	from `corpus::origin' to `uint32_t'.
	* src/abg-corpus.cc (corpus::{get,set}_origin): Replace data
	type from `corpus::origin' to `uint32_t'.
	* tools/abidw.cc (main): Use of --ctf argument to set format debug.
	* tests/test-read-ctf.cc: Add new tests to harness.
	* tests/data/test-read-ctf/test-PR27700.abi: New test expected
	  result.
	* tests/data/test-read-ctf/test-anonymous-fields.o.abi: Likewise.
	* tests/data/test-read-ctf/test-enum-many-ctf.o.hash.abi: Likewise.
	* tests/data/test-read-ctf/test-enum-many.o.hash.abi: Likewise.
	* tests/data/test-read-ctf/test-enum-symbol-ctf.o.hash.abi: Likewise.
	* tests/data/test-read-common/test-PR26568-2.o: Adjust.
	* tests/data/test-read-ctf/test-PR26568-1.o.abi: Likewise.
	* tests/data/test-read-ctf/test-PR26568-2.o.abi: Likewise.
	* tests/data/test-read-ctf/test-ambiguous-struct-A.o.hash.abi: Likewise.
	* tests/data/test-read-ctf/test-ambiguous-struct-B.c: Likewise.
	* tests/data/test-read-ctf/test-ambiguous-struct-B.o: Likewise.
	* tests/data/test-read-ctf/test-ambiguous-struct-B.o.hash.abi: Likewise.
	* tests/data/test-read-ctf/test-array-of-pointers.abi: Likewise.
	* tests/data/test-read-ctf/test-callback.abi: Likewise.
	* tests/data/test-read-ctf/test-callback2.abi: Likewise.
	* tests/data/test-read-ctf/test-conflicting-type-syms-a.o.hash.abi:
	Likewise.
	* tests/data/test-read-ctf/test-conflicting-type-syms-b.o.hash.abi:
	Likewise.
	* tests/data/test-read-ctf/test-dynamic-array.o.abi: Likewise.
	* tests/data/test-read-ctf/test-enum-ctf.o.abi: Likewise.
	* tests/data/test-read-ctf/test-enum-symbol.o.hash.abi: Likewise.
	* tests/data/test-read-ctf/test-enum.o.abi: Likewise.
	* tests/data/test-read-ctf/test-forward-type-decl.abi: Likewise.
	* tests/data/test-read-ctf/test-functions-declaration.abi: Likewise.
	* tests/data/test-read-ctf/test-list-struct.abi: Likewise.
	* tests/data/test-read-ctf/test0: Likewise.
	* tests/data/test-read-ctf/test0.abi: Likewise.
	* tests/data/test-read-ctf/test0.c: Likewise.
	* tests/data/test-read-ctf/test0.hash.abi: Likewise.
	* tests/data/test-read-ctf/test1.so.abi: Likewise.
	* tests/data/test-read-ctf/test1.so.hash.abi: Likewise.
	* tests/data/test-read-ctf/test2.so.abi: Likewise.
	* tests/data/test-read-ctf/test2.so.hash.abi: Likewise.
	* tests/data/test-read-ctf/test3.so.abi: Likewise.
	* tests/data/test-read-ctf/test3.so.hash.abi: Likewise.
	* tests/data/test-read-ctf/test4.so.abi: Likewise.
	* tests/data/test-read-ctf/test4.so.hash.abi: Likewise.
	* tests/data/test-read-ctf/test5.o.abi: Likewise.
	* tests/data/test-read-ctf/test7.o.abi: Likewise.
	* tests/data/test-read-ctf/test8.o.abi: Likewise.
	* tests/data/test-read-ctf/test9.o.abi: Likewise.

Signed-off-by: Guillermo E. Martinez <guillermo.e.martinez@oracle.com>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2022-05-13 09:11:37 +02:00

426 lines
11 KiB
C++

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
// -*- Mode: C++ -*-
//
// Copyright (C) 2021 Oracle, Inc.
//
// Author: Guillermo E. Martinez
/// @file
///
/// This file implement the CTF testsuite. It reads ELF binaries
/// containing CTF, save them in XML corpus files and diff the
/// corpus files against reference XML corpus files.
#include <cstdlib>
#include <fstream>
#include <iostream>
#include <memory>
#include <string>
#include <vector>
#include "abg-ctf-reader.h"
#include "test-read-common.h"
using std::string;
using std::cerr;
using abigail::tests::read_common::InOutSpec;
using abigail::tests::read_common::test_task;
using abigail::tests::read_common::display_usage;
using abigail::tests::read_common::options;
using abigail::ctf_reader::read_context_sptr;
using abigail::ctf_reader::create_read_context;
using abigail::xml_writer::SEQUENCE_TYPE_ID_STYLE;
using abigail::xml_writer::HASH_TYPE_ID_STYLE;
using abigail::tools_utils::emit_prefix;
static InOutSpec in_out_specs[] =
{
{
"data/test-read-ctf/test0",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test0.abi",
"output/test-read-ctf/test0.abi"
},
{
"data/test-read-ctf/test0",
"",
"",
HASH_TYPE_ID_STYLE,
"data/test-read-ctf/test0.hash.abi",
"output/test-read-ctf/test0.hash.abi"
},
{
"data/test-read-ctf/test1.so",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test1.so.abi",
"output/test-read-ctf/test1.so.abi"
},
{
"data/test-read-ctf/test1.so",
"",
"",
HASH_TYPE_ID_STYLE,
"data/test-read-ctf/test1.so.hash.abi",
"output/test-read-ctf/test1.so.hash.abi"
},
{
"data/test-read-ctf/test2.so",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test2.so.abi",
"output/test-read-ctf/test2.so.abi"
},
{
"data/test-read-ctf/test2.so",
"",
"",
HASH_TYPE_ID_STYLE,
"data/test-read-ctf/test2.so.hash.abi",
"output/test-read-ctf/test2.so.hash.abi"
},
{
"data/test-read-common/test3.so",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test3.so.abi",
"output/test-read-ctf/test3.so.abi"
},
{
"data/test-read-common/test3.so",
"",
"",
HASH_TYPE_ID_STYLE,
"data/test-read-ctf/test3.so.hash.abi",
"output/test-read-ctf/test3.so.hash.abi"
},
{
"data/test-read-ctf/test-enum-many.o",
"",
"",
HASH_TYPE_ID_STYLE,
"data/test-read-ctf/test-enum-many.o.hash.abi",
"output/test-read-ctf/test-enum-many.o.hash.abi"
},
{
"data/test-read-ctf/test-ambiguous-struct-A.o",
"",
"",
HASH_TYPE_ID_STYLE,
"data/test-read-ctf/test-ambiguous-struct-A.o.hash.abi",
"output/test-read-ctf/test-ambiguous-struct-A.o.hash.abi"
},
{
"data/test-read-ctf/test-ambiguous-struct-B.o",
"",
"",
HASH_TYPE_ID_STYLE,
"data/test-read-ctf/test-ambiguous-struct-B.o.hash.abi",
"output/test-read-ctf/test-ambiguous-struct-B.o.hash.abi"
},
{
"data/test-read-ctf/test-conflicting-type-syms-a.o",
"",
"",
HASH_TYPE_ID_STYLE,
"data/test-read-ctf/test-conflicting-type-syms-a.o.hash.abi",
"output/test-read-ctf/test-conflicting-type-syms-a.o.hash.abi"
},
{
"data/test-read-ctf/test-conflicting-type-syms-b.o",
"",
"",
HASH_TYPE_ID_STYLE,
"data/test-read-ctf/test-conflicting-type-syms-b.o.hash.abi",
"output/test-read-ctf/test-conflicting-type-syms-b.o.hash.abi"
},
{
"data/test-read-common/test4.so",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test4.so.abi",
"output/test-read-ctf/test4.so.abi"
},
{
"data/test-read-common/test4.so",
"",
"",
HASH_TYPE_ID_STYLE,
"data/test-read-ctf/test4.so.hash.abi",
"output/test-read-ctf/test4.so.hash.abi"
},
{
"data/test-read-ctf/test5.o",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test5.o.abi",
"output/test-read-ctf/test5.o.abi"
},
{
"data/test-read-ctf/test7.o",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test7.o.abi",
"output/test-read-ctf/test7.o.abi"
},
{
"data/test-read-ctf/test8.o",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test8.o.abi",
"output/test-read-ctf/test8.o.abi"
},
{
"data/test-read-ctf/test9.o",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test9.o.abi",
"output/test-read-ctf/test9.o.abi"
},
{
"data/test-read-ctf/test-enum.o",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test-enum.o.abi",
"output/test-read-ctf/test-enum.o.abi"
},
{
"data/test-read-ctf/test-enum-symbol.o",
"",
"",
HASH_TYPE_ID_STYLE,
"data/test-read-ctf/test-enum-symbol.o.hash.abi",
"output/test-read-ctf/test-enum-symbol.o.hash.abi"
},
{
"data/test-read-ctf/test-dynamic-array.o",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test-dynamic-array.o.abi",
"output/test-read-ctf/test-dynamic-array.o.abi"
},
{
"data/test-read-ctf/test-anonymous-fields.o",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test-anonymous-fields.o.abi",
"output/test-read-ctf/test-anonymous-fields.o.abi"
},
{
"data/test-read-common/PR27700/test-PR27700.o",
"",
"data/test-read-common/PR27700/pub-incdir",
HASH_TYPE_ID_STYLE,
"data/test-read-ctf/PR27700/test-PR27700.abi",
"output/test-read-ctf/PR27700/test-PR27700.abi",
},
{
"data/test-read-ctf/test-callback.o",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test-callback.abi",
"output/test-read-ctf/test-callback.abi",
},
{
"data/test-read-ctf/test-array-of-pointers.o",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test-array-of-pointers.abi",
"output/test-read-ctf/test-array-of-pointers.abi",
},
{
"data/test-read-ctf/test-functions-declaration.o",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test-functions-declaration.abi",
"output/test-read-ctf/test-functions-declaration.abi",
},
{
"data/test-read-ctf/test-forward-type-decl.o",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test-forward-type-decl.abi",
"output/test-read-ctf/test-forward-type-decl.abi",
},
{
"data/test-read-ctf/test-list-struct.o",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test-list-struct.abi",
"output/test-read-ctf/test-list-struct.abi",
},
{
"data/test-read-common/test-PR26568-1.o",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test-PR26568-1.o.abi",
"output/test-read-ctf/test-PR26568-1.o.abi",
},
{
"data/test-read-common/test-PR26568-2.o",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test-PR26568-2.o.abi",
"output/test-read-ctf/test-PR26568-2.o.abi",
},
{
"data/test-read-ctf/test-callback2.o",
"",
"",
SEQUENCE_TYPE_ID_STYLE,
"data/test-read-ctf/test-callback2.abi",
"output/test-read-ctf/test-callback2.abi",
},
// This should be the last entry.
{NULL, NULL, NULL, SEQUENCE_TYPE_ID_STYLE, NULL, NULL}
};
/// Task specialization to perform CTF tests.
struct test_task_ctf : public test_task
{
test_task_ctf(const InOutSpec &s,
string& a_out_abi_base,
string& a_in_elf_base,
string& a_in_abi_base);
virtual void
perform();
virtual
~test_task_ctf()
{}
}; // end struct test_task_ctf
/// Constructor.
///
/// Task to be executed for each CTF test entry in @ref
/// abigail::tests::read_common::InOutSpec.
/// @param InOutSpec the array containing set of tests.
///
/// @param a_out_abi_base the output base directory for abixml files.
///
/// @param a_in_elf_base the input base directory for object files.
///
/// @param a_in_elf_base the input base directory for expected
/// abixml files.
test_task_ctf::test_task_ctf(const InOutSpec &s,
string& a_out_abi_base,
string& a_in_elf_base,
string& a_in_abi_base)
: test_task(s, a_out_abi_base, a_in_elf_base, a_in_abi_base)
{}
/// The thread function to execute each CTF test entry in @ref
/// abigail::tests::read_common::InOutSpec.
///
/// This reads the corpus into memory, saves it to disk, loads it
/// again and compares the new in-memory representation against the
void
test_task_ctf::perform()
{
abigail::ir::environment_sptr env;
set_in_elf_path();
set_in_suppr_spec_path();
env.reset(new abigail::ir::environment);
abigail::elf_reader::status status =
abigail::elf_reader::STATUS_UNKNOWN;
ABG_ASSERT(abigail::tools_utils::file_exists(in_elf_path));
read_context_sptr ctxt = create_read_context(in_elf_path,
env.get());
ABG_ASSERT(ctxt);
corpus_sptr corp = read_corpus(ctxt.get(), status);
// if there is no output and no input, assume that we do not care about the
// actual read result, just that it succeeded.
if (!spec.in_abi_path && !spec.out_abi_path)
{
// Phew! we made it here and we did not crash! yay!
return;
}
if (!corp)
{
error_message = string("failed to read ") + in_elf_path + "\n";
is_ok = false;
return;
}
corp->set_path(spec.in_elf_path);
// Do not take architecture names in comparison so that these
// test input binaries can come from whatever arch the
// programmer likes.
corp->set_architecture_name("");
if (!(is_ok = set_out_abi_path()))
return;
if (!(is_ok = serialize_corpus(out_abi_path, corp)))
return;
if (!(is_ok = run_abidw("--ctf ")))
return;
if (!(is_ok = run_diff()))
return;
}
/// Create a new CTF instance for task to be execute by the testsuite.
///
/// @param s the @ref abigail::tests::read_common::InOutSpec
/// tests container.
///
/// @param a_out_abi_base the output base directory for abixml files.
///
/// @param a_in_elf_base the input base directory for object files.
///
/// @param a_in_abi_base the input base directory for abixml files.
///
/// @return abigail::tests::read_common::test_task instance.
static test_task*
new_task(const InOutSpec* s, string& a_out_abi_base,
string& a_in_elf_base, string& a_in_abi_base)
{
return new test_task_ctf(*s, a_out_abi_base,
a_in_elf_base, a_in_abi_base);
}
int
main(int argc, char *argv[])
{
options opts;
if (!parse_command_line(argc, argv, opts))
{
if (!opts.wrong_option.empty())
emit_prefix(argv[0], cerr)
<< "unrecognized option: " << opts.wrong_option << "\n";
display_usage(argv[0], cerr);
return 1;
}
// compute number of tests to be executed.
const size_t num_tests = sizeof(in_out_specs) / sizeof(InOutSpec) - 1;
return run_tests(num_tests, in_out_specs, opts, new_task);
}