libabigail/tests/test-diff2.cc

175 lines
4.1 KiB
C++
Raw Normal View History

// -*- Mode: C++ -*-
Update copyright years * include/abg-comp-filter.h: Update copyright years. * include/abg-comparison.h: Likewise. * include/abg-config.h: Likewise. * include/abg-corpus.h: Likewise. * include/abg-diff-utils.h: Likewise. * include/abg-dwarf-reader.h: Likewise. * include/abg-fwd.h: Likewise. * include/abg-hash.h: Likewise. * include/abg-ini.h: Likewise. * include/abg-ir.h: Likewise. * include/abg-libxml-utils.h: Likewise. * include/abg-libzip-utils.h: Likewise. * include/abg-reader.h: Likewise. * include/abg-sptr-utils.h: Likewise. * include/abg-traverse.h: Likewise. * include/abg-viz-common.h: Likewise. * include/abg-viz-dot.h: Likewise. * include/abg-viz-svg.h: Likewise. * include/abg-writer.h: Likewise. * src/abg-comp-filter.cc: Likewise. * src/abg-comparison.cc: Likewise. * src/abg-config.cc: Likewise. * src/abg-corpus.cc: Likewise. * src/abg-diff-utils.cc: Likewise. * src/abg-dwarf-reader.cc: Likewise. * src/abg-hash.cc: Likewise. * src/abg-ini.cc: Likewise. * src/abg-ir.cc: Likewise. * src/abg-libxml-utils.cc: Likewise. * src/abg-libzip-utils.cc: Likewise. * src/abg-reader.cc: Likewise. * src/abg-traverse.cc: Likewise. * src/abg-viz-common.cc: Likewise. * src/abg-viz-dot.cc: Likewise. * src/abg-viz-svg.cc: Likewise. * src/abg-writer.cc: Likewise. * tests/print-diff-tree.cc: Likewise. * tests/test-abidiff.cc: Likewise. * tests/test-alt-dwarf-file.cc: Likewise. * tests/test-core-diff.cc: Likewise. * tests/test-diff-dwarf.cc: Likewise. * tests/test-diff-filter.cc: Likewise. * tests/test-diff-suppr.cc: Likewise. * tests/test-diff2.cc: Likewise. * tests/test-ir-walker.cc: Likewise. * tests/test-lookup-syms.cc: Likewise. * tests/test-read-dwarf.cc: Likewise. * tests/test-read-write.cc: Likewise. * tests/test-utils.cc: Likewise. * tests/test-utils.h: Likewise. * tests/test-write-read-archive.cc: Likewise. * tools/abg-tools-utils.cc: Likewise. * tools/abg-tools-utils.h: Likewise. * tools/abiar.cc: Likewise. * tools/abidiff.cc: Likewise. * tools/abidw.cc: Likewise. * tools/abilint.cc: Likewise. * tools/abisym.cc: Likewise. * tools/binilint.cc: Likewise. Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2015-01-07 12:53:58 +00:00
// Copyright (C) 2013-2015 Red Hat, Inc.
//
// This file is part of the GNU Application Binary Interface Generic
// Analysis and Instrumentation Library (libabigail). This library is
// free software; you can redistribute it and/or modify it under the
// terms of the GNU Lesser General Public License as published by the
// Free Software Foundation; either version 3, or (at your option) any
// later version.
// This library is distributed in the hope that it will be useful, but
// WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
// General Lesser Public License for more details.
// You should have received a copy of the GNU Lesser General Public
// License along with this program; see the file COPYING-LGPLV3. If
// not, see <http://www.gnu.org/licenses/>.
//
// Author: Dodji Seketeli
//
/// @file
///
/// This file implements a simple command line utility for
/// interactively testing the diff2 algorithms declared and defined in
/// abg-diff-utils.{h,cc}
///
/// The resulting binary name is testdiff2. Run it to see a help message
/// showing you how to use it.
#include <cstring>
#include <iostream>
#include "abg-diff-utils.h"
using std::cout;
using std::string;
using abigail::diff_utils::ses_len;
using abigail::diff_utils::point;
Re-write middle snakes management in core diff algorithms * include/abg-diff-utils.h (point::set): New overload.. (point::{add, operator<, operator>, operator<=, operator>=}): New methods. (point::operator!=): Constify. (point::operator==): Constify. Cleanup. (point::operator=): Keep emptiness. (class snake): New class definition (d_path_vec::{over_bounds, offset}): New methods. (d_path_vec::check_index_against_bound): Don't take a bound parameter anymore. Use the new over_bound method above. Fix up error reporting. (d_path_vec::d_path_vec): Fix d_path_vec size allocation. (d_path_vec::operator[]): Use the d_path_vec::at method to check all accesses against the bounds. This is slower, but at least we can expect to have something that is more robust. We can remove the bound checking later when we are sure the code has been tested enough. Also use the new offset() method. (d_path_vec::at): Take long long. (ends_of_furthest_d_paths_overlap): Constify input parameters. (end_of_fr_d_path_in_k, end_of_frr_d_path_in_k_plus_delta): Take an instance of the new snake in parameter, rather than a bare end point that wasn't carrying enough information about the snake. Record the snake which consists of up to four points: a begin point, an intermediate point, a diagonal start point and an end point. Return that snake upon successful completion. (compute_middle_snake): Take an instance of snake, rather than the two points that were supposed to represent a snake and with which we were loosing information before. Revisit/simplify the logic of this function; this literally goes forward or in reverse, gets the resulting snake returned by the end_of_fr_d_path_in_k and end_of_frr_d_path_in_k_plus_delta functions, detect if these snakes overlap and just return the current snake. Much simpler. The caller now gets a snake, which has much more information than the previous snake approximation made of just two points. Bonus point, this follows almost to the word, what the paper says. (maybe_record_match_point, find_snake_start_point): Remove these as there are not used by compute_middle_snake anymore. (print_snake, ses_len): Update these to take/handle a snake. (snake_end_points): New declaration. (compute_diff): When we are getting an empty first sequence, this means that we are inserting the second sequence *before* the beginning of the first sequence; keep this information by setting the insertion point index to -1, rather than zero. Update this to get/handle snakes, rather than free points vaguely representing snakes. Now that compute_middle_snake returns real snakes, handle the information we are getting. Basically for edit scripts of length equal to 1, as the snake carries all the necessary information about the non-diagonal edge (as well as the diagonal edges), we (can) now precisely update the current edit script (as well as the longest common sub-sequence). For edit scripts of length greater than 1, better at which points to divide the problem and consequently, at which points to conquer it back -- better following The Paper to the letter. (display_edit_script): Update this for the use of instances of snake. * src/abg-diff-utils.cc (ends_of_furthest_d_paths_overlap): Update for constification of inputs. (snake_end_points): Define new function. (compute_middle_snake): Adapt for the taking an instance of snake. * tests/test-diff2.cc (main): Update for using instances of snake. * tests/test-core-diff.cc: Add new tests. * tests/data/test-core-diff/report0.txt: Update for output adaptation. * tests/data/test-core-diff/report6.txt: Likewise. * tests/data/test-core-diff/report7.txt: Likewise. * tests/data/test-core-diff/report8.txt: New test data. * tests/data/test-core-diff/report9.txt: Likewise. * tests/data/test-core-diff/report10.txt: Likewise. * tests/data/test-core-diff/report11.txt: Likewise. * tests/data/test-core-diff/report12.txt: Likewise. * tests/data/test-core-diff/report3.txt: Likewise. Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2013-10-23 22:39:04 +00:00
using abigail::diff_utils::snake;
using abigail::diff_utils::compute_middle_snake;
using abigail::diff_utils::print_snake;
using abigail::diff_utils::compute_lcs;
using abigail::diff_utils::edit_script;
using abigail::diff_utils::compute_ses;
using abigail::diff_utils::display_edit_script;
struct options
{
bool show_help;
bool ses_len;
bool reverse;
bool middle_snake;
bool lcs;
bool ses;
const char* str1;
const char* str2;
options()
: show_help(false),
ses_len(false),
reverse(false),
middle_snake(false),
lcs(false),
ses(false),
str1(0),
str2(0)
{}
};// end struct options
static void
show_help(const string& progname)
{
cout << "usage: " << progname << " [options] str1 str2\n"
<< "where [options] can be:\n"
<< "--seslen print the length of the SES of the two strings\n"
<< "--reverse compute the d-paths in reverse order when applicable\n"
<< "--middle-snake display middle snake & length of SES\n"
<< "--lcs display the longest common subsequence\n"
<< "--ses display the shortest edit script transforming str1 into str2\n";
}
static void
parse_command_line(int argc, char* argv[], options &opts)
{
if (argc < 3)
{
opts.show_help = true;
return;
}
for (int i = 1; i < argc; ++i)
{
if (argv[i][0] != '-')
{
if (!opts.str1)
opts.str1 = argv[i];
else if (!opts.str2)
opts.str2 = argv[i];
else
{
opts.show_help = true;
return;
}
}
else if (strcmp(argv[i], "--seslen") == 0)
opts.ses_len = true;
else if (strcmp(argv[i], "--reverse") == 0)
opts.reverse = true;
else if (strcmp(argv[i], "--middle-snake") == 0)
opts.middle_snake = true;
else if (strcmp(argv[i], "--lcs") == 0)
opts.lcs = true;
else if (strcmp(argv[i], "--ses") == 0)
opts.ses = true;
else
opts.show_help = true;
}
}
int
main(int argc, char*argv[])
{
options opts;
parse_command_line(argc, argv, opts);
if (opts.show_help)
{
show_help(argv[0]);
return -1;
}
if (opts.ses_len)
{
int len = ses_len(opts.str1, opts.str2, opts.reverse);
cout << len << "\n";
return 0;
}
if (opts.middle_snake)
{
int ses_len = 0;
Re-write middle snakes management in core diff algorithms * include/abg-diff-utils.h (point::set): New overload.. (point::{add, operator<, operator>, operator<=, operator>=}): New methods. (point::operator!=): Constify. (point::operator==): Constify. Cleanup. (point::operator=): Keep emptiness. (class snake): New class definition (d_path_vec::{over_bounds, offset}): New methods. (d_path_vec::check_index_against_bound): Don't take a bound parameter anymore. Use the new over_bound method above. Fix up error reporting. (d_path_vec::d_path_vec): Fix d_path_vec size allocation. (d_path_vec::operator[]): Use the d_path_vec::at method to check all accesses against the bounds. This is slower, but at least we can expect to have something that is more robust. We can remove the bound checking later when we are sure the code has been tested enough. Also use the new offset() method. (d_path_vec::at): Take long long. (ends_of_furthest_d_paths_overlap): Constify input parameters. (end_of_fr_d_path_in_k, end_of_frr_d_path_in_k_plus_delta): Take an instance of the new snake in parameter, rather than a bare end point that wasn't carrying enough information about the snake. Record the snake which consists of up to four points: a begin point, an intermediate point, a diagonal start point and an end point. Return that snake upon successful completion. (compute_middle_snake): Take an instance of snake, rather than the two points that were supposed to represent a snake and with which we were loosing information before. Revisit/simplify the logic of this function; this literally goes forward or in reverse, gets the resulting snake returned by the end_of_fr_d_path_in_k and end_of_frr_d_path_in_k_plus_delta functions, detect if these snakes overlap and just return the current snake. Much simpler. The caller now gets a snake, which has much more information than the previous snake approximation made of just two points. Bonus point, this follows almost to the word, what the paper says. (maybe_record_match_point, find_snake_start_point): Remove these as there are not used by compute_middle_snake anymore. (print_snake, ses_len): Update these to take/handle a snake. (snake_end_points): New declaration. (compute_diff): When we are getting an empty first sequence, this means that we are inserting the second sequence *before* the beginning of the first sequence; keep this information by setting the insertion point index to -1, rather than zero. Update this to get/handle snakes, rather than free points vaguely representing snakes. Now that compute_middle_snake returns real snakes, handle the information we are getting. Basically for edit scripts of length equal to 1, as the snake carries all the necessary information about the non-diagonal edge (as well as the diagonal edges), we (can) now precisely update the current edit script (as well as the longest common sub-sequence). For edit scripts of length greater than 1, better at which points to divide the problem and consequently, at which points to conquer it back -- better following The Paper to the letter. (display_edit_script): Update this for the use of instances of snake. * src/abg-diff-utils.cc (ends_of_furthest_d_paths_overlap): Update for constification of inputs. (snake_end_points): Define new function. (compute_middle_snake): Adapt for the taking an instance of snake. * tests/test-diff2.cc (main): Update for using instances of snake. * tests/test-core-diff.cc: Add new tests. * tests/data/test-core-diff/report0.txt: Update for output adaptation. * tests/data/test-core-diff/report6.txt: Likewise. * tests/data/test-core-diff/report7.txt: Likewise. * tests/data/test-core-diff/report8.txt: New test data. * tests/data/test-core-diff/report9.txt: Likewise. * tests/data/test-core-diff/report10.txt: Likewise. * tests/data/test-core-diff/report11.txt: Likewise. * tests/data/test-core-diff/report12.txt: Likewise. * tests/data/test-core-diff/report3.txt: Likewise. Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2013-10-23 22:39:04 +00:00
snake s;
if (compute_middle_snake(opts.str1, opts.str2,
Re-write middle snakes management in core diff algorithms * include/abg-diff-utils.h (point::set): New overload.. (point::{add, operator<, operator>, operator<=, operator>=}): New methods. (point::operator!=): Constify. (point::operator==): Constify. Cleanup. (point::operator=): Keep emptiness. (class snake): New class definition (d_path_vec::{over_bounds, offset}): New methods. (d_path_vec::check_index_against_bound): Don't take a bound parameter anymore. Use the new over_bound method above. Fix up error reporting. (d_path_vec::d_path_vec): Fix d_path_vec size allocation. (d_path_vec::operator[]): Use the d_path_vec::at method to check all accesses against the bounds. This is slower, but at least we can expect to have something that is more robust. We can remove the bound checking later when we are sure the code has been tested enough. Also use the new offset() method. (d_path_vec::at): Take long long. (ends_of_furthest_d_paths_overlap): Constify input parameters. (end_of_fr_d_path_in_k, end_of_frr_d_path_in_k_plus_delta): Take an instance of the new snake in parameter, rather than a bare end point that wasn't carrying enough information about the snake. Record the snake which consists of up to four points: a begin point, an intermediate point, a diagonal start point and an end point. Return that snake upon successful completion. (compute_middle_snake): Take an instance of snake, rather than the two points that were supposed to represent a snake and with which we were loosing information before. Revisit/simplify the logic of this function; this literally goes forward or in reverse, gets the resulting snake returned by the end_of_fr_d_path_in_k and end_of_frr_d_path_in_k_plus_delta functions, detect if these snakes overlap and just return the current snake. Much simpler. The caller now gets a snake, which has much more information than the previous snake approximation made of just two points. Bonus point, this follows almost to the word, what the paper says. (maybe_record_match_point, find_snake_start_point): Remove these as there are not used by compute_middle_snake anymore. (print_snake, ses_len): Update these to take/handle a snake. (snake_end_points): New declaration. (compute_diff): When we are getting an empty first sequence, this means that we are inserting the second sequence *before* the beginning of the first sequence; keep this information by setting the insertion point index to -1, rather than zero. Update this to get/handle snakes, rather than free points vaguely representing snakes. Now that compute_middle_snake returns real snakes, handle the information we are getting. Basically for edit scripts of length equal to 1, as the snake carries all the necessary information about the non-diagonal edge (as well as the diagonal edges), we (can) now precisely update the current edit script (as well as the longest common sub-sequence). For edit scripts of length greater than 1, better at which points to divide the problem and consequently, at which points to conquer it back -- better following The Paper to the letter. (display_edit_script): Update this for the use of instances of snake. * src/abg-diff-utils.cc (ends_of_furthest_d_paths_overlap): Update for constification of inputs. (snake_end_points): Define new function. (compute_middle_snake): Adapt for the taking an instance of snake. * tests/test-diff2.cc (main): Update for using instances of snake. * tests/test-core-diff.cc: Add new tests. * tests/data/test-core-diff/report0.txt: Update for output adaptation. * tests/data/test-core-diff/report6.txt: Likewise. * tests/data/test-core-diff/report7.txt: Likewise. * tests/data/test-core-diff/report8.txt: New test data. * tests/data/test-core-diff/report9.txt: Likewise. * tests/data/test-core-diff/report10.txt: Likewise. * tests/data/test-core-diff/report11.txt: Likewise. * tests/data/test-core-diff/report12.txt: Likewise. * tests/data/test-core-diff/report3.txt: Likewise. Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2013-10-23 22:39:04 +00:00
s, ses_len))
{
Re-write middle snakes management in core diff algorithms * include/abg-diff-utils.h (point::set): New overload.. (point::{add, operator<, operator>, operator<=, operator>=}): New methods. (point::operator!=): Constify. (point::operator==): Constify. Cleanup. (point::operator=): Keep emptiness. (class snake): New class definition (d_path_vec::{over_bounds, offset}): New methods. (d_path_vec::check_index_against_bound): Don't take a bound parameter anymore. Use the new over_bound method above. Fix up error reporting. (d_path_vec::d_path_vec): Fix d_path_vec size allocation. (d_path_vec::operator[]): Use the d_path_vec::at method to check all accesses against the bounds. This is slower, but at least we can expect to have something that is more robust. We can remove the bound checking later when we are sure the code has been tested enough. Also use the new offset() method. (d_path_vec::at): Take long long. (ends_of_furthest_d_paths_overlap): Constify input parameters. (end_of_fr_d_path_in_k, end_of_frr_d_path_in_k_plus_delta): Take an instance of the new snake in parameter, rather than a bare end point that wasn't carrying enough information about the snake. Record the snake which consists of up to four points: a begin point, an intermediate point, a diagonal start point and an end point. Return that snake upon successful completion. (compute_middle_snake): Take an instance of snake, rather than the two points that were supposed to represent a snake and with which we were loosing information before. Revisit/simplify the logic of this function; this literally goes forward or in reverse, gets the resulting snake returned by the end_of_fr_d_path_in_k and end_of_frr_d_path_in_k_plus_delta functions, detect if these snakes overlap and just return the current snake. Much simpler. The caller now gets a snake, which has much more information than the previous snake approximation made of just two points. Bonus point, this follows almost to the word, what the paper says. (maybe_record_match_point, find_snake_start_point): Remove these as there are not used by compute_middle_snake anymore. (print_snake, ses_len): Update these to take/handle a snake. (snake_end_points): New declaration. (compute_diff): When we are getting an empty first sequence, this means that we are inserting the second sequence *before* the beginning of the first sequence; keep this information by setting the insertion point index to -1, rather than zero. Update this to get/handle snakes, rather than free points vaguely representing snakes. Now that compute_middle_snake returns real snakes, handle the information we are getting. Basically for edit scripts of length equal to 1, as the snake carries all the necessary information about the non-diagonal edge (as well as the diagonal edges), we (can) now precisely update the current edit script (as well as the longest common sub-sequence). For edit scripts of length greater than 1, better at which points to divide the problem and consequently, at which points to conquer it back -- better following The Paper to the letter. (display_edit_script): Update this for the use of instances of snake. * src/abg-diff-utils.cc (ends_of_furthest_d_paths_overlap): Update for constification of inputs. (snake_end_points): Define new function. (compute_middle_snake): Adapt for the taking an instance of snake. * tests/test-diff2.cc (main): Update for using instances of snake. * tests/test-core-diff.cc: Add new tests. * tests/data/test-core-diff/report0.txt: Update for output adaptation. * tests/data/test-core-diff/report6.txt: Likewise. * tests/data/test-core-diff/report7.txt: Likewise. * tests/data/test-core-diff/report8.txt: New test data. * tests/data/test-core-diff/report9.txt: Likewise. * tests/data/test-core-diff/report10.txt: Likewise. * tests/data/test-core-diff/report11.txt: Likewise. * tests/data/test-core-diff/report12.txt: Likewise. * tests/data/test-core-diff/report3.txt: Likewise. Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2013-10-23 22:39:04 +00:00
print_snake(opts.str1, opts.str2, s, cout);
cout << "ses len: " << ses_len << "\n";
}
return 0;
}
if (opts.lcs)
{
string lcs;
int ses_len = 0;
compute_lcs(opts.str1, opts.str2, ses_len, lcs);
cout << "lcs: " << lcs << "\n"
<< "ses len: " << ses_len << "\n";
return 0;
}
if (opts.ses)
{
edit_script ses;
compute_ses(opts.str1, opts.str2, ses);
display_edit_script(ses, opts.str1, opts.str2, cout);
return 0;
}
return 0;
}