Commit Graph

14 Commits

Author SHA1 Message Date
Dodji Seketeli
ef94728513 Small optimization hinted by profiling
* include/abg-diff-utils.h (d_path_vec::max_d): Avoid using member
	functions.  This is relevant only when compiling w/o optimization.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2014-02-10 15:51:10 +01:00
Dodji Seketeli
de342efd76 Fix allocation of diff_utils::d_path_vec
* include/abg-diff-utils.h (d_path_vec::d_path_vec):  Do not
	forget to allocate enough data for reverse vectors as well.  The
	comment of the constructor is accurate.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2014-02-10 15:51:10 +01:00
Dodji Seketeli
e8dd2cf705 Ease debugging of abigail::diff_utils::compute_diff
* include/abg-diff-utils.h (compute_diff): Add asserts on for the
	length of the shortest edit script during the divide and conquer
	part of the diff algorithm.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2014-02-10 15:51:09 +01:00
Dodji Seketeli
b219598e7b Fix further reaching reverse path calculation in core diff algo
* include/abg-diff-utils.h (end_of_frr_d_path_in_k_plus_delta):
	Favour moving left when the two abscissas at the previous steps
	are equal.
	(compute_diff): Update the length of the shortest edit script when
	the size of one of the inputs is zero.
	* tests/test-core-diff.cc (in_out_spec): Add a new input to diff
	two sequences for regression testing.
	* tests/data/test-core-diff/report13.txt: New reference for
	the comparison of the new regression test above.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2014-02-10 15:51:09 +01:00
Dodji Seketeli
e6e4e8425e Remove debugging assertion when diffing
* include/abg-diff-utils.h (d_path_vec::at): Do not check for
	bounds.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2014-01-20 12:15:07 +01:00
Dodji Seketeli
73814ba4fa Misc Doxygen API doc fixes
* include/abg-comparison.h: Various doxygen api doc string fixes.
	* include/abg-diff-utils.h: Likewise.
	* include/abg-dwarf-reader.h: Likewise.
	* include/abg-ir.h: Likewise.
	* include/abg-reader.h: Likewise.
	* include/abg-writer.h: Likewise.
	* src/abg-comparison.cc: Likewise.
	* src/abg-corpus.cc: Likewise.
	* src/abg-dwarf-reader.cc: Likewise.
	* src/abg-ir.cc: Likewise.
	* src/abg-libxml-utils.cc: Likewise.
	* src/abg-reader.cc: Likewise.
	* src/abg-writer.cc: Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2014-01-17 15:44:25 +01:00
Dodji Seketeli
fbb6b1bc73 Initial support for diffing ABI corpus files
* include/abg-comparison.h (string_function_ptr_map)
	(changed_function_ptr, string_changed_function_ptr_map)
	(corpus_diff_sptr): New convenience typedefs.
	(translation_unit_diff): Add comments.
	(class corpus_diff): New type.
	(compute_diff): New overload for corpus_diff.
	* include/abg-corpus.h (corpus::{functions, variables}): New
	typedefs.
	(corpus::{operator==, get_functions, get_variables}): New members.
	* include/abg-diff-utils.h (struct deep_ptr_eq_functor): New
	functor.
	* include/abg-ir.h (translation_unit::operator==): New member
	equality operator.
	* src/abg-comparison.cc (struct corpus_diff::priv): New private
	struct holding the private members of corpus_diff.
	(corpus_diff::priv::{lookup_tables_empty, clear_lookup_tables,
	ensure_lookup_tables_populated}): Define new private member functions.
	(corpus_diff::{corpus_diff, first_corpus, second_corpus,
	function_changes, variable_changes, length, report}): New public members.
	(struct noop_deleter): New struct.
	(compute_diff): New implementation for corpus_diff.
	* src/abg-corpus.cc (struct corpus::priv): Renamed corpus::impl
	into this.  Add new fns, vars and is_symbol_table_built data
	members.
	(corpus::priv::build_symbol_table): New member function.
	(class symtab_build_visitor_type): New visitor type to build the
	symbol table.
	(struct func_comp, struct var_comp): New comparison functors.
	(corpus::priv::build_symbol_table): Define new member function.
	(corpus::{corpus, add, get_translation_units, operator==,
	get_functions, get_variables}): Define new members.
	* src/abg-ir.cc (translation_unit::operator==): Define new member
	equality operator.
	(operator==(translation_unit_sptr l, translation_unit_sptr r)):
	Define new equality operator.
	* tools/abg-tools-utils.h (enum file_type): New enum.
	(guess_file_type): Declare new function.
	* tools/abg-tools-utils.cc (guess_file_type): define new function.
	* tools/bidiff.cc (main): Guess the type of the files given in
	input and support elf files reading and diffing.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2013-12-23 14:05:19 +01:00
Dodji Seketeli
dbc0225415 Generalize use of equality operator in core diff algorithms
* include/abg-diff-utils.h (struct default_eq_functor): New
	equality functor.
	(end_of_fr_d_path_in_k, end_of_frr_d_path_in_k_plus_delta): Add a
	new equality functor template parameter and document it.  Use it
	to compare the elements of the sequences given in argument.
	(compute_middle_snake, ses_len, compute_diff): Add a new equality
	functor template parameter and document it.  Adjust call to
	end_of_frr_d_path_in_k_plus_delta, end_of_fr_d_path_in_k and
	compute_middle_snake.
	(ses_len, compute_diff): Add a new overload that uses a
	default_eq_functor as comparison functor, to avoid breaking
	existing client code.
	* src/abg-diff-utils.cc (compute_middle_snake): Adjust the call to
	the compute_middle_snake.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2013-12-23 13:40:14 +01:00
Dodji Seketeli
9be945584a Re-write middle snakes management in core diff algorithms
* include/abg-diff-utils.h (point::set): New overload..
	(point::{add, operator<, operator>, operator<=, operator>=}): New
	methods.
	(point::operator!=): Constify.
	(point::operator==): Constify. Cleanup.
	(point::operator=): Keep emptiness.
	(class snake): New class definition
	(d_path_vec::{over_bounds, offset}): New methods.
	(d_path_vec::check_index_against_bound): Don't take a bound
	parameter anymore.  Use the new over_bound method above.  Fix up
	error reporting.
	(d_path_vec::d_path_vec): Fix d_path_vec size allocation.
	(d_path_vec::operator[]): Use the d_path_vec::at method to check
	all accesses against the bounds.  This is slower, but at least we
	can expect to have something that is more robust.  We can remove
	the bound checking later when we are sure the code has been tested
	enough.  Also use the new offset() method.
	(d_path_vec::at): Take long long.
	(ends_of_furthest_d_paths_overlap): Constify input parameters.
	(end_of_fr_d_path_in_k, end_of_frr_d_path_in_k_plus_delta): Take
	an instance of the new snake in parameter, rather than a bare end
	point that wasn't carrying enough information about the snake.
	Record the snake which consists of up to four points: a begin
	point, an intermediate point, a diagonal start point and an end
	point.  Return that snake upon successful completion.
	(compute_middle_snake): Take an instance of snake, rather than the
	two points that were supposed to represent a snake and with which
	we were loosing information before.  Revisit/simplify the logic of
	this function; this literally goes forward or in reverse, gets the
	resulting snake returned by the end_of_fr_d_path_in_k and
	end_of_frr_d_path_in_k_plus_delta functions, detect if these snakes
	overlap and just return the current snake.  Much simpler.  The
	caller now gets a snake, which has much more information than the
	previous snake approximation made of just two points.  Bonus
	point, this follows almost to the word, what the paper says.
	(maybe_record_match_point, find_snake_start_point): Remove these
	as there are not used by compute_middle_snake anymore.
	(print_snake, ses_len): Update these to take/handle a snake.
	(snake_end_points): New declaration.
	(compute_diff): When we are getting an empty first sequence, this
	means that we are inserting the second sequence *before* the
	beginning of the first sequence; keep this information by setting
	the insertion point index to -1, rather than zero.  Update this to
	get/handle snakes, rather than free points vaguely representing
	snakes.  Now that compute_middle_snake returns real snakes, handle
	the information we are getting.  Basically for edit scripts of
	length equal to 1, as the snake carries all the necessary
	information about the non-diagonal edge (as well as the diagonal
	edges), we (can) now precisely update the current edit script (as
	well as the longest common sub-sequence).  For edit scripts of
	length greater than 1, better at which points to divide the
	problem and consequently, at which points to conquer it back --
	better following The Paper to the letter.
	(display_edit_script): Update this for the use of instances of
	snake.
	* src/abg-diff-utils.cc (ends_of_furthest_d_paths_overlap): Update
	for constification of inputs.
	(snake_end_points): Define new function.
	(compute_middle_snake): Adapt for the taking an instance of snake.
	* tests/test-diff2.cc (main): Update for using instances of snake.
	* tests/test-core-diff.cc: Add new tests.
	* tests/data/test-core-diff/report0.txt: Update for output
	adaptation.
	* tests/data/test-core-diff/report6.txt: Likewise.
	* tests/data/test-core-diff/report7.txt: Likewise.
	* tests/data/test-core-diff/report8.txt: New test data.
	* tests/data/test-core-diff/report9.txt: Likewise.
	* tests/data/test-core-diff/report10.txt: Likewise.
	* tests/data/test-core-diff/report11.txt: Likewise.
	* tests/data/test-core-diff/report12.txt: Likewise.
	* tests/data/test-core-diff/report3.txt: Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2013-11-19 11:25:47 +01:00
Dodji Seketeli
997e2d9328 Fix middle snake determination & ses len computation for d == 1
* include/abg-diff-utils.h (compute_middle_snake): After the
	overlap determination happened, finding the middle snake can
	require keep on building the current path until the "end".  The
	end meaning reaching the max of D.  And that max is (M + N)/2 + 1.
	In the extreme cases were middle snake was on the very last step
	(M + N) + 1, we were not finding the middle snake.  Fix this.
	(compute_diff): When d == 1 and the first edge on the edit graph
	is a non-diagonal edge and when a_base != a_begin, we were failing
	to properly initialize x,y to find that non-diagonal edge.  Also
	we were failing to correctly compute the size of the sequence.
	Fix these.
	* tests/test-core-diff.cc: Add a new regression test for the two
	cases above.
	* tests/data/test-core-diff/report7.txt: New reference data for
	the new regression test.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2013-11-19 11:22:56 +01:00
Dodji Seketeli
dca9690970 Simplify & cleanup compute_diff core api
* include/abg-diff-utils.h (insertion::inserted_): Changed the
	type of this from vector<int> to vector<unsigned>.
	(insertion::{insertion, inserted_indexes}): Adjust.
	(compute_diff): Add two new simpler overloads.  Implement them in
	term of the former more complex overload.
	(compute_lcs): Adjust for the vector<int> -> vector<unsigned>
	change.
	* src/abg-diff-utils.cc (compute_lcs, compute_ses): Adjust for the
	compute_diff change above.
	* src/abg-comparison.cc (compute_diff, report_changes): Adjust for
	the compute_diff & vector<unsigned> changes above..

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2013-11-19 11:22:05 +01:00
Dodji Seketeli
fe1b7062eb Fix middle snake determination
* include/abg-diff-utils.h (point::{operator!=,operator==}): New
	operators.
	(end_of_fr_d_path_in_k, end_of_frr_d_path_in_k_plus_delta): Allow
	the initial point (-1,-1) that is not a point addressing elements
	of the input sequences, but that is the starting point of the
	forward paths and the ending point of reverse paths in the "Linear
	Refinement" of the algorithm.
	(is_match_point, maybe_record_match_point)
	(find_snake_start_point): New functions.
	(find_last_snake_in_path): Remove this.  It's not used anymore.
	(compute_middle_snake): Allow checking for overlapping paths even
	on points that are outside of the edit graph boundaries.  Once the
	overlap is detected, if a non-empty snake has been seen already,
	report it as the middle snake.  Otherwise, keep building the path
	until the end and report the last snake encountered as the middle
	snake.  Add comments.
	(compute_diff): For the d == 1 case, fix the logic of the finding
	the non-diagonal edge.  Fix typos.  Add comments.
	(display_edit_script): Fix report glitches.
	* tests/data/test-core-diff/report3.txt: Update as per the report
	glitch above.
	* tests/data/test-core-diff/report4.txt: Likewise.
	* tests/data/test-core-diff/report5.txt: Likewise.
	* tests/data/test-core-diff/report6.txt: New reference report for
	a new test.
	* tests/test-core-diff.cc: Add a new test for negative delta.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2013-11-19 11:20:19 +01:00
Dodji Seketeli
0b199ebe03 Fix core diff algorithms for negative deltas
* diff2.h (point::point): New copy constructor.
	(point::{operator+=, operator=}):  Use point::set.
	(point::{operator--, operator++,}): New operators.
	(d_path_vec::{a_size_, b_size_}): New members.
	(d_path_vec::max_d_): Remove this member.
	(d_path_vec::max_d): Compute this, now that max_d_ was removed.
	(point_is_valid_in_graph): Declare this new function.
	(end_of_fr_d_path_in_k, ): Return
	a bool when the end of furthest reaching past found is within the
	bounds of the edit graph.  Add comments.
	(end_of_frr_d_path_in_k_plus_delta): Likewise.  Also, delta can be
	negative; support that.  Do not cross the boundaries of the edit
	graph when following a diagonal edge.
	(find_last_snake_in_path): New function.
	(compute_middle_snake): Make forward/reverse d_path_vec be big
	enough to hold paths for M+N differences.  Normally M+N/2 should
	be enough, but we were getting weird out of bound errors.  Let's
	handle it this way for now.  Do not require that we check for
	overlap only when we are on a diagonal edge.  Once we detected an
	overlap, use the new find_last_snake_in_path to find the
	boundaries of the snake.
	(ses_len): Delta can be negative.
	(display_edit): Small minor English nit.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2013-11-19 11:19:34 +01:00
Dodji Seketeli
f54ad28548 Lay down the foundations of computing the diff between two class_decl
* include/abg-diff-utils.h: New file.
	* src/abg-diff-utils.cc: Likewise.  Implement the code diffing
	algorithms from Eugene Myers.
	* include/abg-comparison.h: New file. First short at defining the
	basic APIs to compute the diff of two classes.
	* src/abg-comparison.cc: New file.  Start the implementation of
	the above header.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2013-11-19 11:19:21 +01:00