Commit Graph

5 Commits

Author SHA1 Message Date
Dodji Seketeli
b219598e7b Fix further reaching reverse path calculation in core diff algo
* include/abg-diff-utils.h (end_of_frr_d_path_in_k_plus_delta):
	Favour moving left when the two abscissas at the previous steps
	are equal.
	(compute_diff): Update the length of the shortest edit script when
	the size of one of the inputs is zero.
	* tests/test-core-diff.cc (in_out_spec): Add a new input to diff
	two sequences for regression testing.
	* tests/data/test-core-diff/report13.txt: New reference for
	the comparison of the new regression test above.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2014-02-10 15:51:09 +01:00
Dodji Seketeli
9be945584a Re-write middle snakes management in core diff algorithms
* include/abg-diff-utils.h (point::set): New overload..
	(point::{add, operator<, operator>, operator<=, operator>=}): New
	methods.
	(point::operator!=): Constify.
	(point::operator==): Constify. Cleanup.
	(point::operator=): Keep emptiness.
	(class snake): New class definition
	(d_path_vec::{over_bounds, offset}): New methods.
	(d_path_vec::check_index_against_bound): Don't take a bound
	parameter anymore.  Use the new over_bound method above.  Fix up
	error reporting.
	(d_path_vec::d_path_vec): Fix d_path_vec size allocation.
	(d_path_vec::operator[]): Use the d_path_vec::at method to check
	all accesses against the bounds.  This is slower, but at least we
	can expect to have something that is more robust.  We can remove
	the bound checking later when we are sure the code has been tested
	enough.  Also use the new offset() method.
	(d_path_vec::at): Take long long.
	(ends_of_furthest_d_paths_overlap): Constify input parameters.
	(end_of_fr_d_path_in_k, end_of_frr_d_path_in_k_plus_delta): Take
	an instance of the new snake in parameter, rather than a bare end
	point that wasn't carrying enough information about the snake.
	Record the snake which consists of up to four points: a begin
	point, an intermediate point, a diagonal start point and an end
	point.  Return that snake upon successful completion.
	(compute_middle_snake): Take an instance of snake, rather than the
	two points that were supposed to represent a snake and with which
	we were loosing information before.  Revisit/simplify the logic of
	this function; this literally goes forward or in reverse, gets the
	resulting snake returned by the end_of_fr_d_path_in_k and
	end_of_frr_d_path_in_k_plus_delta functions, detect if these snakes
	overlap and just return the current snake.  Much simpler.  The
	caller now gets a snake, which has much more information than the
	previous snake approximation made of just two points.  Bonus
	point, this follows almost to the word, what the paper says.
	(maybe_record_match_point, find_snake_start_point): Remove these
	as there are not used by compute_middle_snake anymore.
	(print_snake, ses_len): Update these to take/handle a snake.
	(snake_end_points): New declaration.
	(compute_diff): When we are getting an empty first sequence, this
	means that we are inserting the second sequence *before* the
	beginning of the first sequence; keep this information by setting
	the insertion point index to -1, rather than zero.  Update this to
	get/handle snakes, rather than free points vaguely representing
	snakes.  Now that compute_middle_snake returns real snakes, handle
	the information we are getting.  Basically for edit scripts of
	length equal to 1, as the snake carries all the necessary
	information about the non-diagonal edge (as well as the diagonal
	edges), we (can) now precisely update the current edit script (as
	well as the longest common sub-sequence).  For edit scripts of
	length greater than 1, better at which points to divide the
	problem and consequently, at which points to conquer it back --
	better following The Paper to the letter.
	(display_edit_script): Update this for the use of instances of
	snake.
	* src/abg-diff-utils.cc (ends_of_furthest_d_paths_overlap): Update
	for constification of inputs.
	(snake_end_points): Define new function.
	(compute_middle_snake): Adapt for the taking an instance of snake.
	* tests/test-diff2.cc (main): Update for using instances of snake.
	* tests/test-core-diff.cc: Add new tests.
	* tests/data/test-core-diff/report0.txt: Update for output
	adaptation.
	* tests/data/test-core-diff/report6.txt: Likewise.
	* tests/data/test-core-diff/report7.txt: Likewise.
	* tests/data/test-core-diff/report8.txt: New test data.
	* tests/data/test-core-diff/report9.txt: Likewise.
	* tests/data/test-core-diff/report10.txt: Likewise.
	* tests/data/test-core-diff/report11.txt: Likewise.
	* tests/data/test-core-diff/report12.txt: Likewise.
	* tests/data/test-core-diff/report3.txt: Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2013-11-19 11:25:47 +01:00
Dodji Seketeli
997e2d9328 Fix middle snake determination & ses len computation for d == 1
* include/abg-diff-utils.h (compute_middle_snake): After the
	overlap determination happened, finding the middle snake can
	require keep on building the current path until the "end".  The
	end meaning reaching the max of D.  And that max is (M + N)/2 + 1.
	In the extreme cases were middle snake was on the very last step
	(M + N) + 1, we were not finding the middle snake.  Fix this.
	(compute_diff): When d == 1 and the first edge on the edit graph
	is a non-diagonal edge and when a_base != a_begin, we were failing
	to properly initialize x,y to find that non-diagonal edge.  Also
	we were failing to correctly compute the size of the sequence.
	Fix these.
	* tests/test-core-diff.cc: Add a new regression test for the two
	cases above.
	* tests/data/test-core-diff/report7.txt: New reference data for
	the new regression test.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2013-11-19 11:22:56 +01:00
Dodji Seketeli
fe1b7062eb Fix middle snake determination
* include/abg-diff-utils.h (point::{operator!=,operator==}): New
	operators.
	(end_of_fr_d_path_in_k, end_of_frr_d_path_in_k_plus_delta): Allow
	the initial point (-1,-1) that is not a point addressing elements
	of the input sequences, but that is the starting point of the
	forward paths and the ending point of reverse paths in the "Linear
	Refinement" of the algorithm.
	(is_match_point, maybe_record_match_point)
	(find_snake_start_point): New functions.
	(find_last_snake_in_path): Remove this.  It's not used anymore.
	(compute_middle_snake): Allow checking for overlapping paths even
	on points that are outside of the edit graph boundaries.  Once the
	overlap is detected, if a non-empty snake has been seen already,
	report it as the middle snake.  Otherwise, keep building the path
	until the end and report the last snake encountered as the middle
	snake.  Add comments.
	(compute_diff): For the d == 1 case, fix the logic of the finding
	the non-diagonal edge.  Fix typos.  Add comments.
	(display_edit_script): Fix report glitches.
	* tests/data/test-core-diff/report3.txt: Update as per the report
	glitch above.
	* tests/data/test-core-diff/report4.txt: Likewise.
	* tests/data/test-core-diff/report5.txt: Likewise.
	* tests/data/test-core-diff/report6.txt: New reference report for
	a new test.
	* tests/test-core-diff.cc: Add a new test for negative delta.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2013-11-19 11:20:19 +01:00
Dodji Seketeli
90467f10f2 Initial regression test facility for core diff algorithms
* tests/data/test-core-diff/report0.txt: New test reference data.
	* tests/data/test-core-diff/report1.txt: Likewise.
	* tests/data/test-core-diff/report2.txt: Likewise.
	* tests/data/test-core-diff/report3.txt: Likewise.
	* tests/data/test-core-diff/report4.txt: Likewise.
	* tests/data/test-core-diff/report5.txt: Likewise.
	* tests/test-core-diff.cc: New regression test program.
	* tests/Makefile.am: Add these new files to the build system.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
2013-11-19 11:20:03 +01:00