RUS  ENG
Full version
JOURNALS // Program Systems: Theory and Applications // Archive

Program Systems: Theory and Applications, 2015 Volume 6, Issue 1, Pages 189–197 (Mi ps164)

This article is cited in 3 papers

Mathematical Foundations of Programming

A model and algorithm for sequence alignment

S. V. Znamenskij

Program Systems Institute of RAS

Abstract: The change detection problem is aimed at identifying common and different strings and usually has non-unique solutions. The identification of the best alignment is canonically based on finding a longest common subsequence (LCS) and is widely used for various purposes. However, many recent version control systems prefer alternative heuristic algorithms which not only are faster but also usually produce better alignment than finding an LCS.
Two basic shortcomings of known alignment algorithms are outlined in the paper: The sequence alignment problem is considered to be an abstract model for change detection in collaborative text editing designed to minimize the probability of merge conflict. A new cost function is defined as the probability of overlap between detected changes and a random string. This optimization avoids both shortcomings mentioned above. The simple cubic algorithm is proposed.

Key words and phrases: similarity of strings, sequence alignment, software development, diff, LCS, edit distance, Levenshtein metric.

UDC: 004.416

Received: 14.12.2014
Accepted: 28.01.2015

Language: English



© Steklov Math. Inst. of RAS, 2025