论文信息 - A Space Efficient Algorithm for Finding the Best Non-Overlapping Alignment Score

A Space Efficient Algorithm for Finding the Best Non-Overlapping Alignment Score

Repeating patterns make up a significant fraction of DNA and protein molecules. These repeating regions are important to biological function because they may act as catalytic, regulatory or evolutionary sites and because they have been implicated in human disease. Additionally, these regions often serve as useful laboratory tools for such tasks as localizing genes on a chromosome and DNA fingerprinting. In this paper, we present a space efficient algorithm for finding the maximum alignment score for any two substrings of a single string T under the condition that the substrings do not overlap. In a biological context, this corresponds to the largest repeating region in the molecule. The algorithm runs in O(n2 log2n) time and uses only O(n2) space.

Gary Benson

[1] Michael J. Fischer,et al. The String-to-String Correction Problem , 1974, JACM.

[2] Gad M. Landau,et al. An Algorithm for Approximate Tandem Repeats , 1993, CPM.

[3] Mikhail J. Atallah,et al. Efficient Parallel Algorithms for String Editing and Related Problems , 1990, SIAM J. Comput..

[4] M S Waterman,et al. Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[5] Vladimir I. Levenshtein,et al. Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[6] Sampath Kannan,et al. An Algorithm for Locating Non-Overlapping Regions of Maximum Alignment Score , 1993, CPM.