Edit Distance with Block Deletions

Several variants of the edit distance problem with block deletions are considered. Polynomial time optimal algorithms are presented for the edit distance with block deletions allowing character insertions and character moves, but without block moves. We show that the edit distance with block moves and block deletions is NP-complete (Nondeterministic Polynomial time problems in which any given solution to such problem can be verified in polynomial time, and any NP problem can be converted into it in polynomial time), and that it can be reduced to the problem of non-recursive block moves and block deletions within a constant factor.

[1]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[2]  Marek Chrobak,et al.  The greedy algorithm for the minimum common string partition problem , 2005, TALG.

[3]  S. Muthukrishnan,et al.  Approximate nearest neighbors and sequence comparison with block operations , 2000, STOC '00.

[4]  Daniel P. Lopresti,et al.  Block Edit Models for Approximate String Matching , 1997, Theor. Comput. Sci..

[5]  Walter F. Tichy,et al.  The string-to-string correction problem with block moves , 1984, TOCS.

[6]  Funda Ergün,et al.  Comparing Sequences with Segment Rearrangements , 2003, FSTTCS.

[7]  Maxime Crochemore,et al.  Algorithms on strings , 2007 .

[8]  Gad M. Landau,et al.  A sub-quadratic sequence alignment algorithm for unrestricted cost matrices , 2002, SODA '02.

[9]  S. Muthukrishnan,et al.  Simple and Practical Sequence Nearest Neighbors with Block Operations , 2002, CPM.

[10]  Graham Cormode,et al.  The string edit distance matching problem with moves , 2002, SODA '02.

[11]  Robert W. Floyd,et al.  Notes on Avoiding "go to" Statements , 1971, Information Processing Letters.

[12]  Uzi Vishkin,et al.  Communication complexity of document exchange , 1999, SODA '00.

[13]  D. Durand,et al.  A Short Course in Computational Molecular Biology , 1997 .

[14]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[15]  Hsing-Yen Ann,et al.  Efficient algorithms for the block edit problems , 2010, Inf. Comput..

[16]  Mike Paterson,et al.  A Faster Algorithm Computing String Edit Distances , 1980, J. Comput. Syst. Sci..

[17]  Esko Ukkonen,et al.  Algorithms for Approximate String Matching , 1985, Inf. Control..

[18]  Haim Kaplan,et al.  The greedy algorithm for edit distance with moves , 2006, Inf. Process. Lett..

[19]  Sridhar Hannenhalli,et al.  Polynomial-time Algorithm for Computing Translocation Distance Between Genomes , 1995, Discret. Appl. Math..

[20]  Vineet Bafna,et al.  Sorting by Transpositions , 1998, SIAM J. Discret. Math..

[21]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[22]  Gad M. Landau,et al.  A Unified Algorithm for Accelerating Edit-Distance Computation via Text-Compression , 2009, STACS.

[23]  Dana Shapira,et al.  Edit distance with move operations , 2002, J. Discrete Algorithms.

[24]  Gad M. Landau,et al.  A Subquadratic Sequence Alignment Algorithm for Unrestricted Scoring Matrices , 2003, SIAM J. Comput..