Rotations of Periodic Strings and Short Superstrings

This paper presents two simple approximation algorithms for the shortest superstring problem with approximation ratios 223 (?2.67) and22542(?2.596). The framework of our improved algorithms is similar to that of previous algorithms in the sense that they construct a superstring by computing some optimal cycle covers on the distance graph of the given strings and then break and merge the cycles to finally obtain a Hamiltonian path, but we make use of new bounds on the overlap between two strings. We prove that for each periodic semiinfinite string ?=a1a2··· of periodq, there exists an integerk, such that forany(finite) stringsof periodpwhich isinequivalentto ?, the overlap betweensand therotation?k=akak+1··· is at mostp+12q. Moreover, ifp?q, then the overlap betweensand ?k is not larger than 23(p+q). The bounds are tight. In the previous shortest superstring algorithmsp+qwas used as the standard (tight) bound on overlap between two strings with periodspandq.

[1]  Maurice Pouzet,et al.  Une caracterisation des mots periodiques , 1979, Discret. Math..

[2]  Olivier Danvy,et al.  Thunks and the λ-calculus , 1997, Journal of Functional Programming.

[3]  Maxime Crochemore,et al.  Two-way string-matching , 1991, JACM.

[4]  Wojciech Rytter,et al.  Text Algorithms , 1994 .

[5]  Hans Söderlund,et al.  Algorithms for Some String Matching Problems Arising in Molecular Genetics , 1983, IFIP Congress.

[6]  Jonathan S. Turner,et al.  Approximation Algorithms for the Shortest Common Superstring Problem , 1989, Inf. Comput..

[7]  Tao Jiang,et al.  Linear approximation of shortest superstrings , 1991, STOC '91.

[8]  Olivier Danvy,et al.  Back to Direct Style II: First-Class Continuations , 1996 .

[9]  Olivier Danvy,et al.  On the Idempotence of the CPS Transformation , 1996 .

[10]  Olivier Danvy,et al.  Semantics-Based Compiling: A Case Study in Type-Directed Partial Evaluation , 1996 .

[11]  F. Frances Yao,et al.  Approximating shortest superstrings , 1997, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[12]  H. Wilf,et al.  Uniqueness theorems for periodic functions , 1965 .

[13]  Arthur M. Lesk Computational Molecular Biology: Sources and Methods for Sequence Analysis , 1989 .

[14]  J. Davenport Editor , 1960 .

[15]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[16]  Vladimiro Sassone,et al.  Comparing Transition Systems with Independence and Asynchronous Transition Systems , 1996, CONCUR.

[17]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[18]  Clifford Stein,et al.  A 2 2 3 {approximation Algorithm for the Shortest Superstring Problem , 1995 .

[19]  Wojciech Rytter,et al.  Parallel and Sequential Approximations of Shortest Superstrings , 1994, SWAT.

[20]  Clifford Stein,et al.  Improved Length Bounds for the Shortest Superstring Problem (Extended Abstract) , 1995, WADS.

[21]  Clifford Stein,et al.  Long tours and short superstrings , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[22]  Olivier Danvy,et al.  Pragmatic Aspects of Type-Directed Partial Evaluation , 1996 .

[23]  Jens Palsberg,et al.  Eta-Expansion Does The Trick (Revised Version) , 1996 .

[24]  Esko Ukkonen,et al.  A Greedy Approximation Algorithm for Constructing Shortest Common Superstrings , 1988, Theor. Comput. Sci..

[25]  Michael S. Waterman,et al.  Introduction to Computational Biology: Maps, Sequences and Genomes , 1998 .

[26]  Chris Armen Approximation algorithms for the shortest superstring problem , 1996 .

[27]  David Maier,et al.  On Finding Minimal Length Superstrings , 1980, J. Comput. Syst. Sci..

[28]  Clifford Stein,et al.  Short Superstrings and the Structure of Overlapping Strings , 1995, J. Comput. Biol..

[29]  James A. Storer,et al.  Data Compression: Methods and Theory , 1987 .