Dealing with errors in interactive sequencing by hybridization

MOTIVATION A realistic approach to sequencing by hybridization must deal with realistic sequencing errors. The results of such a method can surely be applied to similar sequencing tasks. RESULTS We provide the first algorithms for interactive sequencing by hybridization which are robust in the presence of hybridization errors. Under a strong error model allowing both positive and negative hybridization errors without repeated queries, we demonstrate accurate and efficient reconstruction with error rates up to 7%. Under the weaker traditional error model of Shamir and Tsur (Proceedings of the Fifth International Conference on Computational Molecular Biology (RECOMB-01), pp 269-277, 2000), we obtain accurate reconstructions with up to 20% false negative hybridization errors. Finally, we establish theoretical bounds on the performance of the sequential probing algorithm of Skiena and Sundaram (J. Comput. Biol., 2, 333-353, 1995) under the strong error model. AVAILABILTY Freely available upon request. CONTACT skiena@cs.sunysb.edu.

[1]  S. P. Fodor,et al.  Light-directed, spatially addressable parallel chemical synthesis. , 1991, Science.

[2]  R. Drmanac,et al.  DNA sequencing by hybridization: 100 bases read by a non-gel-based method. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Jacek Blazewicz,et al.  DNA Sequencing With Positive and Negative Errors , 1999, J. Comput. Biol..

[4]  Roded Sharan,et al.  On the Complexity of Positional Sequencing by Hybridization , 1999, CPM.

[5]  C T Caskey,et al.  A rapid scanning strip for tri- and dinucleotide short tandem repeats. , 1994, Nucleic acids research.

[6]  de Ng Dick Bruijn A combinatorial problem , 1946 .

[7]  Ron Shamir,et al.  Large Scale Sequencing by Hybridization , 2002, J. Comput. Biol..

[8]  A. Blanchard,et al.  High-density oligonucleotide arrays , 1996 .

[9]  Pavel A. Pevzner,et al.  Towards DNA Sequencing Chips , 1994, MFCS.

[10]  Fred Russell Kramer,et al.  Oligonucleotide Arrays: New Concepts and Possibilities , 1994, Bio/Technology.

[11]  Semyon Kruglyak,et al.  Multistage Sequencing by Hybridization , 1998, J. Comput. Biol..

[12]  W. Bains,et al.  A novel method for nucleic acid sequence determination. , 1988, Journal of theoretical biology.

[13]  Alan M. Frieze,et al.  Optimal Sequencing by Hybridization in Rounds , 2002, J. Comput. Biol..

[14]  Steven Skiena,et al.  Fabricating arrays of strings , 1997, RECOMB '97.

[15]  P. Pevzner 1-Tuple DNA sequencing: computer analysis. , 1989, Journal of biomolecular structure & dynamics.

[16]  P. Pevzner,et al.  Improved chips for sequencing by hybridization. , 1991, Journal of biomolecular structure & dynamics.

[17]  Eli Upfal,et al.  Sequencing-by-Hybridization at the Information-Theory Bound: An Optimal Algorithm , 2000, J. Comput. Biol..

[18]  Steven Skiena,et al.  Reconstructing Strings from Substrings , 1995, J. Comput. Biol..

[19]  Steven Skiena,et al.  Reconstructing strings from substrings in rounds , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.