Simplified Partial Digest Problem: Enumerative and Dynamic Programming Algorithms

We study the simplified partial digest problem (SPDP), which is a mathematical model for a new simplified partial digest method of genome mapping. This method is easy for laboratory implementation and robust with respect to the experimental errors. SPDP is NP-hard in the strong sense. We present an O(n2n) time enumerative algorithm (ENUM) and an O(n2q) time dynamic programming algorithm for the error-free SPDP, where n is the number of restriction sites and q is the number of distinct intersite distances. We also give examples of the problem in which there are 2n+2/3 -1 noncongruent solutions. These examples partially answer a question recently posed in the literature about the number of solutions of SPDP. We adapt our ENUM for handling SPDP with imprecise input data. Finally, we describe and discuss the results of the computer experiments with our algorithms.

[1]  Michael S. Waterman,et al.  Introduction to computational biology , 1995 .

[2]  Perry L. Miller,et al.  Computer-assisted restriction mapping: an integrated approach to handling experimental uncertainty , 1994, Comput. Appl. Biosci..

[3]  J. Sambrook,et al.  Molecular Cloning: A Laboratory Manual , 2001 .

[4]  P. Pevzner,et al.  Computational Molecular Biology , 2000 .

[5]  Stephan Eidenbenz,et al.  Measurement Errors Make the Partial Digest Problem NP-Hard , 2004, LATIN.

[6]  Graham Kendall,et al.  Search Methodologies: Introductory Tutorials in Optimization and Decision Support Techniques , 2013 .

[7]  Peter C. Nelson,et al.  On the limitations of automated restriction mapping , 1994, Comput. Appl. Biosci..

[8]  Maurice Nivat,et al.  Some necessary clarifications about the chords' problem and the Partial Digest Problem , 2005, Theor. Comput. Sci..

[9]  João Meidanis,et al.  Introduction to computational molecular biology , 1997 .

[10]  Pavel A. Pevzner,et al.  Computational molecular biology : an algorithmic approach , 2000 .

[11]  James D. Watson,et al.  Genetical implications of the structure of deoxyribonucleic acid. 1953 , 1993 .

[12]  Susan R. Wilson INTRODUCTION TO COMPUTATIONAL BIOLOGY: MAPS, SEQUENCES AND GENOMES. , 1996 .

[13]  Randy Goebel,et al.  Computational intelligence - a logical approach , 1998 .

[14]  M. Sarker,et al.  Physical and genetic map of the Bacteroides fragilis YCH46 chromosome. , 2002, FEMS microbiology letters.

[15]  Jacek Blazewicz,et al.  Selected combinatorial problems of computational biology , 2005, Eur. J. Oper. Res..

[16]  R. B. Kearfott,et al.  Interval Computations: Introduction, Uses, and Resources , 2000 .

[17]  C. A. Thomas,et al.  Molecular cloning. , 1977, Advances in pathobiology.

[18]  Trevor I. Dix,et al.  Errors between sites in restriction site mapping , 1988, Comput. Appl. Biosci..

[19]  Paolo Penna,et al.  Noisy Data Make the Partial Digest Problem NP-hard , 2003, WABI.

[20]  D. Nathans,et al.  Specific cleavage of simian virus 40 DNA by restriction endonuclease of Hemophilus influenzae. , 1971, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Steven Skiena,et al.  A partial digest approach to restriction site mapping , 1993, ISMB.

[22]  Jacek Blazewicz,et al.  Construction of DNA restriction maps based on a simplified experiment , 2001, Bioinform..

[23]  Stéphane Chaillou,et al.  Physical and genetic map of the Lactobacillus sakei 23K chromosome. , 2002, Microbiology.

[24]  Jacek Blazewicz,et al.  Combinatorial optimization in DNA mapping - a computational thread of the Simplified Partial Digest Problem , 2005, RAIRO Oper. Res..

[25]  William John Klepczynski The classical techniques. , 1980 .

[26]  D. T. Jones,et al.  Physical and genetic map of the Clostridium saccharobutylicum (formerly Clostridium acetobutylicum) NCP 262 chromosome. , 2001, Microbiology.

[27]  Warren D. Smith,et al.  Reconstructing Sets From Interpoint Distances , 2003 .

[28]  R. Wilson,et al.  High throughput fingerprint analysis of large-insert clones. , 1997, Genome research.

[29]  D. Nathans,et al.  Studies of simian virus 40 DNA. VII. A cleavage map of the SV40 genome. , 1973, Journal of molecular biology.

[30]  Steven Skiena,et al.  Reconstructing sets from interpoint distances (extended abstract) , 1990, SCG '90.

[31]  Petra Mutzel,et al.  Computational Molecular Biology , 1996 .

[32]  Jacek Blazewicz,et al.  New Algorithm for the Simplified Partial Digest Problem , 2003, WABI.