Two applications of the divide&conquer principle in the molecular sciences

In this paper, two problems from the molecular sciences are addressed: the enumeration of fullerene-type isomers and the alignment of biosequences. We report on two algorithms dealing with these problems both of which are based on the well-known and widely used Divide&Conquer principle. In other words, our algorithms attack the original problems by associating with them an appropriate number of much simpler problems whose solutions can be “glued together” to yield solutions of the original, rather complex tasks. The considerable improvements achieved this way exemplify that the present day molecular sciences offer many worthwhile opportunities for the effective use of fundamental algorithmic principles and architectures.

[1]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[2]  A. Dress,et al.  Parsimonious phylogenetic trees in metric spaces and simulated annealing , 1987 .

[3]  S. Altschul,et al.  A tool for multiple sequence alignment. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[5]  On the computational complexity of composite systems , 1987 .

[6]  Jens Stoye,et al.  Improving the Divide-and-Conquer Approach to Sum-of-Pairs Multiple Sequence Alignment , 1997 .

[7]  Michael S. Waterman,et al.  Introduction to Computational Biology: Maps, Sequences and Genomes , 1998 .

[8]  Eugene W. Myers,et al.  Optimal alignments in linear space , 1988, Comput. Appl. Biosci..

[9]  D. K. Y. Chiu,et al.  A survey of multiple sequence comparison methods , 1992 .

[10]  Jens Stoye,et al.  Divide-and-Conquer Multiple Sequence Alignment , 1997 .

[11]  Michael S. Waterman,et al.  Introduction to computational biology , 1995 .

[12]  S. Altschul Gap costs for multiple sequence alignment. , 1989, Journal of theoretical biology.

[13]  W. Taylor,et al.  Identification of protein sequence homology by consensus template alignment. , 1986, Journal of molecular biology.

[14]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[15]  J. Ellman,et al.  Combinatorial chemistry and new drugs. , 1997, Scientific American.

[16]  M. A. McClure,et al.  Comparative analysis of multiple protein-sequence alignment methods. , 1994, Molecular biology and evolution.

[17]  Sandeep K. Gupta,et al.  Improving the Practical Space and Time Efficiency of the Shortest-Paths Approach to Sum-of-Pairs Multiple Sequence Alignment , 1995, J. Comput. Biol..

[18]  O. Gotoh An improved algorithm for matching biological sequences. , 1982, Journal of molecular biology.

[19]  Andreas W. M. Dress,et al.  A Divide and Conquer Approach to Multiple Alignment , 1995, ISMB.

[20]  Daniel S. Hirschberg,et al.  A linear space algorithm for computing maximal common subsequences , 1975, Commun. ACM.

[21]  David Sankoff,et al.  Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[22]  Douglas J. Klein,et al.  Elemental carbon isomerism , 1994 .

[23]  Douglas J. Klein,et al.  Sixty‐atom carbon cages , 1991 .

[24]  S F Altschul,et al.  Weights for data related by a tree. , 1989, Journal of molecular biology.

[25]  D. Lipman,et al.  The multiple sequence alignment problem in biology , 1988 .

[26]  Andreas W. M. Dress,et al.  Topological Resonance Energy of Fullerenes , 1997, J. Chem. Inf. Comput. Sci..

[27]  D. Manolopoulos,et al.  Theoretical studies of the fullerenes: C34 to C70 , 1991 .

[28]  H. M. Martinez A flexible multiple sequence alignment program. , 1988, Nucleic acids research.

[29]  Jens Stoye,et al.  On Simultaneous versus Iterative Multiple Sequence Alignment , 1997 .

[30]  S. C. O'brien,et al.  C60: Buckminsterfullerene , 1985, Nature.

[31]  Jens Stoye,et al.  Fast Approximation to the NP-hard Problem of Multiple Sequence Alignment , 1996 .

[32]  Pierre Hansen,et al.  Fullerene isomers of C60. Kekulé counts versus stability , 1994 .

[33]  J Stoye,et al.  A general method for fast multiple sequence alignment. , 1996, Gene.

[34]  R. Doolittle Molecular evolution: computer analysis of protein and nucleic acid sequences. , 1990, Methods in enzymology.

[35]  Andreas W. M. Dress,et al.  A Constructive Enumeration of Fullerenes , 1997, J. Algorithms.

[36]  D. Manolopoulos,et al.  An Atlas of Fullerenes , 1995 .

[37]  Douglas J. Klein,et al.  Elemental carbon cages , 1988 .

[38]  R. Doolittle Computer methods for macromolecular sequence analysis , 1996 .

[39]  M. O. Dayhoff,et al.  22 A Model of Evolutionary Change in Proteins , 1978 .

[40]  Gary Stix,et al.  Finding Pictures on the Web , 1997 .

[41]  P. Argos,et al.  Motif recognition and alignment for many sequences by comparison of dot-matrices. , 1991, Journal of molecular biology.