Simultaneous Solution of the RNA Folding, Alignment and Protosequence Problems

The alignment of finite sequences, the inference of ribonucleic acid secondary structures (folding), and the reconstruction of ancestral sequences on a phylogenetic tree, are three problems which have dynamic programming solutions, which we formulate in a common mathematical framework. Combining the objective functions for alignment (parsimony, or minimal mutations) and folding (free energy), we present an algorithm which solves all three problems simultaneously for a set of N sequences of length n in time proportional to $n^{3N} $ and storage $n^{2N} $. Incorporating a “cutting corners” constraint against biologically unlikely alignments reduces these requirements so that they are proportional to $n^{3} $ and $n^{2} $, respectively, for fixed N.

[1]  D Sankoff,et al.  Matching sequences under deletion-insertion constraints. , 1972, Proceedings of the National Academy of Sciences of the United States of America.

[2]  D. Crothers,et al.  Improved estimation of secondary structure in ribonucleic acids. , 1973, Nature: New biology.

[3]  D. Sankoff,et al.  Evolution of 5S RNA and the non-randomness of base replacement. , 1973, Nature: New biology.

[4]  P. Sellers On the Theory and Computation of Evolutionary Distances , 1974 .

[5]  Peter H. Sellers,et al.  An Algorithm for the Distance Between Two Finite Sequences , 1974, J. Comb. Theory, Ser. A.

[6]  David Sankoff,et al.  Locating the vertices of a steiner tree in an arbitrary metric space , 1975, Math. Program..

[7]  M. Waterman,et al.  RNA secondary structure: a complete mathematical analysis , 1978 .

[8]  Jerrold R. Griggs,et al.  Algorithms for Loop Matchings , 1978 .

[9]  W. Salser Globin mRNA sequences: analysis of base pairing and evolutionary implications. , 1978, Cold Spring Harbor symposia on quantitative biology.

[10]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[11]  David Sankoff,et al.  Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[12]  Michael S. Waterman,et al.  General methods of sequence comparison , 1984 .

[13]  David Sankoff,et al.  RNA secondary structures and their prediction , 1984 .

[14]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .