RNA folding via algebraic dynamic programming

The aim of this thesis is to apply the framework of Algebraic Dynamic Programming (short ADP) to a well known problem with established significance in bioinformatics, to implement the current "state of the art", and finally to go one step further and solve one of the open problems. Ab initio RNA secondary structure folding of a single sequence was chosen as a perfect fit to the requirements. First, because of the compactness of the field, showing a clear path from the first description of the Nearest Neighbor model by Tinoco and others in a Nature paper from 1971, via the base pair maximization algorithm by Nussinov and others in 1978, to the first efficient and complete solution to the free energy minimization problem by Zuker and Stiegler in 1981, and then on to a number of further refinements to date (Tinoco et al., 1971; Nussinov et al., 1978; Zuker and Stiegler, 1981; Wuchty et al., 1999; Lyngsoe et al., 1999). Second, there is a clear description of an open problem in a paper by Zuker and Sankoff in 1984, that to our knowledge has not been solved yet (Zuker and Sankoff, 1984). It is the problem of reducing the structure space of a given RNA to saturated secondary structures whose helices can not be extended any further by legal base pairs.

[1]  Graham Hutton,et al.  Higher-order functions for parsing , 1992, Journal of Functional Programming.

[2]  Manolo Gouy,et al.  An energy model that predicts the correct folding of both the tRNA and the 5S RNA molecules , 1984, Nucleic Acids Res..

[3]  K. Flaherty,et al.  Three-dimensional structure of a hammerhead ribozyme , 1994, Nature.

[4]  R. Durbin,et al.  Biological sequence analysis: Background on probability , 1998 .

[5]  D. Sankoff,et al.  RNA secondary structures and their prediction , 1984 .

[6]  I. Tinoco,et al.  Stability of ribonucleic acid double-stranded helices. , 1974, Journal of molecular biology.

[7]  A Klug,et al.  The crystal structure of an all-RNA hammerhead ribozyme. , 1995, Nucleic acids symposium series.

[8]  D. Turner,et al.  Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. , 1998, Biochemistry.

[9]  Sidney Altman,et al.  Enzymatic cleavage of RNA by RNA , 1986 .

[10]  M. Zuker On finding all suboptimal foldings of an RNA molecule. , 1989, Science.

[11]  Paulien Hogeweg,et al.  Energy directed folding of RNA sequences , 1984, Nucleic Acids Res..

[12]  A. E. Walter,et al.  Coaxial stacking of helixes enhances binding of oligoribonucleotides and improves predictions of RNA folding. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Eugene W. Myers,et al.  ReAligner: a program for refining DNA sequence multi-alignments , 1997, RECOMB '97.

[14]  Susan R. Wilson INTRODUCTION TO COMPUTATIONAL BIOLOGY: MAPS, SEQUENCES AND GENOMES. , 1996 .

[15]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[16]  David B. Searls,et al.  Linguistic approaches to biological sequences , 1997, Comput. Appl. Biosci..

[17]  T. Cech,et al.  Self-splicing and enzymatic activity of an intervening sequence RNA from Tetrahymena. , 1990, Bioscience reports.

[18]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[19]  Christian N. S. Pedersen,et al.  Fast evaluation of internal loops in RNA secondary structure prediction , 1999, Bioinform..

[20]  I. Tinoco,et al.  Estimation of Secondary Structure in Ribonucleic Acids , 1971, Nature.

[21]  M. Zuker,et al.  Structural analysis by energy dot plot of a large mRNA. , 1993, Journal of molecular biology.

[22]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[23]  F. Crick Central Dogma of Molecular Biology , 1970, Nature.

[24]  T. Steitz,et al.  The structural basis of ribosome activity in peptide bond synthesis. , 2000, Science.

[25]  G. Steger,et al.  Description of RNA folding by "simulated annealing". , 1996, Journal of molecular biology.

[26]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[27]  M Grunberg-Manago,et al.  Escherichia coli phenylalanyl-tRNA synthetase operon region. Evidence for an attenuation mechanism. Identification of the gene for the ribosomal protein L20. , 1983, Journal of molecular biology.

[28]  Ignacio Tinoco,et al.  A dynamic programming algorithm for finding alternative RNA secondary structures , 1986, Nucleic Acids Res..

[29]  Robert E. Bruccoleri,et al.  An improved algorithm for nucleic acid secondary structure display , 1988, Comput. Appl. Biosci..

[30]  Temple F. Smith,et al.  Comparison of biosequences , 1981 .

[31]  J. Ninio,et al.  [Prediction of secondary structures of nucleic acids: algorithmic and physical aspects]. , 1985, Biochimie.

[32]  Robert Giegerich,et al.  A General Pattern Matching Language for Specific Motifs in RNA Secondary Structure , 1999 .

[33]  T. Cech RNA as an enzyme. , 1986, Biochemistry international.

[34]  R De Wachter,et al.  DCSE, an interactive tool for sequence alignment and secondary structure research. , 1993, Computer applications in the biosciences : CABIOS.

[35]  G. L. Eliceiri Small nucleolar RNAs , 1999, Cellular and Molecular Life Sciences CMLS.

[36]  D. Turner,et al.  RNA structure prediction. , 1988, Annual review of biophysics and biophysical chemistry.

[37]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[38]  Robert Giegerich,et al.  Reducing the Conformation Space in RNA Structure Prediction , 2001, German Conference on Bioinformatics.

[39]  김동규,et al.  [서평]「Algorithms on Strings, Trees, and Sequences」 , 2000 .

[40]  Fabrice Lefebvre An Optimized Parsing Algorithm Well Suited to RNA Folding , 1995, ISMB.

[41]  Bruce A. Shapiro,et al.  Generating non-overlapping displays of nucleic acid secondary structure , 1984, Nucleic Acids Res..

[42]  H. Lütcke Signal recognition particle (SRP), a ubiquitous initiator of protein translocation. , 1995, European journal of biochemistry.

[43]  Homer Jacobson,et al.  Intramolecular Reaction in Polycondensations. I. The Theory of Linear Systems , 1950 .

[44]  M. Zuker Calculating nucleic acid secondary structure. , 2000, Current opinion in structural biology.

[45]  David Haussler,et al.  Recent Methods for RNA Modeling Using Stochastic Context-Free Grammars , 1994, CPM.

[46]  C. Zwieb,et al.  Comparative sequence analysis of tmRNA. , 1999, Nucleic acids research.

[47]  C. Pleij,et al.  The computer simulation of RNA folding pathways using a genetic algorithm. , 1995, Journal of molecular biology.

[48]  Mireille Régnier,et al.  Automatic RNA Secondary Structure Prediction with a Comparative Approach , 2002, Comput. Chem..

[49]  A A Mironov,et al.  A kinetic model of RNA folding. , 1993, Bio Systems.

[50]  Jerrold R. Griggs,et al.  Algorithms for Loop Matchings , 1978 .