RNA secondary structure prediction based on free energy and phylogenetic analysis.

We describe a computational method for the prediction of RNA secondary structure that uses a combination of free energy and comparative sequence analysis strategies. Using a homology-based sequence alignment as a starting point, all favorable pairings with respect to the Turner energy function are identified. Each potentially paired region within a multiple sequence alignment is scored using a function that combines both predicted free energy and sequence covariation with optimized weightings. High scoring regions are ranked and sequentially incorporated to define a growing secondary structure. Using a single set of optimized parameters, it is possible to accurately predict the foldings of several test RNAs defined previously by extensive phylogenetic and experimental data (including tRNA, 5 S rRNA, SRP RNA, tmRNA, and 16 S rRNA). The algorithm correctly predicts approximately 80% of the secondary structure. A range of parameters have been tested to define the minimal sequence information content required to accurately predict secondary structure and to assess the importance of individual terms in the prediction scheme. This analysis indicates that prediction accuracy most strongly depends upon covariational information and only weakly on the energetic terms. However, relatively few sequences prove sufficient to provide the covariational information required for an accurate prediction. Secondary structures can be accurately defined by alignments with as few as five sequences and predictions improve only moderately with the inclusion of additional sequences.

[1]  Alexander Rich,et al.  Three-Dimensional Structure of Yeast Phenylalanine Transfer RNA: Folding of the Polynucleotide Chain , 1973, Science.

[2]  J. Pipas,et al.  Method for predicting RNA secondary structure. , 1975, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[4]  H. Noller,et al.  Secondary structure of 16S ribosomal RNA. , 1981, Science.

[5]  H. Noller Structure of ribosomal RNA. , 1984, Annual review of biochemistry.

[6]  D. Turner,et al.  Improved free-energy parameters for predictions of RNA duplex stability. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[7]  C. E. Longfellow,et al.  Improved parameters for prediction of RNA structure. , 1987, Cold Spring Harbor symposia on quantitative biology.

[8]  Robert E. Bruccoleri,et al.  An improved algorithm for nucleic acid secondary structure display , 1988, Comput. Appl. Biosci..

[9]  D. Turner,et al.  RNA structure prediction. , 1988, Annual review of biophysics and biophysical chemistry.

[10]  R Nussinov,et al.  An improved secondary structure computation method and its application to intervening sequence in the human alpha-likeglobin mRNA precursors , 1988, Comput. Appl. Biosci..

[11]  M. Zuker On finding all suboptimal foldings of an RNA molecule. , 1989, Science.

[12]  N. Pace,et al.  Phylogenetic comparative analysis of RNA secondary structure. , 1989, Methods in enzymology.

[13]  Ross A. Overbeek,et al.  Structure detection through automated covariance search , 1990, Comput. Appl. Biosci..

[14]  J. McCaskill The equilibrium partition function and base pair binding probabilities for RNA secondary structure , 1990, Biopolymers.

[15]  S. Gerbi,et al.  Changes in 7SL RNA conformation during the signal recognition particle cycle. , 1991, The EMBO journal.

[16]  N. Larsen,et al.  SRP-RNA sequence alignment and secondary structure. , 1991, Nucleic acids research.

[17]  D. Turner,et al.  A comparison of optimal and suboptimal RNA secondary structures predicted by free energy minimization with structures determined by phylogenetic comparison. , 1991, Nucleic acids research.

[18]  G. Stormo,et al.  Identifying constraints on the higher-order structure of RNA: continued development and application of comparative sequence analysis methods. , 1992, Nucleic acids research.

[19]  K. Han,et al.  Prediction of common folding structures of homologous RNAs. , 1993, Nucleic acids research.

[20]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[21]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[22]  D. Turner,et al.  Predicting thermodynamic properties of RNA. , 1995, Methods in enzymology.

[23]  D. Turner,et al.  A periodic table of symmetric tandem mismatches in RNA. , 1995, Biochemistry.

[24]  D. Turner,et al.  Thermodynamics of base pairing. , 1996, Current opinion in structural biology.

[25]  R. Lück,et al.  Thermodynamic prediction of conserved secondary structure: application to the RRE element of HIV, the tRNA-like element of CMV and the mRNA of prion protein. , 1996, Journal of molecular biology.

[26]  J. Thompson,et al.  Using CLUSTAL for multiple sequence alignments. , 1996, Methods in enzymology.

[27]  James W. Brown,et al.  Comparative analysis of ribonuclease P RNA using gene sequences from natural microbial populations reveals tertiary structural elements. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[28]  R B Altman,et al.  Computational methods for defining the allowed conformational space of 16S rRNA based on chemical footprinting data. , 1996, RNA.

[29]  D. Bartel,et al.  Phylogenetic analysis of tmRNA secondary structure. , 1996, RNA.

[30]  N. Pace,et al.  Analysis of the tertiary structure of the ribonuclease P ribozyme-substrate complex by site-specific photoaffinity crosslinking. , 1997, RNA.

[31]  J. F. Atkins,et al.  Probing the structure of the Escherichia coli 10Sa RNA (tmRNA). , 1997, RNA.

[32]  J. A. Mcdowell,et al.  Thermodynamics of nonsymmetric tandem mismatches adjacent to G.C base pairs in RNA. , 1997, Biochemistry.

[33]  E Westhof,et al.  Derivation of the three-dimensional architecture of bacterial ribonuclease P RNAs from comparative sequence analysis. , 1998, Journal of molecular biology.

[34]  Emmet A. O'Brien,et al.  Optimization of ribosomal RNA profile alignments , 1998, Bioinform..

[35]  Christian Zwieb,et al.  The Signal Recognition Particle Database (SRPDB) , 1993, Nucleic Acids Res..