Structural RNA has lower folding energy than random RNA of the same dinucleotide frequency.

We present results of computer experiments that indicate that several RNAs for which the native state (minimum free energy secondary structure) is functionally important (type III hammerhead ribozymes, signal recognition particle RNAs, U2 small nucleolar spliceosomal RNAs, certain riboswitches, etc.) all have lower folding energy than random RNAs of the same length and dinucleotide frequency. Additionally, we find that whole mRNA as well as 5'-UTR, 3'-UTR, and cds regions of mRNA have folding energies comparable to that of random RNA, although there may be a statistically insignificant trace signal in 3'-UTR and cds regions. Various authors have used nucleotide (approximate) pattern matching and the computation of minimum free energy as filters to detect potential RNAs in ESTs and genomes. We introduce a new concept of the asymptotic Z-score and describe a fast, whole-genome scanning algorithm to compute asymptotic minimum free energy Z-scores of moving-window contents. Asymptotic Z-score computations offer another filter, to be used along with nucleotide pattern matching and minimum free energy computations, to detect potential functional RNAs in ESTs and genomic regions.

[1]  J. Kingman Subadditive Ergodic Theory , 1973 .

[2]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[3]  S. Altschul,et al.  Significance of nucleotide sequence alignments: a method for random sequence permutation that preserves dinucleotide and codon usage. , 1985, Molecular biology and evolution.

[4]  S. Le,et al.  A highly conserved RNA folding region coincident with the Rev response element of primate immunodeficiency viruses. , 1990, Nucleic acids research.

[5]  Daniel Gautheret,et al.  Pattern searching/alignment with RNA primary and secondary structures: an effective descriptor for tRNA , 1990, Comput. Appl. Biosci..

[6]  Michael J. E. Sternberg,et al.  Secondary structure prediction: Current Opinion in Structural Biology 1992, 2:237–241 , 1992 .

[7]  Sergey Steinberg,et al.  Compilation of tRNA sequences and sequences of tRNA genes , 2004, Nucleic Acids Res..

[8]  Daniel Gautheret,et al.  An RNA pattern matching program with enhanced performance and portability , 1994, Comput. Appl. Biosci..

[9]  Roy Fisher,et al.  It Follows That , 1994 .

[10]  Michael S. Waterman,et al.  Introduction to Computational Biology: Maps, Sequences and Genomes , 1998 .

[11]  R. Overbeek,et al.  Searching for patterns in genomic data. , 1997, Trends in genetics : TIG.

[12]  D. Turner,et al.  Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. , 1998, Biochemistry.

[13]  David W. Digby,et al.  mRNAs have greater negative folding free energies than shuffled or codon choice randomized sequences. , 1999, Nucleic acids research.

[14]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[15]  A. Krogh,et al.  No evidence that mRNAs have lower folding free energies than random sequences with the same dinucleotide distribution. , 1999, Nucleic acids research.

[16]  G. Kryukov,et al.  New Mammalian Selenocysteine-containing Proteins Identified with an Algorithm That Searches for Selenocysteine Insertion Sequence Elements* , 1999, The Journal of Biological Chemistry.

[17]  D Gautheret,et al.  Novel Selenoproteins Identified in Silico andin Vivo by Using a Conserved RNA Structural Motif* , 1999, The Journal of Biological Chemistry.

[18]  Eivind Coward,et al.  Shufflet: shuffling sequences while conserving the k-let counts , 1999, Bioinform..

[19]  P. Carbon,et al.  Structural analysis of new local features in SECIS RNA hairpins. , 2000, Nucleic acids research.

[20]  Elena Rivas,et al.  Secondary structure alone is generally not statistically significant for the detection of noncoding RNAs , 2000, Bioinform..

[21]  J. Miranda-Ríos,et al.  A conserved RNA structure (thi box) is involved in regulation of thiamin biosynthetic gene expression in bacteria , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[22]  S. Eddy Non–coding RNA genes and the modern RNA world , 2001, Nature Reviews Genetics.

[23]  D. Ecker,et al.  RNAMotif, an RNA secondary structure definition and search algorithm. , 2001, Nucleic acids research.

[24]  S. Eddy,et al.  Noncoding RNA genes identified in AT-rich hyperthermophiles , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[25]  S. Eddy Computational Genomics of Noncoding RNA Genes , 2002, Cell.

[26]  Ronald R. Breaker,et al.  Thiamine derivatives bind messenger RNAs directly to regulate bacterial gene expression , 2002, Nature.

[27]  Steven E. Brenner,et al.  SCOR: a Structural Classification of RNA database , 2002, Nucleic Acids Res..

[28]  Michael Zuker,et al.  Mfold web server for nucleic acid folding and hybridization prediction , 2003, Nucleic Acids Res..

[29]  Thomas Tuschl,et al.  Functional genomics: RNA sets the standard , 2003, Nature.

[30]  C. Burge,et al.  Vertebrate MicroRNA Genes , 2003, Science.

[31]  N. Gray,et al.  Regulation of mRNA translation by 5'- and 3'-UTR-binding factors. , 2003, Trends in biochemical sciences.

[32]  Sean R. Eddy,et al.  Rfam: an RNA family database , 2003, Nucleic Acids Res..

[33]  Thomas Tuschl,et al.  Sequence, chemical, and structural variation of small interfering RNAs and short hairpin RNAs and the effect on mammalian gene silencing. , 2003, Antisense & nucleic acid drug development.

[34]  Peter F. Stadler,et al.  Prediction of locally stable RNA secondary structures for genome-wide surveys , 2004, Bioinform..

[35]  I. Hofacker,et al.  Consensus folding of aligned sequences as a new measure for the detection of functional RNAs by comparative genomics. , 2004, Journal of molecular biology.

[36]  Yves Van de Peer,et al.  Evidence that microRNA precursors, unlike other non-coding RNAs, have lower folding free energies than random sequences , 2004, Bioinform..

[37]  Craig A. Stewart,et al.  Introduction to computational biology , 2005 .

[38]  Michael Zuker,et al.  RNA Secondary Structure Prediction , 2007, Current protocols in nucleic acid chemistry.