ProbKnot: fast prediction of RNA secondary structure including pseudoknots.

It is a significant challenge to predict RNA secondary structures including pseudoknots. Here, a new algorithm capable of predicting pseudoknots of any topology, ProbKnot, is reported. ProbKnot assembles maximum expected accuracy structures from computed base-pairing probabilities in O(N(2)) time, where N is the length of the sequence. The performance of ProbKnot was measured by comparing predicted structures with known structures for a large database of RNA sequences with fewer than 700 nucleotides. The percentage of known pairs correctly predicted was 69.3%. Additionally, the percentage of predicted pairs in the known structure was 61.3%. This performance is the highest of four tested algorithms that are capable of pseudoknot prediction. The program is available for download at: http://rna.urmc.rochester.edu/RNAstructure.html.

[1]  Peter Walter,et al.  Signal recognition particle contains a 7S RNA essential for protein translocation across the endoplasmic reticulum , 1982, Nature.

[2]  R. Waring,et al.  Assessment of a model for intron RNA secondary structure relevant to RNA self-splicing--a review. , 1984, Gene.

[3]  K. Umesono,et al.  Comparative and functional anatomy of group II catalytic introns--a review. , 1989, Gene.

[4]  J. Abrahams,et al.  Prediction of RNA secondary structure, including pseudoknotting, by computer simulation. , 1990, Nucleic acids research.

[5]  I. Tinoco,et al.  RNA pseudoknots. Stability and loop size requirements. , 1990, Journal of molecular biology.

[6]  Robin Ray Gutell,et al.  Collection of small subunit (16S- and 16S-like) ribosomal RNA structures , 1993, Nucleic Acids Res..

[7]  R. Gutell,et al.  A compilation of large subunit (23S and 23S-like) ribosomal RNA structures: 1993. , 1992, Nucleic acids research.

[8]  Carl R. Woese,et al.  4 Probing RNA Structure, Function, and History by Comparative Analysis , 1993 .

[9]  Sergey Steinberg,et al.  Compilation of tRNA sequences and sequences of tRNA genes , 2004, Nucleic Acids Res..

[10]  R. Gutell,et al.  Collection of small subunit (16S- and 16S-like) ribosomal RNA structures: 1994. , 1993, Nucleic acids research.

[11]  R. Gutell,et al.  A comparative database of group I intron structures. , 1994, Nucleic acids research.

[12]  C. Pleij,et al.  The computer simulation of RNA folding pathways using a genetic algorithm. , 1995, Journal of molecular biology.

[13]  R. Gutell,et al.  Comprehensive comparison of structural characteristics in eukaryotic cytoplasmic large subunit (23 S-like) ribosomal RNA. , 1996, Journal of molecular biology.

[14]  D. Giedroc,et al.  Equilibrium unfolding (folding) pathway of a model H-type pseudoknotted RNA: the role of magnesium ions in stability. , 1998, Biochemistry.

[15]  Christian Zwieb,et al.  The Signal Recognition Particle Database (SRPDB) , 1998, Nucleic Acids Res..

[16]  James W. Brown The ribonuclease P database , 1998, Nucleic Acids Res..

[17]  D. Giedroc,et al.  Non-nearest neighbor effects on the thermodynamics of unfolding of a model mRNA pseudoknot. , 1998, Journal of molecular biology.

[18]  Gary D. Stormo,et al.  An RNA folding method capable of identifying pseudoknots and base triples , 1998, Bioinform..

[19]  Maciej Szymanski,et al.  5S rRNA Data Bank , 1998, Nucleic Acids Res..

[20]  D. Turner,et al.  Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. , 1998, Biochemistry.

[21]  E Rivas,et al.  A dynamic programming algorithm for RNA structure prediction including pseudoknots. , 1998, Journal of molecular biology.

[22]  James W. Brown,et al.  The Ribonuclease P Database , 1994, Nucleic Acids Res..

[23]  I. Tinoco,et al.  How RNA folds. , 1999, Journal of molecular biology.

[24]  Satoshi Kobayashi,et al.  Tree Adjoining Grammars for RNA Structure Prediction , 1999, Theor. Comput. Sci..

[25]  C. Zwieb,et al.  Comparative sequence analysis of tmRNA. , 1999, Nucleic acids research.

[26]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[27]  C. Pleij,et al.  An approximation of loop free energy values of RNA H-pseudoknots. , 1999, RNA.

[28]  D. Giedroc,et al.  Equilibrium unfolding pathway of an H-type RNA pseudoknot which promotes programmed −1 ribosomal frameshifting1 , 1999, Journal of Molecular Biology.

[29]  Christian Zwieb,et al.  The Signal Recognition Particle Database (SRPDB) , 1993, Nucleic Acids Res..

[30]  Tatsuya Akutsu,et al.  Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots , 2000, Discret. Appl. Math..

[31]  T. Steitz,et al.  The structural basis of ribosome activity in peptide bond synthesis. , 2000, Science.

[32]  Jiunn-Liang Chen,et al.  Secondary Structure of Vertebrate Telomerase RNA , 2000, Cell.

[33]  D. Giedroc,et al.  Contribution of the intercalated adenosine at the helical junction to the stability of the gag-pro frameshifting pseudoknot from mouse mammary tumor virus. , 2000, RNA.

[34]  E. Siggia,et al.  Modeling RNA folding paths with pseudoknots: application to hepatitis delta virus ribozyme. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[35]  Christian N. S. Pedersen,et al.  RNA Pseudoknot Prediction in Energy-Based Models , 2000, J. Comput. Biol..

[36]  F. H. D. van Batenburg,et al.  PseudoBase: structural information on RNA pseudoknots , 2001, Nucleic Acids Res..

[37]  J. M. Diamond,et al.  Thermodynamics of three-way multibranch loops in RNA. , 2001, Biochemistry.

[38]  S. Eddy Non–coding RNA genes and the modern RNA world , 2001, Nature Reviews Genetics.

[39]  Christian Zwieb,et al.  tmRDB (tmRNA database) , 2001, Nucleic Acids Res..

[40]  R. Gutell,et al.  The accuracy of ribosomal RNA comparative structure models. , 2002, Current opinion in structural biology.

[41]  D. Turner,et al.  Experimentally derived nearest-neighbor parameters for the stability of RNA three- and four-way multibranch loops. , 2002, Biochemistry.

[42]  RNA pseudoknot prediction , 2002 .

[43]  Jennifer A. Doudna,et al.  The chemical repertoire of natural ribozymes , 2002, Nature.

[44]  Michael Zuker,et al.  Mfold web server for nucleic acid folding and hybridization prediction , 2003, Nucleic Acids Res..

[45]  Niles A. Pierce,et al.  A partition function algorithm for nucleic acid secondary structure including pseudoknots , 2003, J. Comput. Chem..

[46]  Sean R. Eddy,et al.  Rfam: an RNA family database , 2003, Nucleic Acids Res..

[47]  Bjarne Knudsen,et al.  Pfold: RNA Secondary Structure Prediction Using Stochastic Context-Free Grammars , 2003 .

[48]  Robert Giegerich,et al.  Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics , 2004, BMC Bioinformatics.

[49]  Peter F. Stadler,et al.  Prediction of consensus RNA secondary structures including pseudoknots , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[50]  Weixiong Zhang,et al.  An Iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots , 2004, Bioinform..

[51]  D. Mathews Using an RNA secondary structure partition function to determine confidence in base pairs predicted by free energy minimization. , 2004, RNA.

[52]  Peter F. Stadler,et al.  Prediction of locally stable RNA secondary structures for genome-wide surveys , 2004, Bioinform..

[53]  Anne Condon,et al.  Classifying RNA pseudoknotted structures , 2004, Theor. Comput. Sci..

[54]  D. Turner,et al.  Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[55]  Niles A. Pierce,et al.  An algorithm for computing nucleic acid base‐pairing probabilities including pseudoknots , 2004, J. Comput. Chem..

[56]  R. Breaker,et al.  Riboswitches as versatile gene control elements. , 2005, Current opinion in structural biology.

[57]  D. P. Aalberts,et al.  Asymmetry in RNA pseudoknots: observation and theory , 2005, Nucleic acids research.

[58]  H. Hoos,et al.  HotKnots: heuristic prediction of RNA secondary structures including pseudoknots. , 2005, RNA.

[59]  Sean R. Eddy,et al.  Rfam: annotating non-coding RNAs in complete genomes , 2004, Nucleic Acids Res..

[60]  Song Cao,et al.  Predicting RNA pseudoknot folding thermodynamics , 2006, Nucleic acids research.

[61]  Gisela Storz,et al.  20 Versatile Roles of Small RNA Regulators in Bacteria , 2006 .

[62]  Serafim Batzoglou,et al.  CONTRAfold: RNA secondary structure prediction without physics-based models , 2006, ISMB.

[63]  David H Mathews,et al.  Prediction of RNA secondary structure by free energy minimization. , 2006, Current opinion in structural biology.

[64]  W. Dawson,et al.  Prediction of RNA Pseudoknots Using Heuristic Modeling with Mapping and Sequential Folding , 2007, PloS one.

[65]  István Miklós,et al.  SimulFold: Simultaneously Inferring RNA Structures Including Pseudoknots, Alignments, and Trees Using a Bayesian MCMC Framework , 2007, PLoS Comput. Biol..

[66]  J. Tate,et al.  The RNA WikiProject: community annotation of RNA families. , 2008, RNA.

[67]  Hosna Jabbari,et al.  Novel and Efficient RNA Secondary Structure Prediction Using Hierarchical Folding , 2008, J. Comput. Biol..

[68]  Ronny Lorenz,et al.  The Vienna RNA Websuite , 2008, Nucleic Acids Res..

[69]  R. Knight,et al.  From knotted to nested RNA structures: a variety of computational methods for pseudoknot removal. , 2008, RNA.

[70]  Ligang Wu,et al.  Let me count the ways: mechanisms of gene regulation by miRNAs and siRNAs. , 2008, Molecular cell.

[71]  Kiyoshi Asai,et al.  Prediction of RNA secondary structure using generalized centroid estimators , 2009, Bioinform..

[72]  David H. Mathews,et al.  RNAstructure: software for RNA secondary structure prediction and analysis , 2010, BMC Bioinformatics.

[73]  D. Mathews,et al.  Improved RNA secondary structure prediction by maximizing expected pair accuracy. , 2009, RNA.

[74]  Robert D. Finn,et al.  Rfam: updates to the RNA families database , 2008, Nucleic Acids Res..

[75]  Hosna Jabbari,et al.  Computational prediction of nucleic acid secondary structure: Methods, applications, and challenges , 2009, Theor. Comput. Sci..

[76]  Shi-Jie Chen,et al.  Predicting structures and stabilities for H-type pseudoknots with interhelix loops. , 2009, RNA.

[77]  Tatsuya Akutsu,et al.  Prediction of RNA secondary structure with pseudoknots using integer programming , 2009, BMC Bioinformatics.

[78]  A. Condon,et al.  Improved free energy parameters for RNA pseudoknotted secondary structure prediction. , 2010, RNA.