Topology and prediction of RNA pseudoknots

MOTIVATION Several dynamic programming algorithms for predicting RNA structures with pseudoknots have been proposed that differ dramatically from one another in the classes of structures considered. RESULTS Here, we use the natural topological classification of RNA structures in terms of irreducible components that are embeddable in the surfaces of fixed genus. We add to the conventional secondary structures four building blocks of genus one in order to construct certain structures of arbitrarily high genus. A corresponding unambiguous multiple context-free grammar provides an efficient dynamic programming approach for energy minimization, partition function and stochastic sampling. It admits a topology-dependent parametrization of pseudoknot penalties that increases the sensitivity and positive predictive value of predicted base pairs by 10-20% compared with earlier approaches. More general models based on building blocks of higher genus are also discussed. AVAILABILITY The source code of gfold is freely available at http://www.combinatorics.cn/cbpc/gfold.tar.gz. CONTACT duck@santafe.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  M. Zuker On finding all suboptimal foldings of an RNA molecule. , 1989, Science.

[2]  Tatsuya Akutsu,et al.  Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots , 2000, Discret. Appl. Math..

[3]  Christian M. Reidys,et al.  Target prediction and a statistical sampling algorithm for RNA–RNA interaction , 2009, Bioinform..

[4]  Christian N. S. Pedersen,et al.  RNA Pseudoknot Prediction in Energy-Based Models , 2000, J. Comput. Biol..

[5]  David I. Stuart,et al.  A mechanical explanation of RNA pseudoknot function in programmed ribosomal frameshifting , 2006, Nature.

[6]  Satoshi Kobayashi,et al.  Tree Adjoining Grammars for RNA Structure Prediction , 1999, Theor. Comput. Sci..

[7]  C. Lawrence,et al.  A statistical sampling algorithm for RNA secondary structure prediction. , 2003, Nucleic acids research.

[8]  D. Turner,et al.  Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Sean R. Eddy,et al.  Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction , 2004, BMC Bioinformatics.

[10]  Hiroshi Matsui,et al.  Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures , 2004, Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004..

[11]  Russell L. Malmberg,et al.  Stochastic modeling of RNA pseudoknotted structures: a grammatical approach , 2003, ISMB.

[12]  Peter F. Stadler,et al.  RNA Structures with Pseudo-Knots - Graph-Theoretical and Combinatorial Properties , 1997 .

[13]  W. Hsieh,et al.  Proportions of Irreducible Diagrams , 1973 .

[14]  Robert Giegerich,et al.  Versatile and declarative dynamic programming using pair algebras , 2005, BMC Bioinformatics.

[15]  R. Giegerich,et al.  Complete probabilistic analysis of RNA shapes , 2006, BMC Biology.

[16]  Jitender S. Deogun,et al.  RNA Secondary Structure Prediction with Simple Pseudoknots , 2004, APBC.

[17]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[18]  J. WISHART Statistical Sampling , 1950, Nature.

[19]  Christian M. Reidys,et al.  Combinatorics of RNA Structures with Pseudoknots , 2007, Bulletin of mathematical biology.

[20]  H. Al‐Hashimi,et al.  Topology Links RNA Secondary Structure with Global Conformation, Dynamics, and Adaptation , 2010, Science.

[21]  J. McCaskill The equilibrium partition function and base pair binding probabilities for RNA secondary structure , 1990, Biopolymers.

[22]  Christian M. Reidys,et al.  Folding 3-Noncrossing RNA Pseudoknot Structures , 2009, J. Comput. Biol..

[23]  Robert Giegerich,et al.  Abstract shapes of RNA. , 2004, Nucleic acids research.

[24]  D. W. Staple,et al.  Open access, freely available online Primer Pseudoknots: RNA Structures with Diverse Functions , 2022 .

[25]  Einar Andreas Rødland Pseudoknots in RNA Secondary Structures: Representation, Enumeration, and Prevalence , 2006, J. Comput. Biol..

[26]  Robert Giegerich,et al.  Prediction of RNA Secondary Structure Including Kissing Hairpin Motifs , 2010, WABI.

[27]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[28]  Peter F. Stadler,et al.  tRNAdb 2009: compilation of tRNA sequences and tRNA genes , 2008, Nucleic Acids Res..

[29]  Jennifer A. Doudna,et al.  The chemical repertoire of natural ribozymes , 2002, Nature.

[30]  Gary D. Stormo,et al.  An RNA folding method capable of identifying pseudoknots and base triples , 1998, Bioinform..

[31]  William S. Massey,et al.  Algebraic Topology: An Introduction , 1977 .

[32]  R. C. Penner,et al.  Enumeration of linear chord diagrams , 2010, 1010.5614.

[33]  Daming Zhu,et al.  A New Pseudoknots Folding Algorithm for RNA Structure Prediction , 2005, COCOON.

[34]  Petya V Krasteva RNA structures , 2011, Nature Methods.

[35]  Hosna Jabbari,et al.  An O(n5) Algorithm for MFE Prediction of Kissing Hairpins and 4-Chains in Nucleic Acids , 2009, J. Comput. Biol..

[36]  M. Waterman Secondary Structure of Single-Stranded Nucleic Acidst , 1978 .

[37]  P. Schuster,et al.  Algorithm independent properties of RNA secondary structure predictions , 1996, European Biophysics Journal.

[38]  A. Ferré-D’Amaré,et al.  Crystal structure of a hepatitis delta virus ribozyme , 1998, Nature.

[39]  Anne Condon,et al.  Classifying RNA pseudoknotted structures , 2004, Theor. Comput. Sci..

[40]  Daniel J. Kleitman,et al.  Proportions of Irreducible Diagrams , 1970 .

[41]  A. Zee,et al.  Topological classification of RNA structures. , 2006, Journal of molecular biology.

[42]  Daniel Götzmann Multiple Context-Free Grammars , 2007 .

[43]  Tadao Kasami,et al.  RNA Pseudoknotted Structure Prediction Using Stochastic Multiple Context-Free Grammar , 2006 .

[44]  Tatsuya Akutsu,et al.  Dynamic Programming Algorithms for RNA Structure Prediction with Binding Sites , 2010, Pacific Symposium on Biocomputing.

[45]  C. A. Theimer,et al.  Structure of the human telomerase RNA pseudoknot reveals conserved tertiary interactions essential for function. , 2005, Molecular cell.

[46]  Jerrold R. Griggs,et al.  Algorithms for Loop Matchings , 1978 .

[47]  G. Vernizzi,et al.  LargeN Random Matrices for RNA Folding , 2005 .

[48]  D. Giedroc,et al.  Frameshifting RNA pseudoknots: Structure and mechanism , 2008, Virus Research.

[49]  Carsten Wiuf,et al.  Fatgraph models of proteins , 2009, 0902.1025.

[50]  Elena Rivas,et al.  The language of RNA: a formal grammar that includes pseudoknots , 2000, Bioinform..

[51]  Martin Loebl,et al.  The chromatic polynomial of fatgraphs and its categorification , 2008 .

[52]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[53]  Christian M. Reidys,et al.  Addendum: topology and prediction of RNA pseudoknots , 2012, Bioinform..

[54]  E Rivas,et al.  A dynamic programming algorithm for RNA structure prediction including pseudoknots. , 1998, Journal of molecular biology.

[55]  F. H. D. van Batenburg,et al.  PseudoBase: structural information on RNA pseudoknots , 2001, Nucleic Acids Res..

[56]  A. Condon,et al.  Improved free energy parameters for RNA pseudoknotted secondary structure prediction. , 2010, RNA.

[57]  Dirk Metzler,et al.  Predicting RNA secondary structures with pseudoknots by MCMC sampling , 2007, Journal of mathematical biology.

[58]  Christian M. Reidys,et al.  Shapes of RNA Pseudoknot Structures , 2010, J. Comput. Biol..

[59]  Tadao Kasami,et al.  On Multiple Context-Free Grammars , 1991, Theor. Comput. Sci..

[60]  Robert Giegerich,et al.  Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics , 2004, BMC Bioinformatics.

[61]  Shi-jie Chen RNA folding: conformational statistics, folding kinetics, and ion electrostatics. , 2008, Annual review of biophysics.

[62]  P. Stadler,et al.  RNA structures with pseudo-knots: Graph-theoretical, combinatorial, and statistical properties , 1999, Bulletin of mathematical biology.

[63]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[64]  Michela Taufer,et al.  PseudoBase++: an extension of PseudoBase for easy searching, formatting and visualization of pseudoknots , 2008, Nucleic Acids Res..

[65]  Niles A. Pierce,et al.  A partition function algorithm for nucleic acid secondary structure including pseudoknots , 2003, J. Comput. Chem..