Thermodynamics of RNA structures by Wang–Landau sampling

Motivation: Thermodynamics-based dynamic programming RNA secondary structure algorithms have been of immense importance in molecular biology, where applications range from the detection of novel selenoproteins using expressed sequence tag (EST) data, to the determination of microRNA genes and their targets. Dynamic programming algorithms have been developed to compute the minimum free energy secondary structure and partition function of a given RNA sequence, the minimum free-energy and partition function for the hybridization of two RNA molecules, etc. However, the applicability of dynamic programming methods depends on disallowing certain types of interactions (pseudoknots, zig-zags, etc.), as their inclusion renders structure prediction an nondeterministic polynomial time (NP)-complete problem. Nevertheless, such interactions have been observed in X-ray structures. Results: A non-Boltzmannian Monte Carlo algorithm was designed by Wang and Landau to estimate the density of states for complex systems, such as the Ising model, that exhibit a phase transition. In this article, we apply the Wang-Landau (WL) method to compute the density of states for secondary structures of a given RNA sequence, and for hybridizations of two RNA sequences. Our method is shown to be much faster than existent software, such as RNAsubopt. From density of states, we compute the partition function over all secondary structures and over all pseudoknot-free hybridizations. The advantage of the WL method is that by adding a function to evaluate the free energy of arbitary pseudoknotted structures and of arbitrary hybridizations, we can estimate thermodynamic parameters for situations known to be NP-complete. This extension to pseudoknots will be made in the sequel to this article; in contrast, the current article describes the WL algorithm applied to pseudoknot-free secondary structures and hybridizations. Availability: The WL RNA hybridization web server is under construction at http://bioinformatics.bc.edu/clotelab/. Contact: clote@bc.edu

[1]  Michaël Bon Prediction de structures secondaires d'ARN avec pseudo-noeuds , 2009 .

[2]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[3]  Erik Winfree,et al.  Thermodynamic Analysis of Interacting Nucleic Acid Strands , 2007, SIAM Rev..

[4]  D. Landau,et al.  Determining the density of states for classical statistical models: a random walk algorithm to produce a flat histogram. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  J. Skolnick,et al.  Fold assembly of small proteins using monte carlo simulations driven by restraints derived from multiple sequence alignments. , 1998, Journal of molecular biology.

[6]  Wing Hung Wong,et al.  A study of density of states and ground states in hydrophobic-hydrophilic protein folding models by equi-energy sampling. , 2006, The Journal of chemical physics.

[7]  Peter Clote,et al.  Asymptotics of Canonical and Saturated RNA Secondary Structures , 2009, J. Bioinform. Comput. Biol..

[8]  Ivo L. Hofacker,et al.  Vienna RNA secondary structure server , 2003, Nucleic Acids Res..

[9]  Peter F. Stadler,et al.  Dynamic Programming Algorithm for the Density of States of RNA Secondary Structures , 1996, German Conference on Bioinformatics.

[10]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[11]  J. Abrahams,et al.  Prediction of RNA secondary structure, including pseudoknotting, by computer simulation. , 1990, Nucleic acids research.

[12]  Zhong Chen,et al.  Structure Prediction of Helical Transmembrane Proteins at Two Length Scales , 2006, J. Bioinform. Comput. Biol..

[13]  P. Schuster,et al.  RNA folding at elementary step resolution. , 1999, RNA.

[14]  D. Turner,et al.  Thermal unfolding of a group I ribozyme: the low-temperature transition is primarily disruption of tertiary structure. , 1993, Biochemistry.

[15]  R. Nussinov,et al.  Fast algorithm for predicting the secondary structure of single-stranded RNA. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[16]  K.C. Wiese,et al.  jViz.Rna -a java tool for RNA secondary structure visualization , 2005, IEEE Transactions on NanoBioscience.

[17]  Niles A. Pierce,et al.  A partition function algorithm for nucleic acid secondary structure including pseudoknots , 2003, J. Comput. Chem..

[18]  Alain Xayaphoummine,et al.  Kinefold web server for RNA/DNA folding path and structure prediction including pseudoknots and knots , 2005, Nucleic Acids Res..

[19]  Sean R. Eddy,et al.  Infernal 1.0: inference of RNA alignments , 2009, Bioinform..

[20]  Emma Kreuger,et al.  Temperature-controlled Structural Alterations of an RNA Thermometer* , 2003, Journal of Biological Chemistry.

[21]  P. Stadler,et al.  Design of multistable RNA molecules. , 2001, RNA.

[22]  H. Hoos,et al.  HotKnots: heuristic prediction of RNA secondary structures including pseudoknots. , 2005, RNA.

[23]  Peter F. Stadler,et al.  Thermodynamics of RNA-RNA Binding , 2006, German Conference on Bioinformatics.

[24]  Peter F. Stadler,et al.  Partition function and base pairing probabilities of RNA heterodimers , 2006, Algorithms for Molecular Biology.

[25]  S. Kou,et al.  Equi-energy sampler with applications in statistical inference and statistical mechanics , 2005, math/0507080.

[26]  S. Eddy,et al.  Homologs of small nucleolar RNAs in Archaea. , 2000, Science.

[27]  D Thirumalai,et al.  Assembly mechanisms of RNA pseudoknots are determined by the stabilities of constituent secondary structures , 2009, Proceedings of the National Academy of Sciences.

[28]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[29]  J. S. Weinger,et al.  Substrate-assisted catalysis of peptide bond formation by the ribosome , 2004, Nature Structural &Molecular Biology.

[30]  P. Schuster,et al.  Complete suboptimal folding of RNA and the stability of secondary structures. , 1999, Biopolymers.

[31]  M. Zuker,et al.  Prediction of hybridization and melting for double-stranded nucleic acids. , 2004, Biophysical journal.

[32]  D M Crothers,et al.  Prediction of RNA secondary structure. , 1971, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Liming Cai,et al.  Rapid ab initio prediction of RNA pseudoknots via graph tree decomposition , 2007, Journal of mathematical biology.

[34]  Bjarne Knudsen,et al.  Pfold: RNA Secondary Structure Prediction Using Stochastic Context-Free Grammars , 2003 .

[35]  C. Burge,et al.  Vertebrate MicroRNA Genes , 2003, Science.

[36]  D. Turner,et al.  Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Sean R. Eddy,et al.  Infernal 1.0: inference of RNA alignments , 2009, Bioinform..

[38]  Michael Zuker,et al.  UNAFold: software for nucleic acid folding and hybridization. , 2008, Methods in molecular biology.

[39]  D. Turner,et al.  Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. , 1998, Biochemistry.

[40]  S. Wolin,et al.  Emerging themes in non-coding RNA quality control. , 2007, Current opinion in structural biology.

[41]  Christian N. S. Pedersen,et al.  RNA Pseudoknot Prediction in Energy-Based Models , 2000, J. Comput. Biol..

[42]  R. Breaker,et al.  Riboswitches as versatile gene control elements. , 2005, Current opinion in structural biology.

[43]  E Rivas,et al.  A dynamic programming algorithm for RNA structure prediction including pseudoknots. , 1998, Journal of molecular biology.

[44]  F. H. D. van Batenburg,et al.  PseudoBase: structural information on RNA pseudoknots , 2001, Nucleic Acids Res..

[45]  Robert Giegerich,et al.  Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics , 2004, BMC Bioinformatics.

[46]  Simon Levin Computational Molecular Biology An Introduction , 2000 .

[47]  A. Böck,et al.  Selenoprotein synthesis: an expansion of the genetic code. , 1991, Trends in biochemical sciences.

[48]  P. Bradley,et al.  Toward High-Resolution de Novo Structure Prediction for Small Proteins , 2005, Science.

[49]  E. Siggia,et al.  Modeling RNA folding paths with pseudoknots: application to hepatitis delta virus ribozyme. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[50]  Dirk Metzler,et al.  Predicting RNA secondary structures with pseudoknots by MCMC sampling , 2007, Journal of mathematical biology.

[51]  Sean R. Eddy,et al.  Rfam: an RNA family database , 2003, Nucleic Acids Res..

[52]  Michel Termier,et al.  Towards a computational model for −1 eukaryotic frameshifting sites , 2003, Bioinform..

[53]  Andrey A. Mironov,et al.  Rnakinetics: a Web Server that Models Secondary Structure Kinetics of an Elongating RNA , 2006, J. Bioinform. Comput. Biol..

[54]  Gary D. Stormo,et al.  An RNA folding method capable of identifying pseudoknots and base triples , 1998, Bioinform..

[55]  Fabrice Lefebvre An Optimized Parsing Algorithm Well Suited to RNA Folding , 1995, ISMB.

[56]  Jun S. Liu,et al.  Biopolymer structure simulation and optimization via fragment regrowth Monte Carlo. , 2007, The Journal of chemical physics.

[57]  Sean R Eddy,et al.  How do RNA folding algorithms work? , 2004, Nature Biotechnology.

[58]  R. Breaker,et al.  Control of alternative RNA splicing and gene expression by eukaryotic riboswitches , 2007, Nature.

[59]  Jeffrey E. Barrick,et al.  Riboswitches Control Fundamental Biochemical Pathways in Bacillus subtilis and Other Bacteria , 2003, Cell.

[60]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[61]  D. Landau,et al.  Efficient, multiple-range random walk algorithm to calculate the density of states. , 2000, Physical review letters.

[62]  Michael Zuker,et al.  Algorithms and software for nucleic acid sequences , 2006 .

[63]  D. Baker,et al.  Automated de novo prediction of native-like RNA tertiary structures , 2007, Proceedings of the National Academy of Sciences.

[64]  Hamidreza Chitsaz,et al.  A partition function algorithm for interacting nucleic acid strands , 2009, Bioinform..