The significance of thermodynamic models in the accuracy improvement of RNA secondary structure prediction using permutation-based simulated annealing

Ribonucleic acid, a single stranded linear molecule, is essential to all biological systems. Different regions of the same RNA strand will fold together via base pair interactions to make intricate secondary and tertiary structures that guide crucial homeostatic processes in living organisms. Since the structure of RNA molecules is key to their function, algorithms for the prediction of RNA structure are of great value. This paper discusses significant improvements made to SARNA-Predict, an RNA secondary structure prediction algorithm based on Simulated Annealing (SA). One major improvement is the incorporation of a sophisticated thermodynamic model (efn2). This model is used by mfold to rank sub-optimal structures, but cannot be used directly by mfold during the structure prediction. Experiments on eight individual known structures from four RNA classes (5S rRNA, Group I intron 23S rRNA, Group I intron 16S rRNA and 16S rRNA) were performed. The data demonstrate the robustness and the effectiveness of our improved prediction algorithm. The new algorithm shows results which surpass the dynamic programming algorithm mfold in terms of prediction accuracy on all tested structures.

[1]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[2]  Peter F Stadler,et al.  Fast and reliable prediction of noncoding RNAs , 2005, Proc. Natl. Acad. Sci. USA.

[3]  R. Wagner,et al.  5S RNA structure and function. , 1988, Methods in enzymology.

[4]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[5]  T. Steitz,et al.  The structural basis of ribosome activity in peptide bond synthesis. , 2000, Science.

[6]  David H Mathews,et al.  Revolutions in RNA secondary structure prediction. , 2006, Journal of molecular biology.

[7]  Jennifer Couzin,et al.  Small RNAs Make Big Splash , 2002, Science.

[8]  R. Gutell,et al.  The accuracy of ribosomal RNA comparative structure models. , 2002, Current opinion in structural biology.

[9]  Homer Jacobson,et al.  Intramolecular Reaction in Polycondensations. I. The Theory of Linear Systems , 1950 .

[10]  I. Tinoco,et al.  Estimation of Secondary Structure in Ribonucleic Acids , 1971, Nature.

[11]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[12]  Herbert H. Tsang,et al.  SARNA-Predict: A Simulated Annealing Algorithm for RNA Secondary Structure Prediction , 2006, 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology.

[13]  P. Schuster,et al.  Analysis of RNA sequence structure maps by exhaustive enumeration II. Structures of neutral networks and shape space covering , 1996 .

[14]  D. Mathews Predicting RNA secondary structure by free energy minimization , 2006 .

[15]  J. Couzin Small RNAs Make Big Splash , 2002, Science.

[16]  Thomas A Steitz,et al.  Structural insights into peptide bond formation , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Michael Zuker,et al.  Mfold web server for nucleic acid folding and hybridization prediction , 2003, Nucleic Acids Res..

[18]  Peter Walter,et al.  Signal recognition particle contains a 7S RNA essential for protein translocation across the endoplasmic reticulum , 1982, Nature.

[19]  G. Steger,et al.  Description of RNA folding by "simulated annealing". , 1996, Journal of molecular biology.

[20]  A. Hüttenhofer,et al.  The expanding snoRNA world. , 2002, Biochimie.

[21]  Nan Yu,et al.  The Comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs , 2002, BMC Bioinformatics.

[22]  Herbert H. Tsang,et al.  SARNA-Predict: A Study of RNA Secondary Structure Prediction Using Different Annealing Schedules , 2007, 2007 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology.

[23]  D. Turner,et al.  Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. , 1998, Biochemistry.

[24]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[25]  R. Simons,et al.  RNA structure and function , 1998 .

[26]  Ye Ding,et al.  Structure clustering features on the Sfold Web server , 2005, Bioinform..

[27]  Sean R Eddy,et al.  How do RNA folding algorithms work? , 2004, Nature Biotechnology.

[28]  A. E. Walter,et al.  Coaxial stacking of helixes enhances binding of oligoribonucleotides and improves predictions of RNA folding. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Jennifer A. Doudna,et al.  The chemical repertoire of natural ribozymes , 2002, Nature.

[30]  T. Tuschl,et al.  Mechanisms of gene silencing by double-stranded RNA , 2004, Nature.

[31]  Kay C. Wiese,et al.  Using stacking-energies (INN and INN-HB) for improving the accuracy of RNA secondary structure prediction with an evolutionary algorithm - a comparison to known structures , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[32]  Nan Yu,et al.  The Comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs: Correction , 2002, BMC Bioinformatics.

[33]  Robert Giegerich,et al.  Beyond Mfold: Recent advances in RNA bioinformatics , 2006, Journal of Biotechnology.

[34]  I. Tinoco,et al.  How RNA folds. , 1999, Journal of molecular biology.

[35]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..