SARNA-Predict: Accuracy Improvement of RNA Secondary Structure Prediction Using Permutation-Based Simulated Annealing

Ribonucleic acid (RNA), a single-stranded linear molecule, is essential to all biological systems. Different regions of the same RNA strand will fold together via base pair interactions to make intricate secondary and tertiary structures that guide crucial homeostatic processes in living organisms. Since the structure of RNA molecules is the key to their function, algorithms for the prediction of RNA structure are of great value. In this article, we demonstrate the usefulness of SARNA-Predict, an RNA secondary structure prediction algorithm based on Simulated Annealing (SA). A performance evaluation of SARNA-Predict in terms of prediction accuracy is made via comparison with eight state-of-the-art RNA prediction algorithms: mfold, Pseudoknot(pknotsRE), NUPACK, pknotsRG-mfe, Sfold, HotKnots, ILM, and STAR. These algorithms are from three different classes: heuristic, dynamic programming, and statistical sampling techniques. An evaluation for the performance of SARNA-Predict in terms of prediction accuracy was verified with native structures. Experiments on 33 individual known structures from eleven RNA classes (tRNA, viral RNA, antigenomic HDV, telomerase RNA, tmRNA, rRNA, RNaseP, 5S rRNA, Group I intron 23S rRNA, Group I intron 16S rRNA, and 16S rRNA) were performed. The results presented in this paper demonstrate that SARNA-Predict can out-perform other state-of-the-art algorithms in terms of prediction accuracy. Furthermore, there is substantial improvement of prediction accuracy by incorporating a more sophisticated thermodynamic model (efn2).

[1]  Jamie J. Cannone,et al.  Evaluation of the suitability of free-energy minimization using nearest-neighbor energy parameters for RNA secondary structure prediction , 2004, BMC Bioinformatics.

[2]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[3]  M. Zuker Prediction of RNA secondary structure by energy minimization. , 1994, Methods in molecular biology.

[4]  Herbert H. Tsang,et al.  SARNA-Predict: A Study of RNA Secondary Structure Prediction Using Different Annealing Schedules , 2007, 2007 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology.

[5]  Sean R Eddy,et al.  How do RNA folding algorithms work? , 2004, Nature Biotechnology.

[6]  T. Steitz,et al.  The structural basis of ribosome activity in peptide bond synthesis. , 2000, Science.

[7]  Jin Chu Wu,et al.  An annealing mutation operator in the genetic algorithms for RNA folding , 1996, Comput. Appl. Biosci..

[8]  I. Tinoco,et al.  Estimation of Secondary Structure in Ribonucleic Acids , 1971, Nature.

[9]  Kay C. Wiese,et al.  jViz.Rna - An Interactive Graphical Tool for Visualizing RNA Secondary Structure Including Pseudoknots , 2006, 19th IEEE Symposium on Computer-Based Medical Systems (CBMS'06).

[10]  A. Ferré-D’Amaré,et al.  Crystallization and structure determination of a hepatitis delta virus ribozyme: use of the RNA-binding protein U1A as a crystallization module. , 2000, Journal of molecular biology.

[11]  Chun-Hsiang Huang,et al.  A heuristic approach for detecting RNA H-type pseudoknots , 2005, Bioinform..

[12]  Sergey Steinberg,et al.  Compilation of tRNA sequences and sequences of tRNA genes , 2004, Nucleic Acids Res..

[13]  D. Mathews Predicting RNA secondary structure by free energy minimization , 2006 .

[14]  James W. Brown The ribonuclease P database , 1997, Nucleic Acids Res..

[15]  I. Tinoco,et al.  How RNA folds. , 1999, Journal of molecular biology.

[16]  K. Wiese,et al.  A permutation-based genetic algorithm for the RNA folding problem: a critical look at selection strategies, crossover operators, and representation issues. , 2003, Bio Systems.

[17]  Kay C. Wiese,et al.  Using stacking-energies (INN and INN-HB) for improving the accuracy of RNA secondary structure prediction with an evolutionary algorithm - a comparison to known structures , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[18]  E Rivas,et al.  A dynamic programming algorithm for RNA structure prediction including pseudoknots. , 1998, Journal of molecular biology.

[19]  F. H. D. van Batenburg,et al.  PseudoBase: structural information on RNA pseudoknots , 2001, Nucleic Acids Res..

[20]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[21]  Weixiong Zhang,et al.  An Iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots , 2004, Bioinform..

[22]  Jiunn-Liang Chen,et al.  Secondary Structure of Vertebrate Telomerase RNA , 2000, Cell.

[23]  H. Hoos,et al.  HotKnots: heuristic prediction of RNA secondary structures including pseudoknots. , 2005, RNA.

[24]  C. Pleij,et al.  An APL-programmed genetic algorithm for the prediction of RNA secondary structure. , 1995, Journal of theoretical biology.

[25]  Christian N. S. Pedersen,et al.  Pseudoknots in RNA Secondary Structures , 2000 .

[26]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[27]  Peter F Stadler,et al.  Fast and reliable prediction of noncoding RNAs , 2005, Proc. Natl. Acad. Sci. USA.

[28]  C. Pleij,et al.  The computer simulation of RNA folding pathways using a genetic algorithm. , 1995, Journal of molecular biology.

[29]  David H Mathews,et al.  Revolutions in RNA secondary structure prediction. , 2006, Journal of molecular biology.

[30]  C W Pleij,et al.  The role of the pseudoknot at the 3' end of turnip yellow mosaic virus RNA in minus-strand synthesis by the viral RNA-dependent RNA polymerase , 1997, Journal of virology.

[31]  Herbert H. Tsang,et al.  The significance of thermodynamic models in the accuracy improvement of RNA secondary structure prediction using permutation-based simulated annealing , 2007, 2007 IEEE Congress on Evolutionary Computation.

[32]  Robert Giegerich,et al.  Beyond Mfold: Recent advances in RNA bioinformatics , 2006, Journal of Biotechnology.

[33]  A. Ferré-D’Amaré,et al.  Crystal structure of a hepatitis delta virus ribozyme , 1998, Nature.

[34]  Michael Zuker,et al.  Mfold web server for nucleic acid folding and hybridization prediction , 2003, Nucleic Acids Res..

[35]  J P Abrahams,et al.  Five pseudoknots are present at the 204 nucleotides long 3' noncoding region of tobacco mosaic virus RNA. , 1985, Nucleic acids research.

[36]  G. Steger,et al.  Description of RNA folding by "simulated annealing". , 1996, Journal of molecular biology.

[37]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[38]  D. Turner,et al.  Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. , 1998, Biochemistry.

[39]  Kay C. Wiese,et al.  A Permutation Based Genetic Algorithm for RNA Secondary Structure Prediction , 2002, HIS.

[40]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[41]  Robert Giegerich,et al.  Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics , 2004, BMC Bioinformatics.

[42]  Ye Ding,et al.  Structure clustering features on the Sfold Web server , 2005, Bioinform..

[43]  Nan Yu,et al.  The Comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs , 2002, BMC Bioinformatics.

[44]  Ye Ding,et al.  Sfold web server for statistical folding and rational design of nucleic acids , 2004, Nucleic Acids Res..

[45]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[46]  Yinghao Li Directed annealing search in constraint satisfaction and optimization , 1997 .

[47]  Bruce A. Shapiro,et al.  A massively parallel genetic algorithm for RNA secondary structure prediction , 1994, The Journal of Supercomputing.

[48]  Niles A. Pierce,et al.  A partition function algorithm for nucleic acid secondary structure including pseudoknots , 2003, J. Comput. Chem..

[49]  Herbert H. Tsang,et al.  SARNA-Predict: A Simulated Annealing Algorithm for RNA Secondary Structure Prediction , 2006, 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology.

[50]  P. Schuster,et al.  Analysis of RNA sequence structure maps by exhaustive enumeration II. Structures of neutral networks and shape space covering , 1996 .