SARNA-Predict: A Study of RNA Secondary Structure Prediction Using Different Annealing Schedules

This paper presents an algorithm for RNA secondary structure prediction based on simulated annealing (SA) and also studies the effect of using different types of annealing schedules. SA is known to be effective in solving many different types of minimization problems and for being able to approximate global minima in the solution space. Based on free energy minimization techniques, this permutation-based SA algorithm heuristically searches for the structure with a free energy value close to the minimum free energy DeltaG for that strand, within given constraints. Other contributions of this paper include the use of permutation-based encoding for RNA secondary structure and the swap mutation operator. Also, a detailed study of the convergence behavior of the algorithm is conducted and various annealing schedules are investigated. An evaluation of the performance of the new algorithm in terms of prediction accuracy is made via comparison with the dynamic programming algorithm mfold for thirteen individual known structures from four RNA classes (5S rRNA, Group I intron 23 rRNA, Group I intron 16S rRNA and 16S rRNA). Although dynamic programming algorithms for RNA folding are guaranteed to give the mathematically optimal (minimum energy) structure, the fundamental problem of this approach seems to be that the thermodynamic model is only accurate within 5-10%. Therefore, it is difficult for a single sequence folding algorithm to resolve which of the plausible lowest-energy structure is correct. The new algorithm showed comparable results with mfold and demonstrated a slightly higher specificity

[1]  Christian N. S. Pedersen,et al.  Pseudoknots in RNA secondary structures , 2000, RECOMB '00.

[2]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[3]  Michael Zuker,et al.  Mfold web server for nucleic acid folding and hybridization prediction , 2003, Nucleic Acids Res..

[4]  G. Steger,et al.  Description of RNA folding by "simulated annealing". , 1996, Journal of molecular biology.

[5]  Emile H. L. Aarts,et al.  Parallel implementations of the statistical cooling algorithm , 1986, Integr..

[6]  Yinghao Li Directed annealing search in constraint satisfaction and optimization , 1997 .

[7]  Peter F Stadler,et al.  Fast and reliable prediction of noncoding RNAs , 2005, Proc. Natl. Acad. Sci. USA.

[8]  T. Steitz,et al.  The structural basis of ribosome activity in peptide bond synthesis. , 2000, Science.

[9]  J. Couzin Small RNAs Make Big Splash , 2002, Science.

[10]  Kathryn A. Dowsland,et al.  General Cooling Schedules for a Simulated Annealing Based Timetabling System , 1995, PATAT.

[11]  Saeed Zolfaghari,et al.  Adaptive temperature control for simulated annealing: a comparative study , 2004, Comput. Oper. Res..

[12]  Robert Azencott,et al.  Simulated annealing : parallelization techniques , 1992 .

[13]  Sean R Eddy,et al.  How do RNA folding algorithms work? , 2004, Nature Biotechnology.

[14]  Huang,et al.  AN EFFICIENT GENERAL COOLING SCHEDULE FOR SIMULATED ANNEALING , 1986 .

[15]  Kay C. Wiese,et al.  Using stacking-energies (INN and INN-HB) for improving the accuracy of RNA secondary structure prediction with an evolutionary algorithm - a comparison to known structures , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[16]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[17]  Dimitrios L. Kalpaxis,et al.  Localization of spermine binding sites in 23S rRNA by photoaffinity labeling: parsing the spermine contribution to ribosomal 50S subunit functions , 2005, Nucleic acids research.

[18]  I. Tinoco,et al.  How RNA folds. , 1999, Journal of molecular biology.

[19]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[20]  Ye Ding,et al.  Structure clustering features on the Sfold Web server , 2005, Bioinform..

[21]  Herbert H. Tsang,et al.  SARNA-Predict: A Simulated Annealing Algorithm for RNA Secondary Structure Prediction , 2006, 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology.

[22]  Bruce E. Hajek,et al.  Cooling Schedules for Optimal Annealing , 1988, Math. Oper. Res..

[23]  Andrew Hendriks,et al.  Comparison of P-RnaPredict and mfold - algorithms for RNA secondary structure prediction , 2006, Bioinform..

[24]  Nan Yu,et al.  The Comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs , 2002, BMC Bioinformatics.

[25]  Nan Yu,et al.  The Comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs: Correction , 2002, BMC Bioinformatics.

[26]  Robert Giegerich,et al.  Beyond Mfold: Recent advances in RNA bioinformatics , 2006, Journal of Biotechnology.

[27]  David H Mathews,et al.  Revolutions in RNA secondary structure prediction. , 2006, Journal of molecular biology.

[28]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Emile H. L. Aarts,et al.  Simulated annealing and Boltzmann machines - a stochastic approach to combinatorial optimization and neural computing , 1990, Wiley-Interscience series in discrete mathematics and optimization.

[30]  I. Tinoco,et al.  Estimation of Secondary Structure in Ribonucleic Acids , 1971, Nature.

[31]  C. Pleij,et al.  An APL-programmed genetic algorithm for the prediction of RNA secondary structure. , 1995, Journal of theoretical biology.

[32]  R. Wagner,et al.  5S RNA structure and function. , 1988, Methods in enzymology.

[33]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[34]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[35]  D. Turner,et al.  Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. , 1998, Biochemistry.

[36]  Jennifer Couzin,et al.  Small RNAs Make Big Splash , 2002, Science.