Comparison of P-RnaPredict and mfold - algorithms for RNA secondary structure prediction

MOTIVATION Ribonucleic acid is vital in numerous stages of protein synthesis; it also possesses important functional and structural roles within the cell. The function of an RNA molecule within a particular organic system is principally determined by its structure. The current physical methods available for structure determination are time-consuming and expensive. Hence, computational methods for structure prediction are sought after. The energies involved by the formation of secondary structure elements are significantly greater than those of tertiary elements. Therefore, RNA structure prediction focuses on secondary structure. RESULTS We present P-RnaPredict, a parallel evolutionary algorithm for RNA secondary structure prediction. The speedup provided by parallelization is investigated with five sequences, and a dramatic improvement in speedup is demonstrated, especially with longer sequences. An evaluation of the performance of P-RnaPredict in terms of prediction accuracy is made through comparison with 10 individual known structures from 3 RNA classes (5S rRNA, Group I intron 16S rRNA and 16S rRNA) and the mfold dynamic programming algorithm. P-RnaPredict is able to predict structures with higher true positive base pair counts and lower false positives than mfold on certain sequences. AVAILABILITY P-RnaPredict is available for non-commercial usage. Interested parties should contact Kay C. Wiese (wiese@cs.sfu.ca).

[1]  C. Pleij,et al.  An APL-programmed genetic algorithm for the prediction of RNA secondary structure. , 1995, Journal of theoretical biology.

[2]  Jin Chu Wu,et al.  An annealing mutation operator in the genetic algorithms for RNA folding , 1996, Comput. Appl. Biosci..

[3]  Kay C. Wiese,et al.  Permutation-based RNA secondary structure prediction via a genetic algorithm , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[4]  Jerome Spanier,et al.  Dynamic creation of pseudorandom number generators , 2000 .

[5]  K.C. Wiese,et al.  A parallel evolutionary algorithm for RNA secondary structure prediction using stacking-energies (INN and INN-HB) , 2004, 2004 Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[6]  D. Turner,et al.  A periodic table of symmetric tandem mismatches in RNA. , 1995, Biochemistry.

[7]  Vladimir A. Ivanisenko,et al.  A fast genetic algorithm for RNA secondary structure analysis , 2002 .

[8]  Nan Yu,et al.  The Comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs , 2002, BMC Bioinformatics.

[9]  Jerrold R. Griggs,et al.  Algorithms for Loop Matchings , 1978 .

[10]  M. Zuker Calculating nucleic acid secondary structure. , 2000, Current opinion in structural biology.

[11]  Kay C. Wiese,et al.  Using stacking-energies (INN and INN-HB) for improving the accuracy of RNA secondary structure prediction with an evolutionary algorithm - a comparison to known structures , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[12]  M. Zuker Prediction of RNA secondary structure by energy minimization. , 1994, Methods in molecular biology.

[13]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[14]  Mitsuo Gen,et al.  Genetic Algorithms & Engineering Optimization , 2000 .

[15]  Michael Zuker,et al.  Mfold web server for nucleic acid folding and hybridization prediction , 2003, Nucleic Acids Res..

[16]  Darrell Whitley,et al.  The Travelling Salesman and Sequence Scheduling: Quality Solutions using Genetic Edge Recombination , 1990 .

[17]  Bruce A. Shapiro,et al.  A massively parallel genetic algorithm for RNA secondary structure prediction , 1994, The Journal of Supercomputing.

[18]  Derek H. Smith,et al.  A Permutation Based Genetic Algorithm for Minimum Span Frequency Assignment , 1998, PPSN.

[19]  H. P. Schwefel,et al.  Numerische Optimierung von Computermodellen mittels der Evo-lutionsstrategie , 1977 .

[20]  D. Turner,et al.  Stability of XGCGCp, GCGCYp, and XGCGCYp helixes: an empirical estimate of the energetics of hydrogen bonds in nucleic acids. , 1986, Biochemistry.

[21]  Lawrence J. Fogel,et al.  Artificial Intelligence through Simulated Evolution , 1966 .

[22]  Robert Giegerich,et al.  A comprehensive comparison of comparative RNA structure prediction approaches , 2004, BMC Bioinformatics.

[23]  C. Pleij,et al.  The computer simulation of RNA folding pathways using a genetic algorithm. , 1995, Journal of molecular biology.

[24]  Erick Cantú-Paz,et al.  Efficient and Accurate Parallel Genetic Algorithms , 2000, Genetic Algorithms and Evolutionary Computation.

[25]  Scott D. Goodwin,et al.  Keep–Best Reproduction: A Local Family Competition Selection Strategy and the Environment it Flourishes in , 2001, Constraints.

[26]  D. Turner,et al.  Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. , 1998, Biochemistry.

[27]  L. Darrell Whitley,et al.  A Comparison of Genetic Sequencing Operators , 1991, ICGA.

[28]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[29]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[30]  L. D. Whitley,et al.  The Traveling Salesman and Sequence Scheduling : , 1990 .

[31]  Peter F. Stadler,et al.  RNA In Silico The Computational Biology of RNA Secondary Structures , 1999, Adv. Complex Syst..

[32]  D. Turner,et al.  Predicting thermodynamic properties of RNA. , 1995, Methods in enzymology.

[33]  D. J. Smith,et al.  A Study of Permutation Crossover Operators on the Traveling Salesman Problem , 1987, ICGA.

[34]  Andrew Hendriks,et al.  A distributed genetic algorithm for RNA secondary structure prediction , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[35]  Michael Zuker,et al.  Algorithms and Thermodynamics for RNA Secondary Structure Prediction: A Practical Guide , 1999 .

[36]  K. Wiese,et al.  A permutation-based genetic algorithm for the RNA folding problem: a critical look at selection strategies, crossover operators, and representation issues. , 2003, Bio Systems.

[37]  Jin Chu Wu,et al.  The massively parallel genetic algorithm for RNA folding: MIMD implementation and population variation , 2001, Bioinform..

[38]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[39]  D. Turner,et al.  Energetics of internal GU mismatches in ribooligonucleotide helixes. , 1986, Biochemistry.

[40]  I. Tinoco,et al.  Stability of ribonucleic acid double-stranded helices. , 1974, Journal of molecular biology.

[41]  Bruce A. Shapiro,et al.  Secondary structure computer prediction of the poliovirus 5' non-coding region is improved by a genetic algorithm , 1997, Comput. Appl. Biosci..

[42]  A. E. Walter,et al.  Nearest-neighbor parameters for G.U mismatches: [formula; see text] is destabilizing in the contexts [formula; see text] and [formula; see text] but stabilizing in [formula; see text]. , 1991, Biochemistry.

[43]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[44]  M. Zuker On finding all suboptimal foldings of an RNA molecule. , 1989, Science.

[45]  V. W. Porto,et al.  Discovery of RNA structural elements using evolutionary computation. , 2002, Nucleic acids research.

[46]  Thomas Bäck,et al.  Evolutionary algorithms in theory and practice - evolution strategies, evolutionary programming, genetic algorithms , 1996 .

[47]  K.C. Wiese,et al.  P-RnaPredict-a parallel evolutionary algorithm for RNA folding: effects of pseudorandom number quality , 2005, IEEE Transactions on NanoBioscience.

[48]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[49]  C. Pleij,et al.  Dynamic competition between alternative structures in viroid RNAs simulated by an RNA folding algorithm. , 1998, Journal of molecular biology.

[50]  K. Dill,et al.  RNA folding energy landscapes. , 2000, Proceedings of the National Academy of Sciences of the United States of America.