A comprehensive comparison of comparative RNA structure prediction approaches

BackgroundAn increasing number of researchers have released novel RNA structure analysis and prediction algorithms for comparative approaches to structure prediction. Yet, independent benchmarking of these algorithms is rarely performed as is now common practice for protein-folding, gene-finding and multiple-sequence-alignment algorithms.ResultsHere we evaluate a number of RNA folding algorithms using reliable RNA data-sets and compare their relative performance.ConclusionsWe conclude that comparative data can enhance structure prediction but structure-prediction-algorithms vary widely in terms of both sensitivity and selectivity across different lengths and homologies. Furthermore, we outline some directions for future research.

[1]  Robert Giegerich,et al.  Prediction and Visualization of Structural Switches in RNA , 1998, Pacific Symposium on Biocomputing.

[2]  I. Tinoco,et al.  RNA folding and unfolding. , 2004, Current opinion in structural biology.

[3]  R. Breaker,et al.  Gene regulation by riboswitches , 2004, Nature Reviews Molecular Cell Biology.

[4]  Jerrold R. Griggs,et al.  Algorithms for Loop Matchings , 1978 .

[5]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[6]  Robert Giegerich,et al.  Pure multiple RNA secondary structure alignments: a progressive profile approach , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[7]  Tao Jiang,et al.  A more efficient approximation scheme for tree alignment , 1997, RECOMB '97.

[8]  Nan Yu,et al.  The Comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs , 2002, BMC Bioinformatics.

[9]  M. Kimura A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences , 1980, Journal of Molecular Evolution.

[10]  S. P. Fodor,et al.  Large-Scale Transcriptional Activity in Chromosomes 21 and 22 , 2002, Science.

[11]  Ceslovas Venclovas,et al.  Assessment of progress over the CASP experiments , 2003, Proteins.

[12]  Sean R. Eddy,et al.  Rfam: an RNA family database , 2003, Nucleic Acids Res..

[13]  David K. Y. Chiu,et al.  Inferring consensus structure from nucleic acid sequences , 1991, Comput. Appl. Biosci..

[14]  R. Gutell,et al.  A comparison of thermodynamic foldings with comparatively derived structures of 16S and 16S-like rRNAs. , 1995, RNA.

[15]  D. Turner,et al.  Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. , 2002, Journal of molecular biology.

[16]  Sean R. Eddy,et al.  A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure , 2002, BMC Bioinformatics.

[17]  Roland L. Dunbrack,et al.  CAFASP2: The second critical assessment of fully automated structure prediction methods , 2001, Proteins.

[18]  Robert Giegerich,et al.  Evaluating the predictability of conformational switching in RNA , 2004, Bioinform..

[19]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[20]  Erik L L Sonnhammer,et al.  Quality assessment of multiple alignment programs , 2002, FEBS letters.

[21]  Kaizhong Zhang,et al.  Comparing multiple RNA secondary structures using tree comparisons , 1990, Comput. Appl. Biosci..

[22]  Olivier Poch,et al.  BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations , 2001, Nucleic Acids Res..

[23]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[24]  Gary D. Stormo,et al.  Identification of consensus patterns in unaligned DNA sequences known to be functionally related , 1990, Comput. Appl. Biosci..

[25]  Elena Rivas,et al.  The language of RNA: a formal grammar that includes pseudoknots , 2000, Bioinform..

[26]  Robert Giegerich,et al.  Local similarity in RNA secondary structures , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[27]  D. Mathews Using an RNA secondary structure partition function to determine confidence in base pairs predicted by free energy minimization. , 2004, RNA.

[28]  R. Guigó,et al.  Evaluation of gene structure prediction programs. , 1996, Genomics.

[29]  Carl R. Woese,et al.  4 Probing RNA Structure, Function, and History by Comparative Analysis , 1993 .

[30]  C. Lawrence,et al.  A statistical sampling algorithm for RNA secondary structure prediction. , 2003, Nucleic acids research.

[31]  Jan Krüger,et al.  RNA-related tools on the Bielefeld Bioinformatics Server , 2003, Nucleic Acids Res..

[32]  Sean R. Eddy,et al.  Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction , 2004, BMC Bioinformatics.

[33]  Xing Xu,et al.  A graph theoretical approach for predicting common RNA secondary structure motifs including pseudoknots in unaligned sequences , 2004, Bioinform..

[34]  Yves Van de Peer,et al.  The European database on small subunit ribosomal RNA , 2002, Nucleic Acids Res..

[35]  Jamie J. Cannone,et al.  Evaluation of the suitability of free-energy minimization using nearest-neighbor energy parameters for RNA secondary structure prediction , 2004, BMC Bioinformatics.

[36]  Peter F. Stadler,et al.  Alignment of RNA base pairing probability matrices , 2004, Bioinform..

[37]  O. Gotoh,et al.  Multiple sequence alignment: algorithms and applications. , 1999, Advances in biophysics.

[38]  D. Turner,et al.  Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[39]  C Venclovas,et al.  Some measures of comparative performance in the three CASPs , 1999, Proteins.

[40]  Jennifer A. Doudna,et al.  The chemical repertoire of natural ribozymes , 2002, Nature.

[41]  Robert Giegerich,et al.  Abstract shapes of RNA. , 2004, Nucleic acids research.

[42]  Hélène Touzet,et al.  CARNAC: folding families of related RNAs , 2004, Nucleic Acids Res..

[43]  James W. Brown The ribonuclease P database , 1998, Nucleic Acids Res..

[44]  Sean R. Eddy,et al.  RSEARCH: Finding homologs of single structured RNA sequences , 2003, BMC Bioinformatics.

[45]  G. Stormo,et al.  Discovering common stem-loop motifs in unaligned RNA sequences. , 2001, Nucleic acids research.

[46]  G. Soukup,et al.  Riboswitches exert genetic control through metabolite-induced conformational change. , 2004, Current opinion in structural biology.

[47]  G. Stormo,et al.  Identifying constraints on the higher-order structure of RNA: continued development and application of comparative sequence analysis methods. , 1992, Nucleic acids research.

[48]  P. Higgs RNA secondary structure: physical and computational aspects , 2000, Quarterly Reviews of Biophysics.

[49]  Kaizhong Zhang,et al.  Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems , 1989, SIAM J. Comput..

[50]  Thomas Dandekar,et al.  Riboswitch finder tool for identification of riboswitch RNAs , 2004, Nucleic Acids Res..

[51]  J. Mattick Non‐coding RNAs: the architects of eukaryotic complexity , 2001, EMBO reports.

[52]  D. Sankoff Simultaneous Solution of the RNA Folding, Alignment and Protosequence Problems , 1985 .

[53]  D. S. Fields,et al.  An analysis of large rRNA sequences folded by a thermodynamic method. , 1996, Folding & design.

[54]  S. Cawley,et al.  Unbiased Mapping of Transcription Factor Binding Sites along Human Chromosomes 21 and 22 Points to Widespread Regulation of Noncoding RNAs , 2004, Cell.

[55]  P. Stadler,et al.  Secondary structure prediction for aligned RNA sequences. , 2002, Journal of molecular biology.

[56]  David Penny,et al.  Relics from the RNA World , 1998, Journal of Molecular Evolution.

[57]  Roland L. Dunbrack,et al.  CAFASP3: The third critical assessment of fully automated structure prediction methods , 2003, Proteins.

[58]  Hélène Touzet,et al.  Finding the common structure shared by two homologous RNAs , 2003, Bioinform..

[59]  R. Guigó,et al.  An assessment of gene prediction accuracy in large DNA sequences. , 2000, Genome research.

[60]  Olivier Poch,et al.  A comprehensive comparison of multiple sequence alignment programs , 1999, Nucleic Acids Res..

[61]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[62]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[63]  Robert Giegerich,et al.  Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics , 2004, BMC Bioinformatics.

[64]  Gary D. Stormo,et al.  Displaying the information contents of structural RNA alignments: the structure logos , 1997, Comput. Appl. Biosci..

[65]  Fatima Cvrčková,et al.  Molecular diversity of phospholipase D in angiosperms , 2002, BMC Genomics.

[66]  M. Hentze,et al.  Molecular control of vertebrate iron metabolism: mRNA-based regulatory circuits operated by iron, nitric oxide, and oxidative stress. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[67]  Kaizhong Zhang,et al.  Alignment between Two RNA Structures , 2001, MFCS.

[68]  Rolf Backofen,et al.  MARNA: A server for multiple alignment of RNAs , 2003, German Conference on Bioinformatics.

[69]  Olivier Poch,et al.  BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs , 1999, Bioinform..

[70]  Bin Ma,et al.  A General Edit Distance between RNA Structures , 2002, J. Comput. Biol..

[71]  J. Mattick,et al.  The evolution of controlled multitasked gene networks: the role of introns and other noncoding RNAs in the development of complex organisms. , 2001, Molecular biology and evolution.

[72]  Bruce A. Shapiro,et al.  An algorithm for comparing multiple RNA secondary structures , 1988, Comput. Appl. Biosci..

[73]  Laurie J. Heyer,et al.  Finding the most significant common sequence and structure motifs in a set of RNA sequences. , 1997, Nucleic acids research.

[74]  J. McCaskill The equilibrium partition function and base pair binding probabilities for RNA secondary structure , 1990, Biopolymers.

[75]  Mike A. Steel,et al.  Metrics on RNA Secondary Structures , 2000, J. Comput. Biol..

[76]  Bjarne Knudsen,et al.  RNA secondary structure prediction using stochastic context-free grammars and evolutionary history , 1999, Bioinform..

[77]  S. Cawley,et al.  Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. , 2004, Genome research.

[78]  Kuo-Chung Tai,et al.  The Tree-to-Tree Correction Problem , 1979, JACM.

[79]  Weixiong Zhang,et al.  An Iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots , 2004, Bioinform..

[80]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[81]  RNA–Related Tools , 1994, Bio/Technology.

[82]  Niles A. Pierce,et al.  A partition function algorithm for nucleic acid secondary structure including pseudoknots , 2003, J. Comput. Chem..

[83]  Christian N. S. Pedersen,et al.  Fast evaluation of internal loops in RNA secondary structure prediction , 1999, Bioinform..

[84]  P. Schuster,et al.  Statistics of RNA secondary structures , 1993, Biopolymers.

[85]  R. Gutell,et al.  The accuracy of ribosomal RNA comparative structure models. , 2002, Current opinion in structural biology.

[86]  István Miklós,et al.  Co-transcriptional folding is encoded within RNA genes , 2004, BMC Molecular Biology.

[87]  M. Gelfand,et al.  Riboswitches: the oldest mechanism for the regulation of gene expression? , 2004, Trends in genetics : TIG.

[88]  Tao Jiang,et al.  Alignment of Trees - An Alternative to Tree Edit , 1994, Theor. Comput. Sci..

[89]  I. Tinoco,et al.  How RNA folds. , 1999, Journal of molecular biology.

[90]  C Venclovas,et al.  Comparison of performance in successive CASP experiments , 2001, Proteins.

[91]  H. Schwalbe,et al.  NMR Spectroscopy of RNA , 2003, Chembiochem : a European journal of chemical biology.

[92]  Bjarne Knudsen,et al.  Pfold: RNA Secondary Structure Prediction Using Stochastic Context-Free Grammars , 2003 .

[93]  Yves Van de Peer,et al.  The European Large Subunit Ribosomal RNA database , 2000, Nucleic Acids Res..

[94]  D. Penny,et al.  The Path from the RNA World , 1998, Journal of Molecular Evolution.