Computational RNA Structure Prediction

The view of RNA as simple information transfer molecule has been continuously challenged since the discov- ery of ribozymes, a class of RNA molecules with enzyme-like function. Moreover, the recent discovery of tiny RNA molecules such asRNAs and small interfering RNA, is transforming our thinking about how gene expression is regu- lated. Thus, RNA molecules are now known to carry a large repertory of biological functions within cells including in- formation transfer, enzymatic catalysis and regulation of cellular processes. Similar to proteins, functional RNA mole- cules fold into their native three-dimensional (3D) conformation, which is essential for performing their biological activ- ity. Despite advances in understanding the folding and unfolding of RNA, our knowledge of the atomic mechanism by which RNA molecules adopt their biological active structure is still limited. In this review, we outline the general princi- ples that govern RNA structure and describe the databases and algorithms for analyzing and predicting RNA secondary and tertiary structure. Finally, we assess the impact of the current coverage of the RNA structural space on comparative modeling RNA structures.

[1]  Jan Gorodkin,et al.  Multiple structural alignment and clustering of RNA sequences , 2007, Bioinform..

[2]  Robert Giegerich,et al.  Local similarity in RNA secondary structures , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[3]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[4]  Jun Hu,et al.  A method for aligning RNA secondary structures and its application to RNA motif detection , 2005, BMC Bioinformatics.

[5]  E Rivas,et al.  A dynamic programming algorithm for RNA structure prediction including pseudoknots. , 1998, Journal of molecular biology.

[6]  T. Cheatham,et al.  Molecular dynamics simulation of nucleic acids: Successes, limitations, and promise * , 2000, Biopolymers.

[7]  Sean R. Eddy,et al.  RSEARCH: Finding homologs of single structured RNA sequences , 2003, BMC Bioinformatics.

[8]  Jin Chu Wu,et al.  The massively parallel genetic algorithm for RNA folding: MIMD implementation and population variation , 2001, Bioinform..

[9]  Peter Clote,et al.  DIAL: a web server for the pairwise alignment of two RNA three-dimensional structures using nucleotide, dihedral angle and base-pairing similarities , 2007, Nucleic Acids Res..

[10]  John P. Huelsenbeck,et al.  MRBAYES: Bayesian inference of phylogenetic trees , 2001, Bioinform..

[11]  Michael Geis Secondary Structure Prediction of Large RNAs , 2008 .

[12]  D. Sankoff Simultaneous Solution of the RNA Folding, Alignment and Protosequence Problems , 1985 .

[13]  Daniel Gautheret,et al.  Pattern searching/alignment with RNA primary and secondary structures: an effective descriptor for tRNA , 1990, Comput. Appl. Biosci..

[14]  S. Eddy Computational Genomics of Noncoding RNA Genes , 2002, Cell.

[15]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[16]  Hesham H. Ali,et al.  High sensitivity RNA pseudoknot prediction , 2006, Nucleic acids research.

[17]  G. Stormo,et al.  Identifying constraints on the higher-order structure of RNA: continued development and application of comparative sequence analysis methods. , 1992, Nucleic acids research.

[18]  Brice Felden,et al.  RNA structure: experimental analysis. , 2007, Current opinion in microbiology.

[19]  David A. Case,et al.  Modeling Unusual Nucleic Acid Structures , 1998 .

[20]  Robert Giegerich,et al.  A comprehensive comparison of comparative RNA structure prediction approaches , 2004, BMC Bioinformatics.

[21]  C. Pleij,et al.  The computer simulation of RNA folding pathways using a genetic algorithm. , 1995, Journal of molecular biology.

[22]  Sean R. Eddy,et al.  Rfam: an RNA family database , 2003, Nucleic Acids Res..

[23]  W. Olson,et al.  Overview of nucleic acid analysis programs. , 1999, Journal of biomolecular structure & dynamics.

[24]  R. Haselkorn,et al.  SECONDARY STRUCTURE IN RIBONUCLEIC ACIDS. , 1959, Proceedings of the National Academy of Sciences of the United States of America.

[25]  W. Olson,et al.  Resolving the discrepancies among nucleic acid conformational analyses. , 1999, Journal of molecular biology.

[26]  Tim J. P. Hubbard,et al.  SCOP database in 2004: refinements integrate structure and sequence family data , 2004, Nucleic Acids Res..

[27]  Sean R. Eddy,et al.  Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints , 2006, BMC Bioinformatics.

[28]  Paul P. Gardner,et al.  Sequence analysis Measuring covariation in RNA alignments : physical realism improves information measures , 2006 .

[29]  Jonathan P. Bollback,et al.  Exploring genomic dark matter: a critical assessment of the performance of homology search methods on noncoding RNA. , 2006, Genome research.

[30]  K. Dill Dominant forces in protein folding. , 1990, Biochemistry.

[31]  Robert D. Finn,et al.  The Pfam protein families database , 2004, Nucleic Acids Res..

[32]  Robert Giegerich,et al.  Abstract shapes of RNA. , 2004, Nucleic acids research.

[33]  Harry F Noller,et al.  RNA Structure: Reading the Ribosome , 2005, Science.

[34]  Ivo L. Hofacker,et al.  Vienna RNA secondary structure server , 2003, Nucleic Acids Res..

[35]  D. Ecker,et al.  RNAMotif, an RNA secondary structure definition and search algorithm. , 2001, Nucleic acids research.

[36]  Craig L. Zirbel,et al.  FR3D: finding local and composite recurrent structural motifs in RNA 3D structures , 2007, Journal of mathematical biology.

[37]  E. Westhof,et al.  Geometric nomenclature and classification of RNA base pairs. , 2001, RNA.

[38]  D. W. Staple,et al.  Open access, freely available online Primer Pseudoknots: RNA Structures with Diverse Functions , 2022 .

[39]  D. Turner,et al.  Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. , 2002, Journal of molecular biology.

[40]  E. Westhof,et al.  RNA structure: bioinformatic analysis. , 2007, Current opinion in microbiology.

[41]  B. Alberts,et al.  Some Molecular Details of the Secondary Structure of Ribonucleic Acid , 1960, Nature.

[42]  P. Stadler,et al.  Secondary structure prediction for aligned RNA sequences. , 2002, Journal of molecular biology.

[43]  Guillermo Sapiro,et al.  Statistical analysis of RNA backbone , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[44]  S. Eddy,et al.  A computational screen for methylation guide snoRNAs in yeast. , 1999, Science.

[45]  Weixiong Zhang,et al.  An Iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots , 2004, Bioinform..

[46]  Christian N. S. Pedersen,et al.  RNA Pseudoknot Prediction in Energy-Based Models , 2000, J. Comput. Biol..

[47]  E. Westhof,et al.  The building blocks and motifs of RNA architecture. , 2006, Current opinion in structural biology.

[48]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[49]  Christian Zwieb,et al.  Comparative 3-D Modeling of tmRNA , 2005, BMC Molecular Biology.

[50]  Peter Willett,et al.  Representation, searching and discovery of patterns of bases in complex RNA structures , 2003, J. Comput. Aided Mol. Des..

[51]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[52]  Eric Westhof,et al.  The non-Watson-Crick base pairs and their associated isostericity matrices. , 2002, Nucleic acids research.

[53]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[54]  Yves Van de Peer,et al.  Database on the structure of small ribosomal subunit RNA , 1996, Nucleic Acids Res..

[55]  Yves Van de Peer,et al.  Database on the structure of small ribosomal subunit RNA , 1998, Nucleic Acids Res..

[56]  D. A. Kirby,et al.  Maintenance of pre-mRNA secondary structure by epistatic selection. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[57]  N. Seeman,et al.  Three-Dimensional Tertiary Structure of Yeast Phenylalanine Transfer RNA , 1974, Science.

[58]  Hiroshi Matsui,et al.  Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures , 2004, Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004..

[59]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[60]  Yasubumi Sakakibara,et al.  Pair hidden Markov models on tree structures , 2003, ISMB.

[61]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[62]  D. Sankoff,et al.  RNA secondary structures and their prediction , 1984 .

[63]  Magnus Rattray,et al.  RNA-based phylogenetic methods: application to mammalian mitochondrial RNA sequences. , 2003, Molecular phylogenetics and evolution.

[64]  J. Holton,et al.  Structures of the Bacterial Ribosome at 3.5 Å Resolution , 2005, Science.

[65]  Weixiong Zhang,et al.  ILM: a web server for predicting RNA secondary structures with pseudoknots , 2004, Nucleic Acids Res..

[66]  Robert Giegerich,et al.  Pure multiple RNA secondary structure alignments: a progressive profile approach , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[67]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.

[68]  J. Doudna Structural genomics of RNA , 2000, Nature Structural Biology.

[69]  Anna Marie Pyle,et al.  The identification of novel RNA structural motifs using COMPADRES: an automated approach to structural discovery. , 2004, Nucleic acids research.

[70]  Sean R. Eddy,et al.  Rfam: annotating non-coding RNAs in complete genomes , 2004, Nucleic Acids Res..

[71]  D. Baker,et al.  Automated de novo prediction of native-like RNA tertiary structures , 2007, Proceedings of the National Academy of Sciences.

[72]  Niles A. Pierce,et al.  A partition function algorithm for nucleic acid secondary structure including pseudoknots , 2003, J. Comput. Chem..

[73]  G. Rose,et al.  RNABase: an annotated database of RNA structures , 2003, Nucleic Acids Res..

[74]  Marc A. Martí-Renom,et al.  MODBASE: a database of annotated comparative protein structure models and associated resources , 2005, Nucleic Acids Res..

[75]  C. Zwieb,et al.  Three-dimensional comparative modeling of RNA. , 1997, Nucleic acids symposium series.

[76]  Russell L. Malmberg,et al.  Stochastic modeling of RNA pseudoknotted structures: a grammatical approach , 2003, ISMB.

[77]  David K. Y. Chiu,et al.  Inferring consensus structure from nucleic acid sequences , 1991, Comput. Appl. Biosci..

[78]  Ruth Nussinov,et al.  ARTS: alignment of RNA tertiary structures , 2005, ECCB/JBI.

[79]  E. Westhof,et al.  Analysis of RNA motifs. , 2003, Current opinion in structural biology.

[80]  Bjarne Knudsen,et al.  Pfold: RNA Secondary Structure Prediction Using Stochastic Context-Free Grammars , 2003 .

[81]  Peter Clote,et al.  RNAbor: a web server for RNA structural neighbors , 2007, Nucleic Acids Res..

[82]  R. Nussinov,et al.  Fast algorithm for predicting the secondary structure of single-stranded RNA. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[83]  Peter F Stadler,et al.  Fast and reliable prediction of noncoding RNAs , 2005, Proc. Natl. Acad. Sci. USA.

[84]  Steven E. Brenner,et al.  SCOR: a Structural Classification of RNA database , 2002, Nucleic Acids Res..

[85]  Wing-Kin Sung,et al.  Local Gapped Subforest Alignment and Its Application in Finding RNA Structural Motifs , 2004, ISAAC.

[86]  Burkhard Rost,et al.  Domains, motifs and clusters in the protein universe. , 2003, Current opinion in chemical biology.

[87]  Bruce A Shapiro,et al.  The prediction of the wild-type telomerase RNA pseudoknot structure and the pivotal role of the bulge in its formation. , 2006, Journal of molecular graphics & modelling.

[88]  David Haussler,et al.  Identification and Classification of Conserved RNA Secondary Structures in the Human Genome , 2006, PLoS Comput. Biol..

[89]  Thomas Tuschl,et al.  siRNAs: applications in functional genomics and potential as therapeutics , 2004, Nature Reviews Drug Discovery.

[90]  Serafim Batzoglou,et al.  CONTRAfold: RNA secondary structure prediction without physics-based models , 2006, ISMB.

[91]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[92]  W. Olson,et al.  3DNA: a software package for the analysis, rebuilding and visualization of three-dimensional nucleic acid structures. , 2003, Nucleic acids research.

[93]  P. Gendron,et al.  Quantitative analysis of nucleic acid three-dimensional structures. , 2001, Journal of molecular biology.

[94]  Manuel C. Peitsch,et al.  SWISS-MODEL: an automated protein homology-modeling server , 2003, Nucleic Acids Res..

[95]  E. Westhof,et al.  Modelling of the three-dimensional architecture of group I catalytic introns based on comparative sequence analysis. , 1990, Journal of molecular biology.

[96]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[97]  Sean R. Eddy,et al.  A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure , 2002, BMC Bioinformatics.

[98]  David H Mathews,et al.  Revolutions in RNA secondary structure prediction. , 2006, Journal of molecular biology.

[99]  Marc A. Martí-Renom,et al.  Tools for comparative protein structure modeling and analysis , 2003, Nucleic Acids Res..

[100]  James W. Brown,et al.  The RNA Ontology Consortium: an open invitation to the RNA community. , 2006, RNA.

[101]  Ye Ding Statistical and Bayesian approaches to RNA secondary structure prediction. , 2006, RNA.

[102]  Ruth Nussinov,et al.  The ARTS web server for aligning RNA tertiary structures , 2006, Nucleic Acids Res..

[103]  F Rousset,et al.  Evolution of compensatory substitutions through G.U intermediate state in Drosophila rRNA. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[104]  A. Viari,et al.  Palingol: a declarative programming language to describe nucleic acids' secondary structures and to scan sequence database. , 1996, Nucleic acids research.

[105]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[106]  David E. Kim,et al.  Free modeling with Rosetta in CASP6 , 2005, Proteins.

[107]  Wojciech Kasprzak,et al.  Bridging the gap in RNA structure prediction. , 2007, Current opinion in structural biology.

[108]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[109]  K.C. Wiese,et al.  jViz.Rna -a java tool for RNA secondary structure visualization , 2005, IEEE Transactions on NanoBioscience.

[110]  Jerrold R. Griggs,et al.  Algorithms for Loop Matchings , 1978 .

[111]  C Massire,et al.  MANIP: an interactive tool for modelling RNA. , 1998, Journal of molecular graphics & modelling.

[112]  Anna Marie Pyle,et al.  RNA structure comparison, motif search and discovery using a reduced representation of RNA conformational space. , 2003, Nucleic acids research.

[113]  Thomas W H Lui,et al.  Empirical models for substitution in ribosomal RNA. , 2003, Molecular biology and evolution.

[114]  Hélène Touzet,et al.  CARNAC: folding families of related RNAs , 2004, Nucleic Acids Res..

[115]  The structure of the 80S ribosome from Trypanosoma cruzi reveals unique rRNA components. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[116]  Kaizhong Zhang,et al.  Comparing multiple RNA secondary structures using tree comparisons , 1990, Comput. Appl. Biosci..

[117]  Jennifer A. Doudna,et al.  The chemical repertoire of natural ribozymes , 2002, Nature.

[118]  R. C. Underwood,et al.  Stochastic context-free grammars for tRNA modeling. , 1994, Nucleic acids research.

[119]  A Renner,et al.  RNA structures and folding: from conventional to new issues in structure predictions. , 1997, Current opinion in structural biology.

[120]  Jan Gorodkin,et al.  The foldalign web server for pairwise structural RNA alignment and mutual motif search , 2005, Nucleic Acids Res..

[121]  A. R. Srinivasan,et al.  The nucleic acid database. A comprehensive relational database of three-dimensional structures of nucleic acids. , 1992, Biophysical journal.

[122]  David H Mathews,et al.  Prediction of RNA secondary structure by free energy minimization. , 2006, Current opinion in structural biology.

[123]  John P. Huelsenbeck,et al.  MrBayes 3: Bayesian phylogenetic inference under mixed models , 2003, Bioinform..

[124]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[125]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[126]  R. Durbin,et al.  RNA sequence analysis using covariance models. , 1994, Nucleic acids research.

[127]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[128]  Hélène Touzet,et al.  Finding the common structure shared by two homologous RNAs , 2003, Bioinform..

[129]  Niles A. Pierce,et al.  An algorithm for computing nucleic acid base‐pairing probabilities including pseudoknots , 2004, J. Comput. Chem..

[130]  John D. Westbrook,et al.  Tools for the automatic identification and classification of RNA base pairs , 2003, Nucleic Acids Res..

[131]  A. Sali,et al.  Comparative protein structure modeling of genes and genomes. , 2000, Annual review of biophysics and biomolecular structure.

[132]  Robert Giegerich,et al.  Beyond Mfold: Recent advances in RNA bioinformatics , 2006, Journal of Biotechnology.

[133]  E Westhof,et al.  An interactive framework for RNA secondary structure prediction with a dynamical treatment of constraints. , 1995, Journal of molecular biology.

[134]  Helen M Berman,et al.  RNA conformational classes. , 2004, Nucleic acids research.

[135]  D. Mathews Using an RNA secondary structure partition function to determine confidence in base pairs predicted by free energy minimization. , 2004, RNA.

[136]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[137]  J. Skolnick,et al.  How well is enzyme function conserved as a function of pairwise sequence identity? , 2003, Journal of molecular biology.

[138]  W. B. Arendall,et al.  RNA backbone is rotameric , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[139]  J. McCaskill The equilibrium partition function and base pair binding probabilities for RNA secondary structure , 1990, Biopolymers.

[140]  P. Schuster,et al.  Complete suboptimal folding of RNA and the stability of secondary structures. , 1999, Biopolymers.

[141]  Eric Westhof,et al.  Sequence to Structure (S2S): display, manipulate and interconnect RNA data from sequence to structure , 2005, Bioinform..

[142]  N. Pace,et al.  The RNA moiety of ribonuclease P is the catalytic subunit of the enzyme , 1983, Cell.

[143]  Rolf Backofen,et al.  Backofen R: MARNA: multiple alignment and consensus structure prediction of RNAs based on sequence structure comparisons , 2005 .

[144]  Vineet Bafna,et al.  Structural Alignment of Pseudoknotted RNA , 2006, RECOMB.

[145]  François Major,et al.  Building three-dimensional ribonucleic acid structures , 2003, Comput. Sci. Eng..

[146]  C. Pleij,et al.  An APL-programmed genetic algorithm for the prediction of RNA secondary structure. , 1995, Journal of theoretical biology.

[147]  A. Pyle,et al.  Stepping through an RNA structure: A novel approach to conformational analysis. , 1998, Journal of molecular biology.

[148]  B. Rost Twilight zone of protein sequence alignments. , 1999, Protein engineering.

[149]  I. Tinoco,et al.  How RNA folds. , 1999, Journal of molecular biology.

[150]  Graziano Pesole,et al.  PatSearch: a pattern matcher software that finds functional elements in nucleotide and protein sequences and assesses their statistical significance , 2000, Bioinform..