Prediction of consensus RNA secondary structures including pseudoknots

Most functional RNA molecules have characteristic structures that are highly conserved in evolution. Many of them contain pseudoknots. Here, we present a method for computing the consensus structures including pseudoknots based on alignments of a few sequences. The algorithm combines thermodynamic and covariation information to assign scores to all possible base pairs, the base pairs are chosen with the help of the maximum weighted matching algorithm. We applied our algorithm to a number of different types of RNA known to contain pseudoknots. All pseudoknots were predicted correctly and more than 85 percent of the base pairs were identified.

[1]  Jamie J. Cannone,et al.  Evaluation of the suitability of free-energy minimization using nearest-neighbor energy parameters for RNA secondary structure prediction , 2004, BMC Bioinformatics.

[2]  Jiunn-Liang Chen,et al.  Secondary Structure of Vertebrate Telomerase RNA , 2000, Cell.

[3]  Robert Giegerich,et al.  Local similarity in RNA secondary structures , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[4]  Jerrold R. Griggs,et al.  Algorithms for Loop Matchings , 1978 .

[5]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[6]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[7]  D. Draper,et al.  Unusual mRNA pseudoknot structure is recognized by a protein translational repressor , 1989, Cell.

[8]  M. Zuker,et al.  Predicting common foldings of homologous RNAs. , 1991, Journal of biomolecular structure & dynamics.

[9]  Gary D. Stormo,et al.  Finding Common Sequence and Structure Motifs in a Set of RNA Sequences , 1997, ISMB.

[10]  V. Agol,et al.  Structural requirements of the higher order RNA kissing element in the enteroviral 3'UTR. , 1999, Nucleic acids research.

[11]  James W. Brown,et al.  The Ribonuclease P Database , 1994, Nucleic Acids Res..

[12]  P. D. Nagy,et al.  A replication silencer element in a plus‐strand RNA virus , 2003, The EMBO journal.

[13]  A. Ferré-D’Amaré,et al.  Crystal structure of a hepatitis delta virus ribozyme , 1998, Nature.

[14]  Anne Condon,et al.  Classifying RNA pseudoknotted structures , 2004, Theor. Comput. Sci..

[15]  Hélène Touzet,et al.  Finding the common structure shared by two homologous RNAs , 2003, Bioinform..

[16]  C. Pleij,et al.  An approximation of loop free energy values of RNA H-pseudoknots. , 1999, RNA.

[17]  Robert Giegerich,et al.  Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics , 2004, BMC Bioinformatics.

[18]  A. Ferré-D’Amaré,et al.  Crystallization and structure determination of a hepatitis delta virus ribozyme: use of the RNA-binding protein U1A as a crystallization module. , 2000, Journal of molecular biology.

[19]  Nan Yu,et al.  The Comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs , 2002, BMC Bioinformatics.

[20]  Peter F. Stadler,et al.  Automatic Detection of Conserved Base Pairing Patterns in RNA Virus Genomes , 1998, Comput. Chem..

[21]  Paul Higgs,et al.  Evidence for kinetic effects in the folding of large RNA molecules , 1996 .

[22]  Christian N. S. Pedersen,et al.  RNA Pseudoknot Prediction in Energy-Based Models , 2000, J. Comput. Biol..

[23]  A. Finkelstein,et al.  Computer simulation of secondary structure folding of random and ‘‘edited’’ RNA chains , 1996 .

[24]  G. Stormo,et al.  A graph theoretical approach for predicting common RNA secondary structure motifs including pseudoknots in unaligned sequences. , 2004, Bioinformatics.

[25]  Christian Zwieb,et al.  tmRDB (tmRNA database) , 2003, Nucleic Acids Res..

[26]  James W. Brown The ribonuclease P database , 1998, Nucleic Acids Res..

[27]  R. W. Lucky,et al.  Free software [Reflections] , 1999 .

[28]  James W. Brown The ribonuclease P database , 1997, Nucleic Acids Res..

[29]  R. Lück,et al.  ConStruct: a tool for thermodynamic controlled prediction of conserved secondary structure. , 1999, Nucleic acids research.

[30]  Rolf Backofen,et al.  MARNA: A server for multiple alignment of RNAs , 2003, German Conference on Bioinformatics.

[31]  D. Brian,et al.  A Phylogenetically Conserved Hairpin-Type 3′ Untranslated Region Pseudoknot Functions in Coronavirus RNA Replication , 1999, Journal of Virology.

[32]  J. Taylor,et al.  Characterization of self-cleaving RNA sequences on the genome and antigenome of human hepatitis delta virus , 1988, Journal of virology.

[33]  Jiunn-Liang Chen,et al.  A critical stem-loop structure in the CR4-CR5 domain of mammalian telomerase RNA. , 2002, Nucleic acids research.

[34]  G. Stormo,et al.  Identifying constraints on the higher-order structure of RNA: continued development and application of comparative sequence analysis methods. , 1992, Nucleic acids research.

[35]  E Rivas,et al.  A dynamic programming algorithm for RNA structure prediction including pseudoknots. , 1998, Journal of molecular biology.

[36]  O. Galzitskaya,et al.  Geometrical factor and physical reasons for its influence on the kinetic and thermodynamic properties of RNA-like heteropolymers. , 1997, Folding & design.

[37]  Weixiong Zhang,et al.  An Iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots , 2004, Bioinform..

[38]  A. V. D. Fliers Beta-1 integrin variants in myogenesis and cytoskeletal signaling , 2001 .

[39]  P. Higgs RNA secondary structure: physical and computational aspects , 2000, Quarterly Reviews of Biophysics.

[40]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[41]  Niles A. Pierce,et al.  A partition function algorithm for nucleic acid secondary structure including pseudoknots , 2003, J. Comput. Chem..

[42]  George Varghese,et al.  A uniform projection method for motif discovery in DNA sequences , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[43]  P. Stadler,et al.  RNA structures with pseudo-knots: Graph-theoretical, combinatorial, and statistical properties , 1999, Bulletin of mathematical biology.

[44]  C. Haslinger Prediction Algorithms for Restricted RNA Pseudoknots , 2001 .

[45]  Christian Zwieb,et al.  SRPDB (Signal Recognition Particle Database) , 2001, Nucleic Acids Res..

[46]  Tatsuya Akutsu,et al.  Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots , 2000, Discret. Appl. Math..

[47]  V. Juan,et al.  RNA secondary structure prediction based on free energy and phylogenetic analysis. , 1999, Journal of molecular biology.

[48]  David K. Y. Chiu,et al.  Inferring consensus structure from nucleic acid sequences , 1991, Comput. Appl. Biosci..

[49]  J. Ng,et al.  PseudoBase: a database with RNA pseudoknots , 2000, Nucleic Acids Res..

[50]  R. Gutell,et al.  A comparison of thermodynamic foldings with comparatively derived structures of 16S and 16S-like rRNAs. , 1995, RNA.

[51]  James W. Brown,et al.  Comparative analysis of ribonuclease P RNA using gene sequences from natural microbial populations reveals tertiary structural elements. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[52]  D. Sankoff,et al.  RNA secondary structures and their prediction , 1984 .

[53]  C. Vonrhein,et al.  Structure of the 30S ribosomal subunit , 2000, Nature.

[54]  M. Huynen,et al.  Automatic detection of conserved RNA structure elements in complete RNA virus genomes. , 1998, Nucleic acids research.

[55]  Harold Neil Gabow,et al.  Implementation of algorithms for maximum matching on nonbipartite graphs , 1973 .

[56]  Gabriele Varani,et al.  The solution structure of an essential stem-loop of human telomerase RNA. , 2003, Nucleic acids research.

[57]  Christian Zwieb,et al.  tmRDB (tmRNA database) , 2000, Nucleic Acids Res..

[58]  Tamás Kiss,et al.  Analysis of the structure of human telomerase RNA in vivo. , 2002, Nucleic acids research.

[59]  Gary D. Stormo,et al.  An RNA folding method capable of identifying pseudoknots and base triples , 1998, Bioinform..

[60]  N. Larsen,et al.  SRP-RNA sequence alignment and secondary structure. , 1991, Nucleic acids research.

[61]  Peter F. Stadler,et al.  Alignment of RNA base pairing probability matrices , 2004, Bioinform..

[62]  Peter F. Stadler,et al.  RNA Structures with Pseudo-Knots - Graph-Theoretical and Combinatorial Properties , 1997 .

[63]  P. Schuster,et al.  Statistics of RNA secondary structures , 1993, Biopolymers.

[64]  Hong Na,et al.  3'-Terminal RNA secondary structures are important for accumulation of tomato bushy stunt virus DI RNAs. , 2003, Virology.

[65]  J. Harris,et al.  New insight into RNase P RNA structure from comparative analysis of the archaeal RNA. , 2001, RNA.

[66]  Sean R. Eddy,et al.  Rfam: an RNA family database , 2003, Nucleic Acids Res..

[67]  P. Stadler,et al.  Secondary structure prediction for aligned RNA sequences. , 2002, Journal of molecular biology.

[68]  E. Dam,et al.  Structural and functional aspects of RNA pseudoknots. , 1992, Biochemistry.

[69]  K. Umesono,et al.  Comparative and functional anatomy of group II catalytic introns--a review. , 1989, Gene.

[70]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[71]  C. Zwieb,et al.  Comparative sequence analysis of tmRNA. , 1999, Nucleic acids research.