Predicting RNA secondary structure by the comparative approach: how to select the homologous sequences

BackgroundThe secondary structure of an RNA must be known before the relationship between its structure and function can be determined. One way to predict the secondary structure of an RNA is to identify covarying residues that maintain the pairings (Watson-Crick, Wobble and non-canonical pairings). This "comparative approach" consists of identifying mutations from homologous sequence alignments. The sequences must covary enough for compensatory mutations to be revealed, but comparison is difficult if they are too different. Thus the choice of homologous sequences is critical. While many possible combinations of homologous sequences may be used for prediction, only a few will give good structure predictions. This can be due to poor quality alignment in stems or to the variability of certain sequences. This problem of sequence selection is currently unsolved.ResultsThis paper describes an algorithm, SSCA, which measures the suitability of sequences for the comparative approach. It is based on evolutionary models with structure constraints, particularly those on sequence variations and stem alignment. We propose three models, based on different constraints on sequence alignments. We show the results of the SSCA algorithm for predicting the secondary structure of several RNAs. SSCA enabled us to choose sets of homologous sequences that gave better predictions than arbitrarily chosen sets of homologous sequences.ConclusionSSCA is an algorithm for selecting combinations of RNA homologous sequences suitable for secondary structure predictions with the comparative approach.

[1]  M. Huynen,et al.  Automatic detection of conserved RNA structure elements in complete RNA virus genomes. , 1998, Nucleic acids research.

[2]  D. Turner,et al.  Improved free-energy parameters for predictions of RNA duplex stability. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Mireille Régnier,et al.  Automatic RNA Secondary Structure Prediction with a Comparative Approach , 2002, Comput. Chem..

[4]  D. Sankoff Simultaneous Solution of the RNA Folding, Alignment and Protosequence Problems , 1985 .

[5]  R. Haselkorn,et al.  SECONDARY STRUCTURE IN RIBONUCLEIC ACIDS. , 1959, Proceedings of the National Academy of Sciences of the United States of America.

[6]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[7]  D. Hoyle,et al.  RNA sequence evolution with secondary structure constraints: comparison of substitution rate models using maximum-likelihood methods. , 2001, Genetics.

[8]  B. Shapiro,et al.  RNA secondary structure prediction from sequence alignments using a network of k-nearest neighbor classifiers. , 2006, RNA.

[9]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[10]  Christian Zwieb,et al.  SRPDB (Signal Recognition Particle Database) , 2000, Nucleic Acids Res..

[11]  Mireille Régnier,et al.  A fast algorithm for RNA secondary structure prediction including pseudoknots , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[12]  Paul P. Gardner,et al.  Sequence analysis Measuring covariation in RNA alignments : physical realism improves information measures , 2006 .

[13]  F Rousset,et al.  Evolution of compensatory substitutions through G.U intermediate state in Drosophila rRNA. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[14]  W. Curtiss,et al.  Quantitation of base substitutions in eukaryotic 5S rRNA: Selection for the maintenance of RNA secondary structure , 2005, Journal of Molecular Evolution.

[15]  David K. Y. Chiu,et al.  Inferring consensus structure from nucleic acid sequences , 1991, Comput. Appl. Biosci..

[16]  W. Brown,et al.  Rates and patterns of base change in the small subunit ribosomal RNA gene. , 1993, Genetics.

[17]  Christian Zwieb,et al.  SRPDB: Signal Recognition Particle Database , 2003, Nucleic Acids Res..

[18]  Sean R. Eddy,et al.  RSEARCH: Finding homologs of single structured RNA sequences , 2003, BMC Bioinformatics.

[19]  M. Zuker On finding all suboptimal foldings of an RNA molecule. , 1989, Science.

[20]  Eric Westhof,et al.  Recurrent structural RNA motifs, Isostericity Matrices and sequence alignments , 2005, Nucleic acids research.

[21]  James W. Brown The ribonuclease P database , 1998, Nucleic Acids Res..

[22]  Kyungsook Han,et al.  Prediction of common folding structures of homologous RNAs. , 1993, Nucleic acids research.

[23]  Manolo Gouy,et al.  An energy model that predicts the correct folding of both the tRNA and the 5S RNA molecules , 1984, Nucleic Acids Res..

[24]  R. Nussinov,et al.  Fast algorithm for predicting the secondary structure of single-stranded RNA. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[25]  P. Higgs Compensatory neutral mutations and the evolution of RNA , 2004, Genetica.

[26]  Hélène Touzet,et al.  Finding the common structure shared by two homologous RNAs , 2003, Bioinform..

[27]  D Gautheret,et al.  G.U base pairing motifs in ribosomal RNA. , 1995, RNA.

[28]  Maciej Szymanski,et al.  5S Ribosomal RNA Database , 2002, Nucleic Acids Res..

[29]  Robert Giegerich,et al.  A comprehensive comparison of comparative RNA structure prediction approaches , 2004, BMC Bioinformatics.

[30]  Carl R. Woese,et al.  4 Probing RNA Structure, Function, and History by Comparative Analysis , 1993 .

[31]  R. Gutell,et al.  Comparative anatomy of 16-S-like ribosomal RNA. , 1985, Progress in nucleic acid research and molecular biology.

[32]  Christian Zwieb,et al.  The uRNA database , 1996, Nucleic Acids Res..

[33]  Martin C. Frith,et al.  SeqVISTA: a graphical tool for sequence feature visualization and comparison , 2003, BMC Bioinformatics.

[34]  Michael Zuker,et al.  RNA Secondary Structure Prediction , 2007, Current protocols in nucleic acid chemistry.

[35]  Christian Zwieb,et al.  tmRDB (tmRNA database) , 2001, Nucleic Acids Res..