Structural analysis of aligned RNAs

The knowledge about classes of non-coding RNAs (ncRNAs) is growing very fast and it is mainly the structure which is the common characteristic property shared by members of the same class. For correct characterization of such classes it is therefore of great importance to analyse the structural features in great detail. In this manuscript I present RNAlishapes which combines various secondary structure analysis methods, such as suboptimal folding and shape abstraction, with a comparative approach known as RNA alignment folding. RNAlishapes makes use of an extended thermodynamic model and covariance scoring, which allows to reward covariation of paired bases. Applying the algorithm to a set of bacterial trp-operon leaders using shape abstraction it was able to identify the two alternating conformations of this attenuator. Besides providing in-depth analysis methods for aligned RNAs, the tool also shows a fairly well prediction accuracy. Therefore, RNAlishapes provides the community with a powerful tool for structural analysis of classes of RNAs and is also a reasonable method for consensus structure prediction based on sequence alignments. RNAlishapes is available for online use and download at .

[1]  Robert Giegerich,et al.  Local similarity in RNA secondary structures , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[2]  Robert Giegerich,et al.  Algebraic Dynamic Programming , 2002, AMAST.

[3]  D. Turner,et al.  Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. , 2002, Journal of molecular biology.

[4]  A. Fire,et al.  Specific inhibition of gene expression by small double-stranded RNAs in invertebrate and vertebrate systems , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Ilka M. Axmann,et al.  Identification of cyanobacterial non-coding RNAs by comparative genome analysis , 2005, Genome Biology.

[6]  E Rivas,et al.  A dynamic programming algorithm for RNA structure prediction including pseudoknots. , 1998, Journal of molecular biology.

[7]  Gary D. Stormo,et al.  Pairwise local structural alignment of RNA sequences with sequence similarity less than 40% , 2005, Bioinform..

[8]  Robert Giegerich,et al.  RNAshapes: an integrated RNA analysis package based on abstract shapes. , 2006, Bioinformatics.

[9]  G. Stormo,et al.  Discovering common stem-loop motifs in unaligned RNA sequences. , 2001, Nucleic acids research.

[10]  D. Turner,et al.  Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. , 1998, Biochemistry.

[11]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[12]  Peter F Stadler,et al.  Fast and reliable prediction of noncoding RNAs , 2005, Proc. Natl. Acad. Sci. USA.

[13]  D. Turner,et al.  Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[14]  S. Le,et al.  Prediction of common secondary structures of RNAs: a genetic algorithm approach. , 2000, Nucleic acids research.

[15]  P. Agris,et al.  The importance of being modified: roles of modified nucleosides and Mg2+ in RNA structure and function. , 1996, Progress in nucleic acid research and molecular biology.

[16]  Robert Giegerich,et al.  A comprehensive comparison of comparative RNA structure prediction approaches , 2004, BMC Bioinformatics.

[17]  Robert Giegerich,et al.  Pure multiple RNA secondary structure alignments: a progressive profile approach , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Robert Giegerich,et al.  Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics , 2004, BMC Bioinformatics.

[19]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[20]  P. Schuster,et al.  RNA folding at elementary step resolution. , 1999, RNA.

[21]  Robert Giegerich,et al.  Abstract shapes of RNA. , 2004, Nucleic acids research.

[22]  Kenji Yamamoto,et al.  Analysis of the conformational energy landscape of human snRNA with a metric based on tree representation of RNA structures. , 2003, Nucleic acids research.

[23]  P. Stadler,et al.  Mapping of conserved RNA secondary structures predicts thousands of functional noncoding RNAs in the human genome , 2005, Nature Biotechnology.

[24]  M. Zuker On finding all suboptimal foldings of an RNA molecule. , 1989, Science.

[25]  Weixiong Zhang,et al.  An Iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots , 2004, Bioinform..

[26]  R. Giegerich,et al.  Complete probabilistic analysis of RNA shapes , 2006, BMC Biology.

[27]  R. Lück,et al.  ConStruct: a tool for thermodynamic controlled prediction of conserved secondary structure. , 1999, Nucleic acids research.

[28]  J. Vogel,et al.  RNomics in Escherichia coli detects new sRNA species and indicates parallel transcriptional output in bacteria. , 2003, Nucleic acids research.

[29]  Robert Giegerich,et al.  Effective ambiguity checking in biosequence analysis , 2005, BMC Bioinformatics.

[30]  G. Storz,et al.  Target prediction for small, noncoding RNAs in bacteria , 2006, Nucleic acids research.

[31]  C. Lawrence,et al.  A statistical sampling algorithm for RNA secondary structure prediction. , 2003, Nucleic acids research.

[32]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[33]  G. Björk Genetic dissection of synthesis and function of modified nucleosides in bacterial transfer RNA. , 1995, Progress in nucleic acid research and molecular biology.

[34]  Sean R. Eddy,et al.  Rfam: an RNA family database , 2003, Nucleic Acids Res..

[35]  Michael T. Wolfinger,et al.  Barrier Trees of Degenerate Landscapes , 2002 .

[36]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[37]  Robert Giegerich,et al.  Consensus shapes: an alternative to the Sankoff algorithm for RNA consensus structure prediction , 2005, Bioinform..

[38]  M. Gelfand,et al.  Comparative analysis of RNA regulatory elements of amino acid metabolism genes in Actinobacteria , 2005, BMC Microbiology.

[39]  N. Grishin,et al.  A putative RNA-interference-based immune system in prokaryotes: computational analysis of the predicted enzymatic machinery, functional analogies with eukaryotic RNAi, and hypothetical mechanisms of action , 2006, Biology Direct.

[40]  Sean R. Eddy,et al.  A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure , 2002, BMC Bioinformatics.

[41]  Robert Giegerich,et al.  Challenges in the compilation of a domain specific language for dynamic programming , 2006, SAC '06.

[42]  Bjarne Knudsen,et al.  Pfold: RNA Secondary Structure Prediction Using Stochastic Context-Free Grammars , 2003 .

[43]  M. Helm,et al.  Nuclear control of cloverleaf structure of human mitochondrial tRNA(Lys). , 2004, Journal of molecular biology.

[44]  Robert Giegerich,et al.  Versatile and declarative dynamic programming using pair algebras , 2005, BMC Bioinformatics.

[45]  Robert Giegerich,et al.  Evaluating the predictability of conformational switching in RNA , 2004, Bioinform..

[46]  Hélène Touzet,et al.  CARNAC: folding families of related RNAs , 2004, Nucleic Acids Res..

[47]  J. Mattick RNA regulation: a new genetics? , 2004, Nature Reviews Genetics.

[48]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[49]  T. Tuschl,et al.  RNA interference is mediated by 21- and 22-nucleotide RNAs. , 2001, Genes & development.

[50]  E. Siggia,et al.  Modeling RNA folding paths with pseudoknots: application to hepatitis delta virus ribozyme. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[51]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[52]  J. Mattick Non‐coding RNAs: the architects of eukaryotic complexity , 2001, EMBO reports.

[53]  H. Margalit,et al.  Novel small RNA-encoding genes in the intergenic regions of Escherichia coli , 2001, Current Biology.

[54]  P. Stadler,et al.  Secondary structure prediction for aligned RNA sequences. , 2002, Journal of molecular biology.

[55]  J. Mattick Challenging the dogma: the hidden layer of non-protein-coding RNAs in complex organisms. , 2003, BioEssays : news and reviews in molecular, cellular and developmental biology.

[56]  B. Reinhart,et al.  Small RNAs Correspond to Centromere Heterochromatic Repeats , 2002, Science.

[57]  P. Schuster,et al.  Complete suboptimal folding of RNA and the stability of secondary structures. , 1999, Biopolymers.

[58]  A. Djikeng,et al.  RNA interference in Trypanosoma brucei: cloning of small interfering RNAs provides evidence for retroposon-derived 24-26-nucleotide RNAs. , 2001, RNA.

[59]  Robert Giegerich,et al.  A systematic approach to dynamic programming in bioinformatics , 2000, Bioinform..

[60]  Rolf Backofen,et al.  Backofen R: MARNA: multiple alignment and consensus structure prediction of RNAs based on sequence structure comparisons , 2005 .

[61]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[62]  J. McCaskill The equilibrium partition function and base pair binding probabilities for RNA secondary structure , 1990, Biopolymers.