Clustering RNA structural motifs in ribosomal RNAs using secondary structural alignment

RNA structural motifs are the building blocks of the complex RNA architecture. Identification of non-coding RNA structural motifs is a critical step towards understanding of their structures and functionalities. In this article, we present a clustering approach for de novo RNA structural motif identification. We applied our approach on a data set containing 5S, 16S and 23S rRNAs and rediscovered many known motifs including GNRA tetraloop, kink-turn, C-loop, sarcin–ricin, reverse kink-turn, hook-turn, E-loop and tandem-sheared motifs, with higher accuracy than the state-of-the-art clustering method. We also identified a number of potential novel instances of GNRA tetraloop, kink-turn, sarcin–ricin and tandem-sheared motifs. More importantly, several novel structural motif families have been revealed by our clustering analysis. We identified a highly asymmetric bulge loop motif that resembles the rope sling. We also found an internal loop motif that can significantly increase the twist of the helix. Finally, we discovered a subfamily of hexaloop motif, which has significantly different geometry comparing to the currently known hexaloop motif. Our discoveries presented in this article have largely increased current knowledge of RNA structural motifs.

[1]  Thomas A. Steitz,et al.  RNA tertiary interactions in the large ribosomal subunit: The A-minor motif , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[2]  P. Moore,et al.  The loop E-loop D region of Escherichia coli 5S rRNA: the solution structure reveals an unusual loop that may be important for binding ribosomal proteins. , 1997, Structure.

[3]  E. Westhof,et al.  Analysis of RNA motifs. , 2003, Current opinion in structural biology.

[4]  Zohar Yakhini,et al.  Clustering gene expression patterns , 1999, J. Comput. Biol..

[5]  Y Endo,et al.  Ribotoxin recognition of ribosomal RNA and a proposal for the mechanism of translocation. , 1992, Trends in biochemical sciences.

[6]  T. Steitz,et al.  The complete atomic structure of the large ribosomal subunit at 2.4 A resolution. , 2000, Science.

[7]  Anna Marie Pyle,et al.  The identification of novel RNA structural motifs using COMPADRES: an automated approach to structural discovery. , 2004, Nucleic acids research.

[8]  Eric Westhof,et al.  Sequence-based identification of 3D structural modules in RNA with RMDetect , 2011, Nature Methods.

[9]  Derek Y. Chiang,et al.  MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery , 2010, Nucleic acids research.

[10]  E. Westhof,et al.  The building blocks and motifs of RNA architecture. , 2006, Current opinion in structural biology.

[11]  H. Noller,et al.  Interaction of elongation factors EF-G and EF-Tu with a conserved loop in 23S RNA , 1988, Nature.

[12]  E Westhof,et al.  The 5S rRNA loop E: chemical probing and phylogenetic data versus crystal structure. , 1998, RNA.

[13]  Eric Westhof,et al.  Recurrent structural RNA motifs, Isostericity Matrices and sequence alignments , 2005, Nucleic acids research.

[14]  John D. Westbrook,et al.  Tools for the automatic identification and classification of RNA base pairs , 2003, Nucleic Acids Res..

[15]  R. Knight,et al.  From knotted to nested RNA structures: a variety of computational methods for pseudoknot removal. , 2008, RNA.

[16]  Craig L. Zirbel,et al.  FR3D: finding local and composite recurrent structural motifs in RNA 3D structures , 2007, Journal of mathematical biology.

[17]  Steven E. Brenner,et al.  SCOR: Structural Classification of RNA, version 2.0 , 2004, Nucleic Acids Res..

[18]  Jan Gorodkin,et al.  The foldalign web server for pairwise structural RNA alignment and mutual motif search , 2005, Nucleic Acids Res..

[19]  S. Strobel,et al.  RNA kink turns to the left and to the right. , 2004, RNA.

[20]  Szilvia Szép,et al.  The crystal structure of a 26-nucleotide RNA containing a hook-turn. , 2003, RNA.

[21]  N. B. Leontisa,et al.  Motif prediction in ribosomal RNAs Lessons and prospects for automated motif prediction in homologous RNA molecules , 2002 .

[22]  K. Nierhaus,et al.  Evidence that the G2661 region of 23S rRNA is located at the ribosomal binding sites of both elongation factors. , 1987, Biochimie.

[23]  Eric Westhof,et al.  New metrics for comparing and assessing discrepancies between RNA 3D structures and models. , 2009, RNA.

[24]  C R Woese,et al.  Architecture of ribosomal RNA: constraints on the sequence of "tetra-loops". , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[25]  E. Westhof,et al.  Geometric nomenclature and classification of RNA base pairs. , 2001, RNA.

[26]  C. Vonrhein,et al.  Structure of the 30S ribosomal subunit , 2000, Nature.

[27]  A. Serganov,et al.  The crystal structure of UUCG tetraloop. , 2000, Journal of molecular biology.

[28]  Anna Marie Pyle,et al.  RNA structure comparison, motif search and discovery using a reduced representation of RNA conformational space. , 2003, Nucleic acids research.

[29]  T. Steitz,et al.  Metals, Motifs, and Recognition in the Crystal Structure of a 5S rRNA Domain , 1997, Cell.

[30]  P. Moore,et al.  Structure and stability of variants of the sarcin-ricin loop of 28S rRNA: NMR studies of the prokaryotic SRL and a functional mutant. , 1998, RNA.

[31]  J. McCutcheon,et al.  Crystal structure of the 30 S ribosomal subunit from Thermus thermophilus: purification, crystallization and structure determination. , 2001, Journal of molecular biology.

[32]  K. Hartmuth,et al.  Crystal structure of the spliceosomal 15.5kD protein bound to a U4 snRNA fragment. , 2000, Molecular cell.

[33]  P. Gendron,et al.  Quantitative analysis of nucleic acid three-dimensional structures. , 2001, Journal of molecular biology.

[34]  Jiří Šponer,et al.  Molecular dynamics simulations of sarcin–ricin rRNA motif , 2006, Nucleic acids research.

[35]  T. Steitz,et al.  The kink‐turn: a new RNA secondary structure motif , 2001, The EMBO journal.

[36]  Pascale Romby,et al.  Structural basis of translational control by Escherichia coli threonyl tRNA synthetase , 2002, Nature Structural Biology.

[37]  I. Wool,et al.  The conformation of the sarcin/ricin loop from 28S ribosomal RNA. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[38]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[39]  Eric Westhof,et al.  The non-Watson-Crick base pairs and their associated isostericity matrices. , 2002, Nucleic acids research.

[40]  Haixu Tang,et al.  RNAMotifScan: automatic identification of RNA structural motifs using secondary structural alignment , 2010, Nucleic acids research.

[41]  Jennifer A. Doudna,et al.  A universal mode of helix packing in RNA , 2001, Nature Structural Biology.

[42]  Alain Denise,et al.  Automated motif extraction and classification in RNA tertiary structures. , 2008, RNA.

[43]  Peter Willett,et al.  Representation, searching and discovery of patterns of bases in complex RNA structures , 2003, J. Comput. Aided Mol. Des..

[44]  Ruth Nussinov,et al.  ARTS: alignment of RNA tertiary structures , 2005, ECCB/JBI.