De novo discovery of structural motifs in RNA 3D structures through clustering

Abstract As functional components in three-dimensional (3D) conformation of an RNA, the RNA structural motifs provide an easy way to associate the molecular architectures with their biological mechanisms. In the past years, many computational tools have been developed to search motif instances by using the existing knowledge of well-studied families. Recently, with the rapidly increasing number of resolved RNA 3D structures, there is an urgent need to discover novel motifs with the newly presented information. In this work, we classify all the loops in non-redundant RNA 3D structures to detect plausible RNA structural motif families by using a clustering pipeline. Compared with other clustering approaches, our method has two benefits: first, the underlying alignment algorithm is tolerant to the variations in 3D structures. Second, sophisticated downstream analysis has been performed to ensure the clusters are valid and easily applied to further research. The final clustering results contain many interesting new variants of known motif families, such as GNAA tetraloop, kink-turn, sarcin-ricin and T-loop. We have also discovered potential novel functional motifs conserved in ribosomal RNA, sgRNA, SRP RNA, riboswitch and ribozyme.

[1]  Zohar Yakhini,et al.  Clustering gene expression patterns , 1999, J. Comput. Biol..

[2]  Anna Marie Pyle,et al.  The identification of novel RNA structural motifs using COMPADRES: an automated approach to structural discovery. , 2004, Nucleic acids research.

[3]  L. Wol,et al.  Signal Recognition Particle Mediates a Transient Elongation Arrest of Preprolactin in Reticulocyte Lysate , 1989 .

[4]  K. Nierhaus,et al.  Evidence that the G2661 region of 23S rRNA is located at the ribosomal binding sites of both elongation factors. , 1987, Biochimie.

[5]  Scott A. Strobel,et al.  Crystal structure of a self-splicing group I intron with both exons , 2004, Nature.

[6]  Alain Denise,et al.  Automated motif extraction and classification in RNA tertiary structures. , 2008, RNA.

[7]  Peter Willett,et al.  Representation, searching and discovery of patterns of bases in complex RNA structures , 2003, J. Comput. Aided Mol. Des..

[8]  Jinwei Zhang,et al.  Co-crystal structure of a T-box riboswitch stem I domain in complex with its cognate tRNA , 2013, Nature.

[9]  Sergey Melnikov,et al.  The Structure of the Eukaryotic Ribosome at 3.0 Å Resolution , 2011, Science.

[10]  D. Lilley,et al.  A structural database for k-turn motifs in RNA. , 2010, RNA.

[11]  R. Agrawal,et al.  Structure of the Mammalian Mitochondrial Ribosome Reveals an Expanded Functional Role for Its Component Proteins , 2003, Cell.

[12]  R. Breaker,et al.  Control of gene expression by a natural metabolite-responsive ribozyme , 2004, Nature.

[13]  Younghoon Lee,et al.  Identification of a structural motif of 23S rRNA interacting with 5S rRNA , 2001, FEBS letters.

[14]  R. Knight,et al.  From knotted to nested RNA structures: a variety of computational methods for pseudoknot removal. , 2008, RNA.

[15]  Craig L. Zirbel,et al.  FR3D: finding local and composite recurrent structural motifs in RNA 3D structures , 2007, Journal of mathematical biology.

[16]  R. Breaker,et al.  A widespread self-cleaving ribozyme class is revealed by bioinformatics , 2013, Nature chemical biology.

[17]  N. B. Leontisa,et al.  Motif prediction in ribosomal RNAs Lessons and prospects for automated motif prediction in homologous RNA molecules , 2002 .

[18]  B. Clark,et al.  Structure of yeast phenylalanine tRNA at 3 Å resolution , 1974, Nature.

[19]  Eric Westhof,et al.  Frequency and isostericity of RNA base pairs , 2009, Nucleic acids research.

[20]  tetraloops Thermodynamic characterization of naturally occurring RNA , 2010 .

[21]  Anders Liljas,et al.  Structure of the L1 protuberance in the ribosome , 2003, Nature Structural Biology.

[22]  A. Ferré-D’Amaré,et al.  A general module for RNA crystallization. , 1998, Journal of molecular biology.

[23]  J. Bachellerie,et al.  Conformation of yeast 18S rRNA. Direct chemical probing of the 5' domain in ribosomal subunits and in deproteinized RNA by reverse transcriptase mapping of dimethyl sulfate-accessible. , 1985, Nucleic acids research.

[24]  Anton I. Petrov,et al.  Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas , 2013, RNA.

[25]  Sergey V. Melnikov,et al.  The structure of the eukaryotic ribosome at 3.0 angstrom resolution. , 2011 .

[26]  Amber R. Davis,et al.  Thermodynamic characterization of naturally occurring RNA tetraloops. , 2010, RNA.

[27]  N. Larsen,et al.  Kinship in the SRP RNA family , 2009, RNA biology.

[28]  V. Ramakrishnan,et al.  Crystal structure of the 30 S ribosomal subunit from Thermus thermophilus: structure of the proteins and their interactions with 16 S RNA. , 2002, Journal of molecular biology.

[29]  François Major,et al.  Automated extraction and classification of RNA tertiary structure cyclic motifs , 2006, Nucleic acids research.

[30]  Julie L. Fiore,et al.  An RNA folding motif: GNRA tetraloop–receptor interactions , 2013, Quarterly Reviews of Biophysics.

[31]  Elisa Izaurralde,et al.  Structure-function studies of nucleocytoplasmic transport of retroviral genomic RNA by mRNA export factor TAP , 2011, Nature Structural &Molecular Biology.

[32]  Ben Turner,et al.  Induced fit of RNA on binding the L7Ae protein to the kink-turn motif. , 2005, RNA.

[33]  Tomasz Walen,et al.  RNA Bricks—a database of RNA 3D motifs and their interactions , 2013, Nucleic Acids Res..

[34]  Shaojie Zhang,et al.  RNAMotifScanX: a graph alignment approach for RNA structural motif identification , 2015, RNA.

[35]  John D. Westbrook,et al.  Tools for the automatic identification and classification of RNA base pairs , 2003, Nucleic Acids Res..

[36]  Craig L. Zirbel,et al.  Nonredundant 3D Structure Datasets for RNA Knowledge Extraction and Benchmarking , 2012 .

[37]  Shaojie Zhang,et al.  Clustering RNA structural motifs in ribosomal RNAs using secondary structural alignment , 2011, Nucleic acids research.

[38]  R. Breaker,et al.  Regulation of bacterial gene expression by riboswitches. , 2005, Annual review of microbiology.

[39]  Peter Walter,et al.  Removal of the Alu structural domain from signal recognition particle leaves its protein translocation activity intact , 1986, Nature.

[40]  Alfonso Mondragón,et al.  Structure and function of the T‐loop structural motif in noncoding RNAs , 2013, Wiley interdisciplinary reviews. RNA.

[41]  Eric Westhof,et al.  New metrics for comparing and assessing discrepancies between RNA 3D structures and models. , 2009, RNA.

[42]  C R Woese,et al.  Architecture of ribosomal RNA: constraints on the sequence of "tetra-loops". , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[43]  T. Steitz,et al.  The kink‐turn: a new RNA secondary structure motif , 2001, The EMBO journal.

[44]  M. Jinek,et al.  Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease , 2014, Nature.

[45]  Feng Zhang,et al.  Crystal Structure of Cas9 in Complex with Guide RNA and Target DNA , 2014, Cell.

[46]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[47]  Eric Westhof,et al.  The non-Watson-Crick base pairs and their associated isostericity matrices. , 2002, Nucleic acids research.

[48]  A. Serganov,et al.  Structural basis for gene regulation by a thiamine pyrophosphate-sensing riboswitch , 2006, Nature.

[49]  A. Serganov,et al.  The crystal structure of UUCG tetraloop. , 2000, Journal of molecular biology.

[50]  Haixu Tang,et al.  RNAMotifScan: automatic identification of RNA structural motifs using secondary structural alignment , 2010, Nucleic acids research.

[51]  S. Joseph,et al.  Cleavage of the sarcin–ricin loop of 23S rRNA differentially affects EF-G and EF-Tu binding , 2010, Nucleic acids research.

[52]  Oliver Weichenrieder,et al.  Structure and assembly of the Alu domain of the mammalian signal recognition particle , 2000, Nature.

[53]  Ken J Hampel,et al.  Evidence for preorganization of the glmS ribozyme ligand binding pocket. , 2006, Biochemistry.

[54]  Anna Marie Pyle,et al.  RNA structure comparison, motif search and discovery using a reduced representation of RNA conformational space. , 2003, Nucleic acids research.

[55]  Suna P. Gulay Building a map of the dynamic ribosome , 2015 .

[56]  Tan Inoue,et al.  Biochemical characterization of the kink-turn RNA motif. , 2003, Nucleic acids research.

[57]  F. Major,et al.  RNA canonical and non-canonical base pairing types: a recognition method and complete repertoire. , 2002, Nucleic acids research.

[58]  S. Strobel,et al.  Structural investigation of the GlmS ribozyme bound to Its catalytic cofactor. , 2007, Chemistry & biology.