Automated identification of RNA 3D modules with discriminative power in RNA structural alignments

Recent progress in predicting RNA structure is moving towards filling the ‘gap’ in 2D RNA structure prediction where, for example, predicted internal loops often form non-canonical base pairs. This is increasingly recognized with the steady increase of known RNA 3D modules. There is a general interest in matching structural modules known from one molecule to other molecules for which the 3D structure is not known yet. We have created a pipeline, metaRNAmodules, which completely automates extracting putative modules from the FR3D database and mapping of such modules to Rfam alignments to obtain comparative evidence. Subsequently, the modules, initially represented by a graph, are turned into models for the RMDetect program, which allows to test their discriminative power using real and randomized Rfam alignments. An initial extraction of 22 495 3D modules in all PDB files results in 977 internal loop and 17 hairpin modules with clear discriminatory power. Many of these modules describe only minor variants of each other. Indeed, mapping of the modules onto Rfam families results in 35 unique locations in 11 different families. The metaRNAmodules pipeline source for the internal loop modules is available at http://rth.dk/resources/mrm.

[1]  Judea Pearl,et al.  Fusion, Propagation, and Structuring in Belief Networks , 1986, Artif. Intell..

[2]  Christian Laing,et al.  Computational approaches to 3D modeling of RNA , 2010, Journal of physics. Condensed matter : an Institute of Physics journal.

[3]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[4]  Sean R. Eddy,et al.  Rfam: an RNA family database , 2003, Nucleic Acids Res..

[5]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[6]  Sean R. Eddy,et al.  Infernal 1.0: inference of RNA alignments , 2009, Bioinform..

[7]  R. Montange,et al.  Structure of the S-adenosylmethionine riboswitch regulatory mRNA element , 2006, Nature.

[8]  Walter L. Ruzzo,et al.  Multiperm: shuffling multiple sequence alignments while approximately preserving dinucleotide frequencies , 2009, Bioinform..

[9]  A. Ferré-D’Amaré,et al.  RNA folds: insights from recent crystal structures. , 1999, Annual review of biophysics and biomolecular structure.

[10]  Eric Westhof,et al.  Recurrent structural RNA motifs, Isostericity Matrices and sequence alignments , 2005, Nucleic acids research.

[11]  Peter F. Stadler,et al.  RNA Folding Algorithms with G-Quadruplexes , 2012, BSB.

[12]  Craig L. Zirbel,et al.  FR3D: finding local and composite recurrent structural motifs in RNA 3D structures , 2007, Journal of mathematical biology.

[13]  G. Karpova,et al.  Structural and functional topography of the human ribosome. , 2012, Acta biochimica et biophysica Sinica.

[14]  Eric Westhof,et al.  Sequence-based identification of 3D structural modules in RNA with RMDetect , 2011, Nature Methods.

[15]  E. Westhof,et al.  Geometric nomenclature and classification of RNA base pairs. , 2001, RNA.

[16]  S. Butcher,et al.  The molecular interactions that stabilize RNA tertiary structure: RNA motifs, patterns, and networks. , 2011, Accounts of chemical research.

[17]  Haixu Tang,et al.  RNAMotifScan: automatic identification of RNA structural motifs using secondary structural alignment , 2010, Nucleic acids research.

[18]  R. Batey,et al.  Crystal Structure of the Lysine Riboswitch Regulatory mRNA Element* , 2008, Journal of Biological Chemistry.

[19]  Peter F. Stadler,et al.  2D Meets 4G: G-Quadruplexes in RNA Secondary Structure Prediction , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[20]  J. Wedekind,et al.  Crystal structure of the leadzyme at 1.8 A resolution: metal ion binding and the implications for catalytic mechanism and allo site ion regulation. , 2003, Biochemistry.

[21]  E. Goldman tRNA and the Human Genome , 2011 .

[22]  Vinod Scaria,et al.  © 2012 Landes Bioscience. Do not distribute. Potential G-quadruplexes in the human long non-coding transcriptome , 2012 .

[23]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[24]  Markus Wieland,et al.  RNA quadruplex-based modulation of gene expression. , 2007, Chemistry & biology.

[25]  Thomas A. Steitz,et al.  RNA tertiary interactions in the large ribosomal subunit: The A-minor motif , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Eric Westhof,et al.  The non-Watson-Crick base pairs and their associated isostericity matrices. , 2002, Nucleic acids research.

[27]  Szilvia Szép,et al.  The crystal structure of a 26-nucleotide RNA containing a hook-turn. , 2003, RNA.

[28]  N. B. Leontisa,et al.  Motif prediction in ribosomal RNAs Lessons and prospects for automated motif prediction in homologous RNA molecules , 2002 .

[29]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[30]  Shaojie Zhang,et al.  Clustering RNA structural motifs in ribosomal RNAs using secondary structural alignment , 2011, Nucleic acids research.

[31]  Kristian Rother,et al.  RNA tertiary structure prediction with ModeRNA , 2011, Briefings Bioinform..

[32]  Jérôme Waldispühl,et al.  Towards 3D structure prediction of large RNA molecules: an integer programming framework to insert local 3D motifs in RNA secondary structure , 2012, Bioinform..

[33]  F. Major,et al.  The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data , 2008, Nature.

[34]  Alain Denise,et al.  Automated motif extraction and classification in RNA tertiary structures. , 2008, RNA.

[35]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[36]  T. Steitz,et al.  The kink‐turn: a new RNA secondary structure motif , 2001, The EMBO journal.

[37]  J. Bonfield,et al.  Finishing the euchromatic sequence of the human genome , 2004, Nature.

[38]  Peter F. Stadler,et al.  ViennaRNA Package 2.0 , 2011, Algorithms for Molecular Biology.

[39]  John SantaLucia,et al.  Structures of two RNA octamers containing tandem G.A base pairs. , 2004, Acta crystallographica. Section D, Biological crystallography.

[40]  A. Serganov,et al.  Structural insights into amino acid binding and gene control by a lysine riboswitch , 2008, Nature.

[41]  S. Strobel,et al.  RNA kink turns to the left and to the right. , 2004, RNA.

[42]  Peter F. Stadler,et al.  A folding algorithm for extended RNA secondary structures , 2011, Bioinform..

[43]  C Massire,et al.  MANIP: an interactive tool for modelling RNA. , 1998, Journal of molecular graphics & modelling.

[44]  S. Brenner,et al.  RNA structural motifs: building blocks of a modular biomolecule , 2005, Quarterly Reviews of Biophysics.

[45]  Howard Y. Chang,et al.  Genome regulation by long noncoding RNAs. , 2012, Annual review of biochemistry.

[46]  Sean R. Eddy,et al.  Infernal 1.0: inference of RNA alignments , 2009, Bioinform..

[47]  Magdalena A. Jonikas,et al.  Coarse-grained modeling of large RNA molecules with knowledge-based potentials and structural filters. , 2009, RNA.

[48]  Ping Li,et al.  Binding of the Human Prp31 Nop Domain to a Composite RNA-Protein Platform in U4 snRNP , 2007, Science.