RNA global alignment in the joint sequence–structure space using elastic shape analysis

The functions of RNAs, like proteins, are determined by their structures, which, in turn, are determined by their sequences. Comparison/alignment of RNA molecules provides an effective means to predict their functions and understand their evolutionary relationships. For RNA sequence alignment, most methods developed for protein and DNA sequence alignment can be directly applied. RNA 3-dimensional structure alignment, on the other hand, tends to be more difficult than protein structure alignment due to the lack of regular secondary structures as observed in proteins. Most of the existing RNA 3D structure alignment methods use only the backbone geometry and ignore the sequence information. Using both the sequence and backbone geometry information in RNA alignment may not only produce more accurate classification, but also deepen our understanding of the sequence–structure–function relationship of RNA molecules. In this study, we developed a new RNA alignment method based on elastic shape analysis (ESA). ESA treats RNA structures as three dimensional curves with sequence information encoded on additional dimensions so that the alignment can be performed in the joint sequence–structure space. The similarity between two RNA molecules is quantified by a formal distance, geodesic distance. Based on ESA, a rigorous mathematical framework can be built for RNA structure comparison. Means and covariances of full structures can be defined and computed, and probability distributions on spaces of such structures can be constructed for a group of RNAs. Our method was further applied to predict functions of RNA molecules and showed superior performance compared with previous methods when tested on benchmark datasets. The programs are available at http://stat.fsu.edu/ ∼jinfeng/ESA.html.

[1]  Peter Willett,et al.  Representation, searching and discovery of patterns of bases in complex RNA structures , 2003, J. Comput. Aided Mol. Des..

[2]  J. Doudna Structural genomics of RNA , 2000, Nature Structural Biology.

[3]  Thomas Tuschl,et al.  siRNAs: applications in functional genomics and potential as therapeutics , 2004, Nature Reviews Drug Discovery.

[4]  Ruth Nussinov,et al.  ARTS: alignment of RNA tertiary structures , 2005, ECCB/JBI.

[5]  Haixu Tang,et al.  RNAMotifScan: automatic identification of RNA structural motifs using secondary structural alignment , 2010, Nucleic acids research.

[6]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[7]  Peter Clote,et al.  DIAL: a web server for the pairwise alignment of two RNA three-dimensional structures using nucleotide, dihedral angle and base-pairing similarities , 2007, Nucleic Acids Res..

[8]  Anuj Srivastava,et al.  Analysis of planar shapes using geodesic paths on shape spaces , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Anuj Srivastava,et al.  Shape Analysis of Elastic Curves in Euclidean Spaces , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  D. W. Staple,et al.  Open access, freely available online Primer Pseudoknots: RNA Structures with Diverse Functions , 2022 .

[11]  Anuj Srivastava,et al.  Structure-based RNA Function Prediction Using Elastic Shape Analysis , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine.

[12]  Thomas Steinke,et al.  Fast Structural Alignment of Biomolecules Using a Hash Table, N-Grams and String Descriptors , 2009, Algorithms.

[13]  Wei Liu,et al.  Protein structure alignment using elastic shape analysis , 2010, BCB '10.

[14]  S. Eddy Non–coding RNA genes and the modern RNA world , 2001, Nature Reviews Genetics.

[15]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[16]  H. Munro,et al.  Mammalian protein metabolism , 1964 .

[17]  J. Steitz,et al.  The expanding universe of noncoding RNAs. , 2006, Cold Spring Harbor symposia on quantitative biology.

[18]  Ryan R. Rahrig,et al.  R3D Align: global pairwise alignment of RNA 3D structures using local superpositions , 2010, Bioinform..

[19]  Anna Marie Pyle,et al.  The identification of novel RNA structural motifs using COMPADRES: an automated approach to structural discovery. , 2004, Nucleic acids research.

[20]  Anuj Srivastava,et al.  A Novel Representation for Riemannian Analysis of Elastic Curves in Rn , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Anuj Srivastava,et al.  On Shape of Plane Elastic Curves , 2007, International Journal of Computer Vision.

[22]  Steven E. Brenner,et al.  SCOR: Structural Classification of RNA, version 2.0 , 2004, Nucleic Acids Res..

[23]  Jennifer A. Doudna,et al.  The chemical repertoire of natural ribozymes , 2002, Nature.

[24]  Chih-Wei Wang,et al.  iPARTS: an improved tool of pairwise alignment of RNA tertiary structures , 2010, Nucleic Acids Res..

[25]  Anna Marie Pyle,et al.  RNA structure comparison, motif search and discovery using a reduced representation of RNA conformational space. , 2003, Nucleic acids research.

[26]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[27]  Wei Liu,et al.  A Mathematical Framework for Protein Structure Comparison , 2011, PLoS Comput. Biol..

[28]  Marc A. Martí-Renom,et al.  RNA structure alignment by a unit-vector approach , 2008, ECCB.

[29]  Marc A. Martí-Renom,et al.  SARA: a server for function annotation of RNA structures , 2009, Nucleic Acids Res..

[30]  Craig L. Zirbel,et al.  FR3D: finding local and composite recurrent structural motifs in RNA 3D structures , 2007, Journal of mathematical biology.