Graph-based sampling for approximating global helical topologies of RNA

Significance RNA molecules are important components of the cellular machinery and perform many essential roles, including catalysis, transcription, and regulation. Because the structural features are intimately connected to their biological functions, there is great interest in predicting RNA structure from sequence. Present RNA 3D folding algorithms are limited to small RNA structures due to inefficient sampling of RNA structure space. We report a computational approach to predict RNA 3D topologies based on hierarchical sampling of RNA 3D candidate topologies represented as 3D graphs guided by geometrical measures based on known structures. The combination of tools shows great promise for assembling global features of RNA architecture. Applications to RNA design can be envisioned. A current challenge in RNA structure prediction is the description of global helical arrangements compatible with a given secondary structure. Here we address this problem by developing a hierarchical graph sampling/data mining approach to reduce conformational space and accelerate global sampling of candidate topologies. Starting from a 2D structure, we construct an initial graph from size measures deduced from solved RNAs and junction topologies predicted by our data-mining algorithm RNAJAG trained on known RNAs. We sample these graphs in 3D space guided by knowledge-based statistical potentials derived from bending and torsion measures of internal loops as well as radii of gyration for known RNAs. Graph sampling results for 30 representative RNAs are analyzed and compared with reference graphs from both solved structures and predicted structures by available programs. This comparison indicates promise for our graph-based sampling approach for characterizing global helical arrangements in large RNAs: graph rmsds range from 2.52 to 28.24 Å for RNAs of size 25–158 nucleotides, and more than half of our graph predictions improve upon other programs. The efficiency in graph sampling, however, implies an additional step of translating candidate graphs into atomic models. Such models can be built with the same idea of graph partitioning and build-up procedures we used for RNA design.

[1]  J. M. Creeth Molecules in solution , 1977, Nature.

[2]  M. Waterman Secondary Structure of Single-Stranded Nucleic Acidst , 1978 .

[3]  Tamar Schlick,et al.  A modular strategy for generating starting conformations and data structures of polynucleotide helices for potential energy calculations , 1988 .

[4]  R. Nussinov,et al.  Tree graphs of RNA secondary structures and their comparisons. , 1989, Computers and biomedical research, an international journal.

[5]  Kaizhong Zhang,et al.  Comparing multiple RNA secondary structures using tree comparisons , 1990, Comput. Appl. Biosci..

[6]  D. Draper,et al.  Bulge loops used to measure the helical twist of RNA in solution. , 1990, Biochemistry.

[7]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[8]  Determination of the angle between the anticodon and aminoacyl acceptor stems of yeast phenylalanyl tRNA in solution. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[9]  E. Westhof,et al.  Topology of three-way junctions in folded RNAs. , 2006, RNA.

[10]  D. Baker,et al.  Automated de novo prediction of native-like RNA tertiary structures , 2007, Proceedings of the National Academy of Sciences.

[11]  Feng Ding,et al.  iFoldRNA: three-dimensional RNA structure prediction and folding , 2008, Bioinform..

[12]  F. Major,et al.  The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data , 2008, Nature.

[13]  T. Schlick,et al.  Analysis of four-way junctions in RNA structures. , 2009, Journal of molecular biology.

[14]  Xiang-Jun Lu,et al.  Web 3DNA—a web server for the analysis, reconstruction, and visualization of three-dimensional nucleic-acid structures , 2009, Nucleic Acids Res..

[15]  Magdalena A. Jonikas,et al.  Coarse-grained modeling of large RNA molecules with knowledge-based potentials and structural filters. , 2009, RNA.

[16]  Namhee Kim,et al.  Analysis of riboswitch structure and function by an energy landscape framework. , 2009, Journal of molecular biology.

[17]  Namhee Kim,et al.  RAG: An update to the RNA-As-Graphs resource , 2011, BMC Bioinformatics.

[18]  Christian Laing,et al.  Computational approaches to 3D modeling of RNA , 2010, Journal of physics. Condensed matter : an Institute of Physics journal.

[19]  Namhee Kim,et al.  Computational generation and screening of RNA motifs in large nucleotide sequence pools , 2010, Nucleic acids research.

[20]  Teresa W. Haynes,et al.  A predictive model for secondary RNA structure using graph theory and a neural network , 2010, BMC Bioinformatics.

[21]  C. Brooks,et al.  3D maps of RNA interhelical junctions , 2011, Nature Protocols.

[22]  Michael Levitt,et al.  Clustering to identify RNA conformations constrained by secondary structure , 2011, Proceedings of the National Academy of Sciences.

[23]  Anna Marie Pyle,et al.  Discrete RNA libraries from pseudo-torsional space. , 2012, Journal of molecular biology.

[24]  Adelene Y. L. Sim,et al.  Modeling and design by hierarchical natural moves , 2012, Proceedings of the National Academy of Sciences.

[25]  William M Gelbart,et al.  Visualizing large RNA molecules in solution. , 2012, RNA.

[26]  Tamar Schlick,et al.  Graph Applications to RNA Structure and Function , 2013 .

[27]  T. Schlick,et al.  Predicting Helical Topologies in RNA Junctions as Tree Graphs , 2013, PloS one.

[28]  Namhee Kim,et al.  Network Theory Tools for RNA Modeling. , 2013, WSEAS transactions on mathematics.