Integrating end-to-end learning with deep geometrical potentials for ab initio RNA structure prediction

RNAs are fundamental in living cells and perform critical functions determined by the tertiary architectures. However, accurate modeling of 3D RNA structure remains a challenging problem. Here we present a novel method, DRfold, to predict RNA tertiary structures by simultaneous learning of local frame rotations and geometric restraints from experimentally solved RNA structures, where the learned knowledge is converted into a hybrid energy potential to guide subsequent RNA structure constructions. The method significantly outperforms previous approaches by >75.6% in TM-score on a nonredundant dataset containing recently released structures. Detailed analyses showed that the major contribution to the improvements arise from the deep end-to-end learning supervised with the atom coordinates and the composite energy function integrating complementary information from geometry restraints and end-to-end learning models. The open-source DRfold program allows large-scale application of high-resolution RNA structure modeling and can be further improved with future release of RNA structure databases.

[1]  O. S.,et al.  Accurate prediction of protein structures and interactions using a three-track neural network , 2022, Yearbook of Paediatric Endocrinology.

[2]  K. Paliwal,et al.  Predicting RNA distance-based contact maps by integrated deep learning on physics-inferred secondary structure and evolutionary-derived mutational coupling , 2022, Bioinform..

[3]  Yang Zhang,et al.  Deep learning geometrical potential for high-accuracy ab initio protein structure prediction , 2022, iScience.

[4]  Andrew M. Watkins,et al.  Geometric deep learning of RNA structure , 2021, Science.

[5]  Oriol Vinyals,et al.  Highly accurate protein structure prediction with AlphaFold , 2021, Nature.

[6]  Yaoqi Zhou,et al.  Pairing a high-resolution statistical potential with a nucleobase-centric sampling algorithm for improving RNA model refinement , 2021, Nature Communications.

[7]  Jianyi Yang,et al.  RNA inter-nucleotide 3D closeness prediction by deep residual neural networks , 2020, Bioinform..

[8]  Noah Snavely,et al.  An Analysis of SVD for Deep Rotation Estimation , 2020, NeurIPS.

[9]  Le Song,et al.  RNA Secondary Structure Prediction By Learning Unrolled Algorithms , 2020, ICLR.

[10]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[11]  Yang Zhang,et al.  DeepMSA: constructing deep multiple sequence alignment to improve contact prediction and fold-recognition for distant-homology proteins , 2019, Bioinform..

[12]  Yaoqi Zhou,et al.  RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning , 2019, Nature Communications.

[13]  Yang Zhang,et al.  RNA-align: quick and accurate alignment of RNA 3D structures based on size-independent TM-scoreRNA , 2019, Bioinform..

[14]  Jun Hu,et al.  ResPRE: high-accuracy protein contact prediction by coupling precision matrix with deep residual neural networks , 2019, Bioinform..

[15]  K. Weeks,et al.  Principles for targeting RNA with drug-like small molecules , 2018, Nature Reviews Drug Discovery.

[16]  Katarzyna J Purzycka,et al.  RNAComposer and RNA 3D structure prediction for nanotechnology. , 2016, Methods.

[17]  Tianqi Chen,et al.  Training Deep Nets with Sublinear Memory Cost , 2016, ArXiv.

[18]  J. Bujnicki,et al.  SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction , 2015, Nucleic acids research.

[19]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Yangyu Huang,et al.  Automated and fast building of three-dimensional RNA structures , 2012, Scientific Reports.

[22]  Peter F. Stadler,et al.  ViennaRNA Package 2.0 , 2011, Algorithms for Molecular Biology.

[23]  J. Bujnicki,et al.  ModeRNA: a tool for comparative modeling of RNA 3D structure , 2011, Nucleic acids research.

[24]  Russ B. Altman,et al.  Pacific Symposium on Biocomputing 15:216-227(2010) PREDICTING RNA STRUCTURE BY MULTIPLE TEMPLATE HOMOLOGY MODELING , 2022 .

[25]  J. Gorodkin,et al.  Unifying evolutionary and thermodynamic information for RNA folding of multiple alignments , 2008, Nucleic acids research.

[26]  Jorge Nocedal,et al.  Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization , 1997, TOMS.

[27]  J. Erdos,et al.  On Löwdin orthogonalization , 1980 .

[28]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..