ProteinGCN: Protein model quality assessment using Graph Convolutional Networks

Blind estimation of local (per-residue) and global (for the whole structure) accuracies in protein structure models is an essential step in many protein modeling applications. With the recent developments in deep-learning, single-model quality assessment methods have been also advanced, primarily through the use of 2D and 3D convolutional deep neural networks. Here we explore an alternative approach and train a graph convolutional network with nodes representing protein atoms and edges connecting spatially adjacent atom pairs on the dataset Rosetta-300k which contains a set of 300k conformations from 2,897 proteins. We show that our proposed architecture, ProteinGCN, is capable of predicting both local and global accuracies in protein models at state-of-the-art levels. Further, the number of free parameters in ProteinGCN is almost 1-2 orders of magnitude smaller compared to the 3D convolutional networks proposed earlier. We provide the source code of our work to encourage reproducible research.1

[1]  E. Boersma,et al.  Prevention of Catheter-Related Bacteremia with a Daily Ethanol Lock in Patients with Tunnelled Catheters: A Randomized, Placebo-Controlled Trial , 2010, PloS one.

[2]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[3]  Alexander D. MacKerell,et al.  CHARMM: The Energy Function and Its Parameterization , 2002 .

[4]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[5]  Guillaume Pagès,et al.  Protein model quality assessment using 3D oriented convolutional neural networks , 2018 .

[6]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[7]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[8]  Alex Fout,et al.  Protein Interface Prediction using Graph Convolutional Networks , 2017, NIPS.

[9]  W. L. Jorgensen,et al.  Development and Testing of the OPLS All-Atom Force Field on Conformational Energetics and Properties of Organic Liquids , 1996 .

[10]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[11]  Vijay S. Pande,et al.  Low Data Drug Discovery with One-Shot Learning , 2016, ACS central science.

[12]  Kliment Olechnovič,et al.  CAD‐score: A new contact area difference‐based function for evaluation of protein structural models , 2013, Proteins.

[13]  Yoshua Bengio,et al.  Deep convolutional networks for quality assessment of protein folds , 2018, Bioinform..

[14]  Vijay S. Pande,et al.  Molecular graph convolutions: moving beyond fingerprints , 2016, Journal of Computer-Aided Molecular Design.

[15]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[18]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[19]  Wei Zhang,et al.  A point‐charge force field for molecular mechanics simulations of proteins based on condensed‐phase quantum mechanical calculations , 2003, J. Comput. Chem..

[20]  J. Skolnick,et al.  GOAP: a generalized orientation-dependent, all-atom statistical potential for protein structure prediction. , 2011, Biophysical journal.

[21]  Hongyi Zhou,et al.  Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction , 2002, Protein science : a publication of the Protein Society.

[22]  D. Baker,et al.  Relaxation of backbone bond geometry improves protein energy landscape modeling , 2014, Protein science : a publication of the Protein Society.

[23]  David Baker,et al.  High-resolution comparative modeling with RosettaCM. , 2013, Structure.

[24]  Maria Leonor Pacheco,et al.  of the Association for Computational Linguistics: , 2001 .

[25]  Marco Biasini,et al.  lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests , 2013, Bioinform..

[26]  Padmini Rajagopalan,et al.  MT-CGCNN: Integrating Crystal Graph Convolutional Neural Network with Multitask Learning for Material Property Prediction , 2018, ArXiv.

[27]  F. Scarselli,et al.  A new model for learning in graph domains , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[28]  Arne Elofsson,et al.  ProQ3D: improved model quality assessments using deep learning , 2016, Bioinform..

[29]  Jens Meiler,et al.  ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. , 2011, Methods in enzymology.

[30]  Kliment Olechnovič,et al.  VoroMQA: Assessment of protein structure quality using interatomic contact areas , 2017, Proteins.

[31]  Yang Zhang,et al.  A Novel Side-Chain Orientation Dependent Potential Derived from Random-Walk Reference State for Protein Fold Selection and Structure Prediction , 2010, PloS one.

[32]  Sergei Grudinin,et al.  Protein model quality assessment using 3D oriented convolutional neural networks , 2018, bioRxiv.

[33]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[34]  Wilfred F. van Gunsteren,et al.  An improved GROMOS96 force field for aliphatic hydrocarbons in the condensed phase , 2001, J. Comput. Chem..

[35]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[36]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[37]  Minkyung Baek,et al.  Assessment of protein model structure accuracy estimation in CASP13: Challenges in the era of deep learning , 2019, Proteins.

[38]  Jeffrey C Grossman,et al.  Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties. , 2017, Physical review letters.

[39]  Partha Pratim Talukdar,et al.  ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations , 2020, AAAI.

[40]  Gyu Rie Lee,et al.  High‐accuracy refinement using Rosetta in CASP13 , 2019, Proteins.

[41]  Pierre Vandergheynst,et al.  Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[42]  Diego Marcheggiani,et al.  Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling , 2017, EMNLP.