Energy-based Graph Convolutional Networks for Scoring Protein Docking Models

Structural information about protein-protein interactions, often missing at the interactome scale, is important for mechanistic understanding of cells and rational discovery of therapeutics. Protein docking provides a computational alternative to predict such information. However, ranking near-native docked models high among a large number of candidates, often known as the scoring problem, remains a critical challenge. Moreover, estimating model quality, also known as the quality assessment problem, is rarely addressed in protein docking. In this study the two challenging problems in protein docking are regarded as relative and absolute scoring, respectively, and addressed in one physics-inspired deep learning framework. We represent proteins and encounter complexes as intra- and inter-molecular residue contact graphs with atom-resolution node and edge features. And we propose a novel graph convolutional kernel that pool interacting nodes’ features through edge features so that generalized interaction energies can be learned directly from graph data. The resulting energy-based graph convolutional networks (EGCN) with multi-head attention are trained to predict intra- and inter-molecular energies, binding affinities, and quality measures (interface RMSD) for encounter complexes. Compared to a state-of-the-art scoring function for model ranking, EGCN has significantly improved ranking for a CAPRI test set involving homology docking; and is comparable for Score_set, a CAPRI benchmark set generated by diverse community-wide docking protocols not known to training data. For Score_set quality assessment, EGCN shows about 27% improvement to our previous efforts. Directly learning from structure data in graph representation, EGCN represents the first successful development of graph convolutional networks for protein docking.

[1]  Raphael A. G. Chaleil,et al.  Updates to the Integrated Protein-Protein Interaction Benchmarks: Docking Benchmark Version 5 and Affinity Benchmark Version 2. , 2015, Journal of molecular biology.

[2]  Yoshua Bengio,et al.  Deep convolutional networks for quality assessment of protein folds , 2018, Bioinform..

[3]  Dima Kozakov,et al.  The ClusPro web server for protein–protein docking , 2017, Nature Protocols.

[4]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[5]  Jun Li,et al.  RNA3DCNN: Local and global quality assessments of RNA 3D structures using 3D deep convolutional neural networks , 2018, PLoS Comput. Biol..

[6]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[7]  Z. Weng,et al.  Integrating atom‐based and residue‐based scoring functions for protein–protein docking , 2011, Protein science : a publication of the Protein Society.

[8]  Vasant Honavar,et al.  iScore: a novel graph kernel-based function for scoring protein–protein docking models , 2018, bioRxiv.

[9]  Jie Hou,et al.  DeepQA: improving the estimation of single protein model quality with deep belief networks , 2016, BMC Bioinformatics.

[10]  Marc F. Lensink,et al.  Score_set: A CAPRI benchmark for scoring protein complexes , 2014, Proteins.

[11]  David Baker,et al.  Macromolecular modeling with rosetta. , 2008, Annual review of biochemistry.

[12]  Zhiping Weng,et al.  Protein–protein docking benchmark version 4.0 , 2010, Proteins.

[13]  Carles Pons,et al.  Scoring by Intermolecular Pairwise Propensities of Exposed Residues (SIPPER): A New Efficient Potential for Protein-Protein Docking , 2011, J. Chem. Inf. Model..

[14]  Alexandre M J J Bonvin,et al.  Are scoring functions in protein-protein docking ready to predict interactomes? Clues from a novel binding affinity benchmark. , 2010, Journal of proteome research.

[15]  Anna Tramontano,et al.  Assessment of the assessment: Evaluation of the model quality estimates in CASP10 , 2014, Proteins.

[16]  Balachandran Manavalan,et al.  Random Forest-Based Protein Model Quality Assessment (RFMQA) Using Structural Features and Potential Energy Terms , 2014, PloS one.

[17]  Guillaume Pagès,et al.  Protein model quality assessment using 3D oriented convolutional neural networks , 2018 .

[18]  Silvia Crivelli,et al.  Structural Learning of Proteins Using Graph Convolutional Neural Networks , 2019, bioRxiv.

[19]  Alex Fout,et al.  Protein Interface Prediction using Graph Convolutional Networks , 2017, NIPS.

[20]  Yue Cao,et al.  Bayesian active learning for optimization and uncertainty quantification in protein docking , 2019, bioRxiv.

[21]  Sergei Grudinin,et al.  Protein model quality assessment using 3D oriented convolutional neural networks , 2018, bioRxiv.

[22]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[23]  Dima Kozakov,et al.  What method to use for protein-protein docking? , 2019, Current opinion in structural biology.

[24]  Z. Weng,et al.  A structure‐based benchmark for protein–protein binding affinity , 2011, Protein science : a publication of the Protein Society.

[25]  Liam J. McGuffin,et al.  The ModFOLD server for the quality assessment of protein structural models , 2008, Bioinform..

[26]  Yang Shen,et al.  cNMA: a framework of encounter complex-based normal mode analysis to model conformational changes in protein interactions , 2015, Bioinform..

[27]  Yaoqi Zhou,et al.  FreeSASA: An open source C library for solvent accessible surface area calculations , 2016, F1000Research.

[28]  David Baker,et al.  Ranking predicted protein structures with support vector regression , 2007, Proteins.

[29]  M. Michael Gromiha,et al.  Protein-protein binding affinity prediction from amino acid sequence , 2014, Bioinform..

[30]  Xiaoqin Zou,et al.  ITScorePro: an efficient scoring program for evaluating the energy scores of protein structures for structure prediction. , 2014, Methods in molecular biology.

[31]  P. Aloy,et al.  Interactome3D: adding structural details to protein networks , 2013, Nature Methods.

[32]  Yang Shen,et al.  Predicting protein conformational changes for unbound and homology docking: learning from intrinsic and induced flexibility , 2017, Proteins.

[33]  Jeffrey J. Gray,et al.  Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. , 2003, Journal of molecular biology.

[34]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[35]  Vijay S. Pande,et al.  Atomic Convolutional Networks for Predicting Protein-Ligand Binding Affinity , 2017, ArXiv.

[36]  Björn Wallner,et al.  Improved model quality assessment using ProQ2 , 2012, BMC Bioinformatics.

[37]  Sergei Grudinin,et al.  Knowledge of Native Protein-Protein Interfaces Is Sufficient To Construct Predictive Models for the Selection of Binding Candidates , 2015, J. Chem. Inf. Model..