Heterogeneous graph convolutional neural network for protein-ligand scoring

Aim: Drug discovery is a long process, often taking decades of research endeavors. It is still an active area of research in both academic and industrial sectors with efforts on reducing time and cost. Computational simulations like molecular docking enable fast exploration of large databases of compounds and extract the most promising molecule candidates for further in vitro and in vivo tests. Structure-based molecular docking is a complex process mixing both surface exploration and energy estimation to find the minimal free energy of binding corresponding to the best interaction location. Methods: Hereafter, heterogeneous graph score (HGScore), a new scoring function is proposed and is developed in the context of a protein-small compound-complex. Each complex is represented by a heterogeneous graph allowing to separate edges according to their class (inter- or intra-molecular). Then a heterogeneous graph convolutional network (HGCN) is used allowing the discrimination of the information according to the edge crossed. In the end, the model produces the affinity score of the complex. Results: HGScore has been tested on the comparative assessment of scoring functions (CASF) 2013 and 2016 benchmarks for scoring, ranking, and docking powers. It has achieved good performances by outperforming classical methods and being among the best artificial intelligence (AI) methods. Conclusions: Thus, HGScore brings a new way to represent protein-ligand interactions. Using a representation that involves classical graph neural networks (GNNs) and splitting the learning process regarding the edge type makes the proposed model to be the best adapted for future transfer learning on other (protein-DNA, protein-sugar, protein-protein, etc.) biological complexes.

[1]  Jike Wang,et al.  InteractionGraphNet: A Novel and Efficient Deep Graph Representation Learning Framework for Accurate Protein-Ligand Interaction Predictions. , 2021, Journal of medicinal chemistry.

[2]  S. Baud,et al.  Machine-learning methods for ligand-protein molecular docking. , 2021, Drug discovery today.

[3]  Eran Yahav,et al.  How Attentive are Graph Attention Networks? , 2021, ICLR.

[4]  Shudong Wang,et al.  SE-OnionNet: A Convolution Neural Network for Protein–Ligand Binding Affinity Prediction , 2021, Frontiers in Genetics.

[5]  Xavier Barril,et al.  Extended connectivity interaction features: improving binding affinity prediction through chemical description , 2020, Bioinform..

[6]  Xiaomin Luo,et al.  Pushing the boundaries of molecular representation for drug discovery with graph attention mechanism. , 2020, Journal of medicinal chemistry.

[7]  Yanjie Wei,et al.  DeepBindRG: a deep learning based method for estimating effective protein–ligand affinity , 2019, PeerJ.

[8]  Takuya Akiba,et al.  Optuna: A Next-generation Hyperparameter Optimization Framework , 2019, KDD.

[9]  Guo-Wei Wei,et al.  AGL-Score: Algebraic Graph Learning Score for Protein-Ligand Binding Scoring, Ranking, Docking, and Screening , 2019, J. Chem. Inf. Model..

[10]  Yuguang Mu,et al.  OnionNet: a Multiple-Layer Intermolecular-Contact-Based Convolutional Neural Network for Protein–Ligand Binding Affinity Prediction , 2019, ACS omega.

[11]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[12]  Yan Li,et al.  Comparative Assessment of Scoring Functions: The CASF-2016 Update , 2018, J. Chem. Inf. Model..

[13]  K. Allegaert,et al.  (Preprint) , 2018 .

[14]  Ting Wang,et al.  Molecular Docking , 2018 .

[15]  Zhihai Liu,et al.  Assessing protein–ligand interaction scoring functions with the CASF-2013 benchmark , 2018, Nature Protocols.

[16]  Marta M. Stepniewska-Dziubinska,et al.  Development and evaluation of a deep learning model for protein–ligand binding affinity prediction , 2017, Bioinform..

[17]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[18]  Zhihai Liu,et al.  Forging the Basis for Developing Protein-Ligand Interaction Scoring Functions. , 2017, Accounts of chemical research.

[19]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Izhar Wallach,et al.  AtomNet: A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery , 2015, ArXiv.

[21]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[22]  Piotr Zielenkiewicz,et al.  Open Drug Discovery Toolkit (ODDT): a new open-source player in the drug discovery field , 2015, Journal of Cheminformatics.

[23]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[24]  John B. O. Mitchell,et al.  A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking , 2010, Bioinform..

[25]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[26]  W. Knight A Computer Method for Calculating Kendall's Tau with Ungrouped Data , 1966 .

[27]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.