Molecular Property Prediction Based on a Multichannel Substructure Graph

Molecular property prediction is important to drug design. With the development of artificial intelligence, deep learning methods are effective for extracting molecular features. In this paper, we propose a multichannel substructure-graph gated recurrent unit (GRU) architecture, which is a novel GRU-based neural network with attention mechanisms applied to molecular substructures to learn and predict properties. In the architecture, molecular features are extracted at the node level and molecule level for capturing fine-grained and coarse-grained information. In addition, three bidirectional GRUs are adopted to extract the features on three channels to generate the molecular representations. Different attention weights are assigned to the entities in the molecule to evaluate their contributions. Experiments are implemented to compare our model with benchmark models in molecular property prediction for both regression and classification tasks, and the results show that our model has strong robustness and generalizability.

[1]  Regina Barzilay,et al.  Convolutional Embedding of Attributed Molecular Graphs for Physical Property Prediction , 2017, J. Chem. Inf. Model..

[2]  Qiang Yu,et al.  Incremental Graph Embedding Based on Spatial-Spectral Neighbors for Hyperspectral Image Classification , 2018, IEEE Access.

[3]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[4]  Jianfeng Pei,et al.  Deep Learning Based Regression and Multiclass Models for Acute Oral Toxicity Prediction with Automatic Chemical Feature Extraction , 2017, J. Chem. Inf. Model..

[5]  Zuping Zhang,et al.  Prediction of Drug-Disease Associations for Drug Repositioning Through Drug-miRNA-Disease Heterogeneous Network , 2018, IEEE Access.

[6]  Juho Rousu,et al.  Metabolite identification and molecular fingerprint prediction through machine learning , 2012, Bioinform..

[7]  Long Tian,et al.  Combining Convolution Neural Network and Bidirectional Gated Recurrent Unit for Sentence Semantic Classification , 2018, IEEE Access.

[8]  Vijay S. Pande,et al.  Massively Multitask Networks for Drug Discovery , 2015, ArXiv.

[9]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[10]  Nannan Wang,et al.  Semi-Supervised Classification With Graph Structure Similarity and Extended Label Propagation , 2019, IEEE Access.

[11]  Christos Faloutsos,et al.  Beyond Sigmoids: The NetTide Model for Social Network Growth, and Its Applications , 2016, KDD.

[12]  David Weininger,et al.  SMILES. 2. Algorithm for generation of unique SMILES notation , 1989, J. Chem. Inf. Comput. Sci..

[13]  Yang Li,et al.  PotentialNet for Molecular Property Prediction , 2018, ACS central science.

[14]  Alán Aspuru-Guzik,et al.  Inverse molecular design using machine learning: Generative models for matter engineering , 2018, Science.

[15]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[16]  Jure Leskovec,et al.  GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models , 2018, ICML.

[17]  Vijay S. Pande,et al.  Step Change Improvement in ADMET Prediction with PotentialNet Deep Featurization , 2019, ArXiv.

[18]  Dongsup Kim,et al.  FP2VEC: a new molecular featurizer for learning molecular properties , 2019, Bioinform..

[19]  James G. Nourse,et al.  Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..

[20]  Yu Jin,et al.  Learning Graph-Level Representations with Recurrent Neural Networks , 2018, 1805.07683.

[21]  Renxiao Wang,et al.  The PDBbind database: collection of binding affinities for protein-ligand complexes with known three-dimensional structures. , 2004, Journal of medicinal chemistry.

[22]  Omer Levy,et al.  word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method , 2014, ArXiv.

[23]  Svetha Venkatesh,et al.  GraphDTA: Predicting drug–target binding affinity with graph neural networks , 2019 .

[24]  Jacob D. Durrant,et al.  NNScore 2.0: A Neural-Network Receptor–Ligand Scoring Function , 2011, J. Chem. Inf. Model..

[25]  Yuxiao Dong,et al.  DeepInf : Modeling Influence Locality in Large Social Networks , 2018 .

[26]  Luis Pinheiro,et al.  A Bayesian Approach to in Silico Blood-Brain Barrier Penetration Modeling , 2012, J. Chem. Inf. Model..

[27]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[28]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[29]  John S. Delaney,et al.  ESOL: Estimating Aqueous Solubility Directly from Molecular Structure , 2004, J. Chem. Inf. Model..

[30]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[31]  Joseph Gomes,et al.  MoleculeNet: a benchmark for molecular machine learning† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c7sc02664a , 2017, Chemical science.

[32]  Floriane Montanari,et al.  Modeling Physico-Chemical ADMET Endpoints with Multitask Graph Convolutional Networks , 2019, Molecules.

[33]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[34]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[35]  Chi Chen,et al.  Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals , 2018, Chemistry of Materials.

[36]  Jure Leskovec,et al.  Graph Convolutional Neural Networks for Web-Scale Recommender Systems , 2018, KDD.

[37]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[38]  Vijay S. Pande,et al.  Computational Modeling of β-Secretase 1 (BACE-1) Inhibitors Using Ligand Based Approaches , 2016, J. Chem. Inf. Model..

[39]  Anton van den Hengel,et al.  Graph-Structured Representations for Visual Question Answering , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Silvio Savarese,et al.  Structural-RNN: Deep Learning on Spatio-Temporal Graphs , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[42]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[43]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[44]  Mari Ostendorf,et al.  Conversation Modeling on Reddit Using a Graph-Structured LSTM , 2017, TACL.

[45]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[46]  George Papadatos,et al.  The ChEMBL database in 2017 , 2016, Nucleic Acids Res..

[47]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[48]  G. Bemis,et al.  The properties of known drugs. 1. Molecular frameworks. , 1996, Journal of medicinal chemistry.

[49]  Andreas Verras,et al.  Is Multitask Deep Learning Practical for Pharma? , 2017, J. Chem. Inf. Model..

[50]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[51]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[52]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[53]  Renxiao Wang,et al.  The PDBbind database: methodologies and updates. , 2005, Journal of medicinal chemistry.

[54]  Vijay S. Pande,et al.  Molecular graph convolutions: moving beyond fingerprints , 2016, Journal of Computer-Aided Molecular Design.

[55]  Pierre Baldi,et al.  Deep Architectures and Deep Learning in Chemoinformatics: The Prediction of Aqueous Solubility for Drug-Like Molecules , 2013, J. Chem. Inf. Model..

[56]  David L. Mobley,et al.  FreeSolv: a database of experimental and calculated hydration free energies, with input files , 2014, Journal of Computer-Aided Molecular Design.

[57]  Pierre Baldi,et al.  Influence Relevance Voting: An Accurate And Interpretable Virtual High Throughput Screening Method , 2009, J. Chem. Inf. Model..

[58]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[59]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.