Hyperbolic relational graph convolution networks plus: a simple but highly efficient QSAR-modeling method

Accurate predictions of druggability and bioactivities of compounds are desirable to reduce the high cost and time of drug discovery. After more than five decades of continuing developments, quantitative structure-activity relationship (QSAR) methods have been established as indispensable tools that facilitate fast, reliable and affordable assessments of physicochemical and biological properties of compounds in drug-discovery programs. Currently, there are mainly two types of QSAR methods, descriptor-based methods and graph-based methods. The former is developed based on predefined molecular descriptors, whereas the latter is developed based on simple atomic and bond information. In this study, we presented a simple but highly efficient modeling method by combining molecular graphs and molecular descriptors as the input of a modified graph neural network, called hyperbolic relational graph convolution network plus (HRGCN+). The evaluation results show that HRGCN+ achieves state-of-the-art performance on 11 drug-discovery-related datasets. We also explored the impact of the addition of traditional molecular descriptors on the predictions of graph-based methods, and found that the addition of molecular descriptors can indeed boost the predictive power of graph-based methods. The results also highlight the strong anti-noise capability of our method. In addition, our method provides a way to interpret models at both the atom and descriptor levels, which can help medicinal chemists extract hidden information from complex datasets. We also offer an HRGCN+'s online prediction service at https://quantum.tencent.com/hrgcn/.

[1]  Chao Shen,et al.  ADMET Evaluation in Drug Discovery. 19. Reliable Prediction of Human Cytochrome P450 Inhibition Using Artificial Intelligence Approaches , 2019, J. Chem. Inf. Model..

[2]  P Gramatica,et al.  Prediction of PAH mutagenicity in human cells by QSAR classification , 2008, SAR and QSAR in environmental research.

[3]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[4]  Scott M. Lundberg,et al.  Explainable machine-learning predictions for the prevention of hypoxaemia during surgery , 2018, Nature Biomedical Engineering.

[5]  Chen-Yang Jia,et al.  Cloud 3D-QSAR: a web tool for the development of quantitative structure-activity relationship models in drug discovery , 2020, Briefings Bioinform..

[6]  R. M. Muir,et al.  Correlation of Biological Activity of Phenoxyacetic Acids with Hammett Substituent Constants and Partition Coefficients , 1962, Nature.

[7]  Xiaoyang Xia,et al.  Classification of kinase inhibitors using a Bayesian model. , 2004, Journal of medicinal chemistry.

[8]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[9]  Ping Liu,et al.  Predicting the aquatic toxicity mode of action using logistic regression and linear discriminant analysis , 2016, SAR and QSAR in environmental research.

[10]  Xiaomin Luo,et al.  Pushing the boundaries of molecular representation for drug discovery with graph attention mechanism. , 2020, Journal of medicinal chemistry.

[11]  Valeria V Kleandrova,et al.  The QSAR Paradigm in Fragment-Based Drug Discovery: From the Virtual Generation of Target Inhibitors to Multi-Scale Modeling. , 2020, Mini reviews in medicinal chemistry.

[12]  Matthias Rarey,et al.  Similarity searching in large combinatorial chemistry spaces , 2001, J. Comput. Aided Mol. Des..

[13]  M. Withnall,et al.  Building attention and edge message passing neural networks for bioactivity and physical–chemical property prediction , 2020, Journal of Cheminformatics.

[14]  Regina Barzilay,et al.  Analyzing Learned Molecular Representations for Property Prediction , 2019, J. Chem. Inf. Model..

[15]  Youyong Li,et al.  ADMET evaluation in drug discovery. 12. Development of binary classification models for prediction of hERG potassium channel blockage. , 2012, Molecular pharmaceutics.

[16]  Hugh Chen,et al.  From local explanations to global understanding with explainable AI for trees , 2020, Nature Machine Intelligence.

[17]  Jinfeng Yi,et al.  Edge Attention-based Multi-Relational Graph Convolutional Networks , 2018, ArXiv.

[18]  J. Dearden,et al.  QSAR modeling: where have you been? Where are you going to? , 2014, Journal of medicinal chemistry.

[19]  Matthias Rarey,et al.  Feature trees: A new molecular similarity measure based on tree matching , 1998, J. Comput. Aided Mol. Des..

[20]  Robert P. Sheridan,et al.  Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling , 2003, J. Chem. Inf. Comput. Sci..

[21]  Michel Petitjean,et al.  Applications of the radius-diameter diagram to the classification of topological and geometrical shapes of chemical compounds , 1992, J. Chem. Inf. Comput. Sci..

[22]  ChangKyoo Yoo,et al.  Deep learning driven QSAR model for environmental toxicology: Effects of endocrine disrupting chemicals on human health. , 2019, Environmental pollution.

[23]  C. Hansch,et al.  p-σ-π Analysis. A Method for the Correlation of Biological Activity and Chemical Structure , 1964 .

[24]  Qing-You Zhang,et al.  Random Forest Prediction of Mutagenicity from Empirical Physicochemical Descriptors , 2007, J. Chem. Inf. Model..

[25]  Friedrich Rippmann,et al.  Interpretable Deep Learning in Drug Discovery , 2019, Explainable AI.

[26]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[27]  Alexandru Korotcov,et al.  Graph Convolutional Neural Networks as "General-Purpose" Property Predictors: The Universality and Limits of Applicability , 2019, J. Chem. Inf. Model..

[28]  Regina Barzilay,et al.  Correction to Analyzing Learned Molecular Representations for Property Prediction , 2019, J. Chem. Inf. Model..

[29]  M. Verdonk,et al.  Practical High-Quality Electrostatic Potential Surfaces for Drug Discovery Using a Graph-Convolutional Deep Neural Network. , 2019, Journal of medicinal chemistry.

[30]  Anna Palczewska,et al.  Comparison of the Predictive Performance and Interpretability of Random Forest and Linear Models on Benchmark Data Sets , 2017, J. Chem. Inf. Model..

[31]  Chen-Yang Jia,et al.  Graph attention convolutional neural network model for chemical poisoning of honey bees' prediction. , 2020, Science bulletin.

[32]  Hugo Ceulemans,et al.  Large-scale comparison of machine learning methods for drug target prediction on ChEMBL , 2018, Chemical science.

[33]  Vijay S. Pande,et al.  Molecular graph convolutions: moving beyond fingerprints , 2016, Journal of Computer-Aided Molecular Design.

[34]  Chi Chen,et al.  Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals , 2018, Chemistry of Materials.

[35]  Igor V. Pletnev,et al.  Drug Discovery Using Support Vector Machines. The Case Studies of Drug-likeness, Agrochemical-likeness, and Enzyme Inhibition Predictions , 2003, J. Chem. Inf. Comput. Sci..

[36]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[37]  Joseph Gomes,et al.  MoleculeNet: a benchmark for molecular machine learning† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c7sc02664a , 2017, Chemical science.