Weisfeiler-Lehman Embedding for Molecular Graph Neural Networks

A graph neural network (GNN) is a good choice for predicting the chemical properties of molecules. Compared with other deep networks, however, the current performance of a GNN is limited owing to the "curse of depth." Inspired by long-established feature engineering in the field of chemistry, we expanded an atom representation using Weisfeiler-Lehman (WL) embedding, which is designed to capture local atomic patterns dominating the chemical properties of a molecule. In terms of representability, we show WL embedding can replace the first two layers of ReLU GNN -- a normal embedding and a hidden GNN layer -- with a smaller weight norm. We then demonstrate that WL embedding consistently improves the empirical performance over multiple GNN architectures and several molecular graph datasets.

[1]  Xiao-Ming Wu,et al.  Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning , 2018, AAAI.

[2]  Yingyu Liang,et al.  N-Gram Graph: Simple Unsupervised Representation for Graphs, with Applications to Molecules , 2018, NeurIPS.

[3]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[4]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[5]  Karsten M. Borgwardt,et al.  Wasserstein Weisfeiler-Lehman Graph Kernels , 2019, NeurIPS.

[6]  Pietro Cavallo,et al.  Relational Graph Attention Networks , 2018, ArXiv.

[7]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8]  Matt J. Kusner,et al.  A Generative Model For Electron Paths , 2018, ICLR.

[9]  Pavlo O. Dral,et al.  Quantum chemistry structures and properties of 134 kilo molecules , 2014, Scientific Data.

[10]  Svetha Venkatesh,et al.  Graph Classification via Deep Learning with Virtual Nodes , 2017, ArXiv.

[11]  Connor W. Coley,et al.  A graph-convolutional neural network model for the prediction of chemical reactivity , 2018, Chemical science.

[12]  Jean-Louis Reymond,et al.  Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17 , 2012, J. Chem. Inf. Model..

[13]  van den Berg,et al.  UvA-DARE (Digital Academic Modeling Relational Data with Graph Convolutional Networks Modeling Relational Data with Graph Convolutional Networks , 2017 .

[14]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Jure Leskovec,et al.  Strategies for Pre-training Graph Neural Networks , 2020, ICLR.

[16]  Stephan Günnemann,et al.  Directional Message Passing for Molecular Graphs , 2020, ICLR.

[17]  Katsuhiko Ishiguro,et al.  Graph Warp Module: an Auxiliary Module for Boosting the Power of Graph Neural Networks , 2019, ArXiv.

[18]  Takanori Maehara,et al.  Revisiting Graph Neural Networks: All We Have is Low-Pass Filters , 2019, ArXiv.

[19]  Martin Grohe,et al.  Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks , 2018, AAAI.

[20]  Jure Leskovec,et al.  Hierarchical Graph Representation Learning with Differentiable Pooling , 2018, NeurIPS.

[21]  Nathan Srebro,et al.  The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..

[22]  Regina Barzilay,et al.  Are Learned Molecular Representations Ready For Prime Time? , 2019, ArXiv.

[23]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[25]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[26]  Takuya Akiba,et al.  Optuna: A Next-generation Hyperparameter Optimization Framework , 2019, KDD.

[27]  Nicola De Cao,et al.  MolGAN: An implicit generative model for small molecular graphs , 2018, ArXiv.

[28]  Ameet Talwalkar,et al.  Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[29]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[30]  Regina Barzilay,et al.  Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network , 2017, NIPS.

[31]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[32]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[33]  Taiji Suzuki,et al.  Graph Neural Networks Exponentially Lose Expressive Power for Node Classification , 2019, ICLR.

[34]  Bernard Ghanem,et al.  DeepGCNs: Can GCNs Go As Deep As CNNs? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[35]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[36]  Ju Li,et al.  TeaNet: universal neural network interatomic potential inspired by iterative electronic relaxations , 2019, Computational Materials Science.

[37]  Hisashi Kashima,et al.  Approximation Ratios of Graph Neural Networks for Combinatorial Problems , 2019, NeurIPS.

[38]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Jerry March,et al.  March's Advanced Organic Chemistry: Reactions, Mechanisms, and Structure , 2001 .

[40]  Vijay S. Pande,et al.  MoleculeNet: a benchmark for molecular machine learning , 2017, Chemical science.

[41]  Regina Barzilay,et al.  Deriving Neural Architectures from Sequence and Graph Kernels , 2017, ICML.

[42]  H. Wiener Structural determination of paraffin boiling points. , 1947, Journal of the American Chemical Society.

[43]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[44]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[45]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[46]  R. Todeschini,et al.  Molecular Descriptors for Chemoinformatics: Volume I: Alphabetical Listing / Volume II: Appendices, References , 2009 .

[47]  Karsten M. Borgwardt,et al.  Fast subtree kernels on graphs , 2009, NIPS.

[48]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[49]  Suvrit Sra,et al.  Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity , 2018, NeurIPS.

[50]  Leman Akoglu,et al.  PairNorm: Tackling Oversmoothing in GNNs , 2020, ICLR.

[51]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[52]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.