Deep Graph Neural Networks with Shallow Subgraph Samplers

While Graph Neural Networks (GNNs) are powerful models for learning representations on graphs, most state-of-the-art models do not have significant accuracy gain beyond two to three layers. Deep GNNs fundamentally need to address: 1). expressivity challenge due to oversmoothing, and 2). computation challenge due to neighborhood explosion. We propose a simple"deep GNN, shallow sampler"design principle to improve both the GNN accuracy and efficiency -- to generate representation of a target node, we use a deep GNN to pass messages only within a shallow, localized subgraph. A properly sampled subgraph may exclude irrelevant or even noisy nodes, and still preserve the critical neighbor features and graph structures. The deep GNN then smooths the informative local signals to enhance feature learning, rather than oversmoothing the global graph signals into just"white noise". We theoretically justify why the combination of deep GNNs with shallow samplers yields the best learning performance. We then propose various sampling algorithms and neural architecture extensions to achieve good empirical results. On the largest public graph dataset, ogbn-papers100M, we achieve state-of-the-art accuracy with an order of magnitude reduction in hardware cost.

[1]  Jure Leskovec,et al.  PinnerSage: Multi-Modal User Embedding Framework for Recommendations at Pinterest , 2020, KDD.

[2]  Shuiwang Ji,et al.  Towards Deeper Graph Neural Networks , 2020, KDD.

[3]  Xiao Wang,et al.  AM-GCN: Adaptive Multi-channel Graph Convolutional Networks , 2020, KDD.

[4]  Yaliang Li,et al.  Simple and Deep Graph Convolutional Networks , 2020, ICML.

[5]  Aleksandar Bojchevski,et al.  Scaling Graph Neural Networks with Approximate PageRank , 2020, KDD.

[6]  Bernard Ghanem,et al.  DeeperGCN: All You Need to Train Deeper GCNs , 2020, ArXiv.

[7]  Eran Yahav,et al.  On the Bottleneck of Graph Neural Networks and its Practical Implications , 2020, ICLR.

[8]  Xiaoning Qian,et al.  Bayesian Graph Neural Networks with Adaptive Connection Sampling , 2020, ICML.

[9]  Suhang Wang,et al.  Graph Structure Learning for Robust Graph Neural Networks , 2020, KDD.

[10]  J. Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[11]  Davide Eynard,et al.  SIGN: Scalable Inception Graph Neural Networks , 2020, ArXiv.

[12]  Le Song,et al.  Efficient Probabilistic Logic Reasoning with Graph Neural Networks , 2020, ICLR.

[13]  L. Akoglu,et al.  PairNorm: Tackling Oversmoothing in GNNs , 2019, ICLR.

[14]  Junzhou Huang,et al.  DropEdge: Towards Deep Graph Convolutional Networks on Node Classification , 2019, ICLR.

[15]  Rajgopal Kannan,et al.  GraphSAINT: Graph Sampling Based Inductive Learning Method , 2019, ICLR.

[16]  Doina Precup,et al.  Break the Ceiling: Stronger Multi-scale Deep Graph Convolutional Networks , 2019, NeurIPS.

[17]  Taiji Suzuki,et al.  Graph Neural Networks Exponentially Lose Expressive Power for Node Classification , 2019, ICLR.

[18]  Jure Leskovec,et al.  Position-aware Graph Neural Networks , 2019, ICML.

[19]  Christos Faloutsos,et al.  Estimating Node Importance in Knowledge Graphs Using Graph Neural Networks , 2019, KDD.

[20]  A. Galstyan,et al.  MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing , 2019, ICML.

[21]  Bernard Ghanem,et al.  DeepGCNs: Can GCNs Go As Deep As CNNs? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[23]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[24]  Junzhou Huang,et al.  Adaptive Sampling Towards Fast Graph Representation Learning , 2018, NeurIPS.

[25]  Stefano E. Rensi,et al.  Machine learning in chemoinformatics and drug discovery. , 2018, Drug discovery today.

[26]  Jure Leskovec,et al.  Graph Convolutional Neural Networks for Web-Scale Recommender Systems , 2018, KDD.

[27]  Stefano Ermon,et al.  Graphite: Iterative Generative Modeling of Graphs , 2018, ICML.

[28]  Yixin Chen,et al.  Link Prediction Based on Graph Neural Networks , 2018, NeurIPS.

[29]  Xiao-Ming Wu,et al.  Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning , 2018, AAAI.

[30]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[31]  Le Song,et al.  Stochastic Training of Graph Convolutional Networks with Variance Reduction , 2017, ICML.

[32]  Jure Leskovec,et al.  Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..

[33]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[34]  M. Bronstein,et al.  Geometric Matrix Completion with Recurrent Multi-Graph Neural Networks , 2017, NIPS.

[35]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[36]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[37]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[38]  Matus Telgarsky,et al.  Benefits of Depth in Neural Networks , 2016, COLT.

[39]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[40]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[41]  Kevin J. Lang,et al.  Local Graph Partitioning using PageRank Vectors , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[42]  J. Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[43]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[44]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[45]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[46]  Jure Leskovec,et al.  Distance Encoding -- Design Provably More Powerful GNNs for Structural Representation Learning , 2020 .

[47]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[48]  Stephan Günnemann,et al.  Predict then Propagate: Combining neural networks with personalized pagerank for classification on graphs , 2018, ICLR 2018.

[49]  Representation Learning on Graphs with Jumping Knowledge Networks , 2018 .

[50]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..