Decoupling the Depth and Scope of Graph Neural Networks

State-of-the-art Graph Neural Networks (GNNs) have limited scalability with respect to the graph and model sizes. On large graphs, increasing the model depth often means exponential expansion of the scope (i.e., receptive field). Beyond just a few layers, two fundamental challenges emerge: 1. degraded expressivity due to oversmoothing, and 2. expensive computation due to neighborhood explosion. We propose a design principle to decouple the depth and scope of GNNs – to generate representation of a target entity (i.e., a node or an edge), we first extract a localized subgraph as the bounded-size scope, and then apply a GNN of arbitrary depth on top of the subgraph. A properly extracted subgraph consists of a small number of critical neighbors, while excluding irrelevant ones. The GNN, no matter how deep it is, smooths the local neighborhood into informative representation rather than oversmoothing the global graph into “white noise”. Theoretically, decoupling improves the GNN expressive power from the perspectives of graph signal processing (GCN), function approximation (GraphSAGE) and topological learning (GIN). Empirically, on seven graphs (with up to 110M nodes) and six backbone GNN architectures, our design achieves state-of-the-art accuracy with orders of magnitude reduction in computation and hardware cost.

[1]  Eran Yahav,et al.  On the Bottleneck of Graph Neural Networks and its Practical Implications , 2021, ICLR.

[2]  V. Climenhaga Markov chains and mixing times , 2013 .

[3]  Junzhou Huang,et al.  Adaptive Sampling Towards Fast Graph Representation Learning , 2018, NeurIPS.

[4]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[5]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[6]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[7]  Davide Eynard,et al.  SIGN: Scalable Inception Graph Neural Networks , 2020, ArXiv.

[8]  Ken-ichi Kawarabayashi,et al.  Representation Learning on Graphs with Jumping Knowledge Networks , 2018, ICML.

[9]  Rajgopal Kannan,et al.  GraphSAINT: Graph Sampling Based Inductive Learning Method , 2019, ICLR.

[10]  Doina Precup,et al.  Break the Ceiling: Stronger Multi-scale Deep Graph Convolutional Networks , 2019, NeurIPS.

[11]  Le Song,et al.  Stochastic Training of Graph Convolutional Networks with Variance Reduction , 2017, ICML.

[12]  J. Leskovec,et al.  Design Space for Graph Neural Networks , 2020, NeurIPS.

[13]  Bernard Ghanem,et al.  DeepGCNs: Can GCNs Go As Deep As CNNs? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[15]  Yixin Chen,et al.  An End-to-End Deep Learning Architecture for Graph Classification , 2018, AAAI.

[16]  Lingfan Yu,et al.  Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks. , 2019 .

[17]  Jaewoo Kang,et al.  Self-Attention Graph Pooling , 2019, ICML.

[18]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[19]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[20]  Yaliang Li,et al.  Simple and Deep Graph Convolutional Networks , 2020, ICML.

[21]  Xiaoning Qian,et al.  Bayesian Graph Neural Networks with Adaptive Connection Sampling , 2020, ICML.

[22]  Shuiwang Ji,et al.  Towards Deeper Graph Neural Networks , 2020, KDD.

[23]  Yi Tay,et al.  Deep Learning based Recommender System: A Survey and New Perspectives , 2018 .

[24]  Rana Forsati,et al.  Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks , 2020, KDD.

[25]  Johannes Klicpera,et al.  Scaling Graph Neural Networks with Approximate PageRank , 2020, KDD.

[26]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[27]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[28]  Leman Akoglu,et al.  PairNorm: Tackling Oversmoothing in GNNs , 2020, ICLR.

[29]  Jure Leskovec,et al.  Graph Convolutional Neural Networks for Web-Scale Recommender Systems , 2018, KDD.

[30]  K. G. Srinivasa,et al.  Representation Learning on Graphs , 2018 .

[31]  Yixin Chen,et al.  Link Prediction Based on Graph Neural Networks , 2018, NeurIPS.

[32]  Cao Xiao,et al.  FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling , 2018, ICLR.

[33]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[34]  Xavier Bresson,et al.  Geometric Matrix Completion with Recurrent Multi-Graph Neural Networks , 2017, NIPS.

[35]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[36]  Cesare Alippi,et al.  Spectral Clustering with Graph Neural Networks for Graph Pooling , 2019, ICML.

[37]  Kristina Lerman,et al.  MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing , 2019, ICML.

[38]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[39]  Taiji Suzuki,et al.  Graph Neural Networks Exponentially Lose Expressive Power for Node Classification , 2019, ICLR.

[40]  Christos Faloutsos,et al.  Sampling from large graphs , 2006, KDD '06.

[41]  Le Song,et al.  Efficient Probabilistic Logic Reasoning with Graph Neural Networks , 2020, ICLR.

[42]  Emma J. Chory,et al.  A Deep Learning Approach to Antibiotic Discovery , 2020, Cell.

[43]  Russ B Altman,et al.  Machine learning in chemoinformatics and drug discovery. , 2018, Drug discovery today.

[44]  Fan Chung Graham,et al.  Local Graph Partitioning using PageRank Vectors , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[45]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[46]  Samy Bengio,et al.  Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks , 2019, KDD.

[47]  Christos Faloutsos,et al.  Estimating Node Importance in Knowledge Graphs Using Graph Neural Networks , 2019, KDD.

[48]  Stephan Günnemann,et al.  Diffusion Improves Graph Learning , 2019, NeurIPS.

[49]  J. Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[50]  Tingyang Xu,et al.  DropEdge: Towards Deep Graph Convolutional Networks on Node Classification , 2020, ICLR.

[51]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[52]  Jure Leskovec,et al.  Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..

[53]  Xiao-Ming Wu,et al.  Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning , 2018, AAAI.

[54]  Viktor K. Prasanna,et al.  Accurate, Efficient and Scalable Graph Embedding , 2018, 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[55]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[56]  Jure Leskovec,et al.  GNNExplainer: Generating Explanations for Graph Neural Networks , 2019, NeurIPS.

[57]  Xiao Wang,et al.  AM-GCN: Adaptive Multi-channel Graph Convolutional Networks , 2020, KDD.

[58]  Jure Leskovec,et al.  PinnerSage: Multi-Modal User Embedding Framework for Recommendations at Pinterest , 2020, KDD.

[59]  Bernard Ghanem,et al.  DeeperGCN: All You Need to Train Deeper GCNs , 2020, ArXiv.