Communication-Efficient Sampling for Distributed Training of Graph Convolutional Networks

Training Graph Convolutional Networks (GCNs) is expensive as it needs to aggregate data recursively from neighboring nodes. To reduce the computation overhead, previous works have proposed various neighbor sampling methods that estimate the aggregation result based on a small number of sampled neighbors. Although these methods have successfully accelerated the training, they mainly focus on the single-machine setting. As real-world graphs are large, training GCNs in distributed systems is desirable. However, we found that the existing neighbor sampling methods do not work well in a distributed setting. Specifically, a naive implementation may incur a huge amount of communication of feature vectors among different machines. To address this problem, we propose a communication-efficient neighbor sampling method in this work. Our main idea is to assign higher sampling probabilities to the local nodes so that remote nodes are accessed less frequently. We present an algorithm that determines the local sampling probabilities and makes sure our skewed neighbor sampling does not affect much to the convergence of the training. Our experiments with node classification benchmarks show that our method significantly reduces the communication overhead for distributed GCN training with little accuracy loss.

[1]  Chang Zhou,et al.  AliGraph: A Comprehensive Graph Neural Network Platform , 2019, Proc. VLDB Endow..

[2]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[3]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[4]  Mathias Niepert,et al.  Learning Graph Representations with Embedding Propagation , 2017, NIPS.

[5]  Le Song,et al.  Stochastic Training of Graph Convolutional Networks with Variance Reduction , 2017, ICML.

[6]  Cao Xiao,et al.  FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling , 2018, ICLR.

[7]  Samy Bengio,et al.  Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks , 2019, KDD.

[8]  Ziqi Liu,et al.  AGL , 2020, Proc. VLDB Endow..

[9]  Lin Xiao,et al.  MultiLevel Composite Stochastic Optimization via Nested Variance Reduction , 2021, SIAM J. Optim..

[10]  Jure Leskovec,et al.  Graph Convolutional Neural Networks for Web-Scale Recommender Systems , 2018, KDD.

[11]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[12]  Yixin Chen,et al.  Link Prediction Based on Graph Neural Networks , 2018, NeurIPS.

[13]  Jure Leskovec,et al.  Hierarchical Graph Representation Learning with Differentiable Pooling , 2018, NeurIPS.

[14]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[15]  Wotao Yin,et al.  Solving Stochastic Compositional Optimization is Nearly as Easy as Solving Stochastic Optimization , 2020, IEEE Transactions on Signal Processing.

[16]  Ruoyu Li,et al.  Adaptive Graph Convolutional Neural Networks , 2018, AAAI.

[17]  Rana Forsati,et al.  Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks , 2020, KDD.

[18]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[19]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[20]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[21]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[22]  Junzhou Huang,et al.  Adaptive Sampling Towards Fast Graph Representation Learning , 2018, NeurIPS.

[23]  Alex Smola,et al.  Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs , 2019, ArXiv.

[24]  Yizhou Sun,et al.  Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks , 2019, NeurIPS.

[25]  Rajgopal Kannan,et al.  GraphSAINT: Graph Sampling Based Inductive Learning Method , 2019, ICLR.

[26]  Yixin Chen,et al.  Weisfeiler-Lehman Neural Machine for Link Prediction , 2017, KDD.

[27]  Mengdi Wang,et al.  Multilevel Stochastic Gradient Methods for Nested Composition Optimization , 2018, SIAM J. Optim..