L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Graph convolution networks (GCN) are increasingly popular in many applications, yet remain notoriously hard to train over large graph datasets. They need to compute node representations recursively from their neighbors. Current GCN training algorithms suffer from either high computational costs that grow exponentially with the number of layers, or high memory usage for loading the entire graph and node embeddings. In this paper, we propose a novel efficient layer-wise training framework for GCN (L-GCN), that disentangles feature aggregation and feature transformation during training, hence greatly reducing time and memory complexities. We present theoretical analysis for L-GCN under the graph isomorphism framework, that L-GCN leads to as powerful GCNs as the more costly conventional training algorithm does, under mild conditions. We further propose L$^2$-GCN, which learns a controller for each layer that can automatically adjust the training epochs per layer in L-GCN. Experiments show that L-GCN is faster than state-of-the-arts by at least an order of magnitude, with a consistent of memory usage not dependent on dataset size, while maintaining comparable prediction performance. With the learned controller, L$^2$-GCN can further cut the training time in half. Our codes are available at this https URL.

[1]  Misha Denil,et al.  Learning to Learn without Gradient Descent by Gradient Descent , 2016, ICML.

[2]  Shiyu Chang,et al.  AutoGAN: Neural Architecture Search for Generative Adversarial Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[3]  G. Evans,et al.  Learning to Optimize , 2008 .

[4]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[5]  Jun Zhu,et al.  Stochastic Training of Graph Convolutional Networks , 2017, ICML 2018.

[6]  Yizhou Sun,et al.  Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks , 2019, NeurIPS.

[7]  Tianlong Chen,et al.  Learning to Optimize in Swarms , 2019, NeurIPS.

[8]  Joonseok Lee,et al.  N-GCN: Multi-scale Graph Convolution for Semi-supervised Node Classification , 2018, UAI.

[9]  Marcin Andrychowicz,et al.  Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[10]  Cao Xiao,et al.  FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling , 2018, ICLR.

[11]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[12]  Yue Wang,et al.  Drawing early-bird tickets: Towards more efficient training of deep networks , 2019, ICLR.

[13]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[14]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[15]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[16]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[17]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[18]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[19]  Le Song,et al.  Stochastic Training of Graph Convolutional Networks with Variance Reduction , 2017, ICML.

[20]  Jure Leskovec,et al.  Graph Convolutional Neural Networks for Web-Scale Recommender Systems , 2018, KDD.

[21]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[22]  Yixin Chen,et al.  Link Prediction Based on Graph Neural Networks , 2018, NeurIPS.

[23]  Yue Wang,et al.  E2-Train: Training State-of-the-art CNNs with Over 80% Energy Savings , 2019, NeurIPS.

[24]  Jeffrey Dean,et al.  Accelerating Deep Learning by Focusing on the Biggest Losers , 2019, ArXiv.

[25]  Samy Bengio,et al.  Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks , 2019, KDD.

[26]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[27]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.