Generalization Guarantee of Training Graph Convolutional Networks with Graph Topology Sampling

Graph convolutional networks (GCNs) have re-cently achieved great empirical success in learning graph-structured data. To address its scala-bility issue due to the recursive embedding of neighboring features, graph topology sampling has been proposed to reduce the memory and computational cost of training GCNs, and it has achieved comparable test performance to those without topology sampling in many empirical studies. To the best of our knowledge, this paper provides the first theoretical justification of graph topology sampling in training (up to) three-layer GCNs for semi-supervised node classification. We formally characterize some sufficient conditions on graph topology sampling such that GCN training leads to a diminishing generalization error. Moreover, our method tackles the non-convex interaction of weights across layers, which is under-explored in the existing theoretical analyses of GCNs. This paper characterizes the impact of graph structures and topology sampling on the generalization performance and sample complexity explicitly, and the theoretical findings are also justified through numerical experiments.

[1]  M. Wang,et al.  Learning and generalization of one-hidden-layer neural networks, going beyond standard Gaussian data , 2022, 2022 56th Annual Conference on Information Sciences and Systems (CISS).

[2]  Mehrdad Mahdavi,et al.  On Provable Benefits of Depth in Training Graph Convolutional Networks , 2021, NeurIPS.

[3]  Stefanie Jegelka,et al.  Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth , 2021, ICML.

[4]  Shaogao Lv,et al.  Generalization bounds for graph convolutional neural networks via Rademacher complexity , 2021, ArXiv.

[5]  Zhangyang Wang,et al.  A Unified Lottery Ticket Hypothesis for Graph Neural Networks , 2021, ICML.

[6]  Renjie Liao,et al.  A PAC-Bayesian Approach to Generalization Bounds for Graph Neural Networks , 2020, ICLR.

[7]  Hongxia Wang,et al.  The generalization error of graph convolutional networks may enlarge with more layers , 2020, Neurocomputing.

[8]  Bo Zong,et al.  Robust Graph Representation Learning via Neural Sparsification , 2020, ICML.

[9]  Meng Wang,et al.  Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case , 2020, ICML.

[10]  Taiji Suzuki,et al.  Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks , 2020, NeurIPS.

[11]  Shengmin Jin,et al.  SGCN: A Graph Sparsifier Based on Graph Convolutional Networks , 2020, PAKDD.

[12]  Stefanie Jegelka,et al.  Generalization and Representational Limits of Graph Neural Networks , 2020, ICML.

[13]  Yingbin Liang,et al.  Guaranteed Recovery of One-Hidden-Layer Neural Networks via Cross Entropy , 2018, IEEE Transactions on Signal Processing.

[14]  Mahmut T. Kandemir,et al.  GCN meets GPU: Decoupling "When to Sample" from "How to Sample" , 2020, NeurIPS.

[15]  Yizhou Sun,et al.  Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks , 2019, NeurIPS.

[16]  Ruosong Wang,et al.  Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels , 2019, NeurIPS.

[17]  Zhi-Li Zhang,et al.  Stability and Generalization of Graph Convolutional Neural Networks , 2019, KDD.

[18]  Yuanzhi Li,et al.  Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.

[19]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[20]  Arthur Jacot,et al.  Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.

[21]  Raia Hadsell,et al.  Graph networks as learnable physics engines for inference and control , 2018, ICML.

[22]  Abhinav Gupta,et al.  Zero-Shot Recognition via Semantic Embeddings and Knowledge Graphs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Cao Xiao,et al.  FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling , 2018, ICLR.

[24]  Yichen Wei,et al.  Relation Networks for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Joan Bruna,et al.  Few-Shot Learning with Graph Neural Networks , 2017, ICLR.

[26]  Le Song,et al.  Stochastic Training of Graph Convolutional Networks with Variance Reduction , 2017, ICML.

[27]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[28]  Max Welling,et al.  Graph Convolutional Matrix Completion , 2017, ArXiv.

[29]  Nanyun Peng,et al.  Cross-Sentence N-ary Relation Extraction with Graph LSTMs , 2017, TACL.

[30]  Amit Daniely,et al.  SGD Learns the Conjugate Kernel Class of the Network , 2017, NIPS.

[31]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[32]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[33]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[34]  Furong Huang,et al.  Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.

[35]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..