Stability and Generalization of Graph Convolutional Neural Networks

Inspired by convolutional neural networks on 1D and 2D data, graph convolutional neural networks (GCNNs) have been developed for various learning tasks on graph data, and have shown superior performance on real-world datasets. Despite their success, there is a dearth of theoretical explorations of GCNN models such as their generalization properties. In this paper, we take a first step towards developing a deeper theoretical understanding of GCNN models by analyzing the stability of single-layer GCNN models and deriving their generalization guarantees in a semi-supervised graph learning setting. In particular, we show that the algorithmic stability of a GCNN model depends upon the largest absolute eigenvalue of its graph convolution filter. Moreover, to ensure the uniform stability needed to provide strong generalization guarantees, the largest absolute eigenvalue must be independent of the graph size. Our results shed new insights on the design of new & improved graph convolution filters with guaranteed algorithmic stability. We evaluate the generalization gap and stability on various real-world graph datasets and show that the empirical results indeed support our theoretical findings. To the best of our knowledge, we are the first to study stability bounds on graph learning in a semi-supervised setting and derive generalization bounds for GCNN models.

[1]  Pietro Liò,et al.  Deep Graph Infomax , 2018, ICLR.

[2]  Le Song,et al.  Discriminative Embeddings of Latent Variable Models for Structured Data , 2016, ICML.

[3]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[4]  Mikhail Belkin,et al.  Tikhonov regularization and semi-supervised learning on large graphs , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Sanja Fidler,et al.  3D Graph Neural Networks for RGBD Semantic Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Donald F. Towsley,et al.  Diffusion-Convolutional Neural Networks , 2015, NIPS.

[7]  Regina Barzilay,et al.  Deriving Neural Architectures from Sequence and Graph Kernels , 2017, ICML.

[8]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[9]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[10]  Massimiliano Pontil,et al.  Stability of Randomized Learning Algorithms , 2005, J. Mach. Learn. Res..

[11]  Patrick Pérez,et al.  Unifying local and non-local signal processing with graph CNNs , 2017, ArXiv.

[12]  Heinrich Müller,et al.  SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[14]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[15]  Mikhail Belkin,et al.  Regularization and Semi-supervised Learning on Large Graphs , 2004, COLT.

[16]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[17]  Amnon Shashua,et al.  Convolutional Rectifier Networks as Generalized Tensor Decompositions , 2016, ICML.

[18]  Michalis Vazirgiannis,et al.  Graph Classification with 2D Convolutional Neural Networks , 2017, ICANN.

[19]  Ruoyu Li,et al.  Adaptive Graph Convolutional Neural Networks , 2018, AAAI.

[20]  T. Poggio,et al.  Deep vs. shallow networks : An approximation theory perspective , 2016, ArXiv.

[21]  W. Haemers Interlacing eigenvalues and graphs , 1995 .

[22]  Thomas J. Laffey,et al.  Spectra of principal submatrices of nonnegative matrices , 2008 .

[23]  Yoshua Bengio,et al.  Shallow vs. Deep Sum-Product Networks , 2011, NIPS.

[24]  Xavier Bresson,et al.  CayleyNets: Graph Convolutional Neural Networks With Complex Rational Spectral Filters , 2017, IEEE Transactions on Signal Processing.

[25]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[26]  Risi Kondor,et al.  Covariant Compositional Networks For Learning Graphs , 2018, ICLR.

[27]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[28]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[29]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[30]  Matus Telgarsky,et al.  Benefits of Depth in Neural Networks , 2016, COLT.

[31]  Donald F. Towsley,et al.  Quantum Walk Neural Networks for Graph-Structured Data , 2018, COMPLEX NETWORKS.

[32]  Jure Leskovec,et al.  Hierarchical Graph Representation Learning with Differentiable Pooling , 2018, NeurIPS.

[33]  Joan Bruna,et al.  Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[34]  Sayan Mukherjee,et al.  Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization , 2006, Adv. Comput. Math..

[35]  Mehryar Mohri,et al.  Stability of transductive regression algorithms , 2008, ICML '08.

[36]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[37]  Nathan Srebro,et al.  Exploring Generalization in Deep Learning , 2017, NIPS.

[38]  Shivani Agarwal,et al.  Stability and Generalization of Bipartite Ranking Algorithms , 2005, COLT.

[39]  David Haussler,et al.  Probably Approximately Correct Learning , 2010, Encyclopedia of Machine Learning.

[40]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[41]  Zhi-Li Zhang,et al.  Graph Capsule Convolutional Neural Networks , 2018, ArXiv.

[42]  Yue Zhang,et al.  Exploring Graph-structured Passage Representation for Multi-hop Reading Comprehension with Graph Neural Networks , 2018, ArXiv.

[43]  Yoram Singer,et al.  Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.

[44]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[45]  Nathan D. Cahill,et al.  Robust Spatial Filtering With Graph Convolutional Neural Networks , 2017, IEEE Journal of Selected Topics in Signal Processing.

[46]  Shiliang Sun,et al.  Manifold-preserving graph reduction for sparse semi-supervised learning , 2014, Neurocomputing.

[47]  Mathias Niepert,et al.  Learning Graph Representations with Embedding Propagation , 2017, NIPS.

[48]  Pascal Frossard,et al.  The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains , 2012, IEEE Signal Processing Magazine.

[49]  Dongdong Chen,et al.  Quantum-based subgraph convolutional neural networks , 2019, Pattern Recognit..

[50]  Leslie Pack Kaelbling,et al.  Generalization in Deep Learning , 2017, ArXiv.

[51]  Tong Zhang,et al.  Learning on Graph with Laplacian Regularization , 2006, NIPS.

[52]  Shivani Agarwal,et al.  Generalization Bounds for Ranking Algorithms via Algorithmic Stability , 2009, J. Mach. Learn. Res..

[53]  Ohad Shamir,et al.  The Power of Depth for Feedforward Neural Networks , 2015, COLT.