An Anatomy of Graph Neural Networks Going Deep via the Lens of Mutual Information: Exponential Decay vs. Full Preservation

Graph Convolutional Network (GCN) has attracted intensive interests recently. One major limitation of GCN is that it often cannot benefit from using a deep architecture, while traditional CNN and an alternative Graph Neural Network architecture, namely GraphCNN, often achieve better quality with a deeper neural architecture. How can we explain this phenomenon? In this paper, we take the first step towards answering this question. We first conduct a systematic empirical study on the accuracy of GCN, GraphCNN, and ResNet-18 on 2D images and identified relative importance of different factors in architectural design. This inspired a novel theoretical analysis on the mutual information between the input and the output after $l$ GCN and GraphCNN layers. We identified regimes in which GCN suffers exponentially fast information lose and show that GraphCNN requires a much weaker condition for similar behavior to happen.

[1]  Sebastian Ruder,et al.  An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[2]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[3]  Nathan D. Cahill,et al.  Robust Spatial Filtering With Graph Convolutional Neural Networks , 2017, IEEE Journal of Selected Topics in Signal Processing.

[4]  Cyrus Shahabi,et al.  Graph Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting , 2017, ArXiv.

[5]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[6]  Wojciech Czarnecki,et al.  On Loss Functions for Deep Neural Networks in Classification , 2017, ArXiv.

[7]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[8]  Bernard Ghanem,et al.  Can GCNs Go as Deep as CNNs? , 2019, ArXiv.

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[11]  Timothy Baldwin,et al.  Semi-supervised User Geolocation via Graph Convolutional Networks , 2018, ACL.

[12]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Gholamreza Haffari,et al.  Graph-to-Sequence Learning using Gated Graph Neural Networks , 2018, ACL.

[14]  Yoshua Bengio,et al.  Mutual Information Neural Estimation , 2018, ICML.

[15]  Danfei Xu,et al.  Scene Graph Generation by Iterative Message Passing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Taiji Suzuki,et al.  Graph Neural Networks Exponentially Lose Expressive Power for Node Classification , 2019, ICLR.

[17]  Cyrus Shahabi,et al.  Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting , 2017, ICLR.

[18]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[19]  Zhiyuan Liu,et al.  Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.

[20]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[21]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[22]  Taiji Suzuki,et al.  On Asymptotic Behaviors of Graph CNNs from Dynamical Systems Perspective , 2019, ArXiv.

[23]  Bernard Ghanem,et al.  DeepGCNs: Can GCNs Go As Deep As CNNs? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[24]  Xiao-Ming Wu,et al.  Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning , 2018, AAAI.

[25]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[26]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.