Towards Deeper Graph Neural Networks with Differentiable Group Normalization

Graph neural networks (GNNs), which learn the representation of a node by aggregating its neighbors, have become an effective computational tool in downstream applications. Over-smoothing is one of the key issues which limit the performance of GNNs as the number of layers increases. It is because the stacked aggregators would make node representations converge to indistinguishable vectors. Several attempts have been made to tackle the issue by bringing linked node pairs close and unlinked pairs distinct. However, they often ignore the intrinsic community structures and would result in sub-optimal performance. The representations of nodes within the same community/class need be similar to facilitate the classification, while different classes are expected to be separated in embedding space. To bridge the gap, we introduce two over-smoothing metrics and a novel technique, i.e., differentiable group normalization (DGN). It normalizes nodes within the same group independently to increase their smoothness, and separates node distributions among different groups to significantly alleviate the over-smoothing issue. Experiments on real-world datasets demonstrate that DGN makes GNN models more robust to over-smoothing and achieves better performance with deeper GNNs.

[1]  Ce Zhang,et al.  An Anatomy of Graph Neural Networks Going Deep via the Lens of Mutual Information: Exponential Decay vs. Full Preservation , 2019, ArXiv.

[2]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Qingquan Song,et al.  Graph Recurrent Networks With Attributed Random Walks , 2019, KDD.

[6]  John Riedl,et al.  Learning preferences of new users in recommender systems: an information theoretic approach , 2008, SKDD.

[7]  David H. Wolpert,et al.  Nonlinear Information Bottleneck , 2017, Entropy.

[8]  Xiao-Ming Wu,et al.  Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning , 2018, AAAI.

[9]  Jundong Li,et al.  SpecAE: Spectral AutoEncoder for Anomaly Detection in Attributed Networks , 2019, CIKM.

[10]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11]  Takanori Maehara,et al.  Revisiting Graph Neural Networks: All We Have is Low-Pass Filters , 2019, ArXiv.

[12]  Xiao Huang,et al.  Multi-Channel Graph Convolutional Networks , 2019, ArXiv.

[13]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[14]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[15]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[16]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[17]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[18]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[19]  Zhengyang Wang,et al.  Large-Scale Learnable Graph Convolutional Networks , 2018, KDD.

[20]  Hongzhi Chen,et al.  Measuring and Improving the Use of Graph Information in Graph Neural Networks , 2020, ICLR.

[21]  Yoshua Bengio,et al.  Benchmarking Graph Neural Networks , 2023, J. Mach. Learn. Res..

[22]  Jie Zhou,et al.  Measuring and Relieving the Over-smoothing Problem for Graph Neural Networks from the Topological View , 2020, AAAI.

[23]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[24]  Xiao Huang,et al.  Auto-GNN: Neural architecture search of graph neural networks , 2019, Frontiers in Big Data.

[25]  Qingquan Song,et al.  Multi-Channel Graph Neural Networks , 2020, IJCAI.

[26]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[27]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[28]  Taiji Suzuki,et al.  Graph Neural Networks Exponentially Lose Expressive Power for Node Classification , 2019, ICLR.

[29]  Tingyang Xu,et al.  DropEdge: Towards Deep Graph Convolutional Networks on Node Classification , 2020, ICLR.

[30]  Stephan Günnemann,et al.  Pitfalls of Graph Neural Network Evaluation , 2018, ArXiv.

[31]  Kaiming He,et al.  Group Normalization , 2018, ECCV.

[32]  Zhiyuan Liu,et al.  Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.

[33]  Bernard Ghanem,et al.  DeepGCNs: Can GCNs Go As Deep As CNNs? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[35]  Leman Akoglu,et al.  PairNorm: Tackling Oversmoothing in GNNs , 2020, ICLR.

[36]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[37]  Artemy Kolchinsky,et al.  Estimating Mixture Entropy with Pairwise Distances , 2017, Entropy.