PairNorm: Tackling Oversmoothing in GNNs

The performance of graph neural nets (GNNs) is known to gradually decrease with increasing number of layers. This decay is partly attributed to oversmoothing, where repeated graph convolutions eventually make node embeddings indistinguishable. We take a closer look at two different interpretations, aiming to quantify oversmoothing. Our main contribution is PairNorm, a novel normalization layer that is based on a careful analysis of the graph convolution operator, which prevents all node embeddings from becoming too similar. What is more, PairNorm is fast, easy to implement without any change to network architecture nor any additional parameters, and is broadly applicable to any GNN. Experiments on real-world graphs demonstrate that PairNorm makes deeper GCN, GAT, and SGC models more robust against oversmoothing, and significantly boosts performance for a new problem setting that benefits from deeper GNNs. Code is available at this https URL.

[1]  Wenbing Huang,et al.  The Truly Deep Graph Convolutional Networks for Node Classification , 2019, ArXiv.

[2]  Takanori Maehara,et al.  Revisiting Graph Neural Networks: All We Have is Low-Pass Filters , 2019, ArXiv.

[3]  Yoshua Bengio,et al.  GMNN: Graph Markov Neural Networks , 2019, ICML.

[4]  Matthias Fey,et al.  Just Jump: Dynamic Neighborhood Aggregation in Graph Neural Networks , 2019, ArXiv.

[5]  Bernard Ghanem,et al.  Can GCNs Go as Deep as CNNs? , 2019, ArXiv.

[6]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[7]  Stephan Günnemann,et al.  Pitfalls of Graph Neural Network Evaluation , 2018, ArXiv.

[8]  Stephan Günnemann,et al.  Predict then Propagate: Combining neural networks with personalized pagerank for classification on graphs , 2018, ICLR 2018.

[9]  Ken-ichi Kawarabayashi,et al.  Representation Learning on Graphs with Jumping Knowledge Networks , 2018, ICML.

[10]  Xiao-Ming Wu,et al.  Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning , 2018, AAAI.

[11]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[12]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[13]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[14]  Tim Salimans,et al.  Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.

[15]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[17]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..