Simplifying Graph Convolutional Networks

Graph Convolutional Networks (GCNs) and their variants have experienced significant attention and have become the de facto methods for learning graph representations. GCNs derive inspiration primarily from recent deep learning approaches, and as a result, may inherit unnecessary complexity and redundant computation. In this paper, we reduce this excess complexity through successively removing nonlinearities and collapsing weight matrices between consecutive layers. We theoretically analyze the resulting linear model and show that it corresponds to a fixed low-pass filter followed by a linear classifier. Notably, our experimental evaluation demonstrates that these simplifications do not negatively impact accuracy in many downstream applications. Moreover, the resulting model scales to larger datasets, is naturally interpretable, and yields up to two orders of magnitude speedup over FastGCN.

[1]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[2]  J. Orbach Principles of Neurodynamics. Perceptrons and the Theory of Brain Mechanisms. , 1962 .

[3]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[4]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[5]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[6]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[7]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[8]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[9]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[10]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[11]  Mikhail Belkin,et al.  Semi-Supervised Learning on Riemannian Manifolds , 2004, Machine Learning.

[12]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[13]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[14]  Brendan T. O'Connor,et al.  A Latent Variable Model for Geographic Lexical Variation , 2010, EMNLP.

[15]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[16]  Timothy Baldwin,et al.  Geolocation Prediction in Social Media Data by Finding Location Indicative Words , 2012, COLING.

[17]  Jason Baldridge,et al.  Supervised Text-based Geolocation Using Language Models on an Adaptive Grid , 2012, EMNLP.

[18]  Hossein Mobahi,et al.  Deep Learning via Semi-supervised Embedding , 2012, Neural Networks: Tricks of the Trade.

[19]  David D. Cox,et al.  Hyperopt: A Python Library for Optimizing the Hyperparameters of Machine Learning Algorithms , 2013, SciPy.

[20]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[21]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[22]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[23]  Jitendra Malik,et al.  Hypercolumns for object segmentation and fine-grained localization , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[25]  O. A. von Lilienfeld,et al.  Electronic spectra from TDDFT and machine learning in chemical space. , 2015, The Journal of chemical physics.

[26]  Donald F. Towsley,et al.  Diffusion-Convolutional Neural Networks , 2015, NIPS.

[27]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[28]  Kilian Q. Weinberger,et al.  Deep Networks with Stochastic Depth , 2016, ECCV.

[29]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[30]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[31]  Danqi Chen,et al.  Position-aware Attention and Supervised Data Improve Slot Filling , 2017, EMNLP.

[32]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[33]  Mathias Niepert,et al.  Learning Graph Representations with Embedding Propagation , 2017, NIPS.

[34]  Jonathan Masci,et al.  Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Chong Wang,et al.  Attention-based Graph Neural Network for Semi-supervised Learning , 2018, ArXiv.

[36]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[37]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[38]  Xiao-Ming Wu,et al.  Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning , 2018, AAAI.

[39]  Christopher D. Manning,et al.  Graph Convolution over Pruned Dependency Trees Improves Relation Extraction , 2018, EMNLP.

[40]  Timothy Baldwin,et al.  Semi-supervised User Geolocation via Graph Convolutional Networks , 2018, ACL.

[41]  Edith Cohen,et al.  Bootstrapped Graph Diffusions: Exposing the Power of Nonlinearity , 2018, Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems.

[42]  Chen Cai,et al.  A simple yet effective baseline for non-attribute graph classification , 2018, ArXiv.

[43]  Hao Ma,et al.  GaAN: Gated Attention Networks for Learning on Large and Spatiotemporal Graphs , 2018, UAI.

[44]  Jessica B. Hamrick,et al.  LanczosNet : Multi-Scale Deep Graph Convolutional Networks , 2018 .

[45]  Yixin Chen,et al.  An End-to-End Deep Learning Architecture for Graph Classification , 2018, AAAI.

[46]  Cao Xiao,et al.  FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling , 2018, ICLR.

[47]  Abhinav Gupta,et al.  Zero-Shot Recognition via Semantic Embeddings and Knowledge Graphs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[48]  Joonseok Lee,et al.  N-GCN: Multi-scale Graph Convolution for Semi-supervised Node Classification , 2018, UAI.

[49]  Yuan Luo,et al.  Graph Convolutional Networks for Text Classification , 2018, AAAI.

[50]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[51]  Edith Cohen,et al.  Bootstrapped Graph Diffusions: Exposing the Power of Nonlinearity , 2018, PERV.

[52]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[53]  Qimai Li,et al.  Label Efficient Semi-Supervised Learning via Graph Filtering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Hao Wang,et al.  Rethinking Knowledge Graph Propagation for Zero-Shot Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Pietro Liò,et al.  Deep Graph Infomax , 2018, ICLR.

[56]  Stephan Günnemann,et al.  Predict then Propagate: Graph Neural Networks meet Personalized PageRank , 2018, ICLR.

[57]  Ryan A. Rossi,et al.  Attention Models in Graphs , 2018, ACM Trans. Knowl. Discov. Data.

[58]  Wei-Lun Chao,et al.  Classifier and Exemplar Synthesis for Zero-Shot Learning , 2018, International Journal of Computer Vision.

[59]  Zhiyuan Liu,et al.  Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.