Classifying Graphs as Images with Convolutional Neural Networks

The task of graph classification is currently dominated by graph kernels, which, while powerful, suffer some significant limitations. Convolutional Neural Networks (CNNs) offer a very appealing alternative. However, processing graphs with CNNs is not trivial. To address this challenge, many sophisticated extensions of CNNs have recently been proposed. In this paper, we show that a classical 2D architecture designed for images can also be used for graph processing in a completely off-the-shelf manner; the only prerequisite being to encode graphs as stacks of two-dimensional histograms of their node embeddings. Despite its simplicity, our method proves very competitive to state-of-the-art graph kernels, and even outperforms them by a wide margin on some datasets. Our approach is also preferable in terms of time complexity. Code and data are publicly available1.

[1]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[2]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[3]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[4]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[5]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[6]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[7]  Hans-Peter Kriegel,et al.  Graph Kernels For Disease Outcome Prediction From Protein-Protein Interaction Networks , 2006, Pacific Symposium on Biocomputing.

[8]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[9]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[10]  Upmanu Lall,et al.  Streamflow simulation: A nonparametric approach , 1997 .

[11]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[12]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[13]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[14]  O. Sporns,et al.  Complex brain networks: graph theoretical analysis of structural and functional systems , 2009, Nature Reviews Neuroscience.

[15]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[16]  Xiangnan He,et al.  Attributed Social Network Embedding , 2017, IEEE Transactions on Knowledge and Data Engineering.

[17]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[18]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  L. Bottou,et al.  1 Support Vector Machine Solvers , 2007 .

[20]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[21]  David G. Tarboton,et al.  A Nonparametric Wet/Dry Spell Model for Resampling Daily Precipitation , 1996 .

[22]  Matthew R. Hallowell,et al.  Construction Safety Risk Modeling and Simulation , 2016, Risk analysis : an official publication of the Society for Risk Analysis.

[23]  Deli Zhao,et al.  Network Representation Learning with Rich Text Information , 2015, IJCAI.

[24]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[25]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[26]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[27]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[28]  Albert-László Barabási,et al.  Evolution of Networks: From Biological Nets to the Internet and WWW , 2004 .

[29]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[30]  Vincent Gripon,et al.  Generalizing the Convolution Operator to Extend CNNs to Irregular Domains , 2016, ArXiv.

[31]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[32]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[33]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[34]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Lev Muchnik,et al.  Identifying influential spreaders in complex networks , 2010, 1001.5285.

[36]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[37]  David G. Tarboton,et al.  Multivariate nonparametric resampling scheme for generation of daily weather variables , 1997 .

[38]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[39]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[40]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[41]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[42]  W. Tobler A Computer Movie Simulating Urban Growth in the Detroit Region , 1970 .

[43]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[44]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[45]  Michalis Vazirgiannis,et al.  Matching Node Embeddings for Graph Similarity , 2017, AAAI.

[46]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.