MILE: A Multi-Level Framework for Scalable Graph Embedding

Recently there has been a surge of interest in designing graph embedding methods. Few, if any, can scale to a large-sized graph with millions of nodes due to both computational complexity and memory requirements. In this paper, we relax this limitation by introducing the MultI-Level Embedding (MILE) framework -- a generic methodology allowing contemporary graph embedding methods to scale to large graphs. MILE repeatedly coarsens the graph into smaller ones using a hybrid matching technique to maintain the backbone structure of the graph. It then applies existing embedding methods on the coarsest graph and refines the embeddings to the original graph through a graph convolution neural network that it learns. The proposed MILE framework is agnostic to the underlying graph embedding techniques and can be applied to many existing graph embedding methods without modifying them. We employ our framework on several popular graph embedding techniques and conduct embedding for real-world graphs. Experimental results on five large-scale datasets demonstrate that MILE significantly boosts the speed (order of magnitude) of graph embedding while generating embeddings of better quality, for the task of node classification. MILE can comfortably scale to a graph with 9 million nodes and 40 million edges, on which existing methods run out of memory or take too long to compute on a modern workstation. Our code and data are publicly available with detailed instructions for adding new base embedding methods: \url{this https URL}.

[1]  George Karypis,et al.  Multilevel k-way Partitioning Scheme for Irregular Graphs , 1998, J. Parallel Distributed Comput..

[2]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[3]  David Harel,et al.  A fast multi-scale method for drawing large graphs , 2000, AVI '00.

[4]  F. Chung Laplacians and the Cheeger Inequality for Directed Graphs , 2005 .

[5]  Inderjit S. Dhillon,et al.  Weighted Graph Cuts without Eigenvectors A Multilevel Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[7]  Srinivasan Parthasarathy,et al.  Scalable graph clustering using stochastic flows: applications to community discovery , 2009, KDD.

[8]  Srinivasan Parthasarathy,et al.  Symmetrizations for clustering directed graphs , 2011, EDBT/ICDT '11.

[9]  Danai Koutra,et al.  RolX: structural role extraction & mining in large graphs , 2012, KDD.

[10]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[11]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[12]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[13]  David Fuhry,et al.  Community Discovery: Simple and Scalable Approaches , 2015 .

[14]  Avery Ching,et al.  One Trillion Edges: Graph Processing at Facebook-Scale , 2015, Proc. VLDB Endow..

[15]  Ryan A. Rossi,et al.  Role Discovery in Networks , 2014, IEEE Transactions on Knowledge and Data Engineering.

[16]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[17]  Joan Bruna,et al.  Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[18]  Charu C. Aggarwal,et al.  Heterogeneous Network Embedding via Deep Architectures , 2015, KDD.

[19]  Deli Zhao,et al.  Network Representation Learning with Rich Text Information , 2015, IJCAI.

[20]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[21]  Sanjeev Arora,et al.  A Latent Variable Model Approach to PMI-based Word Embeddings , 2015, TACL.

[22]  Jure Leskovec,et al.  Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change , 2016, ACL.

[23]  Chengqi Zhang,et al.  Tri-Party Deep Network Representation , 2016, IJCAI.

[24]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[25]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[26]  Xiao Huang,et al.  Accelerated Attributed Network Embedding , 2017, SDM.

[27]  Zhiyuan Liu,et al.  Fast Network Embedding Enhancement via High Order Proximity Approximation , 2017, IJCAI.

[28]  Ryan A. Rossi,et al.  A Framework for Generalizing Graph-based Representation Learning Methods , 2017, ArXiv.

[29]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[30]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[31]  Nitesh V. Chawla,et al.  metapath2vec: Scalable Representation Learning for Heterogeneous Networks , 2017, KDD.

[32]  Weiyi Liu,et al.  Principled Multilayer Network Embedding , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[33]  Srinivasan Parthasarathy,et al.  SEANO: Semi-supervised Embedding in Attributed Networks with Outliers , 2017, SDM.

[34]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[35]  Palash Goyal,et al.  Graph Embedding Techniques, Applications, and Performance: A Survey , 2017, Knowl. Based Syst..

[36]  Huan Liu,et al.  Multi-Layered Network Embedding , 2018, SDM.

[37]  Jian Li,et al.  Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec , 2017, WSDM.

[38]  Meng Wang,et al.  SocialGCN: An Efficient Graph Convolutional Network based Model for Social Recommendation , 2018, ArXiv.

[39]  Yuxiao Dong,et al.  DeepInf: Social Influence Prediction with Deep Learning , 2018, KDD.

[40]  Steven Skiena,et al.  HARP: Hierarchical Representation Learning for Networks , 2017, AAAI.

[41]  Alexander Peysakhovich,et al.  PyTorch-BigGraph: A Large-scale Graph Embedding System , 2019, SysML.

[42]  Dan Goldwasser,et al.  Encoding Social Information with Graph Convolutional Networks forPolitical Perspective Detection in News Media , 2019, ACL.

[43]  Srinivasan Parthasarathy,et al.  Network Representation Learning: Consolidation and Renewed Bearing , 2019, ArXiv.