Latent Network Summarization: Bridging Network Embedding and Summarization

Motivated by the computational and storage challenges that dense embeddings pose, we introduce the problem of latent network summarization that aims to learn a compact, latent representation of the graph structure with dimensionality that is independent of the input graph size (\i.e., #nodes and #edges), while retaining the ability to derive node representations on the fly. We propose Multi-LENS, an inductive multi-level latent network summarization approach that leverages a set of relational operators and relational functions (compositions of operators) to capture the structure of egonets and higher-order subgraphs, respectively. The structure is stored in low-rank, size-independent structural feature matrices, which along with the relational functions comprise our latent network summary. Multi-LENS is general and naturally supports both homogeneous and heterogeneous graphs with or without directionality, weights, attributes or labels. Extensive experiments on real graphs show 3.5-34.3% improvement in AUC for link prediction, while requiring 80-2152x less output storage space than baseline embedding methods on large datasets. As application areas, we show the effectiveness of Multi-LENS in detecting anomalies and events in the Enron email communication graph and Twitter co-mention graph.

[1]  Danai Koutra,et al.  Exploratory Analysis of Graph Data by Leveraging Domain Knowledge , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[2]  Nisheeth Shrivastava,et al.  Graph summarization with bounded error , 2008, SIGMOD Conference.

[3]  Ryan A. Rossi,et al.  Learning Role-based Graph Embeddings , 2018, ArXiv.

[4]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[5]  Ryan A. Rossi,et al.  Role Discovery in Networks , 2014, IEEE Transactions on Knowledge and Data Engineering.

[6]  Daniel J. Abadi,et al.  Scalable Pattern Matching over Compressed Graphs via Dedensification , 2016, KDD.

[7]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[8]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[9]  M. Brand,et al.  Fast low-rank modifications of the thin singular value decomposition , 2006 .

[10]  Alan M. Frieze,et al.  Fast monte-carlo algorithms for finding low-rank approximations , 2004, JACM.

[11]  Ryan A. Rossi,et al.  Deep Inductive Network Representation Learning , 2018, WWW.

[12]  Danai Koutra,et al.  Graph Summarization Methods and Applications , 2016, ACM Comput. Surv..

[13]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[14]  Wei Lu,et al.  Deep Neural Networks for Learning Graph Representations , 2016, AAAI.

[15]  Patrick J. Wolfe,et al.  A Spectral Framework for Anomalous Subgraph Detection , 2014, IEEE Transactions on Signal Processing.

[16]  S. Havlin,et al.  Scale-free networks are ultrasmall. , 2002, Physical review letters.

[17]  Huan Liu,et al.  Leveraging social media networks for classification , 2011, Data Mining and Knowledge Discovery.

[18]  Ryan A. Rossi,et al.  The Network Data Repository with Interactive Graph Analytics and Visualization , 2015, AAAI.

[19]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[20]  Palash Goyal,et al.  Graph Embedding Techniques, Applications, and Performance: A Survey , 2017, Knowl. Based Syst..

[21]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[22]  Stephan Günnemann,et al.  Deep Gaussian Embedding of Graphs: Unsupervised Inductive Learning via Ranking , 2017, ICLR.

[23]  Mark Heimann,et al.  REGAL: Representation Learning-based Graph Alignment , 2018, CIKM.

[24]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[25]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[26]  Jiawei Han,et al.  AspEm: Embedding Learning by Aspects in Heterogeneous Information Networks , 2018, SDM.

[27]  Daniel R. Figueiredo,et al.  struc2vec: Learning Node Representations from Structural Identity , 2017, KDD.

[28]  Jian Li,et al.  Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec , 2017, WSDM.

[29]  Nitesh V. Chawla,et al.  metapath2vec: Scalable Representation Learning for Heterogeneous Networks , 2017, KDD.

[30]  Danai Koutra,et al.  RolX: structural role extraction & mining in large graphs , 2012, KDD.

[31]  Craig A. Knoblock,et al.  Unsupervised Entity Resolution on Multi-type Graphs , 2016, SEMWEB.

[32]  Danai Koutra,et al.  DELTACON: A Principled Massive-Graph Similarity Function , 2013, SDM.

[33]  Danai Koutra,et al.  TimeCrunch: Interpretable Dynamic Graph Summarization , 2015, KDD.