LASAGNE: Locality and Structure Aware Graph Node Embedding

In this work we propose LASAGNE, a methodology to learn locality and structure aware graph node embeddings in an unsupervised way. In particular, we show that the performance of existing random-walk based approaches depends strongly on the structural properties of the graph, e.g., the size of the graph, whether the graph has a flat or upward-sloping Network Community Profile (NCP), whether the graph is expander-like, whether the classes of interest are more k-core-like or more peripheral, etc. For larger graphs with flat NCPs that are strongly expander-like, existing methods lead to random walks that expand rapidly, touching many dissimilar nodes, thereby leading to lower-quality vector representations that are less useful for downstream tasks. Rather than relying on global random walks or neighbors within fixed hop distances, LASAGNE exploits strongly local Approximate Personalized PageRank stationary distributions to more precisely engineer local information into node embeddings. This leads, in particular, to more meaningful and more useful vector representations of nodes in poorly-structured graphs. We show that LASAGNE leads to significant improvement in downstream multi-label classification for larger graphs with flat NCPs and that it is comparable for smaller graphs with upward-sloping NCPs.

[1]  Fan Chung Graham,et al.  Local Graph Partitioning using PageRank Vectors , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[2]  Omer Levy,et al.  Dependency-Based Word Embeddings , 2014, ACL.

[3]  Enhong Chen,et al.  Learning Deep Representations for Graph Clustering , 2014, AAAI.

[4]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[5]  Graham Cormode,et al.  Node Classification in Social Networks , 2011, Social Network Data Analytics.

[6]  Stephan Günnemann,et al.  Deep Gaussian Embedding of Attributed Graphs: Unsupervised Inductive Learning via Ranking , 2017, ArXiv.

[7]  Guy E. Blelloch,et al.  Ligra: a lightweight graph processing framework for shared memory , 2013, PPoPP '13.

[8]  Jure Leskovec,et al.  Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters , 2008, Internet Math..

[9]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[10]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[11]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[12]  Daniel R. Figueiredo,et al.  struc2vec: Learning Node Representations from Structural Identity , 2017, KDD.

[13]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[14]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[15]  Pavel Berkhin,et al.  Bookmark-Coloring Algorithm for Personalized PageRank Computing , 2006, Internet Math..

[16]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[17]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[18]  Cherié L. Weible,et al.  The Internet Movie Database , 2001 .

[19]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[20]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[21]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[22]  Pradeep Dubey,et al.  Parallelizing Word2Vec in Shared and Distributed Memory , 2016, IEEE Transactions on Parallel and Distributed Systems.

[23]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[24]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[25]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[26]  Jing Gao,et al.  LRBM: A Restricted Boltzmann Machine Based Approach for Representation Learning on Linked Data , 2014, 2014 IEEE International Conference on Data Mining.

[27]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[28]  Huan Liu,et al.  Relational learning via latent social dimensions , 2009, KDD.

[29]  Mason A. Porter,et al.  Think Locally, Act Locally: The Detection of Small, Medium-Sized, and Large Communities in Large Networks , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  Blair D. Sullivan,et al.  Tree-Like Structure in Large Social and Information Networks , 2013, 2013 IEEE 13th International Conference on Data Mining.

[31]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[32]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[33]  Donald E. Knuth,et al.  The art of computer programming. Vol.2: Seminumerical algorithms , 1981 .

[34]  Kara Dolinski,et al.  The BioGRID Interaction Database: 2008 update , 2008, Nucleic Acids Res..

[35]  Shuicheng Yan,et al.  Graph embedding: a general framework for dimensionality reduction , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[36]  Alex Alemi,et al.  Watch Your Step: Learning Graph Embeddings Through Attention , 2017, ArXiv.

[37]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[38]  Charu C. Aggarwal,et al.  An Introduction to Social Network Data Analytics , 2011, Social Network Data Analytics.

[39]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[40]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[41]  Michael W. Mahoney,et al.  Skip-Gram − Zipf + Uniform = Vector Additivity , 2017, ACL.

[42]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[43]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[44]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[45]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[46]  Blair D. Sullivan,et al.  Tree decompositions and social graphs , 2014, Internet Math..