Semantic Proximity Search on Heterogeneous Graph by Proximity Embedding

Many real-world networks have a rich collection of objects. The semantics of these objects allows us to capture different classes of proximities, thus enabling an important task of semantic proximity search. As the core of semantic proximity search, we have to measure the proximity on a heterogeneous graph, whose nodes are various types of objects. Most of the existing methods rely on engineering features about the graph structure between two nodes to measure their proximity. With recent development on graph embedding, we see a good chance to avoid feature engineering for semantic proximity search. There is very little work on using graph embedding for semantic proximity search. We also observe that graph embedding methods typically focus on embedding nodes, which is an “indirect” approach to learn the proximity. Thus, we introduce a new concept of proximity embedding, which directly embeds the network structure between two possibly distant nodes. We also design our proximity embedding, so as to flexibly support both symmetric and asymmetric proximities. Based on the proximity embedding, we can easily estimate the proximity score between two nodes and enable search on the graph. We evaluate our proximity embedding method on three real-world public data sets, and show it outperforms the state-of-the-art baselines. We release the code for proximity embedding.

[1]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[2]  Jiawei Han,et al.  Mining advisor-advisee relationships from research publication networks , 2010, KDD.

[3]  Stanford,et al.  Learning to Discover Social Circles in Ego Networks , 2012 .

[4]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[5]  Xiaochun Cao,et al.  Modularity Based Community Detection with Deep Learning , 2016, IJCAI.

[6]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[7]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[8]  Trevor F. Cox,et al.  Multidimensional Scaling, Second Edition , 2000 .

[9]  Zhiyuan Liu,et al.  Representation Learning of Knowledge Graphs with Entity Descriptions , 2016, AAAI.

[10]  Peter D. Hoff,et al.  Modeling homophily and stochastic equivalence in symmetric relational data , 2007, NIPS.

[11]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[12]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[13]  Kevin Chen-Chuan Chang,et al.  From Node Embedding To Community Embedding , 2016, ArXiv.

[14]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[15]  Kevin Chen-Chuan Chang,et al.  User profiling in an ego network: co-profiling attributes and relationships , 2014, WWW.

[16]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[17]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[18]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[19]  Kevin Chen-Chuan Chang,et al.  Semantic proximity search on graphs with metagraph-based learning , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[20]  Yueting Zhuang,et al.  Community-Based Question Answering via Heterogeneous Social Network Learning , 2016, AAAI.

[21]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[22]  Marina Meila,et al.  Directed Graph Embedding: an Algorithm based on Continuous Limits of Laplacian-type Operators , 2011, NIPS.

[23]  Jian Pei,et al.  Asymmetric Transitivity Preserving Graph Embedding , 2016, KDD.

[24]  Li Guo,et al.  Context-Dependent Knowledge Graph Embedding , 2015, EMNLP.

[25]  Charu C. Aggarwal,et al.  Heterogeneous Network Embedding via Deep Architectures , 2015, KDD.

[26]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[27]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[28]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[29]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[30]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[31]  Jure Leskovec,et al.  Supervised random walks: predicting and recommending links in social networks , 2010, WSDM '11.