User guided entity similarity search using meta-path selection in heterogeneous information networks

With the emergence of web-based social and information applications, entity similarity search in information networks, aiming to find entities with high similarity to a given query entity, has gained wide attention. However, due to the diverse semantic meanings in heterogeneous information networks, which contain multi-typed entities and relationships, similarity measurement can be ambiguous without context. In this paper, we investigate entity similarity search and the resulting ambiguity problems in heterogeneous information networks. We propose to use a meta-path-based ranking model ensemble to represent semantic meanings for similarity queries, exploit the possibility of using using user-guidance to understand users query. Experiments on real-world datasets show that our framework significantly outperforms competitor methods.

[1]  Ying Chen,et al.  Cross Domain Random Walk for Query Intent Pattern Mining from Search Engine Log , 2011, 2011 IEEE 11th International Conference on Data Mining.

[2]  Soumen Chakrabarti,et al.  Dynamic personalized pagerank in entity-relation graphs , 2007, WWW '07.

[3]  Neil Salkind Encyclopedia of Measurement and Statistics , 2006 .

[4]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[5]  Tao Qin,et al.  Feature selection for ranking , 2007, SIGIR.

[6]  Charu C. Aggarwal,et al.  Co-author Relationship Prediction in Heterogeneous Bibliographic Networks , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[7]  Jiawei Han,et al.  Citation Prediction in Heterogeneous Bibliographic Networks , 2012, SDM.

[8]  Ni Lao,et al.  Relational retrieval using a combination of path-constrained random walks , 2010, Machine Learning.

[9]  H. Abdi The Kendall Rank Correlation Coefficient , 2007 .

[10]  Jiawei Han,et al.  Geo-Friends Recommendation in GPS-based Cyber-physical Social Network , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[11]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[12]  P. D. Thouin,et al.  Survey and comparative analysis of entropy and relative entropy thresholding techniques , 2006 .