Heterogeneous information networks that contain multiple types of objects and links are ubiquitous in the real world, such as bibliographic networks, cyber-physical networks, and social media networks. Although researchers have studied various data mining tasks in information networks, interactive query-based network exploration techniques have not been addressed systematically, which, in fact, are highly desirable for exploring large-scale information networks.
In this demo, we introduce and demonstrate our recent research project on query-driven discovery of semantically similar substructures in heterogeneous networks. Given a subgraph query, our system searches a given large information network and finds efficiently a list of subgraphs that are structurally identical and semantically similar. Since data mining methods are used to obtain semantically similar entities (nodes), we use discovery as a term to describe this process. In order to achieve high efficiency and scalability, we design and implement a filter-and verification search framework, which can first generate promising subgraph candidates using off line indices built by data mining results, and then verify candidates with a recursive pruning matching process. The proposed system demonstrates the effectiveness of our query-driven semantic similarity search framework and the efficiency of the proposed methodology on multiple real-world heterogeneous information networks.
[1]
Jiawei Han,et al.
On graph query optimization in large networks
,
2010,
Proc. VLDB Endow..
[2]
Charu C. Aggarwal,et al.
Co-author Relationship Prediction in Heterogeneous Bibliographic Networks
,
2011,
2011 International Conference on Advances in Social Networks Analysis and Mining.
[3]
Jiawei Han,et al.
Ranking-based classification of heterogeneous information networks
,
2011,
KDD.
[4]
Philip S. Yu,et al.
PathSim
,
2011,
Proc. VLDB Endow..
[5]
Ni Lao,et al.
Fast query execution for retrieval models based on path-constrained random walks
,
2010,
KDD.
[6]
Philip S. Yu,et al.
Graph Indexing: Tree + Delta >= Graph
,
2007,
VLDB.
[7]
Nan Li,et al.
Neighborhood based fast graph search in large networks
,
2011,
SIGMOD '11.
[8]
Yizhou Sun,et al.
Ranking-based clustering of heterogeneous information networks with star network schema
,
2009,
KDD.
[9]
Jiawei Han,et al.
Citation Prediction in Heterogeneous Bibliographic Networks
,
2012,
SDM.
[10]
Jiawei Han,et al.
Geo-Friends Recommendation in GPS-based Cyber-physical Social Network
,
2011,
2011 International Conference on Advances in Social Networks Analysis and Mining.
[11]
Julian R. Ullmann,et al.
An Algorithm for Subgraph Isomorphism
,
1976,
J. ACM.