Path-Graph Fusion Based Community Detection over Heterogeneous Information Network

As a natural and general representation of data in the real world, heterogeneous information network (HIN) has been employed to model complex and heterogeneous data in many tasks. Community detection on heterogeneous network has received much attention in recent years. Most of HIN based methods rely on meta-path based similarity. But there are two challenges exist in the approaches. One is the similarity measure directly obtained by a meta-path is often a bias measure. The other is how to effectively aggregate different meta-path based similarities for clustering. In this paper, we propose a path-graph fusion based community detection model called PGFCluster. Our model utilizes a PathSim-based normalization to eliminate similarity bias. In addition, we design a flexible fusion mechanism with dynamically optimizing fusion result for best community partition. Experiments on two real-world datasets demonstrate the effectiveness of our model compared to other methods.

[1]  David Harel,et al.  On Clustering Using Random Walks , 2001, FSTTCS.

[2]  Huan Liu,et al.  Community detection via heterogeneous interaction analysis , 2012, Data Mining and Knowledge Discovery.

[3]  Charu C. Aggarwal,et al.  Relation Strength-Aware Clustering of Heterogeneous Information Networks with Incomplete Attributes , 2012, Proc. VLDB Endow..

[4]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[6]  Tinghuai Ma,et al.  Social Network and Tag Sources Based Augmenting Collaborative Recommender System , 2015, IEICE Trans. Inf. Syst..

[7]  Xiang Li,et al.  Semi-supervised Clustering in Attributed Heterogeneous Information Networks , 2017, WWW.

[8]  Ling Liu,et al.  Integrating Vertex-centric Clustering with Edge-centric Clustering for Meta Path Graph Analysis , 2015, KDD.

[9]  Andreas Noack,et al.  Multi-level Algorithms for Modularity Clustering , 2008, SEA.

[10]  C Tofallis,et al.  Fractional Programming: Theory, Methods and Applications , 1997, J. Oper. Res. Soc..

[11]  Tsuyoshi Murata,et al.  Transductive Classification on Heterogeneous Information Networks with Edge Betweenness-based Normalization , 2016, WSDM.

[12]  L. Hubert,et al.  Comparing partitions , 1985 .

[13]  Wei Cheng,et al.  Flexible and robust co-regularized multi-domain graph clustering , 2013, KDD.

[14]  S.,et al.  An Efficient Heuristic Procedure for Partitioning Graphs , 2022 .

[15]  Philip S. Yu,et al.  Heterogeneous Information Network Embedding for Recommendation , 2017, IEEE Transactions on Knowledge and Data Engineering.

[16]  Jinpeng Huai,et al.  Ring: Real-Time Emerging Anomaly Monitoring System Over Text Streams , 2019, IEEE Transactions on Big Data.

[17]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Yizhou Sun,et al.  Integrating Clustering with Ranking in Heterogeneous Information Networks Analysis , 2010, Link Mining.

[19]  Philip S. Yu,et al.  Integrating meta-path selection with user-guided object clustering in heterogeneous information networks , 2012, KDD.

[20]  Philip S. Yu,et al.  Top-k Similarity Join in Heterogeneous Information Networks , 2015, IEEE Transactions on Knowledge and Data Engineering.

[21]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Charu C. Aggarwal,et al.  Evolutionary Clustering and Analysis of Bibliographic Networks , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[23]  Dik Lun Lee,et al.  Meta-Graph Based Recommendation Fusion over Heterogeneous Information Networks , 2017, KDD.

[24]  Jianxin Li,et al.  An Efficient Approach to Event Detection and Forecasting in Dynamic Multivariate Social Media Networks , 2017, WWW.

[25]  Chris H. Q. Ding,et al.  Symmetric Nonnegative Matrix Factorization for Graph Clustering , 2012, SDM.

[26]  Padhraic Smyth,et al.  A Spectral Clustering Approach To Finding Communities in Graph , 2005, SDM.

[27]  Yizhou Sun,et al.  Mining Heterogeneous Information Networks: Principles and Methodologies , 2012, Mining Heterogeneous Information Networks: Principles and Methodologies.

[28]  Reinhard Lipowsky,et al.  Network Brownian Motion: A New Method to Measure Vertex-Vertex Proximity and to Identify Communities and Subcommunities , 2004, International Conference on Computational Science.

[29]  Werner Dinkelbach On Nonlinear Fractional Programming , 1967 .

[30]  Yizhou Sun,et al.  Ranking-based clustering of heterogeneous information networks with star network schema , 2009, KDD.

[31]  Julia Hirschberg,et al.  V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure , 2007, EMNLP.

[32]  Yizhou Sun,et al.  Mining heterogeneous information networks: a structural analysis approach , 2013, SKDD.