A framework for joint community detection across multiple related networks

Community detection in networks is an active area of research with many practical applications. However, most of the early work in this area has focused on partitioning a single network or a bipartite graph into clusters/communities. With the rapid proliferation of online social media, it has become increasingly common for web users to have noticeable presence across multiple web sites. This raises the question whether it is possible to combine information from several networks to improve community detection. In this paper, we present a framework that identifies communities simultaneously across different networks and learns the correspondences between them. The framework is applicable to networks generated from multiple web sites as well as to those derived from heterogeneous nodes of the same web site. It also allows the incorporation of prior information about the potential relationships between the communities in different networks. Extensive experiments have been performed on both synthetic and real-life data sets to evaluate the effectiveness of our framework. Our results show superior performance of simultaneous community detection over three alternative methods, including normalized cut and matrix factorization on a single network or a bipartite graph.

[1]  Chris H. Q. Ding,et al.  Binary Matrix Factorization with Applications , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[2]  Huan Liu,et al.  Community Detection and Mining in Social Media , 2010, Community Detection and Mining in Social Media.

[3]  Jimeng Sun,et al.  MetaFac: community discovery via relational hypergraph factorization , 2009, KDD.

[4]  Yun Chi,et al.  Combining link and content for community detection: a discriminative approach , 2009, KDD.

[5]  Chen Wang,et al.  Detecting Overlapping Community Structures in Networks , 2009, World Wide Web.

[6]  Philip S. Yu,et al.  Unsupervised learning on k-partite graphs , 2006, KDD '06.

[7]  Pang-Ning Tan,et al.  Identifying Cohesive Subgroups and Their Correspondences in Multiple Related Networks , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[8]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[9]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[10]  Pang-Ning Tan,et al.  Clustering in the Presence of Bridge-Nodes , 2006, SDM.

[11]  Martine D. F. Schlag,et al.  Spectral K-Way Ratio-Cut Partitioning and Clustering , 1993, 30th ACM/IEEE Design Automation Conference.

[12]  Ji-Rong Wen,et al.  Scalable community discovery on textual data with relations , 2008, CIKM '08.

[13]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Philip S. Yu,et al.  A General Model for Multiple View Unsupervised Learning , 2008, SDM.

[15]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[16]  Pang-Ning Tan,et al.  Recommendation via Query Centered Random Walk on K-Partite Graph , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[17]  Somnath Banerjee,et al.  Clustering short texts using wikipedia , 2007, SIGIR.

[18]  Huan Liu,et al.  Uncoverning Groups via Heterogeneous Interaction Analysis , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[19]  Tanya Y. Berger-Wolf,et al.  A framework for community identification in dynamic social networks , 2007, KDD '07.

[20]  Shenghuo Zhu,et al.  Learning multiple graphs for document recommendations , 2008, WWW.

[21]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Philip S. Yu,et al.  A probabilistic framework for relational clustering , 2007, KDD '07.

[23]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[24]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[25]  Yihong Gong,et al.  Combining content and link for classification using matrix factorization , 2007, SIGIR.

[26]  Philip S. Yu,et al.  Co-clustering by block value decomposition , 2005, KDD '05.

[27]  Masashi Sugiyama,et al.  Robust Label Propagation on Multiple Networks , 2009, IEEE Transactions on Neural Networks.

[28]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[29]  Inderjit S. Dhillon,et al.  A generalized maximum entropy approach to bregman co-clustering and matrix approximation , 2004, J. Mach. Learn. Res..

[30]  Bruce Hendrickson,et al.  A Multi-Level Algorithm For Partitioning Graphs , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[31]  Wei Tang,et al.  Clustering with Multiple Graphs , 2009, 2009 Ninth IEEE International Conference on Data Mining.