Do more views of a graph help? Community detection and clustering in multi-graphs

Given a co-authorship collaboration network, how well can we cluster the participating authors into communities? If we also consider their citation network, based on the same individuals, is it possible to do a better job? In general, given a network with multiple types (or views) of edges (e.g., collaboration, citation, friendship), can community detection and graph clustering benefit? In this work, we propose Multi-CLUS and GraphFuse, two multi-graph clustering techniques powered by Minimum Description Length and Tensor analysis, respectively. We conduct experiments both on real and synthetic networks, evaluating the performance of our approaches. Our results demonstrate higher clustering accuracy than state-of-the-art baselines that do not exploit the multi-view nature of the network data. Finally, we address the fundamental question posed in the title, and provide a comprehensive answer, based on our systematic analysis.

[1]  Christopher J. C. Burges,et al.  Spectral clustering and transductive learning with multiple views , 2007, ICML '07.

[2]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[3]  Hong Cheng,et al.  A model-based approach to attributed graph clustering , 2012, SIGMOD Conference.

[4]  Deepayan Chakrabarti,et al.  AutoPart: Parameter-Free Graph Partitioning and Outlier Detection , 2004, PKDD.

[5]  Ali Pinar,et al.  Latent Clustering on Graphs with Multiple Edge Types , 2011, WAW.

[6]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[7]  Wei Tang,et al.  Clustering with Multiple Graphs , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[8]  Fosca Giannotti,et al.  Finding redundant and complementary communities in multidimensional networks , 2011, CIKM '11.

[9]  Hiroshi Mamitsuka,et al.  A Variational Bayesian Framework for Clustering with Multiple Graphs , 2012, IEEE Transactions on Knowledge and Data Engineering.

[10]  Richard M. Karp,et al.  Algorithms for graph partitioning on the planted partition model , 1999, Random Struct. Algorithms.

[11]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[12]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[13]  Nikos D. Sidiropoulos,et al.  Co-clustering as multilinear decomposition with sparse latent factors , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Huan Liu,et al.  Community detection via heterogeneous interaction analysis , 2012, Data Mining and Knowledge Discovery.

[15]  Yizhou Sun,et al.  SHRINK: a structural clustering algorithm for detecting hierarchical communities in networks , 2010, CIKM.

[16]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[17]  Philip S. Yu,et al.  GraphScope: parameter-free mining of large time-evolving graphs , 2007, KDD '07.

[18]  Xiaowei Xu,et al.  SCAN: a structural clustering algorithm for networks , 2007, KDD '07.

[19]  Christos Faloutsos,et al.  PICS: Parameter-free Identification of Cohesive Subgroups in Large Attributed Graphs , 2012, SDM.

[20]  David Lazer,et al.  Inferring friendship network structure by using mobile phone data , 2009, Proceedings of the National Academy of Sciences.

[21]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .