Is Objective Function the Silver Bullet? A Case Study of Community Detection Algorithms on Social Networks

Community detection or cluster detection in networks is a well-studied, albeit hard, problem. Given the scale and complexity of modern day social networks, detecting ``reasonable'' communities is an even harder problem. Since the first use of k-means algorithm in 1960s, many community detection algorithms have been invented - most of which are developed with specific goals in mind and the idea of detecting ``meaningful'' communities varies widely from one algorithm to another. With the increasing number of community detection algorithms, there has been an advent of a number of evaluation measures and objective functions such as modularity and internal density. In this paper we divide methods of measurements in to two categories, according to whether they rely on ground-truth or not. Our work is aiming to answer whether these general used objective functions are well consistent with the real performance of community detection algorithms across a number of homogeneous and heterogeneous networks. Seven representative algorithms are compared under various performance metrics, and on various real world social networks.

[1]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[2]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[3]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Yizhou Sun,et al.  RankClus: integrating clustering with ranking for heterogeneous information network analysis , 2009, EDBT '09.

[5]  Randy Goebel,et al.  Detecting Communities in Social Networks Using Max-Min Modularity , 2009, SDM.

[6]  R. Lambiotte,et al.  Line graphs, link partitions, and overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  Nitesh V. Chawla,et al.  Identifying and evaluating community structure in complex networks , 2010, Pattern Recognit. Lett..

[8]  Inderjit S. Dhillon,et al.  A fast kernel-based multilevel algorithm for graph clustering , 2005, KDD '05.

[9]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[10]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[11]  Judd Harrison Michael,et al.  Modeling the communication network in a sawmill , 1997 .

[12]  Peng Jiang,et al.  SPICi: a fast clustering algorithm for large biological networks , 2010, Bioinform..

[13]  Vladimir Batagelj,et al.  Exploratory Social Network Analysis with Pajek , 2005 .

[14]  Alex Pentland,et al.  Reality mining: sensing complex social systems , 2006, Personal and Ubiquitous Computing.

[15]  Nitesh V. Chawla,et al.  Detecting communities in time-evolving proximity networks , 2011, 2011 IEEE Network Science Workshop.

[16]  Samuel Schmidt,et al.  The political network in Mexico , 1996 .

[17]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.