Community Detection in Large Social Networks Based on Relationship Density

Numerous algorithms have been proposed for detecting underlying community structure in social network. However, the majority of existing methods focus on modularity or structure similarity, and the effectiveness on complicated networks is still far from satisfactory. In this paper, we propose a novel community detection algorithm on the basis of newly defined relationship density in social networks. According to discrepant influence of each vertex, core and auxiliary vertices are introduced to represent the users who influence others significantly and who have a negligible impact, respectively. To summarize, the proposed method consists of two stages: 1) core vertices are clustered into kernel communities in the decreasing order of relationship density and 2) auxiliary vertices are assigned into the closest connected community formed in the previous stage. Experiments on three real networks are conducted to validate the proposed community detection algorithm based on relationship density (CDRD), which achieves appreciable performance improvement over other baseline methods in terms of F1-score.

[1]  S. Kauffman Metabolic stability and epigenesis in randomly constructed genetic nets. , 1969, Journal of theoretical biology.

[2]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[3]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[4]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[5]  Chris H. Q. Ding,et al.  A min-max cut algorithm for graph partitioning and data clustering , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[6]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[7]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  Lada A. Adamic,et al.  The political blogosphere and the 2004 U.S. election: divided they blog , 2005, LinkKDD '05.

[10]  R. Guimerà,et al.  Functional cartography of complex metabolic networks , 2005, Nature.

[11]  Fan Chung Graham,et al.  Local Graph Partitioning using PageRank Vectors , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[12]  Matthieu Latapy,et al.  Theory and Practice of Triangle Problems in Very Large (Sparse (Power-Law)) Graphs , 2006, ArXiv.

[13]  Xiaowei Xu,et al.  SCAN: a structural clustering algorithm for networks , 2007, KDD '07.

[14]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[15]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[16]  Chen-Nee Chuah,et al.  Unveiling facebook: a measurement study of social network based applications , 2008, IMC '08.

[17]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[18]  Yun Chi,et al.  Combining link and content for community detection: a discriminative approach , 2009, KDD.

[19]  Bo Zhao,et al.  PET: a statistical model for popular events tracking in social communities , 2010, KDD.

[20]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[21]  John E. Hopcroft,et al.  Detecting the Structure of Social Networks Using (α, β)-Communities , 2011, WAW.

[22]  Jie Tang,et al.  Detecting Community Kernels in Large Social Networks , 2011, 2011 IEEE 11th International Conference on Data Mining.

[23]  Santo Fortunato,et al.  Finding Statistically Significant Communities in Networks , 2010, PloS one.

[24]  Duncan J. Watts,et al.  Who says what to whom on twitter , 2011, WWW.

[25]  Jure Leskovec,et al.  Defining and Evaluating Network Communities Based on Ground-Truth , 2012, ICDM.

[26]  Marco Rosa,et al.  Four degrees of separation , 2011, WebSci '12.

[27]  Vito Latora,et al.  Non-parametric resampling of random walks for spectral network clustering , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  V. Pareto Manual of Political Economy: A Critical and Variorum Edition , 2014 .

[29]  Jie Tang,et al.  Probabilistic Community and Role Model for Social Networks , 2015, KDD.

[30]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .