An Empirical Study of Community Overlap: Ground-truth, Algorithmic Solutions, and Implications

In real-world social networks, communities tend to be overlapped with each other because a vertex can belong to multiple communities. To identify these overlapping communities, a number of overlapping community detection methods have been proposed over the recent years. However, there have been very few studies on the characteristics and the implications of the community overlap. In this paper, we investigate the properties of the nodes and the edges placed within the overlapped regions between the communities using the ground-truth communities as well as algorithmic communities derived from the state-of-the-art overlapping community detection methods. We find that the overlapped nodes and the overlapped edges play different roles from the ones that are not in the overlapped regions. Using real-world data, we empirically show that the highly overlapped nodes are involved in structure holes of a network. Also, we show that the overlapped nodes and edges play an important role in forming new links in evolving networks and diffusing information through a network.

[1]  Boleslaw K. Szymanski,et al.  Overlapping community detection in networks: The state-of-the-art and comparative study , 2011, CSUR.

[2]  Jon M. Kleinberg,et al.  Community membership identification from small seed sets , 2014, KDD.

[3]  Fan Chung Graham,et al.  Local Graph Partitioning using PageRank Vectors , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[4]  Philip S. Yu,et al.  Understanding Community Effects on Information Diffusion , 2015, PAKDD.

[5]  R. Burt Structural Holes and Good Ideas1 , 2004, American Journal of Sociology.

[6]  Jure Leskovec,et al.  Structure and Overlaps of Ground-Truth Communities in Networks , 2014, TIST.

[7]  E. Young Contagion , 2015, New Scientist.

[8]  E. David,et al.  Networks, Crowds, and Markets: Reasoning about a Highly Connected World , 2010 .

[9]  Chris Arney,et al.  Networks, Crowds, and Markets: Reasoning about a Highly Connected World (Easley, D. and Kleinberg, J.; 2010) [Book Review] , 2013, IEEE Technology and Society Magazine.

[10]  Krishna P. Gummadi,et al.  Growth of the flickr social network , 2008, WOSN '08.

[11]  Inderjit S. Dhillon,et al.  Overlapping Community Detection Using Neighborhood-Inflated Seed Expansion , 2015, IEEE Transactions on Knowledge and Data Engineering.

[12]  Yin Zhang,et al.  Clustered embedding of massive social networks , 2012, SIGMETRICS '12.

[13]  Jie Tang,et al.  Mining structural hole spanners through information diffusion in social networks , 2013, WWW.