Community detection in networks: Structural communities versus ground truth

Algorithms to find communities in networks rely just on structural information and search for cohesive subsets of nodes. On the other hand, most scholars implicitly or explicitly assume that structural communities represent groups of nodes with similar (nontopological) properties or functions. This hypothesis could not be verified, so far, because of the lack of network datasets with information on the classification of the nodes. We show that traditional community detection methods fail to find the metadata groups in many large networks. Our results show that there is a marked separation between structural communities and metadata groups, in line with recent findings. That means that either our current modeling of community structure has to be substantially modified, or that metadata groups may not be recoverable from topology alone.

[1]  Kevin Barraclough,et al.  I and i , 2001, BMJ : British Medical Journal.

[2]  Dunja Mladenic,et al.  Proceedings of the 3rd international workshop on Link discovery , 2005, KDD 2005.

[3]  P. Kuehnlein,et al.  Proceedings of the Twenty-First International Florida Artificial Intelligence Research Society Conference , 2008, AAAI 2008.

[4]  Tony Bates,et al.  Guidelines for creation, selection, and registration of an Autonomous System (AS) , 1996, RFC.

[5]  Gregory Piatetsky-Shapiro,et al.  Advances in Knowledge Discovery and Data Mining , 2004, Lecture Notes in Computer Science.

[6]  W. Marsden I and J , 2012 .

[7]  Remo Guidieri Res , 1995, RES: Anthropology and Aesthetics.

[8]  Jian Pei,et al.  Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining , 2012, KDD 2012.

[9]  M. Panella Associate Editor of the Journal of Computer and System Sciences , 2014 .

[10]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[11]  Aristides Gionis,et al.  Proceedings of the sixth ACM international conference on Web search and data mining , 2013, WSDM 2013.

[12]  Philip S. Yu,et al.  Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics , 2012, KDD 2012.

[13]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[14]  Simson L. Garfinkel,et al.  PGP: Pretty Good Privacy , 1994 .

[15]  Lise Getoor,et al.  Proceedings of the Eighth Workshop on Mining and Learning with Graphs, MLG '10, Washington, D.C., USA, July 24-25, 2010 , 2010, MLG@KDD.

[16]  Paul Shabajee,et al.  Proceedings of the 22nd International Conference on World Wide Web , 2013 .

[17]  Mark Allman,et al.  Proceedings of the 10th ACM SIGCOMM conference on Internet measurement , 2010, IMC 2010.

[18]  Anastasios Kementsietsidis,et al.  Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data , 2001, SIGMOD 2011.

[19]  Amruth N. Kumar,et al.  Links , 1999, INTL.