Ego-Splitting Framework: from Non-Overlapping to Overlapping Clusters

We propose ego-splitting, a new framework for detecting clusters in complex networks which leverage the local structures known as ego-nets (i.e. the subgraph induced by the neighborhood of each node) to de-couple overlapping clusters. Ego-splitting is a highly scalable and flexible framework, with provable theoretical guarantees, that reduces the complex overlapping clustering problem to a simpler and more amenable non-overlapping (partitioning) problem. We can scale community detection to graphs with tens of billions of edges and outperform previous solutions based on ego-nets analysis. More precisely, our framework works in two steps: a local ego-net analysis phase, and a global graph partitioning phase. In the local step, we first partition the nodes' ego-nets using a partitioning algorithm. We then use the computed clusters to split each node into its persona nodes that represent the instantiations of the node in its communities. Finally, in the global step, we partition the newly created graph to obtain an overlapping clustering of the original graph.

[1]  Mark Braverman,et al.  Finding Endogenously Formed Communities , 2012, SODA.

[2]  Jure Leskovec,et al.  Overlapping community detection at scale: a nonnegative matrix factorization approach , 2013, WSDM.

[3]  Alex Delis,et al.  Scalable link community detection: A local dispersion-aware approach , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[4]  Kevin Chen-Chuan Chang,et al.  User profiling in an ego network: co-profiling attributes and relationships , 2014, WWW.

[5]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[6]  Inderjit S. Dhillon,et al.  Overlapping community detection using seed set expansion , 2013, CIKM.

[7]  Chris Arney,et al.  Networks, Crowds, and Markets: Reasoning about a Highly Connected World (Easley, D. and Kleinberg, J.; 2010) [Book Review] , 2013, IEEE Technology and Society Magazine.

[8]  Kun He,et al.  Detecting Overlapping Communities from Local Spectral Subspaces , 2015, 2015 IEEE International Conference on Data Mining.

[9]  P. Ronhovde,et al.  Local resolution-limit-free Potts model for community detection. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  Vahab S. Mirrokni,et al.  On the Advantage of Overlapping Clusters for Minimizing Conductance , 2014, Algorithmica.

[11]  E. David,et al.  Networks, Crowds, and Markets: Reasoning about a Highly Connected World , 2010 .

[12]  Dino Pedreschi,et al.  Uncovering Hierarchical and Overlapping Communities with a Local-First Approach , 2014, TKDD.

[13]  Sanghamitra Bandyopadhyay,et al.  FOCS: Fast Overlapped Community Search , 2015, IEEE Transactions on Knowledge and Data Engineering.

[14]  Bu-Sung Lee,et al.  Event Detection in Twitter , 2011, ICWSM.

[15]  A. Bonato RANDOM GRAPH MODELS FOR THE WEB GRAPH , 2007 .

[16]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[17]  Derek Greene,et al.  Normalized Mutual Information to evaluate overlapping community finding algorithms , 2011, ArXiv.

[18]  Ronald S. Burt,et al.  Structural Holes: The Social Structure of Competition. , 1994 .

[19]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[20]  Marco Rosa,et al.  Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks , 2010, WWW.

[21]  Jure Leskovec,et al.  Learning to Discover Social Circles in Ego Networks , 2012, NIPS.

[22]  J. Hopfield,et al.  From molecular to modular cell biology , 1999, Nature.

[23]  Christos Faloutsos,et al.  oddball: Spotting Anomalies in Weighted Graphs , 2010, PAKDD.

[24]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Silvio Lattanzi,et al.  Affiliation networks , 2009, STOC '09.

[26]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[27]  Bradley S. Rees,et al.  Overlapping Community Detection by Collective Friendship Group Inference , 2010, 2010 International Conference on Advances in Social Networks Analysis and Mining.

[28]  Jure Leskovec,et al.  Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters , 2008, Internet Math..

[29]  LeskovecJure,et al.  Defining and evaluating network communities based on ground-truth , 2015 .

[30]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[31]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[32]  Andrea Lancichinetti,et al.  Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[33]  Sanjeev Arora,et al.  Finding overlapping communities in social networks: toward a rigorous approach , 2011, EC '12.

[34]  Michele Tomaiuolo,et al.  Local-First Algorithms for Community Detection , 2016, KDWeb.

[35]  Stanford,et al.  Learning to Discover Social Circles in Ego Networks , 2012 .

[36]  Denis Turdakov,et al.  EgoLP: Fast and Distributed Community Detection in Billion-Node Social Networks , 2014, 2014 IEEE International Conference on Data Mining Workshop.

[37]  John E. Hopcroft,et al.  A separability framework for analyzing community structure , 2014, ACM Trans. Knowl. Discov. Data.

[38]  Xingpeng Jiang,et al.  Discovering communities in complex networks by edge label propagation , 2016, Scientific Reports.

[39]  Silvio Lattanzi,et al.  Ego-net Community Mining Applied to Friend Suggestion , 2015, Proc. VLDB Endow..

[40]  Jure Leskovec,et al.  Detecting cohesive and 2-mode communities indirected and undirected networks , 2014, WSDM.

[41]  Linton C. Freeman,et al.  Centered graphs and the structure of ego networks , 1982, Math. Soc. Sci..

[42]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[43]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[44]  Robin I. M. Dunbar,et al.  Communication in social networks: Effects of kinship, network size, and emotional closeness , 2011 .

[45]  Martin Everett,et al.  Ego network betweenness , 2005, Soc. Networks.