Deep Community Detection

A deep community in a graph is a connected component that can only be seen after removal of nodes or edges from the rest of the graph. This paper formulates the problem of detecting deep communities as multi-stage node removal that maximizes a new centrality measure, called the local Fiedler vector centrality (LFVC), at each stage. The LFVC is associated with the sensitivity of algebraic connectivity to node or edge removals. We prove that a greedy node/edge removal strategy, based on successive maximization of LFVC, has bounded performance loss relative to the optimal, but intractable, combinatorial batch removal strategy. Under a stochastic block model framework, we show that the greedy LFVC strategy can extract deep communities with probability one as the number of observations becomes large. We apply the greedy LFVC strategy to real-world social network datasets. Compared with conventional community detection methods we demonstrate improved ability to identify important communities and key members in the network.

[1]  A. Arenas,et al.  Abrupt transition in the structural formation of interconnected networks , 2013, Nature Physics.

[2]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[3]  Tsvi Kuflik,et al.  Proceedings of the 2nd International Workshop on Information Heterogeneity and Fusion in Recommender Systems (HetRec 2011) : 27th October 2011, Chicago, IL, USA , 2011 .

[4]  D. Spielman,et al.  Spectral partitioning works: planar graphs and finite element meshes , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[5]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[6]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Martin Everett,et al.  Ego network betweenness , 2005, Soc. Networks.

[8]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockmodels for Graphs with Latent Block Structure , 1997 .

[9]  Benjamin H. Good,et al.  Performance of modularity maximization in practical contexts. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  Andreas Krause,et al.  Submodular Function Maximization , 2014, Tractability.

[11]  Patrick J. Wolfe,et al.  Toward signal processing theory for graphs and non-Euclidean data , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[13]  N. Alon,et al.  Finding a large hidden clique in a random graph , 1998 .

[14]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[15]  Raj Rao Nadakuditi,et al.  The singular values and vectors of low rank perturbations of large rectangular random matrices , 2011, J. Multivar. Anal..

[16]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Alfred O. Hero,et al.  Node removal vulnerability of the largest component of a network , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[18]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[19]  M. Fiedler Algebraic connectivity of graphs , 1973 .

[20]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[21]  Haoran Wen,et al.  Improving community detection in networks by targeted node removal. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[23]  D. Lusseau,et al.  The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations , 2003, Behavioral Ecology and Sociobiology.

[24]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[26]  R. Latala Some estimates of norms of random matrices , 2005 .

[27]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[28]  Alfred O. Hero,et al.  Local Fiedler vector centrality for detection of deep and overlapping communities in networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[30]  M. Talagrand Concentration of measure and isoperimetric inequalities in product spaces , 1994, math/9406212.

[31]  Raj Rao Nadakuditi,et al.  Graph spectra and the detectability of community structure in networks , 2012, Physical review letters.

[32]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Marc Moonen,et al.  Seeing the Bigger Picture: How Nodes Can Learn Their Place Within a Complex Ad Hoc Network Topology , 2013, IEEE Signal Processing Magazine.

[34]  N. Abreu Old and new results on algebraic connectivity of graphs , 2007 .

[35]  R. Lata,et al.  SOME ESTIMATES OF NORMS OF RANDOM MATRICES , 2004 .

[36]  Andrea Lancichinetti,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[37]  Alfred O. Hero,et al.  Assessing and safeguarding network resilience to nodal attacks , 2014, IEEE Communications Magazine.

[38]  Satoru Fujishige,et al.  Submodular functions and optimization , 1991 .

[39]  Patrick J. Wolfe,et al.  Subgraph Detection Using Eigenvector L1 Norms , 2010, NIPS.

[40]  Noga Alon,et al.  Finding a large hidden clique in a random graph , 1998, SODA '98.

[41]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[42]  Sivaraman Balakrishnan,et al.  Noise Thresholds for Spectral Clustering , 2011, NIPS.

[43]  藤重 悟 Submodular functions and optimization , 1991 .

[44]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[45]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[46]  Shang-Hua Teng,et al.  Spectral partitioning works: planar graphs and finite element meshes , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[47]  Marc Moonen,et al.  Distributed computation of the Fiedler vector with application to topology inference in ad hoc networks , 2013, Signal Process..

[48]  Mark E. J. Newman,et al.  Spectral methods for network community detection and graph partitioning , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[49]  R. Merris Laplacian matrices of graphs: a survey , 1994 .

[50]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[51]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[52]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994 .

[53]  Raj Rao Nadakuditi,et al.  On hard limits of eigen-analysis based planted clique detection , 2012, 2012 IEEE Statistical Signal Processing Workshop (SSP).

[54]  Pascal Frossard,et al.  The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains , 2012, IEEE Signal Processing Magazine.

[55]  Pushmeet Kohli,et al.  Tractability: Practical Approaches to Hard Problems , 2013 .

[56]  Gert Sabidussi,et al.  The centrality index of a graph , 1966 .

[57]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[58]  Ji-Ming Guo,et al.  A new upper bound for the Laplacian spectral radius of graphs , 2005 .

[59]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[60]  Dino Pedreschi,et al.  A classification for community discovery methods in complex networks , 2011, Stat. Anal. Data Min..

[61]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.