A Tensor Spectral Approach to Learning Mixed Membership Community Models

Detecting hidden communities from observed interactions is a classical problem. Theoretical analysis of community detection has so far been mostly limited to models with non-overlapping communities such as the stochastic block model. In this paper, we provide guaranteed community detection for a family of probabilistic network models with overlapping communities, termed as the mixed membership Dirichlet model, first introduced in Airoldi et al. (2008). This model allows for nodes to have fractional memberships in multiple communities and assumes that the community memberships are drawn from a Dirichlet distribution. Moreover, it contains the stochastic block model as a special case. We propose a unified approach to learning communities in these models via a tensor spectral decomposition approach. Our estimator uses low-order moment tensor of the observed network, consisting of 3-star counts. Our learning method is based on simple linear algebraic operations such as singular value decomposition and tensor power iterations. We provide guaranteed recovery of community memberships and model parameters, and present a careful finite sample analysis of our learning method. Additionally, our results match the best known scaling requirements for the special case of the (homogeneous) stochastic block model.

[1]  Michael J. Freedman,et al.  Scalable Inference of Overlapping Communities , 2012, NIPS.

[2]  Anima Anandkumar,et al.  Two SVDs Suffice: Spectral decompositions for probabilistic topic modeling and latent Dirichlet allocation , 2012, NIPS 2012.

[3]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[4]  S. Boorman,et al.  Social Structure from Multiple Networks. I. Blockmodels of Roles and Positions , 1976, American Journal of Sociology.

[5]  Fan Chung Graham,et al.  Spectral Clustering of Graphs with General Degrees in the Extended Planted Partition Model , 2012, COLT.

[6]  E. Xing,et al.  A state-space mixed membership blockmodel for dynamic network tomography , 2008, 0901.0135.

[7]  Yuchung J. Wang,et al.  Stochastic Blockmodels for Directed Graphs , 1987 .

[8]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[9]  Ittai Abraham,et al.  Low-Distortion Inference of Latent Similarities from a Multiplex Social Network , 2012, SIAM J. Comput..

[10]  Joel A. Tropp,et al.  User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[11]  Sanjeev Arora,et al.  Finding overlapping communities in social networks: toward a rigorous approach , 2011, EC '12.

[12]  G. Sapiro,et al.  A collaborative framework for 3D alignment and classification of heterogeneous subvolumes in cryo-electron tomography. , 2013, Journal of structural biology.

[13]  M. Jackson,et al.  An Economic Model of Friendship: Homophily, Minorities and Segregation , 2007 .

[14]  K. Pearson Contributions to the Mathematical Theory of Evolution , 1894 .

[15]  Anima Anandkumar,et al.  A Method of Moments for Mixture Models and Hidden Markov Models , 2012, COLT.

[16]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[17]  P. Lazarsfeld,et al.  Friendship as Social process: a substantive and methodological analysis , 1964 .

[18]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[19]  Tamara G. Kolda,et al.  Shifted Power Method for Computing Tensor Eigenpairs , 2010, SIAM J. Matrix Anal. Appl..

[20]  Morroe Berger,et al.  Freedom and control in modern society , 1954 .

[21]  P. Bickel,et al.  The method of moments and degree distributions for network models , 2011, 1202.5101.

[22]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[23]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockmodels for Graphs with Latent Block Structure , 1997 .

[24]  Christopher J. Hillar,et al.  Most Tensor Problems Are NP-Hard , 2009, JACM.

[25]  M. M. Meyer,et al.  Statistical Analysis of Multiple Sociometric Relations. , 1985 .

[26]  Frank McSherry,et al.  Spectral partitioning of random graphs , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[27]  Matus Telgarsky Dirichlet draws are sparse with high probability , 2013, ArXiv.

[28]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[29]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[30]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[31]  Sujay Sanghavi,et al.  Clustering Sparse Graphs , 2012, NIPS.

[32]  Mark Braverman,et al.  I Like Her more than You: Self-determined Communities , 2012, ArXiv.

[33]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.