Spatial compactness meets topical consistency: jointly modeling links and content for community detection

In this paper, we address the problem of discovering topically meaningful, yet compact (densely connected) communities in a social network. Assuming the social network to be an integer-weighted graph (where the weights can be intuitively defined as the number of common friends, followers, documents exchanged, etc.), we transform the social network to a more efficient representation. In this new representation, each user is a bag of her one-hop neighbors. We propose a mixed-membership model to identify compact communities using this transformation. Next, we augment the representation and the model to incorporate user-content information imposing topical consistency in the communities. In our model a user can belong to multiple communities and a community can participate in multiple topics. This allows us to discover community memberships as well as community and user interests. Our method outperforms other well known baselines on two real-world social networks. Finally, we also provide a fast, parallel approximation of the same.

[1]  Miin-Shen Yang A survey of fuzzy clustering , 1993 .

[2]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[3]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[5]  C. Lee Giles,et al.  Self-Organization and Identification of Web Communities , 2002, Computer.

[6]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[7]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[8]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  Andrew McCallum,et al.  Topic and Role Discovery in Social Networks , 2005, IJCAI.

[10]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[11]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Hongyuan Zha,et al.  Probabilistic models for discovering e-communities , 2006, WWW '06.

[13]  A. Banerjee,et al.  Social Topic Models for Community Extraction , 2008 .

[14]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[15]  Ramesh Nallapati,et al.  Joint latent topic models for text and citations , 2008, KDD.

[16]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.

[17]  Max Welling,et al.  Distributed Algorithms for Topic Models , 2009, J. Mach. Learn. Res..

[18]  David M. Blei,et al.  Relational Topic Models for Document Networks , 2009, AISTATS.

[19]  Yan Liu,et al.  Topic-link LDA: joint models of topic and author community , 2009, ICML '09.

[20]  Jian Liu,et al.  Fuzzy modularity and fuzzy community structure in networks , 2010 .

[21]  Christos Faloutsos,et al.  It's who you know: graph mining using recursive structural features , 2011, KDD.

[22]  John E. Hopcroft,et al.  Using community information to improve the precision of link prediction methods , 2012, WWW.

[23]  L. Venkata Subramaniam,et al.  Using content and interactions for discovering communities in social networks , 2012, WWW.

[24]  Eric P. Xing,et al.  On Triangular versus Edge Representations --- Towards Scalable Modeling of Networks , 2012, NIPS.