Overlapping community detection with preference and locality information: a non-negative matrix factorization approach

Community detection plays an important role in understanding structures and patterns in complex networks. In real-world networks, a node in most cases belongs to multiple communities, which makes communities overlap with each other. One popular technique to cope with overlapping community detection is matrix factorization (MF). However, existing MF approaches only make use of the existence of a link, but ignore the implicit preference information inside it. In this paper, we first propose a Preference-based Non-negative Matrix Factorization (PNMF) model to take link preference information into consideration. Distinguished from traditional value approximation-based matrix factorization approaches, our model maximizes the likelihood of the preference order for each node so that it overcomes the indiscriminate penalty problem in which non-linked pairs inside one community are equally penalized in objective functions as those across two communities. Moreover, we propose a Locality-based Non-negative Matrix Factorization (LNMF) model to further incorporate the concept of locality and generalize the preference system of PNMF. Particularly, we define a subgraph called “K-degree local network” to set a boundary between local non-neighbors and other non-neighbors, and explicitly treat these two classes of non-neighbors in objective function. Through experiments on various benchmark networks, we show that our PNMF model outperforms state-of-the-art baselines, and the generalized LNMF model further performs better than the PNMF model on datasets with high locality.

[1]  R. Guimerà,et al.  Modularity from fluctuations in random graphs and complex networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Stanford,et al.  Learning to Discover Social Circles in Ego Networks , 2012 .

[3]  Kun He,et al.  Local Spectral Clustering for Overlapping Community Detection , 2018, ACM Trans. Knowl. Discov. Data.

[4]  A. Arenas,et al.  Community detection in complex networks using extremal optimization. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  Weixiong Zhang,et al.  Modeling with Node Degree Preservation Can Accurately Find Communities , 2015, AAAI.

[6]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[7]  Inderjit S. Dhillon,et al.  Overlapping Community Detection Using Neighborhood-Inflated Seed Expansion , 2015, IEEE Transactions on Knowledge and Data Engineering.

[8]  Michael R. Lyu,et al.  Exploiting k-Degree Locality to Improve Overlapping Community Detection , 2015, IJCAI.

[9]  Stephen Roberts,et al.  Overlapping community detection using Bayesian non-negative matrix factorization. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  Dit-Yan Yeung,et al.  Overlapping community detection via bounded nonnegative matrix tri-factorization , 2012, KDD.

[11]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Lars Schmidt-Thieme,et al.  Learning optimal ranking with tensor factorization for tag recommendation , 2009, KDD.

[13]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[14]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[15]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[16]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[18]  Yu Zhou,et al.  Nonnegative matrix factorization with mixed hypergraph regularization for community detection , 2018, Inf. Sci..

[19]  Santo Fortunato,et al.  Community detection in networks: A user guide , 2016, ArXiv.

[20]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Dino Pedreschi,et al.  DEMON: a local-first discovery method for overlapping communities , 2012, KDD.

[22]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, Knowledge and Information Systems.

[23]  Michael R. Lyu,et al.  Incorporating Implicit Link Preference Into Overlapping Community Detection , 2015, AAAI.

[24]  Jure Leskovec,et al.  Overlapping community detection at scale: a nonnegative matrix factorization approach , 2013, WSDM.

[25]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[26]  Derek Greene,et al.  Normalized Mutual Information to evaluate overlapping community finding algorithms , 2011, ArXiv.

[27]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Andrea Lancichinetti,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[29]  J. Kumpula,et al.  Sequential algorithm for fast clique percolation. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  Boleslaw K. Szymanski,et al.  Overlapping community detection in networks: The state-of-the-art and comparative study , 2011, CSUR.

[31]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[32]  Fei Wang,et al.  Community discovery using nonnegative matrix factorization , 2011, Data Mining and Knowledge Discovery.

[33]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[34]  Tong Zhao,et al.  Leveraging Social Connections to Improve Personalized Ranking for Collaborative Filtering , 2014, CIKM.

[35]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[36]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[37]  Z. Wang,et al.  The structure and dynamics of multilayer networks , 2014, Physics Reports.

[38]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[39]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[40]  Paul Thompson,et al.  Mixed Membership Stochastic Blockmodels for the Human Connectome , 2015 .

[41]  Wei Chen,et al.  A game-theoretic framework to identify overlapping communities in social networks , 2010, Data Mining and Knowledge Discovery.