Semi-supervised community detection based on non-negative matrix factorization with node popularity

Based on the ideas of graph regularization, we present a semi-supervised and NMF-based model to utilize the prior information of networks.By introducing the parameters of node popularities, we propose a refined PSSNMF model, which is particularly suitable for networks with large degree heterogeneity and unbalanced community structure.The prior information with node popularity being introduced into the model is more effective than being directly encoded into the adjacent matrix. A plethora of exhaustive studies have proved that the community detection merely based on topological information often leads to relatively low accuracy. Several approaches aim to achieve performance improvement by utilizing the background information. But they ignore the effect of node degrees on the availability of prior information. In this paper, by combining the idea of graph regularization with the pairwise constraints, we present a semi-supervised non-negative matrix factorization (SSNMF) model for community detection. And then, to alleviate the influence of the heterogeneity of node degrees and community sizes, we propose an improved SSNMF model by introducing the node popularity, namely PSSNMF, which helps to utilize the prior information more effectively. At last, the extensive experiments on both artificial and real-world networks show that the proposed method improves, as expected, the accuracy of community detection, especially on networks with large degree heterogeneity and unbalanced community structure.

[1]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Dit-Yan Yeung,et al.  Overlapping community detection via bounded nonnegative matrix tri-factorization , 2012, KDD.

[3]  Konstantin Avrachenkov,et al.  Cooperative Game Theory Approaches for Network Partitioning , 2017, COCOON.

[4]  Boleslaw K. Szymanski,et al.  Overlapping community detection in networks: The state-of-the-art and comparative study , 2011, CSUR.

[5]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Fei Wang,et al.  Community discovery using nonnegative matrix factorization , 2011, Data Mining and Knowledge Discovery.

[7]  Dong Liu,et al.  Semi-supervised community detection based on discrete potential theory , 2014 .

[8]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[9]  Xiaochun Cao,et al.  Active link selection for efficient semi-supervised community detection , 2015, Scientific Reports.

[10]  Di Jin,et al.  Extending a configuration model to find communities in complex networks , 2013 .

[11]  Xiao Zhang,et al.  Multiway spectral community detection in networks , 2015, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  Anup Rao,et al.  Stochastic Block Model and Community Detection in Sparse Graphs: A spectral algorithm with optimal rate of recovery , 2015, COLT.

[14]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[15]  Lada A. Adamic,et al.  The political blogosphere and the 2004 U.S. election: divided they blog , 2005, LinkKDD '05.

[16]  Haesun Park,et al.  A high-performance parallel algorithm for nonnegative matrix factorization , 2015, PPoPP.

[17]  Renaud Lambiotte,et al.  Uncovering space-independent communities in spatial networks , 2010, Proceedings of the National Academy of Sciences.

[18]  Martin Rosvall,et al.  An information-theoretic framework for resolving community structure in complex networks , 2007, Proceedings of the National Academy of Sciences.

[19]  Xuewei Li,et al.  Semi-supervised Community Detection Framework Based on Non-negative Factorization Using Individual Labels , 2015, ICSI.

[20]  Chris H Wiggins,et al.  Bayesian approach to network modularity. , 2007, Physical review letters.

[21]  Jing Lei,et al.  Network Cross-Validation for Determining the Number of Communities in Network Data , 2014, 1411.1715.

[22]  Cristopher Moore,et al.  Community detection in networks with unequal groups , 2015, Physical review. E.

[23]  Mason A. Porter,et al.  Social Structure of Facebook Networks , 2011, ArXiv.

[24]  Xiao Liu,et al.  Community detection enhancement using non-negative matrix factorization with graph regularization , 2016 .

[25]  Kazuyuki Tanaka,et al.  Community Detection Algorithm Combining Stochastic Block Model and Attribute Data Clustering , 2016, ArXiv.

[26]  Tao Hu,et al.  Local modularity for community detection in complex networks , 2016 .

[27]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  Zhong-Yuan Zhang,et al.  Enhanced Community Structure Detection in Complex Networks with Partial Background Information , 2012, Scientific Reports.

[29]  Xiaoke Ma,et al.  Semi-supervised clustering algorithm for community structure detection in complex networks , 2010 .

[30]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[31]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[32]  David Lusseau,et al.  The emergent properties of a dolphin social network , 2003, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[33]  Weixiong Zhang,et al.  A Stochastic Model for Detecting Heterogeneous Link Communities in Complex Networks , 2015, AAAI.

[34]  Ying Wen,et al.  Highly efficient epidemic spreading model based LPA threshold community detection method , 2016, Neurocomputing.

[35]  Zhong-Yuan Zhang,et al.  Enhanced Community Structure Detection in Complex Networks with Partial Background Information , 2013, Scientific reports.

[36]  Mark E. J. Newman,et al.  Structural inference for uncertain networks , 2015, Physical review. E.

[37]  Lixin Gao,et al.  Scalable Linear Visual Feature Learning via Online Parallel Nonnegative Matrix Factorization , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[38]  Boleslaw K. Szymanski,et al.  Community Detection via Maximization of Modularity and Its Variants , 2014, IEEE Transactions on Computational Social Systems.

[39]  Xiaojun Wu,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Simone Daminelli,et al.  Common neighbours and the local-community-paradigm for topological link prediction in bipartite networks , 2015, ArXiv.

[41]  Hongyu Zhao,et al.  Normalized modularity optimization method for community identification with degree adjustment. , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[42]  Timothy Ravasi,et al.  From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks , 2013, Scientific Reports.

[43]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[44]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockstructures , 2001 .

[45]  Laurent Massoulié,et al.  A spectral method for community detection in moderately sparse degree-corrected stochastic block models , 2015, Advances in Applied Probability.

[46]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[47]  Chris H. Q. Ding,et al.  Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization , 2008, SIGIR '08.

[48]  Eric Eaton,et al.  A Spin-Glass Model for Semi-Supervised Community Detection , 2012, AAAI.

[49]  Hongtao Lu,et al.  Community detection in social network with pairwisely constrained symmetric non-negative matrix factorization , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[50]  Chun Chen,et al.  Graph Regularized Sparse Coding for Image Representation , 2011, IEEE Transactions on Image Processing.

[51]  R. Guimerà,et al.  Modularity from fluctuations in random graphs and complex networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.