A Nonnegative Matrix Factorization Approach for Multiple Local Community Detection

Existing works on local community detection in social networks focus on finding one single community a few seed members are most likely to be in. In this work, we address a much harder problem of multiple local community detection and propose a Nonnegative Matrix Factorization algorithm for finding multiple local communities for a single seed chosen randomly in multiple ground truth communities. The number of detected communities for the seed is determined automatically by the algorithm. We first apply a Breadth-First Search to sample the input graph up to several levels depending on the network density. We then use Nonnegative Matrix Factorization on the adjacency matrix of the sampled subgraph to estimate the number of communities, and then cluster the nodes of the subgraph into communities. Our proposed method differs from the existing NMF-based community detection methods as it does not use“ argmax ” function to assign nodes to communities. Our method has been evaluated on real-world networks and shows good accuracy as evaluated by the F1 score when comparing with the state-of-the-art local community detection algorithm.

[1]  Yu-Jin Zhang,et al.  Nonnegative Matrix Factorization: A Comprehensive Review , 2013, IEEE Transactions on Knowledge and Data Engineering.

[2]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[3]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[4]  Huawen Liu,et al.  Optimization and evaluation of a random walks-based community detection algorithm , 2016, 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD).

[5]  Kun He,et al.  Detecting Overlapping Communities from Local Spectral Subspaces , 2015, 2015 IEEE International Conference on Data Mining.

[6]  Jérôme Idier,et al.  Algorithms for Nonnegative Matrix Factorization with the β-Divergence , 2010, Neural Computation.

[7]  M. Newman,et al.  Identifying the role that animals play in their social networks , 2004, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[8]  Mohammed Zuhair Al-Taie,et al.  Theoretical concepts of network analysis , 2017 .

[9]  Xiao Liu,et al.  Semi-supervised community detection based on non-negative matrix factorization with node popularity , 2017, Inf. Sci..

[10]  A. Clauset Finding local community structure in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Fei Wang,et al.  Community discovery using nonnegative matrix factorization , 2011, Data Mining and Knowledge Discovery.

[12]  Randy Goebel,et al.  Local Community Identification in Social Networks , 2009, 2009 International Conference on Advances in Social Network Analysis and Mining.

[13]  L. Asz Random Walks on Graphs: a Survey , 2022 .

[14]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[15]  Clara Pizzuti,et al.  Is normalized mutual information a fair measure for comparing community detection methods? , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[16]  Santo Fortunato,et al.  Community detection in networks: A user guide , 2016, ArXiv.

[17]  Jon M. Kleinberg,et al.  Community membership identification from small seed sets , 2014, KDD.

[18]  Pili Hu,et al.  A Survey and Taxonomy of Graph Sampling , 2013, ArXiv.

[19]  David F. Gleich,et al.  Heat kernel based community detection , 2014, KDD.

[20]  Stephen A. Vavasis,et al.  On the Complexity of Nonnegative Matrix Factorization , 2007, SIAM J. Optim..

[21]  David A. Bader,et al.  A dynamic algorithm for local community detection in graphs , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[22]  D. Lusseau,et al.  The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations , 2003, Behavioral Ecology and Sociobiology.

[23]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24]  Jure Leskovec,et al.  Defining and Evaluating Network Communities Based on Ground-Truth , 2012, ICDM.

[25]  Neda Binesh,et al.  Fuzzy clustering in community detection based on nonnegative matrix factorization with two novel evaluation criteria , 2017, Appl. Soft Comput..