Fuzzy clustering in community detection based on nonnegative matrix factorization with two novel evaluation criteria

Clustering or community detection is one of the most important problems in social network analysis, and because of the existence of overlapping clusters, fuzzy clustering is a suitable way to cluster these networks. In fuzzy clustering, in addition to the correctness of the clusters assigned to each node, the produced membership of one node to each cluster is also important. In this paper, we introduce a new fuzzy clustering algorithm based on the nonnegative matrix factorization (NMF) method. Despite the well-known fuzzy clustering techniques like FCM, the proposed method does not depend on any parameter. Also, it can produce appropriate memberships based on the network structure and so identify the overlap nodes from non-overlap nodes, well. Also, to evaluate the validity of such fuzzy clustering algorithms, we propose two new evaluation criteria (SFEC and UFEC), which are constructed based on the neighborhood structure of nodes and can evaluate the memberships. Experimental results on some real-world networks and also many artificial networks show the effectiveness and reliability of our proposed criteria.

[1]  Michael W. Berry,et al.  Document clustering using nonnegative matrix factorization , 2006, Inf. Process. Manag..

[2]  Dong Liu,et al.  Fuzzy overlapping community detection based on local random walk and multidimensional scaling , 2013 .

[3]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[4]  M. Yousefi,et al.  A Projected Alternating Least square Approach for Computation of Nonnegative Matrix Factorization , 2015 .

[5]  T. Nepusz,et al.  Fuzzy communities and the concept of bridgeness in complex networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  Shihua Zhang,et al.  Identification of overlapping community structure in complex networks using fuzzy c-means clustering , 2007 .

[7]  Jun Yang,et al.  A novel cluster validity index for fuzzy clustering based on bipartite modularity , 2014, Fuzzy Sets Syst..

[8]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[10]  Pablo M. Gleiser,et al.  Community Structure in Jazz , 2003, Adv. Complex Syst..

[11]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[12]  Andrea Lancichinetti,et al.  Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[14]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Hyunsoo Kim,et al.  Nonnegative Matrix Factorization Based on Alternating Nonnegativity Constrained Least Squares and Active Set Method , 2008, SIAM J. Matrix Anal. Appl..

[16]  Xiaoming Liu,et al.  SLPA: Uncovering Overlapping Communities in Social Networks via a Speaker-Listener Interaction Dynamic Process , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[17]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[18]  Norhasnelly Anuar,et al.  Determination of fuzziness parameter in load profiling via Fuzzy C-Means , 2011, 2011 IEEE Control and System Graduate Research Colloquium.

[19]  Stephen Roberts,et al.  Overlapping community detection using Bayesian non-negative matrix factorization. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Andreas Nürnberger,et al.  Graph clusterings with overlaps: Adapted quality indices and a generation model , 2014, Neurocomputing.

[21]  D. Lusseau,et al.  The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations , 2003, Behavioral Ecology and Sociobiology.

[22]  J. Bezdek Cluster Validity with Fuzzy Sets , 1973 .

[23]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[24]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  J. Bezdek,et al.  FCM: The fuzzy c-means clustering algorithm , 1984 .

[26]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[27]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[28]  Jane You,et al.  Image clustering by hyper-graph regularized non-negative matrix factorization , 2014, Neurocomputing.

[29]  David Lusseau,et al.  The emergent properties of a dolphin social network , 2003, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[30]  Dong Zhou,et al.  Translation techniques in cross-language information retrieval , 2012, CSUR.

[31]  V. Carchiolo,et al.  Extending the definition of modularity to directed graphs with overlapping communities , 2008, 0801.1647.

[32]  Boleslaw K. Szymanski,et al.  Overlapping community detection in networks: The state-of-the-art and comparative study , 2011, CSUR.

[33]  Fei Wang,et al.  Community discovery using nonnegative matrix factorization , 2011, Data Mining and Knowledge Discovery.

[34]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[35]  L. Collins,et al.  Omega: A General Formulation of the Rand Index of Cluster Recovery Suitable for Non-disjoint Solutions. , 1988, Multivariate behavioral research.

[36]  Meng Wang,et al.  Image clustering based on sparse patch alignment framework , 2014, Pattern Recognit..

[37]  Konstantin Avrachenkov,et al.  Cooperative Game Theory Approaches for Network Partitioning , 2017, COCOON.

[38]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[39]  Huan Liu,et al.  Community Detection and Mining in Social Media , 2010, Community Detection and Mining in Social Media.

[40]  Shi-Hua Zhang,et al.  Clustering complex networks and biological networks by nonnegative matrix factorization with various similarity measures , 2008, Neurocomputing.

[41]  J. Bezdek Numerical taxonomy with fuzzy sets , 1974 .

[42]  Steve Gregory,et al.  Finding overlapping communities in networks by label propagation , 2009, ArXiv.

[43]  Lars Elden,et al.  Matrix methods in data mining and pattern recognition , 2007, Fundamentals of algorithms.

[44]  Boleslaw K. Szymanski,et al.  Towards Linear Time Overlapping Community Detection in Social Networks , 2012, PAKDD.

[45]  Nam P. Nguyen,et al.  Overlapping communities in dynamic networks: their detection and mobile applications , 2011, MobiCom.

[46]  Haesun Park,et al.  SymNMF: nonnegative low-rank approximation of a similarity matrix for graph clustering , 2014, Journal of Global Optimization.

[47]  Cun-Quan Zhang,et al.  Optimal local community detection in social networks based on density drop of subgraphs , 2014, Pattern Recognit. Lett..

[48]  Shihua Zhang,et al.  Uncovering fuzzy community structure in complex networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[49]  Nam P. Nguyen,et al.  Finding overlapped communities in online social networks with Nonnegative Matrix Factorization , 2012, MILCOM 2012 - 2012 IEEE Military Communications Conference.