Quadratic Program-Based Modularity Maximization for Fuzzy Community Detection in Social Networks

One of the most important elements of social network analysis is community detection, i.e., finding groups of similar people based on their traits. In this paper, we present the fuzzy modularity maximization (FMM) approach for community detection, which finds overlapping - that is, fuzzy - communities (where appropriate) by maximizing a generalized form of Newman's modularity. The first proposed FMM solution uses a tree-based structure to find a globally optimal solution, while the second proposed solution uses alternating optimization to efficiently search for a locally optimal solution. Both of these approaches are based on a proposed algorithm called one-step modularity maximization (OSMM), which computes the optimal cluster memberships for one person in the social network. We prove that OSMM can be formulated as a simplified quadratic knapsack optimization problem, which is O(n) time complexity. We then propose a tree-based algorithm, called FMM/Find Best Leaf Node (FMM/FBLN), which represents sequences of OSMM steps in a tree-based structure. It is proved that FMM/FBLN finds globally optimal solutions for FMM; however, the time complexity of FMM/FBLN is O(nd), d ≥ 2; thus, it is impractical for most real-world networks. To combat this inefficiency, we propose five heuristic-based alternating optimization schemes, i.e., FMM/H1-H5, which are all shown to be O(n2) time complexity. We compare the results of the FMM/H solutions with those of state-of-the-art community detection algorithms, MULTICUT spectral FCM (MSFCM) and GALS, and with those of two fuzzy community detection algorithms called GA and vertex-similarity based gradient-descent method (VSGD) on ten real-world datasets. We conclude that one of the five heuristic algorithms (FMM/H2) is very competitive with GALS and much more effective than MSFCM, GA, and VSGD. Furthermore, all of the FMM/H schemes are at least two orders of magnitude faster than GALS in run time. Finally, FMM/H, unlike GALS (which only produces crisp partitions) and MSFCM (which always finds fuzzy partitions), is the only fuzzy community detection algorithm to date that can find the max-modularity partition, fuzzy or crisp.

[1]  Tam'as Vicsek,et al.  Modularity measure of networks with overlapping communities , 2009, 0910.5072.

[2]  Pablo M. Gleiser,et al.  Community Structure in Jazz , 2003, Adv. Complex Syst..

[3]  Boleslaw K. Szymanski,et al.  Overlapping community detection in networks: The state-of-the-art and comparative study , 2011, CSUR.

[4]  Jian Liu,et al.  Fuzzy modularity and fuzzy community structure in networks , 2010 .

[5]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  R. Lambiotte,et al.  Line graphs, link partitions, and overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  Shinji Mizuno,et al.  An $$O(\sqrt n L)$$ iteration potential reduction algorithm for linear complementarity problems , 1991, Math. Program..

[8]  Pan Hui,et al.  Handbook of Optimization in Complex Networks , 2012 .

[9]  A. Arenas,et al.  Community detection in complex networks using extremal optimization. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  Bin Wu,et al.  A New Genetic Algorithm for Community Detection , 2009, Complex.

[11]  T. Nepusz,et al.  Fuzzy communities and the concept of bridgeness in complex networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  Shihua Zhang,et al.  Identification of overlapping community structure in complex networks using fuzzy c-means clustering , 2007 .

[13]  Steve Gregory,et al.  Fuzzy overlapping communities in networks , 2010, ArXiv.

[14]  V. Carchiolo,et al.  Extending the definition of modularity to directed graphs with overlapping communities , 2008, 0801.1647.

[15]  Donald E. Knuth,et al.  The art of computer programming. Vol.2: Seminumerical algorithms , 1981 .

[16]  R. A. Fisher,et al.  Statistical Tables for Biological, Agricultural and Medical Research , 1956 .

[17]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[18]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[19]  Judd Harrison Michael,et al.  Modeling the communication network in a sawmill , 1997 .

[20]  Y. Ye,et al.  Algorithms for the solution of quadratic knapsack problems , 1991 .

[21]  Clara Pizzuti,et al.  Community detection in social networks with genetic algorithms , 2008, GECCO '08.

[22]  L. Tippett Statistical Tables: For Biological, Agricultural and Medical Research , 1954 .

[23]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[24]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[25]  Marimuthu Palaniswami,et al.  A Soft Modularity Function For Detecting Fuzzy Communities in Social Networks , 2013, IEEE Transactions on Fuzzy Systems.

[26]  Donald Goldfarb,et al.  An O(n3L) primal interior point algorithm for convex quadratic programming , 1991, Math. Program..

[27]  Haluk Bingol,et al.  Community Detection in Complex Networks Using Genetic Algorithms , 2006, 0711.0491.

[28]  Donald E. Knuth,et al.  The Stanford GraphBase - a platform for combinatorial computing , 1993 .

[29]  P. Brucker Review of recent development: An O( n) algorithm for quadratic knapsack problems , 1984 .

[30]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[31]  Panos M. Pardalos,et al.  An algorithm for a singly constrained class of quadratic programs subject to upper and lower bounds , 1990, Math. Program..

[32]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[33]  A Díaz-Guilera,et al.  Self-similar community structure in a network of human interactions. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  Dayou Liu,et al.  Genetic Algorithm with Local Search for Community Mining in Complex Networks , 2010, 2010 22nd IEEE International Conference on Tools with Artificial Intelligence.

[35]  Panos M. Pardalos,et al.  Handbook of Optimization in Complex Networks , 2012 .

[36]  U. Brandes,et al.  Maximizing Modularity is hard , 2006, physics/0608255.

[37]  Mauricio G. C. Resende,et al.  A Polynomial-Time Primal-Dual Affine Scaling Algorithm for Linear and Convex Quadratic Programming and Its Power Series Extension , 1990, Math. Oper. Res..

[38]  J. Wishart Statistical tables , 2018, Global Education Monitoring Report.

[39]  Valdis E. Krebs Proxy Networks --Analyzing One Network To Reveal Another , 2003 .

[40]  L. Khachiyan,et al.  The polynomial solvability of convex quadratic programming , 1980 .

[41]  J. Bezdek,et al.  VAT: a tool for visual assessment of (cluster) tendency , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[42]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[43]  James M. Keller,et al.  Fuzzy Models and Algorithms for Pattern Recognition and Image Processing , 1999 .

[44]  Timothy C. Havens,et al.  A Generalized Fuzzy T-norm Formulation of Fuzzy Modularity for Community Detection in Social Networks , 2013, WCSC.

[45]  D. Lusseau,et al.  The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations , 2003, Behavioral Ecology and Sociobiology.