Game theory based algorithms for community detection

The problem of community detection is important as it helps in understanding the spread of information in a social network. All real complex networks have an inbuilt structure which captures and characterizes the network dynamics between its nodes. Linkages are more likely to form between similar nodes, leading to the formation of some community structure which characterizes the network dynamic. The more friends they have in common, the more the influence that each person can exercise on the other. People use their attributes to assess the similarity of the other people with them, similarly, the attributes are also influenced with the people they interact and the network structure. Hence, we assume that communities capture homophily as people of the same community share a lot of similar features. The contributions of my thesis are as follows: • We propose a disjoint community detection algorithm, NashDisjoint that detects disjoint communities in any given network. • We evaluate the algorithm NashDisjoint on the standard LFR benchmarks, and we find that our algorithm works at least as good as that of the state of the art algorithms for the mixing factors less than 0.55 in all the cases. • On Real Social Networks, we observe that the modularity values for the community structure detected by our algorithm NashDisjoint is in comparison to that of one of the best modularity optimization algorithms so far. • We propose an overlapping community detection algorithm NashOverlap to detect the overlapping communities in any given network. • We evaluate the algorithm NashOverlap on the standard LFR benchmarks and we find that our algorithm works far better than the state of the art algorithms in around 152 different scenarios, generated by varying the number of nodes, mixing factor and overlapping membership. • The algorithm NashDisjoint and NashOverlap are modeled as a sequence of weighted potential games. • We run our algorithm NashOverlap on DBLP dataset to detect the top collaboration groups. We have identified a giant component as a top collaboration group and the second largest collaboration group is much smaller than the giant component. This leads to a reasoning which says that the scientific collaboration network is highly connected and there is a possibility of more interdisciplinary work in the future which is a good sign for the development of science. The diameter of the largest collaboration group for both the datasets is between 9 and 11 and has the average path length between 4 and 5, which strengthens the above argument. Also, the average number of intermediate researchers that connect any pair of researchers in the giant component is quite less. • These results of our algorithm on DBLP collaboration network are compared with the results of the COPRA algorithm.

[1]  Mark E. J. Newman A measure of betweenness centrality based on random walks , 2005, Soc. Networks.

[2]  M. A. Muñoz,et al.  Journal of Statistical Mechanics: An IOP and SISSA journal Theory and Experiment Detecting network communities: a new systematic and efficient algorithm , 2004 .

[3]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[5]  R. Guimerà,et al.  The worldwide air transportation network: Anomalous centrality, community structure, and cities' global roles , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[6]  A. Hoffman,et al.  Lower bounds for the partitioning of graphs , 1973 .

[7]  Zengqiang Chen,et al.  Community detection based on local topological information in power grid , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[8]  M. A. Munoz,et al.  Improved spectral algorithm for the detection of network communities , 2005 .

[9]  Ning Chen,et al.  Trial and error in influential social networks , 2013, KDD.

[10]  Mark S. Granovetter The Strength of Weak Ties , 1973, American Journal of Sociology.

[11]  Wei Ren,et al.  Simple probabilistic algorithm for detecting community structure. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  Steve Gregory,et al.  Fuzzy overlapping communities in networks , 2010, ArXiv.

[13]  Ken Wakita,et al.  Finding community structure in mega-scale social networks: [extended abstract] , 2007, WWW '07.

[14]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[15]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[16]  Shengrui Wang,et al.  A direct approach to graph clustering , 2004, Neural Networks and Computational Intelligence.

[17]  Jure Leskovec,et al.  Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters , 2008, Internet Math..

[18]  D. R. Fulkerson,et al.  Maximal Flow Through a Network , 1956 .

[19]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Xiaoming Liu,et al.  SLPA: Uncovering Overlapping Communities in Social Networks via a Speaker-Listener Interaction Dynamic Process , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[21]  Bernardo A. Huberman,et al.  E-Mail as Spectroscopy: Automated Discovery of Community Structure within Organizations , 2005, Inf. Soc..

[22]  Martin Rosvall,et al.  An information-theoretic framework for resolving community structure in complex networks , 2007, Proceedings of the National Academy of Sciences.

[23]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[24]  Dennis M. Wilkinson,et al.  A method for finding communities of related genes , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[25]  David D. Jensen,et al.  Graph clustering with network structure indices , 2007, ICML '07.

[26]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[27]  R. Guimerà,et al.  Functional cartography of complex metabolic networks , 2005, Nature.

[28]  Anca Andreica,et al.  Game Theory and Extremal Optimization for Community Detection in Complex Dynamic Networks , 2014, PloS one.

[29]  Ulrik Brandes,et al.  Experiments on Graph Clustering Algorithms , 2003, ESA.

[30]  S. vanDongen Performance criteria for graph clustering and Markov cluster experiments , 2000 .

[31]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[32]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[33]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  Santo Fortunato,et al.  Finding Statistically Significant Communities in Networks , 2010, PloS one.

[35]  Jure Leskovec,et al.  Overlapping community detection at scale: a nonnegative matrix factorization approach , 2013, WSDM.

[36]  Jingchun Chen,et al.  Detecting functional modules in the yeast protein-protein interaction network , 2006, Bioinform..

[37]  Andrea Lancichinetti,et al.  Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[38]  R. Guimerà,et al.  Modularity from fluctuations in random graphs and complex networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[39]  Haijun Zhou Distance, dissimilarity index, and network community structure. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[40]  T. Vicsek,et al.  Weighted network modules , 2007, cond-mat/0703706.

[41]  C. Lee Giles,et al.  Efficient identification of Web communities , 2000, KDD '00.

[42]  Martin Rosvall,et al.  Multilevel Compression of Random Walks on Networks Reveals Hierarchical Organization in Large Integrated Systems , 2010, PloS one.

[43]  Amedeo Caflisch,et al.  Efficient modularity optimization by multistep greedy algorithm and vertex mover refinement. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[44]  Filippo Menczer,et al.  Virality Prediction and Community Structure in Social Networks , 2013, Scientific Reports.

[45]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[46]  Jure Leskovec,et al.  Community-Affiliation Graph Model for Overlapping Network Community Detection , 2012, 2012 IEEE 12th International Conference on Data Mining.

[47]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[48]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[49]  Enhong Chen,et al.  Finding Community Structure Based on Subgraph Similarity , 2009, CompleNet.

[50]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[51]  Malik Magdon-Ismail,et al.  Finding communities by clustering a graph into overlapping subgraphs , 2005, IADIS AC.

[52]  J. Kumpula,et al.  Sequential algorithm for fast clique percolation. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[53]  Jonathan W. Berry,et al.  Community Detection via Facility Location , 2007, 0710.3800.

[54]  Kevin E. Bassler,et al.  Improved community structure detection using a modified fine-tuning strategy , 2009, ArXiv.

[55]  David D. Jensen,et al.  Using structure indices for efficient approximation of network properties , 2006, KDD '06.

[56]  E A Leicht,et al.  Mixture models and exploratory analysis in networks , 2006, Proceedings of the National Academy of Sciences.

[57]  E. Barnes An algorithm for partitioning the nodes of a graph , 1981, 1981 20th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.

[58]  Stanford,et al.  Learning to Discover Social Circles in Ego Networks , 2012 .

[59]  Illés J. Farkas,et al.  CFinder: locating cliques and overlapping modules in biological networks , 2006, Bioinform..

[60]  Y. Narahari,et al.  A game theory inspired, decentralized, local information based algorithm for community detection in social graphs , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[61]  Boleslaw K. Szymanski,et al.  Overlapping community detection in networks: The state-of-the-art and comparative study , 2011, CSUR.

[62]  David M Blei,et al.  Efficient discovery of overlapping communities in massive networks , 2013, Proceedings of the National Academy of Sciences.

[63]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[64]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[65]  Wei Chen,et al.  A game-theoretic framework to identify overlapping communities in social networks , 2010, Data Mining and Knowledge Discovery.

[66]  Martin Rosvall,et al.  Compression of flow can reveal overlapping modular organization in networks , 2011, ArXiv.

[67]  Haifeng Du,et al.  An algorithm for detecting community structure of social networks based on prior knowledge and modularity , 2007, Complex..

[68]  N. Alves Unveiling community structures in weighted networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[69]  S.,et al.  An Efficient Heuristic Procedure for Partitioning Graphs , 2022 .

[70]  Jiming Liu,et al.  Discovering global network communities based on local centralities , 2008, TWEB.

[71]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[72]  G. Caldarelli,et al.  Detecting communities in large networks , 2004, cond-mat/0402499.

[73]  A. Arenas,et al.  Community detection in complex networks using extremal optimization. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[74]  M. Newman Analysis of weighted networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[75]  Ying Xuan,et al.  Modularity-Maximizing Graph Communities via Mathematical Programming , 2009 .

[76]  Richard M. Karp,et al.  Algorithms for graph partitioning on the planted partition model , 2001, Random Struct. Algorithms.

[77]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[78]  M. Newman,et al.  Robustness of community structure in networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[79]  Leon Danon,et al.  The effect of size heterogeneity on community identification in complex networks , 2006, physics/0601144.

[80]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[81]  Steve Gregory,et al.  Finding overlapping communities in networks by label propagation , 2009, ArXiv.

[82]  Bo Zhao,et al.  Community evolution detection in dynamic heterogeneous information networks , 2010, MLG '10.

[83]  J. Pinney,et al.  Betweenness-based decomposition methods for social and biological networks , 2006 .

[84]  Steve Gregory,et al.  An Algorithm to Find Overlapping Community Structure in Networks , 2007, PKDD.