Finding and Testing Network Communities by Lumped Markov Chains

Identifying communities (or clusters), namely groups of nodes with comparatively strong internal connectivity, is a fundamental task for deeply understanding the structure and function of a network. Yet, there is a lack of formal criteria for defining communities and for testing their significance. We propose a sharp definition that is based on a quality threshold. By means of a lumped Markov chain model of a random walker, a quality measure called “persistence probability” is associated to a cluster, which is then defined as an “-community” if such a probability is not smaller than . Consistently, a partition composed of -communities is an “-partition.” These definitions turn out to be very effective for finding and testing communities. If a set of candidate partitions is available, setting the desired -level allows one to immediately select the -partition with the finest decomposition. Simultaneously, the persistence probabilities quantify the quality of each single community. Given its ability in individually assessing each single cluster, this approach can also disclose single well-defined communities even in networks that overall do not possess a definite clusterized structure.

[1]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  John G. Kemeny,et al.  Finite Markov chains , 1960 .

[3]  Jirí Síma,et al.  On the NP-Completeness of Some Graph Cluster Measures , 2005, SOFSEM.

[4]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[5]  Jean-Charles Delvenne,et al.  Stability of graph communities across time scales , 2008, Proceedings of the National Academy of Sciences.

[6]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Karl-Heinz Hoffmann,et al.  Bounding the lumping error in Markov chain dynamics , 2009, Appl. Math. Lett..

[8]  Tiejun Li,et al.  Optimal partition and effective dynamics of complex networks , 2008, Proceedings of the National Academy of Sciences.

[9]  Paul A. Bates,et al.  Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis , 2006, BMC Bioinformatics.

[10]  Giorgio Fagiolo,et al.  On the Topological Properties of the World Trade Web: A Weighted Network Analysis , 2007, 0708.4359.

[11]  François Fouss,et al.  Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation , 2007, IEEE Transactions on Knowledge and Data Engineering.

[12]  Stefan Bornholdt,et al.  Detecting fuzzy community structures in complex networks with a Potts model. , 2004, Physical review letters.

[13]  Andrea Lancichinetti,et al.  Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  P. Buchholz Exact and ordinary lumpability in finite Markov chains , 1994, Journal of Applied Probability.

[15]  C. Lee Giles,et al.  Self-Organization and Identification of Web Communities , 2002, Computer.

[16]  Robin Palotai,et al.  Community Landscapes: An Integrative Approach to Determine Overlapping Network Module Hierarchy, Identify Key Nodes and Predict Network Dynamics , 2009, PloS one.

[17]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[18]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[19]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[20]  Tom A. B. Snijders,et al.  Social Network Analysis , 2011, International Encyclopedia of Statistical Science.

[21]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  V. Latora,et al.  Complex networks: Structure and dynamics , 2006 .

[23]  Santosh S. Vempala,et al.  On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[24]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[25]  Margareta Holgersson,et al.  The limited value of cophenetic correlation as a clustering criterion , 1978, Pattern Recognit..

[26]  John Scott What is social network analysis , 2010 .

[27]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[28]  Marián Boguñá,et al.  Topology of the world trade web. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[29]  Claudio Castellano,et al.  Community Structure in Graphs , 2007, Encyclopedia of Complexity and Systems Science.

[30]  D. Mason,et al.  Compartments revealed in food-web structure , 2003, Nature.

[31]  Carlo Piccardi,et al.  Communities in Italian corporate networks , 2010 .

[32]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[33]  S. Strogatz Exploring complex networks , 2001, Nature.

[34]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[35]  D. Garlaschelli,et al.  Structure and evolution of the world trade network , 2005, physics/0502066.

[36]  P. Sopp Cluster analysis. , 1996, Veterinary immunology and immunopathology.

[37]  P. Hansen,et al.  Locally optimal heuristic for modularity maximization of networks. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[38]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[39]  Roger Guimerà,et al.  Module identification in bipartite and directed networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[40]  Jie Cheng,et al.  Measuring the significance of community structure in complex networks. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[41]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[42]  Y. Narahari,et al.  A Shapley Value-Based Approach to Discover Influential Nodes in Social Networks , 2011, IEEE Transactions on Automation Science and Engineering.

[43]  Michael W. Deem,et al.  Structure and Response in the World Trade Network , 2010, Physical review letters.

[44]  Haijun Zhou Distance, dissimilarity index, and network community structure. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[45]  Nitesh V. Chawla,et al.  Identifying and evaluating community structure in complex networks , 2010, Pattern Recognit. Lett..

[46]  S. Bornholdt,et al.  When are networks truly modular , 2006, cond-mat/0606220.

[47]  J. Reichardt,et al.  Partitioning and modularity of graphs with arbitrary degree distribution. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[48]  Youngdo Kim,et al.  Finding communities in directed networks. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[49]  Santo Fortunato,et al.  Finding Statistically Significant Communities in Networks , 2010, PloS one.

[50]  Andrea Lancichinetti,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[51]  Alessandro Vespignani,et al.  Dynamical Processes on Complex Networks , 2008 .

[52]  Giuseppe Mangioni,et al.  Identifying the Community Structure of the International-Trade Multi Network , 2010, ArXiv.

[53]  Carl D. Meyer,et al.  Matrix Analysis and Applied Linear Algebra , 2000 .

[54]  Peng Zhang,et al.  Comparative definition of community and corresponding identifying algorithm. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[55]  Nick S. Jones,et al.  Dynamic communities in multichannel data: an application to the foreign exchange market during the 2007-2008 credit crisis. , 2008, Chaos.

[56]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[57]  R. Carter 11 – IT and society , 1991 .

[58]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[59]  Marko Bajec,et al.  Community structure of complex software systems: Analysis and applications , 2011, ArXiv.

[60]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.