Uncovering the overlapping community structure of complex networks in nature and society

Many complex systems in nature and society can be described in terms of networks capturing the intricate web of connections among the units they are made of. A key question is how to interpret the global organization of such networks as the coexistence of their structural subunits (communities) associated with more highly interconnected parts. Identifying these a priori unknown building blocks (such as functionally related proteins, industrial sectors and groups of people) is crucial to the understanding of the structural and functional properties of networks. The existing deterministic methods used for large networks find separated communities, whereas most of the actual networks are made of highly overlapping cohesive groups of nodes. Here we introduce an approach to analysing the main statistical features of the interwoven sets of overlapping communities that makes a step towards uncovering the modular structure of complex systems. After defining a set of new characteristic quantities for the statistics of communities, we apply an efficient technique for exploring overlapping communities on a large scale. We find that overlaps are significant, and the distributions we introduce reveal universal features of networks. Our studies of collaboration, word-association and protein interaction graphs show that the web of communities has non-trivial correlations and specific scaling properties.

[1]  M. V. Valkenburg Network Analysis , 1964 .

[2]  Brian Everitt,et al.  Cluster analysis , 1974 .

[3]  John Scott Social Network Analysis , 1988 .

[4]  Robert C. Kohberger,et al.  Cluster Analysis (3rd ed.) , 1994 .

[5]  Eytan Domany,et al.  Superparamagnetic Clustering of Data , 1996 .

[6]  C. Ball,et al.  Genetic and physical maps of Saccharomyces cerevisiae. , 1997, Nature.

[7]  J. Brune,et al.  Structural features in a brittle–ductile wax model of continental extension , 1997, nature.

[8]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[9]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[10]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[11]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[12]  M E J Newman,et al.  Identity and Search in Social Networks , 2002, Science.

[13]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[14]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[15]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[16]  K. Kaski,et al.  Dynamics of market correlations: taxonomy and portfolio analysis. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Vladimir Batagelj,et al.  Short Cycles Connectivity , 2003, ArXiv.

[18]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Simeon Warner,et al.  Eprints and the Open Archives Initiative , 2003, ArXiv.

[20]  Sergey N. Dorogovtsev,et al.  Evolution of Networks: From Biological Nets to the Internet and WWW (Physics) , 2003 .

[21]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  Katy Börner,et al.  Mapping knowledge domains , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Thomas A. Schreiber,et al.  The University of South Florida free association, rhyme, and word fragment norms , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[24]  Mark Newman,et al.  Detecting community structure in networks , 2004 .

[25]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[26]  David Botstein,et al.  GO: : TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes , 2004, Bioinform..

[27]  Sven Kosub,et al.  Local Density , 2004, Network Analysis.

[28]  Albert-László Barabási,et al.  Evolution of Networks: From Biological Nets to the Internet and WWW , 2004 .

[29]  Steen Knudsen,et al.  Guide to analysis of DNA microarray data , 2004 .

[30]  John Scott,et al.  Using Correspondence Analysis for Joint Displays of Affiliation Networks , 2005 .

[31]  S. Wasserman,et al.  Models and Methods in Social Network Analysis , 2005 .

[32]  S. Havlin,et al.  Self-similarity of complex networks , 2005, Nature.

[33]  T. Vicsek,et al.  Clique percolation in random networks. , 2005, Physical review letters.

[34]  S. Knudsen,et al.  Guide to Analysis of DNA Microarray Data: Knudsen/DNA Microarray 2e , 2005 .

[35]  Luciano Rossoni,et al.  Models and methods in social network analysis , 2006 .