Exploring community structure in biological networks with random graphs

Background Community structure is ubiquitous in biological networks. There has been an increased interest in unraveling the community structure of biological systems as it may provide important insights into a system’s functional components and the impact of local structures on dynamics at a global scale. Choosing an appropriate community detection algorithm to identify the community structure in an empirical network can be difficult, however, as the many algorithms available are based on a variety of cost functions and are difficult to validate. Even when community structure is identified in an empirical system, disentangling the effect of community structure from other network properties such as clustering coefficient and assortativity can be a challenge. Results Here, we develop a generative model to produce undirected, simple, connected graphs with a specified degrees and pattern of communities, while maintaining a graph structure that is as random as possible. Additionally, we demonstrate two important applications of our model: (a) to generate networks that can be used to benchmark existing and new algorithms for detecting communities in biological networks; and (b) to generate null models to serve as random controls when investigating the impact of complex network features beyond the byproduct of degree and modularity in empirical biological networks. Conclusion Our model allows for the systematic study of the presence of community structure and its impact on network function and dynamics. This process is a crucial step in unraveling the functional consequences of the structural properties of biological systems and uncovering the mechanisms that drive these systems.

[1]  Daniel B. Stouffer,et al.  Origin of compartmentalization in food webs. , 2010, Ecology.

[2]  Christos Gkantsidis,et al.  The Markov Chain Simulation Method for Generating Connected Power Law Random Graphs , 2003, ALENEX.

[3]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[5]  J. Reichardt,et al.  Statistical mechanics of community detection. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  Shashank Khandelwal,et al.  Exploring biological network structure with clustered random networks , 2009, BMC Bioinformatics.

[7]  D. Lusseau,et al.  The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations , 2003, Behavioral Ecology and Sociobiology.

[8]  Susan Khor,et al.  Concurrency and Network Disassortativity , 2010, Artificial Life.

[9]  Philip S. Yu,et al.  Community detection in incomplete information networks , 2012, WWW.

[10]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[11]  Ziyou Gao,et al.  Modular effects on epidemic dynamics in small-world networks , 2007 .

[12]  Arend Hintze,et al.  Modularity and anti-modularity in networks with arbitrary degree distribution , 2009, Biology Direct.

[13]  Maurice Tchuente,et al.  Local Community Identification in Social Networks , 2012, Parallel Process. Lett..

[14]  Roger Guimerà,et al.  Module identification in bipartite and directed networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  I. Sokolov,et al.  Reshuffling scale-free networks: from random to assortative. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  J. Krause,et al.  Social organisation , 2022 .

[17]  Neo D. Martinez,et al.  Food-web structure and network theory: The role of connectance and size , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Bruce A. Reed,et al.  A Critical Point for Random Graphs with a Given Degree Sequence , 1995, Random Struct. Algorithms.

[19]  Marina Meila,et al.  Comparing Clusterings by the Variation of Information , 2003, COLT.

[20]  M. Newman Communities, modules and large-scale structure in networks , 2011, Nature Physics.

[21]  Natasa Przulj,et al.  Biological network comparison using graphlet degree distribution , 2007, Bioinform..

[22]  S. Hakimi On Realizability of a Set of Integers as Degrees of the Vertices of a Linear Graph. I , 1962 .

[23]  James P. Bagrow Evaluating local community methods in networks , 2007, 0706.3880.

[24]  Benjamin H. Good,et al.  Performance of modularity maximization in practical contexts. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  J. Bascompte,et al.  Compartmentalization increases food-web persistence , 2011, Proceedings of the National Academy of Sciences.

[26]  Neo D. Martinez Artifacts or Attributes? Effects of Resolution on the Little Rock Lake Food Web , 1991 .

[27]  M. Newman,et al.  Mixing patterns in networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  Marcel Salathé,et al.  Dynamics and Control of Diseases in Networks with Community Structure , 2010, PLoS Comput. Biol..

[29]  Vadim E. Zverovich,et al.  Contributions to the theory of graphic sequences , 1992, Discret. Math..

[30]  S N Dorogovtsev,et al.  Degree-dependent intervertex separation in complex networks. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  Wen-Xu Wang,et al.  Collective synchronization induced by epidemic dynamics on complex networks with communities. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[32]  Andrea Lancichinetti,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[33]  Jing Zhao,et al.  The effects of degree correlations on network topologies and robustness , 2006 .

[34]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[35]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, Knowledge and Information Systems.

[36]  M. Newman Properties of highly clustered networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[37]  Antal Iványi DEGREE SEQUENCES OF MULTIGRAPHS , 2012 .

[38]  R. Taylor Contrained switchings in graphs , 1981 .

[39]  E. N. Sawardecker,et al.  Detection of node group membership in networks with group overlap , 2008, 0812.1243.

[40]  Agata Fronczak,et al.  Universal scaling of distances in complex networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[41]  Fan Chung Graham,et al.  A random graph model for massive graphs , 2000, STOC '00.

[42]  Stefan Wuchty,et al.  Stable evolutionary signal in a Yeast protein interaction network , 2006, BMC Evolutionary Biology.

[43]  J. Montoya,et al.  Small world patterns in food webs. , 2002, Journal of theoretical biology.

[44]  Carl T. Bergstrom,et al.  The map equation , 2009, 0906.1405.

[45]  Jure Leskovec,et al.  The Network Completion Problem: Inferring Missing Nodes and Edges in Networks , 2011, SDM.

[46]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[47]  Neo D. Martinez,et al.  Two degrees of separation in complex food webs , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[49]  Lan V. Zhang,et al.  Evidence for dynamically organized modularity in the yeast protein–protein interaction network , 2004, Nature.

[50]  Shuigeng Zhou,et al.  Epidemic spreading in weighted scale-free networks with community structure , 2009 .

[51]  P. Diaconis,et al.  Estimating and understanding exponential random graph models , 2011, 1102.2650.

[52]  Ulrich Krohs The cost of modularity , 2009 .

[53]  M. Newman,et al.  Random graphs with arbitrary degree distributions and their applications. , 2000, Physical review. E, Statistical, nonlinear, and soft matter physics.

[54]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[55]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[56]  Peng Wang,et al.  Exponential random graph models for multilevel networks , 2013, Soc. Networks.

[57]  Patrick C Phillips,et al.  Network thinking in ecology and evolution. , 2005, Trends in ecology & evolution.

[58]  Alessandro Flammini,et al.  Characterization and modeling of protein–protein interaction networks , 2005 .

[59]  D. Mason,et al.  Compartments revealed in food-web structure , 2003, Nature.

[60]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[61]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[62]  Mark Newman,et al.  Detecting community structure in networks , 2004 .

[63]  K. Sneppen,et al.  Specificity and Stability in Topology of Protein Networks , 2002, Science.

[64]  D. Fell,et al.  The small world inside large metabolic networks , 2000, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[65]  Peter Kroes,et al.  Functions in biological and artificial worlds : comparative philosophical perspectives , 2009 .

[66]  N Przulj Biological network comparison using graphlet degree distribution (vol 23 pg c177, 2006) , 2010 .

[67]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[68]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[69]  J. Bascompte,et al.  The modularity of pollination networks , 2007, Proceedings of the National Academy of Sciences.

[70]  U. Alon,et al.  Environmental variability and modularity of bacterial metabolic networks , 2007, BMC Evolutionary Biology.

[71]  Alex Arenas,et al.  Synchronization reveals topological scales in complex networks. , 2006, Physical review letters.

[72]  Jose M. Montoya Ricard V. Sole,et al.  Small world patterns in food webs. , 2000, Journal of theoretical biology.

[73]  R. Tanaka,et al.  Scale-rich metabolic networks. , 2005, Physical review letters.

[74]  John J. Welch,et al.  MODULARITY AND THE COST OF COMPLEXITY , 2003, Evolution; international journal of organic evolution.

[75]  V. Chungphaisan Conditions for sequences to be r-graphic , 1974, Discret. Math..

[76]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.