Configuring Random Graph Models with Fixed Degree Sequences

Random graph null models have found widespread application in diverse research communities analyzing network datasets, including social, information, and economic networks, as well as food webs, protein-protein interactions, and neuronal networks. The most popular family of random graph null models, called configuration models, are defined as uniform distributions over a space of graphs with a fixed degree sequence. Commonly, properties of an empirical network are compared to properties of an ensemble of graphs from a configuration model in order to quantify whether empirical network properties are meaningful or whether they are instead a common consequence of the particular degree sequence. In this work we study the subtle but important decisions underlying the specification of a configuration model, and investigate the role these choices play in graph sampling procedures and a suite of applications. We place particular emphasis on the importance of specifying the appropriate graph labeling (stub-labeled or vertex-labeled) under which to consider a null model, a choice that closely connects the study of random graphs to the study of random contingency tables. We show that the choice of graph labeling is inconsequential for studies of simple graphs, but can have a significant impact on analyses of multigraphs or graphs with self-loops. The importance of these choices is demonstrated through a series of three vignettes, analyzing network datasets under many different configuration models and observing substantial differences in study conclusions under different models. We argue that in each case, only one of the possible configuration models is appropriate. While our work focuses on undirected static networks, it aims to guide the study of directed networks, dynamic networks, and all other network contexts that are suitably studied through the lens of random graph null models.

[1]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[2]  Uncorrelated random networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[4]  Rudolf Mathon,et al.  A Note on the Graph Isomorphism counting Problem , 1979, Inf. Process. Lett..

[5]  Kathy J. Horadam,et al.  Switching edges to randomize networks: what goes wrong and how to fix it , 2016, J. Complex Networks.

[6]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[7]  W. Patefield,et al.  An Efficient Method of Generating Random R × C Tables with Given Row and Column Totals , 1981 .

[8]  Claude Berge Theory of graphs and its applications , 1962 .

[9]  Michael Drew Lamar,et al.  Directed 3-cycle anchored digraphs and their application in the uniform sampling of realizations from a fixed degree sequence , 2011, Proceedings of the 2011 Winter Simulation Conference (WSC).

[10]  Mark Jerrum,et al.  Fast Uniform Generation of Regular Graphs , 1990, Theor. Comput. Sci..

[11]  N. Gotelli,et al.  NULL MODELS IN ECOLOGY , 1996 .

[12]  Kevin E. Bassler,et al.  Exact sampling of graphs with prescribed degree correlations , 2015, ArXiv.

[13]  Catherine S. Greenhill The switch Markov chain for sampling irregular graphs (Extended Abstract) , 2014, SODA.

[14]  Tiago P. Peixoto Hierarchical block structures and high-resolution model selection in large networks , 2013, ArXiv.

[15]  L. Stone,et al.  Generating uniformly distributed random networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  M. Newman,et al.  On the uniform generation of random graphs with prescribed degree sequences , 2003, cond-mat/0312028.

[17]  István Miklós,et al.  Towards Random Uniform Sampling of Bipartite Graphs with given Degree Sequence , 2010, Electron. J. Comb..

[18]  H. Ryser Combinatorial Properties of Matrices of Zeros and Ones , 1957, Canadian Journal of Mathematics.

[19]  Joel Nishimura,et al.  The connectivity of graphs of graphs with self-loops and a given degree sequence , 2017, J. Complex Networks.

[20]  J. Hopcroft,et al.  Are randomly grown graphs really random? , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  O. Sporns,et al.  Motifs in Brain Networks , 2004, PLoS biology.

[22]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[23]  J. I The Design of Experiments , 1936, Nature.

[24]  Daniel J. Kleitman,et al.  Algorithms for constructing graphs and digraphs with given valences and factors , 1973, Discret. Math..

[25]  Matthieu Latapy,et al.  Efficient and simple generation of random simple connected graphs with prescribed degree sequence , 2005, J. Complex Networks.

[26]  Spain,et al.  Cascade Dynamics of Complex Propagation , 2005, physics/0504165.

[27]  Bruce A. Reed,et al.  A Critical Point for Random Graphs with a Given Degree Sequence , 1995, Random Struct. Algorithms.

[28]  I. Lovette,et al.  Dynamic Paternity Allocation as a Function of Male Plumage Color in Barn Swallows , 2005, Science.

[29]  David Eppstein,et al.  Sparsification—a technique for speeding up dynamic graph algorithms , 1997, JACM.

[30]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[31]  Jon M. Kleinberg,et al.  Graph cluster randomization: network exposure to multiple universes , 2013, KDD.

[32]  Derek de Solla Price,et al.  A general theory of bibliometric and other cumulative advantage processes , 1976, J. Am. Soc. Inf. Sci..

[33]  István Miklós,et al.  Approximate Counting of Graphical Realizations , 2015, PloS one.

[34]  Jon M. Kleinberg,et al.  Subgraph frequencies: mapping the empirical and extremal geography of large graph collections , 2013, WWW.

[35]  George C. Homans Human Group , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[36]  F. Chung,et al.  The average distances in random graphs with given expected degrees , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Sarel J Fleishman,et al.  Comment on "Network Motifs: Simple Building Blocks of Complex Networks" and "Superfamilies of Evolved and Designed Networks" , 2004, Science.

[38]  Jean-Gabriel Young,et al.  Susceptible-infected-susceptible dynamics on the rewired configuration model , 2017 .

[39]  Rob Knight,et al.  Stress response, gut microbial diversity and sexual signals correlate with social interactions , 2016, Biology Letters.

[40]  Daniel B. Larremore,et al.  Efficiently inferring community structure in bipartite networks , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[41]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[42]  R. Taylor Contrained switchings in graphs , 1981 .

[43]  Michael Mitzenmacher,et al.  A Brief History of Generative Models for Power Law and Lognormal Distributions , 2004, Internet Math..

[44]  Béla Bollobás,et al.  A Probabilistic Proof of an Asymptotic Formula for the Number of Labelled Regular Graphs , 1980, Eur. J. Comb..

[45]  P. Diaconis,et al.  Estimating and understanding exponential random graph models , 2011, 1102.2650.

[46]  Woodrow L. Shew,et al.  Predicting criticality and dynamic range in complex networks: effects of topology. , 2010, Physical review letters.

[47]  S. Shen-Orr,et al.  Superfamilies of Evolved and Designed Networks , 2004, Science.

[48]  Alessandro Vespignani,et al.  Cut-offs and finite size effects in scale-free networks , 2003, cond-mat/0311650.

[49]  A. Rao,et al.  A Markov chain Monte carol method for generating random (0, 1)-matrices with given marginals , 1996 .

[50]  Arun Sundararajan Local Network Effects and Complex Network Structure , 2006 .

[51]  D. Rapport Stress response. , 1998, Trends in ecology & evolution.

[52]  Pieter M. Kroonenberg,et al.  A survey of algorithms for exact distributions of test statistics in r × c contingency tables with fixed margins , 1985 .

[53]  S. Hakimi On Realizability of a Set of Integers as Degrees of the Vertices of a Linear Graph. I , 1962 .

[54]  L. Amaral,et al.  The role of mentorship in protégé performance , 2010, Nature.

[55]  Tsuyoshi Murata,et al.  Detecting network communities beyond assortativity-related attributes , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[56]  Pierre Hansen,et al.  Loops and multiple edges in modularity maximization of networks. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[57]  Pavel N Krivitsky,et al.  Exponential-family random graph models for valued networks. , 2011, Electronic journal of statistics.

[58]  Annabell Berger,et al.  Curveball: a new generation of sampling algorithms for graphs with fixed degree sequence , 2016, ArXiv.

[59]  Matthew O. Jackson,et al.  Tractable and Consistent Random Graph Models , 2012, ArXiv.

[60]  P. Diaconis,et al.  Rectangular Arrays with Fixed Margins , 1995 .

[61]  D. Watts,et al.  Influentials, Networks, and Public Opinion Formation , 2007 .

[62]  Daniel B. Stouffer,et al.  Evidence for the existence of a robust pattern of prey selection in food webs , 2007, Proceedings of the Royal Society B: Biological Sciences.

[63]  S. Shen-Orr,et al.  Networks Network Motifs : Simple Building Blocks of Complex , 2002 .

[64]  J. Doye,et al.  Identifying communities within energy landscapes. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[65]  S. Berg Snowball Sampling—I , 2006 .

[66]  N. Verhelst An Efficient MCMC Algorithm to Sample Binary Matrices with Fixed Marginals , 2008 .

[67]  David Strauss On a general class of models for interaction , 1986 .

[68]  Milena Mihail,et al.  Graphic Realizations of Joint-Degree Matrices , 2015, ArXiv.

[69]  Persi Diaconis,et al.  A Sequential Importance Sampling Algorithm for Generating Random Graphs with Prescribed Degrees , 2011, Internet Math..

[70]  E. Ott,et al.  Onset of synchronization in large networks of coupled oscillators. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[71]  Isabelle Stanton,et al.  Constructing and sampling graphs with a prescribed joint degree distribution , 2011, JEAL.

[72]  G. Homans The human group , 1952 .

[73]  A. Agresti [A Survey of Exact Inference for Contingency Tables]: Rejoinder , 1992 .

[74]  S. Leinhardt,et al.  The Structure of Positive Interpersonal Relations in Small Groups. , 1967 .

[75]  Martin E. Dyer,et al.  Sampling regular graphs and a peer-to-peer network , 2005, SODA '05.

[76]  István Miklós,et al.  On realizations of a joint degree matrix , 2015, Discret. Appl. Math..

[77]  Tom A. B. Snijders,et al.  Markov Chain Monte Carlo Estimation of Exponential Random Graph Models , 2002, J. Soc. Struct..

[78]  Brendan D. McKay,et al.  Uniform Generation of Random Regular Graphs of Moderate Degree , 1990, J. Algorithms.

[79]  Zoltán Király,et al.  On the Swap-Distances of Different Realizations of a Graphical Degree Sequence , 2013, Comb. Probab. Comput..

[80]  Annabell Berger,et al.  A unifying framework for fast randomization of ecological networks with fixed (node) degrees , 2016, MethodsX.

[81]  Joel Nishimura Uniformly sampling graphs with self-loops and a given degree sequence , 2017, ArXiv.

[82]  A. Gelman,et al.  Some Issues in Monitoring Convergence of Iterative Simulations , 1998 .

[83]  M. Newman,et al.  Reply to ``Comment on `Subgraphs in random networks' '' , 2004 .

[84]  R. Milo,et al.  Subgraphs in random networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[85]  Christian Schindelhauer,et al.  Peer-to-peer networks based on random transformations of connected regular undirected graphs , 2005, SPAA '05.

[86]  Yuguo Chen,et al.  Sequential Monte Carlo Methods for Statistical Analysis of Tables , 2005 .

[87]  Blair D. Sullivan,et al.  Structural sparsity of complex networks: Bounded expansion in random models and real-world graphs , 2014, J. Comput. Syst. Sci..

[88]  D J PRICE,et al.  NETWORKS OF SCIENTIFIC PAPERS. , 1965, Science.

[89]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[90]  K. Sneppen,et al.  Specificity and Stability in Topology of Protein Networks , 2002, Science.

[91]  Bruce A. Desmarais,et al.  Statistical Inference for Valued-Edge Networks: The Generalized Exponential Random Graph Model , 2011, PloS one.

[92]  J. Wilson,et al.  Methods for detecting non-randomness in species co-occurrences: a contribution , 1987, Oecologia.

[93]  Daniel Simberloff,et al.  The Assembly of Species Communities: Chance or Competition? , 1979 .

[94]  Uri Alon,et al.  Response to Comment on "Network Motifs: Simple Building Blocks of Complex Networks" and "Superfamilies of Evolved and Designed Networks" , 2004, Science.

[95]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[96]  Catherine S. Greenhill A Polynomial Bound on the Mixing Time of a Markov Chain for Sampling Regular Directed Graphs , 2011, Electron. J. Comb..

[97]  Oliver D. King Comment on "Subgraphs in random networks". , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[98]  János Podani,et al.  RANDOMIZATION OF PRESENCE–ABSENCE MATRICES: COMMENTS AND NEW ALGORITHMS , 2004 .

[99]  Rok Sosic,et al.  SNAP , 2016, ACM Trans. Intell. Syst. Technol..

[100]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[101]  J. Petersen Die Theorie der regulären graphs , 1891 .

[102]  Alistair Sinclair,et al.  Improved Bounds for Mixing Rates of Markov Chains and Multicommodity Flow , 1992, Combinatorics, Probability and Computing.

[103]  Oktay Günlük,et al.  A degree sequence problem related to network design , 1994, Networks.

[104]  R. B. Eggleton,et al.  Simple and multigraphic realizations of degree sequences , 1981 .

[105]  S. L. Hakimi,et al.  On Realizability of a Set of Integers as Degrees of the Vertices of a Linear Graph II. Uniqueness , 1963 .

[106]  Chiara Orsini,et al.  Quantifying randomness in real networks , 2015, Nature Communications.

[107]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[108]  F. Harary,et al.  Cluster Inference by using Transitivity Indices in Empirical Graphs , 1982 .

[109]  J. Besag,et al.  Generalized Monte Carlo significance tests , 1989 .

[110]  David B. Dunson,et al.  Bayesian data analysis, third edition , 2013 .

[111]  Edward A. Bender,et al.  The Asymptotic Number of Labeled Graphs with Given Degree Sequences , 1978, J. Comb. Theory A.

[112]  Iris I. Levin,et al.  Performance of Encounternet Tags: Field Tests of Miniaturized Proximity Loggers for Use on Small Birds , 2015, PloS one.

[113]  D S Callaway,et al.  Network robustness and fragility: percolation on random graphs. , 2000, Physical review letters.

[114]  David Eppstein,et al.  Sparsification-a technique for speeding up dynamic graph algorithms , 1992, Proceedings., 33rd Annual Symposium on Foundations of Computer Science.

[115]  G. Zhang,et al.  Traversability of graph space with given degree sequence under edge rewiring , 2010 .

[116]  M. Newman,et al.  Random graphs with arbitrary degree distributions and their applications. , 2000, Physical review. E, Statistical, nonlinear, and soft matter physics.

[117]  Christos Gkantsidis,et al.  The Markov Chain Simulation Method for Generating Connected Power Law Random Graphs , 2003, ALENEX.

[118]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[119]  Claude Berge,et al.  The theory of graphs and its applications , 1962 .

[120]  M. Newman,et al.  Mixing patterns in networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[121]  K. Sneppen,et al.  Detection of topological patterns in complex networks: correlation profile of the internet , 2002, cond-mat/0205379.

[122]  Reuven Cohen,et al.  Percolation critical exponents in scale-free networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[123]  R. Tsien,et al.  Specificity and Stability in Topology of Protein Networks , 2022 .

[124]  László Babai,et al.  Graph isomorphism in quasipolynomial time [extended abstract] , 2015, STOC.

[125]  V. Climenhaga Markov chains and mixing times , 2013 .

[126]  Mitchell H. Gail,et al.  Counting the Number of r×c Contingency Tables with Fixed Margins , 1977 .

[127]  Allan Sly,et al.  Mixing time of exponential random graphs. , 2011 .

[128]  Camille Roth,et al.  Generating constrained random graphs using multiple edge switches , 2010, JEAL.

[129]  Amin Saberi,et al.  A Local Switch Markov Chain on Given Degree Graphs with Application in Connectivity of Peer-to-Peer Networks , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[130]  L. Stone,et al.  The checkerboard score and species distributions , 1990, Oecologia.

[131]  Kevin E. Bassler,et al.  Efficient and Exact Sampling of Simple Graphs with Given Arbitrary Degree Sequence , 2010, PloS one.

[132]  Joel Nishimura Swap connectivity for two graph spaces between simple and pseudo graphs and disconnectivity for triangle constraints , 2017 .

[133]  Jacob L. Moreno,et al.  Statistics of Social Configurations , 1938 .

[134]  R. B. Eggleton,et al.  The graph of type (0, ∞, ∞) realizations of a graphic sequence , 1979 .

[135]  H. Jennings,et al.  Who Shall Survive , 2007 .

[136]  Amin Saberi,et al.  A Sequential Algorithm for Generating Random Graphs , 2007, Algorithmica.

[137]  Caroline O. Buckee,et al.  A Network Approach to Analyzing Highly Recombinant Malaria Parasite Genes , 2013, PLoS Comput. Biol..

[138]  Jure Leskovec,et al.  Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters , 2008, Internet Math..

[139]  Tamara G. Kolda,et al.  A Scalable Generative Graph Model with Community Structure , 2013, SIAM J. Sci. Comput..

[140]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[141]  Umesh V. Vazirani,et al.  "Go with the winners" algorithms , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[142]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[143]  C. J. Carstens Proof of uniform sampling of binary matrices with fixed row sums and column sums for the fast Curveball algorithm. , 2015, Physical review. E, Statistical, nonlinear, and soft matter physics.