Constructing and sampling graphs with a prescribed joint degree distribution

One of the most influential recent results in network analysis is that many natural networks exhibit a power-law or log-normal degree distribution. This has inspired numerous generative models that match this property. However, more recent work has shown that while these generative models do have the right degree distribution, they are not good models for real-life networks due to their differences on other important metrics like conductance. We believe this is, in part, because many of these real-world networks have very different joint degree distributions, that is, the probability that a randomly selected edge will be between nodes of degree k and l. Assortativity is a sufficient statistic of the joint degree distribution, and it has been previously noted that social networks tend to be assortative, while biological and technological networks tend to be disassortative. We suggest understanding the relationship between network structure and the joint degree distribution of graphs is an interesting avenue of further research. An important tool for such studies are algorithms that can generate random instances of graphs with the same joint degree distribution. This is the main topic of this article, and we study the problem from both a theoretical and practical perspective. We provide an algorithm for constructing simple graphs from a given joint degree distribution, and a Monte Carlo Markov chain method for sampling them. We also show that the state space of simple graphs with a fixed degree distribution is connected via endpoint switches. We empirically evaluate the mixing time of this Markov chain by using experiments based on the autocorrelation of each edge. These experiments show that our Markov chain mixes quickly on these real graphs, allowing for utilization of our techniques in practice.

[1]  Christos Faloutsos,et al.  The "DGX" distribution for mining massive, skewed data , 2001, KDD '01.

[2]  Tamara G. Kolda,et al.  An In-depth Study of Stochastic Kronecker Graphs , 2011, 2011 IEEE 11th International Conference on Data Mining.

[3]  StantonIsabelle,et al.  Constructing and sampling graphs with a prescribed joint degree distribution , 2012 .

[4]  Christos Faloutsos,et al.  Kronecker Graphs: An Approach to Modeling Networks , 2008, J. Mach. Learn. Res..

[5]  Michael Mitzenmacher,et al.  A Brief History of Generative Models for Power Law and Lognormal Distributions , 2004, Internet Math..

[6]  Béla Bollobás,et al.  A Probabilistic Proof of an Asymptotic Formula for the Number of Labelled Regular Graphs , 1980, Eur. J. Comb..

[7]  Van H. Vu,et al.  Generating Random Regular Graphs , 2003, STOC '03.

[8]  R. Taylor Switchings Constrained to 2-Connectivity in Simple Graphs , 1982 .

[9]  FaloutsosMichalis,et al.  On power-law relationships of the Internet topology , 1999 .

[10]  Priya Mahadevan,et al.  Systematic topology analysis and generation using degree correlations , 2006, SIGCOMM 2006.

[11]  Mark Jerrum,et al.  Fast Uniform Generation of Regular Graphs , 1990, Theor. Comput. Sci..

[12]  Eli Upfal,et al.  Stochastic models for the Web graph , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[13]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[14]  Mihaela Enachescu,et al.  Variations on Random Graph Models for the Web , 2001 .

[15]  M. Newman,et al.  Mixing patterns in networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  Nicholas C. Wormald,et al.  Generating Random Regular Graphs Quickly , 1999, Combinatorics, Probability and Computing.

[17]  Adrian Raftery,et al.  The Number of Iterations, Convergence Diagnostics and Generic Metropolis Algorithms , 1995 .

[18]  Fan Chung Graham,et al.  A random graph model for massive graphs , 2000, STOC '00.

[19]  Eli Upfal,et al.  Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2005 .

[20]  Jonathan W. Berry,et al.  Listing triangles in expected linear time on a class of power law graphs. , 2010 .

[21]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[22]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  David M. Pennock,et al.  Winners don't take all: Characterizing the competition for links on the web , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Tamara G. Kolda,et al.  A Hitchhiker's Guide to Choosing Parameters of Stochastic Kronecker Graphs , 2011, ArXiv.

[25]  Van H. Vu,et al.  Generating Random Regular Graphs , 2006, Comb..

[26]  M. Newman Analysis of weighted networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[27]  A. Bonato RANDOM GRAPH MODELS FOR THE WEB GRAPH , 2007 .

[28]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Eric Vigoda,et al.  A polynomial-time approximation algorithm for the permanent of a matrix with nonnegative entries , 2004, JACM.

[30]  Persi Diaconis,et al.  A Sequential Importance Sampling Algorithm for Generating Random Graphs with Prescribed Degrees , 2011, Internet Math..

[31]  S. Brenner,et al.  The structure of the nervous system of the nematode Caenorhabditis elegans. , 1986, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[32]  F. Chung,et al.  Connected Components in Random Graphs with Given Expected Degree Sequences , 2002 .

[33]  Harald Niederreiter,et al.  Probability and computing: randomized algorithms and probabilistic analysis , 2006, Math. Comput..

[34]  H E Stanley,et al.  Classes of small-world networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[35]  Kevin E. Bassler,et al.  Network dynamics: Jamming is limited in scale-free systems , 2004, Nature.

[36]  A. Sokal Monte Carlo Methods in Statistical Mechanics: Foundations and New Algorithms , 1997 .

[37]  Béla Bollobás,et al.  The degree sequence of a scale‐free random graph process , 2001, Random Struct. Algorithms.

[38]  Andrei Z. Broder,et al.  How hard is it to marry at random? (On the approximation of the permanent) , 1986, STOC '86.

[39]  Christos H. Papadimitriou,et al.  On the Eigenvalue Power Law , 2002, RANDOM.

[40]  David Lusseau,et al.  The emergent properties of a dolphin social network , 2003, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[41]  Tamara G. Kolda,et al.  Community structure and scale-free collections of Erdös-Rényi graphs , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[42]  Amin Saberi,et al.  A Sequential Algorithm for Generating Random Graphs , 2007, Algorithmica.

[43]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[44]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[45]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[46]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[47]  Jon M. Kleinberg,et al.  Small-World Phenomena and the Dynamics of Information , 2001, NIPS.

[48]  Fan Chung Graham,et al.  A Random Graph Model for Power Law Graphs , 2001, Exp. Math..

[49]  Donald E. Knuth,et al.  The Stanford GraphBase - a platform for combinatorial computing , 1993 .

[50]  Fan Chung Graham,et al.  The Average Distance in a Random Graph with Given Expected Degrees , 2004, Internet Math..

[51]  S. Hakimi On Realizability of a Set of Integers as Degrees of the Vertices of a Linear Graph. I , 1962 .

[52]  Albert-László Barabási,et al.  Hierarchical organization in complex networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[53]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[54]  Ben Y. Zhao,et al.  Revisiting Degree Distribution Models for Social Graph Analysis , 2011, ArXiv.

[55]  Tamara G. Kolda,et al.  The BTER Graph Model: Blocked Two-Level Erdos-Renyi. , 2011 .

[56]  Fan Chung Graham,et al.  The Spectra of Random Graphs with Given Expected Degrees , 2004, Internet Math..

[57]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[58]  Prasad Tetali,et al.  Simple Markov-chain algorithms for generating bipartite graphs and tournaments , 1997, SODA '97.

[59]  Priya Mahadevan,et al.  Systematic topology analysis and generation using degree correlations , 2006, SIGCOMM.

[60]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[61]  Mark Jerrum,et al.  Approximate Counting, Uniform Generation and Rapidly Mixing Markov Chains , 1987, International Workshop on Graph-Theoretic Concepts in Computer Science.

[62]  Mark Jerrum,et al.  Approximate Counting, Uniform Generation and Rapidly Mixing Markov Chains , 1987, WG.

[63]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[64]  Priya Mahadevan,et al.  Orbis: rescaling degree correlations to generate annotated internet topologies , 2007, SIGCOMM '07.

[65]  Christos Gkantsidis,et al.  The Markov Chain Simulation Method for Generating Connected Power Law Random Graphs , 2003, ALENEX.

[66]  F. Chung,et al.  Spectra of random graphs with given expected degrees , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[67]  Alan M. Frieze,et al.  A Geometric Preferential Attachment Model of Networks , 2006, Internet Math..

[68]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[69]  Priya Mahadevan,et al.  Orbis: rescaling degree correlations to generate annotated internet topologies , 2007, SIGCOMM 2007.