A scalable null model for directed graphs matching all degree distributions: In, out, and reciprocal

Degree distributions are arguably the most important property of real world networks. The classic edge configuration model or Chung-Lu model can generate an undirected graph with any desired degree distribution. This serves as a good null model to compare algorithms or perform experimental studies. Furthermore, there are scalable algorithms that implement these models and they are invaluable in the study of graphs. However, networks in the real-world are often directed, and have a significant proportion of reciprocal edges. A stronger relation exists between two nodes when they each point to one another (reciprocal edge) as compared to when only one points to the other (one-way edge). Despite their importance, reciprocal edges have been disregarded by most directed graph models. We propose a null model for directed graphs inspired by the Chung-Lu model that matches the in-, out-, and reciprocal-degree distributions of the real graphs. Our algorithm is scalable and requires O(m) random numbers to generate a graph with m edges. We perform a series of experiments on real datasets and compare with existing graph models.

[1]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[2]  P. Bickel,et al.  A nonparametric view of network models and Newman–Girvan and other modularities , 2009, Proceedings of the National Academy of Sciences.

[3]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[4]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[5]  Weiqi Zhang Making Doodle Obsolete : Applying auction mechanisms to meeting scheduling , 2010 .

[6]  Krishna P. Gummadi,et al.  Growth of the flickr social network , 2008, WOSN '08.

[7]  F. Chung,et al.  Connected Components in Random Graphs with Given Expected Degree Sequences , 2002 .

[8]  F. Chung,et al.  Eigenvalues of Random Power law Graphs , 2003 .

[9]  Christos Faloutsos,et al.  Kronecker Graphs: An Approach to Modeling Networks , 2008, J. Mach. Learn. Res..

[10]  Tamara G. Kolda,et al.  A Scalable Generative Graph Model with Community Structure , 2013, SIAM J. Sci. Comput..

[11]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[12]  Béla Bollobás,et al.  Directed scale-free graphs , 2003, SODA '03.

[13]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[14]  Tamara G. Kolda,et al.  On Reciprocity in Massively Multi-player Online Game Networks , 2013, arXiv.org.

[15]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[16]  Andrei Z. Broder,et al.  Graph structure in the Web , 2000, Comput. Networks.

[17]  Christos H. Papadimitriou,et al.  On the Eigenvalue Power Law , 2002, RANDOM.

[18]  Jon M. Kleinberg,et al.  The Web as a Graph: Measurements, Models, and Methods , 1999, COCOON.

[19]  Tamara G. Kolda,et al.  Community structure and scale-free collections of Erdös-Rényi graphs , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Stephanie Forrest,et al.  Email networks and the spread of computer viruses. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  Christos Gkantsidis,et al.  The Markov Chain Simulation Method for Generating Connected Power Law Random Graphs , 2003, ALENEX.

[22]  Hrvoje Štefančić,et al.  Model of Wikipedia growth based on information exchange via reciprocal arcs , 2009, ArXiv.

[23]  Prasad Tetali,et al.  Simple Markov-chain algorithms for generating bipartite graphs and tournaments , 1997, SODA '97.

[24]  Diego Garlaschelli,et al.  Patterns of link reciprocity in directed networks. , 2004, Physical review letters.

[25]  Bruce A. Reed,et al.  The Size of the Giant Component of a Random Graph with a Given Degree Sequence , 1998, Combinatorics, Probability and Computing.

[26]  F. Chung,et al.  The average distances in random graphs with given expected degrees , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Joel C. Miller,et al.  Efficient Generation of Networks with Given Expected Degrees , 2011, WAW.

[28]  Edward A. Bender,et al.  The Asymptotic Number of Labeled Graphs with Given Degree Sequences , 1978, J. Comb. Theory A.

[29]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[30]  M E J Newman,et al.  Predicting epidemics on directed contact networks. , 2006, Journal of theoretical biology.

[31]  Tamara G. Kolda,et al.  An In-depth Study of Stochastic Kronecker Graphs , 2011, 2011 IEEE 11th International Conference on Data Mining.

[32]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[33]  Béla Bollobás,et al.  A Probabilistic Proof of an Asymptotic Formula for the Number of Labelled Regular Graphs , 1980, Eur. J. Comb..

[34]  V. Zlatic,et al.  Influence of reciprocal edges on degree distribution and degree correlations. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[35]  Christos Faloutsos,et al.  Scalable modeling of real graphs using Kronecker multiplication , 2007, ICML '07.

[36]  Fan Chung Graham,et al.  A random graph model for massive graphs , 2000, STOC '00.