Graph model selection using maximum likelihood

In recent years, there has been a proliferation of theoretical graph models, e.g., preferential attachment and small-world models, motivated by real-world graphs such as the Internet topology. To address the natural question of which model is best for a particular data set, we propose a model selection criterion for graph models. Since each model is in fact a probability distribution over graphs, we suggest using Maximum Likelihood to compare graph models and select their parameters. Interestingly, for the case of graph models, computing likelihoods is a difficult algorithmic task. However, we design and implement MCMC algorithms for computing the maximum likelihood for four popular models: a power-law random graph model, a preferential attachment model, a small-world model, and a uniform random graph model. We hope that this novel use of ML will objectify comparisons between graph models.

[1]  Santosh S. Vempala,et al.  Simulated annealing in convex bodies and an O*(n4) volume algorithm , 2006, J. Comput. Syst. Sci..

[2]  Andrzej Rucinski,et al.  Random Graphs , 2018, Foundations of Data Science.

[3]  Arun K. Ramani,et al.  Protein interaction networks from yeast to human. , 2004, Current opinion in structural biology.

[4]  Z. Oltvai,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[5]  Michael Mitzenmacher,et al.  A Brief History of Generative Models for Power Law and Lognormal Distributions , 2004, Internet Math..

[6]  Santosh S. Vempala,et al.  Simulated annealing in convex bodies and an O*(n/sup 4/) volume algorithm , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[7]  Andrew W. Moore,et al.  Finding Underlying Connections: A Fast Graph-Based Method for Link Analysis and Collaboration Queries , 2003, ICML.

[8]  Michalis Faloutsos,et al.  Power laws and the AS-level internet topology , 2003, TNET.

[9]  Donald F. Towsley,et al.  On distinguishing between Internet power law topology generators , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[10]  S. Shenker,et al.  Network topology generators: degree-based vs. structural , 2002, SIGCOMM '02.

[11]  Micah Adler,et al.  Towards compressing Web graphs , 2001, Proceedings DCC 2001. Data Compression Conference.

[12]  Lixin Gao,et al.  On inferring autonomous system relationships in the Internet , 2000, Globecom '00 - IEEE. Global Telecommunications Conference. Conference Record (Cat. No.00CH37137).

[13]  A. Barabasi,et al.  Scale-free characteristics of random networks: the topology of the world-wide web , 2000 .

[14]  Fan Chung Graham,et al.  A random graph model for massive graphs , 2000, STOC '00.

[15]  Jon M. Kleinberg,et al.  The small-world phenomenon: an algorithmic perspective , 2000, STOC '00.

[16]  Ibrahim Matta,et al.  On the origin of power laws in Internet topologies , 2000, CCRV.

[17]  Ronald Rosenfeld,et al.  Efficient sampling and feature selection in whole sentence maximum entropy language models , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[18]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[19]  Andrei Z. Broder,et al.  The Connectivity Server: Fast Access to Linkage Information on the Web , 1998, Comput. Networks.

[20]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[21]  S. Vempala,et al.  Simulated Annealing in Convex Bodies and an O ( n 4 ) Volume Algorithm , 2007 .

[22]  P. Faloutsos,et al.  Power-Laws and the AS-level Internet , 2003 .

[23]  A. Andrew,et al.  Emergence of Scaling in Random Networks , 1999 .

[24]  Stanley F. Chen,et al.  Evaluation Metrics For Language Models , 1998 .

[25]  University of Wisconsin—madison , 1998 .