In-Degree and PageRank: Why Do They Follow Similar Power Laws?

PageRank is a popularity measure designed by Google to rank Web pages. Experiments confirm that PageRank values obey a power law with the same exponent as In-Degree values. This paper presents a novel mathematical model that explains this phenomenon. The relation between PageRank and In-Degree is modeled through a stochastic equation, which is inspired by the original definition of PageRank, and is analogous to the well-known distributional identity for the busy period in the M/G/1 queue. Further, we employ the theory of regular variation and Tauberian theorems to prove analytically that the tail distributions of PageRank and In-Degree differ only by a multiplicative constant, for which we derive a closed-form expression. Our analytical results are in good agreement with experimental data.

[1]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[2]  Michel L. Goldstein,et al.  Problems with fitting to the power-law distribution , 2004, cond-mat/0402322.

[3]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1951 .

[4]  Eli Upfal,et al.  Using PageRank to Characterize Web Structure , 2002, Internet Math..

[5]  Debora Donato,et al.  Large scale properties of the Webgraph , 2004 .

[6]  Béla Bollobás,et al.  The degree sequence of a scale‐free random graph process , 2001, Random Struct. Algorithms.

[7]  B. G. Marsden,et al.  On the distribution of the , 1973 .

[8]  Konstantin Avrachenkov,et al.  The Effect of New Links on Google Pagerank , 2006 .

[9]  M. Meerschaert Regular Variation in R k , 1988 .

[10]  M. Newman Power laws, Pareto distributions and Zipf's law , 2005 .

[11]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[12]  Konstantin Avrachenkov,et al.  PageRank of Scale-Free Growing Networks , 2006, Internet Math..

[13]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[14]  Luca Becchetti,et al.  The distribution of pageRank follows a power-law only for particular values of the damping factor , 2006, WWW '06.

[15]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[16]  A Vespignani,et al.  Topical interests and the mitigation of search engine bias , 2006, Proceedings of the National Academy of Sciences.

[17]  Kevin S. McCurley,et al.  Ranking the web frontier , 2004, WWW '04.

[18]  N. Bingham,et al.  Asymptotic properties of supercritical branching processes I: The Galton-Watson process , 1974, Advances in Applied Probability.

[19]  Marián Boguñá,et al.  How to make the top ten: Approximating PageRank from in-degree , 2005, ArXiv.

[20]  Marián Boguñá,et al.  Approximating PageRank from In-Degree , 2007, WAW.

[21]  Pavel Berkhin,et al.  A Survey on PageRank Computing , 2005, Internet Math..

[22]  Santo Fortunato,et al.  Random Walks on Directed Networks: the Case of PageRank , 2007, Int. J. Bifurc. Chaos.

[23]  D. Aldous,et al.  A survey of max-type recursive distributional equations , 2004, math/0401388.

[24]  M. E. J. Newman,et al.  Power laws, Pareto distributions and Zipf's law , 2005 .

[25]  J. Teugels,et al.  On the asymptotic behaviour of the distributions of the busy period and service time in M/G/1 , 1980, Journal of Applied Probability.

[26]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[27]  Philippe Robert Stochastic Networks and Queues , 2003 .