Distribution of PageRank Mass Among Principle Components of the Web

We study the PageRank mass of principal components in a bow-tie Web Graph, as a function of the damping factor c. Using a singular perturbation approach, we show that the PageRank share of IN and SCC components remains high even for very large values of the damping factor, in spite of the fact that it drops to zero when c → 1. However, a detailed study of the OUT component reveals the presence of "dead-ends" (small groups of pages linking only to each other) that receive an unfairly high ranking when c is close to one. We argue that this problem can be mitigated by choosing c as small as 1/2.

[1]  Shlomo Moran,et al.  The stochastic approach for link-structure analysis (SALSA) and the TKC effect , 2000, Comput. Networks.

[2]  Franco Scarselli,et al.  Inside PageRank , 2005, TOIT.

[3]  Kevin S. McCurley,et al.  Ranking the web frontier , 2004, WWW '04.

[4]  Amy Nicole Langville,et al.  Google's PageRank and beyond - the science of search engine rankings , 2006 .

[5]  Sebastiano Vigna,et al.  PageRank as a function of the damping factor , 2005, WWW '05.

[6]  E. Seneta Non-negative Matrices and Markov Chains , 2008 .

[7]  Cleve B. Moler,et al.  Numerical computing with MATLAB , 2004 .

[8]  Konstantin Avrachenkov,et al.  The Effect of New Links on Google Pagerank , 2006 .

[9]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[10]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[11]  Ravi Kumar,et al.  Self-similarity in the web , 2001, TOIT.

[12]  V. S. Koroli︠u︡k,et al.  Mathematical Foundations of the State Lumping of Large Systems , 1993 .

[13]  G.G. Yin,et al.  Discrete-Time Markov Chains , 2006, IEEE Transactions on Automatic Control.

[14]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[15]  Sergei Maslov,et al.  Finding scientific gems with Google's PageRank algorithm , 2006, J. Informetrics.

[16]  V. G. Gaitsgori,et al.  Theory of Suboptimal Decisions , 1988 .

[17]  Andrei Z. Broder,et al.  Graph structure in the Web , 2000, Comput. Networks.

[18]  Konstantin Avrachenkov,et al.  Monte Carlo Methods in PageRank Computation: When One Iteration is Sufficient , 2007, SIAM J. Numer. Anal..

[19]  Pavel Berkhin,et al.  A Survey on PageRank Computing , 2005, Internet Math..

[20]  Santo Fortunato,et al.  Random Walks on Directed Networks: the Case of PageRank , 2007, Int. J. Bifurc. Chaos.

[21]  Eli Upfal,et al.  The Web as a graph , 2000, PODS.