An Inner-Outer Iteration for Computing PageRank

We present a new iterative scheme for PageRank computation. The algorithm is applied to the linear system formulation of the problem, using inner-outer stationary iterations. It is simple, can be easily implemented and parallelized, and requires minimal storage overhead. Our convergence analysis shows that the algorithm is effective for a crude inner tolerance and is not sensitive to the choice of the parameters involved. The same idea can be used as a preconditioning technique for nonstationary schemes. Numerical examples featuring matrices of dimensions exceeding 100,000,000 in sequential and parallel environments demonstrate the merits of our technique. Our code is available online for viewing and testing, along with several large scale examples.

[1]  Ali Esmaili,et al.  Probability and Random Processes , 2005, Technometrics.

[2]  Sebastiano Vigna,et al.  Traps and Pitfalls of Topic-Biased PageRank , 2007, WAW.

[3]  Frank McSherry,et al.  A uniform approach to accelerated PageRank computation , 2005, WWW '05.

[4]  J. Gillis,et al.  Matrix Iterative Analysis , 1961 .

[5]  Ilse C. F. Ipsen,et al.  PageRank Computation, with Special Attention to Dangling Nodes , 2007, SIAM J. Matrix Anal. Appl..

[6]  Valeria Simoncini,et al.  Theory of Inexact Krylov Subspace Methods and Applications to Scientific Computing , 2003, SIAM J. Sci. Comput..

[7]  Hector Garcia-Molina,et al.  Combating Web Spam with TrustRank , 2004, VLDB.

[8]  Claude Brezinski,et al.  Extrapolation methods for PageRank computations , 2005 .

[9]  Bernhard Schölkopf,et al.  Learning from labeled and unlabeled data on a directed graph , 2005, ICML.

[10]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[11]  Amy Nicole Langville,et al.  Google's PageRank and beyond - the science of search engine rankings , 2006 .

[12]  Sebastiano Vigna,et al.  PageRank as a function of the damping factor , 2005, WWW '05.

[13]  G. Golub,et al.  The convergence of inexact Chebyshev and Richardson iterative methods for solving linear systems , 1988 .

[14]  Kurt Bryan,et al.  The $25,000,000,000 Eigenvector: The Linear Algebra behind Google , 2006, SIAM Rev..

[15]  Franco Scarselli,et al.  Inside PageRank , 2005, TOIT.

[16]  Marc Najork,et al.  Hits on the web: how does it compare? , 2007, SIGIR.

[17]  Francesco Romani,et al.  Comparison of Krylov subspace methods on the PageRank problem , 2007 .

[18]  G. Golub,et al.  An Arnoldi-type algorithm for computing page rank , 2006 .

[19]  David Henige Library of Congress Subject Headings , 1987 .

[20]  Amy Nicole Langville,et al.  A Survey of Eigenvector Methods for Web Information Retrieval , 2005, SIAM Rev..

[21]  Pavel Berkhin,et al.  A Survey on PageRank Computing , 2005, Internet Math..

[22]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[23]  Stefano Serra Capizzano Jordan Canonical Form of the Google Matrix: A Potential Contribution to the PageRank Computation , 2005, SIAM J. Matrix Anal. Appl..

[24]  Gene H. Golub,et al.  Inner and Outer Iterations for the Chebyshev Algorithm , 1998 .

[25]  Taher H. Haveliwala,et al.  The Condition Number of the PageRank Problem , 2003 .

[26]  William J. Stewart,et al.  Introduction to the numerical solution of Markov Chains , 1994 .

[27]  A. Brauer Limits for the characteristic roots of a matrix. IV: Applications to stochastic matrices , 1952 .

[28]  W. Hackbusch Iterative Solution of Large Sparse Systems of Equations , 1993 .

[29]  Desmond J. Higham,et al.  GeneRank: Using search engine technology for the analysis of microarray experiments , 2005, BMC Bioinformatics.

[30]  Robert J. Plemmons,et al.  Nonnegative Matrices in the Mathematical Sciences , 1979, Classics in Applied Mathematics.

[31]  Xu Jia,et al.  A Fast Two-Stage Algorithm for Computing SimRank and Its Extensions , 2010, WAIM Workshops.

[32]  G. Golub,et al.  Inexact and preconditioned Uzawa algorithms for saddle point problems , 1994 .

[33]  David F. Gleich,et al.  Fast Parallel PageRank: A Linear System Approach , 2004 .

[34]  Jasmine Novak,et al.  PageRank Computation and the Structure of the Web: Experiments and Algorithms , 2002 .

[35]  Kumar Chellapilla,et al.  Speeding up algorithms on compressed web graphs , 2009, WSDM '09.

[36]  Ilse C. F. Ipsen,et al.  Ordinal Ranking for Google's PageRank , 2008, SIAM J. Matrix Anal. Appl..

[37]  Gene H. Golub,et al.  Extrapolation methods for accelerating PageRank computations , 2003, WWW '03.

[38]  Gerhard Weikum,et al.  Efficient and decentralized PageRank approximation in a peer-to-peer web search network , 2006, VLDB.

[39]  Bonnie Berger,et al.  Pairwise Global Alignment of Protein Interaction Networks by Matching Neighborhood Topology , 2007, RECOMB.

[40]  Kevin S. McCurley,et al.  Ranking the web frontier , 2004, WWW '04.

[41]  Konstantin Avrachenkov,et al.  Distribution of PageRank Mass Among Principle Components of the Web , 2007, WAW.

[42]  L. Eldén A Note on the Eigenvalues of the Google Matrix , 2004, math/0401177.

[43]  A. B. Farnell Limits for the characteristic roots of a matrix , 1944 .