A Power–Arnoldi algorithm for computing PageRank

SUMMARY The PageRank algorithm plays an important role in modern search engine technology. It involves using the classical power method to compute the principle eigenvector of the Google matrix representing the web link graph. However, when the largest eigenvalue is not well separated from the second one, the power method may perform poorly. This happens when the damping factor is sufficiently close to 1. Therefore, it is worth developing new techniques that are more sophisticated than the power method. The approach presented here, called Power–Arnoldi, is based on a periodic combination of the power method with the thick restarted Arnoldi algorithm. The justification for this new approach is presented. Numerical tests illustrate the efficiency and convergence behaviour of the new algorithm. Copyright q 2007 John Wiley & Sons, Ltd.

[1]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[2]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[3]  J. H. Wilkinson The algebraic eigenvalue problem , 1966 .

[4]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[5]  R. Morgan,et al.  A harmonic restarted Arnoldi algorithm for calculating eigenvalues and determining multiplicity , 2006 .

[6]  Gene H. Golub,et al.  Extrapolation methods for accelerating PageRank computations , 2003, WWW '03.

[7]  Y. Saad,et al.  GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems , 1986 .

[8]  Jack Dongarra,et al.  Templates for the Solution of Algebraic Eigenvalue Problems , 2000, Software, environments, tools.

[9]  Taher H. Haveliwala,et al.  The Condition Number of the PageRank Problem , 2003 .

[10]  Danny C. Sorensen,et al.  Deflation Techniques for an Implicitly Restarted Arnoldi Iteration , 1996, SIAM J. Matrix Anal. Appl..

[11]  Robert J. Plemmons,et al.  Nonnegative Matrices in the Mathematical Sciences , 1979, Classics in Applied Mathematics.

[12]  Sebastiano Vigna,et al.  PageRank as a function of the damping factor , 2005, WWW '05.

[13]  Kesheng Wu,et al.  Thick-Restart Lanczos Method for Large Symmetric Eigenvalue Problems , 2000, SIAM J. Matrix Anal. Appl..

[14]  C. Brezinski,et al.  Hybrid procedures for solving linear systems , 1994 .

[15]  Carl D. Meyer,et al.  Matrix Analysis and Applied Linear Algebra , 2000 .

[16]  Taher H. Haveliwala,et al.  The Second Eigenvalue of the Google Matrix , 2003 .

[17]  Tie-Yan Liu,et al.  AggregateRank: bringing order to web sites , 2006, SIGIR '06.

[18]  Amy Nicole Langville,et al.  Updating pagerank with iterative aggregation , 2004, WWW Alt. '04.

[19]  G. Golub,et al.  An Arnoldi-type algorithm for computing page rank , 2006 .

[20]  Gene H. Golub,et al.  Exploiting the Block Structure of the Web for Computing , 2003 .

[21]  Ronald B. Morgan,et al.  On restarting the Arnoldi method for large nonsymmetric eigenvalue problems , 1996, Math. Comput..

[22]  Gene H. Golub,et al.  Computing PageRank using Power Extrapolation , 2003 .

[23]  O. Schneider Krylov subspace methods for computing stationary probability distributions of CTMCs , 2006 .

[24]  Taher H. Haveliwala,et al.  Adaptive methods for the computation of PageRank , 2004 .

[25]  David F. Gleich,et al.  Fast Parallel PageRank: A Linear System Approach , 2004 .

[26]  Amy Nicole Langville,et al.  A Survey of Eigenvector Methods for Web Information Retrieval , 2005, SIAM Rev..

[27]  Pavel Berkhin,et al.  A Survey on PageRank Computing , 2005, Internet Math..

[28]  Ronald B. Morgan,et al.  GMRES WITH DEFLATED , 2008 .

[29]  Amy Nicole Langville,et al.  A Reordering for the PageRank Problem , 2005, SIAM J. Sci. Comput..

[30]  D. Sorensen Numerical methods for large eigenvalue problems , 2002, Acta Numerica.

[31]  Danny C. Sorensen,et al.  Implicit Application of Polynomial Filters in a k-Step Arnoldi Method , 1992, SIAM J. Matrix Anal. Appl..