Distributed PageRank computation based on iterative aggregation-disaggregation methods

PageRank has been widely used as a major factor in search engine ranking systems. However, global link graph information is required when computing PageRank, which causes prohibitive communication cost to achieve accurate results in distributed solution. In this paper, we propose a distributed PageRank computation algorithm based on iterative aggregation-disaggregation (IAD) method with Block Jacobi smoothing. The basic idea is divide-and-conquer. We treat each web site as a node to explore the block structure of hyperlinks. Local PageRank is computed by each node itself and then updated with a low communication cost with a coordinator. We prove the global convergence of the Block Jacobi method and then analyze the communication overhead and major advantages of our algorithm. Experiments on three real web graphs show that our method converges 5-7 times faster than the traditional Power method. We believe our work provides an efficient and practical distributed solution for PageRank on large scale Web graphs.

[1]  C. F. Kossack,et al.  Rank Correlation Methods , 1949 .

[2]  R. Plemmons,et al.  Convergent nonnegative matrices and iterative methods for consistent linear systems , 1978 .

[3]  L. Kaufman Matrix Methods for Queuing Problems , 1983 .

[4]  Françoise Chatelin,et al.  Iterativ Aggregation/Disaggregation Methods , 1983, Computer Performance and Reliability.

[5]  William J. Stewart,et al.  Iterative aggregation/disaggregation techniques for nearly uncoupled markov chains , 1985, JACM.

[6]  P. Courtois,et al.  Block iterative algorithms for stochastic matrices , 1986 .

[7]  Carl D. Meyer,et al.  Stochastic Complementation, Uncoupling Markov Chains, and the Theory of Nearly Reducible Systems , 1989, SIAM Rev..

[8]  Carl Dean Meyer,et al.  A General Framework for Iterative Aggregation/Disaggregation Methods , 1992 .

[9]  M. Kendall,et al.  Rank Correlation Methods , 1949 .

[10]  G. Stewart,et al.  A Two-Stage Iteration for Solving Nearly Completely Decomposable Markov Chains , 1994 .

[11]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[12]  D. Szyld The Mystery Of Asynchronous Iterations Convergence When The Spectral Radius Is One , 1998 .

[13]  M. KleinbergJon Authoritative sources in a hyperlinked environment , 1999 .

[14]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[15]  Ivo Marek,et al.  Iterative Aggregation/Disaggregation Methods for Computing Some Characteristics of Markov Chains , 2001, LSSC.

[16]  Gene H. Golub,et al.  Extrapolation methods for accelerating PageRank computations , 2003, WWW '03.

[17]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[18]  Gene H. Golub,et al.  Exploiting the Block Structure of the Web for Computing , 2003 .

[19]  I. Marek,et al.  Convergence theory of some classes of iterative aggregation/disaggregation methods for computing stationary probability vectors of stochastic matrices , 2003 .

[20]  G. Golub,et al.  A Fast Two-Stage Algorithm for Computing PageRank , 2003 .

[21]  David J. DeWitt,et al.  Computing PageRank in a Distributed Internet Search Engine System , 2004, VLDB.

[22]  David F. Gleich,et al.  Fast Parallel PageRank: A Linear System Approach , 2004 .

[23]  Amy Nicole Langville,et al.  Updating pagerank with iterative aggregation , 2004, WWW Alt. '04.

[24]  Kevin S. McCurley,et al.  Ranking the web frontier , 2004, WWW '04.

[25]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[26]  Andrei Z. Broder,et al.  Efficient pagerank approximation via graph aggregation , 2004, WWW Alt. '04.

[27]  Taher H. Haveliwala,et al.  Adaptive methods for the computation of PageRank , 2004 .

[28]  Ilse C. F. Ipsen,et al.  Convergence Analysis of a PageRank Updating Algorithm by Langville and Meyer , 2005, SIAM J. Matrix Anal. Appl..

[29]  I. Marek,et al.  A note on local and global convergence analysis of iterative aggregation–disaggregation methods , 2006 .