Fast Distributed PageRank Computation

Over the last decade, PageRank has gained importance in a wide range of applications and domains, ever since it first proved to be effective in determining node importance in large graphs (and was a pioneering idea behind Google’s search engine). In distributed computing alone, PageRank vectors, or more generally random walk based quantities have been used for several different applications ranging from determining important nodes, load balancing, search, and identifying connectivity structures. Surprisingly, however, there has been little work towards designing provably efficient fully-distributed algorithms for computing PageRank. The difficulty is that traditional matrix-vector multiplication style iterative methods may not always adapt well to the distributed setting owing to communication bandwidth restrictions and convergence rates.

[1]  David Peleg,et al.  Distributed Computing: A Locality-Sensitive Approach , 1987 .

[2]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[3]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[4]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[5]  Guangwen Yang,et al.  Distributed page ranking in structured P2P networks , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..

[6]  James C. Browne,et al.  Distributed pagerank for P2P systems , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[7]  Pavel Berkhin,et al.  A Survey on PageRank Computing , 2005, Internet Math..

[8]  Eli Upfal,et al.  Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2005 .

[9]  Franco Scarselli,et al.  Inside PageRank , 2005, TOIT.

[10]  Fan Chung Graham,et al.  Local Graph Partitioning using PageRank Vectors , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[11]  Konstantin Avrachenkov,et al.  Monte Carlo Methods in PageRank Computation: When One Iteration is Sufficient , 2007, SIAM J. Numer. Anal..

[12]  Cong Wang,et al.  Keyword Extraction Based on PageRank , 2007, PAKDD.

[13]  S. Fortunato,et al.  Spectral centrality measures in complex networks. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Prasad Tetali,et al.  Near-Optimal Sublinear Time Bounds for Distributed Random Walks , 2009, ArXiv.

[15]  Maleq Khan,et al.  Theory of communication networks , 2010 .

[16]  Prasad Tetali,et al.  Efficient distributed random walks with applications , 2010, PODC '10.

[17]  Ashish Goel,et al.  Fast Incremental and Personalized PageRank , 2010, Proc. VLDB Endow..

[18]  Dong Xin,et al.  Fast personalized PageRank on MapReduce , 2011, SIGMOD '11.

[19]  Vince Grolmusz,et al.  When the Web meets the cell: using personalized PageRank for analyzing protein interaction networks , 2011, Bioinform..

[20]  Sreenivas Gollapudi,et al.  Estimating PageRank on graph streams , 2008, PODS.

[21]  Prasad Tetali,et al.  Distributed Random Walks , 2013, JACM.

[22]  Vince Grolmusz,et al.  A note on the PageRank of undirected graphs , 2012, Inf. Process. Lett..