Computing Personalized PageRank Quickly by Exploiting Graph Structures

We propose a new scalable algorithm that can compute Personalized PageRank (PPR) very quickly. The Power method is a state-of-the-art algorithm for computing exact PPR; however, it requires many iterations. Thus reducing the number of iterations is the main challenge. We achieve this by exploiting graph structures of web graphs and social networks. The convergence of our algorithm is very fast. In fact, it requires up to 7.5 times fewer iterations than the Power method and is up to five times faster in actual computation time. To the best of our knowledge, this is the first time to use graph structures explicitly to solve PPR quickly. Our contributions can be summarized as follows. 1. We provide an algorithm for computing a tree decomposition, which is more efficient and scalable than any previous algorithm. 2. Using the above algorithm, we can obtain a core-tree decomposition of any web graph and social network. This allows us to decompose a web graph and a social network into (1) the core, which behaves like an expander graph, and (2) a small tree-width graph, which behaves like a tree in an algorithmic sense. 3. We apply a direct method to the small tree-width graph to construct an LU decomposition. 4. Building on the LU decomposition and using it as pre-conditoner, we apply GMRES method (a state-of-the-art advanced iterative method) to compute PPR for whole web graphs and social networks.

[1]  Y. Saad,et al.  GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems , 1986 .

[2]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[3]  Timothy A. Davis,et al.  Direct methods for sparse linear systems , 2006, Fundamentals of algorithms.

[4]  Donald E. Knuth,et al.  The Expected Linearity of a Simple Equivalence Algorithm , 1978, Theor. Comput. Sci..

[5]  Ashish Goel,et al.  Fast Incremental and Personalized PageRank , 2010, Proc. VLDB Endow..

[6]  Francesco Romani,et al.  Comparison of Krylov subspace methods on the PageRank problem , 2007 .

[7]  Stefan Arnborg,et al.  Linear time algorithms for NP-hard problems restricted to partial k-trees , 1989, Discret. Appl. Math..

[8]  Marco Rosa,et al.  Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks , 2010, WWW.

[9]  Dániel Fogaras,et al.  Towards Scaling Fully Personalized PageRank: Algorithms, Lower Bounds, and Experiments , 2005, Internet Math..

[10]  Mark Jerrum,et al.  Approximating the Permanent , 1989, SIAM J. Comput..

[11]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[12]  Bonnie Berger,et al.  A tree-decomposition approach to protein structure prediction , 2005, 2005 IEEE Computational Systems Bioinformatics Conference (CSB'05).

[13]  Amy Nicole Langville,et al.  Google's PageRank and beyond - the science of search engine rankings , 2006 .

[14]  Paul D. Seymour,et al.  Graph minors. III. Planar tree-width , 1984, J. Comb. Theory B.

[15]  Andrew Chi-Chih Yao,et al.  On the average behavior of set merging algorithms (Extended Abstract) , 1976, STOC '76.

[16]  K. Avrachenkov,et al.  Quick Detection of Top-k Personalized PageRank Lists , 2011, WAW.

[17]  Yasuhiro Fujiwara,et al.  Fast and Exact Top-k Search for Random Walk with Restart , 2012, Proc. VLDB Endow..

[18]  David F. Gleich,et al.  Fast Parallel PageRank: A Linear System Approach , 2004 .

[19]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[20]  Yasuhiro Fujiwara,et al.  Efficient personalized pagerank with accuracy assurance , 2012, KDD.

[21]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[22]  Rizal Setya Perdana What is Twitter , 2013 .

[23]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[24]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[25]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[26]  Gang Wu,et al.  Arnoldi versus GMRES for computing pageRank: A theoretical contribution to google's pageRank problem , 2010, TOIS.

[27]  Pinar Heggernes,et al.  The Minimum Degree Heuristic and the Minimal Triangulation Process , 2003, WG.

[28]  Yong Gao,et al.  Treewidth of Erdős-Rényi random graphs, random intersection graphs, and scale-free random graphs , 2009, Discret. Appl. Math..

[29]  Takuya Akiba,et al.  Shortest-path queries for complex networks: exploiting low tree-width outside the core , 2012, EDBT '12.

[30]  A. George Nested Dissection of a Regular Finite Element Mesh , 1973 .

[31]  Noga Alon,et al.  Explicit construction of linear sized tolerant networks , 1988, Discret. Math..

[32]  Yousef Saad,et al.  Iterative methods for sparse linear systems , 2003 .

[33]  Ophir Frieder,et al.  Hourly analysis of a very large topically categorized web query log , 2004, SIGIR '04.

[34]  Takuya Akiba,et al.  Network structural analysis via core-tree-decomposition Publication of this article pending inquiry , 2014, KDD.

[35]  D. Sorensen Numerical methods for large eigenvalue problems , 2002, Acta Numerica.

[36]  F. Chung Laplacians and the Cheeger Inequality for Directed Graphs , 2005 .

[37]  Patrick R. Amestoy,et al.  An Approximate Minimum Degree Ordering Algorithm , 1996, SIAM J. Matrix Anal. Appl..

[38]  N. Linial,et al.  Expander Graphs and their Applications , 2006 .

[39]  M. Embree How Descriptive are GMRES Convergence Bounds? , 1999, ArXiv.

[40]  Bingsheng He,et al.  Large graph processing in the cloud , 2010, SIGMOD Conference.

[41]  Fang Wei-Kleiner,et al.  TEDI: Efficient Shortest Path Query Answering on Graphs , 2010, Graph Data Management.

[42]  Jure Leskovec,et al.  Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters , 2008, Internet Math..

[43]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[44]  Carlos Guestrin,et al.  Distributed GraphLab : A Framework for Machine Learning and Data Mining in the Cloud , 2012 .

[45]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.