Algorithms on Exact Personalized PageRank

As one of the most well known graph computation problems, Personalized PageRank is an effective approach for computing the similarity score between two nodes, and it has been widely used in various applications, such as link prediction and recommendation. Due to the high computational cost and space cost of computing the exact Personalized PageRank Vector (PPV), most existing studies compute PPV approximately. In this paper, we propose novel and efficient distributed algorithms that compute PPV exactly based on graph partitioning on a general coordinator-based share-nothing distributed computing platform. Our algorithms takes three aspects into account: the load balance, the communication cost, and the computation cost of each machine. The proposed algorithms only require one time of communication between each machine and the coordinator at query time. The communication cost is bounded, and the work load on each machine is balanced. Comprehensive experiments conducted on five real datasets demonstrate the efficiency and the scalability of our proposed methods. CCS Concepts •Mathematics of computing → Graph algorithms; •Theory of computation → Parallel computing models;

[1]  Ashish Goel,et al.  Personalized PageRank Estimation and Search: A Bidirectional Approach , 2015, WSDM.

[2]  Lior Horesh,et al.  Community Detection Using Time-Dependent Personalized PageRank , 2015, ICML.

[3]  Lee Sael,et al.  BEAR: Block Elimination Approach for Random Walk with Restart on Large Graphs , 2015, SIGMOD Conference.

[4]  Wilfred Ng,et al.  Effective Techniques for Message Reduction and Load Balancing in Distributed Graph Computation , 2015, WWW.

[5]  Alexandros G. Dimakis,et al.  FrogWild! - Fast PageRank Approximations on Graph Engines , 2015, Proc. VLDB Endow..

[6]  Gao Cong,et al.  Graph-based Point-of-interest Recommendation with Geographical and Temporal Influences , 2014, CIKM.

[7]  Wilfred Ng,et al.  Blogel: A Block-Centric Framework for Distributed Computation on Real-World Graphs , 2014, Proc. VLDB Endow..

[8]  Takuya Akiba,et al.  Computing Personalized PageRank Quickly by Exploiting Graph Structures , 2014, Proc. VLDB Endow..

[9]  Ruoming Jin,et al.  Fast and unified local search for random walk based k-nearest-neighbor query in large graphs , 2014, SIGMOD Conference.

[10]  Ashish Goel,et al.  FAST-PPR: scaling personalized pagerank estimation for large graphs , 2014, KDD.

[11]  Jennifer Widom,et al.  GPS: a graph processing system , 2013, SSDBM.

[12]  Yasuhiro Fujiwara,et al.  Efficient ad-hoc search for personalized PageRank , 2013, SIGMOD '13.

[13]  Kevin Chen-Chuan Chang,et al.  Incremental and Accuracy-Aware Personalized PageRank through Scheduled Approximation , 2013, Proc. VLDB Endow..

[14]  Yasuhiro Fujiwara,et al.  Efficient personalized pagerank with accuracy assurance , 2012, KDD.

[15]  Eli Upfal,et al.  PageRank on an evolving graph , 2012, KDD.

[16]  Yasuhiro Fujiwara,et al.  Fast and Exact Top-k Search for Random Walk with Restart , 2012, Proc. VLDB Endow..

[17]  Abdulmotaleb El-Saddik,et al.  Personalized PageRank vectors for tag recommendations: inside FolkRank , 2011, RecSys '11.

[18]  Dong Xin,et al.  Fast personalized PageRank on MapReduce , 2011, SIGMOD '11.

[19]  Soumen Chakrabarti,et al.  Index design and query processing for graph conductance search , 2011, The VLDB Journal.

[20]  Jure Leskovec,et al.  Supervised random walks: predicting and recommending links in social networks , 2010, WSDM '11.

[21]  Ashish Goel,et al.  Fast Incremental and Personalized PageRank , 2010, Proc. VLDB Endow..

[22]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[23]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[24]  Pavel Velikhov,et al.  Accuracy estimate and optimization techniques for SimRank computation , 2008, The VLDB Journal.

[25]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[26]  Gerhard Weikum,et al.  The Juxtaposed approximate PageRank method for robust PageRank approximation in a peer-to-peer web search network , 2008, The VLDB Journal.

[27]  Ioannis Antonellis,et al.  Simrank++: query rewriting through link analysis of the clickgraph (poster) , 2007, Proc. VLDB Endow..

[28]  Soumen Chakrabarti,et al.  Dynamic personalized pagerank in entity-relation graphs , 2007, WWW '07.

[29]  Jonathan W. Berry,et al.  Challenges in Parallel Graph Processing , 2007, Parallel Process. Lett..

[30]  Christos Faloutsos,et al.  Fast Random Walk with Restart and Its Applications , 2006, Sixth International Conference on Data Mining (ICDM'06).

[31]  Pavel Berkhin,et al.  Bookmark-Coloring Algorithm for Personalized PageRank Computing , 2006, Internet Math..

[32]  Jimeng Sun,et al.  Neighborhood formation and anomaly detection in bipartite graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[33]  Shaozhi Ye,et al.  Distributed PageRank computation based on iterative aggregation-disaggregation methods , 2005, CIKM '05.

[34]  Dániel Fogaras,et al.  Towards Scaling Fully Personalized PageRank: Algorithms, Lower Bounds, and Experiments , 2005, Internet Math..

[35]  Vagelis Hristidis,et al.  ObjectRank: Authority-Based Keyword Search in Databases , 2004, VLDB.

[36]  Konstantin Andreev,et al.  Balanced Graph Partitioning , 2004, SPAA '04.

[37]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[38]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[39]  Robert Krauthgamer,et al.  A polylogarithmic approximation of the minimum bisection , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[40]  Tamara G. Kolda,et al.  Graph partitioning models for parallel computing , 2000, Parallel Comput..

[41]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[42]  Curt Jones,et al.  Finding Good Approximate Vertex and Edge Partitions is NP-Hard , 1992, Inf. Process. Lett..

[43]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[44]  R. Tarjan,et al.  A Separator Theorem for Planar Graphs , 1977 .

[45]  Jane Zundel MATCHING THEORY , 2011 .

[46]  David F. Gleich,et al.  Fast Parallel PageRank: A Linear System Approach , 2004 .

[47]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .