Personalized Page Rank on Knowledge Graphs: Particle Filtering is all you need!

Graphs are everywhere. Personalized Page Rank (PPR) is a particularly important task to support search and exploration within such datasets. PPR computes the proximity between query nodes and other nodes in the graph. This is used, among others, for entity exploration, query expansion, and product recommendation. Graph databases are used for storing knowledge graphs. Unfortunately, the exact computation of PPR is computationally expensive. While different solutions have been proposed to compute PPR values with high precision, these are extremely complex to implement, and in some cases require heavy preprocessing. In this work, we sustain that a better approach exists: particle filtering. Particle filtering methods produce ranks with sufficient precision while exploiting what graph databases architectures are already optimized for: navigating local connections. We present the implementation of such an approach in a popular commercial database and show how this outperforms the already implemented functionality. With this, we aim to motivate future research to optimize and improve upon this research direction.

[1]  Stephanie Rogers,et al.  Related Pins at Pinterest: The Evolution of a Real-World Recommender System , 2017, WWW.

[2]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[3]  Christos Faloutsos,et al.  Fast Random Walk with Restart and Its Applications , 2006, Sixth International Conference on Data Mining (ICDM'06).

[4]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[5]  Dániel Fogaras,et al.  Towards Scaling Fully Personalized PageRank , 2004, WAW.

[6]  Yin Yang,et al.  HubPPR: Effective Indexing for Approximate Personalized PageRank , 2016, Proc. VLDB Endow..

[7]  Inderjit S. Dhillon,et al.  Overlapping Community Detection Using Neighborhood-Inflated Seed Expansion , 2015, IEEE Transactions on Knowledge and Data Engineering.

[8]  Soumen Chakrabarti,et al.  Dynamic personalized pagerank in entity-relation graphs , 2007, WWW '07.

[9]  Yannis Velegrakis,et al.  Beyond Macrobenchmarks: Microbenchmark-based Graph Database Evaluation , 2018, Proc. VLDB Endow..

[10]  Fan Chung Graham,et al.  Local Graph Partitioning using PageRank Vectors , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[11]  Ricardo A. Baeza-Yates,et al.  Generalizing PageRank: damping functions for link-based ranking algorithms , 2006, SIGIR.

[12]  Yin Yang,et al.  FORA: Simple and Effective Approximate Single-Source Personalized PageRank , 2017, KDD.

[13]  Sibo Wang,et al.  TopPPR: Top-k Personalized PageRank Queries with Precision Guarantees on Large Graphs , 2018, SIGMOD Conference.

[14]  Lee Sael,et al.  BePI: Fast and Memory-Efficient Method for Billion-Scale Random Walk with Restart , 2017, SIGMOD Conference.

[15]  Yasuhiro Fujiwara,et al.  Efficient personalized pagerank with accuracy assurance , 2012, KDD.

[16]  M. Tamer Özsu,et al.  Diversified Stress Testing of RDF Data Management Systems , 2014, SEMWEB.

[17]  Natasha Noy,et al.  Industry-scale Knowledge Graphs: Lessons and Challenges , 2019, ACM Queue.

[18]  Christos Faloutsos,et al.  Automatic multimedia cross-modal correlation discovery , 2004, KDD.

[19]  Lenar Iskhakov,et al.  Algorithms and Models for the Web Graph , 2018, Lecture Notes in Computer Science.

[20]  Ashish Goel,et al.  Personalized PageRank Estimation and Search: A Bidirectional Approach , 2015, WSDM.

[21]  Amine Mhedhbi,et al.  The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing , 2017 .

[22]  Themis Palpanas,et al.  Exemplar queries: a new way of searching , 2016, The VLDB Journal.

[23]  Lee Sael,et al.  Random Walk with Restart on Large Graphs Using Block Elimination , 2016, ACM Trans. Database Syst..

[24]  Felix Conrads,et al.  How Representative Is a SPARQL Benchmark? An Analysis of RDF Triplestore Benchmarks , 2019, WWW.

[25]  Jimmy J. Lin,et al.  WTF: the who to follow service at Twitter , 2013, WWW.

[26]  Ni Lao,et al.  Fast query execution for retrieval models based on path-constrained random walks , 2010, KDD.

[27]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.