Personalized PageRank in Uncertain Graphs with Mutually Exclusive Edges

Measures of node ranking, such as personalized PageRank, are utilized in many web and social-network based prediction and recommendation applications. Despite their effectiveness when the underlying graph is certain, however, these measures become difficult to apply in the presence of uncertainties, as they are not designed for graphs that include uncertain information, such as edges that mutually exclude each other. While there are several ways to naively extend existing techniques (such as trying to encode uncertainties as edge weights or computing all possible scenarios), as we discuss in this paper, these either lead to large degrees of errors or are very expensive to compute, as the number of possible worlds can grow exponentially with the amount of uncertainty. To tackle with this challenge, in this paper, we propose an efficient Uncertain Personalized PageRank (UPPR) algorithm to approximately compute personalized PageRank values on an uncertain graph with edge uncertainties. UPPR avoids enumeration of all possible worlds, yet it is able to achieve comparable accuracy by carefully encoding edge uncertainties in a data structure that leads to fast approximations. Experimental results show that UPPR is very efficient in terms of execution time and its accuracy is comparable or better than more costly alternatives.

[1]  Aristides Gionis,et al.  Fast Reliability Search in Uncertain Graphs , 2014, EDBT.

[2]  Lei Chen,et al.  Efficiently Answering Probability Threshold-Based Shortest Path Queries over Uncertain Graphs , 2010, DASFAA.

[3]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[4]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[5]  Hong Chen,et al.  Probabilistic SimRank computation over uncertain graphs , 2015, Inf. Sci..

[6]  K. Selçuk Candan,et al.  A Unified Treatment of Null Values Using Constraints , 1995, Inf. Sci..

[7]  K. Selçuk Candan,et al.  Locality-sensitive and Re-use Promoting Personalized PageRank computations , 2015, Knowledge and Information Systems.

[8]  Peter Lofgren,et al.  Efficient Algorithms for Personalized PageRank , 2015, ArXiv.

[9]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[10]  Lei Chen,et al.  On Uncertain Graphs Modeling and Queries , 2015, Proc. VLDB Endow..

[11]  Ke Xu,et al.  DIGRank: using global degree to facilitate ranking in an incomplete graph , 2011, CIKM '11.

[12]  Jeffrey Xu Yu,et al.  Efficient and accurate query evaluation on uncertain graphs via recursive stratified sampling , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[13]  Jung Hyun Kim,et al.  Efficient Node Proximity and Node Significance Computations in Graphs , 2017 .

[14]  Cristian Molinaro,et al.  Customized Policies for Handling Partial Information in Relational Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.

[15]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[16]  Christos Faloutsos,et al.  Fast Random Walk with Restart and Its Applications , 2006, Sixth International Conference on Data Mining (ICDM'06).

[17]  Roberto Tempo,et al.  Computing the PageRank Variation for Fragile Web Data , 2009 .

[18]  Jianzhong Li,et al.  Mining Frequent Subgraph Patterns from Uncertain Graph Data , 2010, IEEE Transactions on Knowledge and Data Engineering.

[19]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[20]  Haixun Wang,et al.  Efficient subgraph search over large uncertain graphs , 2011, Proc. VLDB Endow..

[21]  George Casella,et al.  Erratum: Inverting a Sum of Matrices , 1990, SIAM Rev..

[22]  Matthew Brand,et al.  Fast Online SVD Revisions for Lightweight Recommender Systems , 2003, SDM.

[23]  Marco Rosa,et al.  HyperANF: approximating the neighbourhood function of very large graphs on a budget , 2010, WWW.

[24]  Vagelis Hristidis,et al.  ObjectRank: Authority-Based Keyword Search in Databases , 2004, VLDB.

[25]  Balázs Csanád Csáji,et al.  PageRank optimization by edge selection , 2009, Discret. Appl. Math..

[26]  Haixun Wang,et al.  Distance-Constraint Reachability Computation in Uncertain Graphs , 2011, Proc. VLDB Endow..

[27]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[28]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[29]  Paul Van Dooren,et al.  Maximizing PageRank via outlinks , 2007, ArXiv.

[30]  M. Stephens,et al.  The Distribution of a Sum of Binomial Random Variables , 1993 .

[31]  Jae-Woo Chang,et al.  Two-Dimensional Dynamic Signature File Method Using Extendible Hashing and Frame-Slicing Techniques , 1997, Inf. Sci..

[32]  George Kollios,et al.  k-nearest neighbors in uncertain graphs , 2010, Proc. VLDB Endow..

[33]  Tamir Tassa,et al.  Injecting Uncertainty in Graphs for Identity Obfuscation , 2012, Proc. VLDB Endow..

[34]  Soumen Chakrabarti,et al.  Dynamic personalized pagerank in entity-relation graphs , 2007, WWW '07.

[35]  Lee Sael,et al.  BEAR: Block Elimination Approach for Random Walk with Restart on Large Graphs , 2015, SIGMOD Conference.

[36]  Maria Luisa Sapino,et al.  A rank algebra to support multimedia mining applications , 2007, MDM '07.

[37]  K. Selçuk Candan,et al.  Using Random Walks for Mining Web Document Associations , 2000, PAKDD.

[38]  K. Selçuk Candan,et al.  Skynets: searching for minimum trees in graphs with incomparable edge weights , 2011, CIKM '11.

[39]  Steven Thomas Smith,et al.  Network Discovery for uncertain graphs , 2014, 17th International Conference on Information Fusion (FUSION).