An In-Depth Comparison of s-t Reliability Algorithms over Uncertain Graphs

Uncertain, or probabilistic, graphs have been increasingly used to represent noisy linked data in many emerging applications, and have recently attracted the attention of the database research community. A fundamental problem on uncertain graphs is the s-t reliability, which measures the probability that a target node t is reachable from a source node s in a probabilistic (or uncertain) graph, i.e., a graph where every edge is assigned a probability of existence. Due to the inherent complexity of the s-t reliability estimation problem (#P-hard), various sampling and indexing based efficient algorithms were proposed in the literature. However, since they have not been thoroughly compared with each other, it is not clear whether the later algorithm outperforms the earlier ones. More importantly, the comparison framework, datasets, and metrics were often not consistent (e.g., different convergence criteria were employed to find the optimal number of samples) across these works. We address this serious concern by re-implementing six state-of-the-art s-t reliability estimation methods in a common system and code base, using several medium and large-scale, real-world graph datasets, identical evaluation metrics, and query workloads. Through our systematic and in-depth analysis of experimental results, we report surprising findings, such as many follow-up algorithms can actually be several orders of magnitude inefficient, less accurate, and more memory intensive compared to the ones that were proposed earlier. We conclude by discussing our recommendations on the road ahead.

[1]  George S. Fishman A Comparison of Four Monte Carlo Methods for Estimating the Probability of s-t Connectedness , 1986, IEEE Transactions on Reliability.

[2]  Nikolaos Limnios,et al.  K-Terminal Network Reliability Measures With Binary Decision Diagrams , 2007, IEEE Transactions on Reliability.

[3]  Christopher Ré,et al.  Managing Uncertainty in Social Networks , 2007, IEEE Data Eng. Bull..

[4]  Jian Pei,et al.  Probabilistic path queries in road networks: traffic uncertainty aware path selection , 2010, EDBT '10.

[5]  Michael O. Ball,et al.  Computational Complexity of Network Reliability Analysis: An Overview , 1986, IEEE Transactions on Reliability.

[6]  Leslie G. Valiant,et al.  The Complexity of Enumeration and Reliability Problems , 1979, SIAM J. Comput..

[7]  Fang Wei-Kleiner,et al.  TEDI: Efficient Shortest Path Query Answering on Graphs , 2010, Graph Data Management.

[8]  J. Dugan,et al.  Network s-t reliability bounds using a 2-dimensional reliability polynomial , 1994 .

[9]  Jianzhong Li,et al.  Scalable Processing of Massive Uncertain Graph Data: A Simultaneous Processing Approach , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[10]  Masahiro Kimura,et al.  Tractable Models for Information Diffusion in Social Networks , 2006, PKDD.

[11]  Xuemin Lin,et al.  BMC: An Efficient Method to Evaluate Probabilistic Reachability Queries , 2011, DASFAA.

[12]  Kian-Lee Tan,et al.  Discovering Your Selling Points: Personalized Social Influential Tags Exploration , 2017, SIGMOD Conference.

[13]  Takuya Akiba,et al.  Shortest-path queries for complex networks: exploiting low tree-width outside the core , 2012, EDBT '12.

[14]  Enhong Chen,et al.  Maximizing the Coverage of Information Propagation in Social Networks , 2015, IJCAI.

[15]  Francesco Bonchi,et al.  Conditional Reliability in Uncertain Graphs , 2016, IEEE Transactions on Knowledge and Data Engineering.

[16]  J. Scott Provan,et al.  Disjoint Products and Efficient Computation of Reliability , 1988, Oper. Res..

[17]  Tamir Tassa,et al.  Injecting Uncertainty in Graphs for Identity Obfuscation , 2012, Proc. VLDB Endow..

[18]  K. K. Aggarwal,et al.  Reliability evaluation A comparative study of different techniques , 1975 .

[19]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[20]  Hannu Toivonen,et al.  Biomine: predicting links between biological entities using network models of heterogeneous databases , 2012, BMC Bioinformatics.

[21]  Lei Chen,et al.  Efficiently Answering Probability Threshold-Based Shortest Path Queries over Uncertain Graphs , 2010, DASFAA.

[22]  Dimitris Papadias,et al.  The pursuit of a good possible world: extracting representative instances of uncertain graphs , 2014, SIGMOD Conference.

[23]  Charles J. Colbourn,et al.  Lower bounds on two-terminal network reliability , 1988, Discret. Appl. Math..

[24]  Xinbing Wang,et al.  Determining Source–Destination Connectivity in Uncertain Networks: Modeling and Solutions , 2017, IEEE/ACM Transactions on Networking.

[25]  Ahmad R. Sharafat,et al.  All-Terminal Network Reliability Using Recursive Truncation Algorithm , 2009, IEEE Transactions on Reliability.

[26]  Jeffrey Xu Yu,et al.  Recursive Stratified Sampling: A New Framework for Query Evaluation on Uncertain Graphs , 2016, IEEE Transactions on Knowledge and Data Engineering.

[27]  Jianzhong Li,et al.  Top-k Reliability Search on Uncertain Graphs , 2015, 2015 IEEE International Conference on Data Mining.

[28]  Aristides Gionis,et al.  Fast Reliability Search in Uncertain Graphs , 2014, EDBT.

[29]  Xiaodong Li,et al.  Scalable Evaluation of k-NN Queries on Large Uncertain Graphs , 2018, EDBT.

[30]  Peng Peng,et al.  Top-K Possible Shortest Path Query over a Large Uncertain Graph , 2011, WISE.

[31]  Kang Liu,et al.  Triangle-Based Representative Possible Worlds of Uncertain Graphs , 2016, DASFAA.

[32]  Jian Pei,et al.  Continuous Influence Maximization: What Discounts Should We Offer to Social Network Users? , 2016, SIGMOD Conference.

[33]  Chunming Qiao,et al.  On a Routing Problem Within Probabilistic Graphs and its Application to Intermittently Connected Networks , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[34]  Haixun Wang,et al.  Distance-Constraint Reachability Computation in Uncertain Graphs , 2011, Proc. VLDB Endow..

[35]  Xinbing Wang,et al.  Optimal determination of source-destination connectivity in random graphs , 2014, MobiHoc '14.

[36]  Xinbing Wang,et al.  Complexity vs. optimality: Unraveling source-destination connection in uncertain graphs , 2017, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[37]  Reynold Cheng,et al.  An Indexing Framework for Queries on Probabilistic Graphs , 2017, ACM Trans. Database Syst..

[38]  Lei Chen,et al.  The Reachability Query over Distributed Uncertain Graphs , 2015, 2015 IEEE 35th International Conference on Distributed Computing Systems.

[39]  Wei Chen,et al.  Scalable influence maximization for prevalent viral marketing in large-scale social networks , 2010, KDD.

[40]  Lei Chen,et al.  On Uncertain Graphs Modeling and Queries , 2015, Proc. VLDB Endow..

[41]  Abdullah Konak,et al.  An Improved General Upper Bound for All-Terminal Network Reliability , 1998 .

[42]  J. Galtier,et al.  Algorithms to evaluate the reliability of a network , 2005, DRCN 2005). Proceedings.5th International Workshop on Design of Reliable Communication Networks, 2005..

[43]  Charu C. Aggarwal,et al.  Managing and Mining Uncertain Data , 2009, Advances in Database Systems.

[44]  Yun Peng,et al.  Human-Powered Data Cleaning for Probabilistic Reachability Queries on Uncertain Graphs , 2017, IEEE Transactions on Knowledge and Data Engineering.

[45]  George Kollios,et al.  k-nearest neighbors in uncertain graphs , 2010, Proc. VLDB Endow..

[46]  Lei Chen,et al.  On Uncertain Graphs , 2018, On Uncertain Graphs.

[47]  J. Scott Provan,et al.  Computing Network Reliability in Time Polynomial in the Number of Cuts , 1984, Oper. Res..