DBL: Efficient Reachability Queries on Dynamic Graphs (Complete Version)

Reachability query is a fundamental problem on graphs, which has been extensively studied in academia and industry. Since graphs are subject to frequent updates in many applications, it is essential to support efficient graph updates while offering good performance in reachability queries. Existing solutions compress the original graph with the Directed Acyclic Graph (DAG) and propose efficient query processing and index update techniques. However, they focus on optimizing the scenarios where the Strong Connected Components (SCCs) remain unchanged and have overlooked the prohibitively high cost of the DAG maintenance when SCCs are updated. In this paper, we propose DBL, an efficient DAG-free index to support the reachability query on dynamic graphs with insertion-only updates. DBL builds on two complementary indexes: Dynamic Landmark (DL) label and Bidirectional Leaf (BL) label. The former leverages landmark nodes to quickly determine reachable pairs whereas the latter prunes unreachable pairs by indexing the leaf nodes in the graph. We evaluate DBL against the state-of-the-art approaches on dynamic reachability index with extensive experiments on real-world datasets. The results have demonstrated that DBL achieves orders of magnitude speedup in terms of index update, while still producing competitive query efficiency.

[1]  Qing Zhu,et al.  Reachability Querying: Can It Be Even Faster? , 2017, IEEE Transactions on Knowledge and Data Engineering.

[2]  Aristides Gionis,et al.  Fast shortest path distance estimation in large networks , 2009, CIKM.

[3]  Philip S. Yu,et al.  Dual Labeling: Answering Graph Reachability Queries in Constant Time , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[4]  George H. L. Fletcher,et al.  Landmark Indexing for Evaluation of Label-Constrained Reachability Queries , 2017, SIGMOD Conference.

[5]  Gerhard Weikum,et al.  Efficient creation and incremental maintenance of the HOPI index for complex XML document collections , 2005, 21st International Conference on Data Engineering (ICDE'05).

[6]  Takuya Akiba,et al.  Fast exact shortest-path distance queries on large networks by pruned landmark labeling , 2013, SIGMOD '13.

[7]  Shilpa Chakravartula,et al.  Complex Networks: Structure and Dynamics , 2014 .

[8]  Takuya Akiba,et al.  Dynamic and historical shortest-path distance queries on large evolving networks by pruned landmark labeling , 2014, WWW.

[9]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[10]  Mohammed J. Zaki,et al.  GRAIL , 2010, Proc. VLDB Endow..

[11]  Jeffrey Xu Yu,et al.  Reachability querying: an independent permutation labeling approach , 2014, The VLDB Journal.

[12]  Sibo Wang,et al.  Reachability queries on large dynamic graphs: a total order approach , 2014, SIGMOD Conference.

[13]  Byron Choi,et al.  Incremental Maintenance of 2-Hop Labeling of Large Graphs , 2010, IEEE Transactions on Knowledge and Data Engineering.

[14]  Yuchen Li,et al.  DyCuckoo: Dynamic Hash Tables on GPUs , 2021, 2021 IEEE 37th International Conference on Data Engineering (ICDE).

[15]  Bingsheng He,et al.  Accelerating Dynamic Graph Analytics on GPUs , 2017, Proc. VLDB Endow..

[16]  Gerhard Weikum,et al.  FERRARI: Flexible and efficient reachability range assignment for graph indexing , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[17]  Kian-Lee Tan,et al.  Parallel Personalized Pagerank on Dynamic Graphs , 2017, Proc. VLDB Endow..

[18]  Yang Xiang,et al.  Path-tree: An efficient reachability indexing scheme for large directed graphs , 2011, TODS.

[19]  Giuseppe F. Italiano,et al.  Fully dynamic all pairs shortest paths with real edge weights , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[20]  Liam Roditty,et al.  Decremental maintenance of strongly connected components , 2013, SODA.

[21]  Monika Henzinger,et al.  Fully dynamic biconnectivity and transitive closure , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[22]  Edith Cohen,et al.  Reachability and distance queries via 2-hop labels , 2002, SODA '02.

[23]  Mohammed J. Zaki,et al.  DAGGER: A Scalable Index for Reachability Queries in Large Dynamic Graphs , 2013, ArXiv.

[24]  Zhengping Qian,et al.  Real-time Constrained Cycle Detection in Large Dynamic Graphs , 2018, Proc. VLDB Endow..

[25]  Uri Zwick,et al.  A fully dynamic reachability algorithm for directed graphs with an almost linear update time , 2004, STOC '04.

[26]  Kian-Lee Tan,et al.  Real-Time Influence Maximization on Dynamic Social Streams , 2017, Proc. VLDB Endow..

[27]  Michael Grossniklaus,et al.  Experiences with Implementing Landmark Embedding in Neo4j , 2019, GRADES/NDA@SIGMOD/PODS.