Decentralized Search for Shortest Path Approximation in Large-Scale Complex Networks

Finding approximated shortest paths for extremely large-scale complex networks is a challenging problem, where existing works require large overhead to achieve high accuracy and diversity for estimated paths, especially for large graphs with millions of vertices. In this paper, we propose an online search approach based on preprocessed indexes, to approximate point-to-point shortest paths. The approach is able to find more accurate and diverse paths with limited index overhead and requires low search overhead. Furthermore, a new path degree based index construction algorithm is introduced that can greatly increase the approximation accuracy and involve no additional index overhead. To handle extreme size graphs, we build a query processing system with our algorithm on distributed graph processing platforms. The system also supports parallel processing of online searches to achieve high throughput for a large number of queries. We evaluate our algorithm on various real-world graphs from different disciplines with up to billions of edges, and we demonstrate that our system can process hundreds of thousand queries per second on these graphs with reducedoverhead.

[1]  Aristides Gionis,et al.  Fast shortest path distance estimation in large networks , 2009, CIKM.

[2]  Mikkel Thorup,et al.  Approximate distance oracles , 2001, JACM.

[3]  Sreenivas Gollapudi,et al.  A sketch-based distance oracle for web-scale graphs , 2010, WSDM '10.

[4]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[5]  Takuya Akiba,et al.  Shortest-path queries for complex networks: exploiting low tree-width outside the core , 2012, EDBT '12.

[6]  Andrew V. Goldberg,et al.  Computing the shortest path: A search meets graph theory , 2005, SODA '05.

[7]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[8]  Takuya Akiba,et al.  Fast exact shortest-path distance queries on large networks by pruned landmark labeling , 2013, SIGMOD '13.

[9]  David D. Jensen,et al.  Indexing Network Structure with Shortest-Path Trees , 2011, TKDD.

[10]  Walter A. Kosters,et al.  Adaptive Landmark Selection Strategies for Fast Shortest Path Computation in Large Real-World Graphs , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[11]  Haixun Wang,et al.  Toward a Distance Oracle for Billion-Node Graphs , 2013, Proc. VLDB Endow..

[12]  Ryan A. Rossi,et al.  The Network Data Repository with Interactive Graph Analytics and Visualization , 2015, AAAI.

[13]  Hong Cheng,et al.  Approximate Shortest Distance Computing: A Query-Dependent Local Landmark Scheme , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[14]  Marlon Dumas,et al.  Memory-Efficient Fast Shortest Path Estimation in Large Social Networks , 2014, ICWSM.

[15]  Edith Cohen,et al.  Reachability and distance queries via 2-hop labels , 2002, SODA '02.

[16]  Marlon Dumas,et al.  Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs , 2011, CIKM '11.

[17]  Yang Xiang,et al.  A highway-centric labeling approach for answering distance queries on large sparse graphs , 2012, SIGMOD Conference.

[18]  Gerhard Weikum,et al.  Fast and accurate estimation of shortest paths in large graphs , 2010, CIKM.

[19]  Eric Eide,et al.  Introducing CloudLab: Scientific Infrastructure for Advancing Cloud Architectures and Applications , 2014, login Usenix Mag..

[20]  Joseph Gonzalez,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[21]  Fang Wei-Kleiner TEDI: Efficient Shortest Path Query Answering on Graphs , 2011, Graph Data Management.

[22]  Jon M. Kleinberg,et al.  Navigation in a small world , 2000, Nature.