TEDI: Efficient Shortest Path Query Answering on Graphs

Efficient shortest path query answering in large graphs is enjoying a growing number of applications, such as ranked keyword search in databases, social networks, ontology reasoning and bioinformatics. A shortest path query on a graph finds the shortest path for the given source and target vertices in the graph. Current techniques for efficient evaluation of such queries are based on the pre-computation of compressed Breadth First Search trees of the graph. However, they suffer from drawbacks of scalability. To address these problems, we propose TEDI, an indexing and query processing scheme for the shortest path query answering. TEDI is based on the tree decomposition methodology. The graph is first decomposed into a tree in which the node (a.k.a. bag) contains more than one vertex from the graph. The shortest paths are stored in such bags and these local paths together with the tree are the components of the index of the graph. Based on this index, a bottom-up operation can be executed to find the shortest path for any given source and target vertices. Our experimental results show that TEDI offers ordersof-magnitude performance improvement over existing approaches on the index construction time, the index size and the query answering.

[1]  Andrew V. Goldberg,et al.  Computing the shortest path: A search meets graph theory , 2005, SODA '05.

[2]  Li Chen,et al.  Stack-based Algorithms for Pattern Matching on DAGs , 2005, VLDB.

[3]  Gerhard Weikum,et al.  Efficient creation and incremental maintenance of the HOPI index for complex XML document collections , 2005, 21st International Conference on Data Engineering (ICDE'05).

[4]  Ronald J. Gutman,et al.  Reach-Based Routing: A New Approach to Shortest Path Algorithms Optimized for Road Networks , 2004, ALENEX/ANALC.

[5]  Arie M. C. A. Koster,et al.  PREPROCESSING RULES FOR TRIANGULATION OF PROBABILISTIC NETWORKS * , 2005, Comput. Intell..

[6]  Yang Xiang,et al.  Efficiently answering reachability queries on very large directed graphs , 2008, SIGMOD Conference.

[7]  Philip S. Yu,et al.  Dual Labeling: Answering Graph Reachability Queries in Constant Time , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[8]  Derek G. Corneil,et al.  Complexity of finding embeddings in a k -tree , 1987 .

[9]  Andrew V. Goldberg,et al.  Point-to-Point Shortest Path Algorithms with Preprocessing , 2007, SOFSEM.

[10]  Jian Pei,et al.  Efficiently indexing shortest paths by exploiting symmetry in graphs , 2009, EDBT '09.

[11]  Alexander Borgida,et al.  Efficient management of transitive relationships in large data and knowledge bases , 1989, SIGMOD '89.

[12]  Yang Xiang,et al.  3-HOP: a high-compression indexing scheme for reachability query , 2009, SIGMOD Conference.

[13]  Haofen Wang,et al.  Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-Shaped (RDF) Data , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[14]  Georg Gottlob,et al.  Tractable database design through bounded treewidth , 2006, PODS '06.

[15]  Ulf Leser,et al.  Fast and practical indexing and querying of very large graphs , 2007, SIGMOD '07.

[16]  Hans L. Bodlaender,et al.  A Tourist Guide through Treewidth , 1993, Acta Cybern..

[17]  Philip S. Yu,et al.  BLINKS: ranked keyword searches on graphs , 2007, SIGMOD '07.

[18]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[19]  Edith Cohen,et al.  Reachability and distance queries via 2-hop labels , 2002, SODA '02.

[20]  Arie M. C. A. Koster,et al.  Treewidth: Computational Experiments , 2001, Electron. Notes Discret. Math..

[21]  Javier Larrosa,et al.  Unifying tree decompositions for reasoning in graphical models , 2005, Artif. Intell..