Shortest Path Computing in Relational DBMSs

This paper takes the shortest path discovery to study efficient relational approaches to graph search queries. We first abstract three enhanced relational operators, based on which we introduce an FEM framework to bridge the gap between relational operations and graph operations. We show new features introduced by recent SQL standards, such as window function and merge statement, can improve the performance of the FEM framework. Second, we propose an edge weight aware graph partitioning schema and design a bi-directional restrictive BFS (breadth-first-search)over partitioned tables, which improves the scalability and performance without extra indexing overheads. The final extensive experimental results illustrate our relational approach with optimization strategies can achieve high scalability and performance.

[1]  Jeffrey Xu Yu,et al.  Relational Approach for Shortest Path Discovery over Large Graphs , 2011, Proc. VLDB Endow..

[2]  David J. DeWitt,et al.  Relational Databases for Querying XML Documents: Limitations and Opportunities , 1999, VLDB.

[3]  Jussi Myllymaki,et al.  Implementing a scalable XML publish/subscribe system using relational database systems , 2004, SIGMOD '04.

[4]  Kenneth A. Ross,et al.  Efficient Incremental Evaluation of Queries with Aggregation , 1994, ILPS.

[5]  Sunil Prabhakar,et al.  ERACER: a database approach for statistical inference and data cleaning , 2010, SIGMOD Conference.

[6]  Dong Xin,et al.  Fast personalized PageRank on MapReduce , 2011, SIGMOD '11.

[7]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[8]  David S. Johnson,et al.  The Traveling Salesman Problem: A Case Study in Local Optimization , 2008 .

[9]  R. Prim Shortest connection networks and some generalizations , 1957 .

[10]  Oded Shmueli,et al.  SoQL: A Language for Querying and Creating Data in Social Networks , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[11]  Aristides Gionis,et al.  Fast shortest path distance estimation in large networks , 2009, CIKM.

[12]  Edith Cohen,et al.  Reachability and distance queries via 2-hop labels , 2002, SODA '02.

[13]  Dorothea Wagner,et al.  Speed-Up Techniques for Shortest-Path Computations , 2007, STACS.

[14]  Andrew V. Goldberg,et al.  Computing the shortest path: A search meets graph theory , 2005, SODA '05.

[15]  Raghu Ramakrishnan,et al.  A performance study of transitive closure algorithms , 1994, SIGMOD '94.

[16]  Aoying Zhou,et al.  DTD-Directed Publishing with Attribute Translation Grammars , 2002, VLDB.

[17]  Philip S. Yu,et al.  GConnect: A Connectivity Index for Massive Disk-resident Graphs , 2009, Proc. VLDB Endow..

[18]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[19]  Jeffrey Xu Yu,et al.  Finding maximal cliques in massive networks by H*-graph , 2010, SIGMOD Conference.

[20]  Norbert Zeh,et al.  An External Memory Data Structure for Shortest Path Queries , 1999, COCOON.

[21]  Jennifer Widom,et al.  Database Systems: The Complete Book , 2001 .

[22]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[23]  Ulf Leser,et al.  Fast and practical indexing and querying of very large graphs , 2007, SIGMOD '07.

[24]  Chen Wang,et al.  Scalable mining of large disk-based graph databases , 2004, KDD.

[25]  Srinivasan Parthasarathy,et al.  A Framework for SQL-Based Mining of Large Graphs on Relational Databases , 2010, PAKDD.