Efficient algorithms for generalized subgraph query processing

We study a new type of graph queries, which injectively maps its edges to paths of the graphs in a given database, where the length of each path is constrained by a given threshold specified by the weight of the corresponding matching edge. We give important applications of the new graph query and identify new challenges of processing such a query. Then, we devise the cost model of the branch-and-bound algorithm framework for processing the graph query, and propose an efficient algorithm to minimize the cost overhead. We also develop three indexing techniques to efficiently answer the queries online. Finally, we verify the efficiency of our proposed indexes with extensive experiments on large real and synthetic datasets.

[1]  Philip S. Yu,et al.  Graph indexing: a frequent structure-based approach , 2004, SIGMOD '04.

[2]  Peter C. Jurs,et al.  Chemistry: The Molecular Science , 2001 .

[3]  Philip S. Yu,et al.  Substructure similarity search in graph databases , 2005, SIGMOD '05.

[4]  Lei Zou,et al.  DistanceJoin: Pattern Match Query In a Large Graph Database , 2009, Proc. VLDB Endow..

[5]  Jianzhong Li,et al.  Adding regular expressions to graph reachability and pattern queries , 2011, ICDE 2011.

[6]  Ambuj K. Singh,et al.  Closure-Tree: An Index Structure for Graph Queries , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[7]  Stephen A. Cook,et al.  The complexity of theorem-proving procedures , 1971, STOC.

[8]  Jeffrey Xu Yu,et al.  Connected substructure similarity search , 2010, SIGMOD Conference.

[9]  Dennis Shasha,et al.  Algorithmics and applications of tree and graph searching , 2002, PODS.

[10]  Jianzhong Li,et al.  A novel approach for efficient supergraph query processing on graph databases , 2009, EDBT '09.

[11]  Ryutaro Ichise,et al.  Similarity search on supergraph containment , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[12]  Philip S. Yu,et al.  Towards Graph Containment Search and Indexing , 2007, VLDB.

[13]  Jeffrey Xu Yu,et al.  iGraph: A Framework for Comparisons of Disk-Based Graph Indexing Techniques , 2010, Proc. VLDB Endow..

[14]  Charu C. Aggarwal,et al.  Managing and Mining Graph Data , 2010, Managing and Mining Graph Data.

[15]  Anthony K. H. Tung,et al.  Comparing Stars: On Approximating Graph Edit Distance , 2009, Proc. VLDB Endow..

[16]  Xuemin Lin,et al.  Efficient Graph Similarity Joins with Edit Distance Constraints , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[17]  Wei Jin,et al.  A Flexible Graph Pattern Matching Framework via Indexing , 2011, SSDBM.

[18]  Wilfred Ng,et al.  Fg-index: towards verification-free query processing on graph databases , 2007, SIGMOD '07.

[19]  Jianzhong Li,et al.  Graph homomorphism revisited for graph matching , 2010, Proc. VLDB Endow..

[20]  Philip S. Yu,et al.  Graph Indexing: Tree + Delta >= Graph , 2007, VLDB.

[21]  Yvonne C. Martin,et al.  ALADDIN: An integrated tool for computer-assisted molecular design and pharmacophore recognition from geometric, steric, and substructure searching of three-dimensional molecular structures , 1989, J. Comput. Aided Mol. Des..

[22]  Shijie Zhang,et al.  TreePi: A Novel Graph Indexing Method , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[23]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.

[24]  Jeffrey Xu Yu,et al.  Taming verification hardness: an efficient algorithm for testing subgraph isomorphism , 2008, Proc. VLDB Endow..

[25]  Jeffrey Xu Yu,et al.  Fast graph query processing with a low-cost index , 2011, The VLDB Journal.