Efficient $K$-concern Matching in a Large Graph

Exact subgraph matching is wildly used for relation exploration in linked data. Existing literatures of matching algorithm focus on exhaustive enumeration of all matches, which are both time and space consuming and even not viable for a large graph. For real world applications such as social search, it is common that not all entities in a query graph are concerned by the users. It is not trivial to find distinct groups of concerned vertices from the exhaustive enumeration of all matches. In this paper, we define the problem of $K-concern \ \ matching$ which returns distinct groups of $K$ concerned vertices in all matches. In order to efficiently find $K$ -concern matches in a large graph, we devise a new strategy for the general backtracking algorithm. To minimize the total search space, we propose methods for search order optimization and candidate set pruning. Experiments on real and synthetic datasets show that the query performance of our method exceeds all existing matching algorithms for $K-concern \ \ matching$.

[1]  Shijie Zhang,et al.  GADDI: distance index based subgraph matching in biological networks , 2009, EDBT '09.

[2]  Jeong-Hoon Lee,et al.  An In-depth Comparison of Subgraph Isomorphism Algorithms in Graph Databases , 2012, Proc. VLDB Endow..

[3]  Mario Vento,et al.  A (sub)graph isomorphism algorithm for matching large graphs , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Ambuj K. Singh,et al.  Graphs-at-a-time: query language and access methods for graph databases , 2008, SIGMOD Conference.

[5]  Junhu Wang,et al.  Exploiting Vertex Relationships in Speeding up Subgraph Isomorphism over Large Graphs , 2015, Proc. VLDB Endow..

[6]  Jiawei Han,et al.  On graph query optimization in large networks , 2010, Proc. VLDB Endow..

[7]  Jeong-Hoon Lee,et al.  Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases , 2013, SIGMOD '13.

[8]  J. Gaschnig Performance measurement and analysis of certain search algorithms. , 1979 .

[9]  Philip S. Yu,et al.  Substructure similarity search in graph databases , 2005, SIGMOD '05.

[10]  Ido Guy,et al.  Personalized social search based on the user's social network , 2009, CIKM.

[11]  Wenfei Fan,et al.  Graph pattern matching revised for social network analysis , 2012, ICDT '12.

[12]  Tianyu Wo,et al.  Strong simulation , 2014, ACM Trans. Database Syst..

[13]  Wilfred Ng,et al.  Fg-index: towards verification-free query processing on graph databases , 2007, SIGMOD '07.

[14]  Anthony K. H. Tung,et al.  Comparing Stars: On Approximating Graph Edit Distance , 2009, Proc. VLDB Endow..

[15]  Jeffrey Xu Yu,et al.  Taming verification hardness: an efficient algorithm for testing subgraph isomorphism , 2008, Proc. VLDB Endow..

[16]  Lijun Chang,et al.  Efficient Subgraph Matching by Postponing Cartesian Products , 2016, SIGMOD Conference.

[17]  Jianzhong Li,et al.  Efficient Subgraph Matching on Billion Node Graphs , 2012, Proc. VLDB Endow..

[18]  Jiawei Han,et al.  Top-K interesting subgraph discovery in information networks , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[19]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[20]  Philip S. Yu,et al.  Graph indexing: a frequent structure-based approach , 2004, SIGMOD '04.

[21]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.