Top- $k$ Subgraph Query Based on Frequent Structure in Large-Scale Dynamic Graphs

Frequent structures have emerged as resolving the structural pattern mining issues, such as chemistry, Web applications, and other related problems. Top-<inline-formula> <tex-math notation="LaTeX">${k}$ </tex-math></inline-formula> subgraph query as an important technology of graph search is widely used in emerging fields. Different from the traditional subgraph query, the query requirement studied in this paper has two unique properties: 1) data graphs change dynamically over time, including vertices/edges insert, delete, and frequent labels update and 2) query with limits of labels and <inline-formula> <tex-math notation="LaTeX">${k}$ </tex-math></inline-formula>. Existing graph index and query techniques rarely considered these. Therefore, this paper proposes a method called frequent subgraph dynamic Top-<inline-formula> <tex-math notation="LaTeX">${k}$ </tex-math></inline-formula> query with label value constraint to address both the challenges. We propose a two-level index with the frequent structure mapping and label value aggregation, which locates the query structure quickly, then prunes, and filters according to the constraint to narrow search range and improve query efficiency. The method uses a two-level index to filter the initial result and combines the dynamic changes of graphs to modify query results. In addition, it also proposed an incremental dynamic maintenance strategy over the proposed index, which only updated the partial contents to avoid the high cost caused by global update. The experimental results demonstrate that the proposed Top-<inline-formula> <tex-math notation="LaTeX">${k}$ </tex-math></inline-formula> query method outperforms the baseline approaches up to 37%.

[1]  Young-Koo Lee,et al.  Extracting top-K interesting subgraphs with weighted query semantics , 2017, 2017 IEEE International Conference on Big Data and Smart Computing (BigComp).

[2]  Jiawei Han,et al.  Top-K interesting subgraph discovery in information networks , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[3]  Pedro Isasi Viñuela,et al.  Extending ACO for fast path search in huge graphs and social networks , 2017, Expert Syst. Appl..

[4]  Ihab F. Ilyas,et al.  A survey of top-k query processing techniques in relational database systems , 2008, CSUR.

[5]  Wu Liu,et al.  Query from Sketch: A Common Subgraph Correspondence Mining Framework , 2017, 2017 IEEE Third International Conference on Multimedia Big Data (BigMM).

[6]  Xin Wang,et al.  Diversified Top-k Graph Pattern Matching , 2013, Proc. VLDB Endow..

[7]  Bin Wang,et al.  Approximate Continuous Top-k Query over Sliding Window , 2017, Journal of Computer Science and Technology.

[8]  Philip S. Yu,et al.  Graph indexing: a frequent structure-based approach , 2004, SIGMOD '04.

[9]  Ya Wang,et al.  Bit selection via walks on graph for hash-based nearest neighbor search , 2016, Neurocomputing.

[10]  Yifan Chen,et al.  Frequent Subgraph Mining Based on Pregel , 2016, Comput. J..

[11]  Naohisa Sakamoto,et al.  Layered Graph Drawing for Visualizing Evaluation Structures , 2017, IEEE Computer Graphics and Applications.

[12]  Lixin Gao,et al.  Fast Top-K Path-Based Relevance Query on Massive Graphs , 2016, IEEE Transactions on Knowledge and Data Engineering.

[13]  Mohamed A. Soliman,et al.  Top-k Query Processing in Uncertain Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[14]  Hanghang Tong,et al.  FIRST: Fast Interactive Attributed Subgraph Matching , 2017, KDD.

[15]  Adnan Yazici,et al.  BB-Graph: A New Subgraph Isomorphism Algorithm for Efficiently Querying Big Graph Databases , 2017, ArXiv.

[16]  Yu Zhao,et al.  Approximate Subgraph Matching Query over Large Graph , 2016, BigCom.

[17]  Yinghui Wu,et al.  Fast top-k search in knowledge graphs , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[18]  Young-Sik Jeong,et al.  Iceberg Clique queries in large graphs , 2017, Neurocomputing.

[19]  Tolga Can,et al.  Comparison of tissue/disease specific integrated networks using directed graphlet signatures , 2017, BMC Bioinformatics.

[20]  Zhengwei Yang,et al.  Diversified Top-k Subgraph Querying in a Large Graph , 2016, SIGMOD Conference.

[21]  Klaus-Dieter Schewe,et al.  Top-k Matching Queries for Filter-Based Profile Matching in Knowledge Bases , 2016, DEXA.

[22]  Peter Triantafillou,et al.  Towards a subgraph/supergraph cached query-graph index , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[23]  Sarbjeet Singh,et al.  A comparative analysis of structural graph metrics to identify anomalies in online social networks , 2017, Comput. Electr. Eng..

[24]  Guy Golan-Gueta,et al.  Top-k Query Processing with Conditional Skips , 2017, WWW.

[25]  Hong Cheng,et al.  Subgraph Matching: on Compression and Computation , 2017, Proc. VLDB Endow..

[26]  Peter Triantafillou,et al.  Subgraph Querying with Parallel Use of Query Rewritings and Alternative Algorithms , 2017, EDBT.

[27]  Louis Ibarra,et al.  A Fully Dynamic Graph Algorithm for Recognizing Interval Graphs , 2010, Algorithmica.

[28]  Dimitrios Tzovaras,et al.  Managing Spatial Graph Dependencies in Large Volumes of Traffic Data for Travel-Time Prediction , 2016, IEEE Transactions on Intelligent Transportation Systems.

[29]  Junhu Wang,et al.  Exploiting Vertex Relationships in Speeding up Subgraph Isomorphism over Large Graphs , 2015, Proc. VLDB Endow..

[30]  Junhu Wang,et al.  Multi-Query Optimization for Subgraph Isomorphism Search , 2016, Proc. VLDB Endow..