Evaluating Mixed Patterns on Large Data Graphs Using Bitmap Views

Developing efficient and scalable techniques for pattern queries over large graphs is crucial for modern applications such as social networks, Web analysis, and bioinformatics. In this paper, we address the problem of efficiently finding the homomorphic matches for tree pattern queries with child and descendant edges (mixed pattern queries) over a large data graph. We propose a novel type of materialized views to accelerate the evaluation. Our materialized views are the sets of occurrence lists of the nodes of the pattern in the data graph. They are stored as compressed bitmaps on the inverted lists of the node labels in the data graph. Reachability information between occurrence list nodes is provided by a node reachability index. This technique not only minimizes the materialization space but also reduces CPU and I/O costs by translating view materialization processing into bitwise operations. We provide conditions for view usability using the concept of pattern node coverage. We design a holistic bottom-up algorithm which efficiently computes pattern query matches in the data graph using bitmap views. An extensive experimental evaluation shows that our method evaluates mixed patterns up to several orders of magnitude faster than existing algorithms.

[1]  Hai Zhuge,et al.  Adding Logical Operators to Tree Pattern Queries on Graph-Structured Data , 2012, Proc. VLDB Endow..

[2]  Hai Zhuge,et al.  Comments on "Stack-based Algorithms for Pattern Matching on DAGs" , 2012, Proc. VLDB Endow..

[3]  Li Chen,et al.  Stack-based Algorithms for Pattern Matching on DAGs , 2005, VLDB.

[4]  Philip S. Yu,et al.  Graph Pattern Matching: A Join/Semijoin Approach , 2011, IEEE Transactions on Knowledge and Data Engineering.

[5]  Qing Zhu,et al.  Reachability Querying: Can It Be Even Faster? , 2017, IEEE Transactions on Knowledge and Data Engineering.

[6]  Hai Zhuge,et al.  Scaling Hop-Based Reachability Indexing for Fast Graph Pattern Query Processing , 2014, IEEE Transactions on Knowledge and Data Engineering.

[7]  Jianzhong Li,et al.  Graph homomorphism revisited for graph matching , 2010, Proc. VLDB Endow..

[8]  Xin Wang,et al.  Answering Pattern Queries Using Views , 2016, IEEE Transactions on Knowledge and Data Engineering.

[9]  Dan Olteanu,et al.  Factorized Databases , 2016, SGMD.

[10]  Brian Gallagher,et al.  Matching Structure and Semantics: A Survey on Graph-Based Pattern Matching , 2006, AAAI Fall Symposium: Capturing and Using Patterns for Evidence Detection.

[11]  Peter Triantafillou,et al.  Indexing Query Graphs to Speedup Graph Query Processing , 2016, EDBT.

[12]  Xiaoying Wu,et al.  Answering XML queries using materialized views revisited , 2009, CIKM.

[13]  Xin Wang,et al.  Answering Graph Pattern Matching Using Views: A Revisit , 2017, DEXA.

[14]  Jeffrey Xu Yu,et al.  Taming verification hardness: an efficient algorithm for testing subgraph isomorphism , 2008, Proc. VLDB Endow..

[15]  Xiaoying Wu,et al.  Efficient evaluation of generalized path pattern queries on XML data , 2008, WWW.

[16]  Owen Kaser,et al.  Better bitmap performance with Roaring bitmaps , 2014, Softw. Pract. Exp..

[17]  Yannis Papakonstantinou,et al.  An Experimental Study of Bitmap Compression vs. Inverted List Compression , 2017, SIGMOD Conference.

[18]  Jia Li,et al.  Approximating Graph Pattern Queries Using Views , 2016, CIKM.

[19]  Jianzhong Li,et al.  Hash-base subgraph query processing method for graph-structured XML documents , 2008, Proc. VLDB Endow..

[20]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.

[21]  Xiaoying Wu,et al.  Optimizing XML queries: Bitmapped materialized views vs. indexes , 2013, Inf. Syst..