BB-Graph: A New Subgraph Isomorphism Algorithm for Efficiently Querying Big Graph Databases

With the emergence of the big data concept, the big graph database model has become very popular since it provides strong modeling for complex applications and fast querying, especially for the cases that require costly join operations in RDBMs. However, it is a big challenge to find all exact matches of a query graph in a big graph database, which is known as the subgraph isomorphism problem. Although a number of related studies exist in literature, there is need for a better algorithm that works efficiently for all types of queries since the subgraph isomorphism problem is NP-hard. The current subgraph isomorphism approaches have been built on Ullmann's idea of focusing on the strategy of pruning out the irrelevant candidates. Nevertheless, for some graph databases and queries, the existing pruning techniques are not adequate to handle some of the complex queries. Moreover, many of those existing algorithms need large indices that cause extra memory consumption. Motivated by these, we introduce a new subgraph isomorphism algorithm, namely BB-Graph, for querying big graph databases in an efficient manner without requiring a large data structure to be stored in main memory. We test and compare our proposed BB-Graph algorithm with two popular existing ones, GraphQL and Cypher of Neo4j. Our experiments are done on a very big graph database application (Population Database) and the publicly available World Cup graph database application. We show that our algorithm performs better than those that we use for comparison in this study, for most of the query types.

[1]  Amir Abboud,et al.  Subtree Isomorphism Revisited , 2015, SODA.

[2]  Jeffrey Xu Yu,et al.  iGraph: A Framework for Comparisons of Disk-Based Graph Indexing Techniques , 2010, Proc. VLDB Endow..

[3]  Philip S. Yu,et al.  Graph indexing: a frequent structure-based approach , 2004, SIGMOD '04.

[4]  Shalini Batra,et al.  Comparative Analysis of Relational And Graph Databases , 2012 .

[5]  Ambuj K. Singh,et al.  Closure-Tree: An Index Structure for Graph Queries , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[6]  Jeong-Hoon Lee,et al.  An In-depth Comparison of Subgraph Isomorphism Algorithms in Graph Databases , 2012, Proc. VLDB Endow..

[7]  Adnan Yazici,et al.  A Graph-Based Big Data Model for Wireless Multimedia Sensor Networks , 2016, INNS Conference on Big Data.

[8]  Justin J. Miller,et al.  Graph Database Applications and Concepts with Neo4j , 2013 .

[9]  Philip S. Yu,et al.  Graph Indexing: Tree + Delta >= Graph , 2007, VLDB.

[10]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.

[11]  Dennis Shasha,et al.  GraphGrep: A fast and universal method for querying graphs , 2002, Object recognition supported by user interaction for service robots.

[12]  Srinath Srinivasa,et al.  LWI and Safari: A New Index Structure and Query Model for Graph Databases , 2005, COMAD.

[13]  Irena Holubová Analysis and Experimental Comparison of Graph Databases , 2013 .

[14]  Harry M. Sneed,et al.  Comparing graph-based program comprehension tools to relational database-based tools , 2001, Proceedings 9th International Workshop on Program Comprehension. IWPC 2001.

[15]  Shijie Zhang,et al.  GADDI: distance index based subgraph matching in biological networks , 2009, EDBT '09.

[16]  Shijie Zhang,et al.  TreePi: A Novel Graph Indexing Method , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[17]  Yixin Chen,et al.  A comparison of a graph database and a relational database: a data provenance perspective , 2010, ACM SE '10.

[18]  Wei Wang,et al.  Graph Database Indexing Using Structured Graph Decomposition , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[19]  Lukasz Warchal,et al.  A Performance Comparison of Several Common Computation Tasks Used in Social Network Analysis Performed on Graph and Relational Databases , 2013, ICMMI.

[20]  Jeffrey Xu Yu,et al.  Taming verification hardness: an efficient algorithm for testing subgraph isomorphism , 2008, Proc. VLDB Endow..

[21]  Jiawei Han,et al.  On graph query optimization in large networks , 2010, Proc. VLDB Endow..

[22]  Arun Prakash Agrawal,et al.  Comparative analysis of Relational and Graph databases , 2013 .

[23]  Mario Vento,et al.  A (sub)graph isomorphism algorithm for matching large graphs , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Ambuj K. Singh,et al.  Graphs-at-a-time: query language and access methods for graph databases , 2008, SIGMOD Conference.

[25]  Alessia Saggese,et al.  Introducing VF3: A New Algorithm for Subgraph Isomorphism , 2017, GbRPR.

[26]  Ameya Nayak Type of NOSQL Databases and its Comparison with Relational Databases , 2013 .