Querying graph patterns

Graph data appears in a variety of application domains, and many uses of it, such as querying, matching, and transforming data, naturally result in incompletely specified graph data, i.e., graph patterns. While queries need to be posed against such data, techniques for querying patterns are generally lacking, and properties of such queries are not well understood. Our goal is to study the basics of querying graph patterns. We first identify key features of patterns, such as node and label variables and edges specified by regular expressions, and define a classification of patterns based on them. We then study standard graph queries on graph patterns, and give precise characterizations of both data and combined complexity for each class of patterns. If complexity is high, we do further analysis of features that lead to intractability, as well as lower complexity restrictions. We introduce a new automata model for query answering with two modes of acceptance: one captures queries returning nodes, and the other queries returning paths. We study properties of such automata, and the key computational tasks associated with them. Finally, we provide additional restrictions for tractability, and show that some intractable cases can be naturally cast as instances of constraint satisfaction problem.

[1]  Georg Gottlob,et al.  Conjunctive queries over trees , 2004, JACM.

[2]  Reinhard Diestel,et al.  Graph Theory , 1997 .

[3]  Alberto O. Mendelzon,et al.  A graphical query language supporting recursion , 1987, SIGMOD '87.

[4]  David S. Johnson,et al.  Testing containment of conjunctive queries under functional and inclusion dependencies , 1982, J. Comput. Syst. Sci..

[5]  Alberto O. Mendelzon,et al.  GraphLog: a visual formalism for real life recursion , 1990, PODS '90.

[6]  Christos Faloutsos,et al.  Fast best-effort pattern matching in large attributed graphs , 2007, KDD '07.

[7]  Marc Gyssens,et al.  A graph-oriented object database model , 1990, IEEE Trans. Knowl. Data Eng..

[8]  Jeffrey Shallit,et al.  A Lower Bound Technique for the Size of Nondeterministic Finite Automata , 1996, Inf. Process. Lett..

[9]  Rina Dechter,et al.  Constraint Processing , 1995, Lecture Notes in Computer Science.

[10]  Cristina Sirangelo,et al.  XML with incomplete information , 2010, JACM.

[11]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[12]  Anthony Widjaja Lin,et al.  Expressive Languages for Path Queries over Graph-Structured Data , 2012, TODS.

[13]  Dan Suciu,et al.  Data on the Web: From Relations to Semistructured Data and XML , 1999 .

[14]  Jianzhong Li,et al.  Graph homomorphism revisited for graph matching , 2010, Proc. VLDB Endow..

[15]  Diego Calvanese,et al.  Containment of Conjunctive Regular Path Queries with Inverse , 2000, KR.

[16]  Nicolás Marín,et al.  Review of Data on the Web: from relational to semistructured data and XML by Serge Abiteboul, Peter Buneman, and Dan Suciu. Morgan Kaufmann 1999. , 2003, SGMD.

[17]  Alin Deutsch,et al.  Optimization Properties for Classes of Conjunctive Regular Path Queries , 2001, DBPL.

[18]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[19]  Gerhard Weikum,et al.  Database and information-retrieval methods for knowledge discovery , 2009, CACM.

[20]  Werner Nutt,et al.  Querying Incomplete Information in Semistructured Data , 2002, J. Comput. Syst. Sci..

[21]  Oded Shmueli,et al.  SoQL: A Language for Querying and Creating Data in Social Networks , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[22]  Marcelo Arenas,et al.  Semantics and Complexity of SPARQL , 2006, International Semantic Web Conference.

[23]  Marcelo Arenas,et al.  Relational and XML Data Exchange , 2010, Relational and XML Data Exchange.

[24]  Moshe Y. Vardi,et al.  Rewriting of Regular Expressions and Regular Path , 1999 .

[25]  John Jay,et al.  UNDERSTANDING THE STRUCTURE OF A DRUG TRAFFICKING ORGANIZATION : A CONVERSATIONAL ANALYSIS by Mangai Natarajan , 2006 .

[26]  Yehoshua Sagiv,et al.  An Abstract Framework for Generating Maximal Answers to Queries , 2005, ICDT.

[27]  Philip S. Yu,et al.  Fast Graph Pattern Matching , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[28]  Diego Calvanese,et al.  Rewriting of regular expressions and regular path queries , 1999, PODS '99.

[29]  Thomas Schwentick,et al.  Conjunctive Query Containment over Trees , 2007, DBPL.

[30]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[31]  Jianzhong Li,et al.  Adding regular expressions to graph reachability and pattern queries , 2011, ICDE 2011.

[32]  Diego Calvanese,et al.  Answering regular path queries using views , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[33]  Phokion G. Kolaitis,et al.  A Logical Approach to Constraint Satisfaction , 2008, Complexity of Constraints.

[34]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[35]  Ronald Fagin,et al.  Data exchange: semantics and query answering , 2003, Theor. Comput. Sci..

[36]  Alberto O. Mendelzon,et al.  Foundations of Semantic Web databases , 2011, J. Comput. Syst. Sci..

[37]  David S. Johnson,et al.  Testing Containment of Conjunctive Queries under Functional and Inclusion Dependencies , 1984, J. Comput. Syst. Sci..

[38]  Frank Olken Graph Data Management for Molecular Biology , 2003, OMICS.

[39]  Claudio Gutiérrez,et al.  Representing, Querying and Transforming Social Networks with RDF/SPARQL , 2009, ESWC.

[40]  Jianzhong Li,et al.  Graph pattern matching , 2010, Proc. VLDB Endow..

[41]  Diego Calvanese,et al.  Simplifying schema mappings , 2011, ICDT '11.

[42]  Diego Calvanese,et al.  View-based query processing and constraint satisfaction , 2000, Proceedings Fifteenth Annual IEEE Symposium on Logic in Computer Science (Cat. No.99CB36332).