Dependencies for Graphs

This article proposes a class of dependencies for graphs, referred to as graph entity dependencies (GEDs). A GED is defined as a combination of a graph pattern and an attribute dependency. In a uniform format, GEDs can express graph functional dependencies with constant literals to catch inconsistencies, and keys carrying id literals to identify entities (vertices) in a graph. We revise the chase for GEDs and prove its Church-Rosser property. We characterize GED satisfiability and implication, and establish the complexity of these problems and the validation problem for GEDs, in the presence and absence of constant literals and id literals. We also develop a sound, complete and independent axiom system for finite implication of GEDs. In addition, we extend GEDs with built-in predicates or disjunctions, to strike a balance between the expressive power and complexity. We settle the complexity of the satisfiability, implication, and validation problems for these extensions.

[1]  Ronald Fagin,et al.  The Theory of Data Dependencies - An Overview , 1984, ICALP.

[2]  Floris Geerts,et al.  Static analysis of schema-mappings ensuring oblivious termination , 2010, ICDT '10.

[3]  Reinhard Pichler,et al.  The complexity of evaluating tuple generating dependencies , 2011, ICDT '11.

[4]  Wenfei Fan,et al.  Conditional functional dependencies for capturing data inconsistencies , 2008, TODS.

[5]  Angela Bonifati,et al.  Functional Dependencies Unleashed for Scalable Data Exchange , 2016, SSDBM.

[6]  M. Schaefer,et al.  Completeness in the Polynomial-Time Hierarchy A Compendium ∗ , 2008 .

[7]  Michael Sirivianos,et al.  Aiding the Detection of Fake Accounts in Large Scale Social Online Services , 2012, NSDI.

[8]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[9]  Catriel Beeri,et al.  The Implication Problem for Data Dependencies , 1981, ICALP.

[10]  Wenfei Fan,et al.  Foundations of Data Quality Management , 2012, Foundations of Data Quality Management.

[11]  Alberto O. Mendelzon,et al.  Notions of dependency satisfaction , 1982, PODS '82.

[12]  Marianne Baudinet,et al.  Constraint-Generating Dependencies , 1994, PPCP.

[13]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[14]  Chao Tian,et al.  Keys for Graphs , 2015, Proc. VLDB Endow..

[15]  David Beech,et al.  XML-Schema Part 1: Structures Second Edition , 2004 .

[16]  Frank Neven,et al.  Discovering XSD keys from XML data , 2013, SIGMOD '13.

[17]  Stijn Vansummeren,et al.  What are real SPARQL queries like? , 2011, SWIM '11.

[18]  Marco Calautti,et al.  Exploiting Equality Generating Dependencies in Checking Chase Termination , 2016, Proc. VLDB Endow..

[19]  Paolo Papotti,et al.  Scalable data exchange with functional dependencies , 2010, Proc. VLDB Endow..

[20]  Paolo Papotti,et al.  The LLUNATIC Data-Cleaning Framework , 2013, Proc. VLDB Endow..

[21]  Bruno Marnette,et al.  Generalized schema-mappings: from termination to tractability , 2009, PODS.

[22]  Vladislav Rutenburg Complexity of Generalized Graph Coloring , 1986, MFCS.

[23]  Lane A. Hemaspaandra,et al.  SIGACT news complexity theory comun 37 , 2002, SIGA.

[24]  Diego Calvanese,et al.  Capturing Relational Schemas and Functional Dependencies in RDFS , 2014, AAAI.

[25]  David S. Johnson,et al.  Some Simplified NP-Complete Graph Problems , 1976, Theor. Comput. Sci..

[26]  W. W. Armstrong,et al.  Dependency Structures of Data Base Relationships , 1974, IFIP Congress.

[27]  Wim Martens,et al.  An analytical study of large SPARQL query logs , 2017, The VLDB Journal.

[28]  Marc Gyssens,et al.  Implication and Axiomatization of Functional Constraints on Patterns with an Application to the RDF Data Model , 2014, FoIKS.

[29]  Marc Gyssens,et al.  Implication and axiomatization of functional and constant constraints , 2015, Annals of Mathematics and Artificial Intelligence.

[30]  Ping Lu,et al.  Dependencies for Graphs , 2017, PODS.

[31]  Alvaro Cortés-Calabuig,et al.  Constraints in RDF , 2010, SDKB.

[32]  Georg Lausen,et al.  SPARQLing constraints for RDF , 2008, EDBT '08.

[33]  Ping Lu,et al.  Edinburgh Research Explorer Discovering Graph Functional Dependencies , 2022 .

[34]  Shuai Ma,et al.  Interaction between Record Matching and Data Repairing , 2014, JDIQ.

[35]  Pablo de la Fuente,et al.  An Empirical Study of Real-World SPARQL Queries , 2011, ArXiv.

[36]  Marcelo Arenas,et al.  A normal form for XML documents , 2002, PODS '02.

[37]  Jeffrey D. Ullman,et al.  The interaction between functional dependencies and template dependencies , 1980, SIGMOD '80.

[38]  Andrea Calì,et al.  On Equality-Generating Dependencies in Ontology Querying - Preliminary Report , 2011, AMW.

[39]  Fereidoon Sadri Data dependencies in the relational model of databases: a generalization , 1980 .

[40]  Jan Chomicki,et al.  Consistent query answers in inconsistent databases , 1999, PODS '99.

[41]  Wenfei Fan,et al.  Reasoning about Keys for XML , 2001, DBPL.

[42]  Ronald Fagin,et al.  Data exchange: semantics and query answering , 2003, Theor. Comput. Sci..

[43]  Lei Zou,et al.  Using Conditional Functional Dependency to Discover Abnormal Data in RDF Graphs , 2014, SWIM.

[44]  Yinghui Wu,et al.  Functional Dependencies for Graphs , 2016, SIGMOD Conference.

[45]  Alin Deutsch,et al.  The chase revisited , 2008, PODS.

[46]  V. Rutenberg,et al.  Complexity of generalized graph coloring , 1986 .

[47]  Jeff Heflin,et al.  Extending Functional Dependency to Detect Abnormal Data in RDF Graphs , 2011, SEMWEB.

[48]  Gerhard J. Woeginger,et al.  Polynomial Graph-Colorings , 1989, STACS.

[49]  Wenfei Fan,et al.  Keys for XML , 2001, WWW '01.

[50]  Alvaro Cortés-Calabuig,et al.  Semantics of Constraints in RDFS , 2012, AMW.

[51]  Michael Schmidt,et al.  Foundations of SPARQL query optimization , 2008, ICDT '10.

[52]  E. F. Codd,et al.  Relational Completeness of Data Base Sublanguages , 1972, Research Report / RJ / IBM / San Jose, California.