Independence in Database Relations

We investigate the implication problem for independence atoms $X \bot Y$ of disjoint attribute sets X and Y on database schemata. A relation satisfies $X \bot Y$ if for every X-value and every Y-value that occurs in the relation there is some tuple in the relation in which the X-value occurs together with the Y-value. We establish an axiomatization by a finite set of Horn rules, and derive an algorithm for deciding the implication problem in low-degree polynomial time in the input. We show how to construct Armstrong relations which satisfy an arbitrarily given set of independence atoms and violate every independence atom not implied by the given set. Our results establish independence atoms as an efficient subclass of embedded multivalued data dependencies which are not axiomatizable by a finite set of Horn rules, and whose implication problem is undecidable.

[1]  A. Dawid Conditional Independence in Statistical Theory , 1979 .

[2]  Pavel Naumov,et al.  Independence in Information Spaces , 2012, Stud Logica.

[3]  Jorma Rissanen,et al.  Independent components of relations , 1977, TODS.

[4]  Milan Studeny,et al.  Conditional independence relations have no finite complete characterization , 1992 .

[5]  H. Whitney On the Abstract Properties of Linear Dependence , 1935 .

[6]  Catriel Beeri,et al.  A complete axiomatization for functional and multivalued dependencies in database relations , 1977, SIGMOD '77.

[7]  Judea Pearl,et al.  Chapter 2 – BAYESIAN INFERENCE , 1988 .

[8]  Erich Grädel,et al.  Dependence and Independence , 2012, Stud Logica.

[9]  Ronald Fagin,et al.  Horn clauses and database dependencies , 1982, JACM.

[10]  Ronald Fagin,et al.  An Equivalence Between Relational Database Dependencies and a Fragment of Propositional Logic , 1981, JACM.

[11]  J. Pearl,et al.  Logical and Algorithmic Properties of Conditional Independence and Graphical Models , 1993 .

[12]  Sebastian Link,et al.  Empirical evidence for the usefulness of Armstrong relations in the acquisition of meaningful functional dependencies , 2010, Inf. Syst..

[13]  Christian Herrmann On the Undecidability of Implications Between Embedded Multivalued Database Dependencies , 1995, Inf. Comput..

[14]  Joseph Y. Halpern Reasoning about uncertainty , 2003 .

[15]  Jan Paredaens The Interaction of Integrity Constraints in an Information System , 1980, J. Comput. Syst. Sci..

[16]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[17]  Zvi Galil,et al.  An Almost Linear-Time Algorithm for Computing a Dependency Basis in a Relational Database , 1982, JACM.

[18]  S. Sullivant Gaussian conditional independence relations have no finite complete characterization , 2007, 0704.2847.

[19]  Bernhard Thalheim,et al.  Dependencies in relational databases , 1991, Teubner-Texte zur Mathematik.

[20]  Hofreiter Moderne Algebra , 1941 .

[21]  Sven Hartmann,et al.  The implication problem of data dependencies over SQL table definitions: Axiomatic, algorithmic and logical characterizations , 2012, TODS.

[22]  Christian Herrmann Corrigendum to "On the undecidability of implications between embedded multivalued database dependencies" [Inform. and Comput. 122(1995) 221-235] , 2006, Inf. Comput..

[23]  Kamran Parsaye-Ghomi,et al.  Inferences involving embedded multivalued dependencies and transitive dependencies , 1980, SIGMOD '80.

[24]  Daniel Kahneman,et al.  Probabilistic reasoning , 1993 .

[25]  Ronald Fagin,et al.  Multivalued dependencies and a new normal form for relational databases , 1977, TODS.

[26]  Yehoshua Sagiv,et al.  Subset Dependencies and a Completeness Result for a Subclass of Embedded Multivalued Dependencies , 1982, JACM.

[27]  Dan Geiger,et al.  Axioms and Algorithms for Inferences Involving Probabilistic Independence , 1991, Inf. Comput..

[28]  Benjamin Sapp,et al.  Concurrency Semantics for the Geiger-Paz-Pearl Axioms of Independence , 2011, CSL.