Structure Identification in Relational Data

This paper presents several investigations into the prospects for identifying meaningful structures in empirical data, namely, structures permitting effective organization of the data to meet requirements of future queries. We propose a general framework whereby the notion of identifiability is given a precise formal definition similar to that of learnability. Using this framework, we then explore if a tractable procedure exists for deciding whether a given relation is decomposable into a constraint network or a CNF theory with desirable topology and, if the answer is positive, identifying the desired decomposition. Finally, we address the problem of expressing a given relation as a Horn theory and, if this is impossible, finding the best k-Horn approximation to the given relation. We show that both problems can be solved in time polynomial in the length of the data.

[1]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[2]  Bart Selman,et al.  Knowledge Compilation using Horn Approximations , 1991, AAAI.

[3]  Rina Dechter Decomposing a Relation into a Tree of Binary Relations , 1990, J. Comput. Syst. Sci..

[4]  Michael Frazier,et al.  Learning conjunctions of Horn clauses , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[5]  Catriel Beeri,et al.  On the Desirability of Acyclic Database Schemes , 1983, JACM.

[6]  Balas K. Natarajan,et al.  On learning Boolean functions , 1987, STOC.

[7]  Robert A. Kowalski,et al.  The Semantics of Predicate Logic as a Programming Language , 1976, JACM.

[8]  Rina Dechter,et al.  Tree Clustering for Constraint Networks , 1989, Artif. Intell..

[9]  Paul F. Lazarsfeld,et al.  Latent Structure Analysis. , 1969 .

[10]  Jean H. Gallier,et al.  Linear-Time Algorithms for Testing the Satisfiability of Propositional Horn Formulae , 1984, J. Log. Program..

[11]  D. Angluin Queries and Concept Learning , 1988 .

[12]  Rina Dechter,et al.  Tree Decomposition with Applications to Constraint Processing , 1990, AAAI.

[13]  Rina Dechter,et al.  Network-Based Heuristics for Constraint-Satisfaction Problems , 1987, Artif. Intell..

[14]  Francesca Rossi,et al.  Fundamental properties of networks of constraints: A new formulation , 1988 .

[15]  Henry Kautz,et al.  Tractability through Theory Approximation , 1992 .

[16]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[17]  Judea Pearl,et al.  A Theory of Inferred Causation , 1991, KR.

[18]  Stefan Arnborg,et al.  Efficient algorithms for combinatorial problems on graphs with bounded decomposability — A survey , 1985, BIT.

[19]  Robert K. Brayton,et al.  Multilevel logic synthesis , 1990, Proc. IEEE.

[20]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[21]  Stephen A. Cook,et al.  The complexity of theorem-proving procedures , 1971, STOC.

[22]  Ugo Montanari,et al.  Networks of constraints: Fundamental properties and applications to picture processing , 1974, Inf. Sci..