Automated database schema design using mined data dependencies

Data dependencies are used in database schema design to enforce the correctness of a database as well as to reduce redundant data. These dependencies are usually determined from the semantics of the attributes and are then enforced upon the relations. This article describes a bottom-up procedure for discovering multivalued dependencies (MVDs) in observed data without knowing a priori the relationships among the attributes. The proposed algorithm is an application of the technique we designed for learning conditional independencies in probabilistic reasoning. A prototype system for automated database schema design has been implemented. Experiments were carried out to demonstrate both the effectiveness and efficiency of our method. © 1998 John Wiley & Sons, Inc.

[1]  Yang Xiang,et al.  Critical Remarks on Single Link Search in Learning Belief Networks , 1996, UAI.

[2]  Claude Delobel,et al.  Normalization and hierarchical dependencies in the relational data model , 1978, TODS.

[3]  Ronald Fagin,et al.  Multivalued dependencies and a new normal form for relational databases , 1977, TODS.

[4]  Yang Xiang,et al.  A Method for Implementing a Probabilistic Model as a Relational Database , 1995, UAI.

[5]  Udi Manber,et al.  Introduction to algorithms - a creative approach , 1989 .

[6]  Catriel Beeri,et al.  A complete axiomatization for functional and multivalued dependencies in database relations , 1977, SIGMOD '77.

[7]  David Maier,et al.  The Theory of Relational Databases , 1983 .

[8]  Wai Lam,et al.  LEARNING BAYESIAN BELIEF NETWORKS: AN APPROACH BASED ON THE MDL PRINCIPLE , 1994, Comput. Intell..

[9]  Remco R. Bouckaert,et al.  Properties of Bayesian Belief Network Learning Algorithms , 1994, UAI.

[10]  Gregory F. Cooper,et al.  An Entropy-driven System for Construction of Probabilistic Expert Systems from Databases , 1990, UAI.

[11]  S. K. Michael Wong,et al.  Testing Implication of Probabilistic Dependencies , 1996, UAI.

[12]  Ronald Fagin,et al.  A simplied universal relation assumption and its properties , 1982, TODS.

[13]  Catriel Beeri,et al.  On the Desirability of Acyclic Database Schemes , 1983, JACM.

[14]  Finn Verner Jensen,et al.  Introduction to Bayesian Networks , 2008, Innovations in Bayesian Networks.

[15]  Yang Xiang,et al.  Representation of Bayesian Networks as Relational Databases , 1994, IPMU.

[16]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[17]  Judea Pearl,et al.  The Logic of Representing Dependencies by Directed Graphs , 1987, AAAI.

[18]  P. Spirtes,et al.  An Algorithm for Fast Recovery of Sparse Causal Graphs , 1991 .

[19]  Richard E. Neapolitan,et al.  Probabilistic reasoning in expert systems - theory and algorithms , 2012 .

[20]  M. Golumbic Algorithmic graph theory and perfect graphs , 1980 .

[21]  T. T. Lee An algebraic theory of relational databases , 1983, The Bell System Technical Journal.

[22]  Gregory F. Cooper,et al.  The ALARM Monitoring System: A Case Study with two Probabilistic Inference Techniques for Belief Networks , 1989, AIME.

[23]  Rina Dechter Decomposing a Relation into a Tree of Binary Relations , 1990, J. Comput. Syst. Sci..

[24]  Petr Hájek,et al.  Uncertain information processing in expert systems , 1992 .

[25]  Peter A. Flach,et al.  Bottom-up induction of functional dependencies from relations , 1993 .

[26]  Joe R. Hill [Bayesian Analysis in Expert Systems]: Comment , 1993 .