Database design for incomplete relations

Although there has been a vast amount of research in the area ofrelational database design, to our knowledge, there has been very little work that considers whether this theory is still valid when relations in the database may be incomplete. When relations are incomplete and thus contain null values the problem of whether satisfaction is additive arises. Additivity is the property of the equivalence of the satisfaction of a set of functional dependencies (FDs) F with the individual satisfaction of each member of F in an incomplete relation. It is well known that in general, satisfaction of FDs is not additive. Previously we have shown that satisfaction is additive if and only if the set of FDs is monodependent. We conclude that monodependence is a fundamental desirable property of a set of FDs when considering incomplete information in relational database design. We show that, when the set of FDs F either satifies the intersection property or the split-freeness property, then the problem of finding an optimum cover of F can be solved in polynomial time in the size of F; in general, this problem is known to be NP-complete. We also show that when F satisfies the split-freeness property then deciding whether there is a superkey of cardinality k or less can be solved in polynomial time in the size of F, since all the keys have the same cardinality. If F only satisfies the intersection property then this problem is NP-complete, as in the general case. Moreover, we show that when F either satisfies the intersection property or the split-freeness property then deciding whether an attribute is prime can be solved in polynomial time in the size of F; in general, this problem is known to be NP-complete. Assume that a relation schema R is an appropriate normal form with respect to a set of FDs F. We show that when F satisfies the intersection property then the notions of second normal form and third normal form are equivalent. We also show that when R is in Boyce-Codd Normal Form (BCNF), then F is monodependent if and only if either there is a unique key for R, or for all keys X for R, the cardinality of X is one less than the number of attributes associated with R. Finally, we tackle a long-standing problem in relational database theory by showing that when a set of FDs F over R satisfies the intersection property, it also satisfies the split-freeness property (i.e., is monodependent), if and only if every lossless join decomposition of R with respect to F is also dependecy preserving. As a corollary of this result we are able to show that when F satisfies the intersection property, it also satisfies the intersection property, it also satisfies the split-freeness property(i.e., is monodependent), if and only if every lossless join decomposition of R, which is in BCNF, is also dependency preserving. Our final result is that when F is monodependent, then there exists a unique optimum lossless join decomposition of R, which is in BCNF, and is also dependency preserving. Furthermore, this ultimate decomposition can be attained in polynomial time in the size of F.

[1]  Peter Kandzia,et al.  On Covering Boyce-Codd Normal Forms , 1980, Inf. Process. Lett..

[2]  Jeffrey D. Ullman,et al.  Principles of Database and Knowledge-Base Systems, Volume II , 1988, Principles of computer science series.

[3]  David Maier Minimum Covers in Relational Database Model , 1980, JACM.

[4]  Y. Edmund Lien,et al.  On the Equivalence of Database Models , 1982, JACM.

[5]  Z. Meral Özsoyoglu,et al.  Design of Desirable Relational Database Schemes , 1992, J. Comput. Syst. Sci..

[6]  Patrick C. Fischer,et al.  Decomposition of a relation scheme into Boyce-Codd Normal Form , 1980, ACM '80.

[7]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[8]  Catriel Beeri,et al.  Preserving Functional Dependencies , 1981, SIAM J. Comput..

[9]  Peter Thanisch,et al.  Conjectures and Refutations in Database Design and Dependency Theory , 1990, ICDT.

[10]  Patrick C. Fischer,et al.  Decomposition of a relation scheme into Boyce-Codd Normal Form , 1982, SIGA.

[11]  Millist Walter Vincent,et al.  THE SEMANTIC JUSTIFICATION FOR NORMAL FORMS IN RELATIONAL DATABASE DESIGN , 1994 .

[12]  John Carter,et al.  Relational Database , 1994, Encyclopedia of Database Systems.

[13]  Mark Levene,et al.  Axiomatisation of Functional Dependencies in Incomplete Relations , 1998, Theor. Comput. Sci..

[14]  Joachim Biskup,et al.  Synthesizing independent database schemas , 1979, SIGMOD '79.

[15]  Heikki Mannila,et al.  Design of Relational Databases , 1992 .

[16]  A BernsteinPhilip,et al.  Computational problems related to the design of normal form relational schemas , 1979 .

[17]  Mark Levene,et al.  A Lattice View of Functional Dependencies in Incomplete Relations , 1995, Acta Cybern..

[18]  Edward Sciore Improving database schemes by adding attributes , 1983, PODS '83.

[19]  Moshe Y. Vardi On decomposition of relational databases , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[20]  Catriel Beeri,et al.  On the Desirability of Acyclic Database Schemes , 1983, JACM.

[21]  Ronald Fagin,et al.  Simple conditions for guaranteeing higher normal forms in relational databases , 1992, TODS.

[22]  V. D. Thi,et al.  Minimal keys and antikeys , 1986, Acta Cybern..

[23]  Jorma Rissanen,et al.  Independent components of relations , 1977, TODS.

[24]  Claudio L. Lucchesi,et al.  Candidate Keys for Relations , 1978, J. Comput. Syst. Sci..

[25]  C. J. Date An Introduction to Database Systems, 6th Edition , 1995 .

[26]  C. J. Date An introduction to database systems (7. ed.) , 1999 .

[27]  Sylvia L. Osborn Testing for Existence of a Covering Boyce-Codd normal Form , 1979, Inf. Process. Lett..

[28]  Paolo Atzeni,et al.  Efficient optimization of simple chase join expressions , 1989, ACM Trans. Database Syst..

[29]  Mark Levene,et al.  The additivity Problem for Data Dependencies in Incomplete Relational Databases , 1995, Semantics in Databases.

[30]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[31]  János Demetrovics,et al.  Relations and minimal keys , 1988, Acta Cybern..

[32]  Edward Sciore Real-world MVD's , 1981, SIGMOD '81.

[33]  Sukhamay Kundu An improved algorithm for finding a key of a relation , 1985, PODS '85.

[34]  Valeria De Antonellis,et al.  Relational Database Theory , 1993 .

[35]  Alfred V. Aho,et al.  The theory of joins in relational data bases , 1977, 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).

[36]  E. F. Codd,et al.  Further Normalization of the Data Base Relational Model , 1971, Research Report / RJ / IBM / San Jose, California.

[37]  János Demetrovics,et al.  On Relational Database Schemes Having Unique Minimal Key , 1991, J. Inf. Process. Cybern..

[38]  Mark Levene,et al.  The additivity problem for functional dependencies in incomplete relations , 1997, Acta Informatica.

[39]  Paolo Atzeni,et al.  Functional Dependencies and Constraints on Null Values in Database Relations , 1986, Inf. Control..

[40]  David Maier,et al.  Testing implications of data dependencies , 1979, SIGMOD '79.

[41]  Philip A. Bernstein,et al.  Computational problems related to the design of normal form relational schemas , 1979, TODS.

[42]  Jeffrey D. Uuman Principles of database and knowledge- base systems , 1989 .

[43]  Catriel Beeri,et al.  An integrated approach to logical design of relational database schemes , 1986, TODS.

[44]  H. J. Pels,et al.  An introduction to database systems, sixth edition , 1997 .

[45]  E. F. Codd,et al.  Extending the database relational model to capture more meaning , 1979, ACM Trans. Database Syst..

[46]  Patrick C. Fischer,et al.  The Complexity of Recognizing 3NF Relation Schemes , 1982, Inf. Process. Lett..

[47]  W. W. Armstrong,et al.  Dependency Structures of Data Base Relationships , 1974, IFIP Congress.

[48]  Peter Honeyman Extension Joins , 1980, VLDB.

[49]  Moshe Y. Vardi A Note on Lossless Database Decompositions , 1984, Inf. Process. Lett..

[50]  David Maier,et al.  The Theory of Relational Databases , 1983 .

[51]  Mark Levene,et al.  A guided tour of relational databases and beyond , 1999 .