Membership Problems for Data Dependencies in Relational Expressions

In relational databases, a query can be formulated in terms of a relational algebra expression using projection, selection, restriction, cross product and union. In this paper, we consider a problem, called the membership problem, of determining whether a given dependency d is valid in a given relational expression E over a given database scheme R that is, whether every instance of the view scheme defined by E satisfies d (assuming that the underlying constraints in R are always satisfied). Consider the case where each relation scheme in R is associated with functional dependencies (FDs) as constraints, and d is an FD. Then the complement of the membership problem is NP-complete. However, if E contains no union, then the membership problem can be solved in polynomial time. Furthermore, if E contains neither a union nor a projection, then we can construct in polynomial time a cover for valid FDs in E, that is, a set of FDs which implies every valid FD in E. Consider the case where each relation scheme in R is associated with multivalued dependencies (MVDs) as well as FDs, and d is an FD or an MVD. Even if E consists of selections and cross products only, the membership problem is NP-hard. However, if E contains no union, and each relation scheme name in R occurs in E at most once, then the membership problem can be solved in polynomial time. As a corollary of this result, it can be determined in polynomial time whether a given FD or MVD is valid in R1⋈⋯⋈Rs, where R1,…,Rs are relation schemes with FDs and MVDs, and Ri⋈Rj is the natural join of Ri and Rj.

[1]  Anthony C. Klug,et al.  Determining View dependencies using tableaux , 1982, TODS.

[2]  Philip A. Bernstein,et al.  Computational problems related to the design of normal form relational schemas , 1979, TODS.

[3]  Alfred V. Aho,et al.  The theory of joins in relational data bases , 1977, 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).

[4]  Jeffrey D. Ullman,et al.  Principles of Database Systems , 1980 .

[5]  David Maier,et al.  Testing implications of data dependencies , 1979, SIGMOD '79.

[6]  Catriel Beeri,et al.  A Sophisticate's Introduction to Database Normalization Theory , 1978, VLDB.

[7]  John Miles Smith,et al.  Optimizing the performance of a relational algebra database interface , 1975, CACM.

[8]  Catriel Beeri,et al.  Preserving Functional Dependencies , 1981, SIAM J. Comput..

[9]  E. F. Codd,et al.  A relational model of data for large shared data banks , 1970, CACM.

[10]  Zvi Galil,et al.  An Almost Linear-Time Algorithm for Computing a Dependency Basis in a Relational Database , 1982, JACM.

[11]  Ronald Fagin,et al.  Multivalued dependencies and a new normal form for relational databases , 1977, TODS.

[12]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[13]  Jorma Rissanen,et al.  Independent components of relations , 1977, TODS.

[14]  David Maier,et al.  Adequacy of Decompositions of Relational Databases , 1980, J. Comput. Syst. Sci..

[15]  Anthony C. Klug Calculating constraints on relational expression , 1980, TODS.

[16]  Carlo Zaniolo,et al.  Analysis and design of relational schemata for database systems. , 1976 .