Incomplete Information in Relational Databases

ABSTRACT This paper concerns the semantics of Codd's relational model of data. Formulated are precise conditions that should be satisfied in a semantically meaningful extension of the usual relational operators, such as projection, selection, union, and join, from operators on relations to operators on tables with “null values” of various kinds allowed. These conditions require that the system be safe in the sense that no incorrect conclusion is derivable by using a specified subset Ω of the relational operators; and that it be complete in the sense that all valid conclusions expressible by relational expressions using operators in Ω are in fact derivable in this system. Two such systems of practical interest are shown. The first, based on the usual Codd's null values, supports projection and selection. The second, based on many different (“marked”) null values or variables allowed to appear in a table, is shown to correctly support projection, positive selection (with no negation occurring in the selection condition), union, and renaming of attributes, which allows for processing arbitrary conjunctive queries. A very desirable property enjoyed by this system is that all relational operators on tables are performed in exactly the same way as in the case of the usual relations. A third system, mainly of theoretical interest, supporting projection, selection, union, join, and renaming, is also discussed. Under a so-called closed world assumption, it can also handle the operator of difference. It is based on a device called a conditional table and is crucial to the proof of the correctness of the second system. All systems considered allow for relational expressions containing arbitrarily many different relation symbols, and no form of the universal relation assumption is required. Categories and Subject Descriptors: H.2.3 [Database Management]: Languages— query languages; H.2.4 [Database Management]: Systems— query processing General Terms: Theory

[1]  Tomasz Imielinski,et al.  On Representing Incomplete Information in a Relational Data Base , 1981, VLDB.

[2]  Jeffrey D. Ullman,et al.  Principles of Database Systems , 1980 .

[3]  Joachim Biskup,et al.  A Formal Approach to Null Values in Database Relations , 1979, Advances in Data Base Theory.

[4]  Witold Lipski On Relational Algebra with Marked Nulls. , 1984, PODS 1984.

[5]  Tomasz Imielinski,et al.  A technique for translating states between database schemata , 1982, SIGMOD '82.

[6]  Ashok K. Chandra,et al.  Optimal implementation of conjunctive queries in relational data bases , 1977, STOC '77.

[7]  Tomasz Imielinski,et al.  Inverting relational expressions: a uniform and natural technique for various database problems , 1983, PODS '83.

[8]  E. F. Codd,et al.  Relational Completeness of Data Base Sublanguages , 1972, Research Report / RJ / IBM / San Jose, California.

[9]  Witold Lipski,et al.  On Databases with Incomplete Information , 1981, JACM.

[10]  E. F. Codd,et al.  Extending the database relational model to capture more meaning , 1979, ACM Trans. Database Syst..

[11]  Witold Lipski,et al.  On semantic issues connected with incomplete information databases , 1979, ACM Trans. Database Syst..

[12]  Carlo Zaniolo Database relations with null values , 1982, PODS '82.

[13]  John Grant,et al.  Null Values in a Relational Data Base , 1977, Inf. Process. Lett..

[14]  Mihalis Yannakakis,et al.  Algebraic dependencies , 1980, 21st Annual Symposium on Foundations of Computer Science (sfcs 1980).

[15]  Yannis Vassiliou,et al.  Null values in data base management a denotational semantics approach , 1979, SIGMOD '79.

[16]  Laurent Siklóssy,et al.  Efficient Query Evaluation in Relational Data Bases with Missing Values , 1981, Inf. Process. Lett..

[17]  E. F. Codd,et al.  A relational model of data for large shared data banks , 1970, CACM.

[18]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[19]  E. F. Codd,et al.  A Relational Model for Large Shared Data Banks , 1970 .

[20]  Tomasz Imielinski,et al.  The Relational Model of Data and Cylindric Algebras , 1984, J. Comput. Syst. Sci..

[21]  Witold Lipski,et al.  Informational Systems with Incomplete Information , 1976, ICALP.

[22]  David Maier,et al.  On the foundations of the universal relation model , 1984, TODS.

[23]  E. F. Codd,et al.  Understanding Relations (Installment #7) , 1974, FDT Bull. ACM SIGFIDET SIGMOD.

[24]  Tomasz Imielinski,et al.  Incomplete information and dependencies in relational databases , 1983, SIGMOD '83.

[25]  Raymond Reiter,et al.  Towards a Logical Reconstruction of Relational Database Theory , 1982, On Conceptual Modelling.

[26]  Alfred V. Aho,et al.  Equivalences Among Relational Expressions , 1979, SIAM J. Comput..

[27]  Yannis Vassiliou Functional Dependencies and Incomplete Information , 1980, VLDB.

[28]  Witold Lipski On relational algebra with marked nulls preliminary version , 1984, PODS '84.

[29]  Mihalis Yannakakis,et al.  Algebraic dependencies , 1980, 21st Annual Symposium on Foundations of Computer Science (sfcs 1980).

[30]  Alain Pirotte,et al.  Generalized joins , 1976, SGMD.

[31]  Catriel Beeri,et al.  Formal Systems for Tuple and Equality Generating Dependencies , 1984, SIAM J. Comput..