Towards a minimal set of operations for nested relations

Since the first publications on non-first-normal-form relations in the late 70's and early 80's, a variety of formalizations of the data structure of and operations for nested relations have been devised. The data structure itself is defined almost identical in the several approaches, only some subtle difi'erences concerning special cases can be observed. For instance, both the VER50 relations [AB84J and the PNF relations of [RKS85J do not allow relations without an "atomic key", i.e. the set of attributes forming a key must not contain relation-valued attributes, which, however, is allowed in [5584/86J. However, concerning operations for nested relations, a much broader scale of languages was proposed. Formal operations were defined in algebra, calculus and SQL style. In this position paper we want to compare several proposals of algebras for nested relations. We distinguish between "minimal" extensions of the fiat algebra (as for example the ones given by [FT83] or [RKS85J)-i.e. those that try to get along with the "Nest" and "Unnest" operations-, and the "maximal" extensions that supply nested operations (e.g. [AB84, 5S84/86]). Others have investigated equivalences between algebras and calculi for nested relations, e.g. [RK585J, van Gucht, Abiteboul and Beeri (the latter ones see this workshop). This issue is related to the problem of identifying a minimal subset of algebraic operators that give the full expressive power of a certain calculus. (The problem of the powerset operation is treated e.g. by Beeri). In the sequel we concentrate on a specific topic: a minimal subset of the algebra from [5584/86] can be shown to consist of the usual selection, union and un nest operation plus extended projection. In particular, nest, difference and Cartesian product (and thus join) can be expressed in terms of the other operations. This exciting result can be derived by utilizing the concept of "dynamic con6tants" as introduced in [5S84/86]. Therefore, we argue in favor of maximally extended algebras as opposed to the 'minimal' extensions. Problems with empty subrelations and corresponding null values (c!. [Sch086]) are responsible for the fact, that the maximally extended algebra operations can not be expressed by unnesting, applying fiat algebra operations and finally nesting back. Rather, a nested algebra must be defined on its own [AB84, 5S84/86J to achieve the desired expressive power. This is the reason why others have restricted the scope of considered relations to certain normal forms (e.g. PNF) or changed the definitions of basic algebra operations to refiect these normal forms. With the maximally extended algebras we do not need such restrictions. The algebra defined in [5584/86] allows application of relational expressions at any place in an algebra operation, where attributes occur that are relation-valued, i.e. subrelations. As an example consider a nested relation representing departments and employees Dept(dno, dname, ... , mgrno, Emp(eno, ename, .. . ».