On Impossibility of Decremental Recomputation of Recursive Queries in Relational Calculus and SQL

We study the problem of maintaining recursively-de ned views, such as the transitive closure of a relation, in traditional relational languages that do not have recursion mechanisms. In particular, we show that the transitive closure cannot be maintained in relational calculus under deletion of edges. We use new proof techniques to show this result. These proof techniques generalize to other languages, for example, to the language for nested relations that also contains a number of aggregate functions. Such a language is considered in this paper as a theoretical reconstruction of SQL. Our proof techniques also generalize to other recursive queries. Consequently, we show that a number of recursive queries cannot be maintained in an SQL-like language. We show that this continues to be true in the presence of certain auxiliary relations. We also relate the complexity of updating transitive closure to that of updating the samegeneration query and show that the latter is strictly harder than the former. Then we extend this result to that of updating queries based on context-free sets. 1 Problem Statement and Summary It is well known that relational calculus (equivalently, rst-order logic) cannot express recursive queries such as transitive closure [1]. However, in a real database system, it is reasonable to store both the relation and its transitive closure and update the latter whenever edges are added to or removed from the former. Doing this is known under the name of view maintenance. In this paper we consider the problem of whether the above update problem for maintaining transitive closure and other recursive queries can be accomplished using relational calculus or using its practical SQL-like extensions. We also compare the complexity of maintaining transitive closure against the complexity of maintaining \same generation" and context-free chain queries. In this paper, we use the letter R to denote a binary relation, and R to denote its transitive closure. It can be proved [6, 2] that given R, R, and a new edge (x; y) to be added to R, the transitive closure R +(x;y) of R [ f(x; y)g can be expressed in rst-order logic and thus in relational calculus. In particular, for all u and v, R +(x;y) (u; v) i R(u; v), or R(u; x) and y = v, or u = x and R(y; v), or R(u; x) and R(y; v). Thus transitive closure can be incrementally maintained in a relational database. The problem of updating the transitive closure after an edge has been removed is more di cult. The best positive solution so far is that of Dong and Su [5]. They proved that if R is acyclic, then the transitive closure R (x;y) of R with the edge (x; y) removed can be de ned in rst-order logic in terms of R, R , and (x; y). Thus transitive closure can be decrementally maintained in a relational database provided the relation involved is acyclic. But this is not satisfactory because acyclicity cannot be tested in relational calculus [10]. Database Programming Languages, 1995 1 On Impossibility of Decremental Recomputation of Recursive Queries in Relational Calculus and SQL Another solution is that of Immerman and Patnaik [14]. They proved that transitive closure of undirected graphs can always be maintained, provided some auxiliary ternary relations can be used. Dong and Su [7] strengthened this result further by showing that transitive closure of undirected graphs can be maintained using only auxiliary binary relations. They also showed that it cannot be done using only auxiliary unary relations. In Section 2, we prove that transitive closure cannot be decrementally maintained in a relational database in general. That is, R (x;y) cannot be expressed in relational calculus in terms of R, R , and (x; y) when R is a directed graph that is not necessarily acyclic. We also consider the problem of maintaining transitive closure in a context where some auxiliary relations are available. Dong and Su [7] also obtained results that are similar to ours. However, the proof techniques involved are very di erent. Most importantly, their proof technique is only applicable to the particular case of maintaining transitive closure in relational calculus. Ours is much simpler and can be generalized to more expressive languages and other recursive queries. In particular, instead of transitive closure, any query complete for DLOGSPACE can be used. In Section 3 we show that our technique extends naturally to prove that transitive closure cannot be decrementally maintained using query languages having the power of SQL. That is, we show that the availability of arithmetic operations and GROUP-BY does not help at all. We also extend this result in the presence of simple auxiliary relations. In addition, we exhibit a query that illustrates the additional power of using an SQL-like language incrementally. This query, which is inexpressible in SQL, is expressible incrementally in SQL with certain auxiliary relations but is not expressible incrementally in rst-order logic with the same auxiliary relations. In Section 4, we look at the complexity of maintaining transitive closure against the complexity of maintaining other queries. We prove that it is strictly more di cult to maintain the \same generation" query than to maintain transitive closure. We are also able to generalize this result and show that maintaining context-free chain queries (in a certain sense to be de ned) is at least as hard as maintaining transitive closure. In Section 5 we extend our basic technique to show that the same-generation query cannot be maintained (incrementally or decrementally) in SQL-like languages. 2 Recomputation of Recursive Queries in Relational Calculus The purpose of this section is to show that the transitive closure of a relation cannot be decrementally maintained in relational calculus or rst-order logic. That is, Theorem 2.1 There is no relational calculus expression that de nes the transitive closure R (x;y) of R f(x; y)g in terms of a binary relation R, its transitive closure R, and an edge (x; y). We introduce a new proof technique that is di erent from [7]. In particular, our technique does not rely on games and can be readily extended to other queries and languages. For example, we will show that the analog of Theorem 2.1 holds for a language having the expressive power of SQL.