Relational queries computable in polynomial time (Extended Abstract)

Query languages for relational databases have received considerable attention. In 1972 Codd [Cod72] showed that two natural mathematical languages for queries-&-mdash;one algebraic and the other a version of first order predicate calculus-&-mdash;had identical powers of expressibility. Query languages which are as expressive as Codd's Relational Calculus are sometimes called complete. This term is misleading, however, because many interesting queries are not expressible in -&-ldquo;complete-&-rdquo; languages. In this paper we show: Theorem 2: The Fixpoint Hierarchy collapses at the first fixpoint level. That is, any query expressible with several applications of least fixpoint can already be expressed with one. We also show: Theorem 1: Let L be a query language consisting of relational calculus plus the least fixpoint operator. Suppose that L contains a relation symbol for a total ordering relation on the domain (e.g. lexicographic ordering). Then the queries expressible in L are exactly the queries computable in polynomial time. Theorem 1 was discovered independantly by M. Vardi [Var82]. It gives a simple syntactic categorization of those queries which can be answered in polynomial time. Of course queries requiring polynomial time in the size of the database are usually prohibitatively expensive. We also consider weaker languages for expressing less complex queries.