Low Complexity Aggregation in GraphLog and Datalog

We present facilities for computing aggregate functions over sets of tuples and along paths in a database graph. We show how Datalog can be extended to compute a large class of queries with aggregates without incurring the large expense of a language with general set manipulation capabilities. In particular, we aim for queries that can be executed efficiently in parallel, using the class NC and its various subclasses as formal models of low parallel complexity. Our approach retains the standard relational notion of relations as sets of tuples, not requiring the introduction of multisets. For the case where no rules are recursive, the language is exactly as expressive as Klug's first order language with aggregates. We show that this class of non-recursive programs cannot express transitive closure (unless LOGSPACE=NLOGSPACE), thus providing evidence for a widely believed but never proven folk result. We also study the expressive power and complexity of languages that support aggregation over recursion. We then describe how these facilities, as well as manipulating the length of paths in database graphs, are incorporated into our visual query language GraphLog. While GraphLog could easily be extended to handle all the queries described above, we prefer to restrict the language in a natural way to avoid explicit recursion; all recursion is expressed as transitive closure. We show that this guarantees all expressible queries are in NC. We analyze other proposals and show that they can express queries that are logspace-complete for P and thus unlikely to be parallelizable efficiently.

[1]  Neil Immerman Nondeterministic Space is Closed Under Complementation , 1988, SIAM J. Comput..

[2]  Alberto O. Mendelzon,et al.  GraphLog: a visual formalism for real life recursion , 1990, PODS '90.

[3]  Alberto O. Mendelzon,et al.  The G+/GraphLog Visual Query System , 1990, SIGMOD '90.

[4]  Hamid Pirahesh,et al.  The Magic of Duplicates and Aggregates , 1990, VLDB.

[5]  Alberto O. Mendelzon,et al.  G+: Recursive Queries Without Recursion , 1988, Expert Database Conf..

[6]  Stephen A. Cook,et al.  A Taxonomy of Problems with Fast Parallel Algorithms , 1985, Inf. Control..

[7]  Gabriel M. Kuper,et al.  Logic programming with sets , 1987, J. Comput. Syst. Sci..

[8]  Neil Immerman,et al.  Languages that Capture Complexity Classes , 1987, SIAM J. Comput..

[9]  Stefano Ceri,et al.  Complex Transitive Closure Queries on a Fragmented Graph , 1990, ICDT.

[10]  Raghu Ramakrishnan,et al.  Efficient Transitive Closure Algorithms , 1988, VLDB.

[11]  Neil Immerman,et al.  Expressibility and Parallel Complexity , 1989, SIAM J. Comput..

[12]  S. Sudarshan,et al.  Aggregation and Relevance in Deductive Databases , 1991, VLDB.

[13]  Catriel Beeri,et al.  On the power of languages for manipulation of complex objects , 1987, VLDB 1987.

[14]  Theodore S. Norvell,et al.  Aggregative closure: an extension of transitive closure , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[15]  Carlo Zaniolo,et al.  LDL: A Logic-Based Data Language , 1986, VLDB.

[16]  Christos H. Papadimitriou,et al.  Why not negation by fixpoint? , 1988, PODS '88.

[17]  Jeffrey D. Ullman,et al.  Principles of Database Systems , 1980 .

[18]  Yeh-Heng Shen IDLOG: extending the expressive power of deductive database languages , 1990, SIGMOD 1990.

[19]  Gultekin Özsoyoglu,et al.  Extending relational algebra and relational calculus with set-valued attributes and aggregate functions , 1987, TODS.

[20]  Paris C. Kanellakis,et al.  Logic Programming and Parallel Complexity , 1986, Foundations of Deductive Databases and Logic Programming..

[21]  Umeshwar Dayal,et al.  Traversal recursion: a practical approach to supporting recursive applications , 1986, SIGMOD '86.

[22]  Anthony C. Klug Equivalence of Relational Algebra and Relational Calculus Query Languages Having Aggregate Functions , 1982, JACM.

[23]  Alberto O. Mendelzon,et al.  Expressing structural hypertext queries in graphlog , 1989, Hypertext.

[24]  Michael V. Mannino,et al.  Extensions to Query Languages for Graph Traversal Problems , 1990, IEEE Trans. Knowl. Data Eng..

[25]  Jeffrey D. Ullman,et al.  Principles Of Database And Knowledge-Base Systems , 1979 .

[26]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[27]  Peter J. Stuckey,et al.  Semantics of Logic Programs with Aggregates , 1991, ISLP.

[28]  Paris C. Kanellakis Logic Programming and Parallel Complexity , 1988, Foundations of Deductive Databases and Logic Programming..

[29]  Rakesh Agrawal Alpha: An extension of relational algebra to express a class of recursive queries , 1987, 1987 IEEE Third International Conference on Data Engineering.

[30]  Sergio Greco,et al.  Minimum and maximum predicates in logic programming , 1991, PODS '91.

[31]  Marc Gyssens,et al.  A graph-oriented object database model , 1990, IEEE Trans. Knowl. Data Eng..

[32]  Umeshwar Dayal,et al.  PROBE: A Knowledge-Oriented Database Management System , 1986, On Knowledge Base Management Systems.

[33]  Eric C. Cooper On the expressive power of query languages for relational databases , 1982, POPL '82.