User-Defined Aggregates in Database Languages

User-defined aggregates (UDAs) can be the linchpin of sophisticated data mining functions and other advanced database applications, but they find little support in current database systems. In this paper, we describe the SQL-AG prototype that overcomes these limitations by supporting UDAs as originally proposed in Postgres and SQL3. Then we extend the power and flexibility of UDAs by adding (i) early returns, (to express online aggregation) and (ii) syntactically recognizable monotonic UDAs that can be used in recursive queries to support applications, such as Bill of Materials (BoM) and greedy algorithms for graph optimization, that cannot be expressed under stratified aggregation. T his paper proposes a unified solution to both the theoretical and practical problems of UDAs, and demonstrates the power of UDAs in dealing with advanced database applications.

[1]  Sunita Sarawagi,et al.  Integrating association rule mining with relational database systems: alternatives and implications , 1998, SIGMOD '98.

[2]  Sergio Greco,et al.  Extrema Predicates in Deductive Databases , 1995, J. Comput. Syst. Sci..

[3]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[4]  Carlo Zaniolo,et al.  Deterministic and Non-Deterministic Stable Models , 1997, J. Log. Comput..

[5]  Ping-Yu Hsu,et al.  Improving SQL with generalized quantifiers , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[6]  Peter J. Stuckey,et al.  Semantics of Logic Programs with Aggregates , 1991, ISLP.

[7]  Peter J. Haas,et al.  Ripple joins for online aggregation , 1999, SIGMOD '99.

[8]  Kenneth A. Ross,et al.  Monotonic Aggregation in Deductive Database , 1997, J. Comput. Syst. Sci..

[9]  Shamim A. Naqvi,et al.  A Logical Language for Data and Knowledge Bases , 1989 .

[10]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[11]  Rakesh Agrawal,et al.  SPRINT: A Scalable Parallel Classifier for Data Mining , 1996, VLDB.

[12]  Carlo Zaniolo,et al.  Semantics and Expressive Power of Nondeterministic Constructs in Deductive Databases , 2001, J. Comput. Syst. Sci..

[13]  Carlo Zaniolo,et al.  Stable models and non-determinism in logic programs with negation , 1990, PODS.

[14]  Rajeev Motwani,et al.  Computing Iceberg Queries Efficiently , 1998, VLDB.

[15]  Carlo Zaniolo,et al.  User-Defined Aggregates for Datamining , 1999, 1999 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[16]  Carlo Zaniolo,et al.  User defined aggregates in object-relational systems , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[17]  Donald D. Chamberlin,et al.  Using the New DB2: IBM's Object-Relational Database System , 1996 .

[18]  Carlo Zaniolo,et al.  Temporal aggregation in active database rules , 1997, SIGMOD '97.

[19]  Antonio Badia,et al.  Query languages with generalized quantifiers , 1995 .

[20]  D UllmanJeffrey,et al.  Dynamic itemset counting and implication rules for market basket data , 1997 .

[21]  Allen Van Gelder,et al.  Foundations of Aggregation in Deductive Databases , 1993, DOOD.

[22]  Helen J. Wang,et al.  Online aggregation , 1997, SIGMOD '97.

[23]  Charles Elkan,et al.  Boosting and Naive Bayesian learning , 1997 .

[24]  Carlo Zaniolo,et al.  Universal temporal extensions for database languages , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[25]  Divesh Srivastava,et al.  Coral++: Adding Object-Orientation to a Logic Database Language , 1993, VLDB.