Aggregate Predicate Support in DBMS

In this paper we consider aggregate predicates and their support in database systems. Aggregate predicates are the predicate equivalent to aggregate functions in that they can be used to search for tuples that satisfy some aggregate property over a set of tuples (as opposed to simply computing an aggregate property over a set of tuples). The importance of aggregate predicates is exemplified by many modern applications that require ranked search, or top-k queries. Such queries are the norm in multimedia and spatial databases.In order to support the concept of aggregate predicates in DBMS, we introduce several extensions in the query language and the database engine. Specifically, we extend the SQL syntax to handle aggregate predicates and work out the semantics of such extensions so that they behave correctly in the existing database model. We also propose a a new rk_SORT operator into the database engine, and study relevant indexing and query optimization issues.Our approach provides several advantages, including enhanced usability and improved performance. By supporting aggregate predicates natively in the database engine, we are able to reuse existing indexing and query optimization techniques, without sacrificing generality or incurring the runtime overhead of database-external approaches. To the best of our knowledge, the proposed framework is the first to support user-defined indexing with aggregate predicates and search based upon user-defined ranking. We also provide empirical results from a simulation study that validates the effectiveness of our approach.

[1]  Michael J. Carey,et al.  On saying “Enough already!” in SQL , 1997, SIGMOD '97.

[2]  Hamid Pirahesh,et al.  Starburst Mid-Flight: As the Dust Clears , 1990, IEEE Trans. Knowl. Data Eng..

[3]  Michael Stonebraker,et al.  The design and implementation of INGRES , 1976, TODS.

[4]  Paul M. Aoki Generalizing "search" in generalized search trees , 1998, Proceedings 14th International Conference on Data Engineering.

[5]  Donald D. Chamberlin,et al.  Access Path Selection in a Relational Database Management System , 1989 .

[6]  William Hobbs,et al.  A relational database management system , 1993 .

[7]  Jeffrey F. Naughton,et al.  Generalized Search Trees for Database Systems , 1995, VLDB.

[8]  Raghu Ramakrishnan,et al.  Probabilistic Optimization of Top N Queries , 1999, VLDB.

[9]  Luis Gravano,et al.  Evaluating Top-k Selection Queries , 1999, VLDB.

[10]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[11]  Ronald Fagin,et al.  Fuzzy queries in multimedia database systems , 1998, PODS '98.

[12]  Yun Wang,et al.  High Level Indexing of User-Defined Types , 1999, VLDB.

[13]  Sharad Mehrotra,et al.  High dimensional feature indexing using hybrid trees , 1998, ICDE 1998.

[14]  Michael J. Carey,et al.  Reducing the Braking Distance of an SQL Query Engine , 1998, VLDB.

[15]  Ramesh C. Jain,et al.  Similarity indexing with the SS-tree , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[16]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.