NiagaraCQ: a scalable continuous query system for Internet databases

Continuous queries are persistent queries that allow users to receive new results when they become available. While continuous query systems can transform a passive web into an active environment, they need to be able to support millions of queries due to the scale of the Internet. No existing systems have achieved this level of scalability. NiagaraCQ addresses this problem by grouping continuous queries based on the observation that many web queries share similar structures. Grouped queries can share the common computation, tend to fit in memory and can reduce the I/O cost significantly. Furthermore, grouping on selection predicates can eliminate a large number of unnecessary query invocations. Our grouping technique is distinguished from previous group optimization approaches in the following ways. First, we use an incremental group optimization strategy with dynamic re-grouping. New queries are added to existing query groups, without having to regroup already installed queries. Second, we use a query-split scheme that requires minimal changes to a general-purpose query engine. Third, NiagaraCQ groups both change-based and timer-based queries in a uniform way. To insure that NiagaraCQ is scalable, we have also employed other techniques including incremental evaluation of continuous queries, use of both pull and push models for detecting heterogeneous data source changes, and memory caching. This paper presents the design of NiagaraCQ system and gives some experimental results on the system's performance and scalability.

[1]  Theodore Johnson,et al.  Selection Predicate Indexing for Active Databases Using Interval Skip Lists , 1996, Inf. Syst..

[2]  Douglas B. Terry,et al.  Continuous queries over append-only databases , 1992, SIGMOD '92.

[3]  Umeshwar Dayal,et al.  The architecture of an active database management system , 1989, SIGMOD '89.

[4]  Calton Pu,et al.  Differential evaluation of continual queries , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[5]  Jack Minker,et al.  Multiple Query Processing in Deductive Databases using Query Graphs , 1986, VLDB.

[6]  Timos K. Sellis,et al.  Multiple-query optimization , 1988, TODS.

[7]  Hamid Pirahesh,et al.  Alert: An Architecture for Transforming a Passive DBMS into an Active DBMS , 1991, VLDB.

[8]  Jennifer Widom,et al.  Set-oriented production rules in relational database systems , 1990, SIGMOD '90.

[9]  Calton Pu,et al.  Continual Queries for Internet Scale Event-Driven Information Delivery , 1999, IEEE Trans. Knowl. Data Eng..

[10]  Jeffrey F. Naughton,et al.  Simultaneous optimization and evaluation of multiple dimensional queries , 1998, SIGMOD '98.

[11]  Eric Simon,et al.  Promises and Realities of Active Database Systems , 1995, VLDB.

[12]  Lan Huang,et al.  Scalable trigger processing , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[13]  Arnon Rosenthal,et al.  Anatomy of a Mudular Multiple Query Optimizer , 1988, VLDB.

[14]  Michael Stonebraker,et al.  On rules, procedure, caching and views in data base systems , 1990, SIGMOD '90.

[15]  Michael Stonebraker,et al.  On rules, procedures, caching and views in database systems , 1994, SIGMOD 1994.