Efficient and Adaptive Processing of Multiple Continuous Queries

Continuous queries are queries executed on data streams within a potentially open-ended time interval specified by the user and are usually long running. The data streams are likely to exhibit fluctuating characteristics such as varying inter-arrival times, as well as varying data characteristics during the query execution. In the presence of such unpredictable factors, continuous query systems must still be able to efficiently handle large number of queries, as well as to offer acceptable individual query performance.In this paper, we propose and discuss a novel framework, called AdaptiveCQ, for the efficient processing of multiple continuous queries. In our framework, multiple queries share intermediate results at a fine level of granularity. Unlike previous approaches to sharing or reusing that relied on materialization to disk, AdaptiveCQ allows on-the-fly sharing of results. We show that this feature improves both the initial query response time, and the overall response time. Finally, AdaptiveCQ, which extrapolates the idea proposed by the eddy query-processing model, adapts well to fluctuations of the data streams characteristics by this combination of fine grain and on-the-fly sharing. We implemented AdaptiveCQ from scratch in Java and made use of it to conduct the experiments. We present experimental results that substantiate our claim that AdaptiveCQ can provide substantial performance improvements over existing methods of reusing intermediate results that relied on materialization to disk. In addition, we also show that AdaptiveCQ can adapt well to fluctuations in the query environment.

[1]  Laurent Amsaleg,et al.  Scrambling query plans to cope with unexpected delays , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[2]  Michael J. Franklin,et al.  Cache investment: integrating query optimization and distributed data placement , 2000, TODS.

[3]  Douglas B. Terry,et al.  Continuous queries over append-only databases , 1992, SIGMOD '92.

[4]  Lan Huang,et al.  Scalable trigger processing , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[5]  Philippe Bonnet,et al.  Towards Sensor Database Systems , 2001, Mobile Data Management.

[6]  Kyuseok Shim,et al.  Optimizing queries with materialized views , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[7]  Joseph M. Hellerstein,et al.  Eddies: continuously adaptive query processing , 2000, SIGMOD 2000.

[8]  Peter Scheuermann,et al.  Dynamic caching of query results for decision support systems , 1999, Proceedings. Eleventh International Conference on Scientific and Statistical Database Management.

[9]  Nick Roussopoulos,et al.  The Implementation and Performance Evaluation of the ADMS Query Optimizer: Integrating Query Result Caching and Matching , 1994, EDBT.

[10]  Timos K. Sellis,et al.  Multiple-query optimization , 1988, TODS.

[11]  Michael J. Franklin,et al.  XJoin: Getting Fast Answers From Slow and Bursty Networks , 1999 .

[12]  Michael J. Franklin,et al.  Efficient Filtering of XML Documents for Selective Dissemination of Information , 2000, VLDB.

[13]  Calton Pu,et al.  Continual Queries for Internet Scale Event-Driven Information Delivery , 1999, IEEE Trans. Knowl. Data Eng..

[14]  Laurent Amsaleg,et al.  Dynamic Query Operator Scheduling for Wide-Area Remote Access , 1998, Distributed and Parallel Databases.

[15]  Beng Chin Ooi,et al.  Cache-on-demand: recycling with certainty , 2001, Proceedings 17th International Conference on Data Engineering.

[16]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD 2000.

[17]  Laurent Amsaleg,et al.  Cost-based query scrambling for initial delays , 1998, SIGMOD '98.