Continuously adaptive continuous queries over streams

We present a continuously adaptive, continuous query (CACQ) implementation based on the eddy query processing framework. We show that our design provides significant performance benefits over existing approaches to evaluating continuous queries, not only because of its adaptivity, but also because of the aggressive cross-query sharing of work and space that it enables. By breaking the abstraction of shared relational algebra expressions, our Telegraph CACQ implementation is able to share physical operators --- both selections and join state --- at a very fine grain. We augment these features with a grouped-filter index to simultaneously evaluate multiple selection predicates. We include measurements of the performance of our core system, along with a comparison to existing continuous query approaches.

[1]  Deborah Estrin,et al.  Directed diffusion: a scalable and robust communication paradigm for sensor networks , 2000, MobiCom '00.

[2]  Robert Szewczyk,et al.  System architecture directions for networked sensors , 2000, ASPLOS IX.

[3]  Daniel P. Miranker TREAT: A Better Match Algorithm for AI Production System Matching , 1987, AAAI.

[4]  David J. DeWitt,et al.  An Evaluation of Non-Equijoin Algorithms , 1991, VLDB.

[5]  MaddenSamuel,et al.  Java support for data-intensive systems , 2001 .

[6]  Miron Livny,et al.  The Design and Implementation of a Sequence Database System , 1996, VLDB.

[7]  Samuel Madden,et al.  Fjording the stream: an architecture for queries over streaming sensor data , 2002, Proceedings 18th International Conference on Data Engineering.

[8]  Randy H. Katz,et al.  Next century challenges: mobile networking for “Smart Dust” , 1999, MobiCom.

[9]  Alon Y. Halevy,et al.  An adaptive query execution system for data integration , 1999, SIGMOD '99.

[10]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[11]  Wendi B. Heinzelman,et al.  Adaptive protocols for information dissemination in wireless sensor networks , 1999, MobiCom.

[12]  Alin Deutsch,et al.  XML-QL: A Query Language for XML , 1998 .

[13]  Samuel Madden,et al.  Java support for data-intensive systems: experiences building the telegraph dataflow system , 2001, SGMD.

[14]  Ralf Rantzau,et al.  StreamJoin: a generic database approach to support the class of stream-oriented applications , 2000, Proceedings 2000 International Database Engineering and Applications Symposium (Cat. No.PR00789).

[15]  Peter Auer,et al.  Using upper confidence bounds for online learning , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[16]  Calton Pu,et al.  Continual Queries for Internet Scale Event-Driven Information Delivery , 1999, IEEE Trans. Knowl. Data Eng..

[17]  H. Robbins,et al.  Asymptotically efficient adaptive allocation rules , 1985 .

[18]  David J. DeWitt,et al.  Design and evaluation of alternative selection placement strategies in optimizing continuous queries , 2002, Proceedings 18th International Conference on Data Engineering.

[19]  Laurent Amsaleg,et al.  Cost-based query scrambling for initial delays , 1998, SIGMOD '98.

[20]  Timos K. Sellis,et al.  Multiple-query optimization , 1988, TODS.

[21]  Michael J. Franklin,et al.  Efficient Filtering of XML Documents for Selective Dissemination of Information , 2000, VLDB.

[22]  Joseph M. Hellerstein,et al.  Interactive query processing , 2001 .

[23]  Douglas B. Terry,et al.  Continuous queries over append-only databases , 1992, SIGMOD '92.

[24]  R. Agrawal Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.

[25]  Jennifer Widom,et al.  Continuous queries over data streams , 2001, SGMD.

[26]  Krithi Ramamritham,et al.  Materialized view selection and maintenance using multi-query optimization , 2000, SIGMOD '01.

[27]  Philippe Bonnet,et al.  Towards Sensor Database Systems , 2001, Mobile Data Management.

[28]  Joseph M. Hellerstein,et al.  Eddies: continuously adaptive query processing , 2000, SIGMOD '00.

[29]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[30]  Andrew Heybey,et al.  Tribeca: A System for Managing Large Databases of Network Traffic , 1998, USENIX Annual Technical Conference.

[31]  Steven J. DeRose,et al.  XML Path Language (XPath) Version 1.0 , 1999 .

[32]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD '00.

[33]  A. N. Wilschut,et al.  Dataflow query execution in a parallel main-memory environment , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[34]  Charles L. Forgy,et al.  Rete: A Fast Algorithm for the Many Patterns/Many Objects Match Problem , 1982, Artif. Intell..

[35]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[36]  Prasan Roy,et al.  Efficient and extensible algorithms for multi query optimization , 1999, SIGMOD '00.

[37]  Donald A. Berry,et al.  Bandit Problems: Sequential Allocation of Experiments. , 1986 .

[38]  Richard S. Sutton,et al.  Reinforcement Learning , 1992, Handbook of Machine Learning.