Query Processing, Approximation, and Resource Management in a Data Stream Management System

This paper describes our ongoing work developing the Stanford Stream Data Manager (STREAM), a system for executing continuous queries over multiple continuous data streams. The STREAM system supports a declarative query language, and it copes with high data rates and query workloads by providing approximate answers when resources are limited. This paper describes specific contributions made so far and enumerates our next steps in developing a general-purpose Data Stream Management System.

[1]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[2]  Timos K. Sellis,et al.  Multiple-query optimization , 1988, TODS.

[3]  Michael D. Soo,et al.  Bibliography on temporal databases , 1991, SGMD.

[4]  A. N. Wilschut,et al.  Dataflow query execution in a parallel main-memory environment , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[5]  Miron Livny,et al.  The Design and Implementation of a Sequence Database System , 1996, VLDB.

[6]  Jennifer Widom,et al.  A First Course in Database Systems , 1997 .

[7]  Rajeev Motwani,et al.  On random sampling over joins , 1999, SIGMOD '99.

[8]  Joseph M. Hellerstein,et al.  Eddies: continuously adaptive query processing , 2000, SIGMOD '00.

[9]  Mehul A. Shah,et al.  Adaptive Query Processing: Technology in Evolution , 2000, IEEE Data Eng. Bull..

[10]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD '00.

[11]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD 2000.

[12]  Kyuseok Shim,et al.  Approximate query processing using wavelets , 2001, The VLDB Journal.

[13]  Jeffrey F. Naughton,et al.  Rate-based query optimization for streaming information sources , 2002, SIGMOD '02.

[14]  Samuel Madden,et al.  Continuously adaptive continuous queries over streams , 2002, SIGMOD '02.

[15]  Michael J. Franklin,et al.  Streaming Queries over Streaming Data , 2002, VLDB.

[16]  Johannes Gehrke,et al.  Querying and mining data streams: you only get one look a tutorial , 2002, SIGMOD '02.

[17]  Michael Stonebraker,et al.  Monitoring Streams - A New Class of Data Management Applications , 2002, VLDB.

[18]  Jennifer Widom,et al.  Characterizing memory requirements for queries over continuous data streams , 2002, PODS '02.

[19]  Minos N. Garofalakis,et al.  Wavelet synopses with error guarantees , 2002, SIGMOD '02.

[20]  Sudipto Guha,et al.  Dynamic multidimensional histograms , 2002, SIGMOD '02.

[21]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[22]  David Maier,et al.  Punctuated data streams , 2005 .