Querying Sliding Windows Over Online Data Streams

A data stream is a real-time, continuous, ordered sequence of items generated by sources such as sensor networks, Internet traffic flow, credit card transaction logs, and on-line financial tickers Processing continuous queries over data streams introduces a number of research problems, one of which concerns evaluating queries over sliding windows defined on the inputs In this paper, we describe our research on sliding window query processing, with an emphasis on query models and algebras, physical and logical optimization, efficient processing of multiple windowed queries, and generating approximate answers We outline previous work in streaming query processing and sliding window algorithms, summarize our contributions to date, and identify directions for future work.

[1]  Jeffrey F. Naughton,et al.  Static optimization of conjunctive queries with sliding windows over infinite streams , 2004, SIGMOD '04.

[2]  Wei Hong,et al.  The design of an acquisitional query processor for sensor networks , 2003, SIGMOD '03.

[3]  Rajeev Motwani,et al.  Load Shedding Techniques for Data Stream Systems , 2003 .

[4]  Jennifer Widom,et al.  Memory-Limited Execution of Windowed Stream Joins , 2004, VLDB.

[5]  Dennis Shasha,et al.  Efficient elastic burst detection in data streams , 2003, KDD '03.

[6]  Frederick Reiss,et al.  TelegraphCQ: Continuous Dataflow Processing for an Uncertain World , 2003, CIDR.

[7]  Michael J. Franklin,et al.  PSoup: a system for streaming queries over streaming data , 2003, The VLDB Journal.

[8]  Jennifer Widom,et al.  Adaptive caching for continuous queries , 2005, 21st International Conference on Data Engineering (ICDE'05).

[9]  Rajeev Motwani,et al.  Load shedding for aggregation queries over data streams , 2004, Proceedings. 20th International Conference on Data Engineering.

[10]  Elke A. Rundensteiner,et al.  Dynamic plan migration for continuous queries over data streams , 2004, SIGMOD '04.

[11]  Theodore Johnson,et al.  Gigascope: a stream database for network applications , 2003, SIGMOD '03.

[12]  Anne Rogers,et al.  Hancock: a language for extracting signatures from data streams , 2000, KDD '00.

[13]  A. N. Wilschut,et al.  Dataflow query execution in a parallel main-memory environment , 1991, Distributed and Parallel Databases.

[14]  Piotr Indyk,et al.  Maintaining Stream Statistics over Sliding Windows , 2002, SIAM J. Comput..

[15]  Ambuj K. Singh,et al.  SWAT: hierarchical stream summarization in large networks , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[16]  Jan Chomicki,et al.  Hippo: A System for Computing Consistent Answers to a Class of SQL Queries , 2004, EDBT.

[17]  Walid G. Aref,et al.  Efficient Execution of Sliding-Window Queries Over Data Streams , 2003 .

[18]  Lukasz Golab,et al.  Processing Sliding Window Multi-Joins in Continuous Queries over Data Streams , 2003, VLDB.

[19]  Dennis Shasha,et al.  StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time , 2002, VLDB.

[20]  Jennifer Widom,et al.  The CQL continuous query language: semantic foundations and query execution , 2006, The VLDB Journal.

[21]  Lukasz Golab,et al.  On Indexing Sliding Windows over Online Data Streams , 2004, EDBT.

[22]  Lukasz Golab,et al.  Issues in data stream management , 2003, SGMD.

[23]  Erik D. Demaine,et al.  Finding frequent items in sliding windows with multinomially-distributed item frequencies , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[24]  Jennifer Widom,et al.  Resource Sharing in Continuous Sliding-Window Aggregates , 2004, VLDB.

[25]  Jennifer Widom,et al.  Flexible time management in data stream systems , 2004, PODS.

[26]  Rajeev Motwani,et al.  Maintaining variance and k-medians over data stream windows , 2003, PODS.

[27]  Jeffrey F. Naughton,et al.  Evaluating window joins over unbounded streams , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[28]  Walid G. Aref,et al.  Scheduling for shared window joins over data streams , 2003, VLDB.

[29]  Walid G. Aref,et al.  Stream window join: tracking moving objects in sensor-network databases , 2003, 15th International Conference on Scientific and Statistical Database Management, 2003..

[30]  Qiang Chen,et al.  Aurora : a new model and architecture for data stream management ) , 2006 .

[31]  Abhinandan Das,et al.  Approximate join processing over data streams , 2003, SIGMOD '03.

[32]  Andrew Heybey,et al.  Tribeca: A System for Managing Large Databases of Network Traffic , 1998, USENIX Annual Technical Conference.

[33]  Erik D. Demaine,et al.  Identifying frequent items in sliding windows over on-line packet streams , 2003, IMC '03.

[34]  Edith Cohen,et al.  Maintaining time-decaying stream aggregates , 2003, J. Algorithms.

[35]  Theodore Johnson,et al.  Gigascope: high performance network monitoring with an SQL interface , 2002, SIGMOD '02.

[36]  Jeffrey D. Ullman,et al.  Principles of Database Systems , 1980 .

[37]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD '00.

[38]  David Maier,et al.  Exploiting Punctuation Semantics in Continuous Data Streams , 2003, IEEE Trans. Knowl. Data Eng..

[39]  Hector Garcia-Molina,et al.  Wave-indices: indexing evolving databases , 1997, SIGMOD '97.

[40]  Divyakant Agrawal,et al.  Supporting sliding window queries for continuous data streams , 2003, 15th International Conference on Scientific and Statistical Database Management, 2003..

[41]  Piotr Indyk,et al.  Maintaining stream statistics over sliding windows: (extended abstract) , 2002, SODA '02.