On Indexing Sliding Windows over Online Data Streams

We consider indexing sliding windows in main memory over on-line data streams. Our proposed data structures and query semantics are based on a division of the sliding window into sub-windows. By classifying windowed operators according to their method of execution, we motivate the need for two types of windowed indices: those which provide a list of attribute values and their counts for answering set-valued queries, and those which provide direct access to tuples for answering attribute-valued queries. We propose and evaluate indices for both of these cases and show that our techniques are more efficient than executing windowed queries without an index.

[1]  Douglas B. Terry,et al.  Continuous queries over append-only databases , 1992, SIGMOD '92.

[2]  Qiang Chen,et al.  Aurora : a new model and architecture for data stream management ) , 2006 .

[3]  Lukasz Golab,et al.  Processing Sliding Window Multi-Joins in Continuous Queries over Data Streams , 2003, VLDB.

[4]  Abhinandan Das,et al.  Approximate join processing over data streams , 2003, SIGMOD '03.

[5]  Lukasz Golab,et al.  Issues in data stream management , 2003, SGMD.

[6]  Guy M. Lohman,et al.  Differential files: their application to the maintenance of large databases , 1976, TODS.

[7]  Jeffrey F. Naughton,et al.  Evaluating window joins over unbounded streams , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[8]  Piotr Indyk,et al.  Maintaining stream statistics over sliding windows: (extended abstract) , 2002, SODA '02.

[9]  Piotr Indyk,et al.  Maintaining Stream Statistics over Sliding Windows , 2002, SIAM J. Comput..

[10]  Michael J. Carey,et al.  Query processing in main memory database management systems , 1986, SIGMOD '86.

[11]  C. V. Ramamoorthy,et al.  Efficient Algorithms for Maintenance of Large Database , 1988, ICDE 1988.

[12]  Dennis Shasha,et al.  Efficient elastic burst detection in data streams , 2003, KDD '03.

[13]  Frederick Reiss,et al.  TelegraphCQ: Continuous Dataflow Processing for an Uncertain World , 2003, CIDR.

[14]  Hector Garcia-Molina,et al.  Wave-indices: indexing evolving databases , 1997, SIGMOD '97.

[15]  Dennis Shasha,et al.  StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time , 2002, VLDB.

[16]  Luc Bouganim,et al.  PicoDBMS: Scaling down database techniques for the smartcard , 2001, The VLDB Journal.

[17]  Divyakant Agrawal,et al.  Supporting sliding window queries for continuous data streams , 2003, 15th International Conference on Scientific and Statistical Database Management, 2003..

[18]  Alfons Kemper,et al.  Efficient bulk deletes in relational databases , 2001, Proceedings 17th International Conference on Data Engineering.

[19]  Ellis Horowitz,et al.  Fundamentals of Data Structures , 1984 .

[20]  Edith Cohen,et al.  Maintaining time-decaying stream aggregates , 2003, J. Algorithms.

[21]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD '00.

[22]  Michael Stonebraker,et al.  Implementation techniques for main memory database systems , 1984, SIGMOD '84.