Window Specification over Data Streams

Several query languages have been proposed for managing data streams in modern monitoring applications. Continuous queries expressed in these languages usually employ windowing constructs in order to extract finite portions of the potentially unbounded stream. Explicitly or not, window specifications rely on ordering. Usually, timestamps are attached to all tuples flowing into the system as a means to provide ordered access to data items. Several window types have been implemented in stream prototype systems, but a precise definition of their semantics is still lacking. In this paper, we describe a formal framework for expressing windows in continuous queries over data streams. After classifying windows according to their basic characteristics, we give algebraic expressions for the most significant window types commonly appearing in applications. As an essential step towards a stream algebra, we then propose formal definitions for the windowed analogs of typical relational operators, such as join, union or aggregation, and we identify several properties useful to query optimization.

[1]  Frederick Reiss,et al.  TelegraphCQ: Continuous Dataflow Processing for an Uncertain World , 2003, CIDR.

[2]  David Maier,et al.  Exploiting Punctuation Semantics in Continuous Data Streams , 2003, IEEE Trans. Knowl. Data Eng..

[3]  Frederick Reiss,et al.  TelegraphCQ: continuous dataflow processing , 2003, SIGMOD '03.

[4]  Qiang Chen,et al.  Aurora : a new model and architecture for data stream management ) , 2006 .

[5]  Lukasz Golab,et al.  Update-pattern-aware modeling and processing of continuous queries , 2005, SIGMOD '05.

[6]  Douglas B. Terry,et al.  Continuous queries over append-only databases , 1992, SIGMOD '92.

[7]  Daniel Barbará,et al.  The Characterization of Continuous Queries , 1999, Int. J. Cooperative Inf. Syst..

[8]  Theodore Johnson,et al.  A Heartbeat Mechanism and Its Application in Gigascope , 2005, VLDB.

[9]  David Maier,et al.  Semantics of Data Streams and Operators , 2005, ICDT.

[10]  Bernhard Seeger,et al.  A Temporal Foundation for Continuous Queries over Data Streams , 2005, COMAD.

[11]  Jennifer Widom,et al.  A denotational semantics for continuous queries over streams and relations , 2004, SGMD.

[12]  Timos K. Sellis,et al.  Managing Trajectories of Moving Objects as Data Streams , 2004, STDBM.

[13]  David Maier,et al.  Semantics and evaluation techniques for window aggregates in data streams , 2005, SIGMOD '05.

[14]  Christian S. Jensen,et al.  Temporal Data Management , 1999, IEEE Trans. Knowl. Data Eng..

[15]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[16]  Randy H. Katz,et al.  An extended relational algebra with control over duplicate elimination , 1982, PODS.

[17]  Jeffrey F. Naughton,et al.  Static optimization of conjunctive queries with sliding windows over infinite streams , 2004, SIGMOD '04.

[18]  Jennifer Widom,et al.  The CQL continuous query language: semantic foundations and query execution , 2006, The VLDB Journal.

[19]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[20]  Miron Livny,et al.  SEQ: A model for sequence databases , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[21]  Walid G. Aref,et al.  Efficient Execution of Sliding-Window Queries Over Data Streams , 2003 .