Research on Sliding Window Join Semantics and Join Algorithm inHeterogeneous Data Streams

Sliding windows of data stream have rich semantics, which results all kinds of window semantics of different data stream, so join semantics between the different types of windows becomes very complicated. The basic join semantic of data streams, the join semantic of tuple-based sliding window and the join semantic of time-based sliding window have partly solved the semantics of stream joins, but the heterogeneity of sliding windows is difficult to be solved. In this paper we present the join semantic model based on matching window identifies for joining of multi-data stream. We make use of window identifies to shield the difference of window attribute, window size, and window slide. In this paper, a sliding window is divided into a number of sub-windows when the newest sub-window fills up it and it is appended to the sliding window while the oldest sub-window in the sliding window is removed. We use the equivalence relation of overlapping sub-window belonging to the adjacent sliding window to reduce the number of join computing. We propose the corre- sponding algorithm of window join to maintain the window. The theoretical and experimental analysis show that the join- ing model of window identifies can synchronize multiple data stream.

[1]  Michael J. Franklin,et al.  XJoin: Getting Fast Answers From Slow and Bursty Networks , 1999 .

[2]  Michael J. Franklin,et al.  Streaming Queries over Streaming Data , 2002, VLDB.

[3]  A. N. Wilschut,et al.  Dataflow query execution in a parallel main-memory environment , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[4]  Abhinandan Das,et al.  Approximate join processing over data streams , 2003, SIGMOD '03.

[5]  Jens Teubner,et al.  How soccer players would do stream joins , 2011, SIGMOD '11.

[6]  Frederick Reiss,et al.  TelegraphCQ: An Architectural Status Report , 2003, IEEE Data Eng. Bull..

[7]  Jun Yang,et al.  A Survey of Join Processing in Data Streams , 2007, Data Streams - Models and Algorithms.

[8]  Walid G. Aref,et al.  Hash-merge join: a non-blocking join algorithm for producing fast and early join results , 2004, Proceedings. 20th International Conference on Data Engineering.

[9]  Jeffrey F. Naughton,et al.  Evaluating window joins over unbounded streams , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[10]  Jennifer Widom,et al.  The CQL continuous query language: semantic foundations and query execution , 2006, The VLDB Journal.

[11]  Bernhard Seeger,et al.  Progressive Merge Join: A Generic and Non-blocking Sort-based Join Algorithm , 2002, VLDB.

[12]  K ElmagarmidAhmed,et al.  Supporting views in data stream management systems , 2008 .

[13]  Danh Le Phuoc,et al.  Linked Stream Data Processing , 2012, Reasoning Web.

[14]  Samuel Madden,et al.  Continuously adaptive continuous queries over streams , 2002, SIGMOD '02.

[15]  Walid G. Aref,et al.  Supporting views in data stream management systems , 2010, TODS.

[16]  Lukasz Golab,et al.  On Indexing Sliding Windows over Online Data Streams , 2004, EDBT.

[17]  Vasilis Vassalos,et al.  Double Index NEsted-Loop Reactive Join for Result Rate Optimization , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[18]  Yufei Tao,et al.  RPJ: producing fast join results on streams through rate-based optimization , 2005, SIGMOD '05.