Streaming data integration: Challenges and opportunities

In this position paper, we motivate the need for streaming data integration in three main forms including across multiple streaming data sources, over multiple stream processing engine instances, and between stream processing engines and traditional database systems. We argue that this need presents a broad range of challenges and opportunities for new research. We provide an overview of the young state of the art in this area and further discuss a selected set of concrete research topics that are currently under investigation within the scope of our MaxStream federated stream processing project at ETH Zurich.

[1]  Michael J. Franklin,et al.  Streaming Queries over Streaming Data , 2002, VLDB.

[2]  Jennifer Widom,et al.  Continuous queries over data streams , 2001, SGMD.

[3]  Michael J. Franklin,et al.  Continuous Analytics: Rethinking Query Processing in a Network-Effect World , 2009, CIDR.

[4]  Jin Zhang,et al.  A demonstration of the MaxStream federated stream processing system , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[5]  Martin Hentschel Scalable Data Integration by Mapping Data to Queries , 2009 .

[6]  Laura M. Haas,et al.  Design and Implementation of the MaxStream Federated Stream Processing Architecture , 2009 .

[7]  Magdalena Balazinska,et al.  Fault-Tolerance and High Availability in Data Stream Management Systems , 2009, Encyclopedia of Database Systems.

[8]  Sudipto Guha,et al.  A substrate for in-network sensor data integration , 2008, DMSN '08.

[9]  Jennifer Widom,et al.  The CQL continuous query language: semantic foundations and query execution , 2006, The VLDB Journal.

[10]  Laura M. Haas,et al.  SECRET: A Model for Analysis of the Execution Semantics of Stream Processing Systems , 2010, Proc. VLDB Endow..

[11]  Laura M. Haas,et al.  Federated Stream Processing Support for Real-Time Business Intelligence Applications , 2009, BIRTE.

[12]  Nesime Tatbul,et al.  DejaVu: declarative pattern matching over live and archived streams of events , 2009, SIGMOD Conference.

[13]  Ying Xing,et al.  The Design of the Borealis Stream Processing Engine , 2005, CIDR.

[14]  Sudipto Guha,et al.  SmartCIS: integrating digital and physical environments , 2010, SGMD.

[15]  Qiang Chen,et al.  Aurora : a new model and architecture for data stream management ) , 2006 .

[16]  Martin Kersten,et al.  Exploiting the power of relational databases for efficient stream processing , 2009, EDBT '09.

[17]  Sudipto Guha,et al.  SmartCIS: integrating digital and physical environments , 2009, SIGMOD Conference.

[18]  Frederick Reiss,et al.  TelegraphCQ: Continuous Dataflow Processing for an Uncertain World , 2003, CIDR.

[19]  Jennifer Widom,et al.  Towards a streaming SQL standard , 2008, Proc. VLDB Endow..

[20]  Michael Stonebraker,et al.  Monitoring Streams - A New Class of Data Management Applications , 2002, VLDB.