Data Stream Management Systems

In many application fields, such as production lines or stock analysis, it is substantial to create and process high amounts of data at high rates. Such continuous data flows with unknown size and end are also called data streams. The processing and analysis of data streams are a challenge for common data management systems as they have to operate and deliver results in real time. Data Stream Management Systems (DSMS), as an advancement of database management systems, have been implemented to deal with these issues. DSMS have to adapt to the notion of data streams on various levels, such as query languages, processing or optimization. In this chapter we give an overview of the basics of data streams, architecture principles of DSMS and the used query languages. Furthermore, we specifically detail data quality aspects in DSMS as these play an important role for various applications based on data streams. Finally, the chapter also includes a list of research and commercial DSMS and their key properties.

[1]  Jennifer Widom,et al.  Flexible time management in data stream systems , 2004, PODS.

[2]  Rekha Jain,et al.  Wireless Sensor Network -A Survey , 2013 .

[3]  Marco Grawunder,et al.  Odysseus as platform to solve grand challenges: DEBS grand challenge , 2012, DEBS.

[4]  Douglas B. Terry,et al.  Continuous queries over append-only databases , 1992, SIGMOD '92.

[5]  Sandra Geisler,et al.  Accuracy Assessment for Traffic Information Derived from Floating Phone Data , 2010 .

[6]  Frederick Reiss,et al.  TelegraphCQ: Continuous Dataflow Processing for an Uncertain World , 2003, CIDR.

[7]  A. Kemper,et al.  On Graph Problems in a Semi-streaming Model , 2015 .

[8]  Theodore Johnson,et al.  Gigascope: a stream database for network applications , 2003, SIGMOD '03.

[9]  Anna Liu,et al.  PODS: a new model and processing algorithms for uncertain data streams , 2010, SIGMOD Conference.

[10]  Jan Van den Bussche,et al.  A Theory of Stream Queries , 2007, DBPL.

[11]  Apostolos Syropoulos,et al.  Mathematics of Multisets , 2000, WMP.

[12]  David Maier,et al.  Semantics of Data Streams and Operators , 2005, ICDT.

[13]  Philip S. Yu,et al.  SPADE: the system s declarative stream processing engine , 2008, SIGMOD Conference.

[14]  Carlo Zaniolo,et al.  Query Languages and Data Models for Database Sequences and Data Streams , 2004, VLDB.

[15]  Alessandro Margara,et al.  Processing flows of information: From data stream to complex event processing , 2012, CSUR.

[16]  Philip S. Yu,et al.  GraphScope: parameter-free mining of large time-evolving graphs , 2007, KDD '07.

[17]  Michael Stonebraker,et al.  Aurora: a new model and architecture for data stream management , 2003, The VLDB Journal.

[18]  Rajeev Motwani,et al.  Load Shedding in Data Stream Systems , 2007, Data Streams - Models and Algorithms.

[19]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD '00.

[20]  Walid G. Aref,et al.  Exploiting predicate-window semantics over data streams , 2006, SGMD.

[21]  David Maier,et al.  Exploiting Punctuation Semantics in Continuous Data Streams , 2003, IEEE Trans. Knowl. Data Eng..

[22]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[23]  Lukasz Golab,et al.  Data Stream Management , 2017, Data Stream Management.

[24]  Jennifer Widom,et al.  An Abstract Semantics and Concrete Language for Continuous Queries over Streams and Relations , 2002 .

[25]  Carlo Zaniolo,et al.  Logical Foundations of Continuous Query Languages for Data Streams , 2012, Datalog.

[26]  D. Singh,et al.  AN OVERVIEW OF THE APPLICATIONS OF MULTISETS , 2007 .

[27]  Ramez Elmasri,et al.  Fundamentals of Database Systems , 1989 .

[28]  Wolfgang Lehner,et al.  Representing Data Quality in Sensor Data Streaming Environments , 2009, JDIQ.

[29]  Amol Deshpande,et al.  Ef?cient Query Evaluation over Temporally Correlated Probabilistic Streams , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[30]  Qiang Chen,et al.  Aurora : a new model and architecture for data stream management ) , 2006 .

[31]  Michael Stonebraker,et al.  The 8 requirements of real-time stream processing , 2005, SGMD.

[32]  T. S. Jayram,et al.  Efficient aggregation algorithms for probabilistic data , 2007, SODA '07.

[33]  Johannes Gehrke,et al.  A General Algebra and Implementation for Monitoring Event Streams , 2005 .

[34]  Thomas Redman,et al.  Data quality for the information age , 1996 .

[35]  A. Rasin,et al.  Streaming for Dummies , 2004 .

[36]  R. C. Luo An Introduction to the Expressive Stream Language ( ESL ) 1 WEB Information System Laboratory , 2022 .

[37]  Sandra Geisler,et al.  A data stream-based evaluation framework for traffic information systems , 2010, IWGS '10.

[38]  Levent Gürgen,et al.  SStreaMWare: a service oriented middleware for heterogeneous sensor data management , 2008, ICPS '08.

[39]  Karl Aberer,et al.  A middleware for fast and flexible sensor network deployment , 2006, VLDB.

[40]  Sudarshan S. Chawathe,et al.  XSQ: A streaming XPath engine , 2005, TODS.

[41]  Joseph M. Hellerstein,et al.  Using state modules for adaptive query processing , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[42]  Matthias Jarke,et al.  An evaluation framework for traffic information systems based on data streams , 2012 .

[43]  Diane M. Strong,et al.  Data quality in context , 1997, CACM.

[44]  Stanley B. Zdonik,et al.  Staying FIT: Efficient Load Shedding Techniques for Distributed Stream Processing , 2007, VLDB.

[45]  Calton Pu,et al.  Continual Queries for Internet Scale Event-Driven Information Delivery , 1999, IEEE Trans. Knowl. Data Eng..

[46]  Jennifer Widom,et al.  Towards a streaming SQL standard , 2008, Proc. VLDB Endow..

[47]  Michael Stonebraker,et al.  Aurora: a data stream management system , 2003, SIGMOD '03.

[48]  Michael Stonebraker,et al.  Monitoring Streams - A New Class of Data Management Applications , 2002, VLDB.

[49]  Ugur Çetintemel,et al.  Data Stream Management Architectures and Prototypes , 2009, Encyclopedia of Database Systems.

[50]  Sven Schmidt,et al.  Quality of service aware data stream processing , 2007 .

[51]  Divesh Srivastava,et al.  Dense subgraph maintenance under streaming edge weight updates for real-time story identification , 2012, The VLDB Journal.

[52]  Jennifer Widom,et al.  Exploiting k-constraints to reduce memory overhead in continuous queries over data streams , 2004, TODS.

[53]  Levent Gürgen,et al.  SStreaM : A Model for Representing Sensor Data and Sensor Queries , 2006 .

[54]  Timos K. Sellis,et al.  Window Specification over Data Streams , 2006, EDBT Workshops.

[55]  Jennifer Widom,et al.  Continuous queries over data streams , 2001, SGMD.

[56]  Carlo Zaniolo,et al.  Relational languages and data models for continuous queries on sequences and data streams , 2011, TODS.

[57]  Carlo Zaniolo,et al.  Designing an inductive data stream management system: the stream mill experience , 2008, SSPS '08.

[58]  Matthias Jarke,et al.  Architecture and Quality in Data Warehouses: An Extended Repository Approach , 1999, Information Systems.

[59]  Tim Kraska,et al.  Extending XQuery with Window Functions , 2007, VLDB.

[60]  Sandra Geisler,et al.  Ontology-based data quality framework for data stream applications , 2011, ICIQ.

[61]  Diane M. Strong,et al.  Beyond Accuracy: What Data Quality Means to Data Consumers , 1996, J. Manag. Inf. Syst..

[62]  Jennifer Widom,et al.  The CQL continuous query language: semantic foundations and query execution , 2006, The VLDB Journal.

[63]  Graham Cormode,et al.  Sketching probabilistic data streams , 2007, SIGMOD '07.

[64]  Yanlei Diao,et al.  SASE: Complex Event Processing over Streams , 2006, ArXiv.

[65]  Opher Etzion,et al.  Event Processing in Action , 2010 .

[66]  Tore Risch Data Stream Management Systems , 2015 .

[67]  Jennifer Widom,et al.  STREAM: The Stanford Stream Data Manager , 2003, IEEE Data Eng. Bull..

[68]  Wolfgang Lehner,et al.  QStream: Deterministic Querying of Data Streams , 2004, VLDB.

[69]  Ying Xing,et al.  The Design of the Borealis Stream Processing Engine , 2005, CIDR.

[70]  David Maier,et al.  Semantics and evaluation techniques for window aggregates in data streams , 2005, SIGMOD '05.

[71]  Andre Bolles A flexible framework for multisensor data fusion using data stream management technologies , 2009, EDBT/ICDT '09.

[72]  JÜRGEN KRÄMER,et al.  Semantics and implementation of continuous sliding window queries over data streams , 2009, TODS.

[73]  Werner Retschitzegger,et al.  Improving Situation Awareness In Traffic Management , 2010 .