Toward Simulation-Based Optimization in Data Stream Management Systems

Our demonstration introduces a novel system architecture which massively facilitates optimization in data stream management systems (DSMS). The basic idea is to decouple optimization from the operative system by means of a secondary optimization system, which bears the burden of determining new query plans. Within the secondary system, which typically runs on a separate machine, we utilize suitable statistical models of the original data streams to simulate them. As the simulation can run at much faster rates, we are able to examine and assess new query plans in a shorter period of time without running the risk of deteriorating the original plan; we only migrate practically approved plans into the operative system. In our demonstration, we will present our prototypical implementation of this optimization architecture. We will demonstrate the interaction between primary and secondary system as well as the key features of the whole optimization process.

[1]  Jeffrey F. Naughton,et al.  Rate-based query optimization for streaming information sources , 2002, SIGMOD '02.

[2]  Yin Yang,et al.  Dynamic Plan Migration for Snapshot-Equivalent Continuous Queries in Data Stream Systems , 2006, EDBT Workshops.

[3]  Bernhard Seeger,et al.  Dynamic Metadata Management for Scalable Stream Processing Systems , 2007, 2007 IEEE 23rd International Conference on Data Engineering Workshop.

[4]  Bernhard Seeger,et al.  PIPES: a public infrastructure for processing and exploring streams , 2004, SIGMOD '04.

[5]  Bernhard Seeger,et al.  Stream Processing in Production-to-Business Software , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[6]  Bernhard Seeger,et al.  Towards Kernel Density Estimation over Streaming Data , 2006, COMAD.

[7]  Bernhard Seeger,et al.  XXL - A Library Approach to Supporting Efficient Implementations of Advanced Database Queries , 2001, VLDB.

[8]  Christian S. Jensen,et al.  A Foundation for Conventional and Temporal Query Optimization Addressing Duplicates and Ordering , 2001, IEEE Trans. Knowl. Data Eng..

[9]  Bernhard Seeger,et al.  Wavelet density estimators over data streams , 2005, SAC '05.

[10]  Bernhard Seeger,et al.  A Temporal Foundation for Continuous Queries over Data Streams , 2005, COMAD.

[11]  Bernhard Seeger,et al.  Adaptive Wavelet Density Estimators over Data Streams , 2007, 19th International Conference on Scientific and Statistical Database Management (SSDBM 2007).

[12]  Yin Yang,et al.  HybMig: A Hybrid Approach to Dynamic Plan Migration for Continuous Queries , 2007, IEEE Transactions on Knowledge and Data Engineering.

[13]  Bernhard Seeger,et al.  A Cost-Based Approach to Adaptive Resource Management in Data Stream Systems , 2008, IEEE Transactions on Knowledge and Data Engineering.