Dynamic Wavelet Synopses Management over Sliding Windows in Sensor Networks

Due to the dynamic nature of data streams, a sliding window is used to generate synopses that approximate the most recent data within the retrospective horizon to answer queries or discover patterns. In this paper, we propose a dynamic scheme for wavelet synopses management in sensor networks. We define a data structure sliding dual tree, abbreviated as SDT, to generate dynamic synopses that adapts to the insertions and deletions in the most recent sliding window. By exploiting the properties of Haar wavelet transform, we develop several operations to incrementally maintain SDT over consecutive time windows in a time- and space-efficient manner. These operations directly operate on the transformed time-frequency domain without the need of storing/reconstructing the original data. As shown in our thorough analysis, our SDT-based approach greatly reduces the required resources for synopses generation and maximizes the storage utilization of wavelet synopses in terms of the window length and quality measures. We also show that the approximation error of the dynamic wavelet synopses, i.e., L2-norm error, can be incrementally updated. We also derive the bound of the overestimation of the approximation error due to the incremental thresholding scheme. Furthermore, the synopses can be used to answer various kinds of numerical queries such as point and distance queries. In addition, we show that our SDT can adapt to resource allocation to further enhance the overall storage utilization over time. As demonstrated by our experimental results, our proposed framework can outperform current techniques in both real and synthetic data.

[1]  Dimitris Sacharidis,et al.  SHIFT-SPLIT: I/O efficient maintenance of wavelet-transformed multidimensional data , 2005, SIGMOD '05.

[2]  Rajeev Motwani,et al.  Maintaining variance and k-medians over data stream windows , 2003, PODS.

[3]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[4]  Jeffrey Scott Vitter,et al.  Dynamic Maintenance of Wavelet-Based Histograms , 2000, VLDB.

[5]  David Salesin,et al.  Wavelets for computer graphics: theory and applications , 1996 .

[6]  Jeffrey Scott Vitter,et al.  Wavelet-based histograms for selectivity estimation , 1998, SIGMOD '98.

[7]  S. Muthukrishnan,et al.  Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries , 2001, VLDB.

[8]  Wei Hong,et al.  Model-based approximate querying in sensor networks , 2005, The VLDB Journal.

[9]  Ambuj K. Singh,et al.  SWAT: hierarchical stream summarization in large networks , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[10]  Piotr Indyk,et al.  Maintaining stream statistics over sliding windows: (extended abstract) , 2002, SODA '02.

[11]  Gurmeet Singh Manku,et al.  Approximate counts and quantiles over sliding windows , 2004, PODS.

[12]  Sharad Mehrotra,et al.  Capturing sensor-generated time series with quality guarantees , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[13]  Eamonn J. Keogh,et al.  Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases , 2001, Knowledge and Information Systems.

[14]  Shenghuo Zhu,et al.  A survey on wavelet applications in data mining , 2002, SKDD.

[15]  Christos Faloutsos,et al.  Fast Time Sequence Indexing for Arbitrary Lp Norms , 2000, VLDB.

[16]  Philip S. Yu,et al.  A Regression-Based Temporal Pattern Mining Scheme for Data Streams , 2003, VLDB.

[17]  Wei Hong,et al.  Approximate Data Collection in Sensor Networks using Probabilistic Models , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[18]  Renée J. Miller,et al.  Similarity search over time-series data using wavelets , 2002, Proceedings 18th International Conference on Data Engineering.