ε-Approximation to data streams in sensor networks

The rapid development in processor, memory, and radio technology have contributed to the furtherance of decentralized sensor networks of small, inexpensive nodes that are capable of sensing, computation, and wireless communication. Due to the characteristic of limited communication bandwidth and other resource constraints of sensor networks, an important and practical demand is to compress time series data generated by sensor nodes with precision guarantee in an online manner. Although a large number of data compression algorithms have been proposed to reduce data volume, their offline characteristic or super-linear time complexity prevents them from being applied directly on time series data generated by sensor nodes. To remedy the deficiencies of previous methods, we propose an optimal online algorithm GDPLA for constructing a disconnected piecewise linear approximation representation of a time series which guarantees that the vertical distance between each real data point and the corresponding fit line is less than or equal to ε. GDPLA not only generates the minimum number of segments to approximate a time series with precision guarantee, but also only requires linear time O(n) bounded by a constant coefficient 6, where unit 1 denotes the time complexity of comparing the slopes of two lines. The low cost characteristic of our method makes it the popular choice for resource-constrained sensor networks. Extensive experiments on a real dataset have been conducted to demonstrate the superior compression performance of our approach.

[1]  S. Akselrod,et al.  Selective discrete Fourier transform algorithm for time-frequency analysis: method and application on simulated and cardiovascular signals , 1996, IEEE Transactions on Biomedical Engineering.

[2]  Amit Kumar,et al.  Wavelet synopses for general error metrics , 2005, TODS.

[3]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[4]  Jian Pei,et al.  An Energy-Efficient Data Collection Framework for Wireless Sensor Networks by Exploiting Spatiotemporal Correlation , 2007, IEEE Transactions on Parallel and Distributed Systems.

[5]  Torsten Suel,et al.  Optimal Histograms with Quality Guarantees , 1998, VLDB.

[6]  Jennifer Widom,et al.  Adaptive filters for continuous queries over distributed data streams , 2003, SIGMOD '03.

[7]  Jianzhong Li,et al.  O(ε)-Approximation to physical world by sensor networks , 2013, 2013 Proceedings IEEE INFOCOM.

[8]  Shaojie Tang,et al.  Canopy closure estimates with GreenOrbs: sustainable sensing in the forest , 2009, SenSys '09.

[9]  Eamonn J. Keogh,et al.  An online algorithm for segmenting time series , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[10]  Jing He,et al.  Optimal Distributed Data Collection for Asynchronous Cognitive Radio Networks , 2012, 2012 IEEE 32nd International Conference on Distributed Computing Systems.

[11]  Huaiqing Wang,et al.  Novel Online Methods for Time Series Segmentation , 2008, IEEE Transactions on Knowledge and Data Engineering.

[12]  Edward Y. Chang,et al.  Adaptive stream resource management using Kalman Filters , 2004, SIGMOD '04.

[13]  Walid G. Aref,et al.  Online Piece-wise Linear Approximation of Numerical Streams with Precision Guarantees , 2009, Proc. VLDB Endow..

[14]  Shouling Ji,et al.  Data caching-based query processing in multi-sink wireless sensor networks , 2012, Int. J. Sens. Networks.

[15]  Divyakant Agrawal,et al.  A comparison of DFT and DWT based similarity search in time-series databases , 2000, CIKM '00.

[16]  Sharad Mehrotra,et al.  Capturing sensor-generated time series with quality guarantees , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[17]  Ada Wai-Chee Fu,et al.  Efficient time series matching by wavelets , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[18]  Shouling Ji,et al.  Distributed data collection and its capacity in asynchronous wireless sensor networks , 2012, 2012 Proceedings IEEE INFOCOM.

[19]  Vladimir Britanak,et al.  A new fast algorithm for the unified forward and inverse MDCT/MDST computation , 2002, Signal Process..