Drawing dominant dataset from big sensory data in wireless sensor networks

The amount of sensory data manifests an explosive growth due to the increasing popularity of Wireless Sensor Networks. The scale of the sensory data in many applications has already exceeds several petabytes annually, which is beyond the computation and transmission capabilities of the conventional WSNs. On the other hand, the information carried by big sensory data has high redundancy because of strong correlation among sensory data. In this paper, we define the concept of e-dominant dataset, which is only a small data set and can represent the vast information carried by big sensory data with the information loss rate being less than e, where e can be arbitrarily small. We prove that drawing the minimum e-dominant dataset is polynomial time solvable and provide a centralized algorithm with 0(n3) time complexity. Furthermore, a distributed algorithm with constant complexity (O(l)) is also designed. It is shown that the result returned by the distributed algorithm can satisfy the e requirement with a near optimal size. Finally, the extensive real experiment results and simulation results are carried out. The results indicate that all the proposed algorithms have high performance in terms of accuracy and energy efficiency.

[1]  C. Lanczos An iteration method for the solution of the eigenvalue problem of linear differential and integral operators , 1950 .

[2]  Herman H. Goldstine,et al.  The Jacobi Method for Real Symmetric Matrices , 1959, JACM.

[3]  Deborah Estrin,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Fine-grained Network Time Synchronization Using Reference Broadcasts , 2022 .

[4]  Jeffrey Considine,et al.  Approximate aggregation techniques for sensor databases , 2004, Proceedings. 20th International Conference on Data Engineering.

[5]  Antonio Ortega,et al.  A distributed wavelet compression algorithm for wireless multihop sensor networks using lifting , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[6]  Daniel J. Abadi,et al.  REED: Robust, Efficient Filtering and Event Detection in Sensor Networks , 2005, VLDB.

[7]  Guohui Lin,et al.  Improved Approximation Algorithms for the Capacitated Multicast Routing Problem , 2005, COCOON.

[8]  R. Howland Intermediate Dynamics: A Linear Algebraic Approach , 2005 .

[9]  Kamesh Munagala,et al.  A Sampling-Based Approach to Optimizing Top-k Queries in Sensor Networks , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[10]  Konstantinos Psounis,et al.  Modeling spatially correlated data in sensor networks , 2006, TOSN.

[11]  Yunhao Liu,et al.  Non-Threshold based Event Detection for 3D Environment Monitoring in Sensor Networks , 2007, 27th International Conference on Distributed Computing Systems (ICDCS '07).

[12]  Jianzhong Li,et al.  Sampling Based (epsilon, delta)-Approximate Aggregation Algorithm in Sensor Networks , 2009, 2009 29th IEEE International Conference on Distributed Computing Systems.

[13]  Myungho Yeo,et al.  Huffman coding algorithm for compression of sensor data in wireless sensor networks , 2009, ICHIT '09.

[14]  Jianzhong Li,et al.  Sampling Based ( , δ)-Approximate Aggregation Algorithm in Sensor Networks , 2009 .

[15]  Tamer A. ElBatt On the trade-offs of cooperative data compression in wireless sensor networks with spatial correlations , 2009, IEEE Transactions on Wireless Communications.

[16]  Lei Yu,et al.  Bernoulli sampling based (ε, δ)-approximate aggregation in large-scale sensor networks , 2010, INFOCOM 2010.

[17]  Ian F. Akyildiz,et al.  Collaborative Data Compression Using Clustered Source Coding for Wireless Multimedia Sensor Networks , 2010, 2010 Proceedings IEEE INFOCOM.

[18]  Lu Wang,et al.  Sampling based algorithms for quantile computation in sensor networks , 2011, SIGMOD '11.

[19]  J. Overpeck,et al.  Climate Data Challenges in the 21st Century , 2011, Science.

[20]  Qin Zhang,et al.  Optimal Tracking of Distributed Heavy Hitters and Quantiles , 2011, Algorithmica.

[21]  Richard G Baraniuk,et al.  More Is Less: Signal Processing and the Data Deluge , 2011, Science.

[22]  Stephen Hailes,et al.  Design and evaluation of an adaptive sampling strategy for a wireless air pollution sensor network , 2011, 2011 IEEE 36th Conference on Local Computer Networks.

[23]  Yu-Chee Tseng,et al.  Data Compression by Temporal and Spatial Correlations in a Body-Area Sensor Network: A Case Study in Pilates Motion Recognition , 2011, IEEE Transactions on Mobile Computing.

[24]  G. Brumfiel High-energy physics: Down the petabyte highway , 2011, Nature.

[25]  Giovanni Pau,et al.  TurboSync: Clock synchronization for shared media networks via principal component analysis with missing data , 2011, 2011 Proceedings IEEE INFOCOM.

[26]  Jianzhong Li,et al.  (ε, δ)-Approximate Aggregation Algorithms in Dynamic Sensor Networks , 2012, IEEE Transactions on Parallel and Distributed Systems.

[27]  Jianzhong Li,et al.  O(ε)-Approximation to physical world by sensor networks , 2013, 2013 Proceedings IEEE INFOCOM.

[28]  Zhipeng Cai,et al.  Approximate Aggregation for Tracking Quantiles in Wireless Sensor Networks , 2014, COCOA.

[29]  Jianzhong Li,et al.  Approximate Physical World Reconstruction Algorithms in Sensor Networks , 2014, IEEE Transactions on Parallel and Distributed Systems.

[30]  Jianzhong Li,et al.  Curve Query Processing in Wireless Sensor Networks , 2015, IEEE Transactions on Vehicular Technology.

[31]  2015 IEEE Conference on Computer Communications, INFOCOM 2015, Kowloon, Hong Kong, April 26 - May 1, 2015 , 2015, IEEE Conference on Computer Communications.