IoT Data Compression: Sensor-Agnostic Approach

Management of bulk sensor data is one of the challenging problems in the development of Internet of Things (IoT) applications. High volume of sensor data induces for optimal implementation of appropriate sensor data compression technique to deal with the problem of energy-efficient transmission, storage space optimization for tiny sensor devices, and cost-effective sensor analytics. The compression performance to realize significant gain in processing high volume sensor data cannot be attained by conventional lossy compression methods, which are less likely to exploit the intrinsic unique contextual characteristics of sensor data. In this paper, we propose SensCompr, a dynamic lossy compression method specific for sensor datasets and it is easily realizable with standard compression methods. Senscompr leverages robust statistical and information theoretic techniques and does not require specific physical modeling. It is an information-centric approach that exhaustively analyzes the inherent properties of sensor data for extracting the embedded useful information content and accordingly adapts the parameters of compression scheme to maximize compression gain while optimizing information loss. Senscompr is successfully applied to compress large sets of heterogeneous real sensor datasets like ECG, EEG, smart meter, accelerometer. To the best of our knowledge, for the first time 'sensor information content'-centric dynamic compression technique is proposed and implemented particularly for IoT-applications and this method is independent to sensor data types.

[1]  E. H. Darlington,et al.  Algorithm for Compressing Time-Series Data , 2012 .

[2]  Soma Bandyopadhyay,et al.  IoT-Privacy: To be private or not to be private , 2014, 2014 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[3]  J. Zico Kolter,et al.  REDD : A Public Data Set for Energy Disaggregation Research , 2011 .

[4]  Soma Bandyopadhyay,et al.  Sensitivity inspector: Detecting privacy in smart energy applications , 2014, 2014 IEEE Symposium on Computers and Communications (ISCC).

[5]  R. Serfling,et al.  General foundations for studying masking and swamping robustness of outlier identifiers , 2014 .

[6]  E. Farahabadi,et al.  R Peak Detection in Electrocardiogram Signal Based on an Optimal Combination of Wavelet Transform, Hilbert Transform, and Adaptive Thresholding , 2011, Journal of medical signals and sensors.

[7]  Jian-ning Wen,et al.  Detecting and Disposing Abnormal Signal Outliers with Masking Effect by Using Data Accumulated Generating Operation , 2008, 2008 Congress on Image and Signal Processing.

[8]  G.B. Moody,et al.  The impact of the MIT-BIH Arrhythmia Database , 2001, IEEE Engineering in Medicine and Biology Magazine.

[9]  J. P. Park The Identification Of Multiple Outliers , 2000 .

[10]  Kingshuk Chakravarty,et al.  Gait based people identification system using multiple switching kinects , 2013, 2013 13th International Conference on Intellient Systems Design and Applications.

[11]  Anthony Rowe,et al.  BLUED : A Fully Labeled Public Dataset for Event-Based Non-Intrusive Load Monitoring Research , 2012 .

[12]  Karl Aberer,et al.  An Evaluation of Model-Based Approaches to Sensor Data Compression , 2013, IEEE Transactions on Knowledge and Data Engineering.

[13]  Florent Krzakala,et al.  Compressed sensing under matrix uncertainty: Optimum thresholds and robust approximate message passing , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14]  B. Rosner Percentage Points for a Generalized ESD Many-Outlier Procedure , 1983 .

[15]  Vivek K. Goyal,et al.  Distributed Scalar Quantization for Computing: High-Resolution Analysis and Extensions , 2008, IEEE Transactions on Information Theory.

[16]  Mooi Choo Chuah,et al.  ECG Anomaly Detection via Time Series Analysis , 2007, ISPA Workshops.

[17]  Wei Jiang,et al.  On-line outlier detection and data cleaning , 2004, Comput. Chem. Eng..