Recursive Principal Component Analysis-Based Data Outlier Detection and Sensor Data Aggregation in IoT Systems

Internet of Things (IoT) is emerging as the underlying technology of our connected society, which enables many advanced applications. In IoT-enabled applications, information of application surroundings is gathered by networked sensors, especially wireless sensors due to their advantage of infrastructure-free deployment. However, the pervasive deployment of wireless sensor nodes generate massive amount of sensor data, and data outliers are frequently incurred due to the dynamic nature of wireless channels. As operation of IoT systems relies on sensor data, data redundancy and data outliers could significantly reduce the effectiveness of IoT applications or even mislead systems into unsafe conditions. In this paper, a cluster-based data analysis framework is proposed using recursive principal component analysis (R-PCA), which can aggregate the redundant data and detect the outliers in the meantime. More specifically, at a cluster head, spatially correlated sensor data collected from cluster members are aggregated by extracting the principal components (PCs), and potential data outliers are determined by the abnormal squared prediction error score, which is defined as the square of residual value after extraction of PCs. With R-PCA, the parameters of PCA model can be recursively updated to adapt to the changes in IoT systems. Cluster-based data analysis framework also releases the computational and processing burdens on sensor nodes. Practical databases-based simulations have confirmed that the proposed framework efficiently aggregates the correlated sensor data with high recovery accuracy. The data outlier detection accuracy is also improved by the proposed method compared to other existing algorithms.

[1]  Feng Wang,et al.  Networked Wireless Sensor Data Collection: Issues, Challenges, and Approaches , 2011, IEEE Communications Surveys & Tutorials.

[2]  Lars Michael Kristensen,et al.  An Industrial Perspective on Wireless Sensor Networks — A Survey of Requirements, Protocols, and Challenges , 2014, IEEE Communications Surveys & Tutorials.

[3]  Kazem Sohraby,et al.  IoT Considerations, Requirements, and Architectures for Smart Buildings—Energy Optimization and Next-Generation Building Management Systems , 2017, IEEE Internet of Things Journal.

[4]  Arun Somani,et al.  Distributed fault detection of wireless sensor networks , 2006, DIWANS '06.

[5]  John A. Stankovic,et al.  Research Directions for the Internet of Things , 2014, IEEE Internet of Things Journal.

[6]  Peng Jiang,et al.  A New Method for Node Fault Detection in Wireless Sensor Networks , 2009, Sensors.

[7]  Ning Zhang,et al.  Identifying the Most Valuable Workers in Fog-Assisted Spatial Crowdsourcing , 2017, IEEE Internet of Things Journal.

[8]  Catherine Rosenberg,et al.  Compressed Data Aggregation: Energy-Efficient and High-Fidelity Data Collection , 2013, IEEE/ACM Transactions on Networking.

[9]  Zhang Yang,et al.  An online outlier detection technique for wireless sensor networks using unsupervised quarter-sphere support vector machine , 2008, 2008 International Conference on Intelligent Sensors, Sensor Networks and Information Processing.

[10]  Tao Zhang,et al.  Fog and IoT: An Overview of Research Opportunities , 2016, IEEE Internet of Things Journal.

[11]  J. Martínez-Carranza,et al.  Alternative analysis to perturbation theory in quantum mechanics , 2011 .

[12]  Nirvana Meratnia,et al.  Outlier Detection Techniques for Wireless Sensor Networks: A Survey , 2008, IEEE Communications Surveys & Tutorials.

[13]  Nei Kato,et al.  A Survey on Network Methodologies for Real-Time Analytics of Massive IoT Data and Open Research Issues , 2017, IEEE Communications Surveys & Tutorials.

[14]  S. Papavassiliou,et al.  Diagnosing Anomalies and Identifying Faulty Nodes in Sensor Networks , 2007, IEEE Sensors Journal.

[15]  Sajal K. Das,et al.  A Trust-Based Framework for Fault-Tolerant Data Aggregation in Wireless Multimedia Sensor Networks , 2012, IEEE Transactions on Dependable and Secure Computing.

[16]  H. Abdi,et al.  Principal component analysis , 2010 .

[17]  Athanasios V. Vasilakos,et al.  CDC: Compressive Data Collection for Wireless Sensor Networks , 2015, IEEE Transactions on Parallel and Distributed Systems.

[18]  Sanjiv K. Bhatia Adaptive K-Means Clustering , 2004, FLAIRS Conference.

[19]  Mani Srivastava,et al.  Cooperative sensor anomaly detection using global information , 2013 .

[20]  Yao Zheng,et al.  A Feedback Control-Based Crowd Dynamics Management in IoT System , 2017, IEEE Internet of Things Journal.

[21]  Stathes Hadjiefthymiades,et al.  Context Compression: Using Principal Component Analysis for Efficient Wireless Communications , 2011, 2011 IEEE 12th International Conference on Mobile Data Management.

[22]  Xianbin Wang,et al.  A novel R-PCA based multivariate fault-tolerant data aggregation algorithm in WSNs , 2016, 2016 IEEE International Conference on Communications (ICC).

[23]  Deniz Erdogmus,et al.  Recursive principal components analysis using eigenvector matrix perturbation , 2004 .

[24]  Eduardo Morgado,et al.  Energy Efficiency and Quality of Data Reconstruction Through Data-Coupled Clustering for Self-Organized Large-Scale WSNs , 2016, IEEE Sensors Journal.

[25]  Marimuthu Palaniswami,et al.  Quarter Sphere Based Distributed Anomaly Detection in Wireless Sensor Networks , 2007, 2007 IEEE International Conference on Communications.

[26]  Nada Golmie,et al.  Toward Integrating Distributed Energy Resources and Storage Devices in Smart Grid , 2017, IEEE Internet of Things Journal.

[27]  Xuemin Shen,et al.  Lifetime and Energy Hole Evolution Analysis in Data-Gathering Wireless Sensor Networks , 2016, IEEE Transactions on Industrial Informatics.

[28]  Guoliang Xing,et al.  Accuracy-Aware Interference Modeling and Measurement in Wireless Sensor Networks , 2011, 2011 31st International Conference on Distributed Computing Systems.

[29]  Pedro José Marrón,et al.  An Internet-of-Things Enabled Connected Navigation System for Urban Bus Riders , 2016, IEEE Internet of Things Journal.

[30]  Jun Zhao,et al.  Adaptive and Online Fault Detection Using RPCA Algorithm in Wireless Sensor Network Nodes , 2012, 2012 Second International Conference on Intelligent System Design and Engineering Application.

[31]  Shing-Chow Chan,et al.  Robust Recursive Eigendecomposition and Subspace-Based Algorithms With Application to Fault Detection in Wireless Sensor Networks , 2012, IEEE Transactions on Instrumentation and Measurement.

[32]  Hamid R. Rabiee,et al.  Reducing the data transmission in Wireless Sensor Networks using the Principal Component Analysis , 2010, 2010 Sixth International Conference on Intelligent Sensors, Sensor Networks and Information Processing.

[33]  Mianxiong Dong,et al.  RMER: Reliable and Energy-Efficient Data Collection for Large-Scale Wireless Sensor Networks , 2016, IEEE Internet of Things Journal.