Outlier aware data aggregation in distributed wireless sensor network using robust principal component analysis

To address the problem of outlier detection in wireless sensor networks, in this paper we propose a robust principal component analysis based technique to detect anomalous or faulty sensor data in a distributed wireless sensor network with a focus on data integrity and accuracy problem. The main key features are that it considers the correlation existing among the sensor data in order to disclose anomalies that span through a number of neighboring sensors, does not require error free data for PCA model construction and the operation takes place in a distributed fashion. In this paper, a two-step algorithm is proposed. First, the intent was to find an accurate estimate of the correlation of sensor data to build up a robust PCA model that could then be used for fault detection. This locally developed correlation based robust PCA model tends to accentuate the contribution of close observations in comparison with distant observations and does not impose any constraints in model design. Second, we use mahalanobis distance, a multivariate distance metric to determine the similarity between the current sensor readings against the developed sensor data model. Combined with component analysis, mahalanobis distance is extended to examine whether a sensor node is an outlier from a model defined by principal components based on principal component analysis. We examined the algorithm's performance using simulation with synthetic and real sensor data streams. The results clearly show that our approach outperforms existing methods in terms of accuracy even when processing corrupted data.

[1]  Xianggui Qu,et al.  Multivariate Data Analysis , 2007, Technometrics.

[2]  Thomas F. Edgar,et al.  Identification of faulty sensors using principal component analysis , 1996 .

[3]  Wendi Heinzelman,et al.  Energy-efficient communication protocol for wireless microsensor networks , 2000, Proceedings of the 33rd Annual Hawaii International Conference on System Sciences.

[4]  Ian F. Akyildiz,et al.  Wireless sensor networks: a survey , 2002, Comput. Networks.

[5]  Yunfeng Zhang,et al.  Interactive sensor network data retrieval and management using principal components analysis transform , 2006 .

[6]  Sylvain Raybaud,et al.  Distributed Principal Component Analysis for Wireless Sensor Networks , 2008, Sensors.

[7]  Jim Freeman,et al.  Outliers in Statistical Data (3rd edition) , 1995 .

[8]  Wei Hong,et al.  TASK: sensor network in a box , 2005, Proceeedings of the Second European Workshop on Wireless Sensor Networks, 2005..

[9]  S. Papavassiliou,et al.  Diagnosing Anomalies and Identifying Faulty Nodes in Sensor Networks , 2007, IEEE Sensors Journal.

[10]  Mark G. Terwilliger,et al.  Overview of Sensor Networks , 2004 .

[11]  N. Chitradevi,et al.  Estimation based Efficient and Resilient Hierarchical In-Network Data Aggregation Scheme for Wireless Sensor Network , 2010 .

[12]  Yang Sun,et al.  A PCA-based vehicle classification system in wireless sensor networks , 2006, IEEE Wireless Communications and Networking Conference, 2006. WCNC 2006..

[13]  Mani Srivastava,et al.  Overview of sensor networks , 2004 .

[14]  I. Jolliffe Principal Component Analysis , 2002 .

[15]  Didier Maquin,et al.  Fault Detection and Isolation with Robust Principal Component Analysis , 2008, 2008 16th Mediterranean Conference on Control and Automation.

[16]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[17]  Gaëtan Kerschen,et al.  Sensor validation using principal component analysis , 2005 .

[18]  José Ragot,et al.  Sensor Failure Detection of Air Quality Monitoring Network , 2000 .

[19]  Symeon Papavassiliou,et al.  Hierarchical Anomaly Detection in Distributed Large-Scale Sensor Networks , 2006, 11th IEEE Symposium on Computers and Communications (ISCC'06).

[20]  S. Sitharama Iyengar,et al.  Distributed Bayesian algorithms for fault-tolerant event region detection in wireless sensor networks , 2004, IEEE Transactions on Computers.

[21]  Mohamed K. Watfa,et al.  A Sensor Network Data Aggregation Technique , 2009 .

[22]  Cormac J. Sreenan,et al.  A Study on Data Aggregation and Reliability in Managing Wireless Sensor Networks , 2007, 2007 IEEE Internatonal Conference on Mobile Adhoc and Sensor Systems.

[23]  Nirvana Meratnia,et al.  Outlier Detection Techniques for Wireless Sensor Networks: A Survey , 2008, IEEE Communications Surveys & Tutorials.

[24]  Marcos Augusto M. Vieira,et al.  Survey on wireless sensor network devices , 2003, EFTA 2003. 2003 IEEE Conference on Emerging Technologies and Factory Automation. Proceedings (Cat. No.03TH8696).

[25]  Anne Ruiz-Gazen,et al.  A very simple robust estimator of a dispersion matrix , 1996 .

[26]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[27]  Charles E. Heckler,et al.  Applied Multivariate Statistical Analysis , 2005, Technometrics.

[28]  Jianbo Xu,et al.  A new In-network data aggregation technology of wireless sensor networks , 2006, SKG.

[29]  S. Qin,et al.  Selection of the Number of Principal Components: The Variance of the Reconstruction Error Criterion with a Comparison to Other Methods† , 1999 .

[30]  Deborah Estrin,et al.  Habitat monitoring with sensor networks , 2004, CACM.

[31]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[32]  S.A. Khan,et al.  Analyzing & Enhancing energy Efficient Communication Protocol for Wireless Micro-sensor Networks , 2005, 2005 International Conference on Information and Communication Technologies.