Data Quality and Failures Characterization of Sensing Data in Environmental Applications

Environmental monitoring, targeting at discovering and understanding the environmental laws and changes, is one of the most important sensor network application domains. Environmental monitoring is one of the most important sensor network application domains. The success of those applications is determined by the quality of the collected data. Thus, it is crucial to carefully analyze the collected sensing data, which not only helps us understand the features of monitored field, but also unveil any limitations and opportunities that should be considered in future sensor system design. In this paper, we take an initial step and analyze one-month sensing data collected from a real-world water system surveillance application, focusing on the data similarity, data abnormality and failure patterns. Our major findings include: (1) Information similarity, including pattern similarity and numerical similarity, is very common, which provides a good opportunity to trade off energy efficiency and data quality; (2) Spatial and multi-modality correlation analysis provide a way to evaluate data integrity and to detect conflicting data that usually indicates appearances of sensor malfunction or interesting events; and (3) External harsh environmental conditions may be the most important factor on inflicting failures in environmental applications. Communication failures, mainly caused by lacking of synchronization, contribute the largest portion among all failure types.

[1]  Jianliang Xu,et al.  Extending Network Lifetime for Precision-Constrained Data Aggregation in Wireless Sensor Networks , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[2]  Yanyong Zhang,et al.  DADA : A 2-Dimensional Adaptive Node Schedule to Provide Smooth Sensor Network Services against Random Failures , 2005 .

[3]  Deborah Estrin,et al.  Simultaneous Optimization for Concave Costs: Single Sink Aggregation or Single Source Buy-at-Bulk , 2003, SODA '03.

[4]  Xiuzhen Cheng,et al.  Aggregation tree construction in sensor networks , 2003, 2003 IEEE 58th Vehicular Technology Conference. VTC 2003-Fall (IEEE Cat. No.03CH37484).

[5]  Weisong Shi,et al.  Modeling Data Consistency in Wireless Sensor Networks , 2007, 27th International Conference on Distributed Computing Systems Workshops (ICDCSW'07).

[6]  Anantha P. Chandrakasan,et al.  An application-specific protocol architecture for wireless microsensor networks , 2002, IEEE Trans. Wirel. Commun..

[7]  Gregory J. Pottie,et al.  Wireless integrated network sensors , 2000, Commun. ACM.

[8]  Edward Y. Chang,et al.  Adaptive sampling for sensor networks , 2004, DMSN '04.

[9]  Gaurav S. Sukhatme,et al.  Connecting the Physical World with Pervasive Networks , 2002, IEEE Pervasive Comput..

[10]  Guohong Cao,et al.  Optimizing tree reconfiguration for mobile target tracking in sensor networks , 2004, IEEE INFOCOM 2004.

[11]  Weisong Shi,et al.  Availability Modeling and Analysis of Autonomous In-Door WSNs , 2007, 2007 IEEE Internatonal Conference on Mobile Adhoc and Sensor Systems.

[12]  Ling Liu,et al.  Energy-Aware Data Collection in Sensor Networks: A Localized Selective Sampling Approach , 2005 .

[13]  Bianca Schroeder,et al.  Disk Failures in the Real World: What Does an MTTF of 1, 000, 000 Hours Mean to You? , 2007, FAST.

[14]  Jun Luo,et al.  Energy efficient routing with adaptive data fusion in sensor networks , 2005, DIALM-POMC '05.

[15]  Deborah Estrin,et al.  Impact of network density on data aggregation in wireless sensor networks , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[16]  Deepak Ganesan,et al.  PRESTO: feedback-driven data management in sensor networks , 2009, TNET.

[17]  Arkady Kanevsky,et al.  Are disks the dominant contributor for storage failures?: A comprehensive study of storage subsystem failure characteristics , 2008, TOS.