A Data Cleaning Method and Its Application for Earthen Site Data Monitored by WSN

This paper focuses on a data cleaning method to denoise and detect outliers of earthen site monitoring data with wireless sensor network (WSN). A data cleaning method, named DC_ESVS is proposed, which is based on the temporal and spatial characteristics of monitoring data with WSN. Using the cubic exponential smoothing algorithm and voting strategy, it can denoise and detect outliers of earthen site monitoring data based on the decision rule. We conduct various experiments on the dataset of the monitoring data of Xi’an Tang Hanguangmen city wall site with WSN to show detection accuracy of the presented method. Experimental results on anther dataset of the monitoring data of the Ming Great Wall in Shaanxi also show good performance of the proposed method.

[1]  Alessandro Mecocci,et al.  Health monitoring of architectural heritage: the case study of San Gimignano , 2010, 2010 IEEE Workshop on Environmental Energy and Structural Monitoring Systems.

[2]  Alexandra Meliou,et al.  Data X-Ray: A Diagnostic Tool for Data Errors , 2015, SIGMOD Conference.

[3]  Jianzhong Li,et al.  Towards certain fixes with editing rules and master data , 2010, Proc. VLDB Endow..

[4]  Joseph M. Hellerstein,et al.  Potter's Wheel: An Interactive Data Cleaning System , 2001, VLDB.

[5]  Ahmed K. Elmagarmid,et al.  Don't be SCAREd: use SCalable Automatic REpairing with maximal likelihood and bounded changes , 2013, SIGMOD '13.

[6]  Ye Yang,et al.  Handling missing data in software effort prediction with naive Bayes and EM algorithm , 2011, Promise '11.

[7]  Renée J. Miller,et al.  Continuous data cleaning , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[8]  Dominique Brodbeck,et al.  Research directions in data wrangling: Visualizations and transformations for usable and credible data , 2011, Inf. Vis..

[9]  Hong Cheng,et al.  Repairing Vertex Labels under Neighborhood Constraints , 2014, Proc. VLDB Endow..

[10]  Ahmed K. Elmagarmid,et al.  Guided data repair , 2011, Proc. VLDB Endow..

[11]  Paolo Papotti,et al.  Holistic data cleaning: Putting violations into context , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[12]  Renée J. Miller,et al.  A unified model for data and constraint repair , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[13]  Paolo Papotti,et al.  The LLUNATIC Data-Cleaning Framework , 2013, Proc. VLDB Endow..

[14]  Sunil Prabhakar,et al.  ERACER: a database approach for statistical inference and data cleaning , 2010, SIGMOD Conference.

[15]  Rajeev Rastogi,et al.  A cost-based model and effective heuristic for repairing constraints by value modification , 2005, SIGMOD '05.

[16]  M C Rodriguez-Sanchez,et al.  Wireless Sensor Networks for Conservation and Monitoring Cultural Assets , 2011, IEEE Sensors Journal.