Evaluation of missing value imputation methods for wireless soil datasets

AbstractSoil data are very important for hydrologists to model and predict the evolution of water–soil environments. In present, the soil data are often collected by unattended wireless sensing system and then inevitably involves continuous missing values due to the unreliability of system, which is different from the manually collected datasets with the data losses being sparsely distributed . This paper investigates seven typical methods that are used to infill soil missing data, and in particular we also attempt to employ the extreme learning machine in missing-data infilling. This work is aimed at answering such a question: Whether or not existing methods suit for wireless sensory soil dataset with continuous missing values, and how well they perform. With a real-world soil dataset involving complete samples as the benchmark, we evaluate and compare these methods , and analyze the possible reasons behind. This study provides insights for designing new methods that can effectively deal with the missing values in wireless sensory soil dataset.

[1]  Therese D. Pigott,et al.  A Review of Methods for Missing Data , 2001 .

[2]  Annemarie Schneider,et al.  Monitoring land cover change in urban and peri-urban areas using dense time stacks of Landsat satellite data and a data mining approach , 2012 .

[3]  Narendra Singh Raghuwanshi,et al.  Wireless sensor networks for agriculture: The state-of-the-art in practice and future challenges , 2015, Comput. Electron. Agric..

[4]  Li Da Wireless sensor networks system of forest habitat factors collection , 2014 .

[5]  Erkan Besdok,et al.  A Comparison of RBF Neural Network Training Algorithms for Inertial Sensor Based Terrain Classification , 2009, Sensors.

[6]  Mohd Saberi Mohamad,et al.  A Review on Missing Value Imputation Algorithms for Microarray Gene Expression Data , 2014 .

[7]  Li Jianzhong and Gao Hong,et al.  Survey on Sensor Network Research , 2008 .

[8]  Friedhelm Schwenker,et al.  Three learning phases for radial-basis-function networks , 2001, Neural Networks.

[9]  Q. M. Jonathan Wu,et al.  Human face recognition based on multidimensional PCA and extreme learning machine , 2011, Pattern Recognit..

[10]  R. Deo,et al.  Application of the extreme learning machine algorithm for the prediction of monthly Effective Drought Index in eastern Australia , 2015 .

[11]  D. Valle,et al.  Molecular analysis of the third component of canine complement (C3) and identification of the mutation responsible for hereditary canine C3 deficiency. , 1998, Journal of immunology.

[12]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[13]  Paulin Coulibaly,et al.  Comparison of Interpolation, Statistical, and Data-Driven Methods for Imputation of Missing Values in a Distributed Soil Moisture Dataset , 2014 .

[14]  Yi Liu,et al.  A three-dimensional gap filling method for large geophysical datasets: Application to global satellite soil moisture observations , 2012, Environ. Model. Softw..

[15]  David B. Lindenmayer,et al.  The science and application of ecological monitoring , 2010 .

[16]  V. Alchanatis,et al.  Review: Sensing technologies for precision specialty crop production , 2010 .

[17]  Sultan Noman Qasem,et al.  Author's Personal Copy Applied Soft Computing Radial Basis Function Network Based on Time Variant Multi-objective Particle Swarm Optimization for Medical Diseases Diagnosis , 2022 .

[18]  Yan Huang,et al.  Integration of Wireless Sensor Networks in Environmental Monitoring Cyber Infrastructure , 2009 .

[19]  J. Crowcroft,et al.  Automatic epileptic seizure detection in EEGs based on optimized sample entropy and extreme learning machine , 2012, Journal of Neuroscience Methods.

[20]  R. Kohn,et al.  Estimation, Prediction, and Interpolation for ARIMA Models with Missing Data , 1986 .

[21]  Ju Wang,et al.  Sensor data modeling and validating for wireless soil sensor network , 2015, Comput. Electron. Agric..

[22]  Stuart Barr,et al.  Characterising soil moisture in transport corridor environments using airborne LIDAR and CASI data , 2012 .

[23]  G. Vachaud,et al.  Temporal Stability of Spatially Measured Soil Water Probability Density Function , 1985 .

[24]  Deborah Estrin,et al.  Guest Editors' Introduction: Overview of Sensor Networks , 2004, Computer.

[25]  Jianya Gong,et al.  Real-time GIS data model and sensor web service platform for environmental data management , 2015, International Journal of Health Geographics.

[26]  J. Jokela,et al.  An automated platform for phytoplankton ecology and aquatic ecosystem monitoring. , 2011, Environmental science & technology.

[27]  Dehai Zhu,et al.  Drought forecasting based on the remote sensing data using ARIMA models , 2010, Math. Comput. Model..

[28]  Roman Neruda,et al.  Learning methods for radial basis function networks , 2005, Future Gener. Comput. Syst..

[29]  Subhas Chandra Mukhopadhyay,et al.  Wireless Sensor Networks and Ecological Monitoring , 2013 .

[30]  Evan J. Coopersmith,et al.  Machine learning assessments of soil drying for agricultural planning , 2014 .

[31]  Yue Liu,et al.  Prediction of soil moisture based on Extreme Learning Machine for an apple orchard , 2014, 2014 IEEE 3rd International Conference on Cloud Computing and Intelligence Systems.

[32]  Kwok-wing Chau,et al.  Data-driven input variable selection for rainfall-runoff modeling using binary-coded particle swarm optimization and Extreme Learning Machines , 2015 .

[33]  Ning Wang,et al.  Review: Wireless sensors in agriculture and food industry-Recent development and future perspective , 2006 .

[34]  John Tsimikas,et al.  On training RBF neural networks using input-output fuzzy clustering and particle swarm optimization , 2013, Fuzzy Sets Syst..

[35]  Gift Dumedah,et al.  Evaluation of statistical methods for infilling missing values in high-resolution soil moisture data , 2011 .

[36]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[37]  E. Meijering,et al.  A chronology of interpolation: from ancient astronomy to modern signal and image processing , 2002, Proc. IEEE.

[38]  Lukasz A. Kurgan,et al.  Impact of imputation of missing values on classification error for discrete data , 2008, Pattern Recognit..

[39]  Mohamad Ivan Fanany,et al.  Pose-based 3D human motion analysis using Extreme Learning Machine , 2013, 2013 IEEE 2nd Global Conference on Consumer Electronics (GCCE).

[40]  Ming Zhong,et al.  Evolutionary Regression and Neural Imputations of Missing Values , 2008, Soft Computing Applications in Industry.

[41]  Hui-Huang Hsu,et al.  KNN-DTW Based Missing Value Imputation for Microarray Time Series Data , 2011, J. Comput..

[42]  J. Bouma,et al.  Soil water balance scenario studies using predicted soil hydraulic parameters , 2006 .

[43]  Xi Lifeng,et al.  Distributed sensor system for fault detection and isolation in multistage manufacturing systems , 2006 .

[44]  E. Meijering A chronology of interpolation: from ancient astronomy to modern signal and image processing , 2002, Proc. IEEE.

[45]  Kiyoshi Honda,et al.  Soil moisture estimation from inverse modeling using multiple criteria functions , 2011 .

[46]  Maohua Wang,et al.  Wireless sensors in agriculture and food industry — Recent development and future perspective , 2005 .

[47]  Gift Dumedah,et al.  Assessing artificial neural networks and statistical methods for infilling missing soil moisture records , 2014 .