Missing value imputation on missing completely at random data using multilayer perceptrons

Data mining is based on data files which usually contain errors in the form of missing values. This paper focuses on a methodological framework for the development of an automated data imputation model based on artificial neural networks. Fifteen real and simulated data sets are exposed to a perturbation experiment, based on the random generation of missing values. These data set sizes range from 47 to 1389 records. A perturbation experiment was performed for each data set where the probability of missing value was set to 0.05. Several architectures and learning algorithms for the multilayer perceptron are tested and compared with three classic imputation procedures: mean/mode imputation, regression and hot-deck. The obtained results, considering different performance measures, not only suggest this approach improves the quality of a database with missing values, but also the best results are clearly obtained using the Multilayer Perceptron model in data sets with categorical variables. Three learning rules (Levenberg-Marquardt, BFGS Quasi-Newton and Conjugate Gradient Fletcher-Reeves Update) and a small number of hidden nodes are recommended.

[1]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[2]  Seppo Laaksonen,et al.  Traditional and New Techniques for Imputation , 2003 .

[3]  S. Nordbotten Neural network imputation applied to the Norwegian 1990 population census data , 1996 .

[4]  Tshilidzi Marwala,et al.  Computational intelligence and decision trees for missing data estimation , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[5]  P. Bühlmann Bagging, subagging and bragging for improving some prediction algorithms , 2003 .

[6]  Bingru Yang,et al.  A SVM Regression Based Approach to Filling in Missing Values , 2005, KES.

[7]  A. Gammerman,et al.  Imputation Using Support Vector Machines , 2003 .

[8]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[9]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[10]  Soo-Young Lee,et al.  Training Algorithm with Incomplete Data for Feed-Forward Neural Networks , 1999, Neural Processing Letters.

[11]  Juan Luis Castro,et al.  Local distance-based classification , 2008, Knowl. Based Syst..

[12]  Svein Nordbotten New Methods of Editing and Imputation , 2002 .

[13]  Zidong Wang,et al.  State Estimation for Coupled Uncertain Stochastic Networks With Missing Measurements and Time-Varying Delays: The Discrete-Time Case , 2009, IEEE Transactions on Neural Networks.

[14]  Harri Niska,et al.  Methods for imputation of missing values in air quality data sets , 2004 .

[15]  Svein Nordbotten EDITING STATISTICAL RECORDS BY NEURAL NETWORKS , 1995 .

[16]  J. Gower A General Coefficient of Similarity and Some of Its Properties , 1971 .

[17]  Tshilidzi Marwala,et al.  Missing data: A comparison of neural network and expectation maximization techniques , 2007 .

[18]  Alan F. Murray,et al.  International Joint Conference on Neural Networks , 1993 .

[19]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[20]  Daniel W. C. Ho,et al.  Robust H∞ control for a class of nonlinear discrete time-delay stochastic systems with missing measurements , 2009, Autom..