Treatment of missing data using neural networks and genetic algorithms

This paper introduces a method aimed at approximating missing data in a database using a combination of genetic algorithms and neural networks. The proposed method uses genetic algorithm to minimise an error function derived from an auto-associative neural network. An investigation on using the proposed method to accurately approximate missing data as the number of missing cases within a single record increases is conducted. Multi layer perceptron (MLP) and radial basis function (RBF) neural networks are employed. Results obtained using RBF are found to be better than those from the MLP. Results from a combination of both MLP and RBF are found to be better than those obtained using either MLP or RBF individually.

[1]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[2]  P. Roth MISSING DATA: A CONCEPTUAL REVIEW FOR APPLIED PSYCHOLOGISTS , 1994 .

[3]  Tshilidzi Marwala,et al.  Assessing Different Bayesian Neural Network Models for Militarized Interstate Dispute , 2006, Social Science Computer Review.

[4]  Tshilidzi Marwala,et al.  FAULT IDENTIFICATION USING FINITE ELEMENT MODELS AND NEURAL NETWORKS , 1999 .

[5]  Parag C. Pendharkar,et al.  An empirical study of impact of crossover operators on the performance of non-binary genetic algorithm based neural approaches for classification , 2004, Comput. Oper. Res..

[6]  Martin Fodslette Møller,et al.  A scaled conjugate gradient algorithm for fast supervised learning , 1993, Neural Networks.

[7]  Mingxiu Hu,et al.  EVALUATION OF SOME POPULAR IMPUTATION ALGORITHMS , 2002 .

[8]  Yang C. Yuan,et al.  Multiple Imputation for Missing Data: Concepts and New Development , 2000 .

[9]  T. Marwala,et al.  Fault classification in structures with incomplete measured data using autoassociative neural networks and genetic algorithm , 2006 .

[10]  Victoria Y. Yoon,et al.  Artificial neural networks: an emerging new technique , 1990, DATB.

[11]  Maurice K. Wong,et al.  Algorithm AS136: A k-means clustering algorithm. , 1979 .

[12]  Tshilidzi Marwala,et al.  Neural Networks, Fuzzy Inference Systems and Adaptive-Neuro Fuzzy Inference Systems for Financial Decision Making , 2006, ICONIP.

[13]  Parag C. Pendharkar,et al.  An empirical study of non-binary genetic algorithm-based neural approaches for classification , 1999, ICIS.

[14]  Thomas Kolarik,et al.  Time series forecasting using neural networks , 1994, APL '94.

[15]  Michael Jones,et al.  The use of genetic algorithms and neural networks to investigate the Baldwin effect , 1999, SAC '99.

[16]  P. Allison Multiple Imputation for Missing Data , 2000 .

[17]  N. Draper,et al.  Applied Regression Analysis , 1966 .

[18]  Tshilidzi Marwala,et al.  DAMAGE IDENTIFICATION USING COMMITTEE OF NEURAL NETWORKS , 2000 .

[19]  Tshilidzi Marwala,et al.  Fuzzy Artmap and Neural Network Approach to Online Processing of Inputs with Missing Values , 2007, ArXiv.

[20]  T. Marwala Scaled conjugate gradient and Bayesian training of neural networks for fault identification in cylinders , 2001 .

[21]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[22]  Christopher R. Houck,et al.  A Genetic Algorithm for Function Optimization: A Matlab Implementation , 2001 .

[23]  N. Draper,et al.  Applied Regression Analysis: Draper/Applied Regression Analysis , 1998 .

[24]  Victoria Y. Yoon,et al.  Artificial Neural Networks: An Emerging New Technique , 1992, Data Base.

[25]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[26]  Tshilidzi Marwala,et al.  Finite Element Model Updating Using Wavelet Data and Genetic Algorithm , 2002 .

[27]  H. Robbins A Stochastic Approximation Method , 1951 .

[28]  Tshilidzi Marwala,et al.  Online Forecasting of Stock Market Movement Direction Using the Improved Incremental Algorithm , 2006, ICONIP.

[29]  Stephanie Forrest,et al.  Genetic algorithms , 1996, CSUR.

[30]  R. Penrose A Generalized inverse for matrices , 1955 .

[31]  Tshilidzi Marwala Probabilistic Fault Identification Using a Committee of Neural Networks and Vibration Data , 2001 .