Auto Associative Extreme Learning Machine Based Hybrids for Data Imputation

This chapter presents three novel hybrid techniques for data imputation viz., (1) Auto-associative Extreme Learning Machine (AAELM) with Principal Component Analysis (PCA) (PCA-AAELM), (2) Gray system theory (GST) + AAELM with PCA (Gray+PCA-AAELM), (3) AAELM with Evolving Clustering Method (ECM) (ECM-AAELM). Our prime concern is to remove the randomness in AAELM caused by the random weights with the help of ECM and PCA. This chapter also proposes local learning by invoking ECM as a preprocessor for AAELM. The proposed methods are tested on several regression, classification and bank datasets using 10 fold cross validation. The results, in terms of Mean Absolute Percentage Error (MAPE,) are compared with that of K-Means+Multilayer perceptron (MLP) imputation (Ankaiah & Ravi, 2011), K-Medoids+MLP, K-Means+GRNN, K-Medoids+GRNN (Nishanth & Ravi, 2013) PSO_Covariance imputation (Krishna & Ravi, 2013) and ECM-Imputation (Gautam & Ravi, 2014). It is concluded that the proposed methods achieved better imputation in most of the datasets as evidenced by the Wilcoxon signed rank test.

[1]  Amaury Lendasse,et al.  X-SOM and L-SOM: A double classification approach for missing value imputation , 2010, Neurocomputing.

[2]  M. Beynon,et al.  Variable precision rough set theory and data discretisation: an application to corporate failure prediction , 2001 .

[3]  Benito E. Flores,et al.  A pragmatic view of accuracy measurement in forecasting , 1986 .

[4]  M. Marseguerra,et al.  The AutoAssociative Neural Network in signal analysis: II. Application to on-line monitoring of a simulated BWR component , 2005 .

[5]  L. L. Doove,et al.  Recursive partitioning for missing data imputation in the presence of interaction effects , 2014, Comput. Stat. Data Anal..

[6]  Bruno Crémilleux,et al.  MVC - a preprocessing method to deal with missing values , 1999, Knowl. Based Syst..

[7]  Ahmet Arslan,et al.  A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression and a genetic algorithm , 2013, Inf. Sci..

[8]  Peter K. Sharpe,et al.  Dealing with missing values in neural network-based diagnostic systems , 1995, Neural Computing & Applications.

[9]  Vadlamani Ravi,et al.  A new online data imputation method based on general regression auto associative neural network , 2014, Neurocomputing.

[10]  Tshilidzi Marwala,et al.  The use of genetic algorithms and neural networks to approximate missing data in database , 2005, IEEE 3rd International Conference on Computational Cybernetics, 2005. ICCC 2005..

[11]  Vadlamani Ravi,et al.  Soft computing based imputation and hybrid data and text mining: The case of predicting the severity of phishing alerts , 2012, Expert Syst. Appl..

[12]  Vadlamani Ravi,et al.  A Computational Intelligence Based Online Data Imputation Method: An Application For Banking , 2013, J. Inf. Process. Syst..

[13]  César Hervás-Martínez,et al.  PCA-ELM: A Robust and Pruned Extreme Learning Machine Approach Based on Principal Component Analysis , 2012, Neural Processing Letters.

[14]  Md Zahidul Islam,et al.  Missing value imputation using decision trees and decision forests by splitting and merging records: Two novel techniques , 2013, Knowl. Based Syst..

[15]  Tariq Samad,et al.  Self–organization with partial data , 1992 .

[16]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[17]  Qinbao Song,et al.  A new imputation method for small software project data sets , 2007, J. Syst. Softw..

[18]  Peter C. Austin,et al.  Bayesian modeling of missing data in clinical research , 2005, Comput. Stat. Data Anal..

[19]  Alessandro G. Di Nuovo,et al.  Missing data analysis with fuzzy C-Means: A study of its application in a psychological scenario , 2011, Expert Syst. Appl..

[20]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[21]  Slobodan P. Simonovic,et al.  Estimation of missing streamflow data using principles of chaos theory , 2002 .

[22]  Bogdan Gabrys,et al.  Neuro-fuzzy approach to processing inputs with missing values in pattern recognition problems , 2002, Int. J. Approx. Reason..

[23]  Tshilidzi Marwala,et al.  A dynamic programming approach to missing data estimation using neural networks , 2013, Inf. Sci..

[24]  Bing Yu,et al.  Missing data analyses: a hybrid multiple imputation algorithm using Gray System Theory and entropy based on clustering , 2013, Applied Intelligence.

[25]  Soo-Young Lee,et al.  Training Algorithm with Incomplete Data for Feed-Forward Neural Networks , 1999, Neural Processing Letters.

[26]  Tshilidzi Marwala,et al.  Partial imputation of unseen records to improve classification using a hybrid multi-layered artificial immune system and genetic algorithm , 2013, Appl. Soft Comput..

[27]  Esther-Lydia Silva-Ramírez,et al.  Missing value imputation on missing completely at random data using multilayer perceptrons , 2011, Neural Networks.

[28]  S. Nordbotten Neural network imputation applied to the Norwegian 1990 population census data , 1996 .

[29]  T. V. Geetha,et al.  Indian Logic Ontology based Automatic Query Refinement , 2008 .

[30]  Aníbal R. Figueiras-Vidal,et al.  Classifying patterns with missing values using Multi-Task Learning perceptrons , 2013, Expert Syst. Appl..

[31]  Fengzhan Tian,et al.  A selective Bayes Classifier for classifying incomplete data based on gain ratio , 2008, Knowl. Based Syst..

[32]  Juan Carlos Figueroa García,et al.  Missing data imputation in multivariate data by evolutionary algorithms , 2011, Comput. Hum. Behav..

[33]  Shichao Zhang,et al.  The Journal of Systems and Software , 2012 .

[34]  Teresa B. Ludermir,et al.  Comparison of new activation functions in neural network for forecasting financial time series , 2011, Neural Computing and Applications.

[35]  Vadlamani Ravi,et al.  Counter propagation auto-associative neural network based data imputation , 2015, Inf. Sci..

[36]  Pilsung Kang,et al.  Locally linear reconstruction based missing value imputation for supervised learning , 2013, Neurocomputing.

[37]  Amit Gupta,et al.  Estimating Missing Values Using Neural Networks , 1996 .

[38]  Ignacio Olmeda,et al.  Hybrid Classifiers for Financial Multicriteria Decision Making: The Case of Bankruptcy Prediction , 1997 .

[39]  Serpil Canbas,et al.  Prediction of commercial bank failure via multivariate statistical analysis of financial structures: The Turkish case , 2005, Eur. J. Oper. Res..

[40]  John O. Odiyo,et al.  Filling of missing rainfall data in Luvuvhu River Catchment using artificial neural networks , 2011 .

[41]  Deng Ju-Long,et al.  Control problems of grey systems , 1982 .

[42]  Michel Ballings,et al.  Kernel Factory: An ensemble of kernel machines , 2013, Expert Syst. Appl..

[43]  Vadlamani Ravi,et al.  Evolving clustering based data imputation , 2014, 2014 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2014].

[44]  Shichao Zhang,et al.  Noisy data elimination using mutual k-nearest neighbor for classification mining , 2012, J. Syst. Softw..

[45]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[46]  Vadlamani Ravi,et al.  Particle swarm optimization and covariance matrix based data imputation , 2013, 2013 IEEE International Conference on Computational Intelligence and Computing Research.