Missing Data Estimation Using Cuckoo Search Algorithm

This chapter brings together two related areas: deep learning and swarm intelligence for missing data estimation in high-dimensional datasets. The growing number of studies in the deep learning area warrants a closer look at its possible application in the domain. Missing data being an unavoidable scenario in present-day datasets results in different challenges, which are nontrivial for existing techniques that constitute narrow artificial intelligence architectures and computational intelligence methods. This can be attributed to the large number of samples and high number of features. In this chapter, we propose a new framework for the imputation procedure that uses a deep learning method with a swarm intelligence algorithm, called deep learning-cuckoo search (DL-CS). This technique is compared to similar approaches and other existing methods. The time required to obtain accurate estimates for the missing data entries surpasses that of existing methods, but this is considered a worthy bargain when the accuracy of the said estimates in a high-dimensional setting is taken into consideration.

[1]  Geoffrey E. Hinton,et al.  Using very deep autoencoders for content-based image retrieval , 2011, ESANN.

[2]  Tshilidzi Marwala,et al.  A Deep Learning-Cuckoo Search Method for Missing Data Estimation in High-Dimensional Datasets , 2017, ICSI.

[3]  Tshilidzi Marwala,et al.  Missing Data Estimation in High-Dimensional Datasets: A Swarm Intelligence-Deep Neural Network Approach , 2016, ICSI.

[4]  Ahmed F Ali,et al.  A hybrid cuckoo search algorithm with Nelder Mead method for solving global optimization problems , 2016, SpringerPlus.

[5]  Teresa A. Myers Goodbye, Listwise Deletion: Presenting Hot Deck Imputation as an Easy and Effective Tool for Handling Missing Data , 2011 .

[6]  J. Schafer,et al.  Missing data: our view of the state of the art. , 2002, Psychological methods.

[7]  Xin-She Yang,et al.  Cuckoo Search via Lévy flights , 2009, 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC).

[8]  Sergey Levine,et al.  Deep spatial autoencoders for visuomotor learning , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Jun Ren,et al.  Using Genetic Programming with Prior Formula Knowledge to Solve Symbolic Regression Problem , 2015, Comput. Intell. Neurosci..

[10]  Tshilidzi Marwala,et al.  Modeling of missing data prediction: Computational intelligence and optimization algorithms , 2014, 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[11]  Tshilidzi Marwala,et al.  The use of genetic algorithms and neural networks to approximate missing data in database , 2005, IEEE 3rd International Conference on Computational Cybernetics, 2005. ICCC 2005..

[12]  N. Kumarappan,et al.  Cuckoo Search Algorithm based environmental economic dispatch of microgrid system with distributed generation , 2015, 2015 International Conference on Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials (ICSTM).

[13]  Jun Guo,et al.  A Deep Learning Method Combined Sparse Autoencoder with SVM , 2015, 2015 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery.

[14]  Hong Yan,et al.  Missing value imputation for gene expression data: computational techniques to recover missing data from available information , 2011, Briefings Bioinform..

[15]  Jun Wang,et al.  An Improved Cuckoo Search Optimization Algorithm for the Problem of Chaotic Systems Parameter Estimation , 2016, Comput. Intell. Neurosci..

[16]  B. L. Betechuoh,et al.  Autoencoder networks for HIV classification , 2006 .

[17]  Leonardo Franco,et al.  Missing data imputation using statistical and machine learning methods in a real breast cancer problem , 2010, Artif. Intell. Medicine.

[18]  Tshilidzi Marwala,et al.  Autoencoder, Principal Component Analysis and Support Vector Regression for Data Imputation , 2007, ArXiv.

[19]  Xin-She Yang,et al.  Cuckoo search: recent advances and applications , 2013, Neural Computing and Applications.

[20]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.