Feature Selection for Optimized High-Dimensional Biomedical Data Using an Improved Shuffled Frog Leaping Algorithm

High dimensional biomedical datasets contain thousands of features which can be used in molecular diagnosis of disease, however, such datasets contain many irrelevant or weak correlation features which influence the predictive accuracy of diagnosis. Without a feature selection algorithm, it is difficult for the existing classification techniques to accurately identify patterns in the features. The purpose of feature selection is to not only identify a feature subset from an original set of features [without reducing the predictive accuracy of classification algorithm] but also reduce the computation overhead in data mining. In this paper, we present our improved shuffled frog leaping algorithm which introduces a chaos memory weight factor, an absolute balance group strategy, and an adaptive transfer factor. Our proposed approach explores the space of possible subsets to obtain the set of features that maximizes the predictive accuracy and minimizes irrelevant features in high-dimensional biomedical data. To evaluate the effectiveness of our proposed method, we have employed the K-nearest neighbor method with a comparative analysis in which we compare our proposed approach with genetic algorithms, particle swarm optimization, and the shuffled frog leaping algorithm. Experimental results show that our improved algorithm achieves improvements in the identification of relevant subsets and in classification accuracy.

[1]  Francisco Jurado,et al.  A binary SFLA for probabilistic three-phase load flow in unbalanced distribution systems with technical constraints , 2013 .

[2]  Cheng-Lung Huang,et al.  A distributed PSO-SVM hybrid system with feature selection and parameter optimization , 2008, Appl. Soft Comput..

[3]  Hany M. Hasanien,et al.  Shuffled Frog Leaping Algorithm for Photovoltaic Model Identification , 2015, IEEE Transactions on Sustainable Energy.

[4]  Li-Yeh Chuang,et al.  Improved binary PSO for feature selection using gene expression data , 2008, Comput. Biol. Chem..

[5]  Thai-Hoang Huynh,et al.  Fuzzy controller design using a new shuffled frog leaping algorithm , 2009, 2009 IEEE International Conference on Industrial Technology.

[6]  Chen Fang,et al.  An effective shuffled frog-leaping algorithm for multi-mode resource-constrained project scheduling problem , 2011, Inf. Sci..

[7]  Xia Li,et al.  An improved shuffled frog-leaping algorithm with extremal optimisation for continuous optimisation , 2012, Inf. Sci..

[8]  Li-Yeh Chuang,et al.  Feature Selection Using Memetic Algorithms , 2008, 2008 Third International Conference on Convergence and Hybrid Information Technology.

[9]  Min-Rong Chen,et al.  Improved Shuffled Frog Leaping Algorithm and its multi-phase model for multi-depot vehicle routing problem , 2014, Expert Syst. Appl..

[10]  R. Meenakumari,et al.  Optimum generation scheduling using an Improved Adaptive Shuffled Frog Leaping Algorithm , 2015, 2015 International Conference on Cognitive Computing and Information Processing(CCIP).

[11]  Ashraf Elazouni,et al.  Performance of Shuffled Frog-Leaping Algorithm in Finance-Based Scheduling , 2012, J. Comput. Civ. Eng..

[12]  M. Shahriari-kahkeshi,et al.  Nonlinear continuous stirred tank reactor (CSTR) identification and control using recurrent neural network trained Shuffled Frog Leaping Algorithm , 2011, The 2nd International Conference on Control, Instrumentation and Automation.

[13]  Xia Li,et al.  Solving TSP with Shuffled Frog-Leaping Algorithm , 2008, 2008 Eighth International Conference on Intelligent Systems Design and Applications.

[14]  Zhihong Man,et al.  Classification of bioinformatics dataset using finite impulse response extreme learning machine for cancer diagnosis , 2013, Neural Computing and Applications.

[15]  Xiangyang Wang,et al.  Feature selection based on rough sets and particle swarm optimization , 2007, Pattern Recognit. Lett..

[16]  Quan-Ke Pan,et al.  An effective shuffled frog-leaping algorithm for lot-streaming flow shop scheduling problem , 2011 .

[17]  G B Gharehpetian,et al.  Unit Commitment Problem Solution Using Shuffled Frog Leaping Algorithm , 2011, IEEE Transactions on Power Systems.

[18]  Pablo A. Estévez,et al.  A review of feature selection methods based on mutual information , 2013, Neural Computing and Applications.

[19]  Kevin E Lansey,et al.  Optimization of Water Distribution Network Design Using the Shuffled Frog Leaping Algorithm , 2003 .

[20]  Thomas Roß,et al.  Feature selection for optimized skin tumor recognition using genetic algorithms , 1999, Artif. Intell. Medicine.

[21]  Jiawei Han,et al.  Cancer classification using gene expression data , 2003, Inf. Syst..

[22]  Kazuyuki Murase,et al.  A new hybrid ant colony optimization algorithm for feature selection , 2012, Expert Syst. Appl..

[23]  D. Bertrand,et al.  Feature selection by a genetic algorithm. Application to seed discrimination by artificial vision , 1998 .

[24]  Dun-Wei Gong,et al.  Feature selection algorithm based on bare bones particle swarm optimization , 2015, Neurocomputing.

[25]  William A. Schmitt,et al.  Interactive exploration of microarray gene expression patterns in a reduced dimensional space. , 2002, Genome research.

[26]  M. Gomez-Gonzalez,et al.  Estimation of induction motor parameters using shuffled frog-leaping algorithm , 2013 .

[27]  B. Alatas,et al.  Chaos embedded particle swarm optimization algorithms , 2009 .