Efficient feature selection using one-pass generalized classifier neural network and binary bat algorithm with a novel fitness function

In high-dimensional data, many of the features are either irrelevant to the machine learning task or are redundant. These situations lead to two problems, firstly overfitting and secondly high computational overhead. The paper proposes a feature selection method to identify the relevant subset of features for the machine-learning task using wrapper approach. The wrapper approach uses the Binary Bat algorithm to select the set of features and One-pass Generalized Classifier Neural Network (OGCNN) to evaluate the selected set of features using a novel fitness function. The proposed fitness function accounts for the entropy of sensitivity and specificity along with accuracy of classifier and fraction of selected features. The fitness function is compared using four classifiers (Radial Basis Function Neural Network, Probabilistic Neural Network, Extreme Learning Machine and OGCNN) on six publicly available datasets. One-pass classifiers are chosen as these are computationally faster. The results suggest that OGCNN along with the novel fitness function performs well in the majority of cases.

[1]  Guoqiang Peter Zhang,et al.  Neural networks for classification: a survey , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[2]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[3]  Alper Ekrem Murat,et al.  A discrete particle swarm optimization method for feature selection in binary classification problems , 2010, Eur. J. Oper. Res..

[4]  Andrey V. Savchenko,et al.  Probabilistic neural network with homogeneity testing in recognition of discrete patterns set , 2013, Neural Networks.

[5]  Damodar Reddy Edla,et al.  An Efficient Multi-layer Ensemble Framework with BPSOGSA-Based Feature Selection for Credit Scoring Data Analysis , 2018 .

[6]  Xiaoyan Sun,et al.  Multi-objective PSO Algorithm for Feature Selection Problems with Unreliable Data , 2014, ICSI.

[7]  Mengjie Zhang,et al.  Novel Initialisation and Updating Mechanisms in PSO for Feature Selection in Classification , 2013, EvoApplications.

[8]  Hervé Bourlard,et al.  Continuous speech recognition by connectionist statistical methods , 1993, IEEE Trans. Neural Networks.

[9]  Stjepan Oreski,et al.  Genetic algorithm-based heuristic for feature selection in credit risk assessment , 2014, Expert Syst. Appl..

[10]  Murali Krishna,et al.  Exploratory Boosted Feature Selection and Neural Network Framework for Depression Classification , 2018, Int. J. Interact. Multim. Artif. Intell..

[11]  Xin Yao,et al.  A Survey on Evolutionary Computation Approaches to Feature Selection , 2016, IEEE Transactions on Evolutionary Computation.

[12]  Xin-She Yang,et al.  Binary bat algorithm , 2013, Neural Computing and Applications.

[13]  Zexuan Zhu,et al.  Markov blanket-embedded genetic algorithm for gene selection , 2007, Pattern Recognit..

[14]  Helbert E. Espitia,et al.  Statistical analysis for vortex particle swarm optimization , 2018, Appl. Soft Comput..

[15]  Basabi Chakraborty,et al.  A new penalty-based wrapper fitness function for feature subset selection with evolutionary algorithms , 2018, J. Inf. Telecommun..

[16]  Siti Zaiton Mohd Hashim,et al.  BMOA: Binary Magnetic Optimization Algorithm , 2012 .

[17]  Oguz Findik,et al.  A comparison of feature selection models utilizing binary particle swarm optimization and genetic algorithm in determining coronary artery disease using support vector machine , 2010, Expert Syst. Appl..

[18]  Majdi M. Mafarja,et al.  Binary Dragonfly Algorithm for Feature Selection , 2017, 2017 International Conference on New Trends in Computing Sciences (ICTCS).

[19]  Xin-She Yang,et al.  Binary Bat Algorithm for Feature Selection , 2013 .

[20]  Isabelle Guyon Applications of Neural Networks to Character Recognition , 1991, Int. J. Pattern Recognit. Artif. Intell..

[21]  B. S. Harish,et al.  A New Feature Selection Method based on Intuitionistic Fuzzy Entropy to Categorize Text Documents , 2018, Int. J. Interact. Multim. Artif. Intell..

[22]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[23]  Satvir Singh,et al.  An Effective Hybrid Butterfly Optimization Algorithm with Artificial Bee Colony for Numerical Optimization , 2017, Int. J. Interact. Multim. Artif. Intell..

[24]  Rubén González Crespo,et al.  Statistical analysis of a multi-objective optimization algorithm based on a model of particles with vorticity behavior , 2016, Soft Comput..

[25]  Russell C. Eberhart,et al.  A discrete binary version of the particle swarm algorithm , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[26]  Buse Melis Ozyildirim,et al.  One pass learning for generalized classifier neural network , 2016, Neural Networks.

[27]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[28]  Francisco Herrera,et al.  A First Study on the Use of Coevolutionary Algorithms for Instance and Feature Selection , 2009, HAIS.

[29]  Leandro Nunes de Castro,et al.  Recent Developments In Biologically Inspired Computing , 2004 .

[30]  Melanie Mitchell,et al.  An introduction to genetic algorithms , 1996 .

[31]  Daoliang Li,et al.  Feature selection based on improved ant colony optimization for online detection of foreign fiber in cotton , 2014, Appl. Soft Comput..

[32]  Mengjie Zhang,et al.  A Genetic Programming Approach to Hyper-Heuristic Feature Selection , 2012, SEAL.

[33]  Max E. Valentinuzzi Handbook of bioinspired algorithms and applications , 2006, BioMedical Engineering OnLine.

[34]  Witold Jacak,et al.  Identification of cancer diagnosis estimation models using evolutionary algorithms: a case study for breast cancer, melanoma, and cancer in the respiratory system , 2011, GECCO.

[35]  Chih-Min Lin,et al.  Breast Nodules Computer-Aided Diagnostic System Design Using Fuzzy Cerebellar Model Neural Networks , 2014, IEEE Transactions on Fuzzy Systems.

[36]  Kevin Hapeshi,et al.  A Review of Nature-Inspired Algorithms , 2010 .

[37]  Nikhil R. Pal,et al.  Genetic programming for simultaneous feature selection and classifier design , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[38]  Cheng-Lung Huang,et al.  A distributed PSO-SVM hybrid system with feature selection and parameter optimization , 2008, Appl. Soft Comput..

[39]  Václav Snásel,et al.  Large-dimensionality small-instance set feature selection: A hybrid bio-inspired heuristic approach , 2018, Swarm Evol. Comput..

[40]  Hossein Nezamabadi-pour,et al.  BGSA: binary gravitational search algorithm , 2010, Natural Computing.

[41]  Mengjie Zhang,et al.  Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach , 2013, IEEE Transactions on Cybernetics.

[42]  Edward I. Altman,et al.  Corporate distress diagnosis: Comparisons using linear discriminant analysis and neural networks (the Italian experience) , 1994 .

[43]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[44]  Buse Melis Ozyildirim,et al.  Generalized classifier neural network , 2013, Neural Networks.

[45]  Agma J. M. Traina,et al.  Improving the ranking quality of medical image retrieval using a genetic feature selection method , 2011, Decis. Support Syst..

[46]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[47]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[48]  Chiun-Chieh Hsu,et al.  A hybrid approach to integrate genetic algorithm into dual scoring model in enhancing the performance of credit scoring model , 2012, Expert Syst. Appl..