A novel bacterial foraging optimization algorithm for feature selection

ACBFO and ISEDBFO are proposed based on original bacterial foraging optimization.The modified chemotaxis step raises selected probability of primary features in ACBFO.Swarming equation and elimination dispersal step are improved in ISEDBFO.ACBFO and ISEDBFO promote the classification accuracy and convergence speed.The proposed algorithms significantly outperformed other six metaheuristic algorithms. Bacterial foraging optimization (BFO) algorithm is a new swarming intelligent method, which has a satisfactory performance in solving the continuous optimization problem based on the chemotaxis, swarming, reproduction and elimination-dispersal steps. However, BFO algorithm is rarely used to deal with feature selection problem. In this paper, we propose two novel BFO algorithms, which are named as adaptive chemotaxis bacterial foraging optimization algorithm (ACBFO) and improved swarming and elimination-dispersal bacterial foraging optimization algorithm (ISEDBFO) respectively. Two improvements are presented in ACBFO. On the one hand, in order to solve the discrete problem, data structure of each bacterium is redefined to establish the mapping relationship between the bacterium and the feature subset. On the other hand, an adaptive method for evaluating the importance of features is designed. Therefore the primary features in feature subset are preserved. ISEDBFO is proposed based on ACBFO. ISEDBFO algorithm also includes two modifications. First, with the aim of describing the nature of cell to cell attraction-repulsion relationship more accurately, swarming representation is improved by means of introducing the hyperbolic tangent function. Second, in order to retain the primary features of eliminated bacteria, roulette technique is applied to the elimination-dispersal phase.In this study, ACBFO and ISEDBFO are tested with 10 public data sets of UCI. The performance of the proposed methods is compared with particle swarm optimization based, genetic algorithm based, simulated annealing based, ant lion optimization based, binary bat algorithm based and cuckoo search based approaches. The experimental results demonstrate that the average classification accuracy of the proposed algorithms is nearly 3 percentage points higher than other tested methods. Furthermore, the improved algorithms reduce the length of the feature subset by almost 3 in comparison to other methods. In addition, the modified methods achieve excellent performance on wilcoxon signed-rank test and sensitivity-specificity test. In conclusion, the novel BFO algorithms can provide important support for the expert and intelligent systems.

[1]  Seyed Mohammad Mirjalili,et al.  The Ant Lion Optimizer , 2015, Adv. Eng. Softw..

[2]  Jacob Scharcanski,et al.  Feature selection for face recognition based on multi-objective evolutionary wrappers , 2013, Expert Syst. Appl..

[3]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[4]  Masoud Nikravesh,et al.  Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing) , 2006 .

[5]  Rutuparna Panda,et al.  A novel adaptive crossover bacterial foraging optimization algorithm for linear discriminant analysis based face recognition , 2015, Appl. Soft Comput..

[6]  Zne-Jung Lee,et al.  Parameter determination of support vector machine and feature selection using simulated annealing approach , 2008, Appl. Soft Comput..

[7]  Ratna Babu Chinnam,et al.  mr2PSO: A maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification , 2011, Inf. Sci..

[8]  Jiasen Lu,et al.  Assessment of the sensitivity and specificity of oligonucleotide (50mer) microarrays. , 2000, Nucleic acids research.

[9]  Guy Theraulaz,et al.  Self-Organization in Biological Systems , 2001, Princeton studies in complexity.

[10]  José Luis Rojo-Álvarez,et al.  Feature selection using support vector machines and bootstrap methods for ventricular fibrillation detection , 2012, Expert Syst. Appl..

[11]  W. J. Conover,et al.  On Methods of Handling Ties in the Wilcoxon Signed-Rank Test , 1973 .

[12]  Cheng-Lung Huang,et al.  A GA-based feature selection and parameters optimizationfor support vector machines , 2006, Expert Syst. Appl..

[13]  Xiangyang Wang,et al.  Feature selection based on rough sets and particle swarm optimization , 2007, Pattern Recognit. Lett..

[14]  Jian Ma,et al.  Rough set and scatter search metaheuristic based feature selection for credit scoring , 2012, Expert Syst. Appl..

[15]  Enrique Alba,et al.  Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments , 2016, Appl. Soft Comput..

[16]  Silvia Casado Yusta,et al.  Different metaheuristic strategies to solve the feature selection problem , 2009, Pattern Recognit. Lett..

[17]  Xin-She Yang,et al.  A wrapper approach for feature selection based on Bat Algorithm and Optimum-Path Forest , 2014, Expert Syst. Appl..

[18]  Christian Igel,et al.  Second-Order SMO Improves SVM Online and Active Learning , 2008, Neural Computation.

[19]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[20]  Marco Dorigo,et al.  Swarm intelligence: from natural to artificial systems , 1999 .

[21]  Gabriele Steidl,et al.  Combined SVM-Based Feature Selection and Classification , 2005, Machine Learning.

[22]  Isabelle Guyon,et al.  Competitive baseline methods set new standards for the NIPS 2003 feature selection benchmark , 2007, Pattern Recognit. Lett..

[23]  Navdeep Kaur,et al.  Face Recognition Using Bacteria Foraging Optimization-Based Selected Features , 2011 .

[24]  Hui-Huang Hsu,et al.  Hybrid feature selection by combining filters and wrappers , 2011, Expert Syst. Appl..

[25]  Alvaro Soto,et al.  Embedded local feature selection within mixture of experts , 2014, Inf. Sci..

[26]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[27]  Mineichi Kudo,et al.  Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..

[28]  Kevin M. Passino,et al.  Biomimicry of bacterial foraging for distributed optimization and control , 2002 .

[29]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Mehmet Fatih Akay,et al.  Support vector machines combined with feature selection for breast cancer diagnosis , 2009, Expert Syst. Appl..

[31]  Jacob Zahavi,et al.  Using simulated annealing to optimize the feature selection problem in marketing applications , 2006, Eur. J. Oper. Res..

[32]  Muhammad Atif Tahir,et al.  Creating diverse nearest-neighbour ensembles using simultaneous metaheuristic feature selection , 2010, Pattern Recognit. Lett..

[33]  Cheng-Lung Huang,et al.  A distributed PSO-SVM hybrid system with feature selection and parameter optimization , 2008, Appl. Soft Comput..

[34]  Xin-She Yang,et al.  Cuckoo search: recent advances and applications , 2013, Neural Computing and Applications.

[35]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[36]  Steven Guan,et al.  Feature selection for modular GA-based classification , 2004, Appl. Soft Comput..

[37]  Xin-She Yang,et al.  BBA: A Binary Bat Algorithm for Feature Selection , 2012, 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images.

[38]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[39]  Justin C. W. Debuse,et al.  Feature Subset Selection within a Simulated Annealing Data Mining Algorithm , 1997, Journal of Intelligent Information Systems.

[40]  James Kennedy,et al.  Particle swarm optimization , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.