A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy

The proposed method uses a local search technique which is embedded in particle swarm optimization (PSO) to select the reduced sized and salient feature subset. The goal of the local search technique is to guide the PSO search process to select distinct features by using their correlation information. Therefore, the proposed method selects the subset of features with reduced redundancy. A hybrid feature selection method based on particle swarm optimization is proposed.Our method uses a novel local search to enhance the search process near global optima.The method efficiently finds the discriminative features with reduced correlations.The size of final feature set is determined using a subset size detection scheme.Our method is compared with well-known and state-of-the-art feature selection methods. Feature selection has been widely used in data mining and machine learning tasks to make a model with a small number of features which improves the classifier's accuracy. In this paper, a novel hybrid feature selection algorithm based on particle swarm optimization is proposed. The proposed method called HPSO-LS uses a local search strategy which is embedded in the particle swarm optimization to select the less correlated and salient feature subset. The goal of the local search technique is to guide the search process of the particle swarm optimization to select distinct features by considering their correlation information. Moreover, the proposed method utilizes a subset size determination scheme to select a subset of features with reduced size. The performance of the proposed method has been evaluated on 13 benchmark classification problems and compared with five state-of-the-art feature selection methods. Moreover, HPSO-LS has been compared with four well-known filter-based methods including information gain, term variance, fisher score and mRMR and five well-known wrapper-based methods including genetic algorithm, particle swarm optimization, simulated annealing and ant colony optimization. The results demonstrated that the proposed method improves the classification accuracy compared with those of the filter based and wrapper-based feature selection methods. Furthermore, several performed statistical tests show that the proposed method's superiority over the other methods is statistically significant.

[1]  Hao Dong,et al.  An improved particle swarm optimization for feature selection , 2011 .

[2]  Marcel J. T. Reinders,et al.  Random subspace method for multivariate feature selection , 2006, Pattern Recognit. Lett..

[3]  Russell C. Eberhart,et al.  A discrete binary version of the particle swarm algorithm , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[4]  Shigeo Abe,et al.  Modified backward feature selection by cross validation , 2005, ESANN.

[5]  Fred W. Glover,et al.  Simulation optimization: a review, new developments, and applications , 2005, Proceedings of the Winter Simulation Conference, 2005..

[6]  Byung Ro Moon,et al.  Hybrid Genetic Algorithms for Feature Selection , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  M. Friedman A Comparison of Alternative Tests of Significance for the Problem of $m$ Rankings , 1940 .

[8]  Bijaya K. Panigrahi,et al.  Face recognition using bacterial foraging strategy , 2011, Swarm Evol. Comput..

[9]  Kazuyuki Murase,et al.  A new wrapper feature selection approach using neural network , 2010, Neurocomputing.

[10]  Amany M. Mohamed,et al.  Optimal sequencing of design projects' activities using discrete particle swarm optimisation , 2012, Int. J. Bio Inspired Comput..

[11]  Kazuyuki Murase,et al.  A new hybrid ant colony optimization algorithm for feature selection , 2012, Expert Syst. Appl..

[12]  Chuyi Song,et al.  Hybrid Algorithm Based on Particle Swarm Optimization and Artificial Fish Swarm Algorithm , 2012, ISNN.

[13]  Zhen Liu,et al.  A new feature selection algorithm based on binomial hypothesis testing for spam filtering , 2011, Knowl. Based Syst..

[14]  Harun Uguz,et al.  A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm , 2011, Knowl. Based Syst..

[15]  Mohsen Ramezani,et al.  Improve performance of collaborative filtering systems using backward feature selection , 2013, The 5th Conference on Information and Knowledge Technology.

[16]  Nikhil R. Pal,et al.  A neuro-fuzzy scheme for simultaneous feature selection and fuzzy rule-based classification , 2004, IEEE Transactions on Neural Networks.

[17]  Mohammad Saniee Abadeh,et al.  Gene selection for cancer tumor detection using a novel memetic algorithm with a multi-view fitness function , 2013, Eng. Appl. Artif. Intell..

[18]  Cheng-Lung Huang,et al.  A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting , 2009, Expert Syst. Appl..

[19]  Mehrnoush Shamsfard,et al.  A Novel Approach for Feature Selection based on the Bee Colony Optimization , 2012 .

[20]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[21]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[22]  Chun-Nan Hsu,et al.  The ANNIGMA-wrapper approach to fast feature selection for neural nets , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[23]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[24]  Mengjie Zhang,et al.  Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms , 2014, Appl. Soft Comput..

[25]  Nikhil R. Pal,et al.  Genetic programming for simultaneous feature selection and classifier design , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[26]  Fakhri Karray,et al.  Hierarchical genetic algorithm with new evaluation function and bi-coded representation for the selection of features considering their confidence rate , 2011, Appl. Soft Comput..

[27]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[28]  Riyaz Sikora,et al.  Framework for efficient feature selection in genetic algorithm based data mining , 2007, Eur. J. Oper. Res..

[29]  Zhihua Cui,et al.  PID-Controlled Particle Swarm Optimization , 2010, J. Multiple Valued Log. Soft Comput..

[30]  Donald E. Grierson,et al.  Comparison among five evolutionary-based optimization algorithms , 2005, Adv. Eng. Informatics.

[31]  Ashok K. Agrawala Machine Recognition of Patterns , 1977 .

[32]  David B. Skalak,et al.  Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.

[33]  Jacob Zahavi,et al.  Using simulated annealing to optimize the feature selection problem in marketing applications , 2006, Eur. J. Oper. Res..

[34]  Nasser Ghasem-Aghaee,et al.  Text feature selection using ant colony optimization , 2009, Expert Syst. Appl..

[35]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[36]  Parham Moradi,et al.  Integration of graph clustering with ant colony optimization for feature selection , 2015, Knowl. Based Syst..

[37]  Feng Chu,et al.  A General Wrapper Approach to Selection of Class-Dependent Features , 2008, IEEE Transactions on Neural Networks.

[38]  Li-Yeh Chuang,et al.  Gene selection and classification using Taguchi chaotic binary particle swarm optimization , 2011, Expert Syst. Appl..

[39]  Shih-Wei Lin,et al.  Particle swarm optimization for parameter determination and feature selection of support vector machines , 2008, Expert Syst. Appl..

[40]  Zhong Yan,et al.  Ant Colony Optimization for Feature Selection in Face Recognition , 2004, ICBA.

[41]  Russell C. Eberhart,et al.  A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.

[42]  Samrat L. Sabat,et al.  Particle Swarm Optimization Based Universal Solver for Global Optimization , 2012 .

[43]  Jun Liu,et al.  An Incremental Approach to Contribution-Based Feature Selection , 2004 .

[44]  Ashish Ghosh,et al.  Self-adaptive differential evolution for feature selection in hyperspectral image data , 2013, Appl. Soft Comput..

[45]  Mário A. T. Figueiredo,et al.  An unsupervised approach to feature discretization and selection , 2012, Pattern Recognit..

[46]  Juan Luis Fernández-Martínez,et al.  A Brief Historical Review of Particle Swarm Optimization (PSO) , 2012 .

[47]  Sébastien Paris,et al.  Application of global optimization methods to model and feature selection , 2012, Pattern Recognit..

[48]  Shuo-Yan Chou,et al.  A simulated-annealing-based approach for simultaneous parameter optimization and feature selection of back-propagation networks , 2008, Expert Syst. Appl..

[49]  Parham Moradi,et al.  A graph theoretic approach for unsupervised feature selection , 2015, Eng. Appl. Artif. Intell..

[50]  Magdalene Marinaki,et al.  Ant colony and particle swarm optimization for financial classification problems , 2009, Expert Syst. Appl..

[51]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[52]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[53]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[54]  Xiangyang Wang,et al.  Feature selection based on rough sets and particle swarm optimization , 2007, Pattern Recognit. Lett..

[55]  Chang Jui-Fang A PERFORMANCE COMPARISON BETWEEN GENETIC ALGORITHMS AND PARTICLE SWARM OPTIMIZATION APPLIED IN CONSTRUCTING EQUITY PORTFOLIOS , 2009 .

[56]  Zuren Feng,et al.  An efficient ant colony optimization approach to attribute reduction in rough set theory , 2008, Pattern Recognit. Lett..

[57]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[58]  Anil K. Jain,et al.  Large scale feature selection using modified random mutation hill climbing , 2004, ICPR 2004.

[59]  Pei-Chann Chang,et al.  An attribute weight assignment and particle swarm optimization algorithm for medical database classifications , 2012, Comput. Methods Programs Biomed..

[60]  Jiawei Han,et al.  Generalized Fisher Score for Feature Selection , 2011, UAI.

[61]  Niladri Chakraborty,et al.  Comparative Performance Study of Genetic Algorithm and Particle Swarm Optimization Applied on Off-grid Renewable Hybrid Energy System , 2011, SEMCCO.

[62]  Mengjie Zhang,et al.  Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach , 2013, IEEE Transactions on Cybernetics.

[63]  Kilian Stoffel,et al.  Theoretical Comparison between the Gini Index and Information Gain Criteria , 2004, Annals of Mathematics and Artificial Intelligence.

[64]  Ahmad Taher Azar,et al.  Supervised hybrid feature selection based on PSO and rough sets for medical diagnosis , 2014, Comput. Methods Programs Biomed..

[65]  Li-Yeh Chuang,et al.  Improved binary particle swarm optimization using catfish effect for feature selection , 2011, Expert Syst. Appl..

[66]  Hugo Jair Escalante,et al.  Particle Swarm Model Selection , 2009, J. Mach. Learn. Res..

[67]  Eduardo Gasca,et al.  Eliminating redundancy and irrelevance using a new MLP-based feature selection method , 2006, Pattern Recognit..

[68]  Parham Moradi,et al.  A clustering based genetic algorithm for feature selection , 2014, 2014 6th Conference on Information and Knowledge Technology (IKT).

[69]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[70]  Hélio Pedrini,et al.  Data feature selection based on Artificial Bee Colony algorithm , 2013, EURASIP J. Image Video Process..

[71]  Sergios Theodoridis,et al.  Pattern Recognition, Fourth Edition , 2008 .

[72]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[73]  Leslie S. Smith,et al.  Feature subset selection in large dimensionality domains , 2010, Pattern Recognit..

[74]  Zhijian Wu,et al.  A Hybrid Particle Swarm Optimization Algorithm Based on Space Transformation Search and a Modified Velocity Model , 2009, HPCA.

[75]  Anne M. P. Canuto,et al.  ReinSel: A class-based mechanism for feature selection in ensemble of classifiers , 2012, Appl. Soft Comput..

[76]  P. Lakshmi,et al.  Particle swarm optimisation applied to real time control of spherical tank system , 2012, Int. J. Bio Inspired Comput..

[77]  El-Ghazali Talbi,et al.  Comparison of population based metaheuristics for feature selection: Application to microarray data classification , 2008, 2008 IEEE/ACS International Conference on Computer Systems and Applications.

[78]  Josep M. Sopena,et al.  Performing Feature Selection With Multilayer Perceptrons , 2008, IEEE Transactions on Neural Networks.

[79]  Ángel Fernando Kuri Morales,et al.  A search space reduction methodology for data mining in large databases , 2009, Eng. Appl. Artif. Intell..

[80]  Maurice Clerc,et al.  The particle swarm - explosion, stability, and convergence in a multidimensional complex space , 2002, IEEE Trans. Evol. Comput..

[81]  Zne-Jung Lee,et al.  Parameter determination of support vector machine and feature selection using simulated annealing approach , 2008, Appl. Soft Comput..

[82]  João Miguel da Costa Sousa,et al.  Modified binary PSO for feature selection using SVM applied to mortality prediction of septic patients , 2013, Appl. Soft Comput..

[83]  Karim Faez,et al.  An improved feature selection method based on ant colony optimization (ACO) evaluated on face recognition system , 2008, Appl. Math. Comput..

[84]  Suresh Gyan,et al.  A Hybrid PSO Approach to Automate Test Data Generation for Data Flow Coverage with Dominance Concepts , 2011 .

[85]  Sreeram Ramakrishnan,et al.  A hybrid approach for feature subset selection using neural networks and ant colony optimization , 2007, Expert Syst. Appl..

[86]  Xiaoming Xu,et al.  A hybrid genetic algorithm for feature selection wrapper based on mutual information , 2007, Pattern Recognit. Lett..

[87]  Hiroshi Motoda,et al.  Computational Methods of Feature Selection , 2022 .

[88]  Cheng-Lung Huang,et al.  A distributed PSO-SVM hybrid system with feature selection and parameter optimization , 2008, Appl. Soft Comput..

[89]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[90]  Parham Moradi,et al.  Relevance-redundancy feature selection based on ant colony optimization , 2015, Pattern Recognit..

[91]  Parham Moradi,et al.  Gene selection for microarray data classification using a novel ant colony optimization , 2015, Neurocomputing.

[92]  Ratna Babu Chinnam,et al.  mr2PSO: A maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification , 2011, Inf. Sci..

[93]  Antanas Verikas,et al.  Feature selection with neural networks , 2002, Pattern Recognit. Lett..

[94]  Yijun Sun,et al.  Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[95]  Li-Yeh Chuang,et al.  Chaotic maps based on binary particle swarm optimization for feature selection , 2011, Appl. Soft Comput..

[96]  Frans van den Bergh,et al.  An analysis of particle swarm optimizers , 2002 .

[97]  Jing Zhao,et al.  A Modified Ant Colony Optimization Algorithm for Tumor Marker Gene Selection , 2009, Genom. Proteom. Bioinform..

[98]  Kazuyuki Murase,et al.  A new local search based hybrid genetic algorithm for feature selection , 2011, Neurocomputing.

[99]  Soodabeh Soleymani,et al.  Using Particle Swarm Optimization Algorithm Based on Multi-Objective Function in Reconfigured System for Optimal Placement of Distributed Generation , 2013 .

[100]  Parham Moradi,et al.  An unsupervised feature selection algorithm based on ant colony optimization , 2014, Eng. Appl. Artif. Intell..

[101]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[102]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .