Novel Initialisation and Updating Mechanisms in PSO for Feature Selection in Classification

In classification, feature selection is an important, but difficult problem. Particle swarm optimisation (PSO) is an efficient evolutionary computation technique. However, the traditional personal best and global best updating mechanism in PSO limits its performance for feature selection and the potential of PSO for feature selection has not been fully investigated. This paper proposes a new initialisation strategy and a new personal best and global best updating mechanism in PSO to develop a novel feature selection algorithm with the goals of minimising the number of features, maximising the classification performance and simultaneously reducing the computational time. The proposed algorithm is compared with two traditional feature selection methods, a PSO based method with the goal of only maximising the classification performance, and a PSO based two-stage algorithm considering both the number of features and the classification performance. Experiments on eight benchmark datasets show that the proposed algorithm can automatically evolve a feature subset with a smaller number of features and higher classification performance than using all features. The proposed algorithm achieves significantly better classification performance than the two traditional methods. The proposed algorithm also outperforms the two PSO based feature selection algorithms in terms of the classification performance, the number of features and the computational cost.

[1]  Thomas Marill,et al.  On the effectiveness of receptors in recognition systems , 1963, IEEE Trans. Inf. Theory.

[2]  A. Wayne Whitney,et al.  A Direct Method of Nonparametric Measurement Selection , 1971, IEEE Transactions on Computers.

[3]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[4]  Russell C. Eberhart,et al.  A discrete binary version of the particle swarm algorithm , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[5]  Yue Shi,et al.  A modified particle swarm optimizer , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[6]  Sean Luke,et al.  Lexicographic Parsimony Pressure , 2002, GECCO.

[7]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[8]  Hui Wang,et al.  Opposition-based particle swarm algorithm with cauchy mutation , 2007, 2007 IEEE Congress on Evolutionary Computation.

[9]  Sushmita Mitra,et al.  Evolutionary Rough Feature Selection in Gene Expression Data , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[10]  M. A. Khanesar,et al.  A novel binary particle swarm optimization , 2007, 2007 Mediterranean Conference on Control & Automation.

[11]  Li-Yeh Chuang,et al.  Improved binary PSO for feature selection using gene expression data , 2008, Comput. Biol. Chem..

[12]  Cheng-Lung Huang,et al.  A distributed PSO-SVM hybrid system with feature selection and parameter optimization , 2008, Appl. Soft Comput..

[13]  Eibe Frank,et al.  Large-scale attribute selection using wrappers , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[14]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[15]  Alper Ekrem Murat,et al.  A discrete particle swarm optimization method for feature selection in binary classification problems , 2010, Eur. J. Oper. Res..

[16]  Mengjie Zhang,et al.  Using genetic programming for context-sensitive feature scoring in classification problems , 2011, Connect. Sci..

[17]  Mengjie Zhang,et al.  New fitness functions in binary particle swarm optimisation for feature selection , 2012, 2012 IEEE Congress on Evolutionary Computation.