Gaussian Transformation Based Representation in Particle Swarm Optimisation for Feature Selection

In classification, feature selection is an important but challenging task, which requires a powerful search technique. Particle swarm optimisation (PSO) has recently gained much attention for solving feature selection problems, but the current representation typically forms a high-dimensional search space. A new representation based on feature clusters was recently proposed to reduce the dimensionality and improve the performance, but it does not form a smooth fitness landscape, which may limit the performance of PSO. This paper proposes a new Gaussian based transformation rule for interpreting a particle as a feature subset, which is combined with the feature cluster based representation to develop a new PSO-based feature selection algorithm. The proposed algorithm is examined and compared with two recent PSO-based algorithms, where the first uses a Gaussian based updating mechanism and the conventional representation, and the second uses the feature cluster representation without using Gaussian distribution. Experiments on commonly used datasets of varying difficulty show that the proposed algorithm achieves better performance than the other two algorithms in terms of the classification performance and the number of features in both the training sets and the test sets. Further analyses show that the Gaussian transformation rule improves the stability, i.e. selecting similar features in different independent runs and almost always selects the most important features.

[1]  Zexuan Zhu,et al.  Markov blanket-embedded genetic algorithm for gene selection , 2007, Pattern Recognit..

[2]  Thomas Marill,et al.  On the effectiveness of receptors in recognition systems , 1963, IEEE Trans. Inf. Theory.

[3]  Doina Bucur,et al.  Influence Maximization in Social Networks with Genetic Algorithms , 2016, EvoApplications.

[4]  Shih-Wei Lin,et al.  Particle swarm optimization for parameter determination and feature selection of support vector machines , 2008, Expert Syst. Appl..

[5]  Huan Liu,et al.  Feature Selection: An Ever Evolving Frontier in Data Mining , 2010, FSDM.

[6]  João Miguel da Costa Sousa,et al.  Modified binary PSO for feature selection using SVM applied to mortality prediction of septic patients , 2013, Appl. Soft Comput..

[7]  Richard Arnold,et al.  Multivariate methods using mixtures: Correspondence analysis, scaling and pattern-detection , 2014, Comput. Stat. Data Anal..

[8]  Mengjie Zhang,et al.  Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach , 2013, IEEE Transactions on Cybernetics.

[9]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[10]  Sébastien Paris,et al.  Application of global optimization methods to model and feature selection , 2012, Pattern Recognit..

[11]  Mengjie Zhang,et al.  Novel Initialisation and Updating Mechanisms in PSO for Feature Selection in Classification , 2013, EvoApplications.

[12]  Maurice Clerc,et al.  The particle swarm - explosion, stability, and convergence in a multidimensional complex space , 2002, IEEE Trans. Evol. Comput..

[13]  Mengjie Zhang,et al.  Gaussian Based Particle Swarm Optimisation and Statistical Clustering for Feature Selection , 2014, EvoCOP.

[14]  Mengjie Zhang,et al.  Multi-objective Feature Selection in Classification: A Differential Evolution Approach , 2014, SEAL.

[15]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[16]  Christian Blum,et al.  Evolutionary Computation in Combinatorial Optimization , 2015, Lecture Notes in Computer Science.

[17]  Mengjie Zhang,et al.  PSO and Statistical Clustering for Feature Selection: A New Representation , 2014, SEAL.