Variable-Length Particle Swarm Optimization for Feature Selection on High-Dimensional Classification

With a global search mechanism, particle swarm optimization (PSO) has shown promise in feature selection (FS). However, most of the current PSO-based FS methods use a fix-length representation, which is inflexible and limits the performance of PSO for FS. When applying these methods to high-dimensional data, it not only consumes a significant amount of memory but also requires a high computational cost. Overcoming this limitation enables PSO to work on data with much higher dimensionality which has become more and more popular with the advance of data collection technologies. In this paper, we propose the first variable-length PSO representation for FS, enabling particles to have different and shorter lengths, which defines smaller search space and therefore, improves the performance of PSO. By rearranging features in a descending order of their relevance, we facilitate particles with shorter lengths to achieve better classification performance. Furthermore, using the proposed length changing mechanism, PSO can jump out of local optima, further narrow the search space and focus its search on smaller and more fruitful area. These strategies enable PSO to reach better solutions in a shorter time. Results on ten high-dimensional datasets with varying difficulties show that the proposed variable-length PSO can achieve much smaller feature subsets with significantly higher classification performance in much shorter time than the fixed-length PSO methods. The proposed method also outperformed the compared non-PSO FS methods in most cases.

[1]  Afsoon Moaref,et al.  A particle swarm optimization based on a ring topology for fuzzy-rough feature selection , 2013, 2013 13th Iranian Conference on Fuzzy Systems (IFSC).

[2]  Russell C. Eberhart,et al.  A discrete binary version of the particle swarm algorithm , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[3]  Jing J. Liang,et al.  Comprehensive learning particle swarm optimizer for global optimization of multimodal functions , 2006, IEEE Transactions on Evolutionary Computation.

[4]  Michiharu Maeda,et al.  Set-based particle swarm optimization with status memory for knapsack problem , 2015, Artificial Life and Robotics.

[5]  Jing J. Liang,et al.  A Multiobjective Particle Swarm Optimizer Using Ring Topology for Solving Multimodal Multiobjective Problems , 2018, IEEE Transactions on Evolutionary Computation.

[6]  Basabi Chakraborty,et al.  Fuzzy Consistency Measure with Particle Swarm Optimization for Feature Selection , 2013, 2013 IEEE International Conference on Systems, Man, and Cybernetics.

[7]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[8]  Russell C. Eberhart,et al.  A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.

[9]  Mengjie Zhang,et al.  Fitness Functions in Genetic Programming for Classification with Unbalanced Data , 2007, Australian Conference on Artificial Intelligence.

[10]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[11]  Reza Kheirollahi,et al.  Point and interval forecasting of real-time and day-ahead electricity prices by a novel hybrid approach , 2017 .

[12]  Ivan Bratko,et al.  Testing the significance of attribute interactions , 2004, ICML.

[13]  Peter Andreae,et al.  A Novel Binary Particle Swarm Optimization Algorithm and Its Applications on Knapsack and Feature Selection Problems , 2017 .

[14]  Thomas Marill,et al.  On the effectiveness of receptors in recognition systems , 1963, IEEE Trans. Inf. Theory.

[15]  Qinbao Song,et al.  A Fast Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data , 2013, IEEE Transactions on Knowledge and Data Engineering.

[16]  Charu C. Aggarwal,et al.  On the Surprising Behavior of Distance Metrics in High Dimensional Spaces , 2001, ICDT.

[17]  Ahmad Taher Azar,et al.  Supervised hybrid feature selection based on PSO and rough sets for medical diagnosis , 2014, Comput. Methods Programs Biomed..

[18]  Eibe Frank,et al.  Large-scale attribute selection using wrappers , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[19]  Ausama Al-Sahaf,et al.  Automatically Evolving Rotation-Invariant Texture Image Descriptors by Genetic Programming , 2017, IEEE Transactions on Evolutionary Computation.

[20]  George D. C. Cavalcanti,et al.  An approach to feature selection for keystroke dynamics systems based on PSO and feature weighting , 2007, 2007 IEEE Congress on Evolutionary Computation.

[21]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[22]  Andries Petrus Engelbrecht,et al.  Set-based particle swarm optimization applied to the multidimensional knapsack problem , 2012, Swarm Intelligence.

[23]  Haider Banka,et al.  A Hamming distance based binary particle swarm optimization (HDBPSO) algorithm for high dimensional feature selection, classification and validation , 2015, Pattern Recognit. Lett..

[24]  Xiang Yu,et al.  Enhanced Comprehensive Learning Particle Swarm Optimization with Exemplar Evolution , 2017, SEAL.

[25]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[26]  Xiaodong Li,et al.  Cooperatively Coevolving Particle Swarms for Large Scale Optimization , 2012, IEEE Transactions on Evolutionary Computation.

[27]  Yaochu Jin,et al.  Feature selection for high-dimensional classification using a competitive swarm optimizer , 2016, Soft Computing.

[28]  Mengjie Zhang,et al.  A PSO based hybrid feature selection algorithm for high-dimensional classification , 2016, 2016 IEEE Congress on Evolutionary Computation (CEC).

[29]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[30]  Xin Yao,et al.  A Survey on Evolutionary Computation Approaches to Feature Selection , 2016, IEEE Transactions on Evolutionary Computation.

[31]  Mengjie Zhang,et al.  Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms , 2014, Appl. Soft Comput..

[32]  Jianchao Zeng,et al.  Surrogate-Assisted Cooperative Swarm Optimization of High-Dimensional Expensive Problems , 2017, IEEE Transactions on Evolutionary Computation.

[33]  A. Wayne Whitney,et al.  A Direct Method of Nonparametric Measurement Selection , 1971, IEEE Transactions on Computers.

[34]  Yijun Sun,et al.  Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  M. Dash,et al.  Feature selection via set cover , 1997, Proceedings 1997 IEEE Knowledge and Data Engineering Exchange Workshop.

[36]  Jun Zhang,et al.  A Novel Set-Based Particle Swarm Optimization Method for Discrete Optimization Problems , 2010, IEEE Transactions on Evolutionary Computation.

[37]  Peter J. Angeline,et al.  Evolutionary Optimization Versus Particle Swarm Optimization: Philosophy and Performance Differences , 1998, Evolutionary Programming.

[38]  Mengjie Zhang,et al.  Gaussian Based Particle Swarm Optimisation and Statistical Clustering for Feature Selection , 2014, EvoCOP.