An archive based particle swarm optimisation for feature selection in classification

Feature selection aims to select a subset of relevant features from typically a large number of original features, which is a difficult task due to the large search space. Particle swarm optimisation (PSO) is a powerful search technique, but there are some limitations on using the standard PSO for feature selection. This paper proposes a new PSO based feature selection approach, which introduces an external archive to store promising solutions obtained during the search process. The solutions in the archive serve as potential leaders (i.e. global best, gbest) to guide the swarm to search for an optimal feature subset with the lowest classification error rate and a smaller number of features. The proposed approach has two specific methods, PSOArR and PSOArRWS, where PSOArR randomly selects gbest from the archive and PSOArRWS uses the roulette wheel selection to select gbest considering both the classification error rate and also considering the number of selected features. Experiments on twelve benchmark datasets show that both PSOArR and PSOArRWS can successfully select a smaller number of features and achieve similar or better classification performance than using all features. PSOArR and PSOArRWS outperform a PSO based algorithm without using an archive and two traditional feature selection methods. The performance of PSOArR and PSOArRWS are similar to each other.

[1]  Adel M. Alimi,et al.  Distributed MOPSO with a new population subdivision technique for the feature selection , 2011, 2011 5th International Symposium on Computational Intelligence and Intelligent Informatics (ISCIII).

[2]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[3]  Sean Luke,et al.  Lexicographic Parsimony Pressure , 2002, GECCO.

[4]  Mark Johnston,et al.  Particle Swarm Optimization based Adaboost for face detection , 2009, 2009 IEEE Congress on Evolutionary Computation.

[5]  B. Chakraborty Feature subset selection by particle swarm optimization with fuzzy fitness function , 2008, 2008 3rd International Conference on Intelligent System and Knowledge Engineering.

[6]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[7]  Mengjie Zhang,et al.  Binary particle swarm optimisation for feature selection: A filter based approach , 2012, 2012 IEEE Congress on Evolutionary Computation.

[8]  Shian-Shyong Tseng,et al.  A two-phase feature selection method using both filter and wrapper , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[9]  Yongsheng Ding,et al.  An Improved Particle Swarm Optimization with an Adaptive Updating Mechanism , 2011, ICSI.

[10]  Mengjie Zhang,et al.  Multi-objective particle swarm optimisation (PSO) for feature selection , 2012, GECCO '12.

[11]  M Reyes Sierra,et al.  Multi-Objective Particle Swarm Optimizers: A Survey of the State-of-the-Art , 2006 .

[12]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[13]  Andreas König,et al.  Feature-Level Fusion by Multi-Objective Binary Particle Swarm Based Unbiased Feature Selection for Optimized Sensor System Design , 2006, 2006 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems.

[14]  Li-Yeh Chuang,et al.  Improved binary particle swarm optimization using catfish effect for feature selection , 2011, Expert Syst. Appl..

[15]  Qingfu Zhang,et al.  MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition , 2007, IEEE Transactions on Evolutionary Computation.

[16]  Lei Liu,et al.  Feature selection with dynamic mutual information , 2009, Pattern Recognit..

[17]  Antonio J. Nebro,et al.  jMetal: A Java framework for multi-objective optimization , 2011, Adv. Eng. Softw..

[18]  Russell C. Eberhart,et al.  A discrete binary version of the particle swarm algorithm , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[19]  Thomas Marill,et al.  On the effectiveness of receptors in recognition systems , 1963, IEEE Trans. Inf. Theory.

[20]  Rich Caruana,et al.  Greedy Attribute Selection , 1994, ICML.

[21]  Bishwajit Chakraborty,et al.  Genetic algorithm with fuzzy fitness function for feature selection , 2002, Industrial Electronics, 2002. ISIE 2002. Proceedings of the 2002 IEEE International Symposium on.

[22]  Qiang Shen,et al.  Finding Rough Set Reducts with Ant Colony Optimization , 2003 .

[23]  Richard Jensen,et al.  Performing Feature Selection with ACO , 2006, Swarm Intelligence in Data Mining.

[24]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[25]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[26]  Yiyu Yao,et al.  Attribute reduction in decision-theoretic rough set models , 2008, Inf. Sci..

[27]  M.A. El-Sharkawi,et al.  Pareto Multi Objective Optimization , 2005, Proceedings of the 13th International Conference on, Intelligent Systems Application to Power Systems.

[28]  Voratas Kachitvichyanukul,et al.  Particle swarm optimization and two solution representations for solving the capacitated vehicle routing problem , 2009, Comput. Ind. Eng..

[29]  Mohamed E. El-Hawary,et al.  A Survey of Particle Swarm Optimization Applications in Electric Power Systems , 2009, IEEE Transactions on Evolutionary Computation.

[30]  Frans van den Bergh,et al.  An analysis of particle swarm optimizers , 2002 .

[31]  Been-Chian Chien,et al.  Features Selection based on Rough Membership and Genetic Programming , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[32]  Larry Bull,et al.  Genetic Programming with a Genetic Algorithm for Feature Construction and Selection , 2005, Genetic Programming and Evolvable Machines.

[33]  Kourosh Neshatian,et al.  Feature Manipulation with Genetic Programming , 2010 .

[34]  Fernando E. B. Otero,et al.  Genetic Programming for Attribute Construction in Data Mining , 2002, EuroGP.

[35]  Matthew Walker Introduction to Genetic Programming , 2001 .

[36]  Shih-Wei Lin,et al.  Particle swarm optimization for parameter determination and feature selection of support vector machines , 2008, Expert Syst. Appl..

[37]  Cheng-Lung Huang,et al.  A distributed PSO-SVM hybrid system with feature selection and parameter optimization , 2008, Appl. Soft Comput..

[38]  Hui Wang,et al.  Opposition-based particle swarm algorithm with cauchy mutation , 2007, 2007 IEEE Congress on Evolutionary Computation.

[39]  Anne Auger,et al.  Theory of the hypervolume indicator: optimal μ-distributions and the choice of the reference point , 2009, FOGA '09.

[40]  David Corne,et al.  The Pareto archived evolution strategy: a new baseline algorithm for Pareto multiobjective optimisation , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[41]  Mark Richards,et al.  Choosing a starting configuration for particle swarm optimization , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[42]  C. A. Coello Coello,et al.  Evolutionary multi-objective optimization: a historical view of the field , 2006, IEEE Computational Intelligence Magazine.

[43]  Y. Guoa,et al.  Applications of particle swarm optimisation in integrated process planning and scheduling , 2008 .

[44]  George D. C. Cavalcanti,et al.  An approach to feature selection for keystroke dynamics systems based on PSO and feature weighting , 2007, 2007 IEEE Congress on Evolutionary Computation.

[45]  Mengjie Zhang,et al.  A Filter Approach to Multiple Feature Construction for Symbolic Learning Classifiers Using Genetic Programming , 2012, IEEE Transactions on Evolutionary Computation.

[46]  Amiya Kumar Rath,et al.  Rough ACO: A Hybridized Model for Feature Selection in Gene Expression Data , 2010 .

[47]  Oscal T.-C. Chen,et al.  Particle Swarm Optimization incorporating a Preferential Velocity-Updating Mechanism and Its Applications in IIR Filter Design , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[48]  Fakhri Karray,et al.  Multi-objective Feature Selection with NSGA II , 2007, ICANNGA.

[49]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[50]  João Miguel da Costa Sousa,et al.  Metaheuristics for feature selection: Application to sepsis outcome prediction , 2012, 2012 IEEE Congress on Evolutionary Computation.

[51]  He Ming A Rough Set Based Hybrid Method to Feature Selection , 2008, 2008 International Symposium on Knowledge Acquisition and Modeling.

[52]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[53]  Mengjie Zhang,et al.  Genetic programming for performance improvement and dimensionality reduction of classification problems , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[54]  Li-Yeh Chuang,et al.  Improved binary PSO for feature selection using gene expression data , 2008, Comput. Biol. Chem..

[55]  Enrique Alba,et al.  Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms , 2007, 2007 IEEE Congress on Evolutionary Computation.

[56]  Stewart W. Wilson Get Real! XCS with Continuous-Valued Inputs , 1999, Learning Classifier Systems.

[57]  Chong-Ho Choi,et al.  Input feature selection for classification problems , 2002, IEEE Trans. Neural Networks.

[58]  Harry Zhang,et al.  A Fast Decision Tree Learning Algorithm , 2006, AAAI.

[59]  Mengjie Zhang,et al.  Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach , 2013, IEEE Transactions on Cybernetics.

[60]  Zexuan Zhu,et al.  Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[61]  Georgios Dounias,et al.  Particle swarm optimization for pap-smear diagnosis , 2008, Expert Syst. Appl..

[62]  Thomas G. Dietterich,et al.  Efficient Algorithms for Identifying Relevant Features , 1992 .

[63]  Duoqian Miao,et al.  A rough set approach to feature selection based on ant colony optimization , 2010, Pattern Recognit. Lett..

[64]  Mark Johnston,et al.  Feature Construction and Dimension Reduction Using Genetic Programming , 2007, Australian Conference on Artificial Intelligence.

[65]  Xiangyang Wang,et al.  Feature selection based on rough sets and particle swarm optimization , 2007, Pattern Recognit. Lett..

[66]  Jacek M. Zurada,et al.  Normalized Mutual Information Feature Selection , 2009, IEEE Transactions on Neural Networks.

[67]  Shih-Wei Lin,et al.  PSOLDA: A particle swarm optimization approach for enhancing classification accuracy rate of linear discriminant analysis , 2009, Appl. Soft Comput..

[68]  Mengjie Zhang,et al.  Genetic Programming for Feature Subset Ranking in Binary Classification Problems , 2009, EuroGP.

[69]  R. Storn,et al.  Differential Evolution: A Practical Approach to Global Optimization (Natural Computing Series) , 2005 .

[70]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[71]  Maurice Clerc,et al.  The particle swarm - explosion, stability, and convergence in a multidimensional complex space , 2002, IEEE Trans. Evol. Comput..

[72]  Sankar K. Pal,et al.  Multilayer perceptron, fuzzy sets, and classification , 1992, IEEE Trans. Neural Networks.

[73]  Qiang Du,et al.  Centroidal Voronoi Tessellations: Applications and Algorithms , 1999, SIAM Rev..

[74]  Mengjie Zhang,et al.  Pareto front feature selection: using genetic programming to explore feature space , 2009, GECCO.

[75]  Jonathan Timmis,et al.  Artificial immune systems - a new computational intelligence paradigm , 2002 .

[76]  R. K. Ursem Multi-objective Optimization using Evolutionary Algorithms , 2009 .

[77]  Rafael Bello,et al.  A model based on ant colony system and rough set theory to feature selection , 2005, GECCO '05.

[78]  Mengjie Zhang,et al.  A multi-objective particle swarm optimisation for filter-based feature selection in classification problems , 2012, Connect. Sci..

[79]  Dipankar Dasgupta,et al.  An Overview of Artificial Immune Systems and Their Applications , 1993 .

[80]  Kevin K Dobbin,et al.  Optimally splitting cases for training and testing high dimensional classifiers , 2011, BMC Medical Genomics.

[81]  Hans-Paul Schwefel,et al.  Evolution strategies – A comprehensive introduction , 2002, Natural Computing.

[82]  Mengjie Zhang,et al.  A Dimension Reduction Approach to Classification Based on Particle Swarm Optimisation and Rough Set Theory , 2012, Australasian Conference on Artificial Intelligence.

[83]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[84]  Lawrence J. Fogel,et al.  Intelligence Through Simulated Evolution: Forty Years of Evolutionary Programming , 1999 .

[85]  Eibe Frank,et al.  Large-scale attribute selection using wrappers , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[86]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[87]  Mengjie Zhang,et al.  New fitness functions in binary particle swarm optimisation for feature selection , 2012, 2012 IEEE Congress on Evolutionary Computation.

[88]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[89]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[90]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[91]  K. Faez,et al.  Clustering and feature selection via PSO algorithm , 2011, 2011 International Symposium on Artificial Intelligence and Signal Processing (AISP).

[92]  Mengjie Zhang,et al.  Novel Initialisation and Updating Mechanisms in PSO for Feature Selection in Classification , 2013, EvoApplications.

[93]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[94]  Konstantinos E. Parsopoulos,et al.  Initializing the Particle Swarm Optimizer Using the Nonlinear Simplex Method , 2002 .

[95]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[96]  Mengjie Zhang,et al.  PSO for feature construction and binary classification , 2013, GECCO '13.

[97]  Thomas G. Dietterich,et al.  Learning Boolean Concepts in the Presence of Many Irrelevant Features , 1994, Artif. Intell..

[98]  Carlos A. Coello Coello,et al.  Improving PSO-Based Multi-objective Optimization Using Crowding, Mutation and epsilon-Dominance , 2005, EMO.

[99]  Rafael Ramírez,et al.  An evolutionary computation approach to cognitive states classification , 2007, 2007 IEEE Congress on Evolutionary Computation.

[100]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[101]  Karim Faez,et al.  An improved feature selection method based on ant colony optimization (ACO) evaluated on face recognition system , 2008, Appl. Math. Comput..

[102]  Mengjie Zhang,et al.  Particle Swarm Optimisation and Statistical Clustering for Feature Selection , 2013, Australasian Conference on Artificial Intelligence.

[103]  Silvia Casado Yusta,et al.  Different metaheuristic strategies to solve the feature selection problem , 2009, Pattern Recognit. Lett..

[104]  Sreeram Ramakrishnan,et al.  A hybrid approach for feature subset selection using neural networks and ant colony optimization , 2007, Expert Syst. Appl..

[105]  Jing Wang,et al.  A New Population Initialization Method Based on Space Transformation Search , 2009, 2009 Fifth International Conference on Natural Computation.

[106]  A. Wayne Whitney,et al.  A Direct Method of Nonparametric Measurement Selection , 1971, IEEE Transactions on Computers.

[107]  Marco Colombetti,et al.  What Is a Learning Classifier System? , 1999, Learning Classifier Systems.

[108]  Claire Cardie,et al.  Using Decision Trees to Improve Case-Based Learning , 1993, ICML.

[109]  Mengjie Zhang,et al.  Multi-objective Evolutionary Algorithms for filter Based Feature Selection in Classification , 2013, Int. J. Artif. Intell. Tools.

[110]  Xiaodong Li,et al.  A Non-dominated Sorting Particle Swarm Optimizer for Multiobjective Optimization , 2003, GECCO.

[111]  Ian Griffin,et al.  A Comparative Study of Progressive Preference Articulation Techniques for Multiobjective Optimisation , 2007, EMO.

[112]  Yue Shi,et al.  A modified particle swarm optimizer , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[113]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[114]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[115]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[116]  Nikhil R. Pal,et al.  Genetic programming for simultaneous feature selection and classifier design , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[117]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..