A multi-objective algorithm for multi-label filter feature selection problem

Feature selection is an important data preprocessing method before classification. Multi-objective optimization algorithms have been proved an effective way to solve feature selection problems. However, there are few studies on multi-objective optimization feature selection methods for multi-label data. In this paper, a multi-objective multi-label filter feature selection algorithm based on two particle swarms (MOMFS) is proposed. We use mutual information to measure the relevance between features and label sets, and the redundancy between features, which are taken as two objectives. In order to avoid Particle Swarm Optimization (PSO) from falling into the local optimum and obtaining a false Pareto front, we employ two swarms to optimize the two objectives separately and propose an improved hybrid topology based on particle’s fitness value. Furthermore, an archive maintenance strategy is introduced to maintain the distribution of archive. In order to study the effectiveness of the proposed algorithm, we select five multi-label evaluation criteria and perform experiments on seven multi-label data sets. MOMFS is compared with classic single-objective multi-label feature selection algorithms, multi-objective filter and wrapper feature selection algorithms. The experimental results show that MOMFS can effectively reduce the multi-label data dimension and perform better than other approaches on five evaluation criteria.

[1]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[2]  Long Zhao,et al.  A nested particle swarm algorithm based on sphere mutation to solve bi-level optimization , 2019, Soft Comput..

[3]  Huan Liu,et al.  Advancing feature selection research , 2010 .

[4]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[5]  Xin Yao,et al.  A Survey on Evolutionary Computation Approaches to Feature Selection , 2016, IEEE Transactions on Evolutionary Computation.

[6]  Mengjie Zhang,et al.  Pareto front feature selection based on artificial bee colony optimization , 2018, Inf. Sci..

[7]  Michelangelo Ceci,et al.  Using PPI network autocorrelation in hierarchical multi-label classification trees for gene function prediction , 2013, BMC Bioinformatics.

[8]  Carlos Fernandez-Lozano,et al.  Texture classification using feature selection and kernel-based techniques , 2015, Soft Computing.

[9]  Ratna Babu Chinnam,et al.  mr2PSO: A maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification , 2011, Inf. Sci..

[10]  Selma Ayse Özel,et al.  A hybrid approach of differential evolution and artificial bee colony for feature selection , 2016, Expert Syst. Appl..

[11]  Mengjie Zhang,et al.  Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach , 2013, IEEE Transactions on Cybernetics.

[12]  Russell C. Eberhart,et al.  A discrete binary version of the particle swarm algorithm , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[13]  Michel C. Desmarais,et al.  Performance Comparison of Recent Imputation Methods for Classification Tasks over Binary Data , 2017, Appl. Artif. Intell..

[14]  Daoqiang Zhang,et al.  Constraint Score: A new filter method for feature selection with pairwise constraints , 2008, Pattern Recognit..

[15]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[16]  Josef Kittler,et al.  Multilabel classification using heterogeneous ensemble of multi-label classifiers , 2012, Pattern Recognit. Lett..

[17]  C.A. Coello Coello,et al.  MOPSO: a proposal for multiple objective particle swarm optimization , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[18]  Yudong Zhang,et al.  Binary PSO with mutation operator for feature selection using decision tree applied to spam detection , 2014, Knowl. Based Syst..

[19]  Qinghua Hu,et al.  Multi-label feature selection based on max-dependency and min-redundancy , 2015, Neurocomputing.

[20]  Xing Liu,et al.  Particle swarm optimization-based feature selection in sentiment classification , 2016, Soft Comput..

[21]  Adel Al-Jumaily,et al.  Feature subset selection using differential evolution and a statistical repair mechanism , 2011, Expert Syst. Appl..

[22]  Lei Liu,et al.  Feature selection with dynamic mutual information , 2009, Pattern Recognit..

[23]  Mansour Sheikhan,et al.  Hybrid of binary gravitational search algorithm and mutual information for feature selection in intrusion detection systems , 2015, Soft Computing.

[24]  Kezhi Mao,et al.  Feature subset selection for support vector machines through discriminative function pruning analysis , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[25]  Liu Quan,et al.  Financial time series forecasting using LPP and SVM optimized by PSO , 2013, SOCO 2013.

[26]  Jing J. Liang,et al.  Comprehensive learning particle swarm optimizer for global optimization of multimodal functions , 2006, IEEE Transactions on Evolutionary Computation.

[27]  Aldo Ursini Franco Montagna (1948–2015) , 2017, Soft Comput..

[28]  Lothar Thiele,et al.  Multiobjective Optimization Using Evolutionary Algorithms - A Comparative Case Study , 1998, PPSN.

[29]  Wei Xu,et al.  CNN-RNN: A Unified Framework for Multi-label Image Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Shailendra Singh,et al.  SVM Hyper-parameters optimization using quantized multi-PSO in dynamic environment , 2019, Soft Computing.

[31]  S. Sitharama Iyengar,et al.  Data-Driven Techniques in Disaster Information Management , 2017, ACM Comput. Surv..

[32]  Tao Li,et al.  A novel hybrid genetic algorithm with granular information for feature selection and optimization , 2018, Appl. Soft Comput..

[33]  Yong Zhang,et al.  A PSO-based multi-objective multi-label feature selection method in classification , 2017, Scientific Reports.

[34]  Jiannong Cao,et al.  Multiple Populations for Multiple Objectives: A Coevolutionary Technique for Solving Multiobjective Optimization Problems , 2013, IEEE Transactions on Cybernetics.

[35]  Newton Spolaôr,et al.  A Comparison of Multi-label Feature Selection Methods using the Problem Transformation Approach , 2013, CLEI Selected Papers.

[36]  Johannes Fürnkranz,et al.  Multi-objective Optimisation-Based Feature Selection for Multi-label Classification , 2017, NLDB.

[37]  Shahryar Rahnamayan,et al.  Opposition-Based Multi-objective Binary Differential Evolution for Multi-label Feature Selection , 2019, EMO.

[38]  James Kennedy,et al.  Small worlds and mega-minds: effects of neighborhood topology on particle swarm performance , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[39]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[40]  Asif Ekbal,et al.  Feature selection for entity extraction from multiple biomedical corpora: A PSO-based approach , 2018, Soft Comput..

[41]  Grigorios Tsoumakas,et al.  MULAN: A Java Library for Multi-Label Learning , 2011, J. Mach. Learn. Res..

[42]  Gracia Sánchez,et al.  A methodology for evaluating multi-objective evolutionary feature selection for classification in the context of virtual screening , 2018, Soft Comput..

[43]  Huan Liu,et al.  Advancing Feature Selection Research − ASU Feature Selection Repository , 2010 .

[44]  Ana Carolina Lorena,et al.  Feature Selection via Pareto Multi-objective Genetic Algorithms , 2017, Appl. Artif. Intell..

[45]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[46]  Huaiqing Wang,et al.  Financial time series forecasting using LPP and SVM optimized by PSO , 2012, Soft Computing.

[47]  Zhiming Luo,et al.  Manifold regularized discriminative feature selection for multi-label learning , 2019, Pattern Recognit..

[48]  Chun-Nan Hsu,et al.  The ANNIGMA-wrapper approach to fast feature selection for neural nets , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[49]  Rui Huang,et al.  Manifold-based constraint Laplacian score for multi-label feature selection , 2018, Pattern Recognit. Lett..

[50]  Prospero C. Naval,et al.  An effective use of crowding distance in multiobjective particle swarm optimization , 2005, GECCO '05.

[51]  Xavier Serra,et al.  Multi-Label Music Genre Classification from Audio, Text and Images Using Deep Features , 2017, ISMIR.

[52]  Mengjie Zhang,et al.  Particle Swarm Optimisation for Feature Selection in Classification , 2014 .

[53]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Mengjie Zhang,et al.  Differential evolution (DE) for multi-objective feature selection in classification , 2014, GECCO.

[55]  Davor Sluga,et al.  Quadratic Mutual Information Feature Selection , 2017, Entropy.

[56]  S. C. Neoh,et al.  A Micro-GA Embedded PSO Feature Selection Approach to Intelligent Facial Emotion Recognition , 2017, IEEE Transactions on Cybernetics.

[57]  Patrick Siarry,et al.  Integrating fuzzy entropy clustering with an improved PSO for MRI brain image segmentation , 2018, Appl. Soft Comput..

[58]  Dun-Wei Gong,et al.  Multi-objective Differential Evolution Algorithm for Multi-label Feature Selection in Classification , 2016, ICSI.

[59]  Huan Liu,et al.  Feature selection for clustering - a filter solution , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[60]  Izabela Rejer,et al.  Gamers' involvement detection from EEG data with cGAAM - A method for feature selection for clustering , 2018, Expert Syst. Appl..

[61]  Russell C. Eberhart,et al.  Evolutionary computation implementations , 2007 .

[62]  Carlos Martín-Vide,et al.  Special Issue on Second International Conference on the Theory and Practice of Natural Computing, TPNC 2013 , 2016, Soft Comput..

[63]  Asif Ekbal,et al.  MODE: multiobjective differential evolution for feature selection and classifier ensemble , 2015, Soft Computing.

[64]  Jianhua Xu,et al.  A Multi-label feature selection algorithm based on multi-objective optimization , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[65]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[66]  N. Xiong,et al.  A hybrid approach to input selection for complex processes , 2002, IEEE Trans. Syst. Man Cybern. Part A.

[67]  Yanqing Zhang,et al.  A genetic algorithm-based method for feature subset selection , 2008, Soft Comput..

[68]  Tao Li,et al.  An Improved Niching Binary Particle Swarm Optimization for Feature Selection , 2018, 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC).