Multi-label Feature Selection Method via Maximizing Correlation-based Criterion with Mutation Binary Bat Algorithm

Multi-label feature selection is a vital pre-processing step to reduce computational complexity, improve classification performance and enhance model interpretability, via selecting a discriminative subset of features from original high-dimensional features. Correlation-based feature selection (CFS) criterion measures the relevance between features and labels, and the redundancy among features, which has been combined with hill climbing and genetic algorithm to execute multi-label feature selection task. However, it is an open problem to search for more effective optimization tools for CFS. In this paper, through adding a mutation operation, we modify existing binary bat algorithm to build its mutation version (MBBA), to adjust the number of "1" components to be a fix size. Then a new multi-label feature selection approach is proposed via maximizing CFS criterion using MBBA, to select a fixed number of discriminative features. Our experiments on four data sets show that our proposed method is superior to three state-of-the-art approaches, according to four sample-based performance evaluation metrics for multi-label classification.

[1]  Xin-She Yang,et al.  Bat algorithm for multi-objective optimisation , 2011, Int. J. Bio Inspired Comput..

[2]  Salwani Abdullah,et al.  A Comparison of Multi-label Feature Selection Methods Using the Algorithm Adaptation Approach , 2015, IVIC.

[3]  Qinghua Hu,et al.  Robust Multi-label Feature Selection with Missing Labels , 2016, CCPR.

[4]  Alex Alves Freitas,et al.  A Multi-Label Correlation-Based Feature Selection Method for the Classification of Neuroblastoma Microarray Data , 2012, ICDM.

[5]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[6]  Jianhua Xu,et al.  Effective and Efficient Multi-label Feature Selection Approaches via Modifying Hilbert-Schmidt Independence Criterion , 2016, ICONIP.

[7]  Chris H. Q. Ding,et al.  Multi-label ReliefF and F-statistic feature selections for image annotation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Xin-She Yang,et al.  A New Metaheuristic Bat-Inspired Algorithm , 2010, NICSO.

[9]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[10]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[11]  Francisco Charte,et al.  Multilabel Classification: Problem Analysis, Metrics and Techniques , 2016 .

[12]  Pascale Kuntz,et al.  A Review on Dimensionality Reduction for Multi-Label Classification , 2019, IEEE Transactions on Knowledge and Data Engineering.

[13]  L. Goddard Information Theory , 1962, Nature.

[14]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[15]  Jianhua Xu,et al.  Multi-label regularized quadratic programming feature selection algorithm with Frank-Wolfe method , 2018, Expert Syst. Appl..

[16]  Hossein Nezamabadi-pour,et al.  Multilabel feature selection: A comprehensive review and guiding experiments , 2018, Wiley Interdiscip. Rev. Data Min. Knowl. Discov..

[17]  Jiawei Han,et al.  Correlated multi-label feature selection , 2011, CIKM '11.

[18]  Bianca Zadrozny,et al.  Categorizing feature selection methods for multi-label classification , 2016, Artificial Intelligence Review.

[19]  Vili Podgorelec,et al.  Swarm Intelligence Algorithms for Feature Selection: A Review , 2018, Applied Sciences.

[20]  Lei Zhao,et al.  Multi-label Feature Selection Method Based on Multivariate Mutual Information and Particle Swarm Optimization , 2018, ICONIP.

[21]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[22]  Alex Alves Freitas,et al.  A new genetic algorithm for multi-label correlation-based feature selection , 2015, ESANN.

[23]  ZhangMin-Ling,et al.  Feature selection for multi-label naive Bayes classification , 2009 .

[24]  Dae-Won Kim,et al.  Fast multi-label feature selection based on information-theoretic feature ranking , 2015, Pattern Recognit..

[25]  Qinghua Hu,et al.  Multi-label feature selection based on max-dependency and min-redundancy , 2015, Neurocomputing.

[26]  Dae-Won Kim,et al.  SCLS: Multi-label feature selection based on scalable criterion for large label set , 2017, Pattern Recognit..

[27]  Jianhua Xu,et al.  A Multi-label feature selection algorithm based on multi-objective optimization , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[28]  Chang Liu,et al.  Multi-label Feature Selection Method Combining Unbiased Hilbert-Schmidt Independence Criterion with Controlled Genetic Algorithm , 2018, ICONIP.

[29]  Roberto Battiti,et al.  Feature Selection Based on the Neighborhood Entropy , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Xin-She Yang,et al.  BBA: A Binary Bat Algorithm for Feature Selection , 2012, 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images.