Multi-label Feature Selection Method Based on Multivariate Mutual Information and Particle Swarm Optimization

Multi-label feature selection has become an indispensable pre-processing step to deal with possible irrelevant and redundant features, to decrease computational burdens, improve classification performance and enhance model interpretability, in multi-label learning. Mutual information (MI) between two random variables is widely used to describe feature-label relevance and feature-feature redundancy. Furthermore, multivariate mutual information (MMI) is approximated via limiting three-degree interactions to speed up its computation, and then is used to characterize relevance between selected feature subset and label subset. In this paper, we combine MMI-based relevance with MI-based redundancy to define a new max-relevance and min-redundancy feature selection criterion (simply MMI). To search for a globally optimal solution, we add an auxiliary mutation operation to existing binary particle swarm optimization with mutation to control the number of selected features strictly to form a new PSO variant: M2BPSO. Integrating MMI with M2BPSO builds a novel multi-label feature selection method: MMI-PSO. The experiments on four benchmark data sets demonstrate the effectiveness of our proposed algorithm, according to four instance-based classification evaluation metrics, compared with three state-of-the-art feature selection approaches.

[1]  Dae-Won Kim,et al.  Memetic feature selection algorithm for multi-label classification , 2015, Inf. Sci..

[2]  Dae-Won Kim,et al.  Multi-Label Learning Using Mathematical Programming , 2015, IEICE Trans. Inf. Syst..

[3]  Bianca Zadrozny,et al.  Categorizing feature selection methods for multi-label classification , 2016, Artificial Intelligence Review.

[4]  Shulin Wang,et al.  Feature selection in machine learning: A new perspective , 2018, Neurocomputing.

[5]  Hossein Nezamabadi-pour,et al.  Multilabel feature selection: A comprehensive review and guiding experiments , 2018, Wiley Interdiscip. Rev. Data Min. Knowl. Discov..

[6]  Yudong Zhang,et al.  Binary PSO with mutation operator for feature selection using decision tree applied to spam detection , 2014, Knowl. Based Syst..

[7]  Qinghua Hu,et al.  Multi-label feature selection based on max-dependency and min-redundancy , 2015, Neurocomputing.

[8]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[9]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[10]  Dae-Won Kim,et al.  Fast multi-label feature selection based on information-theoretic feature ranking , 2015, Pattern Recognit..

[11]  Dae-Won Kim,et al.  Mutual Information-based multi-label feature selection using interaction information , 2015, Expert Syst. Appl..

[12]  Sebastián Ventura,et al.  A Tutorial on Multilabel Learning , 2015, ACM Comput. Surv..

[13]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[14]  Jie Duan,et al.  Multi-label feature selection based on neighborhood mutual information , 2016, Appl. Soft Comput..

[15]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Dae-Won Kim,et al.  Feature selection for multi-label classification using multivariate mutual information , 2013, Pattern Recognit. Lett..

[17]  Jianhua Xu,et al.  Multi-label regularized quadratic programming feature selection algorithm with Frank-Wolfe method , 2018, Expert Syst. Appl..

[18]  William J. McGill Multivariate information transmission , 1954, Trans. IRE Prof. Group Inf. Theory.

[19]  Craig A. Knoblock,et al.  A Survey of Digital Map Processing Techniques , 2014, ACM Comput. Surv..

[20]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[21]  Francisco Charte,et al.  Multilabel Classification: Problem Analysis, Metrics and Techniques , 2016 .

[22]  Pablo A. Estévez,et al.  A review of feature selection methods based on mutual information , 2013, Neural Computing and Applications.

[23]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.