Multi-Label Feature Selection Based on Mutual Information

In recent years, multi-label learning has attracted considerable attention in the field of research and has been widely used in practical problems. Multi-label learning is also impacted by the curse of dimensionality similar to single label learning. Feature selection is a key pretreatment process to reduce the curse of dimensionality, and filter is a simple and fast feature selection method. A filter defines a statistical criterion for sorting features to the extent that they should be helpful in classification. The features of the highest ranking are selected, and the features of the lowest ranking will be discarded. Computing the mutual information between labels and features is one of the commonly filter models. We will deeply study the multi-label feature selection problem by computing the multivariate mutual information in this paper. Firstly, the general framework for multi-label feature selection based on mutual information is set up in this paper, and the unified formula of multi-label feature selection is given. Secondly, for the sake of reducing the algorithm's complexity, the calculation of conditional mutual information is simplified, and two different multi-label feature selection algorithms based on mutual information are proposed. Finally, the classification ability of different algorithms is compared by multi-label datasets, and the experimental results show that the proposed algorithms have good performance in feature selection.

[1]  Chris H. Q. Ding,et al.  Minimum redundancy feature selection from microarray gene expression data , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[2]  Grigorios Tsoumakas,et al.  Random k -Labelsets: An Ensemble Method for Multilabel Classification , 2007, ECML.

[3]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[4]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[5]  Kemal Kilic,et al.  A Novel Hybrid Genetic Local Search Algorithm for Feature Selection and Weighting with an Application in Strategic Decision Making in Innovation Management , 2017, Inf. Sci..

[6]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[7]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[8]  Chin-Hui Lee,et al.  A MFoM learning approach to robust multiclass multi-label text categorization , 2004, ICML.

[9]  Saso Dzeroski,et al.  Ensembles of Multi-Objective Decision Trees , 2007, ECML.

[10]  Robert E. Schapire,et al.  Hierarchical multi-label prediction of gene function , 2006, Bioinform..

[11]  Eyke Hüllermeier,et al.  Multilabel classification via calibrated label ranking , 2008, Machine Learning.

[12]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[13]  Qing Chang,et al.  Feature selection methods for big data bioinformatics: A survey from the search perspective. , 2016, Methods.

[14]  Yang Liu,et al.  Multi-class sentiment classification: The experimental comparisons of feature selection and machine learning algorithms , 2017, Expert Syst. Appl..

[15]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[16]  WangHao,et al.  An Incremental Decision Tree for Mining Multilabel Data , 2015 .

[17]  Dae-Won Kim,et al.  Approximating mutual information for multi-label feature selection , 2012 .

[18]  Hao Wang,et al.  An Incremental Decision Tree for Mining Multilabel Data , 2015, Appl. Artif. Intell..

[19]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[20]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[21]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[22]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Dae-Won Kim,et al.  Fast multi-label feature selection based on information-theoretic feature ranking , 2015, Pattern Recognit..

[24]  刘景华,et al.  Multi-label feature selection based on max-dependency and min-redundancy , 2015 .

[25]  Saso Dzeroski,et al.  An extensive experimental comparison of methods for multi-label learning , 2012, Pattern Recognit..

[26]  Kilian Stoffel,et al.  Theoretical Comparison between the Gini Index and Information Gain Criteria , 2004, Annals of Mathematics and Artificial Intelligence.