Alignment Based Feature Selection for Multi-label Learning

Multi-label learning deals with data sets in which each example is assigned with a set of labels, and the goal is to construct a learning model to predict the label set for unseen examples. Multi-label data sets share the same problems with single-label data sets that usually possess high-dimensional features and may exist redundancy features, which will influence the performance of the algorithm. Thus it is obviously necessary to address feature selection in multi-label learning. Meanwhile, information among labels play an important role in multi-label learning, thereby it is significance to measure information among labels in order to improve the performance of learning algorithms. In this paper, we introduce kernel alignment into multi-label learning to measure the consistency between feature space and label space by which features are ranked and selected. Firstly we define an ideal kernel in label space as a convex combination of ideal kernels defined by each label, and a linear combination of kernels where each kernel corresponds to a feature. Secondly, through maximizing the kernel alignment value between linear combination kernel and ideal kernel, both weights in the two defined kernels are learned in this process simultaneously, and the learned weights of labels can be employed as the degree of labeling importance regarded as a kind of information among labels. Finally, features are ranked according to their weights in linear combined kernel, and a proper feature subset consisting of top ranking features is selected. Thus a novel method of feature selection for multi-label learning is developed which can learn and address importance degree of labels automatically, and effectiveness of this method is demonstrated by experimental comparisons.

[1]  Min-Ling Zhang,et al.  MIMLRBF: RBF neural networks for multi-instance multi-label learning , 2009, Neurocomputing.

[2]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[3]  Forbes J. Burkowski,et al.  Using Kernel Alignment to Select Features of Molecular Descriptors in a QSAR Study , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[4]  Grigorios Tsoumakas,et al.  Random k -Labelsets: An Ensemble Method for Multilabel Classification , 2007, ECML.

[5]  John Z. Zhang,et al.  Enhancing multi-label music genre classification through ensemble techniques , 2011, SIGIR.

[6]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[7]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[8]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[9]  Weiwei Liu,et al.  Metric Learning for Multi-Output Tasks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Zhi-Hua Zhou,et al.  Facial Age Estimation by Learning from Label Distributions , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Mehryar Mohri,et al.  Algorithms for Learning Kernels Based on Centered Alignment , 2012, J. Mach. Learn. Res..

[12]  Qiang Yang,et al.  Document Transformation for Multi-label Feature Selection in Text Categorization , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[13]  Songcan Chen,et al.  Multi-label active learning by model guided distribution matching , 2016, Frontiers of Computer Science.

[14]  Víctor Robles,et al.  Feature selection for multi-label naive Bayes classification , 2009, Inf. Sci..

[15]  Dae-Won Kim,et al.  Feature selection for multi-label classification using multivariate mutual information , 2013, Pattern Recognit. Lett..

[16]  Jianhua Xu,et al.  Multi-label regularized quadratic programming feature selection algorithm with Frank-Wolfe method , 2018, Expert Syst. Appl..

[17]  Pawel Teisseyre,et al.  CCnet: Joint multi-label classification and feature selection using classifier chains and elastic net regularization , 2017, Neurocomputing.

[18]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[19]  Eyke Hüllermeier,et al.  Graded Multilabel Classification: The Ordinal Case , 2010, ICML.

[20]  Miao Xu,et al.  Multi-Label Learning with PRO Loss , 2013, AAAI.

[21]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[22]  Newton Spolaôr,et al.  ReliefF for Multi-label Feature Selection , 2013, 2013 Brazilian Conference on Intelligent Systems.

[23]  Dae-Won Kim,et al.  SCLS: Multi-label feature selection based on scalable criterion for large label set , 2017, Pattern Recognit..

[24]  T. Glasmachers,et al.  Gradient-Based Optimization of Kernel-Target Alignment for Sequence Kernels Applied to Bacterial Gene Start Detection , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[25]  Dongyan Zhao,et al.  An overview of kernel alignment and its applications , 2012, Artificial Intelligence Review.

[26]  Guang Cheng,et al.  Simultaneous Clustering and Estimation of Heterogeneous Graphical Models , 2016, J. Mach. Learn. Res..

[27]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[28]  N. Cristianini,et al.  Optimizing Kernel Alignment over Combinations of Kernel , 2002 .

[29]  Min-Ling Zhang,et al.  Leveraging Implicit Relative Labeling-Importance Information for Effective Multi-Label Learning , 2019 .

[30]  Grigorios Tsoumakas,et al.  Multi-Label Classification of Music into Emotions , 2008, ISMIR.

[31]  Lei Tang,et al.  Large scale multi-label classification via metalabeler , 2009, WWW '09.

[32]  Shunxiang Wu,et al.  Online Multi-label Group Feature Selection , 2017, Knowl. Based Syst..

[33]  Qinghua Hu,et al.  Multi-label feature selection with missing labels , 2018, Pattern Recognit..

[34]  Weiwei Liu,et al.  Making Decision Trees Feasible in Ultrahigh Feature and Label Dimensions , 2017, J. Mach. Learn. Res..

[35]  Geoff Holmes,et al.  Multi-label Classification Using Ensembles of Pruned Sets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[36]  Weiwei Liu,et al.  An Easy-to-hard Learning Paradigm for Multiple Classes and Multiple Labels , 2017, J. Mach. Learn. Res..

[37]  Tao Mei,et al.  Correlative multi-label video annotation , 2007, ACM Multimedia.

[38]  Witold Pedrycz,et al.  Granular multi-label feature selection based on mutual information , 2017, Pattern Recognit..

[39]  Qinghua Hu,et al.  Streaming Feature Selection for Multilabel Learning Based on Fuzzy Mutual Information , 2017, IEEE Transactions on Fuzzy Systems.

[40]  Nello Cristianini,et al.  On the Extensions of Kernel Alignment , 2002 .

[41]  Robert E. Schapire,et al.  Hierarchical multi-label prediction of gene function , 2006, Bioinform..

[42]  Zhi-Hua Zhou,et al.  Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization , 2006, IEEE Transactions on Knowledge and Data Engineering.

[43]  Xin Geng,et al.  Label Distribution Learning , 2013, 2013 IEEE 13th International Conference on Data Mining Workshops.

[44]  Christian Igel,et al.  Gradient-Based Optimization of Kernel-Target Alignment for Sequence Kernels Applied to Bacterial Gene Start Detection , 2007, IEEE ACM Trans. Comput. Biol. Bioinform..

[45]  Zhi-Hua Zhou,et al.  Multilabel dimensionality reduction via dependence maximization , 2008, TKDD.

[46]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[47]  Michel Verleysen,et al.  Feature Selection for Multi-label Classification Problems , 2011, IWANN.