ReliefF-based Multi-label Feature Selection

In recent years, multi-label learning has been used to deal with data attributed to multiple labels simultaneously and has been increasingly applied to various applications. As many other machine learning tasks, multi-label learning also suffers from the curse of dimensionality; so extracting good features using multiple labels of the datasets becomes an important step prior to classification. In this paper, we study the problem of multilabel feature selection for classification and have proposed a method based on single label feature selection ReliefF, termed ML-ReliefF, to select discriminant features in order to boost multi-label classification accuracy. Compared to other multi-label feature selection methods that only consider the relationship between pairwise classes, the proposed method introduces the concept of label set to further consider the relationship among more than two labels, modifies the regulation of the nearest neighbors computation reflecting the influence between samples and multiple labels, and considers and adds the similarity between samples to reinforce the effect. With the classifier, MLkNN, experiments on five different datasets show that the proposed method is effective in removing irrelevant or redundant features and the selected features are more discriminant for classification.

[1]  Volker Tresp,et al.  Multi-label informed latent semantic indexing , 2005, SIGIR '05.

[2]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[3]  Lei Tang,et al.  Large scale multi-label classification via metalabeler , 2009, WWW '09.

[4]  Tao Mei,et al.  Correlative multi-label video annotation , 2007, ACM Multimedia.

[5]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[6]  Min-Ling Zhang,et al.  Ml-rbf: RBF Neural Networks for Multi-Label Learning , 2009, Neural Processing Letters.

[7]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[8]  Kun Zhang,et al.  Multi-label learning by exploiting label dependency , 2010, KDD.

[9]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[10]  Víctor Robles,et al.  Feature selection for multi-label naive Bayes classification , 2009, Inf. Sci..

[11]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[12]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[13]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[14]  Zheng Chen,et al.  Effective multi-label active learning for text classification , 2009, KDD.

[15]  Grigorios Tsoumakas,et al.  Random k -Labelsets: An Ensemble Method for Multilabel Classification , 2007, ECML.

[16]  Chris H. Q. Ding,et al.  Multi-label ReliefF and F-statistic feature selections for image annotation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[18]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[19]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[20]  Amanda Clare,et al.  Knowledge Discovery in Multi-label Phenotype Data , 2001, PKDD.

[21]  Ian Davidson,et al.  Semi-Supervised Dimension Reduction for Multi-Label Classification , 2010, AAAI.

[22]  Eyke Hüllermeier,et al.  Multilabel classification via calibrated label ranking , 2008, Machine Learning.

[23]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[24]  Sunita Sarawagi,et al.  Discriminative Methods for Multi-labeled Classification , 2004, PAKDD.

[25]  Robert E. Schapire,et al.  Hierarchical multi-label prediction of gene function , 2006, Bioinform..

[26]  Zhi-Hua Zhou,et al.  Multilabel dimensionality reduction via dependence maximization , 2008, TKDD.

[27]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .