An Ensemble Method for High-Dimensional Multilabel Data

Multilabel learning is now receiving an increasing attention from a variety of domains and many learning algorithms have been witnessed. Similarly, the multilabel learning may also suffer from the problems of high dimensionality, and little attention has been paid to this issue. In this paper, we propose a new ensemble learning algorithms for multilabel data. The main characteristic of our method is that it exploits the features with local discriminative capabilities for each label to serve the purpose of classification. Specifically, for each label, the discriminative capabilities of features on positive and negative data are estimated, and then the top features with the highest capabilities are obtained. Finally, a binary classifier for each label is constructed on the top features. Experimental results on the benchmark data sets show that the proposed method outperforms four popular and previously published multilabel learning algorithms.

[1]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[2]  Lei Wu,et al.  Lift: Multi-Label Learning with Label-Specific Features , 2015, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Jieping Ye,et al.  A shared-subspace learning framework for multi-label classification , 2010, TKDD.

[4]  Xiangtao Li,et al.  An opposition-based differential evolution algorithm for permutation flow shop scheduling based on diversity measure , 2013, Adv. Eng. Softw..

[5]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[6]  Xiangtao Li,et al.  A perturb biogeography based optimization with mutation for global numerical optimization , 2011, Appl. Math. Comput..

[7]  Thomas Hofmann,et al.  Hierarchical document categorization with support vector machines , 2004, CIKM '04.

[8]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[9]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[10]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[11]  Minghao Yin,et al.  Application of Differential Evolution Algorithm on Self-Potential Data , 2012, PloS one.

[12]  Foster J. Provost,et al.  Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction , 2003, J. Artif. Intell. Res..

[13]  Amanda Clare,et al.  Knowledge Discovery in Multi-label Phenotype Data , 2001, PKDD.

[14]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[15]  Grigorios Tsoumakas,et al.  Random K-labelsets for Multilabel Classification , 2022 .

[16]  Yihong Gong,et al.  Multi-labelled classification using maximum entropy method , 2005, SIGIR '05.

[17]  Xindong Wu,et al.  A NEW SUPERVISED FEATURE SELECTION METHOD FOR PATTERN CLASSIFICATION , 2014, Comput. Intell..

[18]  Francisco Escolano,et al.  Information-theoretic selection of high-dimensional spectral features for structural recognition , 2013, Comput. Vis. Image Underst..