Incorporating label dependency into the binary relevance framework for multi-label classification

In multi-label classification, examples can be associated with multiple labels simultaneously. The task of learning from multi-label data can be addressed by methods that transform the multi-label classification problem into several single-label classification problems. The binary relevance approach is one of these methods, where the multi-label learning task is decomposed into several independent binary classification problems, one for each label in the set of labels, and the final labels for each example are determined by aggregating the predictions from all binary classifiers. However, this approach fails to consider any dependency among the labels. Aiming to accurately predict label combinations, in this paper we propose a simple approach that enables the binary classifiers to discover existing label dependency by themselves. An experimental study using decision trees, a kernel method as well as Naive Bayes as base-learning techniques shows the potential of the proposed approach to improve the multi-label classification performance.

[1]  Yong Man Ro,et al.  Semantic Home Photo Categorization , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Sunita Sarawagi,et al.  Discriminative Methods for Multi-labeled Classification , 2004, PAKDD.

[3]  Eyke Hüllermeier,et al.  On label dependence in multilabel classification , 2010, ICML 2010.

[4]  De Xu,et al.  Transductive Multi-Instance Multi-Label learning algorithm with application to automatic image annotation , 2010, Expert Syst. Appl..

[5]  Geoff Holmes,et al.  Classifier Chains for Multi-label Classification , 2009, ECML/PKDD.

[6]  Gustavo E. A. P. A. Batista,et al.  Class Imbalances versus Class Overlapping: An Analysis of a Learning System Behavior , 2004, MICAI.

[7]  Tao Mei,et al.  Correlative multi-label video annotation , 2007, ACM Multimedia.

[8]  Rong Jin,et al.  Correlated Label Propagation with Application to Multi-label Learning , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Jieping Ye,et al.  Hypergraph spectral learning for multi-label classification , 2008, KDD.

[10]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[11]  David E. Goldberg,et al.  The multi-label OCS with a genetic algorithm for rule discovery: implementation and first results , 2009, GECCO '09.

[12]  Saso Dzeroski,et al.  Decision Trees for Hierarchical Multilabel Classification: A Case Study in Functional Genomics , 2006, PKDD.

[13]  Jieping Ye,et al.  Extracting shared subspace for multi-label classification , 2008, KDD.

[14]  Shyi-Ming Chen,et al.  Multilabel text categorization based on a new linear classifier learning method and a category-sensitive refinement method , 2008, Expert Syst. Appl..

[15]  Everton Alvares Cherman,et al.  A Simple Approach to Incorporate Label Dependency in Multi-label Classification , 2010, MICAI.

[16]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[17]  Grigorios Tsoumakas,et al.  Multi-Label Classification of Music into Emotions , 2008, ISMIR.

[18]  Zhong Wang,et al.  Multi-label Classification without the Multi-label Cost , 2010, SDM.