Multi-label Active Learning with Error Correcting Output Codes

Due to the demand of practical problems, multi-label learning has become an important research where each instance belongs to multiple classes. Compared with single-label problem, the labeling cost for multi-label one is rather expensive because of the diversity and non-uniqueness of the labels. Therefore, the active learning which reduces the cost by selecting the most valuable data to query the labels attracts a lot of interests. Although several multi-label active learning (MLAL) methods were proposed, they often identify the label merely through a classifier via one-versus-all (OVA) strategy for each class, which makes the classification model very fragile, thus having a serious impact on the later selection criteria. In this paper, we utilize a new multi-label Error Correcting Output Codes (ECOC) method which determines the label of an instance on each class by combining multiple classifiers. This makes our classification model has a good ability of error-correcting and thus ensures the effectiveness of evaluation information in the selection process. Then we combine two effective selection strategies, the margin prediction uncertainty and label cardinality inconsistency, to complement each other and select the most informative instance. Based on this combination, we propose a novel MLAL framework, termed Multi-label Active Learning with Error Correcting Output Codes (MAOC). Experiments on multiple benchmark multi-label datasets demonstrate the efficacy of the combination in proposed approach.

[1]  K. Bretonnel Cohen,et al.  A shared task involving multi-label classification of clinical free text , 2007, BioNLP@ACL.

[2]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[3]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[4]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[5]  Xin Li,et al.  Active Learning with Multi-Label SVM Classification , 2013, IJCAI.

[6]  Mohan Singh,et al.  Active Learning for Multi-Label Image Annotation , 2009 .

[7]  Rong Jin,et al.  Active Learning by Querying Informative and Representative Examples , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Yi Yang,et al.  Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization , 2015, International Journal of Computer Vision.

[9]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[10]  Lei Wang,et al.  Multilabel SVM active learning for image classification , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[11]  Klaus Brinker,et al.  On Active Learning in Multi-label Classification , 2005, GfKl.

[12]  Dale Schuurmans,et al.  Discriminative Batch Mode Active Learning , 2007, NIPS.

[13]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[14]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[15]  Zheng Chen,et al.  Effective multi-label active learning for text classification , 2009, KDD.

[16]  Zhi-Hua Zhou,et al.  Active Query Driven by Uncertainty and Diversity for Incremental Multi-label Learning , 2013, 2013 IEEE 13th International Conference on Data Mining.

[17]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[18]  A.N. Srivastava,et al.  Discovering recurring anomalies in text reports regarding complex space systems , 2005, 2005 IEEE Aerospace Conference.