Multi-label active learning based on submodular functions

Abstract In the data collection task, it is more expensive to annotate the instance in multi-label learning problem, since each instance is associated with multiple labels. Therefore it is more important to adopt active learning method in multi-label learning to reduce the labeling cost. Recent researches indicate submodular function optimization works well on subset selection problem and provides theoretical performance guarantees while simultaneously retaining extremely fast optimization. In this paper, we propose a query strategy by constructing a submodular function for the selected instance-label pairs, which can measure and combine the informativeness and representativeness. Thus the active learning problem can be formulated as a submodular function maximization problem, which can be solved efficiently and effectively by a simple greedy lazy algorithm. Experimental results show that the proposed approach outperforms several state-of-the-art multi-label active learning methods.

[1]  Alex Alves Freitas,et al.  A Genetic Algorithm for Optimizing the Label Ordering in Multi-label Classifier Chains , 2013, 2013 IEEE 25th International Conference on Tools with Artificial Intelligence.

[2]  Grigorios Tsoumakas,et al.  Multi-Label Classification of Music into Emotions , 2008, ISMIR.

[3]  Grigorios Tsoumakas,et al.  Protein Classification with Multiple Algorithms , 2005, Panhellenic Conference on Informatics.

[4]  Lei Wang,et al.  Multilabel SVM active learning for image classification , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[5]  Rishabh K. Iyer,et al.  Learning Mixtures of Submodular Functions for Image Collection Summarization , 2014, NIPS.

[6]  Rong Jin,et al.  Active Learning by Querying Informative and Representative Examples , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Zhi-Hua Zhou,et al.  Active Query Driven by Uncertainty and Diversity for Incremental Multi-label Learning , 2013, 2013 IEEE 13th International Conference on Data Mining.

[8]  Zhi-Hua Zhou,et al.  Multi-Label Active Learning: Query Type Matters , 2015, IJCAI.

[9]  Andreas Krause,et al.  Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization , 2010, J. Artif. Intell. Res..

[10]  Changsheng Xu,et al.  Multi-view multi-label active learning for image classification , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[11]  Grigorios Tsoumakas,et al.  The 9th annual MLSP competition: New methods for acoustic classification of multiple simultaneous bird species in a noisy environment , 2013, 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[12]  Yueting Zhuang,et al.  Topic aspect-oriented summarization via group selection , 2015, Neurocomputing.

[13]  Xian-Sheng Hua,et al.  Two-Dimensional Active Learning for image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Klaus Brinker,et al.  On Active Learning in Multi-label Classification , 2005, GfKl.

[15]  Dhruv Batra,et al.  SubmodBoxes: Near-Optimal Search for a Set of Diverse Object Proposals , 2015, NIPS.

[16]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[17]  Michel Minoux,et al.  Accelerated greedy algorithms for maximizing submodular set functions , 1978 .

[18]  Abhimanyu Das,et al.  Selecting Diverse Features via Spectral Regularization , 2012, NIPS.

[19]  Bo Du,et al.  Robust and Discriminative Labeling for Multi-Label Active Learning Based on Maximum Correntropy Criterion , 2017, IEEE Transactions on Image Processing.

[20]  Bo Du,et al.  A batch-mode active learning framework by querying discriminative and representative samples for hyperspectral image classification , 2016, Neurocomputing.

[21]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[22]  Zheng Chen,et al.  Effective multi-label active learning for text classification , 2009, KDD.

[23]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[24]  Xin Li,et al.  Active Learning with Multi-Label SVM Classification , 2013, IJCAI.

[25]  Wei Liu,et al.  Exploring Representativeness and Informativeness for Active Learning , 2019, IEEE Transactions on Cybernetics.

[26]  Rishabh K. Iyer,et al.  Submodularity in Data Subset Selection and Active Learning , 2015, ICML.

[27]  Jeff A. Bilmes,et al.  Submodular subset selection for large-scale speech training data , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[28]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..