Optimizing Active Learning for Low Annotation Budgets

When we can not assume a large amount of annotated data , active learning is a good strategy. It consists in learning a model on a small amount of annotated data (annotation budget) and in choosing the best set of points to annotate in order to improve the previous model and gain in generalization. In deep learning, active learning is usually implemented as an iterative process in which successive deep models are updated via fine tuning, but it still poses some issues. First, the initial batch of annotated images has to be sufficiently large to train a deep model. Such an assumption is strong, especially when the total annotation budget is reduced. We tackle this issue by using an approach inspired by transfer learning. A pre-trained model is used as a feature extractor and only shallow classifiers are learned during the active iterations. The second issue is the effectiveness of probability or feature estimates of early models for AL task. Samples are generally selected for annotation using acquisition functions based only on the last learned model. We introduce a novel acquisition function which exploits the iterative nature of AL process to select samples in a more robust fashion. Samples for which there is a maximum shift towards uncertainty between the last two learned models predictions are favored. A diversification step is added to select samples from different regions of the classification space and thus introduces a representativeness component in our approach. Evaluation is done against competitive methods with three balanced and imbalanced datasets and outperforms them.

[1]  C. Lee Giles,et al.  Active learning for class imbalance problem , 2007, SIGIR.

[2]  Zoubin Ghahramani,et al.  Deep Bayesian Active Learning with Image Data , 2017, ICML.

[3]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[4]  Céline Hudelot,et al.  Active Learning for Imbalanced Datasets , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[5]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  S. Ertekin CLASS IMBALANCE AND ACTIVE LEARNING , 2013 .

[8]  Baharan Mirzasoleiman,et al.  Selection Via Proxy: Efficient Data Selection For Deep Learning , 2019, ICLR.

[9]  Klaus Brinker,et al.  Incorporating Diversity in Active Learning with Support Vector Machines , 2003, ICML.

[10]  Junmo Kim,et al.  Weight Decay Scheduling and Knowledge Distillation for Active Learning , 2020, ECCV.

[11]  Hsuan-Tien Lin,et al.  Active Learning by Learning , 2015, AAAI.

[12]  Lei Zhang,et al.  Fine-Tuning Convolutional Neural Networks for Biomedical Image Analysis: Actively and Incrementally , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Atsuto Maki,et al.  A systematic study of the class imbalance problem in convolutional neural networks , 2017, Neural Networks.

[14]  Rishabh K. Iyer,et al.  Submodularity in Data Subset Selection and Active Learning , 2015, ICML.

[15]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[16]  Sotaro Tsukizawa,et al.  Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Sethuraman Panchanathan,et al.  Active Batch Selection via Convex Relaxations with Guaranteed Solution Bounds , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Jun Cheng,et al.  Incorporating Incremental and Active Learning for Scene Classification , 2012, 2012 11th International Conference on Machine Learning and Applications.

[19]  Sanjoy Dasgupta,et al.  Hierarchical sampling for active learning , 2008, ICML '08.

[20]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[21]  Andreas Nürnberger,et al.  The Power of Ensembles for Active Learning in Image Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Sercan Ö. Arik,et al.  Consistency-Based Semi-Supervised Active Learning: Towards Minimizing Labeling Cost , 2019, ECCV.

[23]  Stefan Wrobel,et al.  Mining the Web with active hidden Markov models , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[24]  Pascal Fua,et al.  Learning Active Learning from Data , 2017, NIPS.

[25]  Matthieu Guillaumin,et al.  Food-101 - Mining Discriminative Components with Random Forests , 2014, ECCV.

[26]  Tong Zhang,et al.  Local Uncertainty Sampling for Large-Scale Multi-Class Logistic Regression , 2016, The Annals of Statistics.

[27]  Silvio Savarese,et al.  Active Learning for Convolutional Neural Networks: A Core-Set Approach , 2017, ICLR.

[28]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Andrew McCallum,et al.  Reducing Labeling Effort for Structured Prediction Tasks , 2005, AAAI.

[30]  Jingbo Zhu,et al.  Active Learning for Word Sense Disambiguation with Methods for Addressing the Class Imbalance Problem , 2007, EMNLP.

[31]  John Langford,et al.  Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds , 2019, ICLR.

[32]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[33]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.