Active Learning for Convolutional Neural Networks: A Core-Set Approach

Convolutional neural networks (CNNs) have been successfully applied to many recognition and learning tasks using a universal recipe; training a deep model on a very large dataset of supervised examples. However, this approach is rather restrictive in practice since collecting a large set of labeled images is very expensive. One way to ease this problem is coming up with smart ways for choosing images to be labelled from a very large collection (ie. active learning). Our empirical study suggests that many of the active learning heuristics in the literature are not effective when applied to CNNs in batch setting. Inspired by these limitations, we define the problem of active learning as core-set selection, ie. choosing set of points such that a model learned over the selected subset is competitive for the remaining data points. We further present a theoretical result characterizing the performance of any selected subset using the geometry of the datapoints. As an active learning algorithm, we choose the subset which is expected to yield best result according to our characterization. Our experiments show that the proposed method significantly outperforms existing approaches in image classification experiments by a large margin.

[1]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[2]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[3]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2002, J. Mach. Learn. Res..

[4]  Andrew McCallum,et al.  Toward Optimal Active Learning through Monte Carlo Estimation of Error Reduction , 2001, ICML 2001.

[5]  Klaus Brinker,et al.  Incorporating Diversity in Active Learning with Support Vector Machines , 2003, ICML.

[6]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[7]  Sanjoy Dasgupta,et al.  Analysis of a greedy active learning strategy , 2004, NIPS.

[8]  Sariel Har-Peled,et al.  Smaller Coresets for k-Median and k-Means Clustering , 2005, SCG.

[9]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[10]  Rong Jin,et al.  Batch mode active learning and its application to medical image classification , 2006, ICML.

[11]  Jinbo Bi,et al.  Active learning via transductive experimental design , 2006, ICML.

[12]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[13]  Trevor Darrell,et al.  Active Learning with Gaussian Processes for Object Categorization , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[14]  Dale Schuurmans,et al.  Discriminative Batch Mode Active Learning , 2007, NIPS.

[15]  Steve Hanneke,et al.  A bound on the label complexity of agnostic active learning , 2007, ICML '07.

[16]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[17]  Nikolaos Papanikolopoulos,et al.  Multi-class active learning for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[19]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[20]  Nikolaos Papanikolopoulos,et al.  Multi-class batch-mode active learning for image classification , 2010, 2010 IEEE International Conference on Robotics and Automation.

[21]  Jeff A. Bilmes,et al.  Interactive Submodular Set Cover , 2010, ICML.

[22]  Yuhong Guo,et al.  Active Instance Sampling via Matrix Partition , 2010, NIPS.

[23]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[24]  Shie Mannor,et al.  Robustness and generalization , 2010, Machine Learning.

[25]  Andreas Krause,et al.  Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization , 2010, J. Artif. Intell. Res..

[26]  Lorenzo Bruzzone,et al.  Batch-Mode Active-Learning Methods for the Interactive Classification of Remote Sensing Images , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[27]  Alexander G. Gray,et al.  UPAL: Unbiased Pool Based Active Learning , 2011, AISTATS.

[28]  Xin Li,et al.  Adaptive Active Learning for Image Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Shai Shalev-Shwartz,et al.  Efficient active learning of halfspaces: an aggressive approach , 2012, J. Mach. Learn. Res..

[30]  Jieping Ye,et al.  Querying discriminative and representative samples for batch mode active learning , 2013, KDD.

[31]  Allen Y. Yang,et al.  A Convex Optimization Framework for Active Learning , 2013, 2013 IEEE International Conference on Computer Vision.

[32]  Jeff A. Bilmes,et al.  Using Document Summarization Techniques for Speech Data Subset Selection , 2013, NAACL.

[33]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[34]  Yi Yang,et al.  Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization , 2015, International Journal of Computer Vision.

[35]  Tapani Raiko,et al.  Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[36]  Daniel Cremers,et al.  CAPTCHA Recognition with Active Deep Learning , 2015 .

[37]  Rishabh K. Iyer,et al.  Submodularity in Data Subset Selection and Active Learning , 2015, ICML.

[38]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[39]  Ruth Urner,et al.  Active Nearest Neighbors in Changing Environments , 2015, ICML.

[40]  Franziska Abend,et al.  Facility Location Concepts Models Algorithms And Case Studies , 2016 .

[41]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[43]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[44]  Yonatan Wexler,et al.  Minimizing the Maximal Loss: How and Why , 2016, ICML.

[45]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[46]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[47]  Ruimao Zhang,et al.  Cost-Effective Active Learning for Deep Image Classification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[48]  Zoubin Ghahramani,et al.  Deep Bayesian Active Learning with Image Data , 2017, ICML.