Weakly Supervised Image Classification with Coarse and Fine Labels

We consider image classification in a weakly supervised scenario where the training data are annotated at different levels of abstractions. A subset of the training data are annotated with coarse labels (e.g. wolf, dog), while the rest of the training data are annotated with fine labels (e.g. breeds of wolves and dogs). Each coarse label corresponds to a superclass of several fine labels. Our goal is to learn a model that can classify a new image into one of the fine classes. We investigate how the coarsely labeled data can help improve the fine label classification. Since it is usually much easier to collect data with coarse labels than those with fine labels, the problem setup considered in this paper can benefit a wide range of real-world applications. We propose a model based on convolutional neural networks (CNNs) to address this problem. We demonstrate the effectiveness of the proposed model on several benchmark datasets. Our model significantly outperforms the naive approach that discards the extra coarsely labeled data.

[1]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[2]  Jason Weston,et al.  Label Embedding Trees for Large Multi-Class Tasks , 2010, NIPS.

[3]  Robinson Piramuthu,et al.  HD-CNN: Hierarchical Deep Convolutional Neural Networks for Large Scale Visual Recognition , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  Matthieu Guillaumin,et al.  From categories to subcategories: Large-scale image classification with partial class label refinement , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[6]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[7]  Alexander C. Berg,et al.  Fast and Balanced: Efficient Label Tree Learning for Large Scale Object Recognition , 2011, NIPS.

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Matthieu Guillaumin,et al.  Incremental Learning of NCM Forests for Large-Scale Image Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[11]  Lorenzo Torresani,et al.  Network of Experts for Large-Scale Image Categorization , 2016, ECCV.

[12]  Trevor Darrell,et al.  Recognizing Image Style , 2013, BMVC.

[13]  Jonathan Krause,et al.  Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Yejin Choi,et al.  From Large Scale Image Categorization to Entry-Level Categories , 2013, 2013 IEEE International Conference on Computer Vision.

[15]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[16]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[17]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).