Incremental Learning for Fine-Grained Image Recognition

This paper considers the problem of fine-grained image recognition with a growing vocabulary. Since in many real world applications we often have to add a new object category or visual concept with just a few images to learn from, it is crucial to develop a method that is able to generalize the recognition model from existing classes to new classes. Deep convolutional neural networks are capable of constructing powerful image representations; however, these networks usually rely on a logistic loss function that cannot handle the incremental learning problem. In this paper, we present a new method that can efficiently learn a new class given only a limited number of training examples, which we evaluate on the problems of food and clothing recognition. To illustrate the performance of our proposed method on the task of recognizing different kinds of food, when using only 1.3\% of training examples per category we achieved about 73\% of the performance (as measured by F1-score) compared to when using all available training data.

[1]  Razvan Pascanu,et al.  Theano: A CPU and GPU Math Compiler in Python , 2010, SciPy.

[2]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[3]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[4]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[5]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[6]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8]  Matthieu Guillaumin,et al.  Incremental Learning of NCM Forests for Large-Scale Image Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[10]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[11]  Jianfeng Gao,et al.  Scalable training of L1-regularized log-linear models , 2007, ICML '07.

[12]  Trevor Darrell,et al.  Open-vocabulary Object Retrieval , 2014, Robotics: Science and Systems.

[13]  Xinlei Chen,et al.  NEIL: Extracting Visual Knowledge from Web Data , 2013, 2013 IEEE International Conference on Computer Vision.

[14]  Matthieu Guillaumin,et al.  Food-101 - Mining Discriminative Components with Random Forests , 2014, ECCV.

[15]  Trevor Darrell,et al.  Recognizing Image Style , 2013, BMVC.

[16]  Amit K. Roy-Chowdhury,et al.  Incremental Activity Modeling and Recognition in Streaming Videos , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Bodo Rosenhahn,et al.  Expanding object detector's Horizon: Incremental learning framework for object detection in videos , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Gabriela Csurka,et al.  Distance-Based Image Classification: Generalizing to New Classes at Near-Zero Cost , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.