Classifying low-resolution images by integrating privileged information in deep CNNs

Abstract As introduced by [1], the privileged information is a complementary datum related to a training example that is unavailable for the test examples. In this paper, we consider the problem of recognizing low-resolution images (targeted task), while leveraging their high-resolution version as privileged information. In this context, we propose a novel framework for integrating privileged information in the learning phase of a deep neural network. We present a natural multi-class formulation of the addressed problem, while providing an end-to-end training framework of the internal deep representations. Based on a detailed analysis of the state-of-the-art approaches, we propose a novel loss function, combining two different ways of computing indicators of an example’s difficulty, based on its privileged information. We experimentally validate our approach in various contexts, proving the interest of our model for different tasks such as fine-grained image classification or image recognition from a dataset containing annotation noise.

[1]  Dmitry Pechyony,et al.  Fast Optimization Algorithms for Solving SVM , 2012 .

[2]  Luc Van Gool,et al.  Fast Algorithms for Linear and Kernel SVM+ , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Bernhard Schölkopf,et al.  Unifying distillation and privileged information , 2015, ICLR.

[6]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[7]  Subhransu Maji,et al.  Fine-Grained Visual Classification of Aircraft , 2013, ArXiv.

[8]  Qiang Ji,et al.  Classifier learning with hidden information , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Yi Li,et al.  DeepTrack: Learning Discriminative Feature Representations by Convolutional Neural Networks for Visual Tracking , 2014, BMVC.

[10]  Vladimir Vapnik,et al.  A new learning paradigm: Learning using privileged information , 2009, Neural Networks.

[11]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[12]  Matthieu Cord,et al.  LR-CNN for fine-grained classification with varying resolution , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[13]  Eric O. Postma,et al.  Learning scale-variant and scale-invariant features for deep image classification , 2016, Pattern Recognit..

[14]  Trevor Darrell,et al.  Learning with Side Information through Modality Hallucination , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[16]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[18]  Kavita Bala,et al.  Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Peter Tiño,et al.  Incorporating Privileged Information Through Metric Learning , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Jitendra Malik,et al.  Contextual Action Recognition with R*CNN , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Matthieu Cord,et al.  Recipe recognition with large multimodal food dataset , 2015, 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[22]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[23]  Jan Feyereisl,et al.  Object Localization based on Structural SVM using Privileged Information , 2014, NIPS.

[24]  Xinlei Chen,et al.  Mind's eye: A recurrent visual representation for image caption generation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Bernt Schiele,et al.  Learning using privileged information: SV M+ and weighted SVM , 2013, Neural Networks.

[26]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[28]  Christoph H. Lampert,et al.  Learning to Rank Using Privileged Information , 2013, 2013 IEEE International Conference on Computer Vision.

[29]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[30]  Christoph H. Lampert,et al.  Learning to Transfer Privileged Information , 2014, ArXiv.

[31]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.