Improving classification with semi-supervised and fine-grained learning

Abstract In this paper, we propose a novel and efficient multi-stage approach, which combines both semi-supervised learning and fine-grained learning to improve the performance of classification model learned only from a few samples. The fine-grained category recognition process utilized in our method is dubbed as MSR. In this process, we cut images into multi-scaled parts to feed into the network to learn more fine-grained features. By assigning these image cuts with dynamic weights, we can reduce the negative impact of background information and thus achieve a more accurate prediction. Furthermore, we present the voted pseudo label (VPL) which is an efficient method of semi-supervised learning. In this approach, for unlabeled data, VPL picks up the classes with non-confused labels verified by the consensus prediction of different classification models. These two methods can be applied to most neural network models and training methods. Inspired from classifier-based adaptation, we also propose a mix deep CNN architecture (MixDCNN). Both the VPL and MSR are integrated with the MixDCNN. Comprehensive experiments demonstrate the effectiveness of VPL and MSR. Without bottles and jars, we achieve the state-of-the-art or even better performance in two fine-grained recognition tasks on the datasets of Stanford Dogs and CUB Birds, with the accuracy of 95.6% and 85.2%, respectively.

[1]  Ling Shao,et al.  Transfer Learning for Visual Categorization: A Survey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Cewu Lu,et al.  Deep LAC: Deep localization, alignment and classification for fine-grained recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Luca Martino,et al.  Joint Gaussian Processes for Biophysical Parameter Retrieval , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Taghi M. Khoshgoftaar,et al.  A survey of transfer learning , 2016, Journal of Big Data.

[5]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[6]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[7]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[8]  George Papandreou,et al.  Weakly-and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[9]  Tao Mei,et al.  Relaxing from Vocabulary: Robust Weakly-Supervised Deep Learning for Vocabulary-Free Image Tagging , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[11]  Shuicheng Yan,et al.  LG-CNN: From local parts to global discrimination for fine-grained recognition , 2017, Pattern Recognit..

[12]  Marcel Simon,et al.  Neural Activation Constellations: Unsupervised Part Model Discovery with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  Subhransu Maji,et al.  Bilinear CNN Models for Fine-Grained Visual Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Tao Mei,et al.  Image Tag Refinement With View-Dependent Concept Representations , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Jonathan Krause,et al.  The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition , 2015, ECCV.

[16]  Li Fei-Fei,et al.  DenseCap: Fully Convolutional Localization Networks for Dense Captioning , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Yuxin Peng,et al.  The application of two-level attention models in deep convolutional neural network for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Ahmed M. Elgammal,et al.  SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  C. V. Jawahar,et al.  The truth about cats and dogs , 2011, 2011 International Conference on Computer Vision.

[20]  Tao Mei,et al.  Beyond Object Recognition: Visual Sentiment Analysis with Deep Coupled Adjective and Noun Neural Networks , 2016, IJCAI.

[21]  Shuicheng Yan,et al.  A survey on deep learning-based fine-grained object classification and semantic segmentation , 2017, International Journal of Automation and Computing.

[22]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[23]  Zhiqiang Shen,et al.  Multiple Granularity Descriptors for Fine-Grained Categorization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Ya Zhang,et al.  Part-Stacked CNN for Fine-Grained Visual Categorization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Andrew Zisserman,et al.  A Visual Vocabulary for Flower Classification , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[26]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[27]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[28]  Qi Tian,et al.  Picking Deep Filter Responses for Fine-Grained Image Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Bernt Schiele,et al.  Learning Deep Representations of Fine-Grained Visual Descriptions , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Xiu-Shen Wei,et al.  Mask-CNN: Localizing parts and selecting descriptors for fine-grained bird species categorization , 2018, Pattern Recognit..

[31]  Yan Zhang,et al.  Semi-supervised local multi-manifold Isomap by linear embedding for feature extraction , 2018, Pattern Recognit..

[32]  Hao Wu,et al.  Semi-supervised dimensionality reduction of hyperspectral imagery using pseudo-labels , 2018, Pattern Recognit..

[33]  Marc'Aurelio Ranzato,et al.  Semi-supervised learning of compact document representations with deep networks , 2008, ICML '08.

[34]  Arun Ross,et al.  On automated source selection for transfer learning in convolutional neural networks , 2018, Pattern Recognit..

[35]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[36]  Trevor Darrell,et al.  Part-Based R-CNNs for Fine-Grained Category Detection , 2014, ECCV.

[37]  Trevor Darrell,et al.  Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Jonathan Krause,et al.  Fine-grained recognition without part annotations , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Rich Caruana,et al.  Inductive Transfer for Bayesian Network Structure Learning , 2007, ICML Unsupervised and Transfer Learning.

[40]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[41]  Conrad Sanderson,et al.  Fine-grained classification via mixture of deep convolutional neural networks , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[42]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Tao Mei,et al.  Tagging Personal Photos with Transfer Deep Learning , 2015, WWW.

[44]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[45]  Tao Mei,et al.  Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).