Unsupervised and Semi-Supervised Image Classification With Weak Semantic Consistency

Supervised methods have been widely used for image classifications. Although great progress has been made, existing supervised methods rely on well-labeled samples for classification. However, we often have large quantities of images with few or no labels. To cope with this problem, in this paper, we propose a novel weak semantic consistency constrained image classification method. We start from an extreme circumstance by viewing each image as one class. We train exemplar classifiers to separate each image from other images. For each image, we use the learned exemplar classifiers to predict the weak semantic correlations with the exemplar classifiers. When no labeled information is available, we cluster images using the weak semantic correlations and assign images within one cluster to the same mid-level class. When partially labeled images are available, we can use them to constrain the clustering process by assigning images of varied semantics to different mid-level classes. We use the newly assigned images for classifier training and new image representations, which can then be used for similar image assignments. The classifier training, image representation, and assignment processes are repeated until convergence. We conduct both unsupervised and semi-supervised image classification experiments on several datasets. The experimental results show the effectiveness of the proposed unsupervised and semi-supervised weak semantic consistency image classification method.

[1]  Qi Tian,et al.  Multiview Label Sharing for Visual Representations and Classifications , 2018, IEEE Transactions on Multimedia.

[2]  Trevor Darrell,et al.  Adversarial Feature Learning , 2016, ICLR.

[3]  Bo Wang,et al.  Dynamic Label Propagation for Semi-supervised Multi-class Multi-label Classification , 2013, 2013 IEEE International Conference on Computer Vision.

[4]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[5]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Armand Joulin,et al.  Unsupervised Learning by Predicting Noise , 2017, ICML.

[7]  Guo-Jun Qi,et al.  Loss-Sensitive Generative Adversarial Networks on Lipschitz Densities , 2017, International Journal of Computer Vision.

[8]  Qi Tian,et al.  Multiview Hessian Semisupervised Sparse Feature Selection for Multimedia Analysis , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Masayuki Karasuyama,et al.  Multiple Graph Label Propagation by Sparse Integration , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Thomas Brox,et al.  Discriminative Unsupervised Feature Learning with Convolutional Neural Networks , 2014, NIPS.

[11]  Qi Tian,et al.  Boosted random contextual semantic space based representation for visual recognition , 2016, Inf. Sci..

[12]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[15]  Charu C. Aggarwal,et al.  Factorized Similarity Learning in Networks , 2014, 2014 IEEE International Conference on Data Mining.

[16]  Hao Hu,et al.  Global Versus Localized Generative Adversarial Nets , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Peter Kontschieder,et al.  Neural Decision Forests for Semantic Image Labelling , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Charu C. Aggarwal,et al.  On clustering heterogeneous social media objects with outlier links , 2012, WSDM '12.

[19]  Qi Tian,et al.  Incremental Codebook Adaptation for Visual Representation and Categorization , 2018, IEEE Transactions on Cybernetics.

[20]  Qi Tian,et al.  Image-level classification by hierarchical structure learning with visual and semantic similarities , 2018, Inf. Sci..

[21]  Xian-Sheng Hua,et al.  Typicality ranking via semi-supervised multiple-instance learning , 2007, ACM Multimedia.

[22]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Xian-Sheng Hua,et al.  Learning semantic distance from community-tagged media collection , 2009, MM '09.

[24]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[25]  Nuno Vasconcelos,et al.  Scene Recognition on the Semantic Manifold , 2012, ECCV.

[26]  Wen Li,et al.  Domain Generalization and Adaptation Using Low Rank Exemplar SVMs , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Qi Tian,et al.  Hierarchical deep semantic representation for visual categorization , 2017, Neurocomputing.

[28]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[29]  Qi Tian,et al.  Image-Specific Classification With Local and Global Discriminations , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[32]  Hao Wang,et al.  Recommending Flickr groups with social topic model , 2012, Information Retrieval.

[33]  Chu-Song Chen,et al.  Supervised Learning of Semantics-Preserving Hash via Deep Convolutional Neural Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Bo Zhao,et al.  Diversified Visual Attention Networks for Fine-Grained Object Classification , 2016, IEEE Transactions on Multimedia.

[35]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[36]  Qingming Huang,et al.  Image classification by non-negative sparse coding, correlation constrained low-rank and sparse decomposition , 2014, Comput. Vis. Image Underst..

[37]  Qi Tian,et al.  Image classification by search with explicitly and implicitly semantic representations , 2017, Inf. Sci..

[38]  Tao Xiang,et al.  Joint Semantic and Latent Attribute Modelling for Cross-Class Transfer Learning , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Alexei A. Efros,et al.  Colorful Image Colorization , 2016, ECCV.

[40]  Tao Mei,et al.  Refining video annotation by exploiting pairwise concurrent relation , 2007, ACM Multimedia.

[41]  Jian Yang,et al.  Ensemble Teaching for Hybrid Label Propagation , 2019, IEEE Transactions on Cybernetics.

[42]  Feiping Nie,et al.  Heterogeneous Image Features Integration via Multi-modal Semi-supervised Learning Model , 2013, 2013 IEEE International Conference on Computer Vision.

[43]  Thomas Mensink,et al.  Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.

[44]  Xiaodong Yu,et al.  Attribute-Based Transfer Learning for Object Categorization with Zero/One Training Example , 2010, ECCV.

[45]  Qi Tian,et al.  Birds of a feather flock together: Visual representation with scale and class consistency , 2018, Inf. Sci..

[46]  Jian Yang,et al.  A Regularization Approach for Instance-Based Superset Label Learning , 2018, IEEE Transactions on Cybernetics.

[47]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[48]  Yi Yang,et al.  Bi-Level Semantic Representation Analysis for Multimedia Event Detection , 2017, IEEE Transactions on Cybernetics.

[49]  Qi Tian,et al.  Bundled Local Features for Image Representation , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[50]  Fei-Fei Li,et al.  What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[51]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[52]  Yun Fu,et al.  Unsupervised transfer learning via Low-Rank Coding for image clustering , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[53]  Jitendra Malik,et al.  Learning to See by Moving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[54]  Meng Wang,et al.  Semi-supervised kernel density estimation for video annotation , 2009, Comput. Vis. Image Underst..

[55]  Jiajun Bu,et al.  Exemplar-Based Image and Video Stylization Using Fully Convolutional Semantic Features , 2017, IEEE Transactions on Image Processing.

[56]  Hongliang Li,et al.  PBC: Polygon-Based Classifier for Fine-Grained Categorization , 2017, IEEE Transactions on Multimedia.

[57]  Qi Tian,et al.  Multiview Semantic Representation for Visual Recognition , 2020, IEEE Transactions on Cybernetics.

[58]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[59]  Aswin C. Sankaranarayanan,et al.  Shape and Spatially-Varying Reflectance Estimation from Virtual Exemplars , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[61]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[62]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[63]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[64]  Wei Liu,et al.  Multi-Modal Curriculum Learning for Semi-Supervised Image Classification , 2016, IEEE Transactions on Image Processing.

[65]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Yann LeCun,et al.  Stacked What-Where Auto-encoders , 2015, ArXiv.

[67]  Qi Tian,et al.  Image classification using spatial pyramid robust sparse coding , 2013, Pattern Recognit. Lett..

[68]  Ali Farhadi,et al.  Commonly Uncommon: Semantic Sparsity in Situation Recognition , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[69]  Qi Tian,et al.  Beyond Explicit Codebook Generation: Visual Representation Using Implicitly Transferred Codebooks , 2015, IEEE Transactions on Image Processing.

[70]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[71]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[72]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[73]  Abhinav Gupta,et al.  Unsupervised Learning of Visual Representations Using Videos , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[74]  Luc Van Gool,et al.  Ensemble Projection for Semi-supervised Image Classification , 2013, 2013 IEEE International Conference on Computer Vision.

[75]  Yoshua Bengio,et al.  Generative Adversarial Networks , 2014, ArXiv.

[76]  Qi Tian,et al.  Contextual Exemplar Classifier-Based Image Representation for Classification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[77]  Qi Tian,et al.  Multiview, Few-Labeled Object Categorization by Predicting Labels With View Consistency , 2019, IEEE Transactions on Cybernetics.

[78]  Dacheng Tao,et al.  Fick’s Law Assisted Propagation for Semisupervised Learning , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[79]  Qi Tian,et al.  Image Class Prediction by Joint Object, Context, and Background Modeling , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[80]  Qi Tian,et al.  Image classification by non-negative sparse coding, low-rank and sparse decomposition , 2011, CVPR 2011.

[81]  Alexei A. Efros,et al.  Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[82]  Qi Tian,et al.  Object Categorization Using Class-Specific Representations , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[83]  Qi Tian,et al.  Structured Weak Semantic Space Construction for Visual Categorization , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[84]  Li-Rong Dai,et al.  Video Annotation by Active Learning and Cluster Tuning , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[85]  Pengpeng Zhao,et al.  Weak-Labeled Active Learning With Conditional Label Dependence for Multilabel Image Classification , 2017, IEEE Transactions on Multimedia.

[86]  Wei Liu,et al.  Label Propagation via Teaching-to-Learn and Learning-to-Teach , 2017, IEEE Transactions on Neural Networks and Learning Systems.