Semi-supervised robust deep neural networks for multi-label image classification

Abstract This paper introduces a robust method for semi-supervised training of deep neural networks for multi-label image classification. To this end, a ramp loss is utilized since it is more robust against noisy and incomplete image labels compared to the classic hinge loss. The proposed method allows for learning from both labeled and unlabeled data in a semi-supervised setting. This is achieved by propagating labels from the labeled images to their unlabeled neighbors in the feature space. Using a robust loss function becomes crucial here, as the initial label propagations may include many errors, which degrades the performance of non-robust loss functions. In contrast, the proposed robust ramp loss restricts extreme penalties from the samples with incorrect labels, and the label assignment improves in each iteration and contributes to the learning process. The proposed method achieves state-of-the-art results in semi-supervised learning experiments on the CIFAR-10 and STL-10 datasets, and comparable results to the state-of the-art in supervised learning experiments on the NUS-WIDE and MS-COCO datasets. Experimental results also verify that our proposed method is more robust against noisy image labels as expected.

[1]  Hakan Cevikalp,et al.  Semi-Supervised Dimensionality Reduction Using Pairwise Equivalence Constraints , 2008, VISAPP.

[2]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[3]  Chen Huang,et al.  Unsupervised Learning of Discriminative Attributes and Visual Representations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Nenghai Yu,et al.  Learning Spatial Regularization with Image-Level Supervisions for Multi-label Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Gang Niu,et al.  Analysis of Learning from Positive and Unlabeled Data , 2014, NIPS.

[6]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[7]  Hanjiang Lai,et al.  Simultaneous feature learning and hash coding with deep neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[9]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[10]  Bernhard Schölkopf,et al.  Learning from labeled and unlabeled data on a directed graph , 2005, ICML.

[11]  Yin Wang,et al.  Large-scale multi-label classification using unknown streaming images , 2020, Pattern Recognit..

[12]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[13]  Bingbing Ni,et al.  HCP: A Flexible CNN Framework for Multi-Label Image Classification , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Arun K. Pujari,et al.  Group Preserving Label Embedding for Multi-Label Classification , 2018, Pattern Recognit..

[15]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[16]  Robert Jenssen,et al.  Noisy multi-label semi-supervised dimensionality reduction , 2019, Pattern Recognit..

[17]  Daniel Cremers,et al.  Learning by Association — A Versatile Semi-Supervised Training Method for Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Jason Weston,et al.  Deep learning via semi-supervised embedding , 2008, ICML '08.

[19]  Graham W. Taylor,et al.  Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[20]  Chen Sun,et al.  Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[22]  C. Lee Giles,et al.  Nonconvex Online Support Vector Machines , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Hakan Cevikalp,et al.  Towards Category Based Large-Scale Image Retrieval Using Transductive Support Vector Machines , 2016, ECCV Workshops.

[24]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[25]  Bernt Schiele,et al.  Loss Functions for Top-k Error: Analysis and Insights , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Ömer Emre Yetgin,et al.  Power Line Recognition From Aerial Images With Deep Learning , 2019, IEEE Transactions on Aerospace and Electronic Systems.

[27]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Bernt Schiele,et al.  Top-k Multiclass SVM , 2015, NIPS.

[29]  Qi Wu,et al.  Multilabel Image Classification With Regional Latent Semantic Dependencies , 2016, IEEE Transactions on Multimedia.

[30]  Tapani Raiko,et al.  Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[31]  Jian-Huang Lai,et al.  Deep Growing Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[32]  Jason Weston,et al.  WSABIE: Scaling Up to Large Vocabulary Image Annotation , 2011, IJCAI.

[33]  Yangqing Jia,et al.  Deep Convolutional Ranking for Multilabel Image Annotation , 2013, ICLR.

[34]  Bernt Schiele,et al.  Analysis and Optimization of Loss Functions for Multiclass, Top-k, and Multilabel Classification , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[36]  Tatsuya Harada,et al.  Multi-label Ranking from Positive and Unlabeled Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Amanda Clare,et al.  Knowledge Discovery in Multi-label Phenotype Data , 2001, PKDD.

[38]  Hakan Cevikalp,et al.  Large-scale image retrieval using transductive support vector machines , 2017, Comput. Vis. Image Underst..

[39]  Albert Gordo,et al.  Deep Image Retrieval: Learning Global Representations for Image Search , 2016, ECCV.

[40]  Wei Xu,et al.  CNN-RNN: A Unified Framework for Multi-label Image Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Alan Wee-Chung Liew,et al.  Multi-label classification via label correlation and first order feature dependance in a data stream , 2019, Pattern Recognit..

[42]  Hideki Nakayama,et al.  Annotation order matters: Recurrent Image Annotator for arbitrary length image tagging , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[43]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Wu Liu,et al.  DELTA: A deep dual-stream network for multi-label image classification , 2019, Pattern Recognit..

[45]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[46]  Jiebo Luo,et al.  Weakly Semi-Supervised Deep Learning for Multi-Label Image Annotation , 2015, IEEE Transactions on Big Data.

[47]  Tieniu Tan,et al.  Deep semantic ranking based hashing for multi-label image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Patrick Gallinari,et al.  Ranking with ordered weighted pairwise classification , 2009, ICML '09.