Semi-Supervised Robust Deep Neural Networks for Multi-Label Classification

In this paper, we propose a robust method for semisupervised training of deep neural networks for multi-label image classification. To this end, we use ramp loss, which is more robust against noisy and incomplete image labels compared to the classical hinge loss. The proposed method allows for learning from both labeled and unlabeled data in a semi-supervised learning setting. This is achieved by propagating labels from the labeled images to their unlabeled neighbors. Using a robust loss function becomes crucial here, as the initial label propagations may include many errors, which degrades the performance of non-robust loss functions. In contrast, the proposed robust ramp loss restricts extreme penalties for the samples with incorrect labels, and the label assignment improves in each iteration and contributes to the learning process. The proposed method achieves state-of-the-art results in semisupervised learning experiments on the CIFAR-10 and STL10 datasets, and comparable results to the state-of the-art in supervised learning experiments on the NUS-WIDE and MS-COCO datasets.

[1]  Hossein Mobahi,et al.  Deep Learning via Semi-supervised Embedding , 2012, Neural Networks: Tricks of the Trade.

[2]  Daniel Cremers,et al.  Learning by Association — A Versatile Semi-Supervised Training Method for Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Chen Sun,et al.  Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Wei Xu,et al.  CNN-RNN: A Unified Framework for Multi-label Image Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[6]  Hideki Nakayama,et al.  Annotation order matters: Recurrent Image Annotator for arbitrary length image tagging , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[7]  Bernhard Schölkopf,et al.  Learning from labeled and unlabeled data on a directed graph , 2005, ICML.

[8]  Gang Niu,et al.  Analysis of Learning from Positive and Unlabeled Data , 2014, NIPS.

[9]  Hakan Cevikalp,et al.  Towards Category Based Large-Scale Image Retrieval Using Transductive Support Vector Machines , 2016, ECCV Workshops.

[10]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2007, ICML '07.

[11]  Yangqing Jia,et al.  Deep Convolutional Ranking for Multilabel Image Annotation , 2013, ICLR.

[12]  Nenghai Yu,et al.  Learning Spatial Regularization with Image-Level Supervisions for Multi-label Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[15]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[16]  Bingbing Ni,et al.  HCP: A Flexible CNN Framework for Multi-Label Image Classification , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Jason Weston,et al.  Kernel methods for Multi-labelled classification and Categ orical regression problems , 2001, NIPS 2001.

[18]  Jian-Huang Lai,et al.  Deep Growing Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Bernt Schiele,et al.  Top-k Multiclass SVM , 2015, NIPS.

[21]  Qi Wu,et al.  Multilabel Image Classification With Regional Latent Semantic Dependencies , 2016, IEEE Transactions on Multimedia.

[22]  Jason Weston,et al.  WSABIE: Scaling Up to Large Vocabulary Image Annotation , 2011, IJCAI.

[23]  Bernt Schiele,et al.  Loss Functions for Top-k Error: Analysis and Insights , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  C. Lee Giles,et al.  Nonconvex Online Support Vector Machines , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[26]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[27]  Jiebo Luo,et al.  Weakly Semi-Supervised Deep Learning for Multi-Label Image Annotation , 2015, IEEE Transactions on Big Data.

[28]  Patrick Gallinari,et al.  Ranking with ordered weighted pairwise classification , 2009, ICML '09.

[29]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[30]  Tapani Raiko,et al.  Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[31]  Tatsuya Harada,et al.  Multi-label Ranking from Positive and Unlabeled Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Amanda Clare,et al.  Knowledge Discovery in Multi-label Phenotype Data , 2001, PKDD.

[33]  Hakan Cevikalp,et al.  Semi-Supervised Dimensionality Reduction Using Pairwise Equivalence Constraints , 2008, VISAPP.

[34]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[35]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[36]  Bernt Schiele,et al.  Analysis and Optimization of Loss Functions for Multiclass, Top-k, and Multilabel Classification , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Chen Huang,et al.  Unsupervised Learning of Discriminative Attributes and Visual Representations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.