论文信息 - A Self-supervised GAN for Unsupervised Few-shot Object Recognition

A Self-supervised GAN for Unsupervised Few-shot Object Recognition

This paper addresses unsupervised few-shot object recognition, where all training images are unlabeled, and test images are divided into queries and a few labeled support images per object class of interest. The training and test images do not share object classes. We extend the vanilla GAN with two loss functions, both aimed at self-supervised learning. The first is a reconstruction loss that enforces the discriminator to reconstruct the probabilistically sampled latent code which has been used for generating the “fake” image. The second is a triplet loss that enforces the discriminator to output image encodings that are closer for more similar images. Evaluation, comparisons, and detailed ablation studies are done in the context of few-shot classification. Our approach significantly outperforms the state of the art on the Mini-Imagenet and Tiered-Imagenet datasets.

Khoi Nguyen | Sinisa Todorovic | S. Todorovic | Khoi Duc Minh Nguyen

[1] Xiaohua Zhai,et al. Self-Supervised GANs via Auxiliary Rotation Loss , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[3] Trevor Darrell,et al. Adversarial Feature Learning , 2016, ICLR.

[4] Ladislau Bölöni,et al. Unsupervised Meta-Learning For Few-Shot Image and Video Classification , 2018, ArXiv.

[5] Pascal Vincent,et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[6] Paolo Favaro,et al. Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[7] Hugo Larochelle,et al. Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[8] J. Henderson. Eye movement control during visual object processing: effects of initial fixation position and semantic constraint. , 1993, Canadian journal of experimental psychology = Revue canadienne de psychologie experimentale.

[9] Alexei A. Efros,et al. Colorful Image Colorization , 2016, ECCV.

[10] Stella X. Yu,et al. Unsupervised Feature Learning via Non-parametric Instance Discrimination , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11] Amos J. Storkey,et al. Assume, Augment and Learn: Unsupervised Few-Shot Meta-Learning via Random Labels and Data Augmentation , 2019, ArXiv.

[12] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[13] Laurent Itti,et al. Interesting objects are visually salient. , 2008, Journal of vision.

[14] Sergey Levine,et al. Unsupervised Learning via Meta-Learning , 2018, ICLR.

[15] Alexei A. Efros,et al. Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Patrick Pérez,et al. Boosting Few-Shot Visual Learning With Self-Supervision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[17] Ladislau Bölöni,et al. Unsupervised Meta-Learning for Few-Shot Image Classification , 2019, NeurIPS.

[18] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .

[19] Andrew Zisserman,et al. Multi-task Self-Supervised Visual Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20] Alexei A. Efros,et al. Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[22] Paolo Favaro,et al. Representation Learning by Learning to Count , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23] Yuichi Yoshida,et al. Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[24] Nikos Komodakis,et al. Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[25] Joshua B. Tenenbaum,et al. Meta-Learning for Semi-Supervised Few-Shot Classification , 2018, ICLR.

[26] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[27] J. Henderson,et al. Object-based attentional selection in scene viewing. , 2010, Journal of vision.

[28] Ali Borji,et al. Reconciling Saliency and Object Center-Bias Hypotheses in Explaining Free-Viewing Fixations , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[29] Ngai-Man Cheung,et al. Self-supervised GAN: Analysis and Improvement with Multi-class Minimax Game , 2019, NeurIPS.

[30] Ting Chen,et al. On Self Modulation for Generative Adversarial Networks , 2018, ICLR.

[31] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[32] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[33] Jae Hyun Lim,et al. Geometric GAN , 2017, ArXiv.

[34] Bernt Schiele,et al. Learning to Self-Train for Semi-Supervised Few-Shot Classification , 2019, NeurIPS.

[35] Matthijs Douze,et al. Deep Clustering for Unsupervised Learning of Visual Features , 2018, ECCV.

[36] Yingli Tian,et al. Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37] Ali Farhadi,et al. Unsupervised Deep Embedding for Clustering Analysis , 2015, ICML.

[38] Yoshua Bengio,et al. MetaGAN: An Adversarial Approach to Few-Shot Learning , 2018, NeurIPS.

[39] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.