论文信息 - Towards Cross-Granularity Few-Shot Learning: Coarse-to-Fine Pseudo-Labeling with Visual-Semantic Meta-Embedding

Towards Cross-Granularity Few-Shot Learning: Coarse-to-Fine Pseudo-Labeling with Visual-Semantic Meta-Embedding

Few-shot learning aims at rapidly adapting to novel categories with only a handful of samples at test time, which has been predominantly tackled with the idea of meta-learning. However, meta-learning approaches essentially learn across a variety of few-shot tasks and thus still require large-scale training data with fine-grained supervision to derive a generalized model, thereby involving prohibitive annotation cost. In this paper, we advance the few-shot classification paradigm towards a more challenging scenario, i.e, cross-granularity few-shot classification, where the model observes only coarse labels during training while is expected to perform fine-grained classification during testing. This task largely relieves the annotation cost since fine-grained labeling usually requires strong domain-specific expertise. To bridge the cross-granularity gap, we approximate the fine-grained data distribution by greedy clustering of each coarse-class into pseudo-fine-classes according to the similarity of image embeddings. We then propose a meta-embedder that jointly optimizes the visual- and semantic-discrimination, in both instance-wise and coarse class-wise, to obtain a good feature space for this coarse-to-fine pseudo-labeling process. Extensive experiments and ablation studies are conducted to demonstrate the effectiveness and robustness of our approach on three representative datasets.

[1] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[3] Stella X. Yu,et al. Unsupervised Feature Learning via Non-parametric Instance Discrimination , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4] Matthieu Guillaumin,et al. From categories to subcategories: Large-scale image classification with partial class label refinement , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Trevor Darrell,et al. Adversarial Feature Learning , 2016, ICLR.

[6] Nanning Zheng,et al. Transductive Semi-Supervised Deep Learning Using Min-Max Features , 2018, ECCV.

[7] Bernt Schiele,et al. Learning to Self-Train for Semi-Supervised Few-Shot Classification , 2019, NeurIPS.

[8] Yu Liu,et al. CNN-RNN: a large-scale hierarchical image classification framework , 2018, Multimedia Tools and Applications.

[9] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[10] Joshua B. Tenenbaum,et al. Human-level concept learning through probabilistic program induction , 2015, Science.

[11] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[12] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Stefano Soatto,et al. Few-Shot Learning With Embedded Class Models and Shot-Free Meta Training , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14] Mubarak Shah,et al. Task Agnostic Meta-Learning for Few-Shot Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Stefano Soatto,et al. A Baseline for Few-Shot Image Classification , 2019, ICLR.

[16] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[17] Bernt Schiele,et al. Meta-Transfer Learning for Few-Shot Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Suvrit Sra,et al. Strength from Weakness: Fast Learning Using Weak Supervision , 2020, ICML.

[19] Yu-Chiang Frank Wang,et al. A Closer Look at Few-shot Classification , 2019, ICLR.

[20] Matthijs Douze,et al. Deep Clustering for Unsupervised Learning of Visual Features , 2018, ECCV.

[21] Shih-Fu Chang,et al. Unsupervised Embedding Learning via Invariant and Spreading Instance Feature , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Subhransu Maji,et al. Meta-Learning With Differentiable Convex Optimization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Samy Bengio,et al. Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML , 2020, ICLR.

[24] Alexei A. Efros,et al. Colorful Image Colorization , 2016, ECCV.

[25] Daan Wierstra,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[26] Ladislau Bölöni,et al. Unsupervised Meta-Learning for Few-Shot Image Classification , 2019, NeurIPS.

[27] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[28] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[29] Sebastian Thrun,et al. Learning to Learn: Introduction and Overview , 1998, Learning to Learn.

[30] Sham M. Kakade,et al. Few-Shot Learning via Learning the Representation, Provably , 2020, ICLR.

[31] Lina Yao,et al. Prototype Propagation Networks (PPN) for Weakly-supervised Few-shot Learning on Category Graph , 2019, IJCAI.

[32] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[33] Yue Wang,et al. Rethinking Few-Shot Image Classification: a Good Embedding Is All You Need? , 2020, ECCV.

[34] Trevor Darrell,et al. A New Meta-Baseline for Few-Shot Learning , 2020, ArXiv.

[35] Thomas Brox,et al. Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[37] Gregory R. Koch,et al. Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[38] Alexei A. Efros,et al. Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[39] Joshua B. Tenenbaum,et al. Meta-Learning for Semi-Supervised Few-Shot Classification , 2018, ICLR.

[40] Yannis Avrithis,et al. Label Propagation for Deep Semi-Supervised Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41] Kilian Q. Weinberger,et al. On Calibration of Modern Neural Networks , 2017, ICML.

[42] Yunchao Wei,et al. Meta Parsing Networks: Towards Generalized Few-shot Scene Parsing with Adaptive Metric Learning , 2020, ACM Multimedia.

[43] Zhi-Hua Zhou,et al. A brief introduction to weakly supervised learning , 2018 .

[44] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[45] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[46] Mehrtash Harandi,et al. Adaptive Subspaces for Few-Shot Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47] Nasser M. Nasrabadi,et al. A Weakly Supervised Fine Label Classifier Enhanced by Coarse Supervision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[48] Kaiming He,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49] Tao Xiang,et al. Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50] Stefanie Jegelka,et al. Deep Metric Learning via Facility Location , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[52] Gustavo Carneiro,et al. Smart Mining for Deep Metric Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[53] Hugo Larochelle,et al. Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[54] David Berthelot,et al. MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[55] Silvio Savarese,et al. Deep Metric Learning via Lifted Structured Feature Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56] Yoshua Bengio,et al. MetaGAN: An Adversarial Approach to Few-Shot Learning , 2018, NeurIPS.

[57] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[58] Tao Xiang,et al. Few-Shot Learning With Global Class Representations , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[59] Alexei A. Efros,et al. Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60] Quoc V. Le,et al. Self-Training With Noisy Student Improves ImageNet Classification , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[61] Sergey Levine,et al. Unsupervised Learning via Meta-Learning , 2018, ICLR.

[62] Yu-Gang Jiang,et al. Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition , 2020, ACM Multimedia.

[63] Fei-Yue Wang,et al. Learning from the Past: Meta-Continual Learning with Knowledge Embedding for Jointly Sketch, Cartoon, and Caricature Face Recognition , 2020, ACM Multimedia.

[64] Dong-Hyun Lee,et al. Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[65] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[66] Qingming Huang,et al. Task-distribution-aware Meta-learning for Cold-start CTR Prediction , 2020, ACM Multimedia.

[67] Alexandre Lacoste,et al. TADAM: Task dependent adaptive metric for improved few-shot learning , 2018, NeurIPS.