Universal Representation Learning from Multiple Domains for Few-shot Classification

In this paper, we look at the problem of few-shot image classification that aims to learn a classifier for previously unseen classes and domains from few labeled samples. Recent methods use various adaptation strategies for aligning their visual representations to new domains or select the relevant ones from multiple domain-specific feature extractors. In this work, we present URL, which learns a single set of universal visual representations by distilling knowledge of multiple domain-specific networks after co-aligning their features with the help of adapters and centered kernel alignment. We show that the universal representations can be further refined for previously unseen domains by an efficient adaptation step in a similar spirit to distance learning methods. We rigorously evaluate our model in the recent Meta-Dataset benchmark and demonstrate that it significantly outperforms the previous methods while being more efficient.

[1]  Andrea Vedaldi,et al.  Learning multiple visual domains with residual adapters , 2017, NIPS.

[2]  Jieping Ye,et al.  A Transductive Multi-Head Model for Cross-Domain Few-Shot Learning , 2020, ArXiv.

[3]  Sebastian Nowozin,et al.  TaskNorm: Rethinking Batch Normalization for Meta-Learning , 2020, ICML.

[4]  Simon Kornblith,et al.  Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth , 2021, ICLR.

[5]  Allison Lai Quick, Draw! , 2020, Die Unterrichtspraxis/Teaching German.

[6]  Amos J. Storkey,et al.  How to train your MAML , 2018, ICLR.

[7]  Leonid Sigal,et al.  Improved Few-Shot Visual Classification , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Yue Wang,et al.  Rethinking Few-Shot Image Classification: a Good Embedding Is All You Need? , 2020, ECCV.

[9]  Julien Mairal,et al.  Selecting Relevant Features from a Multi-domain Representation for Few-Shot Classification , 2020, ECCV.

[10]  Trevor Darrell,et al.  A New Meta-Baseline for Few-Shot Learning , 2020, ArXiv.

[11]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[12]  Andrea Vedaldi,et al.  Efficient Parametrization of Multi-domain Deep Neural Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[14]  Hakan Bilen,et al.  Knowledge Distillation for Multi-task Learning , 2020, ECCV Workshops.

[15]  S. Levine,et al.  Gradient Surgery for Multi-Task Learning , 2020, NeurIPS.

[16]  Frank D. Wood,et al.  Enhancing Few-Shot Image Classification with Unlabelled Examples , 2020, 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[17]  Subhransu Maji,et al.  Fine-Grained Visual Classification of Aircraft , 2013, ArXiv.

[18]  Tao Xiang,et al.  Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Gabriela Csurka,et al.  Distance-Based Image Classification: Generalizing to New Classes at Near-Zero Cost , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[21]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[22]  Aaron C. Courville,et al.  FiLM: Visual Reasoning with a General Conditioning Layer , 2017, AAAI.

[23]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[24]  Yonglong Tian,et al.  Contrastive Representation Distillation , 2019, ICLR.

[25]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[26]  Paul A. Viola,et al.  Learning from one example through shared densities on transforms , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[27]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[28]  Zachary Chase Lipton,et al.  Born Again Neural Networks , 2018, ICML.

[29]  Lu Liu,et al.  A Universal Representation Transformer Layer for Few-Shot Image Classification , 2020, ICLR.

[30]  Yoshua Bengio,et al.  FitNets: Hints for Thin Deep Nets , 2014, ICLR.

[31]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[32]  Joost van de Weijer,et al.  Learning Metrics From Teachers: Compact Networks for Image Embedding , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Joshua B. Tenenbaum,et al.  Meta-Learning for Semi-Supervised Few-Shot Classification , 2018, ICLR.

[34]  Roberto Cipolla,et al.  Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[36]  Joshua Achiam,et al.  On First-Order Meta-Learning Algorithms , 2018, ArXiv.

[37]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[38]  Geoffrey E. Hinton,et al.  Similarity of Neural Network Representations Revisited , 2019, ICML.

[39]  Sebastian Nowozin,et al.  Fast and Flexible Multi-Task Classification Using Conditional Neural Adaptive Processes , 2019, NeurIPS.

[40]  Qiaozhu Mei,et al.  Graph Representation Learning via Multi-task Knowledge Distillation , 2019, ArXiv.

[41]  Joshua B. Tenenbaum,et al.  One shot learning of simple visual concepts , 2011, CogSci.

[42]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[43]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Quoc V. Le,et al.  BAM! Born-Again Multi-Task Networks for Natural Language Understanding , 2019, ACL.

[45]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[46]  Silvio Savarese,et al.  Deep Metric Learning via Lifted Structured Feature Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Zhao Chen,et al.  GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks , 2017, ICML.

[48]  Johannes Stallkamp,et al.  Detection of traffic signs in real-world images: The German traffic sign detection benchmark , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[49]  Stefano Soatto,et al.  A Baseline for Few-Shot Image Classification , 2019, ICLR.

[50]  Yu-Chiang Frank Wang,et al.  A Closer Look at Few-shot Classification , 2019, ICLR.

[51]  Cordelia Schmid,et al.  Optimized Generic Feature Learning for Few-shot Classification across Domains , 2020, ArXiv.

[52]  Andrew Zisserman,et al.  CrossTransformers: spatially-aware few-shot transfer , 2020, NeurIPS.

[53]  Iasonas Kokkinos,et al.  Describing Textures in the Wild , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[54]  Hugo Larochelle,et al.  Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples , 2019, ICLR.

[55]  Andrea Vedaldi,et al.  Universal representations: The missing link between faces, text, planktons, and cat breeds , 2017, ArXiv.

[56]  Christoph H. Lampert,et al.  Towards Understanding Knowledge Distillation , 2019, ICML.

[57]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.