Rectifying the Shortcut Learning of Background: Shared Object Concentration for Few-Shot Image Recognition

Few-Shot image classification aims to utilize pretrained knowledge learned from a large-scale dataset to tackle a series of downstream classification tasks. Typically, each task involves only few training examples from brand-new categories. This requires the pretraining models to focus on well-generalizable knowledge, but ignore domain-specific information. In this paper, we observe that image background serves as a source of domain-specific knowledge, which is a shortcut for models to learn in the source dataset, but is harmful when adapting to brand-new classes. To prevent the model from learning this shortcut knowledge, we propose COSOC, a novel Few-Shot Learning framework, to automatically figure out foreground objects at both pretraining and evaluation stage. COSOC is a two-stage algorithm motivated by the observation that foreground objects from different images within the same class share more similar patterns than backgrounds. At the pretraining stage, for each class, we cluster contrastive-pretrained features of randomly cropped image patches, such that crops containing only foreground objects can be identified by a single cluster. We then force the pretraining model to focus on found foreground objects by a fusion sampling strategy; at the evaluation stage, among images in each training class of any few-shot task, we seek for shared contents and filter out background. The recognized foreground objects of each class are used to match foreground of testing images. Extensive experiments tailored to inductive FSL tasks on two benchmarks demonstrate the state-of-the-art performance of our method.

[1]  Aleksander Madry,et al.  Noise or Signal: The Role of Image Backgrounds in Object Recognition , 2020, ICLR.

[2]  Andrew Zisserman,et al.  CrossTransformers: spatially-aware few-shot transfer , 2020, NeurIPS.

[3]  Hang Li,et al.  Meta-SGD: Learning to Learn Quickly for Few Shot Learning , 2017, ArXiv.

[4]  Junier B. Oliva,et al.  Meta-Curvature , 2019, NeurIPS.

[5]  Orchid Majumder,et al.  Revisiting Contrastive Learning for Few-Shot Classification , 2021, ArXiv.

[6]  Subhransu Maji,et al.  Meta-Learning With Differentiable Convex Optimization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Zhiwu Lu,et al.  Contrastive Prototype Learning with Augmented Embeddings for Few-Shot Learning , 2021, UAI.

[8]  Yee Whye Teh,et al.  MetaFun: Meta-Learning with Iterative Functional Updates , 2020, ICML.

[9]  Nikos Komodakis,et al.  Generating Classification Weights With GNN Denoising Autoencoders for Few-Shot Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Hong Yu,et al.  Meta Networks , 2017, ICML.

[11]  Xiaogang Wang,et al.  Finding Task-Relevant Features for Few-Shot Learning by Category Traversal , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Alexandre Lacoste,et al.  TADAM: Task dependent adaptive metric for improved few-shot learning , 2018, NeurIPS.

[13]  Julien Mairal,et al.  Unsupervised Learning of Visual Features by Contrasting Cluster Assignments , 2020, NeurIPS.

[14]  Hanwang Zhang,et al.  Interventional Few-Shot Learning , 2020, NeurIPS.

[15]  Sebastian Thrun,et al.  Learning to Learn: Introduction and Overview , 1998, Learning to Learn.

[16]  Jae-Joon Han,et al.  Meta Variance Transfer: Learning to Augment from the Others , 2020, ICML.

[17]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[18]  Stephen Lin,et al.  What makes instance discrimination good for transfer learning? , 2020, ICLR.

[19]  Jilin Li,et al.  Learning a Few-shot Embedding Model with Contrastive Learning , 2021, AAAI.

[20]  Sung Whan Yoon,et al.  TapNet: Neural Network Augmented with Task-Adaptive Projection for Few-Shot Learning , 2019, ICML.

[21]  Xilin Chen,et al.  Cross Attention Network for Few-shot Classification , 2019, NeurIPS.

[22]  Yoshua Bengio,et al.  MetaGAN: An Adversarial Approach to Few-Shot Learning , 2018, NeurIPS.

[23]  Fahad Shahbaz Khan,et al.  Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[25]  Amos Storkey,et al.  Meta-Learning in Neural Networks: A Survey , 2020, IEEE transactions on pattern analysis and machine intelligence.

[26]  Mubarak Shah,et al.  Task Agnostic Meta-Learning for Few-Shot Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Zhuowen Tu,et al.  Attentional Constellation Nets for Few-Shot Learning , 2021, ICLR.

[28]  Zheng Zhang,et al.  Negative Margin Matters: Understanding Margin in Few-shot Classification , 2020, ECCV.

[29]  Rogério Schmidt Feris,et al.  Delta-encoder: an effective sample synthesis method for few-shot object recognition , 2018, NeurIPS.

[30]  Sergey Levine,et al.  Meta-Learning with Implicit Gradients , 2019, NeurIPS.

[31]  Katja Hofmann,et al.  Fast Context Adaptation via Meta-Learning , 2018, ICML.

[32]  James T. Kwok,et al.  Generalizing from a Few Examples , 2019, ACM Comput. Surv..

[33]  Matthias Bethge,et al.  ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.

[34]  Guosheng Lin,et al.  DeepEMD: Few-Shot Image Classification With Differentiable Earth Mover’s Distance and Structured Classifiers , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Mehrtash Harandi,et al.  Adaptive Subspaces for Few-Shot Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Yue Wang,et al.  Rethinking Few-Shot Image Classification: a Good Embedding Is All You Need? , 2020, ECCV.

[37]  Myriam Tami,et al.  Spatial Contrastive Learning for Few-Shot Classification , 2020, ECML/PKDD.

[38]  Bharath Hariharan,et al.  Few-Shot Classification with Feature Map Reconstruction Networks , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[40]  Gunhee Kim,et al.  Model-Agnostic Boundary-Adversarial Sampling for Test-Time Generalization in Few-Shot Learning , 2020, ECCV.

[41]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[42]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Tao Xiang,et al.  Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44]  Razvan Pascanu,et al.  Meta-Learning with Latent Embedding Optimization , 2018, ICLR.

[45]  Yannis Avrithis,et al.  Dense Classification and Implanting for Few-Shot Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[47]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Nikos Komodakis,et al.  Dynamic Few-Shot Visual Learning Without Forgetting , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[49]  Kyoung Mu Lee,et al.  Meta-Learning with Adaptive Hyperparameters , 2020, NeurIPS.

[50]  Aoxue Li,et al.  Boosting Few-Shot Learning With Adaptive Margin Loss , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Bharath Hariharan,et al.  Low-Shot Visual Recognition by Shrinking and Hallucinating Features , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[52]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[53]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[54]  Feiyue Huang,et al.  Learning Dynamic Alignment via Meta-filter for Few-shot Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Joshua B. Tenenbaum,et al.  Meta-Learning for Semi-Supervised Few-Shot Classification , 2018, ICLR.

[56]  Zhiwu Lu,et al.  MELR: Meta-Learning via Modeling Episode-Level Relationships for Few-Shot Learning , 2021, ICLR.

[57]  Martial Hebert,et al.  Low-Shot Learning from Imaginary Data , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[58]  Fei Sha,et al.  Few-Shot Learning via Embedding Adaptation With Set-to-Set Functions , 2018, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Michal Valko,et al.  Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.

[60]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[61]  Martial Hebert,et al.  Image Deformation Meta-Networks for One-Shot Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).