论文信息 - Interventional Few-Shot Learning

Interventional Few-Shot Learning

We uncover an ever-overlooked deficiency in the prevailing Few-Shot Learning (FSL) methods: the pre-trained knowledge is indeed a confounder that limits the performance. This finding is rooted from our causal assumption: a Structural Causal Model (SCM) for the causalities among the pre-trained knowledge, sample features, and labels. Thanks to it, we propose a novel FSL paradigm: Interventional Few-Shot Learning (IFSL). Specifically, we develop three effective IFSL algorithmic implementations based on the backdoor adjustment, which is essentially a causal intervention towards the SCM of many-shot learning: the upper-bound of FSL in a causal view. It is worth noting that the contribution of IFSL is orthogonal to existing fine-tuning and meta-learning based FSL methods, hence IFSL can improve all of them, achieving a new 1-/5-shot state-of-the-art on \textit{mini}ImageNet, \textit{tiered}ImageNet, and cross-domain CUB. Code is released at this https URL.

[1] Martial Hebert,et al. Image Deformation Meta-Networks for One-Shot Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[3] Hanwang Zhang,et al. Two Causal Principles for Improving Visual Dialog , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Joshua B. Tenenbaum,et al. Meta-Learning for Semi-Supervised Few-Shot Classification , 2018, ICLR.

[5] Alexei A. Efros,et al. What makes ImageNet good for transfer learning? , 2016, ArXiv.

[6] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Alexandre Lacoste,et al. TADAM: Task dependent adaptive metric for improved few-shot learning , 2018, NeurIPS.

[8] Neil D. Lawrence,et al. Empirical Bayes Transductive Meta-Learning with Synthetic Gradients , 2020, ICLR.

[9] Bernhard Schölkopf,et al. Counterfactuals uncover the modular structure of deep generative models , 2018, ICLR.

[10] Quanming Yao,et al. Few-shot Learning: A Survey , 2019, ArXiv.

[11] Jing Zhang,et al. Few-Shot Learning via Saliency-Guided Hallucination of Samples , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Bernhard Schölkopf,et al. Discovering Causal Signals in Images , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Mubarak Shah,et al. Task Agnostic Meta-Learning for Few-Shot Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[15] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[16] Tiago Ramalho,et al. An empirical study of pretrained representations for few-shot classification , 2019, ArXiv.

[17] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Bernhard Schölkopf,et al. Learning Independent Causal Mechanisms , 2017, ICML.

[19] Brian D. Davison,et al. Impact of ImageNet Model Selection on Domain Adaptation , 2020, 2020 IEEE Winter Applications of Computer Vision Workshops (WACVW).

[20] Stefano Soatto,et al. A Baseline for Few-Shot Image Classification , 2019, ICLR.

[21] J. Pearl. Causality: Models, Reasoning and Inference , 2000 .

[22] Hugo Larochelle,et al. Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[23] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.

[24] J. Schulman,et al. Reptile: a Scalable Metalearning Algorithm , 2018 .

[25] Hanwang Zhang,et al. Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect , 2020, NeurIPS.

[26] Yi Ma,et al. Robust principal component analysis? , 2009, JACM.

[27] Jay L. Devore,et al. A Modern Introduction to Probability and Statistics: Understanding Why and How , 2006 .

[28] V. Kshirsagar,et al. Face recognition using Eigenfaces , 2011, 2011 3rd International Conference on Computer Research and Development.

[29] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[30] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[31] Amos J. Storkey,et al. Data Augmentation Generative Adversarial Networks , 2017, ICLR 2018.

[32] Jianqiang Huang,et al. Unbiased Scene Graph Generation From Biased Training , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33] J. Pearl,et al. Causal Inference in Statistics: A Primer , 2016 .

[34] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[35] Xiaogang Wang,et al. Finding Task-Relevant Features for Few-Shot Learning by Category Traversal , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36] Xilin Chen,et al. Cross Attention Network for Few-shot Classification , 2019, NeurIPS.

[37] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.

[38] Bernt Schiele,et al. Meta-Transfer Learning for Few-Shot Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Atsuto Maki,et al. Factors of Transferability for a Generic ConvNet Representation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40] J. Pearl,et al. Bounds on Treatment Effects from Studies with Imperfect Compliance , 1997 .

[41] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[42] Yoshua Bengio,et al. MetaGAN: An Adversarial Approach to Few-Shot Learning , 2018, NeurIPS.

[43] S T Roweis,et al. Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[44] Feiyue Huang,et al. LGM-Net: Learning to Generate Matching Networks for Few-Shot Learning , 2019, ICML.

[45] Yoshua Bengio,et al. Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.

[46] Daan Wierstra,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[47] 知秀柴田. 5分で分かる!? 有名論文ナナメ読み：Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .

[48] Hanwang Zhang,et al. Visual Commonsense R-CNN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49] Hanwang Zhang,et al. Deconfounded Image Captioning: A Causal Retrospect , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50] Albert Gatt,et al. Transfer learning from language models to image caption generators: Better models may not transfer better , 2019, ArXiv.

[51] J. Tenenbaum,et al. A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[52] Quoc V. Le,et al. Do Better ImageNet Models Transfer Better? , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[53] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54] Aoxue Li,et al. Boosting Few-Shot Learning With Adaptive Margin Loss , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[55] Pierre Baldi,et al. The dropout learning algorithm , 2014, Artif. Intell..

[56] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[57] Guosheng Lin,et al. DeepEMD: Few-Shot Image Classification With Differentiable Earth Mover’s Distance and Structured Classifiers , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[58] Yu-Chiang Frank Wang,et al. A Closer Look at Few-shot Classification , 2019, ICLR.

[59] Pietro Perona,et al. Visual Causal Feature Learning , 2014, UAI.

[60] Jinhui Tang,et al. Causal Intervention for Weakly-Supervised Semantic Segmentation , 2020, NeurIPS.

[61] Razvan Pascanu,et al. Meta-Learning with Latent Embedding Optimization , 2018, ICLR.

[62] J. Pearl. Causal diagrams for empirical research , 1995 .

[63] Bernhard Schölkopf,et al. Domain Adaptation with Conditional Transferable Components , 2016, ICML.

[64] Stefan Bauer,et al. Robustly Disentangled Causal Mechanisms: Validating Deep Representations for Interventional Robustness , 2018, ICML.

[65] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.

[66] Wei Shen,et al. Few-Shot Image Recognition by Predicting Parameters from Activations , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.