MetaMix: Improved Meta-Learning with Interpolation-based Consistency Regularization

Model-Agnostic Meta-Learning (MAML) and its variants are popular few-shot classification methods. They train an initializer across a variety of sampled learning tasks (also known as episodes) such that the initialized model can adapt quickly to new tasks. However, current MAML-based algorithms have limitations in forming generalizable decision boundaries. In this paper, we propose an approach called MetaMix. It generates virtual feature-target pairs within each episode to regularize the backbone models. MetaMix can be integrated with any of the MAML-based algorithms and learn the decision boundaries generalizing better to new tasks. Experiments on the mini-ImageNet, CUB, and FC100 datasets show that MetaMix improves the performance of MAML-based algorithms and achieves state-of-the-art result when integrated with Meta-Transfer Learning.

[1]  Marcin Andrychowicz,et al.  Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[2]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[3]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[4]  Qing Li,et al.  Prototypical Networks for Small Footprint Text-Independent Speaker Verification , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[6]  Hang Li,et al.  Meta-SGD: Learning to Learn Quickly for Few Shot Learning , 2017, ArXiv.

[7]  Alexandre Lacoste,et al.  TADAM: Task dependent adaptive metric for improved few-shot learning , 2018, NeurIPS.

[8]  Tao Xiang,et al.  Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[10]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[11]  Bernt Schiele,et al.  Meta-Transfer Learning for Few-Shot Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[14]  David Berthelot,et al.  ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring , 2020, ICLR.

[15]  Yun Ma,et al.  Virtual Mixup Training for Unsupervised Domain Adaptation , 2019, ArXiv.

[16]  Amos Storkey,et al.  Meta-Learning in Neural Networks: A Survey , 2020, IEEE transactions on pattern analysis and machine intelligence.

[17]  Razvan Pascanu,et al.  Meta-Learning with Latent Embedding Optimization , 2018, ICLR.

[18]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[19]  Yu-Chiang Frank Wang,et al.  A Closer Look at Few-shot Classification , 2019, ICLR.

[20]  Joshua Achiam,et al.  On First-Order Meta-Learning Algorithms , 2018, ArXiv.

[21]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[22]  Hongyu Guo,et al.  MixUp as Locally Linear Out-Of-Manifold Regularization , 2018, AAAI.

[23]  Hong Yu,et al.  Meta Networks , 2017, ICML.

[24]  Qing Li,et al.  Mixing Up Real Samples and Adversarial Samples for Semi-Supervised Learning , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[25]  James T. Kwok,et al.  Generalizing from a Few Examples , 2019, ACM Comput. Surv..

[26]  Ioannis Mitliagkas,et al.  Manifold Mixup: Learning Better Representations by Interpolating Hidden States , 2018, 1806.05236.

[27]  Yoshua Bengio,et al.  MetaGAN: An Adversarial Approach to Few-Shot Learning , 2018, NeurIPS.

[28]  David Berthelot,et al.  FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence , 2020, NeurIPS.

[29]  Bo Liu,et al.  Few-Shot Open-Set Recognition Using Meta-Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Qing Li,et al.  An Investigation of Few-Shot Learning in Spoken Term Classification , 2020, INTERSPEECH.

[31]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Artëm Yankov,et al.  Few-Shot Learning with Metric-Agnostic Conditional Embeddings , 2018, ArXiv.

[33]  Yoshua Bengio,et al.  Interpolation Consistency Training for Semi-Supervised Learning , 2019, IJCAI.

[34]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[35]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[36]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[37]  J. Schulman,et al.  Reptile: a Scalable Metalearning Algorithm , 2018 .

[38]  Martial Hebert,et al.  Image Deformation Meta-Networks for One-Shot Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Daan Wierstra,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.