Meta Learning on a Sequence of Imbalanced Domains with Difficulty Awareness

Recognizing new objects by learning from a few labeled examples in an evolving environment is crucial to obtain excellent generalization ability for real-world machine learning systems. A typical setting across current meta learning algorithms assumes a stationary task distribution during meta training. In this paper, we explore a more practical and challenging setting where task distribution changes over time with domain shift. Particularly, we consider realistic scenarios where task distribution is highly imbalanced with domain labels unavailable in nature. We propose a kernel-based method for domain change detection and a difficulty-aware memory management mechanism that jointly considers the imbalanced domain size and domain importance to learn across domains continuously. Furthermore, we introduce an efficient adaptive task sampling method during meta training, which significantly reduces task gradient variance with theoretical guarantees. Finally, we propose a challenging benchmark with imbalanced domain sequences and varied domain difficulty. We have performed extensive evaluations on the proposed benchmark, demonstrating the effectiveness of our method.

[1]  Yoshua Bengio,et al.  Torchmeta: A Meta-Learning library for PyTorch , 2019, ArXiv.

[2]  Amos Storkey,et al.  Defining Benchmarks for Continual Few-Shot Learning , 2020, ArXiv.

[3]  Yoshua Bengio,et al.  Variance Reduction in SGD by Distributed Importance Sampling , 2015, ArXiv.

[4]  Roberto Basili,et al.  Learning to Solve NLP Tasks in an Incremental Number of Languages , 2021, ACL.

[5]  Marcin Andrychowicz,et al.  Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[6]  Joseph J. Lim,et al.  Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation , 2019, NeurIPS.

[7]  Gerald Tesauro,et al.  Learning to Learn without Forgetting By Maximizing Transfer and Minimizing Interference , 2018, ICLR.

[8]  Pietro Perona,et al.  Caltech-UCSD Birds 200 , 2010 .

[9]  Marco Pavone,et al.  Continuous Meta-Learning without Tasks , 2019, NeurIPS.

[10]  Sergey Levine,et al.  Probabilistic Model-Agnostic Meta-Learning , 2018, NeurIPS.

[11]  Marc'Aurelio Ranzato,et al.  Efficient Lifelong Learning with A-GEM , 2018, ICLR.

[12]  Sergey Levine,et al.  Online Meta-Learning , 2019, ICML.

[13]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[14]  Richard E. Turner,et al.  Variational Continual Learning , 2017, ICLR.

[15]  Adrian Popescu,et al.  IL2M: Class Incremental Learning With Dual Memory , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Stefano Soatto,et al.  Few-Shot Learning With Embedded Class Models and Shot-Free Meta Training , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  Yu-Chiang Frank Wang,et al.  A Closer Look at Few-shot Classification , 2019, ICLR.

[18]  Sung Ju Hwang,et al.  Lifelong Learning with Dynamically Expandable Networks , 2017, ICLR.

[19]  J. Schmidhuber,et al.  A neural network that embeds its own meta-levels , 1993, IEEE International Conference on Neural Networks.

[20]  Trevor Darrell,et al.  Adversarial Continual Learning , 2020, ECCV.

[21]  Sung Whan Yoon,et al.  XtarNet: Learning to Extract Task-Adaptive Representation for Incremental Few-Shot Learning , 2020, ICML.

[22]  Ioannis Mitliagkas,et al.  Reducing the variance in online optimization by transporting past gradients , 2019, NeurIPS.

[23]  Changyou Chen,et al.  Meta-Learning with Neural Tangent Kernels , 2021, ICLR.

[24]  Tinne Tuytelaars,et al.  Online Continual Learning with Maximally Interfered Retrieval , 2019, ArXiv.

[25]  Marc'Aurelio Ranzato,et al.  Continual Learning with Tiny Episodic Memories , 2019, ArXiv.

[26]  Maria-Florina Balcan,et al.  Provable Guarantees for Gradient-Based Meta-Learning , 2019, ICML.

[27]  Marcus Rohrbach,et al.  Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.

[28]  Hung-Yu Tseng,et al.  Cross-Domain Few-Shot Classification via Learned Feature-Wise Transformation , 2020, ICLR.

[29]  Joshua B. Tenenbaum,et al.  One shot learning of simple visual concepts , 2011, CogSci.

[30]  Amos J. Storkey,et al.  Towards a Neural Statistician , 2016, ICLR.

[31]  Samy Bengio,et al.  Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML , 2020, ICLR.

[32]  Hugo Larochelle,et al.  Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples , 2019, ICLR.

[33]  Martial Hebert,et al.  Learning Compositional Representations for Few-Shot Recognition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[35]  Tinne Tuytelaars,et al.  Task-Free Continual Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Bingbing Ni,et al.  Variational Few-Shot Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[37]  Subhransu Maji,et al.  Meta-Learning With Differentiable Convex Optimization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Thomas L. Griffiths,et al.  Reconciling meta-learning and continual learning with online mixtures of tasks , 2018, NeurIPS.

[39]  Yuanjie Zheng,et al.  Logo-2K+: A Large-Scale Logo Dataset for Scalable Logo Classification , 2019, AAAI.

[40]  Nikos Komodakis,et al.  Dynamic Few-Shot Visual Learning Without Forgetting , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41]  Damien Garreau,et al.  NEWMA: A New Method for Scalable Model-Free Online Change-Point Detection , 2018, IEEE Transactions on Signal Processing.

[42]  Renjie Liao,et al.  Incremental Few-Shot Learning with Attention Attractor Networks , 2018, NeurIPS.

[43]  Gunhee Kim,et al.  Imbalanced Continual Learning with Partitioning Reservoir Sampling , 2020, ECCV.

[44]  Pieter Abbeel,et al.  A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[45]  Eunho Yang,et al.  Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks , 2019, ICLR.

[46]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[47]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[48]  Min Lin,et al.  Online Fast Adaptation and Knowledge Accumulation: a New Approach to Continual Learning , 2020, ArXiv.

[49]  Shiho Kim,et al.  Self-Driving like a Human driver instead of a Robocar: Personalized comfortable driving experience for autonomous vehicles , 2020, ArXiv.

[50]  Peter Richtárik,et al.  SGD: General Analysis and Improved Rates , 2019, ICML 2019.

[51]  Marie-Francine Moens,et al.  Online Continual Learning from Imbalanced Data , 2020, ICML.

[52]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[53]  Amos J. Storkey,et al.  How to train your MAML , 2018, ICLR.

[54]  Yoshua Bengio,et al.  Gradient based sample selection for online continual learning , 2019, NeurIPS.

[55]  Ing Rj Ser Approximation Theorems of Mathematical Statistics , 1980 .

[56]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[57]  Yang Zhao,et al.  Bayesian Meta Sampling for Fast Uncertainty Adaptation , 2020, ICLR.

[58]  Jeffrey Scott Vitter,et al.  Random sampling with a reservoir , 1985, TOMS.

[59]  François Fleuret,et al.  Not All Samples Are Created Equal: Deep Learning with Importance Sampling , 2018, ICML.

[60]  Subhransu Maji,et al.  Fine-Grained Visual Classification of Aircraft , 2013, ArXiv.

[61]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.