Hierarchical Meta Learning

Meta learning is a promising solution to few-shot learning problems. However, existing meta learning methods are restricted to the scenarios where training and application tasks share the same out-put structure. To obtain a meta model applicable to the tasks with new structures, it is required to collect new training data and repeat the time-consuming meta training procedure. This makes them inefficient or even inapplicable in learning to solve heterogeneous few-shot learning tasks. We thus develop a novel and principled HierarchicalMeta Learning (HML) method. Different from existing methods that only focus on optimizing the adaptability of a meta model to similar tasks, HML also explicitly optimizes its generalizability across heterogeneous tasks. To this end, HML first factorizes a set of similar training tasks into heterogeneous ones and trains the meta model over them at two levels to maximize adaptation and generalization performance respectively. The resultant model can then directly generalize to new tasks. Extensive experiments on few-shot classification and regression problems clearly demonstrate the superiority of HML over fine-tuning and state-of-the-art meta learning approaches in terms of generalization across heterogeneous tasks.

[1]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[3]  Bogdan Gabrys,et al.  Metalearning: a survey of trends and technologies , 2013, Artificial Intelligence Review.

[4]  Peter L. Bartlett,et al.  RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.

[5]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[6]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[8]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[9]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[10]  Joshua B. Tenenbaum,et al.  One shot learning of simple visual concepts , 2011, CogSci.

[11]  Sergey Levine,et al.  One-Shot Visual Imitation Learning via Meta-Learning , 2017, CoRL.

[12]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[13]  Bin Wu,et al.  Deep Meta-Learning: Learning to Learn in the Concept Space , 2018, ArXiv.

[14]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[15]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[16]  Razvan Pascanu,et al.  Meta-Learning with Latent Embedding Optimization , 2018, ICLR.

[17]  Sebastian Thrun,et al.  Learning to Learn , 1998, Springer US.

[18]  Daan Wierstra,et al.  One-shot Learning with Memory-Augmented Neural Networks , 2016, ArXiv.

[19]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[20]  Hilan Bensusan,et al.  Meta-Learning by Landmarking Various Learning Algorithms , 2000, ICML.

[21]  Hang Li,et al.  Meta-SGD: Learning to Learn Quickly for Few Shot Learning , 2017, ArXiv.

[22]  Ryan P. Adams,et al.  Gradient-based Hyperparameter Optimization through Reversible Learning , 2015, ICML.

[23]  Frank Hutter,et al.  Initializing Bayesian Hyperparameter Optimization via Meta-Learning , 2015, AAAI.

[24]  Joan Bruna,et al.  Few-Shot Learning with Graph Neural Networks , 2017, ICLR.