论文信息 - Hierarchically Structured Meta-learning - 字舞流文

Hierarchically Structured Meta-learning

In order to learn quickly with few samples, meta-learning utilizes prior knowledge learned from previous tasks. However, a critical challenge in meta-learning is task uncertainty and heterogeneity, which can not be handled via globally sharing knowledge among tasks. In this paper, based on gradient-based meta-learning, we propose a hierarchically structured meta-learning (HSML) algorithm that explicitly tailors the transferable knowledge to different clusters of tasks. Inspired by the way human beings organize knowledge, we resort to a hierarchical task clustering structure to cluster tasks. As a result, the proposed approach not only addresses the challenge via the knowledge customization to different clusters of tasks, but also preserves knowledge generalization among a cluster of similar tasks. To tackle the changing of task relationship, in addition, we extend the hierarchical structure to a continual learning environment. The experimental results show that our approach can achieve state-of-the-art performance in both toy-regression and few-shot image classification problems.

Ying Wei | Zhenhui Li | Junzhou Huang | Huaxiu Yao | Junzhou Huang | Z. Li | Ying Wei | Huaxiu Yao

[1] Marcin Andrychowicz,et al. Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[2] Iasonas Kokkinos,et al. Describing Textures in the Wild , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3] Yong Wang,et al. Meta-Learning for Low-Resource Neural Machine Translation , 2018, EMNLP.

[4] G. Evans,et al. Learning to Optimize , 2008 .

[5] Ilja Kuzborskij,et al. Fast rates by transferring from auxiliary hypotheses , 2014, Machine Learning.

[6] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[7] Jonathan Baxter,et al. Theoretical Models of Learning to Learn , 1998, Learning to Learn.

[8] Jure Leskovec,et al. Inductive Representation Learning on Large Graphs , 2017, NIPS.

[9] D. Blei,et al. Context, learning, and extinction. , 2010, Psychological review.

[10] Samuel Gershman,et al. Statistical Computations Underlying the Dynamics of Memory Updating , 2014, PLoS Comput. Biol..

[11] Hal Daumé,et al. Learning Task Grouping and Overlap in Multi-task Learning , 2012, ICML.

[12] Tsendsuren Munkhdalai,et al. Rapid Adaptation with Conditionally Shifted Neurons , 2017, ICML.

[13] Christoph H. Lampert,et al. Data-Dependent Stability of Stochastic Gradient Descent , 2017, ICML.

[14] Sung Ju Hwang,et al. Lifelong Learning with Dynamically Expandable Networks , 2017, ICLR.

[15] Hang Li,et al. Meta-SGD: Learning to Learn Quickly for Few Shot Learning , 2017, ArXiv.

[16] Daan Wierstra,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[17] Hugo Larochelle,et al. Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[18] Neil D. Lawrence,et al. Transferring Knowledge across Learning Processes , 2018, ICLR.

[19] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[20] Pietro Perona,et al. The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[21] Sergey Levine,et al. Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm , 2017, ICLR.

[22] Raquel Urtasun,et al. Few-Shot Learning Through an Information Retrieval Lens , 2017, NIPS.

[23] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[24] Subhransu Maji,et al. Fine-Grained Visual Classification of Aircraft , 2013, ArXiv.

[25] Jure Leskovec,et al. Hierarchical Graph Representation Learning with Differentiable Pooling , 2018, NeurIPS.

[26] Thomas L. Griffiths,et al. Recasting Gradient-Based Meta-Learning as Hierarchical Bayes , 2018, ICLR.

[27] Yoshua Bengio,et al. Bayesian Model-Agnostic Meta-Learning , 2018, NeurIPS.

[28] J. Schulman,et al. Reptile: a Scalable Metalearning Algorithm , 2018 .

[29] Amit Daniely,et al. Strongly Adaptive Online Learning , 2015, ICML.

[30] Aaron C. Courville,et al. FiLM: Visual Reasoning with a General Conditioning Layer , 2017, AAAI.

[31] Matthias Hein,et al. Optimization Landscape and Expressivity of Deep CNNs , 2017, ICML.

[32] Seungjin Choi,et al. Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace , 2018, ICML.

[33] Joseph J. Lim,et al. Toward Multimodal Model-Agnostic Meta-Learning , 2018, ArXiv.

[34] Eric P. Xing,et al. Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity , 2009, ICML.

[35] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[36] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[37] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[38] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[39] Qiang Chen,et al. Network In Network , 2013, ICLR.

[40] Hong Yu,et al. Meta Networks , 2017, ICML.

[41] Daniel A. Braun,et al. Structure learning in action , 2010, Behavioural Brain Research.

[42] Deniz Yuret,et al. Transfer Learning for Low-Resource Neural Machine Translation , 2016, EMNLP.

[43] Holger Schwenk,et al. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[44] Tao Xiang,et al. Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45] Joan Bruna,et al. Few-Shot Learning with Graph Neural Networks , 2017, ICLR.