论文信息 - Meta-Learning with Latent Embedding Optimization - 字舞流文

Meta-Learning with Latent Embedding Optimization

Gradient-based meta-learning techniques are both widely applicable and proficient at solving challenging few-shot learning and fast adaptation problems. However, they have practical difficulties when operating on high-dimensional parameter spaces in extreme low-data regimes. We show that it is possible to bypass these limitations by learning a data-dependent latent generative representation of model parameters, and performing gradient-based meta-learning in this low-dimensional latent space. The resulting approach, latent embedding optimization (LEO), decouples the gradient-based adaptation procedure from the underlying high-dimensional space of model parameters. Our evaluation shows that LEO can achieve state-of-the-art performance on the competitive miniImageNet and tieredImageNet few-shot classification tasks. Further analysis indicates LEO is able to capture uncertainty in the data, and can perform adaptation more effectively by optimizing in latent space.

Razvan Pascanu | Raia Hadsell | Simon Osindero | Oriol Vinyals | Dushyant Rao | Andrei A. Rusu | Jakub Sygnowski | Oriol Vinyals | R. Hadsell | Simon Osindero | Razvan Pascanu | Jakub Sygnowski | Dushyant Rao | O. Vinyals

[1] Geoffrey E. Hinton. Using fast weights to deblur old memories , 1987 .

[2] Sebastian Thrun,et al. Learning to Learn: Introduction and Overview , 1998, Learning to Learn.

[3] Sepp Hochreiter,et al. Learning to Learn Using Gradient Descent , 2001, ICANN.

[4] Pietro Perona,et al. One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[6] Joshua B. Tenenbaum,et al. One shot learning of simple visual concepts , 2011, CogSci.

[7] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.

[9] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[10] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[11] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[12] Trevor Darrell,et al. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[13] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[14] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.

[15] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[16] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[17] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[19] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[20] Gregory R. Koch,et al. Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[21] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.

[22] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Daan Wierstra,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[24] Marcin Andrychowicz,et al. Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[25] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[26] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[27] Geoffrey E. Hinton,et al. Using Fast Weights to Attend to the Recent Past , 2016, NIPS.

[28] Bartunov Sergey,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016 .

[29] Hugo Larochelle,et al. Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[30] Tsendsuren Munkhdalai,et al. Learning Rapid-Temporal Adaptations , 2017, ArXiv.

[31] Alexandre Lacoste,et al. Bayesian Hypernetworks , 2017, ArXiv.

[32] Christopher Burgess,et al. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[33] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.

[34] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[35] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.

[36] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[37] Hang Li,et al. Meta-SGD: Learning to Learn Quickly for Few Shot Learning , 2017, ArXiv.

[38] Bernhard Schölkopf,et al. Discriminative k-shot learning using probabilistic models , 2017, ArXiv.

[39] Bin Wu,et al. Deep Meta-Learning: Learning to Learn in the Concept Space , 2018, ArXiv.

[40] Sergey Levine,et al. Probabilistic Model-Agnostic Meta-Learning , 2018, NeurIPS.

[41] Wei Shen,et al. Few-Shot Image Recognition by Predicting Parameters from Activations , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42] Thomas L. Griffiths,et al. Recasting Gradient-Based Meta-Learning as Hierarchical Bayes , 2018, ICLR.

[43] Max Tegmark,et al. Meta-learning autoencoders for few-shot prediction , 2018, ArXiv.

[44] J. Schulman,et al. Reptile: a Scalable Metalearning Algorithm , 2018 .

[45] Alexandre Lacoste,et al. Uncertainty in Multitask Transfer Learning , 2018, ArXiv.

[46] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[47] Yoshua Bengio,et al. Bayesian Model-Agnostic Meta-Learning , 2018, NeurIPS.

[48] Yi Yang,et al. Transductive Propagation Network for Few-shot Learning , 2018, ArXiv.

[49] Nikos Komodakis,et al. Dynamic Few-Shot Visual Learning Without Forgetting , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50] Yee Whye Teh,et al. Conditional Neural Processes , 2018, ICML.

[51] Tao Xiang,et al. Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[52] Seungjin Choi,et al. Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace , 2018, ICML.

[53] Joshua B. Tenenbaum,et al. Meta-Learning for Semi-Supervised Few-Shot Classification , 2018, ICLR.

[54] Alexandre Lacoste,et al. TADAM: Task dependent adaptive metric for improved few-shot learning , 2018, NeurIPS.