论文信息 - Revisiting Meta-Learning as Supervised Learning

Revisiting Meta-Learning as Supervised Learning

Recent years have witnessed an abundance of new publications and approaches on meta-learning. This community-wide enthusiasm has sparked great insights but has also created a plethora of seemingly different frameworks, which can be hard to compare and evaluate. In this paper, we aim to provide a principled, unifying framework by revisiting and strengthening the connection between meta-learning and traditional supervised learning. By treating pairs of task-specific data sets and target models as (feature, label) samples, we can reduce many meta-learning algorithms to instances of supervised learning. This view not only unifies meta-learning into an intuitive and practical framework but also allows us to transfer insights from supervised learning directly to improve meta-learning. For example, we obtain a better understanding of generalization properties, and we can readily transfer well-understood techniques, such as model ensemble, pre-training, joint training, data augmentation, and even nearest neighbor based methods. We provide an intuitive analogy of these methods in the context of meta-learning and show that they give rise to significant improvements in model performance on few-shot learning.

[1] Christopher Ré,et al. Learning to Compose Domain-Specific Transformations for Data Augmentation , 2017, NIPS.

[2] Jian-Jiun Ding,et al. Facial age estimation based on label-sensitive learning and age-oriented regression , 2013, Pattern Recognit..

[3] Fei-Fei Li,et al. Novel Dataset for Fine-Grained Image Categorization : Stanford Dogs , 2012 .

[4] Jascha Sohl-Dickstein,et al. Meta-Learning Update Rules for Unsupervised Representation Learning , 2018, ICLR.

[5] Wei Shen,et al. Few-Shot Image Recognition by Predicting Parameters from Activations , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6] Pietro Perona,et al. The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[7] Yu Zhang,et al. Learning to Multitask , 2018, NeurIPS.

[8] Philip Bachman,et al. Learning Algorithms for Active Learning , 2017, ICML.

[9] Kilian Q. Weinberger,et al. Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[10] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[11] Pieter Abbeel,et al. The Importance of Sampling inMeta-Reinforcement Learning , 2018, NeurIPS.

[12] Pieter Abbeel,et al. Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments , 2017, ICLR.

[13] Raquel Urtasun,et al. Few-Shot Learning Through an Information Retrieval Lens , 2017, NIPS.

[14] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[15] Dragomir Anguelov,et al. Capturing Long-Tail Distributions of Object Subcategories , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[17] Joshua B. Tenenbaum,et al. Meta-Learning for Semi-Supervised Few-Shot Classification , 2018, ICLR.

[18] Yu-Xiong Wang,et al. Learning to Learn for Small Sample Visual Recognition , 2018 .

[19] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Joshua B. Tenenbaum,et al. Learning to share visual appearance for multiclass object detection , 2011, CVPR 2011.

[21] Paolo Frasconi,et al. A Bridge Between Hyperparameter Optimization and Larning-to-learn , 2017, ArXiv.

[22] Sergey Levine,et al. One-Shot Visual Imitation Learning via Meta-Learning , 2017, CoRL.

[23] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[24] Yongxin Yang,et al. Learning to Generalize: Meta-Learning for Domain Generalization , 2017, AAAI.

[25] Luca Bertinetto,et al. Learning feed-forward one-shot learners , 2016, NIPS.

[26] Jitendra Malik,et al. SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[27] Balaraman Ravindran,et al. Learning to Multi-Task by Active Sampling , 2017, ICLR.

[28] Sergey Levine,et al. Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm , 2017, ICLR.

[29] Subhransu Maji,et al. Task2Vec: Task Embedding for Meta-Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30] Vikas K. Garg,et al. Supervising Unsupervised Learning , 2017, NeurIPS.

[31] Wei Zhou,et al. Feature-Critic Networks for Heterogeneous Domain Generalization , 2019, ICML.

[32] Yoshua Bengio,et al. Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[33] Joaquin Vanschoren,et al. Meta-Learning: A Survey , 2018, Automated Machine Learning.

[34] Alexander Ilin,et al. Semi-Supervised Few-Shot Learning with MAML , 2018, ICLR.

[35] Frank Hutter,et al. Neural Architecture Search: A Survey , 2018, J. Mach. Learn. Res..

[36] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[37] Quoc V. Le,et al. Neural Optimizer Search with Reinforcement Learning , 2017, ICML.

[38] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .

[39] Yuan Shi,et al. Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[40] Dirk Van,et al. Ensemble Methods: Foundations and Algorithms , 2012 .

[41] Subhransu Maji,et al. Fine-Grained Visual Classification of Aircraft , 2013, ArXiv.

[42] Razvan Pascanu,et al. Meta-Learning with Latent Embedding Optimization , 2018, ICLR.

[43] Massimiliano Pontil,et al. Multi-Task Feature Learning , 2006, NIPS.

[44] Aurko Roy,et al. Learning to Remember Rare Events , 2017, ICLR.

[45] Amos J. Storkey,et al. Towards a Neural Statistician , 2016, ICLR.

[46] Koby Crammer,et al. A theory of learning from different domains , 2010, Machine Learning.

[47] Daan Wierstra,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[48] Hugo Larochelle,et al. A Meta-Learning Perspective on Cold-Start Recommendations for Items , 2017, NIPS.

[49] Antonio Torralba,et al. Recognizing indoor scenes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[50] Sergey Levine,et al. Unsupervised Learning via Meta-Learning , 2018, ICLR.

[51] Jascha Sohl-Dickstein,et al. Learning Unsupervised Learning Rules , 2018, ArXiv.

[52] Stefano Soatto,et al. The Information Complexity of Learning Tasks, their Structure and their Distance , 2019, Information and Inference: A Journal of the IMA.

[53] Tao Xiang,et al. Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[54] Yang Wu,et al. Meta-Learning Transferable Active Learning Policies by Deep Reinforcement Learning , 2018, ArXiv.

[55] Jonathan Baxter,et al. A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[56] Marcin Andrychowicz,et al. Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[57] Pieter Abbeel,et al. Meta Learning Shared Hierarchies , 2017, ICLR.

[58] Seungjin Choi,et al. Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace , 2018, ICML.

[59] Bharath Hariharan,et al. Low-Shot Visual Recognition by Shrinking and Hallucinating Features , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[60] Yun Fu,et al. Network Reparameterization for Unseen Class Categorization , 2018 .

[61] Jonathan Krause,et al. 3D Object Representations for Fine-Grained Categorization , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[62] Massimiliano Pontil,et al. The Benefit of Multitask Representation Learning , 2015, J. Mach. Learn. Res..

[63] Bogdan Gabrys,et al. Metalearning: a survey of trends and technologies , 2013, Artificial Intelligence Review.

[64] Chelsea Finn,et al. Learning to Learn with Gradients , 2018 .

[65] Hugo Larochelle,et al. Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples , 2019, ICLR.

[66] Misha Denil,et al. Learned Optimizers that Scale and Generalize , 2017, ICML.

[67] Andreas Maurer,et al. Algorithmic Stability and Meta-Learning , 2005, J. Mach. Learn. Res..

[68] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.

[69] Michael I. Jordan,et al. Shared Segmentation of Natural Scenes Using Dependent Pitman-Yor Processes , 2008, NIPS.

[70] Sethuraman Panchanathan,et al. Deep Hashing Network for Unsupervised Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[71] Ricardo Vilalta,et al. A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.

[72] Hugo Larochelle,et al. Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[73] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.

[74] Martial Hebert,et al. Low-Shot Learning from Imaginary Data , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[75] Hugo Larochelle,et al. Meta-Learning for Batch Mode Active Learning , 2018, ICLR.

[76] Yu-Chiang Frank Wang,et al. A Closer Look at Few-shot Classification , 2019, ICLR.

[77] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[78] Po-Sen Huang,et al. Natural Language to Structured Query Generation via Meta-Learning , 2018, NAACL.

[79] Hugo Larochelle. Few-shot Learning with Meta-Learning: Progress Made and Challenges Ahead , 2018 .

[80] J. Stenton,et al. Learning how to teach. , 1973, Nursing mirror and midwives journal.

[81] Fei Sha,et al. Learning Embedding Adaptation for Few-Shot Learning , 2018, ArXiv.

[82] Yoshua Bengio,et al. Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[83] Marcin Andrychowicz,et al. One-Shot Imitation Learning , 2017, NIPS.

[84] Martial Hebert,et al. Learning to Learn: Model Regression Networks for Easy Small Sample Learning , 2016, ECCV.

[85] G. Evans,et al. Learning to Optimize , 2008 .

[86] Gerald Tesauro,et al. Learning to Learn without Forgetting By Maximizing Transfer and Minimizing Interference , 2018, ICLR.

[87] Swami Sankaranarayanan,et al. MetaReg: Towards Domain Generalization using Meta-Regularization , 2018, NeurIPS.

[88] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[89] Martial Hebert,et al. Learning to Model the Tail , 2017, NIPS.

[90] Nikos Komodakis,et al. Dynamic Few-Shot Visual Learning Without Forgetting , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[91] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.

[92] Yoshua Bengio,et al. MetaGAN: An Adversarial Approach to Few-Shot Learning , 2018, NeurIPS.

[93] Yu Zhang,et al. Transfer Learning via Learning to Transfer , 2018, ICML.

[94] Zeb Kurth-Nelson,et al. Been There, Done That: Meta-Learning with Episodic Recall , 2018, ICML.

[95] Hang Li,et al. Meta-SGD: Learning to Learn Quickly for Few Shot Learning , 2017, ArXiv.

[96] Alexandre Lacoste,et al. TADAM: Task dependent adaptive metric for improved few-shot learning , 2018, NeurIPS.

[97] Sergey Levine,et al. Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning , 2018, ICLR.

[98] José M. F. Moura,et al. Few-Shot Human Motion Prediction via Meta-learning , 2018, ECCV.

[99] Thomas G. Dietterich. Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[100] Sergey Levine,et al. One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning , 2018, Robotics: Science and Systems.

[101] C A Nelson,et al. Learning to Learn , 2017, Encyclopedia of Machine Learning and Data Mining.