A survey of deep meta-learning

Deep neural networks can achieve great successes when presented with large data sets and sufficient computational resources. However, their ability to learn new concepts quickly is limited. Meta-learning is one approach to address this issue, by enabling the network to learn how to learn. The field of Deep Meta-Learning advances at great speed, but lacks a unified, in-depth overview of current techniques. With this work, we aim to bridge this gap. After providing the reader with a theoretical foundation, we investigate and summarize key methods, which are categorized into (i) metric-, (ii) model-, and (iii) optimization-based techniques. In addition, we identify the main open challenges, such as performance evaluations on heterogeneous benchmarks, and reduction of the computational costs of meta-learning.

[1]  Sergey Levine,et al.  Online Meta-Learning , 2019, ICML.

[2]  Sebastian Thrun,et al.  Lifelong Learning Algorithms , 1998, Learning to Learn.

[3]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[4]  Joaquin Vanschoren,et al.  Meta-Learning: A Survey , 2018, Automated Machine Learning.

[5]  Sergey Levine,et al.  Meta-Reinforcement Learning of Structured Exploration Strategies , 2018, NeurIPS.

[6]  Joshua B. Tenenbaum,et al.  One shot learning of simple visual concepts , 2011, CogSci.

[7]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[8]  Sepp Hochreiter,et al.  Learning to Learn Using Gradient Descent , 2001, ICANN.

[9]  Yoshua Bengio,et al.  On the Optimization of a Synaptic Learning Rule , 2007 .

[10]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Ben Calderhead,et al.  Advances in Neural Information Processing Systems 29 , 2016 .

[12]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[13]  Luís Torgo,et al.  OpenML: networked science in machine learning , 2014, SKDD.

[14]  Dilin Wang,et al.  Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.

[15]  Peter L. Bartlett,et al.  RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.

[16]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[17]  Chelsea Finn,et al.  Meta-Learning without Memorization , 2020, ICLR.

[18]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[19]  Bin Luo,et al.  Mitochondrial Organelle Movement Classification (Fission and Fusion) via Convolutional Neural Network Approach , 2019, IEEE Access.

[20]  Roger B. Grosse,et al.  Optimizing Neural Networks with Kronecker-factored Approximate Curvature , 2015, ICML.

[21]  Yu-Chiang Frank Wang,et al.  A Closer Look at Few-shot Classification , 2019, ICLR.

[22]  Peter A. Flach,et al.  Improved Dataset Characterisation for Meta-learning , 2002, Discovery Science.

[23]  Evgin Goceri,et al.  CapsNet topology to classify tumours from brain images and comparative evaluation , 2020, IET Image Process..

[24]  Bartunov Sergey,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016 .

[25]  Sergey Levine,et al.  Probabilistic Model-Agnostic Meta-Learning , 2018, NeurIPS.

[26]  J. Schmidhuber,et al.  A neural network that embeds its own meta-levels , 1993, IEEE International Conference on Neural Networks.

[27]  Sergey Levine,et al.  Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm , 2017, ICLR.

[28]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[29]  Jürgen Schmidhuber,et al.  Shifting Inductive Bias with Success-Story Algorithm, Adaptive Levin Search, and Incremental Self-Improvement , 1997, Machine Learning.

[30]  Tao Xiang,et al.  Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Razvan Pascanu,et al.  Meta-Learning with Latent Embedding Optimization , 2018, ICLR.

[32]  Samy Bengio,et al.  Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML , 2020, ICLR.

[33]  Rashid Mehmood,et al.  Heterogeneous transfer learning techniques for machine learning , 2018, Iran J. Comput. Sci..

[34]  Alexandre Lacoste,et al.  TADAM: Task dependent adaptive metric for improved few-shot learning , 2018, NeurIPS.

[35]  Hugo Larochelle,et al.  Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples , 2019, ICLR.

[36]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[37]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[38]  Richard J. Mammone,et al.  Meta-neural networks that learn by learning , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[39]  Santosh S. Vempala,et al.  Efficient algorithms for online decision problems , 2005, Journal of computer and system sciences (Print).

[40]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[41]  Shujian Huang,et al.  Efficient cell classification of mitochondrial images by using deep learning , 2019, Journal of Optics.

[42]  Hang Li,et al.  Meta-SGD: Learning to Learn Quickly for Few Shot Learning , 2017, ArXiv.

[43]  Hong Yu,et al.  Meta Networks , 2017, ICML.

[44]  Terry Anderson,et al.  The Theory and Practice of Online Learning , 2009 .

[45]  Martial Hebert,et al.  Learning Compositional Representations for Few-Shot Recognition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[46]  Amos Storkey,et al.  Meta-Learning in Neural Networks: A Survey , 2020, IEEE transactions on pattern analysis and machine intelligence.

[47]  Joan Bruna,et al.  Few-Shot Learning with Graph Neural Networks , 2017, ICLR.

[48]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[49]  Daan Wierstra,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[50]  Joel J. P. C. Rodrigues,et al.  Deep learning recognition of diseased and normal cell representation , 2020, Trans. Emerg. Telecommun. Technol..

[51]  James Hannan,et al.  4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .

[52]  Joshua B. Tenenbaum,et al.  Meta-Learning for Semi-Supervised Few-Shot Classification , 2018, ICLR.

[53]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[54]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[55]  Kenneth O. Stanley,et al.  Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity , 2018, ICLR.

[56]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[57]  Subhransu Maji,et al.  Meta-Learning With Differentiable Convex Optimization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Evgin Göçeri Convolutional Neural Network Based Desktop Applications to Classify Dermatological Diseases , 2020, 2020 IEEE 4th International Conference on Image Processing, Applications and Systems (IPAS).

[59]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[60]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[61]  Ricardo Vilalta,et al.  Metalearning - Applications to Data Mining , 2008, Cognitive Technologies.

[62]  Sergey Levine,et al.  Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.

[63]  Amos J. Storkey,et al.  Towards a Neural Statistician , 2016, ICLR.

[64]  Felix Hill,et al.  Measuring abstract reasoning in neural networks , 2018, ICML.

[65]  Sepp Hochreiter,et al.  Meta-learning with backpropagation , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[66]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[67]  Pieter Abbeel,et al.  A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[68]  Jitendra Malik,et al.  Learning to Optimize Neural Nets , 2017, ArXiv.

[69]  Yoshua Bengio,et al.  Learning a synaptic learning rule , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[70]  Amos J. Storkey,et al.  How to train your MAML , 2018, ICLR.

[71]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[72]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[73]  Thomas L. Griffiths,et al.  Recasting Gradient-Based Meta-Learning as Hierarchical Bayes , 2018, ICLR.

[74]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[75]  Geoffrey E. Hinton Using fast weights to deblur old memories , 1987 .

[76]  Joshua Achiam,et al.  On First-Order Meta-Learning Algorithms , 2018, ArXiv.

[77]  Zeb Kurth-Nelson,et al.  Learning to reinforcement learn , 2016, CogSci.

[78]  Sergey Levine,et al.  Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning , 2018, ICLR.

[79]  Tom M. Mitchell,et al.  The Need for Biases in Learning Generalizations , 2007 .

[80]  Yoshua Bengio,et al.  Bayesian Model-Agnostic Meta-Learning , 2018, NeurIPS.

[81]  Norbert Jankowski,et al.  Meta-Learning in Computational Intelligence , 2013, Meta-Learning in Computational Intelligence.

[82]  Ambedkar Dukkipati,et al.  Attentive Recurrent Comparators , 2017, ICML.

[83]  E. Goceri,et al.  COMPARATIVE EVALUATIONS OF CNN BASED NETWORKS FOR SKIN LESION CLASSIFICATION , 2020, 14th International Conference on Computer Graphics, Visualization, Computer Vision and Image Processing, 5th International Conference on Big Data Analytics, Data Mining and Computational Intelligence and 9th International Conference on Theory and Practice.

[84]  Yee Whye Teh,et al.  Conditional Neural Processes , 2018, ICML.

[85]  Kenneth O. Stanley,et al.  Differentiable plasticity: training plastic neural networks with backpropagation , 2018, ICML.

[86]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[87]  Luca Bertinetto,et al.  Meta-learning with differentiable closed-form solvers , 2018, ICLR.

[88]  Evgin Goceri,et al.  Challenges and Recent Solutions for Image Segmentation in the Era of Deep Learning , 2019, 2019 Ninth International Conference on Image Processing Theory, Tools and Applications (IPTA).

[89]  Marcin Andrychowicz,et al.  Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[90]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[91]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[92]  Sergey Levine,et al.  Meta-Learning with Implicit Gradients , 2019, NeurIPS.

[93]  Chen Sun,et al.  Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[94]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.