A Deep Learning Framework for Lifelong Machine Learning

Humans can learn a variety of concepts and skills incrementally over the course of their lives while exhibiting many desirable properties, such as continual learning without forgetting, forward transfer and backward transfer of knowledge, and learning a new concept or task with only a few examples. Several lines of machine learning research, such as lifelong machine learning, few-shot learning, and transfer learning attempt to capture these properties. However, most previous approaches can only demonstrate subsets of these properties, often by different complex mechanisms. In this work, we propose a simple yet powerful unified deep learning framework that supports almost all of these properties and approaches through one central mechanism. Experiments on toy examples support our claims. We also draw connections between many peculiarities of human learning (such as memory loss and “rain man”) and our framework. As academics, we often lack resources required to build and train, deep neural networks with billions of parameters on hundreds of TPUs. Thus, while our framework is still conceptual, and our experiment results are surely not SOTA, we hope that this unified lifelong learning framework inspires new work towards large-scale experiments and understanding human learning in general. This paper is summarized in two short YouTube videos: https://youtu.be/gCuUyGETbTU (part 1) and https://youtu.be/XsaGI01b-1o (part 2).

[1]  Marcus Rohrbach,et al.  Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.

[2]  P. S. St George-Hyslop,et al.  Prediction of probable Alzheimer's disease in memory-impaired patients , 1996, Neurology.

[3]  Charles Ling,et al.  A Deeper Look at Bongard Problems , 2020, Canadian Conference on AI.

[4]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[5]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[6]  Razvan Pascanu,et al.  Progressive Neural Networks , 2016, ArXiv.

[7]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[8]  H. Wildon Carr,et al.  I.—The Problem of Recognition , 1917 .

[9]  Jiwon Kim,et al.  Continual Learning with Deep Generative Replay , 2017, NIPS.

[10]  Sung Ju Hwang,et al.  Lifelong Learning with Dynamically Expandable Networks , 2017, ICLR.

[11]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Christoph Meinel,et al.  Deep Learning for Medical Image Analysis , 2018, Journal of Pathology Informatics.

[14]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[15]  Matthew P Walker,et al.  Sleep, memory and emotion. , 2010, Progress in brain research.

[16]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[17]  Andreas Kamilaris,et al.  Deep learning in agriculture: A survey , 2018, Comput. Electron. Agric..

[18]  Larry P. Heck,et al.  Class-incremental Learning via Deep Model Consolidation , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[19]  William D S Killgore,et al.  Effects of sleep deprivation on cognition. , 2010, Progress in brain research.

[20]  Yu Zhang,et al.  A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.

[21]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[22]  Tiago Ramalho,et al.  An empirical study of pretrained representations for few-shot classification , 2019, ArXiv.

[23]  David Barber,et al.  Online Structured Laplace Approximations For Overcoming Catastrophic Forgetting , 2018, NeurIPS.

[24]  D. Treffert,et al.  Inside the mind of a savant. , 2005, Scientific American.

[25]  J. Elman Learning and development in neural networks: the importance of starting small , 1993, Cognition.

[26]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[27]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[28]  Gregory Cohen,et al.  EMNIST: an extension of MNIST to handwritten letters , 2017, CVPR 2017.

[29]  Nitish Srivastava Unsupervised Learning of Visual Representations using Videos , 2015 .

[30]  David Isele,et al.  Selective Experience Replay for Lifelong Learning , 2018, AAAI.

[31]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Sebastian Thrun,et al.  Lifelong Learning Algorithms , 1998, Learning to Learn.

[33]  Sebastian Thrun,et al.  A lifelong learning perspective for mobile robot control , 1994, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'94).

[34]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[35]  Kaiming He,et al.  Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.

[36]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[37]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[38]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[39]  Marc'Aurelio Ranzato,et al.  Continual Learning with Tiny Episodic Memories , 2019, ArXiv.

[40]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[41]  K. Thomas,et al.  A Simple Role for BDNF in Learning and Memory? , 2009, Front. Mol. Neurosci..

[42]  Yan Liu,et al.  Deep Generative Dual Memory Network for Continual Learning , 2017, ArXiv.

[43]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[44]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2019, Neural Networks.

[45]  Zhanxing Zhu,et al.  Reinforced Continual Learning , 2018, NeurIPS.

[46]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[47]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[48]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Sebastian Thrun,et al.  A Lifelong Learning Perspective for Mobile Robot Control , 1994, IROS.

[50]  Philip H. S. Torr,et al.  Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.