Continual Learning for Robotics

Continual learning (CL) is a particular machine learning paradigm where the data distribution and learning objective changes through time, or where all the training data and objective criteria are never available at once. The evolution of the learning process is modeled by a sequence of learning experiences where the goal is to be able to learn new skills all along the sequence without forgetting what has been previously learned. Continual learning also aims at the same time at optimizing the memory, the computation power and the speed during the learning process. An important challenge for machine learning is not necessarily finding solutions that work in the real world but rather finding stable algorithms that can learn in real world. Hence, the ideal approach would be tackling the real world in a embodied platform: an autonomous agent. Continual learning would then be effective in an autonomous agent or robot, which would learn autonomously through time about the external world, and incrementally develop a set of complex skills and knowledge. Robotic agents have to learn to adapt and interact with their environment using a continuous stream of observations. Some recent approaches aim at tackling continual learning for robotics, but most recent papers on continual learning only experiment approaches in simulation or with static datasets. Unfortunately, the evaluation of those algorithms does not provide insights on whether their solutions may help continual learning in the context of robotics. This paper aims at reviewing the existing state of the art of continual learning, summarizing existing benchmarks and metrics, and proposing a framework for presenting and evaluating both robotics and non robotics approaches in a way that makes transfer between both fields easier.

[1]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[2]  Giulio Sandini,et al.  Developmental robotics: a survey , 2003, Connect. Sci..

[3]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[4]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[5]  Benedikt Pfülb,et al.  A comprehensive, application-oriented study of catastrophic forgetting in DNNs , 2019, ICLR.

[6]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Jay M. Wong,et al.  Towards Lifelong Self-Supervision: A Deep Learning Direction for Robotics , 2016, ArXiv.

[8]  Jitendra Malik,et al.  Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.

[9]  Svetlana Lazebnik,et al.  PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Mark B. Ring Continual learning in reinforcement environments , 1995, GMD-Bericht.

[11]  Manuel Lopes,et al.  A multimodal dataset for object model learning from natural human-robot interaction , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12]  David Barber,et al.  Online Structured Laplace Approximations For Overcoming Catastrophic Forgetting , 2018, NeurIPS.

[13]  Marc'Aurelio Ranzato,et al.  Efficient Lifelong Learning with A-GEM , 2018, ICLR.

[14]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[15]  James L. McClelland,et al.  Autonomous Mental Development by Robots and Animals , 2001, Science.

[16]  Alessandro Chiuso,et al.  Online semi-parametric learning for inverse dynamics modeling , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[17]  Pierre-Yves Oudeyer,et al.  Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning , 2017, J. Mach. Learn. Res..

[18]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[19]  Lorenzo Rosasco,et al.  Teaching iCub to recognize objects using deep Convolutional Neural Networks , 2015, MLIS@ICML.

[20]  Michael Gasser,et al.  The Development of Embodied Cognition: Six Lessons from Babies , 2005, Artificial Life.

[21]  Xu He,et al.  Overcoming Catastrophic Interference using Conceptor-Aided Backpropagation , 2018, ICLR.

[22]  Yi Sun,et al.  Planning to Be Surprised: Optimal Bayesian Exploration in Dynamic Environments , 2011, AGI.

[23]  Anna Chadwick The Scientist in the Crib -- Minds, Brains, and How Children Learn , 2001 .

[24]  Olivier Sigaud,et al.  Tensor Based Knowledge Transfer Across Skill Categories for Robot Control , 2017, IJCAI.

[25]  Lorien Y. Pratt,et al.  Discriminability-Based Transfer between Neural Networks , 1992, NIPS.

[26]  Honglak Lee,et al.  Online Incremental Feature Learning with Denoising Autoencoders , 2012, AISTATS.

[27]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2018, Neural Networks.

[28]  Sebastian Thrun,et al.  Lifelong robot learning , 1993, Robotics Auton. Syst..

[29]  Pietro Perona,et al.  Caltech-UCSD Birds 200 , 2010 .

[30]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[31]  Wolfram Burgard,et al.  The limits and potentials of deep learning for robotics , 2018, Int. J. Robotics Res..

[32]  Davide Maltoni,et al.  Comparing Incremental Learning Strategies for Convolutional Neural Networks , 2016, ANNPR.

[33]  Richard J. Duro,et al.  Open-Ended Learning: A Conceptual Framework Based on Representational Redescription , 2018, Front. Neurorobot..

[34]  Martial Mermillod,et al.  The stability-plasticity dilemma: investigating the continuum from catastrophic forgetting to age-limited learning effects , 2013, Front. Psychol..

[35]  Martin A. Riedmiller,et al.  Learn to Swing Up and Balance a Real Pole Based on Raw Visual Input Data , 2012, ICONIP.

[36]  Longxin Lin Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching , 2004, Machine Learning.

[37]  Pierre-Yves Oudeyer,et al.  GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms , 2017, ICML.

[38]  David Filliat,et al.  S-TRIGGER: Continual State Representation Learning via Self-Triggered Generative Replay , 2019, ArXiv.

[39]  Xinlei Chen,et al.  Never-Ending Learning , 2012, ECAI.

[40]  Nathan D. Cahill,et al.  Memory Efficient Experience Replay for Streaming Learning , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[41]  Razvan Pascanu,et al.  Sim-to-Real Robot Learning from Pixels with Progressive Nets , 2016, CoRL.

[42]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[43]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[44]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[45]  Herbert Jaeger,et al.  Using Conceptors to Manage Neural Long-Term Memories for Temporal Patterns , 2017, J. Mach. Learn. Res..

[46]  Pierre-Yves Oudeyer,et al.  Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[47]  Peng Cui,et al.  NICO: A Dataset Towards Non-I.I.D. Image Classification , 2019, ArXiv.

[48]  Yarin Gal,et al.  Towards Robust Evaluations of Continual Learning , 2018, ArXiv.

[49]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[50]  Conrad D. James,et al.  Neurogenesis deep learning: Extending deep networks to accommodate new classes , 2016, 2017 International Joint Conference on Neural Networks (IJCNN).

[51]  David Filliat,et al.  Decoupling feature extraction from policy learning: assessing benefits of state representation learning in goal based robotics , 2018, ArXiv.

[52]  Luigi di Stefano,et al.  On-the-Fly Adaptation of Regression Forests for Online Camera Relocalisation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Gabriela Csurka,et al.  Domain Adaptation for Visual Applications: A Comprehensive Survey , 2017, ArXiv.

[54]  Alessandro Chiuso,et al.  Derivative-Free Online Learning of Inverse Dynamics Models , 2018, IEEE Transactions on Control Systems Technology.

[55]  David Filliat,et al.  Marginal Replay vs Conditional Replay for Continual Learning , 2018, ICANN.

[56]  Eugenio Culurciello,et al.  Continual Reinforcement Learning in 3D Non-stationary Environments , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[57]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[58]  Yoshua Bengio,et al.  An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[59]  Davide Maltoni,et al.  CORe50: a New Dataset and Benchmark for Continuous Object Recognition , 2017, CoRL.

[60]  Rama Chellappa,et al.  Visual Domain Adaptation: A survey of recent advances , 2015, IEEE Signal Processing Magazine.

[61]  David Filliat,et al.  Generative Models from the perspective of Continual Learning , 2018, 2019 International Joint Conference on Neural Networks (IJCNN).

[62]  Siddhartha S. Srinivasa,et al.  HerbDisc: Towards lifelong robotic object discovery , 2015, Int. J. Robotics Res..

[63]  David Filliat,et al.  Deep unsupervised state representation learning with robotic priors: a robustness analysis , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[64]  David Filliat,et al.  State Representation Learning for Control: An Overview , 2018, Neural Networks.

[65]  Yandong Guo,et al.  Incremental Classifier Learning with Generative Adversarial Networks , 2018, ArXiv.

[66]  Mark B. Ring Toward a Formal Framework for Continual Learning , 2005 .

[67]  Bing Liu,et al.  Lifelong machine learning: a paradigm for continuous learning , 2017, Frontiers of Computer Science.

[68]  Svetlana Lazebnik,et al.  Piggyback: Adding Multiple Tasks to a Single, Fixed Network by Learning to Mask , 2018, ArXiv.

[69]  R. French Catastrophic Forgetting in Connectionist Networks , 2006 .

[70]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[71]  Ronald Kemker,et al.  Measuring Catastrophic Forgetting in Neural Networks , 2017, AAAI.

[72]  He Ma,et al.  Quantitatively Evaluating GANs With Divergences Proposed for Training , 2018, ICLR.

[73]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[74]  Alexandros Kalousis,et al.  Lifelong Generative Modeling , 2017, Neurocomputing.

[75]  Jan Peters,et al.  Stable reinforcement learning with autoencoders for tactile and visual data , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[76]  David Filliat,et al.  Continual Reinforcement Learning deployed in Real-life using Policy Distillation and Sim2Real Transfer , 2019, ArXiv.

[77]  Joachim Denzler,et al.  Fine-Tuning Deep Neural Networks in Continuous Learning Scenarios , 2016, ACCV Workshops.

[78]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[79]  Sergey Levine,et al.  QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.

[80]  Marc G. Bellemare,et al.  The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[81]  Bogdan Raducanu,et al.  Memory Replay GANs: learning to generate images from new categories without forgetting , 2018, NeurIPS.

[82]  Alexander Gepperth,et al.  A Bio-Inspired Incremental Learning Architecture for Applied Perceptual Problems , 2016, Cognitive Computation.

[83]  David Filliat,et al.  Exploring to learn visual saliency: The RL-IAC approach , 2018, Robotics Auton. Syst..

[84]  Tom Schaul,et al.  Unicorn: Continual Learning with a Universal, Off-policy Agent , 2018, ArXiv.

[85]  Jeff Clune,et al.  Diffusion-based neuromodulation can eliminate catastrophic forgetting in simple neural networks , 2017, PloS one.

[86]  Silvio Savarese,et al.  ROBOTURK: A Crowdsourcing Platform for Robotic Skill Learning through Imitation , 2018, CoRL.

[87]  Lorenzo Rosasco,et al.  Object identification from few examples by improving the invariance of a Deep Convolutional Neural Network , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[88]  Yoshua Bengio,et al.  Generative Adversarial Networks , 2014, ArXiv.

[89]  Freek Stulp,et al.  Simultaneous on-line Discovery and Improvement of Robotic Skill options , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[90]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[91]  Byoung-Tak Zhang,et al.  Overcoming Catastrophic Forgetting by Incremental Moment Matching , 2017, NIPS.

[92]  Marvin Minsky,et al.  Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.

[93]  Sung Ju Hwang,et al.  Lifelong Learning with Dynamically Expandable Networks , 2017, ICLR.

[94]  William Q. Meeker,et al.  A BAYESIAN ON-LINE CHANGE DETECTION ALGORITHM WITH PROCESS MONITORING APPLICATIONS , 1998 .

[95]  Sergey Levine,et al.  Deep Imitative Models for Flexible Inference, Planning, and Control , 2018, ICLR.

[96]  Giorgio Metta,et al.  Incremental semiparametric inverse dynamics learning , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[97]  Stefan Wermter,et al.  Lifelong Learning of Spatiotemporal Representations With Dual-Memory Recurrent Self-Organization , 2018, Front. Neurorobot..

[98]  David Filliat,et al.  From passive to interactive object learning and recognition through self-identification on a humanoid robot , 2016, Auton. Robots.

[99]  Laurent Itti,et al.  Active Long Term Memory Networks , 2016, ArXiv.

[100]  Andrew Zisserman,et al.  Multi-task Self-Supervised Visual Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[101]  Chrisantha Fernando,et al.  PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.

[102]  Yoshua Bengio,et al.  On Training Recurrent Neural Networks for Lifelong Learning , 2018, ArXiv.

[103]  Abhinav Gupta,et al.  Learning to fly by crashing , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[104]  Alexandros Karatzoglou,et al.  Overcoming Catastrophic Forgetting with Hard Attention to the Task , 2018 .

[105]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[106]  A. Cangelosi,et al.  From Babies to Robots: The Contribution of Developmental Robotics to Developmental Psychology , 2018 .

[107]  Pierre-Yves Oudeyer,et al.  CURIOUS: Intrinsically Motivated Multi-Task, Multi-Goal Reinforcement Learning , 2018, ICML 2019.

[108]  Ronald Kemker,et al.  FearNet: Brain-Inspired Model for Incremental Learning , 2017, ICLR.

[109]  Oliver Brock,et al.  Learning state representations with robotic priors , 2015, Auton. Robots.

[110]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[111]  Yan Liu,et al.  Deep Generative Dual Memory Network for Continual Learning , 2017, ArXiv.

[112]  Jiwon Kim,et al.  Continual Learning with Deep Generative Replay , 2017, NIPS.

[113]  Fawzi Nashashibi,et al.  End-to-End Race Driving with Deep Reinforcement Learning , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[114]  Tom Eccles,et al.  Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies , 2018, NeurIPS.

[115]  Nathan D. Cahill,et al.  New Metrics and Experimental Paradigms for Continual Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[116]  Cordelia Schmid,et al.  Incremental Learning of Object Detectors without Catastrophic Forgetting , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[117]  Junmo Kim,et al.  Less-forgetting Learning in Deep Neural Networks , 2016, ArXiv.

[118]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[119]  Davide Maltoni,et al.  Continuous Learning in Single-Incremental-Task Scenarios , 2018, Neural Networks.

[120]  Silvio Savarese,et al.  SURREAL: Open-Source Reinforcement Learning Framework and Robot Manipulation Benchmark , 2018, CoRL.

[121]  Barbara Hammer,et al.  Incremental learning algorithms and applications , 2016, ESANN.

[122]  Christoph H. Lampert,et al.  Lifelong Learning with Non-i.i.d. Tasks , 2015, NIPS.

[123]  秀俊 松井,et al.  Statistics for High-Dimensional Data: Methods, Theory and Applications , 2014 .

[124]  Martial Hebert,et al.  Growing a Brain: Fine-Tuning by Increasing Model Capacity , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[125]  Marcus Rohrbach,et al.  Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.

[126]  Sergey Levine,et al.  Deep spatial autoencoders for visuomotor learning , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[127]  Yee Whye Teh,et al.  Progress & Compress: A scalable framework for continual learning , 2018, ICML.

[128]  A. M. Turing,et al.  Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[129]  Jürgen Schmidhuber,et al.  Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.

[130]  Alexey Dosovitskiy,et al.  End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[131]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[132]  Yinda Zhang,et al.  LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[133]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[134]  Philip H. S. Torr,et al.  Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.

[135]  Pierre-Yves Oudeyer,et al.  Curiosity Driven Exploration of Learned Disentangled Goal Spaces , 2018, CoRL.

[136]  Joshua B. Tenenbaum,et al.  One shot learning of simple visual concepts , 2011, CogSci.

[137]  Alexei A. Efros,et al.  Large-Scale Study of Curiosity-Driven Learning , 2018, ICLR.

[138]  Giorgio Metta,et al.  Incremental robot learning of new objects with fixed update time , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[139]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[140]  Pierre-Yves Oudeyer,et al.  Computational Theories of Curiosity-Driven Learning , 2018, ArXiv.

[141]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[142]  Ali Borji,et al.  Pros and Cons of GAN Evaluation Measures , 2018, Comput. Vis. Image Underst..

[143]  Richard E. Turner,et al.  Variational Continual Learning , 2017, ICLR.

[144]  James L. McClelland,et al.  Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. , 1995, Psychological review.

[145]  Tinne Tuytelaars,et al.  Expert Gate: Lifelong Learning with a Network of Experts , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[146]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[147]  David Filliat,et al.  Don't forget, there is more than forgetting: new metrics for Continual Learning , 2018, ArXiv.

[148]  Stéphane Doncieux,et al.  From exploration to control: learning object manipulation skills through novelty search and local adaptation , 2019, Robotics Auton. Syst..

[149]  David Filliat,et al.  Training Discriminative Models to Evaluate Generative Ones , 2019, ICANN.

[150]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[151]  Wuyang Duan,et al.  Learning state representations for robotic control: Information disentangling and multi-modal learning , 2017 .

[152]  Han Liu,et al.  Continual Learning in Generative Adversarial Nets , 2017, ArXiv.

[153]  Anthony V. Robins,et al.  Catastrophic Forgetting, Rehearsal and Pseudorehearsal , 1995, Connect. Sci..

[154]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[155]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[156]  Alexandre Zénon,et al.  Learning and forgetting using reinforced Bayesian change detection , 2018, bioRxiv.

[157]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[158]  Ricardo Vilalta,et al.  Metalearning - Applications to Data Mining , 2008, Cognitive Technologies.

[159]  David Filliat,et al.  Exploration strategies for incremental learning of object-based visual saliency , 2015, 2015 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob).