论文信息 - Continual World: A Robotic Benchmark For Continual Reinforcement Learning

Continual World: A Robotic Benchmark For Continual Reinforcement Learning

Continual learning (CL) — the ability to continuously learn, building on previously acquired knowledge — is a natural requirement for long-lived autonomous reinforcement learning (RL) agents. While building such agents, one needs to balance opposing desiderata, such as constraints on capacity and compute, the ability to not catastrophically forget, and to exhibit positive transfer on new tasks. Understanding the right trade-off is conceptually and computationally challenging, which we argue has led the community to overly focus on catastrophic forgetting. In response to these issues, we advocate for the need to prioritize forward transfer and propose Continual World, a benchmark consisting of realistic and meaningfully diverse robotic tasks built on top of Meta-World [51] as a testbed. Following an in-depth empirical evaluation of existing CL methods, we pinpoint their limitations and highlight unique algorithmic challenges in the RL setting. Our benchmark aims to provide a meaningful and computationally inexpensive challenge for the community and thus help better understand existing and future solutions.

Razvan Pascanu | Maciej Wolczyk | Piotr Milo's | Michal Zajkac | Lukasz Kuci'nski

[1] Eugenio Culurciello,et al. Continual Reinforcement Learning in 3D Non-stationary Environments , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[2] Sergey Levine,et al. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.

[3] Andrei A. Rusu,et al. Embracing Change: Continual Learning in Deep Neural Networks , 2020, Trends in Cognitive Sciences.

[4] Shie Mannor,et al. A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.

[5] Marc'Aurelio Ranzato,et al. Efficient Lifelong Learning with A-GEM , 2018, ICLR.

[6] Julien Cornebise,et al. Weight Uncertainty in Neural Network , 2015, ICML.

[7] Doina Precup,et al. Towards Continual Reinforcement Learning: A Review and Perspectives , 2020, ArXiv.

[8] Wojciech Czarnecki,et al. Multi-task Deep Reinforcement Learning with PopArt , 2018, AAAI.

[9] David Filliat,et al. Don't forget, there is more than forgetting: new metrics for Continual Learning , 2018, ArXiv.

[10] Eric Eaton,et al. Lifelong Policy Gradient Learning of Factored Policies for Faster Training Without Forgetting , 2020, NeurIPS.

[11] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.

[12] Ferenc Huszár. Note on the quadratic penalties in elastic weight consolidation , 2018, Proceedings of the National Academy of Sciences.

[13] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[14] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[15] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[16] Joel Veness,et al. The Forget-me-not Process , 2016, NIPS.

[17] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[18] Tom Mitchell,et al. Jelly Bean World: A Testbed for Never-Ending Learning , 2020, ICLR.

[19] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20] Razvan Pascanu,et al. Ray Interference: a Source of Plateaus in Deep Reinforcement Learning , 2019, ArXiv.

[21] Tinne Tuytelaars,et al. A Continual Learning Survey: Defying Forgetting in Classification Tasks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Davide Maltoni,et al. CORe50: a New Dataset and Benchmark for Continuous Object Recognition , 2017, CoRL.

[23] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[24] Stefan Wermter,et al. Continual Lifelong Learning with Neural Networks: A Review , 2019, Neural Networks.

[25] Ryan P. Adams,et al. On Warm-Starting Neural Network Training , 2020, NeurIPS.

[26] Martha White,et al. Meta-Learning Representations for Continual Learning , 2019, NeurIPS.

[27] Svetlana Lazebnik,et al. PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28] Yarin Gal,et al. Towards Robust Evaluations of Continual Learning , 2018, ArXiv.

[29] Richard E. Turner,et al. Continual Learning with Adaptive Weights (CLAW) , 2020, ICLR.

[30] Tinne Tuytelaars,et al. Online Continual Learning with Maximally Interfered Retrieval , 2019, ArXiv.

[31] Demis Hassabis,et al. Improved protein structure prediction using potentials from deep learning , 2020, Nature.