Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline
暂无分享,去创建一个
[1] Jorge Armando Mendez Mendez,et al. How to Reuse and Compose Knowledge for a Lifetime of Tasks: A Survey on Continual Learning and Functional Composition , 2022, Trans. Mach. Learn. Res..
[2] Sergey Levine,et al. CoMPS: Continual Meta Policy Search , 2021, ICLR.
[3] Pau Rodríguez López,et al. Sequoia: A Software Framework to Unify Continual Learning Research , 2021, ArXiv.
[4] Stephen J. Roberts,et al. Same State, Different Task: Continual Reinforcement Learning without Interference , 2021, AAAI.
[5] Razvan Pascanu,et al. Continual World: A Robotic Benchmark For Continual Reinforcement Learning , 2021, NeurIPS.
[6] Massimo Caccia,et al. Understanding Continual Learning Settings with Data Distribution Drift Analysis , 2021, ArXiv.
[7] D. Bacciu,et al. Continual Learning for Recurrent Neural Networks: an Empirical Evaluation , 2021, Neural Networks.
[8] Joelle Pineau,et al. Multi-Task Reinforcement Learning with Context-based Representations , 2021, ICML.
[9] Doina Precup,et al. Towards Continual Reinforcement Learning: A Review and Perspectives , 2020, J. Artif. Intell. Res..
[10] Andrei A. Rusu,et al. Embracing Change: Continual Learning in Deep Neural Networks , 2020, Trends in Cognitive Sciences.
[11] Eric Eaton,et al. Lifelong Policy Gradient Learning of Factored Policies for Faster Training Without Forgetting , 2020, NeurIPS.
[12] Maria R. Cervera,et al. Continual learning in recurrent neural networks , 2020, ICLR.
[13] Wenhao Ding,et al. Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of Gaussian Processes , 2020, NeurIPS.
[14] Chelsea Finn,et al. Deep Reinforcement Learning amidst Lifelong Non-Stationarity , 2020, ArXiv.
[15] Timothée Lesort. Continual Learning: Tackling Catastrophic Forgetting in Deep Neural Networks with Replay Processes , 2020, ArXiv.
[16] Yi Wu,et al. Multi-Task Reinforcement Learning with Soft Modularization , 2020, NeurIPS.
[17] Sergey Levine,et al. DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction , 2020, NeurIPS.
[18] David Vázquez,et al. Online Fast Adaptation and Knowledge Accumulation (OSAKA): a New Approach to Continual Learning , 2020, NeurIPS.
[19] S. Levine,et al. Gradient Surgery for Multi-Task Learning , 2020, NeurIPS.
[20] David Filliat,et al. Regularization Shortcomings for Continual Learning , 2019, ArXiv.
[21] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[22] S. Levine,et al. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.
[23] Alex Smola,et al. Meta-Q-Learning , 2019, ICLR.
[24] Francisco S. Melo,et al. Multi-task Learning and Catastrophic Forgetting in Continual Reinforcement Learning , 2019, GCAI.
[25] Matthias De Lange,et al. A Continual Learning Survey: Defying Forgetting in Classification Tasks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[26] Tinne Tuytelaars,et al. Online Continual Learning with Maximally Interfered Retrieval , 2019, ArXiv.
[27] David Filliat,et al. DisCoRL: Continual Reinforcement Learning via Policy Distillation , 2019, ArXiv.
[28] Tayo Obafemi-Ajayi,et al. Recurrent Network and Multi-arm Bandit Methods for Multi-task Learning without Task Specification , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).
[29] Yee Whye Teh,et al. Task Agnostic Continual Learning via Meta Learning , 2019, ArXiv.
[30] Efthymios Tzinis,et al. Continual Learning of New Sound Classes Using Generative Replay , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[31] Mikhail S. Burtsev,et al. Continual and Multi-task Reinforcement Learning With Shared Episodic Memory , 2019, ArXiv.
[32] Alexander J. Smola,et al. P3O: Policy-on Policy-off Policy Optimization , 2019, UAI.
[33] Andreas S. Tolias,et al. Three scenarios for continual learning , 2019, ArXiv.
[34] Sergey Levine,et al. Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables , 2019, ICML.
[35] Marc'Aurelio Ranzato,et al. Continual Learning with Tiny Episodic Memories , 2019, ArXiv.
[36] S. Levine,et al. Deep Online Learning via Meta-Learning: Continual Adaptation for Model-Based RL , 2018, ICLR.
[37] Matteo Hessel,et al. Deep Reinforcement Learning and the Deadly Triad , 2018, ArXiv.
[38] David Rolnick,et al. Experience Replay for Continual Learning , 2018, NeurIPS.
[39] Thomas Wolf,et al. Continuous Learning in a Hierarchical Multiscale Neural Network , 2018, ACL.
[40] Elad Hoffer,et al. Task Agnostic Continual Learning Using Online Variational Bayes , 2018, 1803.10123.
[41] Philip H. S. Torr,et al. Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.
[42] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[43] Svetlana Lazebnik,et al. PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[44] Qiang Yang,et al. A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.
[45] Marc'Aurelio Ranzato,et al. Gradient Episodic Memory for Continual Learning , 2017, NIPS.
[46] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[47] Andrei A. Rusu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[48] David Silver,et al. Memory-based control with recurrent neural networks , 2015, ArXiv.
[49] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[50] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[51] Daniele Calandriello,et al. Sparse multi-task reinforcement learning , 2014, Intelligenza Artificiale.
[52] Eric Eaton,et al. Online Multi-Task Learning for Policy Gradient Methods , 2014, ICML.
[53] Finale Doshi-Velez,et al. Hidden Parameter Markov Decision Processes: A Semiparametric Regression Approach for Discovering Latent Task Parametrizations , 2013, IJCAI.
[54] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[55] Manfred Huber,et al. Improving tractability of POMDPs by separation of decision and perceptual processes , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).
[56] Jürgen Schmidhuber,et al. Solving Deep Memory POMDPs with Recurrent Policy Gradients , 2007, ICANN.
[57] Bram Bakker,et al. Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.
[58] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.
[59] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[60] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[61] Long Ji Lin,et al. Reinforcement Learning of Non-Markov Decision Processes , 1995, Artif. Intell..
[62] M. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[63] Tom M. Mitchell,et al. Reinforcement learning with hidden states , 1993 .
[64] Sebastian Thrun,et al. Lifelong robot learning , 1993, Robotics Auton. Syst..
[65] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[66] G. Qi,et al. Pretrained Language Model in Continual Learning: A Comparative Study , 2022, ICLR.
[67] Ruslan Salakhutdinov,et al. Recurrent Model-Free RL is a Strong Baseline for Many POMDPs , 2021, ArXiv.
[68] Dit-Yan Yeung,et al. Hidden-Mode Markov Decision Processes for Nonstationary Sequential Decision Making , 2001, Sequence Learning.
[69] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[70] Michael McCloskey,et al. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .