Beyond Fine-Tuning: Transferring Behavior in Reinforcement Learning
暂无分享,去创建一个
Charles Blundell | Pablo Sprechmann | Adrià Puigdomènech Badia | Adria Puigdomenech Badia | Steven Kapturowski | Alex Vitvitskyi | Andre Barreto | Steven Hansen | V'ictor Campos | C. Blundell | André Barreto | P. Sprechmann | S. Hansen | Alex Vitvitskyi | Steven Kapturowski | Víctor Campos
[1] Jing Peng,et al. Incremental multi-step Q-learning , 1994, Machine Learning.
[2] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[3] Marcello Restelli,et al. Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate , 2021, AAAI.
[4] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[5] Richard Socher,et al. Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills , 2020, ICML.
[6] Allan Jabri,et al. Unsupervised Curricula for Visual Meta-Reinforcement Learning , 2019, NeurIPS.
[7] Sergey Levine,et al. Model-Based Reinforcement Learning for Atari , 2019, ICLR.
[8] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[9] Ilya Kostrikov,et al. Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels , 2020, ArXiv.
[10] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[11] Christoph Salge,et al. Empowerment - an Introduction , 2013, ArXiv.
[12] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[13] Pieter Abbeel,et al. Planning to Explore via Self-Supervised World Models , 2020, ICML.
[14] Sergey Levine,et al. Unsupervised Meta-Learning for Reinforcement Learning , 2018, ArXiv.
[15] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[16] Pieter Abbeel,et al. Behavior From the Void: Unsupervised Active Pre-Training , 2021, ArXiv.
[17] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[18] Georg Ostrovski,et al. Temporally-Extended ε-Greedy Exploration , 2020, ICLR.
[19] Yao Liu,et al. When Simple Exploration is Sample Efficient: Identifying Sufficient Conditions for Random Exploration to Yield PAC RL Algorithms , 2018, ArXiv.
[20] Steven Latré,et al. Learning Intrinsically Motivated Options to Stimulate Policy Exploration , 2020 .
[21] Sham M. Kakade,et al. Provably Efficient Maximum Entropy Exploration , 2018, ICML.
[22] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[23] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[24] Doina Precup,et al. Fast reinforcement learning with generalized policy updates , 2020, Proceedings of the National Academy of Sciences.
[25] Mohammad Norouzi,et al. Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.
[26] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[27] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[28] Michal Valko,et al. Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.
[29] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[30] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[31] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[32] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[33] Tom Schaul,et al. Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement , 2018, ICML.
[34] Kaiming He,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[35] David Warde-Farley,et al. Fast Task Inference with Variational Intrinsic Successor Features , 2019, ICLR.
[36] Tom Schaul,et al. Universal Successor Features Approximators , 2018, ICLR.
[37] MahadevanSridhar,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003 .
[38] Alexei A. Efros,et al. Large-Scale Study of Curiosity-Driven Learning , 2018, ICLR.
[39] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[40] Daniel Guo,et al. Agent57: Outperforming the Atari Human Benchmark , 2020, ICML.
[41] Rémi Munos,et al. Recurrent Experience Replay in Distributed Reinforcement Learning , 2018, ICLR.
[42] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[43] Pieter Abbeel,et al. Decoupling Representation Learning from Reinforcement Learning , 2020, ICML.
[44] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[45] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[46] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[47] Daniel Guo,et al. Never Give Up: Learning Directed Exploration Strategies , 2020, ICLR.
[48] Pieter Abbeel,et al. Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.
[49] Trevor Darrell,et al. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.
[50] P. A. Prince,et al. Lévy flight search patterns of wandering albatrosses , 1996, Nature.
[51] Jürgen Schmidhuber,et al. Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.
[52] Sergey Levine,et al. Efficient Exploration via State Marginal Matching , 2019, ArXiv.
[53] Shakir Mohamed,et al. Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning , 2015, NIPS.
[54] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[55] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[56] Alessandro Lazaric,et al. Reinforcement Learning with Prototypical Representations , 2021, ICML.
[57] Benjamin Van Roy,et al. Generalization and Exploration via Randomized Value Functions , 2014, ICML.